aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-06-06 13:12:50 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-06-06 13:12:50 -0700
commit6d8854216ebb60959ddb6f4ea4123bd449ba6cf6 (patch)
tree065c7ffc29952ae309fcb815fbd4f81423655292 /Documentation
parentMerge tag 'io_uring-6.16-20250606' of git://git.kernel.dk/linux (diff)
parentMerge tag 'nvme-6.16-2025-06-05' of git://git.infradead.org/nvme into block-6.16 (diff)
downloadlinux-rng-6d8854216ebb60959ddb6f4ea4123bd449ba6cf6.tar.xz
linux-rng-6d8854216ebb60959ddb6f4ea4123bd449ba6cf6.zip
Merge tag 'block-6.16-20250606' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe: - NVMe pull request via Christoph: - TCP error handling fix (Shin'ichiro Kawasaki) - TCP I/O stall handling fixes (Hannes Reinecke) - fix command limits status code (Keith Busch) - support vectored buffers also for passthrough (Pavel Begunkov) - spelling fixes (Yi Zhang) - MD pull request via Yu: - fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10 - fix max_write_behind setting for dm-raid - some minor cleanups - Integrity data direction fix and cleanup - bcache NULL pointer fix - Fix for loop missing write start/end handling - Decouple hardware queues and IO threads in ublk - Slew of ublk selftests additions and updates * tag 'block-6.16-20250606' of git://git.kernel.dk/linux: (29 commits) nvme: spelling fixes nvme-tcp: fix I/O stalls on congested sockets nvme-tcp: sanitize request list handling nvme-tcp: remove tag set when second admin queue config fails nvme: enable vectored registered bufs for passthrough cmds nvme: fix implicit bool to flags conversion nvme: fix command limits status code selftests: ublk: kublk: improve behavior on init failure block: flip iter directions in blk_rq_integrity_map_user() block: drop direction param from bio_integrity_copy_user() selftests: ublk: cover PER_IO_DAEMON in more stress tests Documentation: ublk: document UBLK_F_PER_IO_DAEMON selftests: ublk: add stress test for per io daemons selftests: ublk: add functional test for per io daemons selftests: ublk: kublk: decouple ublk_queues from ublk server threads selftests: ublk: kublk: move per-thread data out of ublk_queue selftests: ublk: kublk: lift queue initialization out of thread selftests: ublk: kublk: tie sqe allocation to io instead of queue selftests: ublk: kublk: plumb q_id in io_uring user_data ublk: have a per-io daemon instead of a per-queue daemon ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/block/ublk.rst35
1 files changed, 24 insertions, 11 deletions
diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index 854f823b46c2..c368e1081b41 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -115,15 +115,15 @@ managing and controlling ublk devices with help of several control commands:
- ``UBLK_CMD_START_DEV``
- After the server prepares userspace resources (such as creating per-queue
- pthread & io_uring for handling ublk IO), this command is sent to the
+ After the server prepares userspace resources (such as creating I/O handler
+ threads & io_uring for handling ublk IO), this command is sent to the
driver for allocating & exposing ``/dev/ublkb*``. Parameters set via
``UBLK_CMD_SET_PARAMS`` are applied for creating the device.
- ``UBLK_CMD_STOP_DEV``
Halt IO on ``/dev/ublkb*`` and remove the device. When this command returns,
- ublk server will release resources (such as destroying per-queue pthread &
+ ublk server will release resources (such as destroying I/O handler threads &
io_uring).
- ``UBLK_CMD_DEL_DEV``
@@ -208,15 +208,15 @@ managing and controlling ublk devices with help of several control commands:
modify how I/O is handled while the ublk server is dying/dead (this is called
the ``nosrv`` case in the driver code).
- With just ``UBLK_F_USER_RECOVERY`` set, after one ubq_daemon(ublk server's io
- handler) is dying, ublk does not delete ``/dev/ublkb*`` during the whole
+ With just ``UBLK_F_USER_RECOVERY`` set, after the ublk server exits,
+ ublk does not delete ``/dev/ublkb*`` during the whole
recovery stage and ublk device ID is kept. It is ublk server's
responsibility to recover the device context by its own knowledge.
Requests which have not been issued to userspace are requeued. Requests
which have been issued to userspace are aborted.
- With ``UBLK_F_USER_RECOVERY_REISSUE`` additionally set, after one ubq_daemon
- (ublk server's io handler) is dying, contrary to ``UBLK_F_USER_RECOVERY``,
+ With ``UBLK_F_USER_RECOVERY_REISSUE`` additionally set, after the ublk server
+ exits, contrary to ``UBLK_F_USER_RECOVERY``,
requests which have been issued to userspace are requeued and will be
re-issued to the new process after handling ``UBLK_CMD_END_USER_RECOVERY``.
``UBLK_F_USER_RECOVERY_REISSUE`` is designed for backends who tolerate
@@ -241,10 +241,11 @@ can be controlled/accessed just inside this container.
Data plane
----------
-ublk server needs to create per-queue IO pthread & io_uring for handling IO
-commands via io_uring passthrough. The per-queue IO pthread
-focuses on IO handling and shouldn't handle any control & management
-tasks.
+The ublk server should create dedicated threads for handling I/O. Each
+thread should have its own io_uring through which it is notified of new
+I/O, and through which it can complete I/O. These dedicated threads
+should focus on IO handling and shouldn't handle any control &
+management tasks.
The's IO is assigned by a unique tag, which is 1:1 mapping with IO
request of ``/dev/ublkb*``.
@@ -265,6 +266,18 @@ with specified IO tag in the command data:
destined to ``/dev/ublkb*``. This command is sent only once from the server
IO pthread for ublk driver to setup IO forward environment.
+ Once a thread issues this command against a given (qid,tag) pair, the thread
+ registers itself as that I/O's daemon. In the future, only that I/O's daemon
+ is allowed to issue commands against the I/O. If any other thread attempts
+ to issue a command against a (qid,tag) pair for which the thread is not the
+ daemon, the command will fail. Daemons can be reset only be going through
+ recovery.
+
+ The ability for every (qid,tag) pair to have its own independent daemon task
+ is indicated by the ``UBLK_F_PER_IO_DAEMON`` feature. If this feature is not
+ supported by the driver, daemons must be per-queue instead - i.e. all I/Os
+ associated to a single qid must be handled by the same task.
+
- ``UBLK_IO_COMMIT_AND_FETCH_REQ``
When an IO request is destined to ``/dev/ublkb*``, the driver stores