aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/block (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-05-08ataflop: use a statically allocated error countersWilly Tarreau1-4/+6
This is the last driver making use of fd_request->error_count, which is easy to get wrong as was shown in floppy.c. We don't need to keep it there, it can be moved to the atari_floppy_struct instead, so let's do this. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Minh Yuan <yuanmingbuaa@gmail.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-08floppy: use a statically allocated error counterWilly Tarreau1-10/+8
Interrupt handler bad_flp_intr() may cause a UAF on the recently freed request just to increment the error count. There's no point keeping that one in the request anyway, and since the interrupt handler uses a static pointer to the error which cannot be kept in sync with the pending request, better make it use a static error counter that's reset for each new request. This reset now happens when entering redo_fd_request() for a new request via set_next_request(). One initial concern about a single error counter was that errors on one floppy drive could be reported on another one, but this problem is not real given that the driver uses a single drive at a time, as that PC-compatible controllers also have this limitation by using shared signals. As such the error count is always for the "current" drive. Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Tested-by: Denis Efremov <efremov@linux.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27floppy: disable FDRAWCMD by defaultWilly Tarreau2-11/+48
Minh Yuan reported a concurrency use-after-free issue in the floppy code between raw_cmd_ioctl and seek_interrupt. [ It turns out this has been around, and that others have reported the KASAN splats over the years, but Minh Yuan had a reproducer for it and so gets primary credit for reporting it for this fix - Linus ] The problem is, this driver tends to break very easily and nowadays, nobody is expected to use FDRAWCMD anyway since it was used to manipulate non-standard formats. The risk of breaking the driver is higher than the risk presented by this race, and accessing the device requires privileges anyway. Let's just add a config option to completely disable this ioctl and leave it disabled by default. Distros shouldn't use it, and only those running on antique hardware might need to enable it. Link: https://lore.kernel.org/all/000000000000b71cdd05d703f6bf@google.com/ Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Reported-by: syzbot+8e8958586909d62b6840@syzkaller.appspotmail.com Reported-by: cruise k <cruise4k@gmail.com> Reported-by: Kyungtae Kim <kt0755@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Tested-by: Denis Efremov <efremov@linux.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-14block: null_blk: end timed out poll requestMing Lei1-1/+1
When poll request is timed out, it is removed from the poll list, but not completed, so the request is leaked, and never get chance to complete. Fix the issue by ending it in timeout handler. Fixes: 0a593fbbc245 ("null_blk: poll queue support") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220413084836.1571995-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-06drbd: set QUEUE_FLAG_STABLE_WRITESChristoph Böhmwalder1-0/+1
We want our pages not to change while they are being written. Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-06drbd: fix an invalid memory access caused by incorrect use of list iteratorXiaomeng Tong1-4/+2
The bug is here: idr_remove(&connection->peer_devices, vnr); If the previous for_each_connection() don't exit early (no goto hit inside the loop), the iterator 'connection' after the loop will be a bogus pointer to an invalid structure object containing the HEAD (&resource->connections). As a result, the use of 'connection' above will lead to a invalid memory access (including a possible invalid free as idr_remove could call free_layer). The original intention should have been to remove all peer_devices, but the following lines have already done the work. So just remove this line and the unneeded label, to fix this bug. Cc: stable@vger.kernel.org Fixes: c06ece6ba6f1b ("drbd: Turn connection->volumes into connection->peer_devices") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-06drbd: Fix five use after free bugs in get_initial_stateLv Yunlong4-33/+42
In get_initial_state, it calls notify_initial_state_done(skb,..) if cb->args[5]==1. If genlmsg_put() failed in notify_initial_state_done(), the skb will be freed by nlmsg_free(skb). Then get_initial_state will goto out and the freed skb will be used by return value skb->len, which is a uaf bug. What's worse, the same problem goes even further: skb can also be freed in the notify_*_state_change -> notify_*_state calls below. Thus 4 additional uaf bugs happened. My patch lets the problem callee functions: notify_initial_state_done and notify_*_state_change return an error code if errors happen. So that the error codes could be propagated and the uaf bugs can be avoid. v2 reports a compilation warning. This v3 fixed this warning and built successfully in my local environment with no additional warnings. v2: https://lore.kernel.org/patchwork/patch/1435218/ Fixes: a29728463b254 ("drbd: Backport the "events2" command") Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-02Merge tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-blockLinus Torvalds1-12/+12
Pull block driver fix from Jens Axboe: "Got two reports on nbd spewing warnings on load now, which is a regression from a commit that went into your tree yesterday. Revert the problematic change for now" * tag 'for-5.18/drivers-2022-04-02' of git://git.kernel.dk/linux-block: Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"
2022-04-02Revert "nbd: fix possible overflow on 'first_minor' in nbd_dev_add()"Jens Axboe1-12/+12
This reverts commit 6d35d04a9e18990040e87d2bbf72689252669d54. Both Gabriel and Borislav report that this commit casues a regression with nbd: sysfs: cannot create duplicate filename '/dev/block/43:0' Revert it before 5.18-rc1 and we'll investigage this separately in due time. Link: https://lore.kernel.org/all/YkiJTnFOt9bTv6A2@zn.tnic/ Reported-by: Gabriel L. Somlo <somlo@cmu.edu> Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-04-01Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-blockLinus Torvalds7-33/+50
Pull block driver fixes from Jens Axboe: "Followup block driver updates and fixes for the 5.18-rc1 merge window. In detail: - NVMe pull request - Fix multipath hang when disk goes live over reconnect (Anton Eidelman) - fix RCU hole that allowed for endless looping in multipath round robin (Chris Leech) - remove redundant assignment after left shift (Colin Ian King) - add quirks for Samsung X5 SSDs (Monish Kumar R) - fix the read-only state for zoned namespaces with unsupposed features (Pankaj Raghav) - use a private workqueue instead of the system workqueue in nvmet (Sagi Grimberg) - allow duplicate NSIDs for private namespaces (Sungup Moon) - expose use_threaded_interrupts read-only in sysfs (Xin Hao)" - nbd minor allocation fix (Zhang) - drbd fixes and maintainer addition (Lars, Jakob, Christoph) - n64cart build fix (Jackie) - loop compat ioctl fix (Carlos) - misc fixes (Colin, Dongli)" * tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block: drbd: remove check of list iterator against head past the loop body drbd: remove usage of list iterator variable after loop nbd: fix possible overflow on 'first_minor' in nbd_dev_add() MAINTAINERS: add drbd co-maintainer drbd: fix potential silent data corruption loop: fix ioctl calls using compat_loop_info nvme-multipath: fix hang when disk goes live over reconnect nvme: fix RCU hole that allowed for endless looping in multipath round robin nvme: allow duplicate NSIDs for private namespaces nvmet: remove redundant assignment after left shift nvmet: use a private workqueue instead of the system workqueue nvme-pci: add quirks for Samsung X5 SSDs nvme-pci: expose use_threaded_interrupts read-only in sysfs nvme: fix the read-only state for zoned namespaces with unsupposed features n64cart: convert bi_disk to bi_bdev->bd_disk fix build xen/blkfront: fix comment for need_copy xen-blkback: remove redundant assignment to variable i
2022-03-31drbd: remove check of list iterator against head past the loop bodyJakob Koschel1-15/+27
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element. While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided. In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1]. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20220331220349.885126-2-jakobkoschel@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-31drbd: remove usage of list iterator variable after loopJakob Koschel1-2/+5
In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to iterate through the list [1]. Since that variable should not be used past the loop iteration, a separate variable is used to 'remember the current location within the loop'. To either continue iterating from that position or skip the iteration (if the previous iteration was complete) list_prepare_entry() is used. Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com> Link: https://lore.kernel.org/r/20220331220349.885126-1-jakobkoschel@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-31nbd: fix possible overflow on 'first_minor' in nbd_dev_add()Zhang Wensheng1-12/+12
When 'index' is a big numbers, it may become negative which forced to 'int'. then 'index << part_shift' might overflow to a positive value that is not greater than '0xfffff', then sysfs might complains about duplicate creation. Because of this, move the 'index' judgment to the front will fix it and be better. Fixes: b0d9111a2d53 ("nbd: use an idr to keep track of nbd devices") Fixes: 940c264984fd ("nbd: fix possible overflow for 'first_minor' in nbd_dev_add()") Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220310093224.4002895-1-zhangwensheng5@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-30drbd: fix potential silent data corruptionLars Ellenberg1-1/+2
Scenario: --------- bio chain generated by blk_queue_split(). Some split bio fails and propagates its error status to the "parent" bio. But then the (last part of the) parent bio itself completes without error. We would clobber the already recorded error status with BLK_STS_OK, causing silent data corruption. Reproducer: ----------- How to trigger this in the real world within seconds: DRBD on top of degraded parity raid, small stripe_cache_size, large read_ahead setting. Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED", umount and mount again, "reboot"). Cause significant read ahead. Large read ahead request is split by blk_queue_split(). Parts of the read ahead that are already in the stripe cache, or find an available stripe cache to use, can be serviced. Parts of the read ahead that would need "too much work", would need to wait for a "stripe_head" to become available, are rejected immediately. For larger read ahead requests that are split in many pieces, it is very likely that some "splits" will be serviced, but then the stripe cache is exhausted/busy, and the remaining ones will be rejected. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Cc: <stable@vger.kernel.org> # 4.13.x Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-30loop: fix ioctl calls using compat_loop_infoCarlos Llamas1-0/+1
Support for cryptoloop was deleted in commit 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer"), making the usage of loop_info->lo_encrypt_type obsolete. However, this member was also removed from the compat_loop_info definition and this breaks userspace ioctl calls for 32-bit binaries and CONFIG_COMPAT=y. This patch restores the compat_loop_info->lo_encrypt_type member and marks it obsolete as well as in the uapi header definitions. Fixes: 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220329201815.1347500-1-cmllamas@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-28Merge tag 'for-linus-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds1-4/+4
Pull xen updates from Juergen Gross: - A bunch of minor cleanups - A fix for kexec in Xen dom0 when executed on a high cpu number - A fix for resuming after suspend of a Xen guest with assigned PCI devices - A fix for a crash due to not disabled preemption when resuming as Xen dom0 * tag 'for-linus-5.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix is_xen_pmu() xen: don't hang when resuming PCI device arch:x86:xen: Remove unnecessary assignment in xen_apic_read() xen/grant-table: remove readonly parameter from functions xen/grant-table: remove gnttab_*transfer*() functions drivers/xen: use helper macro __ATTR_RW x86/xen: Fix kerneldoc warning xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 xen: use time_is_before_eq_jiffies() instead of open coding it
2022-03-24Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds10-160/+21
Pull SCSI updates from James Bottomley: "This series consists of the usual driver updates (qla2xxx, pm8001, libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates and bug fixes. The high blast radius core update is the removal of write same, which affects block and several non-SCSI devices. The other big change, which is more local, is the removal of the SCSI pointer" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (281 commits) scsi: scsi_ioctl: Drop needless assignment in sg_io() scsi: bsg: Drop needless assignment in scsi_bsg_sg_io_fn() scsi: lpfc: Copyright updates for 14.2.0.0 patches scsi: lpfc: Update lpfc version to 14.2.0.0 scsi: lpfc: SLI path split: Refactor BSG paths scsi: lpfc: SLI path split: Refactor Abort paths scsi: lpfc: SLI path split: Refactor SCSI paths scsi: lpfc: SLI path split: Refactor CT paths scsi: lpfc: SLI path split: Refactor misc ELS paths scsi: lpfc: SLI path split: Refactor VMID paths scsi: lpfc: SLI path split: Refactor FDISC paths scsi: lpfc: SLI path split: Refactor LS_RJT paths scsi: lpfc: SLI path split: Refactor LS_ACC paths scsi: lpfc: SLI path split: Refactor the RSCN/SCR/RDF/EDC/FARPR paths scsi: lpfc: SLI path split: Refactor PLOGI/PRLI/ADISC/LOGO paths scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI path scsi: lpfc: SLI path split: Introduce lpfc_prep_wqe scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4 scsi: lpfc: SLI path split: Refactor lpfc_iocbq scsi: lpfc: Use kcalloc() ...
2022-03-22Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-0/+1
Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
2022-03-22Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-5/+1
Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...
2022-03-22remove bdi_congested() and wb_congested() and related functionsNeilBrown2-5/+1
These functions are no longer useful as no BDIs report congestions any more. Removing the test on bdi_write_contested() in current_may_throttle() could cause a small change in behaviour, but only when PF_LOCAL_THROTTLE is set. So replace the calls by 'false' and simplify the code - and remove the functions. [akpm@linux-foundation.org: fix build] Link: https://lkml.kernel.org/r/164549983742.9187.2570198746005819592.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs] Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Paolo Valente <paolo.valente@linaro.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-21Merge tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-blockLinus Torvalds13-84/+86
Pull block driver updates from Jens Axboe: - NVMe updates via Christoph: - add vectored-io support for user-passthrough (Kanchan Joshi) - add verbose error logging (Alan Adamson) - support buffered I/O on block devices in nvmet (Chaitanya Kulkarni) - central discovery controller support (Martin Belanger) - fix and extended the globally unique idenfier validation (Christoph) - move away from the deprecated IDA APIs (Sagi Grimberg) - misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin, Chaitanya Kulkarni) - add lockdep annotations for in-kernel sockets (Chris Leech) - use vmalloc for ANA log buffer (Hannes Reinecke) - kerneldoc fixes (Chaitanya Kulkarni) - cleanups (Guoqing Jiang, Chaitanya Kulkarni, Christoph) - warn about shared namespaces without multipathing (Christoph) - MD updates via Song with a set of cleanups (Christoph, Mariusz, Paul, Erik, Dirk) - loop cleanups and queue depth configuration (Chaitanya) - null_blk cleanups and fixes (Chaitanya) - Use descriptive init/exit names in virtio_blk (Randy) - Use bvec_kmap_local() in drivers (Christoph) - bcache fixes (Mingzhe) - xen blk-front persistent grant speedups (Juergen) - rnbd fix and cleanup (Gioh) - Misc fixes (Christophe, Colin) * tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-block: (76 commits) virtio_blk: eliminate anonymous module_init & module_exit nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH nvme: remove nvme_alloc_request and nvme_alloc_request_qid nvme: cleanup how disk->disk_name is assigned nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate nvmet: use snprintf() with PAGE_SIZE in configfs nvmet: don't fold lines nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport nvme-tcp: lockdep: annotate in-kernel sockets nvme-tcp: don't fold the line nvme-tcp: don't initialize ret variable nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio nvme-multipath: use vmalloc for ANA log buffer xen/blkfront: speed up purge_persistent_grants() raid5: initialize the stripe_head embeeded bios as needed raid5-cache: statically allocate the recovery ra bio raid5-cache: fully initialize flush_bio when needed raid5-ppl: fully initialize the bio in ppl_new_iounit ...
2022-03-21Merge tag 'for-5.18/block-2022-03-18' of git://git.kernel.dk/linux-blockLinus Torvalds22-254/+70
Pull block updates from Jens Axboe: - BFQ cleanups and fixes (Yu, Zhang, Yahu, Paolo) - blk-rq-qos completion fix (Tejun) - blk-cgroup merge fix (Tejun) - Add offline error return value to distinguish it from an IO error on the device (Song) - IO stats fixes (Zhang, Christoph) - blkcg refcount fixes (Ming, Yu) - Fix for indefinite dispatch loop softlockup (Shin'ichiro) - blk-mq hardware queue management improvements (Ming) - sbitmap dead code removal (Ming, John) - Plugging merge improvements (me) - Show blk-crypto capabilities in sysfs (Eric) - Multiple delayed queue run improvement (David) - Block throttling fixes (Ming) - Start deprecating auto module loading based on dev_t (Christoph) - bio allocation improvements (Christoph, Chaitanya) - Get rid of bio_devname (Christoph) - bio clone improvements (Christoph) - Block plugging improvements (Christoph) - Get rid of genhd.h header (Christoph) - Ensure drivers use appropriate flush helpers (Christoph) - Refcounting improvements (Christoph) - Queue initialization and teardown improvements (Ming, Christoph) - Misc fixes/improvements (Barry, Chaitanya, Colin, Dan, Jiapeng, Lukas, Nian, Yang, Eric, Chengming) * tag 'for-5.18/block-2022-03-18' of git://git.kernel.dk/linux-block: (127 commits) block: cancel all throttled bios in del_gendisk() block: let blkcg_gq grab request queue's refcnt block: avoid use-after-free on throttle data block: limit request dispatch loop duration block/bfq-iosched: Fix spelling mistake "tenative" -> "tentative" sr: simplify the local variable initialization in sr_block_open() block: don't merge across cgroup boundaries if blkcg is enabled block: fix rq-qos breakage from skipping rq_qos_done_bio() block: flush plug based on hardware and software queue order block: ensure plug merging checks the correct queue at least once block: move rq_qos_exit() into disk_release() block: do more work in elevator_exit block: move blk_exit_queue into disk_release block: move q_usage_counter release into blk_queue_release block: don't remove hctx debugfs dir from blk_mq_exit_queue block: move blkcg initialization/destroy into disk allocation/release handler sr: implement ->free_disk to simplify refcounting sd: implement ->free_disk to simplify refcounting sd: delay calling free_opal_dev sd: call sd_zbc_release_disk before releasing the scsi_device reference ...
2022-03-21fs: Move many prototypes to pagemap.hMatthew Wilcox (Oracle)1-0/+1
These functions are page cache functionality and don't need to be declared in fs.h. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
2022-03-21n64cart: convert bi_disk to bi_bdev->bd_disk fix buildJackie Liu1-1/+1
My kernel robot report below: drivers/block/n64cart.c: In function ‘n64cart_submit_bio’: drivers/block/n64cart.c:91:26: error: ‘struct bio’ has no member named ‘bi_disk’ 91 | struct device *dev = bio->bi_disk->private_data; | ^~ CC drivers/slimbus/qcom-ctrl.o CC drivers/auxdisplay/hd44780.o CC drivers/watchdog/watchdog_core.o CC drivers/nvme/host/fault_inject.o AR drivers/accessibility/braille/built-in.a make[2]: *** [scripts/Makefile.build:288: drivers/block/n64cart.o] Error 1 Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio"); Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220321071216.1549596-1-liu.yun@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-21xen/blkfront: fix comment for need_copyDongli Zhang1-1/+1
The 'need_copy' is set when rq_data_dir(req) returns WRITE, in order to copy the written data to persistent page. ".need_copy = rq_data_dir(req) && info->feature_persistent," Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Fixes: c004a6fe0c40 ('block/xen-blkfront: Make it running on 64KB page granularity') Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220317220930.5698-1-dongli.zhang@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-21xen-blkback: remove redundant assignment to variable iColin Ian King1-1/+1
Variable i is being assigned a value that is never read, it is being re-assigned later in a for-loop. The assignment is redundant and can be removed. Cleans up clang scan build warning: drivers/block/xen-blkback/blkback.c:934:14: warning: Although the value stored to 'i' is used in the enclosing expression, the value is never actually read from 'i' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220317234646.78158-1-colin.i.king@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-17Merge tag 'nvme-5.18-2022-03-17' of git://git.infradead.org/nvme into for-5.18/driversJens Axboe1-0/+1
Pull NVMe updates from Christoph: "Second round of nvme updates for Linux 5.18 - add lockdep annotations for in-kernel sockets (Chris Leech) - use vmalloc for ANA log buffer (Hannes Reinecke) - kerneldoc fixes (Chaitanya Kulkarni) - cleanups (Guoqing Jiang, Chaitanya Kulkarni, me) - warn about shared namespaces without multipathing (me)" * tag 'nvme-5.18-2022-03-17' of git://git.infradead.org/nvme: nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH nvme: remove nvme_alloc_request and nvme_alloc_request_qid nvme: cleanup how disk->disk_name is assigned nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate nvmet: use snprintf() with PAGE_SIZE in configfs nvmet: don't fold lines nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport nvme-tcp: lockdep: annotate in-kernel sockets nvme-tcp: don't fold the line nvme-tcp: don't initialize ret variable nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio nvme-multipath: use vmalloc for ANA log buffer
2022-03-17virtio_blk: eliminate anonymous module_init & module_exitRandy Dunlap1-4/+4
Eliminate anonymous module_init() and module_exit(), which can lead to confusion or ambiguity when reading System.map, crashes/oops/bugs, or an initcall_debug log. Give each of these init and exit functions unique driver-specific names to eliminate the anonymous names. Example 1: (System.map) ffffffff832fc78c t init ffffffff832fc79e t init ffffffff832fc8f8 t init Example 2: (initcall_debug log) calling init+0x0/0x12 @ 1 initcall init+0x0/0x12 returned 0 after 15 usecs calling init+0x0/0x60 @ 1 initcall init+0x0/0x60 returned 0 after 2 usecs calling init+0x0/0x9a @ 1 initcall init+0x0/0x9a returned 0 after 74 usecs Fixes: e467cde23818 ("Block driver using virtio.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: virtualization@lists.linux-foundation.org Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220316192010.19001-2-rdunlap@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-16nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATHChristoph Hellwig1-0/+1
Start warning about exposing a namespace as multiple block devices, and set a fixed deprecation release. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2022-03-15xen/grant-table: remove readonly parameter from functionsJuergen Gross1-4/+4
The gnttab_end_foreign_access() family of functions is taking a "readonly" parameter, which isn't used. Remove it from the function parameters. Signed-off-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220311103429.12845-3-jgross@suse.com Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2022-03-11xen/blkfront: speed up purge_persistent_grants()Juergen Gross1-1/+4
purge_persistent_grants() is scanning the grants list for persistent grants being no longer in use by the backend. When having found such a grant, it will be set to "invalid" and pushed to the tail of the list. Instead of pushing it directly to the end of the list, add it first to a temporary list, avoiding to scan those entries again in the main list traversal. After having finished the scan, append the temporary list to the grant list. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Link: https://lore.kernel.org/r/20220311103527.12931-1-jgross@suse.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-09Merge tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds1-26/+37
Pull xen fixes from Juergen Gross: "Several Linux PV device frontends are using the grant table interfaces for removing access rights of the backends in ways being subject to race conditions, resulting in potential data leaks, data corruption by malicious backends, and denial of service triggered by malicious backends: - blkfront, netfront, scsifront and the gntalloc driver are testing whether a grant reference is still in use. If this is not the case, they assume that a following removal of the granted access will always succeed, which is not true in case the backend has mapped the granted page between those two operations. As a result the backend can keep access to the memory page of the guest no matter how the page will be used after the frontend I/O has finished. The xenbus driver has a similar problem, as it doesn't check the success of removing the granted access of a shared ring buffer. - blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p, kbdfront, and pvcalls are using a functionality to delay freeing a grant reference until it is no longer in use, but the freeing of the related data page is not synchronized with dropping the granted access. As a result the backend can keep access to the memory page even after it has been freed and then re-used for a different purpose. - netfront will fail a BUG_ON() assertion if it fails to revoke access in the rx path. This will result in a Denial of Service (DoS) situation of the guest which can be triggered by the backend" * tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/netfront: react properly to failing gnttab_end_foreign_access_ref() xen/gnttab: fix gnttab_end_foreign_access() without page specified xen/pvcalls: use alloc/free_pages_exact() xen/9p: use alloc/free_pages_exact() xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done() xen: remove gnttab_query_foreign_access() xen/gntalloc: don't use gnttab_query_foreign_access() xen/scsifront: don't use gnttab_query_foreign_access() for mapped status xen/netfront: don't use gnttab_query_foreign_access() for mapped status xen/blkfront: don't use gnttab_query_foreign_access() for mapped status xen/grant-table: add gnttab_try_end_foreign_access() xen/xenbus: don't let xenbus_grant_ring() remove grants in error case
2022-03-08blk-mq: prepare for implementing hctx table via xarrayMing Lei1-1/+1
It is inevitable to cause use-after-free on q->queue_hw_ctx between queue_for_each_hw_ctx() and blk_mq_update_nr_hw_queues(). And converting to xarray can fix the uaf, meantime code gets cleaner. Prepare for converting q->queue_hctx_ctx into xarray, one thing is that xa_for_each() can only accept 'unsigned long' as index, so changes type of hctx index of queue_for_each_hw_ctx() into 'unsigned long'. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-6-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-08block: mtip32xx: don't touch q->queue_hw_ctxMing Lei1-3/+1
q->queue_hw_ctx is really one blk-mq internal structure for retrieving hctx via its index, not supposed to be used by drivers. Meantime drivers can get the tags structure easily from tagset. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220308073219.91173-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07pktcdvd: remove a pointless debug check in pkt_submit_bioChristoph Hellwig1-8/+1
->queuedata is set up in pkt_init_queue, so it can't be NULL here. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20220304180105.409765-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07xen/blkfront: don't use gnttab_query_foreign_access() for mapped statusJuergen Gross1-26/+37
It isn't enough to check whether a grant is still being in use by calling gnttab_query_foreign_access(), as a mapping could be realized by the other side just after having called that function. In case the call was done in preparation of revoking a grant it is better to do so via gnttab_end_foreign_access_ref() and check the success of that operation instead. For the ring allocation use alloc_pages_exact() in order to avoid high order pages in case of a multi-page ring. If a grant wasn't unmapped by the backend without persistent grants being used, set the device state to "error". This is CVE-2022-23036 / part of XSA-396. Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> --- V2: - use gnttab_try_end_foreign_access() V4: - use alloc_pages_exact() and free_pages_exact() - set state to error if backend didn't unmap (Roger Pau Monné)
2022-03-06virtio-blk: Remove BUG_ON() in virtio_queue_rq()Xie Yongji1-10/+2
Currently we have a BUG_ON() to make sure the number of sg list does not exceed queue_max_segments() in virtio_queue_rq(). However, the block layer uses queue_max_discard_segments() instead of queue_max_segments() to limit the sg list for discard requests. So the BUG_ON() might be triggered if virtio-blk device reports a larger value for max discard segment than queue_max_segments(). To fix it, let's simply remove the BUG_ON() which has become unnecessary after commit 02746e26c39e("virtio-blk: avoid preallocating big SGL for data"). And the unused vblk->sg_elems can also be removed together. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20220304100058.116-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zeroXie Yongji1-2/+8
Currently the value of max_discard_segment will be set to MAX_DISCARD_SEGMENTS (256) with no basis in hardware if device set 0 to max_discard_seg in configuration space. It's incorrect since the device might not be able to handle such large descriptors. To fix it, let's follow max_segments restrictions in this case. Fixes: 1f23816b8eb8 ("virtio_blk: add discard and write zeroes support") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20220304100058.116-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-04floppy: use memcpy_{to,from}_bvecChristoph Hellwig1-4/+2
Use the helpers instead of open coding them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-04drbd: use bvec_kmap_local in recv_dless_readChristoph Hellwig1-2/+2
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-04drbd: use bvec_kmap_local in drbd_csum_bioChristoph Hellwig1-3/+3
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-04zram: use memcpy_from_bvec in zram_bvec_writeChristoph Hellwig1-4/+1
Use memcpy_from_bvec instead of open coding the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-04zram: use memcpy_to_bvec in zram_bvec_readChristoph Hellwig1-3/+1
Use the proper helper instead of open coding the copy. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-04aoe: use bvec_kmap_local in bvcpyChristoph Hellwig1-2/+2
Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220303111905.321089-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-01scsi: core: Move the result field from struct scsi_request to struct scsi_cmndChristoph Hellwig1-1/+1
Prepare for removing the scsi_request structure by moving the result field to struct scsi_cmnd. Link: https://lore.kernel.org/r/20220224175552.988286-7-hch@lst.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-03-01scsi: core: Remove the cmd field from struct scsi_requestChristoph Hellwig1-2/+4
Now that each scsi_request is backed by a scsi_cmnd, there is no need to indirect the CDB storage. Change all submitters of SCSI passthrough requests to store the CDB information directly in the scsi_cmnd, and while doing so allocate the full 32 bytes that cover all Linux supported SCSI hosts instead of requiring dynamic allocation for > 16 byte CDBs. On 64-bit systems this does not change the size of the scsi_cmnd at all, while on 32-bit systems it slightly increases it for now, but that increase will be made up by the removal of the remaining scsi_request fields. Link: https://lore.kernel.org/r/20220224175552.988286-4-hch@lst.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-02-27null_blk: null_alloc_page() cleanupChaitanya Kulkarni1-7/+5
Remove goto labels and use direct returns as error unwinding code only needs to free t_page variable if we alloc_pages() call fails as having two labels for one kfree() can be avoided easily. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220222152852.26043-3-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-27null_blk: remove hardcoded null_alloc_page() paramChaitanya Kulkarni1-4/+4
Only caller of null_alloc_page() is null_insert_page() unconditionally sets only parameter to GFP_NOIO and that is statically hard-coded in null_blk. There is no point in having statically hardcoded function parameter. Remove the unnecessary parameter gfp_flags and adjust the code, so it can retain existing behavior null_alloc_page() with GFP_NOIO. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220222152852.26043-2-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-27null_blk: remove hardcoded alloc_cmd() parameterChaitanya Kulkarni1-17/+12
Only caller of alloc_cmd() is null_submit_bio() unconditionally sets second parameter to true and that is statically hard-coded in null_blk. There is no point in having statically hardcoded function parameter. Remove the unnecessary parameter can_wait and adjust the code so it can retain existing behavior of waiting when we don't get valid nullb_cmd from __alloc_cmd() in alloc_cmd(). The restructured code avoids multiple return statements, multiple calls to __alloc_cmd() and resulting a fast path call to prepare_to_wait() due to removal of first alloc_cmd() call. Follow the pattern that we have in bio_alloc() to set the structure members in the structure allocation function in alloc_cmd() and pass bio to initialize newly allocated cmd->bio member. Follow the pattern in copy_to_nullb() to use result of one function call (null_cache_active()) to be used as a parameter to another function call (null_insert_page()), use result of alloc_cmd() as a first parameter to the null_handle_cmd() in null_submit_bio() function. This allow us to remove the local variable cmd on stack in null_submit_bio() that is in fast path. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Link: https://lore.kernel.org/r/20220216172945.31124-2-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-27loop: allow user to set the queue depthChaitanya Kulkarni1-1/+20
Instead of hardcoding queue depth allow user to set the hw queue depth using module parameter. Set default value to 128 to retain the existing behavior. Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20220215213310.7264-5-kch@nvidia.com Signed-off-by: Jens Axboe <axboe@kernel.dk>