aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi/scsi_lib.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-05-18scsi: zero per-cmd private driver data for each MQ I/OLong Li1-1/+1
In lower layer driver's (LLD) scsi_host_template, the driver may optionally ask SCSI to allocate its private driver memory for each command, by specifying cmd_size. This memory is allocated at the end of scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use scsi_cmd_priv to get to its private data. Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In this case, the LLD may get to stale or uninitialized data in its private driver memory. This may result in unexpected driver and hardware behavior. Fix this problem by also zeroing the private driver memory before passing them to LLD. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> CC: <stable@vger.kernel.org> # 4.11+ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-08scsi: scsi_lib: Add #include <scsi/scsi_transport.h>Bart Van Assche1-0/+1
This patch avoids that when building with W=1 the compiler complains that __scsi_init_queue() has not been declared. See also commit d48777a633d6 ("scsi: remove __scsi_alloc_queue"). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-06Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-7/+6
Pull block fixes and updates from Jens Axboe: "Some fixes and followup features/changes that should go in, in this merge window. This contains: - Two fixes for lightnvm from Javier, fixing problems in the new code merge previously in this merge window. - A fix from Jan for the backing device changes, fixing an issue in NFS that causes a failure to mount on certain setups. - A change from Christoph, cleaning up the blk-mq init and exit request paths. - Remove elevator_change(), which is now unused. From Bart. - A fix for queue operation invocation on a dead queue, from Bart. - A series fixing up mtip32xx for blk-mq scheduling, removing a bandaid we previously had in place for this. From me. - A regression fix for this series, fixing a case where we wait on workqueue flushing from an invalid (non-blocking) context. From me. - A fix/optimization from Ming, ensuring that we don't both quiesce and freeze a queue at the same time. - A fix from Peter on lock ordering for CPU hotplug. Not a real problem right now, but will be once the CPU hotplug rework goes in. - A series from Omar, cleaning up out blk-mq debugfs support, and adding support for exporting info from schedulers in debugfs as well. This is really useful in debugging stalls or livelocks. From Omar" * 'for-linus' of git://git.kernel.dk/linux-block: (28 commits) mq-deadline: add debugfs attributes kyber: add debugfs attributes blk-mq-debugfs: allow schedulers to register debugfs attributes blk-mq: untangle debugfs and sysfs blk-mq: move debugfs declarations to a separate header file blk-mq: Do not invoke queue operations on a dead queue blk-mq-debugfs: get rid of a bunch of boilerplate blk-mq-debugfs: rename hw queue directories from <n> to hctx<n> blk-mq-debugfs: don't open code strstrip() blk-mq-debugfs: error on long write to queue "state" file blk-mq-debugfs: clean up flag definitions blk-mq-debugfs: separate flags with | nfs: Fix bdi handling for cloned superblocks block/mq: Cure cpu hotplug lock inversion lightnvm: fix bad back free on error path lightnvm: create cmd before allocating request blk-mq: don't use sync workqueue flushing from drivers mtip32xx: convert internal commands to regular block infrastructure mtip32xx: cleanup internal tag assumptions block: don't call blk_mq_quiesce_queue() after queue is frozen ...
2017-05-04Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...
2017-05-02blk-mq: update ->init_request and ->exit_request prototypesChristoph Hellwig1-7/+6
Remove the request_idx parameter, which can't be used safely now that we support I/O schedulers with blk-mq. Except for a superflous check in mtip32xx it was unused anyway. Also pass the tag_set instead of just the driver data - this allows drivers to avoid some code duplication in a follow on cleanup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-01Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-blockLinus Torvalds1-11/+14
Pull block layer updates from Jens Axboe: - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ was initially a fork of CFQ, but subsequently changed to implement fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant to be used on desktop type single drives, providing good fairness. From Paolo. - Add Kyber IO scheduler. This is a full multiqueue aware scheduler, using a scalable token based algorithm that throttles IO based on live completion IO stats, similary to blk-wbt. From Omar. - A series from Jan, moving users to separately allocated backing devices. This continues the work of separating backing device life times, solving various problems with hot removal. - A series of updates for lightnvm, mostly from Javier. Includes a 'pblk' target that exposes an open channel SSD as a physical block device. - A series of fixes and improvements for nbd from Josef. - A series from Omar, removing queue sharing between devices on mostly legacy drivers. This helps us clean up other bits, if we know that a queue only has a single device backing. This has been overdue for more than a decade. - Fixes for the blk-stats, and improvements to unify the stats and user windows. This both improves blk-wbt, and enables other users to register a need to receive IO stats for a device. From Omar. - blk-throttle improvements from Shaohua. This provides a scalable framework for implementing scalable priotization - particularly for blk-mq, but applicable to any type of block device. The interface is marked experimental for now. - Bucketized IO stats for IO polling from Stephen Bates. This improves efficiency of polled workloads in the presence of mixed block size IO. - A few fixes for opal, from Scott. - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics. From a variety of folks, mostly Sagi and James Smart. - A series from Bart, improving our exposed info and capabilities from the blk-mq debugfs support. - A series from Christoph, cleaning up how handle WRITE_ZEROES. - A series from Christoph, cleaning up the block layer handling of how we track errors in a request. On top of being a nice cleanup, it also shrinks the size of struct request a bit. - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was never used by platforms, and the latter has outlived it's usefulness. - Various little bug fixes and cleanups from a wide variety of folks. * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits) block: hide badblocks attribute by default blk-mq: unify hctx delay_work and run_work block: add kblock_mod_delayed_work_on() blk-mq: unify hctx delayed_run_work and run_work nbd: fix use after free on module unload MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler blk-mq-sched: alloate reserved tags out of normal pool mtip32xx: use runtime tag to initialize command header scsi: Implement blk_mq_ops.show_rq() blk-mq: Add blk_mq_ops.show_rq() blk-mq: Show operation, cmd_flags and rq_flags names blk-mq: Make blk_flags_show() callers append a newline character blk-mq: Move the "state" debugfs attribute one level down blk-mq: Unregister debugfs attributes earlier blk-mq: Only unregister hctxs for which registration succeeded blk-mq-debugfs: Rename functions for registering and unregistering the mq directory blk-mq: Let blk_mq_debugfs_register() look up the queue name blk-mq: Register <dev>/queue/mq after having registered <dev>/queue ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset ide-pm: always pass 0 error to __blk_end_request_all ..
2017-04-26scsi: Implement blk_mq_ops.show_rq()Bart Van Assche1-0/+4
Show the SCSI CDB for pending SCSI commands in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list. An example of how SCSI commands are displayed by this code: ffff8801703245c0 {.op=READ, .cmd_flags=META PRIO, .rq_flags=DONTPREP IO_STAT STATS, .tag=14, .internal_tag=-1, .cmd=Read(10) 28 00 2a 81 1b 30 00 00 08 00} Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <linux-scsi@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-24Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull SCSI fix from James Bottomley: "Our final fix before the 4.12 release (hopefully). It's an error leg again: the fix to not bug on empty DMA transfers is returning the wrong code and confusing the block layer" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: return correct blkprep status code in case scsi_init_io() fails.
2017-04-20blk-mq: remove the error argument to blk_mq_complete_requestChristoph Hellwig1-1/+1
Now that all drivers that call blk_mq_complete_requests have a ->complete callback we can remove the direct call to blk_mq_end_request, as well as the error argument to blk_mq_complete_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20scsi: introduce a result field in struct scsi_requestChristoph Hellwig1-8/+7
This passes on the scsi_cmnd result field to users of passthrough requests. Currently we abuse req->errors for this purpose, but that field will go away in its current form. Note that the old IDE code abuses the errors field in very creative ways and stores all kinds of different values in it. I didn't dare to touch this magic, so the abuses are brought forward 1:1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-15Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixesJames Bottomley1-2/+2
2017-04-13scsi: return correct blkprep status code in case scsi_init_io() fails.Johannes Thumshirn1-2/+2
When instrumenting the SCSI layer to run into the !blk_rq_nr_phys_segments(rq) case the following warning emitted from the block layer: blk_peek_request: bad return=-22 This happens because since commit fd3fc0b4d730 ("scsi: don't BUG_ON() empty DMA transfers") we return the wrong error value from scsi_prep_fn() back to the block layer. [mkp: silenced checkpatch] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: fd3fc0b4d730 scsi: don't BUG_ON() empty DMA transfers Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07Merge branch 'for-linus' into for-4.12/blockJens Axboe1-3/+3
We've added a considerable amount of fixes for stalls and issues with the blk-mq scheduling in the 4.11 series since forking off the for-4.12/block branch. We need to do improvements on top of that for 4.12, so pull in the previous fixes to make our lives easier going forward. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07scsi: Avoid that SCSI queues get stuckBart Van Assche1-3/+3
If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block driver that implements that function is responsible for rerunning the hardware queue once requests can be queued again successfully. commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") removed the blk_mq_stop_hw_queue() call from scsi_queue_rq() for the BLK_MQ_RQ_QUEUE_BUSY case. Hence change all calls to functions that are intended to rerun a busy queue such that these examine all hardware queues instead of only stopped queues. Since no other functions than scsi_internal_device_block() and scsi_internal_device_unblock() should ever stop or restart a SCSI queue, change the blk_mq_delay_queue() call into a blk_mq_delay_run_hw_queue() call. Fixes: commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") Fixes: commit 7e79dadce222 ("blk-mq: stop hardware queue in blk_mq_delay_queue()") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-06scsi: make asynchronous aborts mandatoryHannes Reinecke1-1/+1
There hasn't been any reports for HBAs where asynchronous abort would not work, so we should make it mandatory and remove the fallback. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make scsi_eh_scmd_add() always succeedHannes Reinecke1-2/+2
scsi_eh_scmd_add() currently only will fail if no error handler thread is started (which will never be the case) or if the state machine encounters an illegal transition. But if we're encountering an invalid state transition chances is we cannot fixup things with the error handler. So better add a WARN_ON for illegal host states and make scsi_dh_scmd_add() a void function. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-05block, scsi: move the retries field to struct scsi_requestChristoph Hellwig1-2/+2
Instead of bloating the generic struct request with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-31blk-mq: constify struct blk_mq_opsEric Biggers1-1/+1
Constify all instances of blk_mq_ops, as they are never modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-07Merge remote-tracking branch 'mkp-scsi/fixes' into fixesJames Bottomley1-4/+10
2017-03-03Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-63/+50
Pull more SCSI updates from James Bottomley: "This is the set of stuff that didn't quite make the initial pull and a set of fixes for stuff which did. The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes cover a lot of previously submitted stuff, the most important of which probably covers some of the failing irq vectors allocation and other fallout from having the SCSI command allocated as part of the block allocation functions" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (59 commits) scsi: qedi: Fix memory leak in tmf response processing. scsi: aacraid: remove redundant zero check on ret scsi: lpfc: use proper format string for dma_addr_t scsi: lpfc: use div_u64 for 64-bit division scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m scsi: cciss: correct check map error. scsi: qla2xxx: fix spelling mistake: "seperator" -> "separator" scsi: aacraid: Fixed expander hotplug for SMART family scsi: mpt3sas: switch to pci_alloc_irq_vectors scsi: qedf: fixup compilation warning about atomic_t usage scsi: remove scsi_execute_req_flags scsi: merge __scsi_execute into scsi_execute scsi: simplify scsi_execute_req_flags scsi: make the sense header argument to scsi_test_unit_ready mandatory scsi: sd: improve TUR handling in sd_check_events scsi: always zero sshdr in scsi_normalize_sense scsi: scsi_dh_emc: return success in clariion_std_inquiry() scsi: fix memory leak of sdpk on when gd fails to allocate scsi: sd: make sd_devt_release() static scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework. ...
2017-03-01scsi: mpt3sas: Avoid sleeping in interrupt contextBart Van Assche1-4/+10
Commit 669f044170d8 ("scsi: srp_transport: Move queuecommand() wait code to SCSI core") can make scsi_internal_device_block() sleep. However, the mpt3sas driver can call this function from an interrupt handler. Hence add a second argument to scsi_internal_device_block() that restores the old behavior of this function for the mpt3sas handler. The call chain that triggered an "IRQ handler enabled interrupts" complaint is as follows: _base_interrupt() -> _base_async_event() -> mpt3sas_scsih_event_callback() -> _scsih_check_topo_delete_events() -> _scsih_block_io_to_children_attached_directly() -> _scsih_block_io_device() -> _scsih_internal_device_block() -> scsi_internal_device_block() Reported-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Omar Sandoval <osandov@osandov.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Christoph Hellwig <hch@lst.de> Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Chaitra P B <chaitra.basappa@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Cc: <stable@vger.kernel.org> # v4.10+ Tested-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-23scsi: remove scsi_execute_req_flagsChristoph Hellwig1-11/+0
And switch all callers to use scsi_execute instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-23scsi: merge __scsi_execute into scsi_executeChristoph Hellwig1-27/+21
All but one caller want the decoded sense header, so offer the existing __scsi_execute helper as the public scsi_execute API to simply the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22scsi: simplify scsi_execute_req_flagsChristoph Hellwig1-18/+9
Add a sshdr argument to __scsi_execute so that we can decode the sense data directly into the sense header instead of needing a copy of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22scsi: make the sense header argument to scsi_test_unit_ready mandatoryChristoph Hellwig1-12/+2
It's a tiny structure that can be allocated on the stack, don't complicate the code by making it optional. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22scsi: use 'scsi_device_from_queue()' for scsi_dhHannes Reinecke1-0/+23
The device handler needs to check if a given queue belongs to a scsi device; only then does it make sense to attach a device handler. [mkp: dropped flags] Cc: <stable@vger.kernel.org> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-21scsi: zero per-cmd driver data before each I/OChristoph Hellwig1-1/+1
Without this drivers that don't clear the state themselves can see off effects. For example Hyper-V VMs using the storvsc driver will often hang during boot due to uncleared Test Unit Ready failures. Fixes: e9c787e6 ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Dexuan Cui <decui@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-21Merge tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-blockLinus Torvalds1-88/+176
Pull block layer updates from Jens Axboe: - blk-mq scheduling framework from me and Omar, with a port of the deadline scheduler for this framework. A port of BFQ from Paolo is in the works, and should be ready for 4.12. - Various fixups and improvements to the above scheduling framework from Omar, Paolo, Bart, me, others. - Cleanup of the exported sysfs blk-mq data into debugfs, from Omar. This allows us to export more information that helps debug hangs or performance issues, without cluttering or abusing the sysfs API. - Fixes for the sbitmap code, the scalable bitmap code that was migrated from blk-mq, from Omar. - Removal of the BLOCK_PC support in struct request, and refactoring of carrying SCSI payloads in the block layer. This cleans up the code nicely, and enables us to kill the SCSI specific parts of struct request, shrinking it down nicely. From Christoph mainly, with help from Hannes. - Support for ranged discard requests and discard merging, also from Christoph. - Support for OPAL in the block layer, and for NVMe as well. Mainly from Scott Bauer, with fixes/updates from various others folks. - Error code fixup for gdrom from Christophe. - cciss pci irq allocation cleanup from Christoph. - Making the cdrom device operations read only, from Kees Cook. - Fixes for duplicate bdi registrations and bdi/queue life time problems from Jan and Dan. - Set of fixes and updates for lightnvm, from Matias and Javier. - A few fixes for nbd from Josef, using idr to name devices and a workqueue deadlock fix on receive. Also marks Josef as the current maintainer of nbd. - Fix from Josef, overwriting queue settings when the number of hardware queues is updated for a blk-mq device. - NVMe fix from Keith, ensuring that we don't repeatedly mark and IO aborted, if we didn't end up aborting it. - SG gap merging fix from Ming Lei for block. - Loop fix also from Ming, fixing a race and crash between setting loop status and IO. - Two block race fixes from Tahsin, fixing request list iteration and fixing a race between device registration and udev device add notifiations. - Double free fix from cgroup writeback, from Tejun. - Another double free fix in blkcg, from Hou Tao. - Partition overflow fix for EFI from Alden Tondettar. * tag 'for-4.11/linus-merge-signed' of git://git.kernel.dk/linux-block: (156 commits) nvme: Check for Security send/recv support before issuing commands. block/sed-opal: allocate struct opal_dev dynamically block/sed-opal: tone down not supported warnings block: don't defer flushes on blk-mq + scheduling blk-mq-sched: ask scheduler for work, if we failed dispatching leftovers blk-mq: don't special case flush inserts for blk-mq-sched blk-mq-sched: don't add flushes to the head of requeue queue blk-mq: have blk_mq_dispatch_rq_list() return if we queued IO or not block: do not allow updates through sysfs until registration completes lightnvm: set default lun range when no luns are specified lightnvm: fix off-by-one error on target initialization Maintainers: Modify SED list from nvme to block Move stack parameters for sed_ioctl to prevent oversized stack with CONFIG_KASAN uapi: sed-opal fix IOW for activate lsp to use correct struct cdrom: Make device operations read-only elevator: fix loading wrong elevator type for blk-mq devices cciss: switch to pci_irq_alloc_vectors block/loop: fix race between I/O and set_status blk-mq-sched: don't hold queue_lock when calling exit_icq block: set make_request_fn manually in blk_mq_update_nr_hw_queues ...
2017-02-19scsi: don't BUG_ON() empty DMA transfersJohannes Thumshirn1-1/+2
Don't crash the machine just because of an empty transfer. Use WARN_ON() combined with returning an error. Found by Dmitry Vyukov and syzkaller. [ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON() might still be a cause of excessive log spamming. NOTE! If this warning ever triggers, we may end up leaking resources, since this doesn't bother to try to clean the command up. So this WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is much worse. People really need to stop using BUG_ON() for "this shouldn't ever happen". It makes pretty much any bug worse. - Linus ] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-31block: fold cmd_type into the REQ_OP_ spaceChristoph Hellwig1-17/+13
Instead of keeping two levels of indirection for requests types, fold it all into the operations. The little caveat here is that previously cmd_type only applied to struct request, while the request and bio op fields were set to plain REQ_OP_READ/WRITE even for passthrough operations. Instead this patch adds new REQ_OP_* for SCSI passthrough and driver private requests, althought it has to add two for each so that we can communicate the data in/out nature of the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31block: introduce blk_rq_is_passthroughChristoph Hellwig1-3/+3
This can be used to check for fs vs non-fs requests and basically removes all knowledge of BLOCK_PC specific from the block layer, as well as preparing for removing the cmd_type field in struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27block: split scsi_request out of struct requestChristoph Hellwig1-23/+28
And require all drivers that want to support BLOCK_PC to allocate it as the first thing of their private data. To support this the legacy IDE and BSG code is switched to set cmd_size on their queues to let the block layer allocate the additional space. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27scsi: allocate scsi_cmnd structures as part of struct requestChristoph Hellwig1-40/+82
Rely on the new block layer functionality to allocate additional driver specific data behind struct request instead of implementing it in SCSI itѕelf. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27scsi: remove __scsi_alloc_queueChristoph Hellwig1-15/+4
Instead do an internal export of __scsi_init_queue for the transport classes that export BSG nodes. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-27scsi: respect unchecked_isa_dma for blk-mqChristoph Hellwig1-3/+59
Currently blk-mq always allocates the sense buffer using normal GFP_KERNEL allocation. Refactor the cmd pool code to split the cmd and sense allocation and share the code to allocate the sense buffers as well as the sense buffer slab caches between the legacy and blk-mq path. Note that this switches to lazy allocation of the sense slab caches - the slab caches (not the actual allocations) won't be destroy until the scsi module is unloaded instead of keeping track of hosts using them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-14Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull block fixes from Jens Axboe: - the virtio_blk stack DMA corruption fix from Christoph, fixing and issue with VMAP stacks. - O_DIRECT blkbits calculation fix from Chandan. - discard regression fix from Christoph. - queue init error handling fixes for nbd and virtio_blk, from Omar and Jeff. - two small nvme fixes, from Christoph and Guilherme. - rename of blk_queue_zone_size and bdev_zone_size to _sectors instead, to more closely follow what we do in other places in the block layer. This interface is new for this series, so let's get the naming right before releasing a kernel with this feature. From Damien. * 'for-linus' of git://git.kernel.dk/linux-block: block: don't try to discard from __blkdev_issue_zeroout sd: remove __data_len hack for WRITE SAME nvme: use blk_rq_payload_bytes scsi: use blk_rq_payload_bytes block: add blk_rq_payload_bytes block: Rename blk_queue_zone_size and bdev_zone_size nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too nvme-rdma: fix nvme_rdma_queue_is_ready virtio_blk: fix panic in initialization error path nbd: blk_mq_init_queue returns an error code on failure, not NULL virtio_blk: avoid DMA to stack for the sense buffer do_direct_IO: Use inode->i_blkbits to compute block count to be cleaned
2017-01-13scsi: use blk_rq_payload_bytesChristoph Hellwig1-1/+1
Without that we'll pass a wrong payload size in cmd->sdb, which can lead to hangs with drivers that need the total transfer size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Chris Valean <v-chvale@microsoft.com> Reported-by: Dexuan Cui <decui@microsoft.com> Fixes: f9d03f96 ("block: improve handling of the magic discard payload") Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-12-20scsi: scsi-mq: Wait for .queue_rq() if necessaryBart Van Assche1-1/+1
Ensure that if scsi-mq is enabled that scsi_internal_device_block() waits until ongoing shost->hostt->queuecommand() calls have finished. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-12-14Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+49
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (ncr5380, lpfc, hisi_sas, megaraid_sas, ufs, ibmvscsis, mpt3sas). There's also an assortment of minor fixes, mostly in error legs or other not very user visible stuff. The major change is the pci_alloc_irq_vectors replacement for the old pci_msix_.. calls; this effectively makes IRQ mapping generic for the drivers and allows blk_mq to use the information" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (256 commits) scsi: qla4xxx: switch to pci_alloc_irq_vectors scsi: hisi_sas: support deferred probe for v2 hw scsi: megaraid_sas: switch to pci_alloc_irq_vectors scsi: scsi_devinfo: remove synchronous ALUA for NETAPP devices scsi: be2iscsi: set errno on error path scsi: be2iscsi: set errno on error path scsi: hpsa: fallback to use legacy REPORT PHYS command scsi: scsi_dh_alua: Fix RCU annotations scsi: hpsa: use %phN for short hex dumps scsi: hisi_sas: fix free'ing in probe and remove scsi: isci: switch to pci_alloc_irq_vectors scsi: ipr: Fix runaway IRQs when falling back from MSI to LSI scsi: dpt_i2o: double free on error path scsi: cxlflash: Migrate scsi command pointer to AFU command scsi: cxlflash: Migrate IOARRIN specific routines to function pointers scsi: cxlflash: Cleanup queuecommand() scsi: cxlflash: Cleanup send_tmf() scsi: cxlflash: Remove AFU command lock scsi: cxlflash: Wait for active AFU commands to timeout upon tear down scsi: cxlflash: Remove private command pool ...
2016-12-09block: improve handling of the magic discard payloadChristoph Hellwig1-3/+3
Instead of allocating a single unused biovec for discard requests, send them down without any payload. Instead we allow the driver to add a "special" payload using a biovec embedded into struct request (unioned over other fields never used while in the driver), and overloading the number of segments for this case. This has a couple of advantages: - we don't have to allocate the bio_vec - the amount of special casing for discard requests in the block layer is significantly reduced - using this same scheme for other request types is trivial, which will be important for implementing the new WRITE_ZEROES op on devices where it actually requires a payload (e.g. SCSI) - we can get rid of playing games with the request length, as we'll never touch it and completions will work just fine - it will allow us to support ranged discard operations in the future by merging non-contiguous discard bios into a single request - last but not least it removes a lot of code This patch is the common base for my WIP series for ranges discards and to remove discard_zeroes_data in favor of always using REQ_OP_WRITE_ZEROES, so it would be good to get it in quickly. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-29scsi: srp_transport: Move queuecommand() wait code to SCSI coreBart Van Assche1-2/+39
Additionally, rename srp_wait_for_queuecommand() into scsi_wait_for_queuecommand() and add a comment about the queuecommand() call from scsi_send_eh_cmnd(). Note: this patch changes scsi_internal_device_block from a function that did not sleep into a function that may sleep. This is fine for all callers of this function: * scsi_internal_device_block() is called from the mpt3sas device while that driver holds the ioc->dm_cmds.mutex. This means that the mpt3sas driver calls this function from thread context. * scsi_target_block() is called by __iscsi_block_session() from kernel thread context and with IRQs enabled. * The SRP transport code also calls scsi_target_block() from kernel thread context while sleeping is allowed. * The snic driver also calls scsi_target_block() from a context from which sleeping is allowed. The scsi_target_block() call namely occurs immediately after a scsi_flush_work() call. [mkp: s/shost/sdev/] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-15scsi_lib: untangle 0 and BLK_MQ_RQ_QUEUE_OKOmar Sandoval1-3/+3
Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having specific values. No functional change. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-08scsi: allow LLDDs to expose the queue mapping to blk-mqChristoph Hellwig1-0/+10
Just hand through the blk-mq map_queues method in the host template. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-02blk-mq: Add a kick_requeue_list argument to blk_mq_requeue_request()Bart Van Assche1-3/+1
Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls are followed by kicking the requeue list. Hence add an argument to these two functions that allows to kick the requeue list. This was proposed by Christoph Hellwig. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-11-02blk-mq: Avoid that requeueing starts stopped queuesBart Van Assche1-1/+0
Since blk_mq_requeue_work() starts stopped queues and since execution of this function can be scheduled after a queue has been stopped it is not possible to stop queues without using an additional state variable to track whether or not the queue has been stopped. Hence modify blk_mq_requeue_work() such that it does not start stopped queues. My conclusion after a review of the blk_mq_stop_hw_queues() and blk_mq_{delay_,}kick_requeue_list() callers is as follows: * In the dm driver starting and stopping queues should only happen if __dm_suspend() or __dm_resume() is called and not if the requeue list is processed. * In the SCSI core queue stopping and starting should only be performed by the scsi_internal_device_block() and scsi_internal_device_unblock() functions but not by any other function. Although the blk_mq_stop_hw_queue() call in scsi_queue_rq() may help to reduce CPU load if a LLD queue is full, figuring out whether or not a queue should be restarted when requeueing a command would require to introduce additional locking in scsi_mq_requeue_cmd() to avoid a race with scsi_internal_device_block(). Avoid this complexity by removing the blk_mq_stop_hw_queue() call from scsi_queue_rq(). * In the NVMe core only the functions that call blk_mq_start_stopped_hw_queues() explicitly should start stopped queues. * A blk_mq_start_stopped_hwqueues() call must be added in the xen-blkfront driver in its blkif_recover() function. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: James Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-10-28block: split out request-only flags into a new namespaceChristoph Hellwig1-32/+43
A lot of the REQ_* flags are only used on struct requests, and only of use to the block layer and a few drivers that dig into struct request internals. This patch adds a new req_flags_t rq_flags field to struct request for them, and thus dramatically shrinks the number of common requests. It also removes the unfortunate situation where we have to fit the fields from the same enum into 32 bits for struct bio and 64 bits for struct request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15blk-mq: remove ->map_queueChristoph Hellwig1-1/+0
All drivers use the default, so provide an inline version of it. If we ever need other queue mapping we can add an optional method back, although supporting will also require major changes to the queue setup code. This provides better code generation, and better debugability as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-22scsi_lib: correctly retry failed zero length REQ_TYPE_FS commandsJames Bottomley1-2/+5
When SCSI was written, all commands coming from the filesystem (REQ_TYPE_FS commands) had data. This meant that our signal for needing to complete the command was the number of bytes completed being equal to the number of bytes in the request. Unfortunately, with the advent of flush barriers, we can now get zero length REQ_TYPE_FS commands, which confuse this logic because they satisfy the condition every time. This means they never get retried even for retryable conditions, like UNIT ATTENTION because we complete them early assuming they're done. Fix this by special casing the early completion condition to recognise zero length commands with errors and let them drop through to the retry code. Cc: stable@vger.kernel.org Reported-by: Sebastian Parschauer <s.parschauer@gmx.de> Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Tested-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-05-10scsi_lib: Decode T10 vendor IDsHannes Reinecke1-0/+16
Some arrays / HBAs will only present T10 vendor IDs, so we should be decoding them, too. [mkp: Fixed T10 spelling] Suggested-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Hannes Reinecke <hare@suse.com> Tested-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-04-15lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.cMing Lin1-137/+0
Now it's ready to move the mempool based SG chained allocator code from SCSI driver to lib/sg_pool.c, which will be compiled only based on a Kconfig symbol CONFIG_SG_POOL. SCSI selects CONFIG_SG_POOL. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>