aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/scsi/scsi_lib.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-11-07Revert "scsi: make 'state' device attribute pollable"Linus Torvalds1-3/+0
This reverts commit 8a97712e5314aefe16b3ffb4583a34deaa49de04. This commit added a call to sysfs_notify() from within scsi_device_set_state(), which in turn turns out to make libata very unhappy, because ata_eh_detach_dev() does spin_lock_irqsave(ap->lock, flags); .. if (ata_scsi_offline_dev(dev)) { dev->flags |= ATA_DFLAG_DETACHED; ap->pflags |= ATA_PFLAG_SCSI_HOTPLUG; } and ata_scsi_offline_dev() then does that scsi_device_set_state() to set it offline. So now we called sysfs_notify() from within a spinlocked region, which really doesn't work. The 0day robot reported this as: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 because sysfs_notify() ends up calling kernfs_find_and_get_ns() which then does mutex_lock(&kernfs_mutex).. The pollability of the device state isn't critical, so revert this all for now, and maybe we'll do it differently in the future. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-10-23scsi: Suppress a kernel warning in case the prep function returns BLKPREP_DEFERBart Van Assche1-7/+1
The legacy block layer handles requests as follows: - If the prep function returns BLKPREP_OK, let blk_peek_request() return the pointer to that request. - If the prep function returns BLKPREP_DEFER, keep the RQF_STARTED flag and retry calling the prep function later. - If the prep function returns BLKPREP_KILL or BLKPREP_INVALID, end the request. In none of these cases it is correct to clear the SCMD_INITIALIZED flag from inside scsi_prep_fn(). Since scsi_prep_fn() already guarantees that scsi_init_command() will be called once even if scsi_prep_fn() is called multiple times, remove the code that clears SCMD_INITIALIZED from scsi_prep_fn(). The scsi-mq code handles requests as follows: - If scsi_mq_prep_fn() returns BLKPREP_OK, set the RQF_DONTPREP flag and submit the request to the SCSI LLD. - If scsi_mq_prep_fn() returns BLKPREP_DEFER, call blk_mq_delay_run_hw_queue() and return BLK_STS_RESOURCE. - If the prep function returns BLKPREP_KILL or BLKPREP_INVALID, call scsi_mq_uninit_cmd() and let the blk-mq core end the request. In none of these cases scsi_mq_prep_fn() should clear the SCMD_INITIALIZED flag. Hence remove the code from scsi_mq_prep_fn() function that clears that flag. This patch avoids that the following warning is triggered when using the legacy block layer: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 4198 at drivers/scsi/scsi_lib.c:654 scsi_end_request+0x1de/0x220 CPU: 1 PID: 4198 Comm: mkfs.f2fs Not tainted 4.14.0-rc5+ #1 task: ffff91c147a4b800 task.stack: ffffb282c37b8000 RIP: 0010:scsi_end_request+0x1de/0x220 Call Trace: <IRQ> scsi_io_completion+0x204/0x5e0 scsi_finish_command+0xce/0xe0 scsi_softirq_done+0x126/0x130 blk_done_softirq+0x6e/0x80 __do_softirq+0xcf/0x2a8 irq_exit+0xab/0xb0 do_IRQ+0x7b/0xc0 common_interrupt+0x90/0x90 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x9/0x10 __test_set_page_writeback+0xc7/0x2c0 __block_write_full_page+0x158/0x3b0 block_write_full_page+0xc4/0xd0 blkdev_writepage+0x13/0x20 __writepage+0x12/0x40 write_cache_pages+0x204/0x500 generic_writepages+0x48/0x70 blkdev_writepages+0x9/0x10 do_writepages+0x34/0xc0 __filemap_fdatawrite_range+0x6c/0x90 file_write_and_wait_range+0x31/0x90 blkdev_fsync+0x16/0x40 vfs_fsync_range+0x44/0xa0 do_fsync+0x38/0x60 SyS_fsync+0xb/0x10 entry_SYSCALL_64_fastpath+0x13/0x94 ---[ end trace 86e8ef85a4a6c1d1 ]--- Fixes: commit 64104f703212 ("scsi: Call scsi_initialize_rq() for filesystem requests") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-31scsi: scsi-mq: Always unprepare before requeuing a requestBart Van Assche1-2/+8
One of the two scsi-mq functions that requeue a request unprepares a request before requeueing (scsi_io_completion()) but the other function not (__scsi_queue_insert()). Make sure that a request is unprepared before requeuing it. Fixes: commit d285203cf647 ("scsi: add support for a blk-mq based I/O path.") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-31scsi: Improve requeuing behaviorBart Van Assche1-2/+13
Requests are unprepared and reprepared when being requeued. Avoid that requeuing resets .jiffies_at_alloc and .retries by initializing these two member variables from inside scsi_initialize_rq() and by preserving both member variables when preparing a request. This patch affects the requeuing behavior of both the legacy scsi and the scsi-mq code paths. Reported-by: Brian King <brking@linux.vnet.ibm.com> References: https://lkml.org/lkml/2017/8/18/923 ("Re: [BUG][bisected 270065e] linux-next fails to boot on powerpc") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-31scsi: Call scsi_initialize_rq() for filesystem requestsBart Van Assche1-4/+22
If a pass-through request is submitted then blk_get_request() initializes that request by calling scsi_initialize_rq(). Also call this function for filesystem requests. Introduce CMD_INITIALIZED to keep track of whether or not a request has already been initialized. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-29scsi: Rework handling of scsi_device.vpd_pg8[03]Bart Van Assche1-8/+8
Introduce struct scsi_vpd for the VPD page length, data and the RCU head that will be used to free the VPD data. Use kfree_rcu() instead of kfree() to free VPD data. Move the VPD buffer pointer check inside the RCU read lock in the sysfs code. Only annotate pointers that are shared across threads with __rcu. Use rcu_dereference() when dereferencing an RCU pointer. This patch suppresses about twenty sparse complaints about the vpd_pg8[03] pointers. This patch also fixes a race condition, namely that updating of the VPD pointers and length variables in struct scsi_device was not atomic with reference to the code reading these variables. See also "Does the update code tolerate concurrent accesses?" in Documentation/RCU/checklist.txt. Fixes: commit 09e2b0b14690 ("scsi: rescan VPD attributes") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Shane Seymour <shane.seymour@hpe.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Shane Seymour <shane.seymour@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: Fix the kerneldoc for scsi_initialize_rq()Jonathan Corbet1-0/+1
The kerneldoc comment for scsi_initialize_rq() neglected to document the "rq" parameter, leading to this docs build warning: ./drivers/scsi/scsi_lib.c:1116: warning: No description found for parameter 'rq' Document the parameter and make the build slightly quieter. [mkp: used wording suggested by Bart] Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: fix comment in scsi_device_set_state()Hannes Reinecke1-1/+1
The function returns '0' if successful; with the original comment the function doesn't have a way to indicate success ... Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart van Assche <bvanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: Use blk_mq_rq_to_pdu() to convert a request to a SCSI command pointerBart Van Assche1-9/+9
Since commit e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct request") struct request and struct scsi_cmnd are adjacent. This means that there is now an alternative to reading req->special to convert a pointer to a prepared request into a SCSI command pointer, namely by using blk_mq_rq_to_pdu(). Make this change where appropriate. Although this patch does not change any functionality, it slightly improves performance and slightly improves readability. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-25scsi: Document which queue type a function is intended forBart Van Assche1-11/+12
Rename several functions to make it easy to see which queue type a function is intended for. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: make 'state' device attribute pollableHannes Reinecke1-0/+3
While the 'state' attribute can (and will) change occasionally, calling 'poll()' or 'select()' on it fails as sysfs is never notified that the state has changed. With this patch calling 'poll()' or 'select()' will work properly. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24scsi: scsi_lib: rework scsi_internal_device_unblock_nowait()Hannes Reinecke1-6/+11
Rework scsi_internal_device_unblock_nowait() into using a switch statement. No functional changes. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-06Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-115/+191
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, bnx2fc, qedf, hpsa, hisi_sas, smartpqi, cxlflash, aacraid, csiostor along with a host of minor and miscellaneous changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (276 commits) qla2xxx: Fix NVMe entry_type for iocb packet on BE system scsi: qla2xxx: avoid unused-function warning scsi: snic: fix a couple of spelling mistakes/typos scsi: qla2xxx: fix a bunch of typos and spelling mistakes scsi: lpfc: don't double count abort errors scsi: lpfc: spin_lock_irq() is not nestable scsi: hisi_sas: optimise DMA slot memory scsi: ibmvfc: constify dev_pm_ops structures. scsi: ibmvscsi: constify dev_pm_ops structures. scsi: cxlflash: Update debug prints in reset handlers scsi: cxlflash: Update send_tmf() parameters scsi: cxlflash: Avoid double free of character device scsi: Add STARGET_CREATED_REMOVE state to scsi_target_state scsi: ses: do not add a device to an enclosure if enclosure_add_links() fails. scsi: ufs: flush eh_work when eh_work scheduled. scsi: qla2xxx: Protect access to qpair members with qpair->qp_lock scsi: sun_esp: fix device reference leaks scsi: fnic: changing queue command to return result DID_IMM_RETRY when rport is init scsi: fnic: correct speed display and add support for 25,40 and 100G scsi: fnic: added timestamp reporting in fnic debug stats ...
2017-06-20block: Change argument type of scsi_req_init()Bart Van Assche1-1/+3
Since scsi_req_init() works on a struct scsi_request, change the argument type into struct scsi_request *. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-20block: Make most scsi_req_init() calls implicitBart Van Assche1-1/+14
Instead of explicitly calling scsi_req_init() after blk_get_request(), call that function from inside blk_get_request(). Add an .initialize_rq_fn() callback function to the block drivers that need it. Merge the IDE .init_rq_fn() function into .initialize_rq_fn() because it is too small to keep it as a separate function. Keep the scsi_req_init() call in ide_prep_sense() because it follows a blk_rq_init() call. References: commit 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-18blk-mq: use the introduced blk_mq_unquiesce_queue()Ming Lei1-2/+2
blk_mq_unquiesce_queue() is used for unquiescing the queue explicitly, so replace blk_mq_start_stopped_hw_queues() with it. For the scsi part, this patch takes Bart's suggestion to switch to block quiesce/unquiesce API completely. Cc: linux-nvme@lists.infradead.org Cc: linux-scsi@vger.kernel.org Cc: dm-devel@redhat.com Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-06-12scsi: Make scsi_mq_prep_fn() call scsi_init_command()Bart Van Assche1-16/+9
This patch reduces code duplication. There are two functional changes in this patch: - It causes scsi_mq_prep_fn() to clear driver-private command data, just like the already upstream commit 1bad6c4a57ef ("scsi: zero per-cmd private driver data for each MQ I/O"). - The initialization of .prot_sdb is moved from scsi_mq_prep_fn() into scsi_init_request(). [mkp: applied by hand] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Introduce scsi_mq_sgl_size()Bart Van Assche1-9/+10
This patch does not change any functionality but makes the next patch easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Only add commands to the device command list if required by the LLDBart Van Assche1-20/+32
Just like for the scsi-mq code path, in the single queue SCSI code path only add commands to the per-device command list if required by the SCSI LLD. This patch will make it easier to merge the single-queue and multiqueue command initialization code. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Make __scsi_remove_device go straight from BLOCKED to DELBart Van Assche1-1/+1
If a device is blocked, make __scsi_remove_device() cause it to transition to the DEL state. This means that all the commands issued in .shutdown() will error in the mid-layer, thus making the removal proceed without being stopped. This patch is a slightly modified version of a patch from James Bottomley. This patch avoids that the following lockup occurs: Call Trace: schedule+0x35/0x80 schedule_timeout+0x237/0x2d0 io_schedule_timeout+0xa6/0x110 wait_for_completion_io+0xa3/0x110 blk_execute_rq+0xdf/0x120 scsi_execute+0xce/0x150 [scsi_mod] scsi_execute_req_flags+0x8f/0xf0 [scsi_mod] sd_sync_cache+0xa9/0x190 [sd_mod] sd_shutdown+0x6a/0x100 [sd_mod] sd_remove+0x64/0xc0 [sd_mod] __device_release_driver+0x8d/0x120 device_release_driver+0x1e/0x30 bus_remove_device+0xf9/0x170 device_del+0x127/0x240 __scsi_remove_device+0xc1/0xd0 [scsi_mod] scsi_forget_host+0x57/0x60 [scsi_mod] scsi_remove_host+0x72/0x110 [scsi_mod] srp_remove_work+0x8b/0x200 [ib_srp] Reported-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Introduce scsi_start_queue()Bart Van Assche1-10/+15
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Protect SCSI device state changes with a mutexBart Van Assche1-6/+21
Serializing SCSI device state changes avoids that two state changes can occur concurrently, e.g. the state changes in scsi_target_block() and __scsi_remove_device(). This serialization is essential to make patch "Make __scsi_remove_device go straight from BLOCKED to DEL" work reliably. Enable this mechanism for all scsi_target_*block() callers but not for the scsi_internal_device_unblock() calls from the mpt3sas driver because that driver can call scsi_internal_device_unblock() from atomic context. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Create two versions of scsi_internal_device_unblock()Bart Van Assche1-14/+32
This will make it easier to serialize SCSI device state changes through a mutex. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Split scsi_internal_device_block()Bart Van Assche1-26/+47
Instead of passing a "wait" argument to scsi_internal_device_block(), split this function into a function that waits and a function that doesn't wait. This will make it easier to serialize SCSI device state changes through a mutex. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12scsi: Avoid that scsi_exit_rq() triggers a use-after-freeBart Van Assche1-18/+29
Dereferencing shost from scsi_exit_rq() is not safe because the SCSI host may already have been freed when scsi_exit_rq() is called. Increasing the shost reference count in scsi_init_rq() and dropping that reference in scsi_exit_rq() is nontrivial since scsi_host_dev_release() may sleep and since scsi_exit_rq() may be called from interrupt context. Since scsi_exit_rq() only needs a single bit from shost, copy that bit into struct scsi_cmnd. Reported-by: Scott Bauer <scott.bauer@intel.com> Fixes: e9c787e65c0c ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Scott Bauer <scott.bauer@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-09blk-mq: switch ->queue_rq return value to blk_status_tChristoph Hellwig1-15/+15
Use the same values for use for request completion errors as the return value from ->queue_rq. BLK_STS_RESOURCE is special cased to cause a requeue, and all the others are completed as-is. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-09block: introduce new block status code typeChristoph Hellwig1-34/+17
Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-06-01block: Introduce queue flag QUEUE_FLAG_SCSI_PASSTHROUGHBart Van Assche1-0/+2
From the context where a SCSI command is submitted it is not always possible to figure out whether or not the queue the command is submitted to has struct scsi_request as the first member of its private data. Hence introduce the flag QUEUE_FLAG_SCSI_PASSTHROUGH. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Don Brace <don.brace@microsemi.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-18scsi: zero per-cmd private driver data for each MQ I/OLong Li1-1/+1
In lower layer driver's (LLD) scsi_host_template, the driver may optionally ask SCSI to allocate its private driver memory for each command, by specifying cmd_size. This memory is allocated at the end of scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use scsi_cmd_priv to get to its private data. Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In this case, the LLD may get to stale or uninitialized data in its private driver memory. This may result in unexpected driver and hardware behavior. Fix this problem by also zeroing the private driver memory before passing them to LLD. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> CC: <stable@vger.kernel.org> # 4.11+ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-08scsi: scsi_lib: Add #include <scsi/scsi_transport.h>Bart Van Assche1-0/+1
This patch avoids that when building with W=1 the compiler complains that __scsi_init_queue() has not been declared. See also commit d48777a633d6 ("scsi: remove __scsi_alloc_queue"). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-06Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-7/+6
Pull block fixes and updates from Jens Axboe: "Some fixes and followup features/changes that should go in, in this merge window. This contains: - Two fixes for lightnvm from Javier, fixing problems in the new code merge previously in this merge window. - A fix from Jan for the backing device changes, fixing an issue in NFS that causes a failure to mount on certain setups. - A change from Christoph, cleaning up the blk-mq init and exit request paths. - Remove elevator_change(), which is now unused. From Bart. - A fix for queue operation invocation on a dead queue, from Bart. - A series fixing up mtip32xx for blk-mq scheduling, removing a bandaid we previously had in place for this. From me. - A regression fix for this series, fixing a case where we wait on workqueue flushing from an invalid (non-blocking) context. From me. - A fix/optimization from Ming, ensuring that we don't both quiesce and freeze a queue at the same time. - A fix from Peter on lock ordering for CPU hotplug. Not a real problem right now, but will be once the CPU hotplug rework goes in. - A series from Omar, cleaning up out blk-mq debugfs support, and adding support for exporting info from schedulers in debugfs as well. This is really useful in debugging stalls or livelocks. From Omar" * 'for-linus' of git://git.kernel.dk/linux-block: (28 commits) mq-deadline: add debugfs attributes kyber: add debugfs attributes blk-mq-debugfs: allow schedulers to register debugfs attributes blk-mq: untangle debugfs and sysfs blk-mq: move debugfs declarations to a separate header file blk-mq: Do not invoke queue operations on a dead queue blk-mq-debugfs: get rid of a bunch of boilerplate blk-mq-debugfs: rename hw queue directories from <n> to hctx<n> blk-mq-debugfs: don't open code strstrip() blk-mq-debugfs: error on long write to queue "state" file blk-mq-debugfs: clean up flag definitions blk-mq-debugfs: separate flags with | nfs: Fix bdi handling for cloned superblocks block/mq: Cure cpu hotplug lock inversion lightnvm: fix bad back free on error path lightnvm: create cmd before allocating request blk-mq: don't use sync workqueue flushing from drivers mtip32xx: convert internal commands to regular block infrastructure mtip32xx: cleanup internal tag assumptions block: don't call blk_mq_quiesce_queue() after queue is frozen ...
2017-05-04Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull SCSI updates from James Bottomley: "This update includes the usual round of major driver updates (hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the usual amount of cosmetic and spelling stuff" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits) scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template" scsi: stex: make S6flag static scsi: mac_esp: fix to pass correct device identity to free_irq() scsi: aacraid: pci_alloc_consistent() failures on ARM64 scsi: ufs: make ufshcd_get_lists_status() register operation obvious scsi: ufs: use MASK_EE_STATUS scsi: mac_esp: Replace bogus memory barrier with spinlock scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static scsi: sd_zbc: Do not write lock zones for reset scsi: sd_zbc: Remove superfluous assignments scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd scsi: Improve scsi_get_sense_info_fld scsi: sd: Cleanup sd_done sense data handling scsi: sd: Improve sd_completed_bytes scsi: sd: Fix function descriptions scsi: mpt3sas: remove redundant wmb scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host() scsi: sg: reset 'res_in_use' after unlinking reserved array scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency" ...
2017-05-02blk-mq: update ->init_request and ->exit_request prototypesChristoph Hellwig1-7/+6
Remove the request_idx parameter, which can't be used safely now that we support I/O schedulers with blk-mq. Except for a superflous check in mtip32xx it was unused anyway. Also pass the tag_set instead of just the driver data - this allows drivers to avoid some code duplication in a follow on cleanup. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-01Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-blockLinus Torvalds1-11/+14
Pull block layer updates from Jens Axboe: - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ was initially a fork of CFQ, but subsequently changed to implement fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant to be used on desktop type single drives, providing good fairness. From Paolo. - Add Kyber IO scheduler. This is a full multiqueue aware scheduler, using a scalable token based algorithm that throttles IO based on live completion IO stats, similary to blk-wbt. From Omar. - A series from Jan, moving users to separately allocated backing devices. This continues the work of separating backing device life times, solving various problems with hot removal. - A series of updates for lightnvm, mostly from Javier. Includes a 'pblk' target that exposes an open channel SSD as a physical block device. - A series of fixes and improvements for nbd from Josef. - A series from Omar, removing queue sharing between devices on mostly legacy drivers. This helps us clean up other bits, if we know that a queue only has a single device backing. This has been overdue for more than a decade. - Fixes for the blk-stats, and improvements to unify the stats and user windows. This both improves blk-wbt, and enables other users to register a need to receive IO stats for a device. From Omar. - blk-throttle improvements from Shaohua. This provides a scalable framework for implementing scalable priotization - particularly for blk-mq, but applicable to any type of block device. The interface is marked experimental for now. - Bucketized IO stats for IO polling from Stephen Bates. This improves efficiency of polled workloads in the presence of mixed block size IO. - A few fixes for opal, from Scott. - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics. From a variety of folks, mostly Sagi and James Smart. - A series from Bart, improving our exposed info and capabilities from the blk-mq debugfs support. - A series from Christoph, cleaning up how handle WRITE_ZEROES. - A series from Christoph, cleaning up the block layer handling of how we track errors in a request. On top of being a nice cleanup, it also shrinks the size of struct request a bit. - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was never used by platforms, and the latter has outlived it's usefulness. - Various little bug fixes and cleanups from a wide variety of folks. * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits) block: hide badblocks attribute by default blk-mq: unify hctx delay_work and run_work block: add kblock_mod_delayed_work_on() blk-mq: unify hctx delayed_run_work and run_work nbd: fix use after free on module unload MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler blk-mq-sched: alloate reserved tags out of normal pool mtip32xx: use runtime tag to initialize command header scsi: Implement blk_mq_ops.show_rq() blk-mq: Add blk_mq_ops.show_rq() blk-mq: Show operation, cmd_flags and rq_flags names blk-mq: Make blk_flags_show() callers append a newline character blk-mq: Move the "state" debugfs attribute one level down blk-mq: Unregister debugfs attributes earlier blk-mq: Only unregister hctxs for which registration succeeded blk-mq-debugfs: Rename functions for registering and unregistering the mq directory blk-mq: Let blk_mq_debugfs_register() look up the queue name blk-mq: Register <dev>/queue/mq after having registered <dev>/queue ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset ide-pm: always pass 0 error to __blk_end_request_all ..
2017-04-26scsi: Implement blk_mq_ops.show_rq()Bart Van Assche1-0/+4
Show the SCSI CDB for pending SCSI commands in /sys/kernel/debug/block/*/mq/*/dispatch and */rq_list. An example of how SCSI commands are displayed by this code: ffff8801703245c0 {.op=READ, .cmd_flags=META PRIO, .rq_flags=DONTPREP IO_STAT STATS, .tag=14, .internal_tag=-1, .cmd=Read(10) 28 00 2a 81 1b 30 00 00 08 00} Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <linux-scsi@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-24Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull SCSI fix from James Bottomley: "Our final fix before the 4.12 release (hopefully). It's an error leg again: the fix to not bug on empty DMA transfers is returning the wrong code and confusing the block layer" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: return correct blkprep status code in case scsi_init_io() fails.
2017-04-20blk-mq: remove the error argument to blk_mq_complete_requestChristoph Hellwig1-1/+1
Now that all drivers that call blk_mq_complete_requests have a ->complete callback we can remove the direct call to blk_mq_end_request, as well as the error argument to blk_mq_complete_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-20scsi: introduce a result field in struct scsi_requestChristoph Hellwig1-8/+7
This passes on the scsi_cmnd result field to users of passthrough requests. Currently we abuse req->errors for this purpose, but that field will go away in its current form. Note that the old IDE code abuses the errors field in very creative ways and stores all kinds of different values in it. I didn't dare to touch this magic, so the abuses are brought forward 1:1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-15Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixesJames Bottomley1-2/+2
2017-04-13scsi: return correct blkprep status code in case scsi_init_io() fails.Johannes Thumshirn1-2/+2
When instrumenting the SCSI layer to run into the !blk_rq_nr_phys_segments(rq) case the following warning emitted from the block layer: blk_peek_request: bad return=-22 This happens because since commit fd3fc0b4d730 ("scsi: don't BUG_ON() empty DMA transfers") we return the wrong error value from scsi_prep_fn() back to the block layer. [mkp: silenced checkpatch] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: fd3fc0b4d730 scsi: don't BUG_ON() empty DMA transfers Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-07Merge branch 'for-linus' into for-4.12/blockJens Axboe1-3/+3
We've added a considerable amount of fixes for stalls and issues with the blk-mq scheduling in the 4.11 series since forking off the for-4.12/block branch. We need to do improvements on top of that for 4.12, so pull in the previous fixes to make our lives easier going forward. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07scsi: Avoid that SCSI queues get stuckBart Van Assche1-3/+3
If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block driver that implements that function is responsible for rerunning the hardware queue once requests can be queued again successfully. commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") removed the blk_mq_stop_hw_queue() call from scsi_queue_rq() for the BLK_MQ_RQ_QUEUE_BUSY case. Hence change all calls to functions that are intended to rerun a busy queue such that these examine all hardware queues instead of only stopped queues. Since no other functions than scsi_internal_device_block() and scsi_internal_device_unblock() should ever stop or restart a SCSI queue, change the blk_mq_delay_queue() call into a blk_mq_delay_run_hw_queue() call. Fixes: commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") Fixes: commit 7e79dadce222 ("blk-mq: stop hardware queue in blk_mq_delay_queue()") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-06scsi: make asynchronous aborts mandatoryHannes Reinecke1-1/+1
There hasn't been any reports for HBAs where asynchronous abort would not work, so we should make it mandatory and remove the fallback. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-06scsi: make scsi_eh_scmd_add() always succeedHannes Reinecke1-2/+2
scsi_eh_scmd_add() currently only will fail if no error handler thread is started (which will never be the case) or if the state machine encounters an illegal transition. But if we're encountering an invalid state transition chances is we cannot fixup things with the error handler. So better add a WARN_ON for illegal host states and make scsi_dh_scmd_add() a void function. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-04-05block, scsi: move the retries field to struct scsi_requestChristoph Hellwig1-2/+2
Instead of bloating the generic struct request with it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-31blk-mq: constify struct blk_mq_opsEric Biggers1-1/+1
Constify all instances of blk_mq_ops, as they are never modified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-07Merge remote-tracking branch 'mkp-scsi/fixes' into fixesJames Bottomley1-4/+10
2017-03-03Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-63/+50
Pull more SCSI updates from James Bottomley: "This is the set of stuff that didn't quite make the initial pull and a set of fixes for stuff which did. The new stuff is basically lpfc (nvme), qedi and aacraid. The fixes cover a lot of previously submitted stuff, the most important of which probably covers some of the failing irq vectors allocation and other fallout from having the SCSI command allocated as part of the block allocation functions" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (59 commits) scsi: qedi: Fix memory leak in tmf response processing. scsi: aacraid: remove redundant zero check on ret scsi: lpfc: use proper format string for dma_addr_t scsi: lpfc: use div_u64 for 64-bit division scsi: mac_scsi: Fix MAC_SCSI=m option when SCSI=m scsi: cciss: correct check map error. scsi: qla2xxx: fix spelling mistake: "seperator" -> "separator" scsi: aacraid: Fixed expander hotplug for SMART family scsi: mpt3sas: switch to pci_alloc_irq_vectors scsi: qedf: fixup compilation warning about atomic_t usage scsi: remove scsi_execute_req_flags scsi: merge __scsi_execute into scsi_execute scsi: simplify scsi_execute_req_flags scsi: make the sense header argument to scsi_test_unit_ready mandatory scsi: sd: improve TUR handling in sd_check_events scsi: always zero sshdr in scsi_normalize_sense scsi: scsi_dh_emc: return success in clariion_std_inquiry() scsi: fix memory leak of sdpk on when gd fails to allocate scsi: sd: make sd_devt_release() static scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework. ...
2017-03-01scsi: mpt3sas: Avoid sleeping in interrupt contextBart Van Assche1-4/+10
Commit 669f044170d8 ("scsi: srp_transport: Move queuecommand() wait code to SCSI core") can make scsi_internal_device_block() sleep. However, the mpt3sas driver can call this function from an interrupt handler. Hence add a second argument to scsi_internal_device_block() that restores the old behavior of this function for the mpt3sas handler. The call chain that triggered an "IRQ handler enabled interrupts" complaint is as follows: _base_interrupt() -> _base_async_event() -> mpt3sas_scsih_event_callback() -> _scsih_check_topo_delete_events() -> _scsih_block_io_to_children_attached_directly() -> _scsih_block_io_device() -> _scsih_internal_device_block() -> scsi_internal_device_block() Reported-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Omar Sandoval <osandov@osandov.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Christoph Hellwig <hch@lst.de> Cc: Sathya Prakash <sathya.prakash@broadcom.com> Cc: Chaitra P B <chaitra.basappa@broadcom.com> Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com> Cc: <stable@vger.kernel.org> # v4.10+ Tested-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-23scsi: remove scsi_execute_req_flagsChristoph Hellwig1-11/+0
And switch all callers to use scsi_execute instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>