aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/nvme/target (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-08-28nvmet: free workqueue object if module init failsChaitanya Kulkarni1-1/+3
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-28nvme-fcloop: Fix dropped LS's to removed target portJames Smart1-1/+2
When a targetport is removed from the config, fcloop will avoid calling the LS done() routine thinking the targetport is gone. This leaves the initiator reset/reconnect hanging as it waits for a status on the Create_Association LS for the reconnect. Change the filter in the LS callback path. If tport null (set when failed validation before "sending to remote port"), be sure to call done. This was the main bug. But, continue the logic that only calls done if tport was set but there is no remoteport (e.g. case where remoteport has been removed, thus host doesn't expect a completion). Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-16Merge branch 'linus/master' into rdma.git for-nextJason Gunthorpe9-79/+846
rdma.git merge resolution for the 4.19 merge window Conflicts: drivers/infiniband/core/rdma_core.c - Use the rdma code and revise with the new spelling for atomic_fetch_add_unless drivers/nvme/host/rdma.c - Replace max_sge with max_send_sge in new blk code drivers/nvme/target/rdma.c - Use the blk code and revise to use NULL for ib_post_recv when appropriate - Replace max_sge with max_recv_sge in new blk code net/rds/ib_send.c - Use the net code and revise to use NULL for ib_post_recv when appropriate Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16Merge tag 'v4.18' into rdma.git for-nextJason Gunthorpe4-13/+52
Resolve merge conflicts from the -rc cycle against the rdma.git tree: Conflicts: drivers/infiniband/core/uverbs_cmd.c - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next - Merge removal of file->ucontext in for-next with new code in -rc drivers/infiniband/core/uverbs_main.c - for-next removed code from ib_uverbs_write() that was modified in for-rc Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-14Merge tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-blockLinus Torvalds9-79/+845
Pull block updates from Jens Axboe: "First pull request for this merge window, there will also be a followup request with some stragglers. This pull request contains: - Fix for a thundering heard issue in the wbt block code (Anchal Agarwal) - A few NVMe pull requests: * Improved tracepoints (Keith) * Larger inline data support for RDMA (Steve Wise) * RDMA setup/teardown fixes (Sagi) * Effects log suppor for NVMe target (Chaitanya Kulkarni) * Buffered IO suppor for NVMe target (Chaitanya Kulkarni) * TP4004 (ANA) support (Christoph) * Various NVMe fixes - Block io-latency controller support. Much needed support for properly containing block devices. (Josef) - Series improving how we handle sense information on the stack (Kees) - Lightnvm fixes and updates/improvements (Mathias/Javier et al) - Zoned device support for null_blk (Matias) - AIX partition fixes (Mauricio Faria de Oliveira) - DIF checksum code made generic (Max Gurtovoy) - Add support for discard in iostats (Michael Callahan / Tejun) - Set of updates for BFQ (Paolo) - Removal of async write support for bsg (Christoph) - Bio page dirtying and clone fixups (Christoph) - Set of bcache fix/changes (via Coly) - Series improving blk-mq queue setup/teardown speed (Ming) - Series improving merging performance on blk-mq (Ming) - Lots of other fixes and cleanups from a slew of folks" * tag 'for-4.19/block-20180812' of git://git.kernel.dk/linux-block: (190 commits) blkcg: Make blkg_root_lookup() work for queues in bypass mode bcache: fix error setting writeback_rate through sysfs interface null_blk: add lock drop/acquire annotation Blk-throttle: reduce tail io latency when iops limit is enforced block: paride: pd: mark expected switch fall-throughs block: Ensure that a request queue is dissociated from the cgroup controller block: Introduce blk_exit_queue() blkcg: Introduce blkg_root_lookup() block: Remove two superfluous #include directives blk-mq: count the hctx as active before allocating tag block: bvec_nr_vecs() returns value for wrong slab bcache: trivial - remove tailing backslash in macro BTREE_FLAG bcache: make the pr_err statement used for ENOENT only in sysfs_attatch section bcache: set max writeback rate when I/O request is idle bcache: add code comments for bset.c bcache: fix mistaken comments in request.c bcache: fix mistaken code comments in bcache.h bcache: add a comment in super.c bcache: avoid unncessary cache prefetch bch_btree_node_get() bcache: display rate debug parameters to 0 when writeback is not running ...
2018-08-08nvmet: add ns write protect supportChaitanya Kulkarni5-5/+114
This patch implements the Namespace Write Protect feature described in "NVMe TP 4005a Namespace Write Protect". In this version, we implement No Write Protect and Write Protect states for target ns which can be toggled by set-features commands from the host side. For write-protect state transition, we need to flush the ns specified as a part of command so we also add helpers for carrying out synchronous flush operations. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: fixed an incorrect endianess conversion, minor cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-27nvmet: use Retain Async Event bit to clear AENChaitanya Kulkarni1-2/+15
In the current implementation, we clear the AEN bit when we get the "get log page" command if given log page is associated with AEN. This patch allows optionally retaining the AEN for the ctrl under consideration when Retain Asynchronous Event (RAE) bit is set as a part of "get log page" command. This allows the host to read the Log page and optionally retaining the AEN associated with this log page when using userspace tools like nvme-cli. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: also use the new helper in the just merged ANA code] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-27nvmet: support configuring ANA groupsChristoph Hellwig4-4/+236
Allow creating non-default ANA groups (group ID > 1). Groups are created either by assigning the group ID to a namespace, or by creating a configfs group object under a specific port. All namespaces assigned to a group that doesn't have a configfs object for a given port are marked as inaccessible. Allow changing the ANA state on a per-port basis by creating an ana_groups directory under each port, and another directory with an ana_state file in it. The default ANA group 1 directory is created automatically for each port. For all changes in ANA configuration the ANA change AEN is sent. We only keep a global changecount instead of additional per-group changecounts to keep the implementation as simple as possible. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-27nvmet: add minimal ANA supportChristoph Hellwig4-4/+142
Add support for Asynchronous Namespace Access as specified in NVMe 1.3 TP 4004. Just add a default ANA group 1 that is optimized on all ports. This is (and will remain) the default assignment for any namespace not epxlicitly assigned to another ANA group. The ANA state can be manually changed through the configfs interface, including the change state. Includes fixes and improvements from Hannes Reinecke. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-27nvmet: track and limit the number of namespaces per subsystemChristoph Hellwig3-1/+16
TP 4004 introduces a new 'Maximum Number of Allocated Namespaces' field in the Identify controller data to help the host size resources. Put an upper limit on the supported namespaces to be able to support this value as supporting 32-bits worth of namespaces would lead to very large buffers. The limit is completely arbitrary at this point. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-27nvmet: keep a port pointer in nvmet_ctrlChristoph Hellwig2-0/+4
This will be needed for the ANA AEN code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-25nvmet: only check for filebacking on -ENOTBLKHannes Reinecke1-1/+1
We only need to check for a file-backed namespace if nvmet_bdev_ns_enable() returns -ENOTBLK. For any other error it's pointless as the open() error will remain the same. Fixes: d5eff33e ("nvmet: add simple file backed ns support") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-25nvmet: fixup crash on NULL device pathHannes Reinecke1-2/+7
When writing an empty string into the device_path attribute the kernel will crash with nvmet: failed to open block device (null): (-22) BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 This patch sanitizes the error handling for invalid device path settings. Fixes: a07b4970 ("nvmet: add a generic NVMe target") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-24nvmet-rdma: Simplify ib_post_(send|recv|srq_recv)() callsBart Van Assche1-6/+4
Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL as third argument to ib_post_(send|recv|srq_recv)(). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-24nvmet: don't use uuid_le typeAndy Shevchenko1-1/+1
Don't use sizeof(uuid_le) where none of the parameters is type of uuid_le. Since both arguments are u8 [16], use size of destination there. Moreover, uuid_le is a deprecated type, and nvmet is using uuid_t already. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-24nvmet: check fileio lba range access boundariesSagi Grimberg1-2/+17
Fail out-of-bounds with a proper status code. Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-24nvmet: fix file discard return statusSagi Grimberg1-8/+10
If nvmet_copy_from_sgl failed, we falsly return successful completion status. Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-24nvme: if_ready checks to fail io to deleting controllerJames Smart1-1/+1
The revised if_ready checks skipped over the case of returning error when the controller is being deleted. Instead it was returning BUSY, which caused the ios to retry, which caused the ns delete to hang waiting for the ios to drain. Stack trace of hang looks like: kworker/u64:2 D 0 74 2 0x80000000 Workqueue: nvme-delete-wq nvme_delete_ctrl_work [nvme_core] Call Trace: ? __schedule+0x26d/0x820 schedule+0x32/0x80 blk_mq_freeze_queue_wait+0x36/0x80 ? remove_wait_queue+0x60/0x60 blk_cleanup_queue+0x72/0x160 nvme_ns_remove+0x106/0x140 [nvme_core] nvme_remove_namespaces+0x7e/0xa0 [nvme_core] nvme_delete_ctrl_work+0x4d/0x80 [nvme_core] process_one_work+0x160/0x350 worker_thread+0x1c3/0x3d0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_bind+0x10/0x10 ret_from_fork+0x1f/0x30 Extend nvmf_fail_nonready_command() to supply the controller pointer so that the controller state can be looked at. Fail any io to a controller that is deleting. Fixes: 3bc32bb1186c ("nvme-fabrics: refactor queue ready check") Fixes: 35897b920c8a ("nvme-fabrics: fix and refine state checks in __nvmf_check_ready") Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com>
2018-07-24nvmet-fc: fix target sgl list on large transfersJames Smart1-9/+35
The existing code to carve up the sg list expected an sg element-per-page which can be very incorrect with iommu's remapping multiple memory pages to fewer bus addresses. To hit this error required a large io payload (greater than 256k) and a system that maps on a per-page basis. It's possible that large ios could get by fine if the system condensed the sgl list into the first 64 elements. This patch corrects the sg list handling by specifically walking the sg list element by element and attempting to divide the transfer up on a per-sg element boundary. While doing so, it still tries to keep sequences under 256k, but will exceed that rule if a single sg element is larger than 256k. Fixes: 48fa362b6c3f ("nvmet-fc: simplify sg list handling") Cc: <stable@vger.kernel.org> # 4.14 Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvme: cache struct nvme_ctrl reference to struct nvme_requestSagi Grimberg1-0/+1
We will need to reference the controller in the setup and completion time for tracing and future traffic based keep alive support. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet-rdma: add an error flow for post_recv failuresMax Gurtovoy1-5/+21
Posting receive buffer operation can fail, thus we should make sure to have an error flow during initialization phase. While we're here, add a debug print in case of a failure. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet-rdma: add unlikely check in the fast pathMax Gurtovoy1-1/+1
ib_post_send operation should succeed unless something unusual happened to the ib device. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet-rdma: support max(16KB, PAGE_SIZE) inline dataSteve Wise6-43/+169
The patch enables inline data sizes using up to 4 recv sges, and capping the size at 16KB or at least 1 page size. So on a 4K page system, up to 16KB is supported, and for a 64K page system 1 page of 64KB is supported. We avoid > 0 order page allocations for the inline buffers by using multiple recv sges, one for each page. If the device cannot support the configured inline data size due to lack of enough recv sges, then log a warning and reduce the inline size. Add a new configfs port attribute, called param_inline_data_size, to allow configuring the size of inline data for a given nvmf port. The maximum size allowed is still enforced by nvmet-rdma with NVMET_RDMA_MAX_INLINE_DATA_SIZE, which is now max(16KB, PAGE_SIZE). And the default size, if not specified via configfs, is still PAGE_SIZE. This preserves the existing behavior, but allows larger inline sizes for small page systems. If the configured inline data size exceeds NVMET_RDMA_MAX_INLINE_DATA_SIZE, a warning is logged and the size is reduced. If param_inline_data_size is set to 0, then inline data is disabled for that nvmf port. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet: add buffered I/O support for file backed nsChaitanya Kulkarni4-5/+67
Add a new "buffered_io" attribute, which disabled direct I/O and thus enables page cache based caching when enabled. The attribute can only be changed when the namespace is disabled as the file has to be reopend for the change to take effect. The possibly blocking read/write are deferred to a newly introduced global workqueue. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-23nvmet: add commands supported and effects log pageChaitanya Kulkarni1-1/+34
This patch adds support for Commands Supported and Effects log page (Log Identifier 05h) for NVMeOF. This also makes it easier to find which commands are supported, e.g. :- subnqn : testnqn1 Admin Command Set ACS2 [Get Log Page ] 00000001 ACS6 [Identify ] 00000001 ACS8 [Abort ] 00000001 ACS9 [Set Features ] 00000001 ACS10 [Get Features ] 00000001 ACS12 [Asynchronous Event Request ] 00000001 ACS24 [Keep Alive ] 00000001 NVM Command Set IOCS0 [Flush ] 00000001 IOCS1 [Write ] 00000001 IOCS2 [Read ] 00000001 IOCS8 [Write Zeroes ] 00000001 IOCS9 [Dataset Management ] 00000001 This partticular functionality can be used from the host side to examine the NVMeOF ctrl commands supported. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-20nvmet: reset keep alive timer in controller enableMax Gurtuvoy1-0/+8
Controllers that are not yet enabled should not really enforce keep alive timeouts, but we still want to track a timeout and cleanup in case a host died before it enabled the controller. Hence, simply reset the keep alive timer when the controller is enabled. Suggested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-18IB/core: add max_send_sge and max_recv_sge attributesSteve Wise1-1/+1
This patch replaces the ib_device_attr.max_sge with max_send_sge and max_recv_sge. It allows ulps to take advantage of devices that have very different send and recv sge depths. For example cxgb4 has a max_recv_sge of 4, yet a max_send_sge of 16. Splitting out these attributes allows much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW API. Consider a large RDMA WRITE that has 16 scattergather entries. With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of 16, it can be done with 1 WRITE WR. Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-15nvme-fabrics: refactor queue ready checkChristoph Hellwig1-4/+3
Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com>
2018-06-11nvmet: free smart-log buffer after useChaitanya Kulkarni1-1/+3
Free smart-log buffer allocated in the function after use. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-06-08nvmet: filter newlines from user inputSagi Grimberg1-5/+9
We should avoid consuming the newlines in traddr, trsvcid and device_path. Add minimal processing to make sure they are gone. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-08nvmet: return all zeroed buffer when we can't find an active namespaceChristoph Hellwig1-6/+9
Quote from Figure 106 in NVMe 1.3a: The Identify Namespace data structure is returned to the host for the namespace specified in the Namespace Identifier (CDW1.NSID) field if it is an active NSID. If the specified namespace is not an active NSID, then the controller returns a zero filled data structure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@rimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01nvmet: mask pending AENsChristoph Hellwig3-1/+10
Per section 5.2 of the NVMe 1.3 spec: "When the controller posts a completion queue entry for an outstanding Asynchronous Event Request command and thus reports an asynchronous event, subsequent events of that event type are automatically masked by the controller until the host clears that event. An event is cleared by reading the log page associated with that event using the Get Log Page command (see section 5.14)." Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-06-01nvmet: add AEN configuration supportChristoph Hellwig3-2/+32
AEN configuration via the 'Get Features' and 'Set Features' admin command is mandatory, so we should be implemeting handling for it. Signed-off-by: Hannes Reinecke <hare@suse.com> [hch: use WRITE_ONCE, check for invalid values] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-06-01nvmet: implement the changed namespaces logChristoph Hellwig3-9/+76
Just keep a per-controller buffer of changed namespaces and copy it out in the get log page implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
2018-06-01nvmet: split log page implementationChristoph Hellwig1-63/+36
Remove the common code to allocate a buffer and copy it into the SGL. Instead the two no-op implementations just zero the SGL directly, and the smart log allocates a buffer on its own. This prepares for the more elaborate ANA log page. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-06-01nvmet: add a new nvmet_zero_sgl helperChristoph Hellwig2-0/+8
Zeroes the SGL in the payload. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-05-31nvmet: fix error return code in nvmet_file_ns_enable()Wei Yongjun1-2/+6
Fix to return error code -ENOMEM from the memory alloc fail error handling case instead of 0, as done elsewhere in this function. Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-31nvmet: fix a typo in nvmet_file_ns_enable()Wei Yongjun1-1/+1
Fix a typo in nvmet_file_ns_enable(). Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-30nvme-loop: add support for multiple portsChristoph Hellwig2-19/+31
This is useful at least for multipath testing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-05-29Merge branch 'nvme-4.18-2' of git://git.infradead.org/nvme into for-4.18/blockJens Axboe10-67/+417
Pull NVMe changes from Christoph: "Here is the current batch of nvme updates for 4.18, we have a few more patches in the queue, but I'd like to get this pile into your tree and linux-next ASAP. The biggest item is support for file-backed namespaces in the NVMe target from Chaitanya, in addition to that we mostly small fixes from all the usual suspects." * 'nvme-4.18-2' of git://git.infradead.org/nvme: nvme: fixup memory leak in nvme_init_identify() nvme: fix KASAN warning when parsing host nqn nvmet-loop: use nr_phys_segments when map rq to sgl nvmet-fc: increase LS buffer count per fc port nvmet: add simple file backed ns support nvmet: remove duplicate NULL initialization for req->ns nvmet: make a few error messages more generic nvme-fabrics: allow duplicate connections to the discovery controller nvme-fabrics: centralize discovery controller defaults nvme-fabrics: remove unnecessary controller subnqn validation nvme-fc: remove setting DNR on exception conditions nvme-rdma: stop admin queue before freeing it nvme-pci: Fix AER reset handling nvme-pci: set nvmeq->cq_vector after alloc cq/sq nvme: host: core: fix precedence of ternary operator nvme: fix lockdep warning in nvme_mpath_clear_current_path
2018-05-29nvme: return BLK_EH_DONE from ->timeoutChristoph Hellwig1-1/+1
NVMe always completes the request before returning from ->timeout, either by polling for it, or by disabling the controller. Return BLK_EH_DONE so that the block layer doesn't even try to complete it again. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-05-25nvmet-loop: use nr_phys_segments when map rq to sglChaitanya Kulkarni1-1/+1
Use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes() to check if a command contains data to me mapped. This fixes the case where a struct requests contains LBAs, but no data will actually be send, e.g. the pending Write Zeroes support. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-25nvmet-fc: increase LS buffer count per fc portJames Smart1-1/+1
Todays limit on concurrent LS's is very small - 4 buffers. With large subsystem counts or large numbers of initiators connecting, the limit may be exceeded. Raise the LS buffer count to 256. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-25nvmet: add simple file backed ns supportChaitanya Kulkarni6-52/+413
This patch adds simple file backed namespace support for NVMeOF target. The new file io-cmd-file.c is responsible for handling the code for I/O commands when ns is file backed. Also, we introduce mempools based slow path using sync I/Os for file backed ns to ensure forward progress under reclaim. The old block device based implementation is moved to io-cmd-bdev.c and use a "nvmet_bdev_" symbol prefix. The enable/disable calls are also move into the respective files. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: updated changelog, fixed double req->ns lookup in bdev case] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-25nvmet: remove duplicate NULL initialization for req->nsChaitanya Kulkarni5-12/+1
Remove the duplicate NULL initialization for req->ns. req->ns is always initialized to NULL in nvmet_req_init(), so there is no need to reset it later on failures unless we have previously assigned a value to it. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-25nvmet: make a few error messages more genericChaitanya Kulkarni1-2/+2
"nvmet_check_ctrl_status()" is called from admin-cmd.c along with io-cmd.c, make the error message more generic. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-05-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+1
Pull rdma fixes from Doug Ledford: "This is our first pull request of the rc cycle. It's not that it's been overly quiet, we were just waiting on a few things before sending this off. For instance, the 6 patch series from Intel for the hfi1 driver had actually been pulled in on Tuesday for a Wednesday pull request, only to have Jason notice something I missed, so we held off for some testing, and then on Thursday had to respin the series because the very first patch needed a minor fix (unnecessary cast is all). There is a sizable hns patch series in here, as well as a reasonably largish hfi1 patch series, then all of the lines of uapi updates are just the change to the new official Linux-OpenIB SPDX tag (a bunch of our files had what amounts to a BSD-2-Clause + MIT Warranty statement as their license as a result of the initial code submission years ago, and the SPDX folks decided it was unique enough to warrant a unique tag), then the typical mlx4 and mlx5 updates, and finally some cxgb4 and core/cache/cma updates to round out the bunch. None of it was overly large by itself, but in the 2 1/2 weeks we've been collecting patches, it has added up :-/. As best I can tell, it's been through 0day (I got a notice about my last for-next push, but not for my for-rc push, but Jason seems to think that failure messages are prioritized and success messages not so much). It's also been through linux-next. And yes, we did notice in the context portion of the CMA query gid fix patch that there is a dubious BUG_ON() in the code, and have plans to audit our BUG_ON usage and remove it anywhere we can. Summary: - Various build fixes (USER_ACCESS=m and ADDR_TRANS turned off) - SPDX license tag cleanups (new tag Linux-OpenIB) - RoCE GID fixes related to default GIDs - Various fixes to: cxgb4, uverbs, cma, iwpm, rxe, hns (big batch), mlx4, mlx5, and hfi1 (medium batch)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (52 commits) RDMA/cma: Do not query GID during QP state transition to RTR IB/mlx4: Fix integer overflow when calculating optimal MTT size IB/hfi1: Fix memory leak in exception path in get_irq_affinity() IB/{hfi1, rdmavt}: Fix memory leak in hfi1_alloc_devdata() upon failure IB/hfi1: Fix NULL pointer dereference when invalid num_vls is used IB/hfi1: Fix loss of BECN with AHG IB/hfi1 Use correct type for num_user_context IB/hfi1: Fix handling of FECN marked multicast packet IB/core: Make ib_mad_client_id atomic iw_cxgb4: Atomically flush per QP HW CQEs IB/uverbs: Fix kernel crash during MR deregistration flow IB/uverbs: Prevent reregistration of DM_MR to regular MR RDMA/mlx4: Add missed RSS hash inner header flag RDMA/hns: Fix a couple misspellings RDMA/hns: Submit bad wr RDMA/hns: Update assignment method for owner field of send wqe RDMA/hns: Adjust the order of cleanup hem table RDMA/hns: Only assign dqpn if IB_QP_PATH_DEST_QPN bit is set RDMA/hns: Remove some unnecessary attr_mask judgement RDMA/hns: Only assign mtu if IB_QP_PATH_MTU bit is set ...
2018-05-03nvmet: switch loopback target state to connecting when resettingJohannes Thumshirn1-0/+6
After commit bb06ec31452f ("nvme: expand nvmf_check_if_ready checks") resetting of the loopback nvme target failed as we forgot to switch it's state to NVME_CTRL_CONNECTING before we reconnect the admin queues. Therefore the checks in nvmf_check_if_ready() choose to go to the reject_io case and thus we couldn't sent out an identify controller command to reconnect. Change the controller state to NVME_CTRL_CONNECTING after tearing down the old connection and before re-establishing the connection. Fixes: bb06ec31452f ("nvme: expand nvmf_check_if_ready checks") Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-04-27nvmet-rdma: depend on INFINIBAND_ADDR_TRANSGreg Thelen1-1/+1
NVME_TARGET_RDMA code depends on INFINIBAND_ADDR_TRANS provided symbols. So declare the kconfig dependency. This is necessary to allow for enabling INFINIBAND without INFINIBAND_ADDR_TRANS. Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Tarick Bedeir <tarick@google.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-12nvme: expand nvmf_check_if_ready checksJames Smart1-9/+2
The nvmf_check_if_ready() checks that were added are very simplistic. As such, the routine allows a lot of cases to fail ios during windows of reset or re-connection. In cases where there are not multi-path options present, the error goes back to the callee - the filesystem or application. Not good. The common routine was rewritten and calling syntax slightly expanded so that per-transport is_ready routines don't need to be present. The transports now call the routine directly. The routine is now a fabrics routine rather than an inline function. The routine now looks at controller state to decide the action to take. Some states mandate io failure. Others define the condition where a command can be accepted. When the decision is unclear, a generic queue-or-reject check is made to look for failfast or multipath ios and only fails the io if it is so marked. Otherwise, the io will be queued and wait for the controller state to resolve. Admin commands issued via ioctl share a live admin queue with commands from the transport for controller init. The ioctls could be intermixed with the initialization commands. It's possible for the ioctl cmd to be issued prior to the controller being enabled. To block this, the ioctl admin commands need to be distinguished from admin commands used for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to reflect this division and set it on ioctls requests. As the nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(), ensure that commands allocated by the ioctl path (actually anything in core.c) preps the nvme_req(req) before starting the io. This will preserve the USERCMD flag during execution and/or retry. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.e> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>