aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/nvme/target/nvmet.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-09-27nvmet: handle keep-alive timer when kato is modified by a set features cmdAmit Engel1-0/+2
A user may modify the kato by a set features cmd. To properly deal with races or a kato value of 0 (no keep alive enabled) change nvmet_set_feat_kato to first disable the timer, then set the value and then re-enable the timer. Signed-off-by: Amit Engel <amit.engel@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: introduce the passthru configfs interfaceLogan Gunthorpe1-0/+1
When CONFIG_NVME_TARGET_PASSTHRU as 'passthru' directory will be added to each subsystem. The directory is similar to a namespace and has two attributes: device_path and enable. The user must set the path to the nvme controller's char device and write '1' to enable the subsystem to use passthru. Any given subsystem is prevented from enabling both a regular namespace and the passthru device. If one is enabled, enabling the other will produce an error. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: Add passthru enable/disable helpersLogan Gunthorpe1-0/+12
This patch adds helper functions which are used in the NVMeOF configfs when the user is configuring the passthru subsystem. Here we ensure that only one subsys is assigned to each nvme_ctrl by using an xarray on the cntlid. The subsystem's version number is overridden by the passed through controller's version. However, if that version is less than 1.2.1, then we bump the advertised version to that and print a warning in dmesg. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: add passthru code to process commandsLogan Gunthorpe1-0/+39
Add passthru command handling capability for the NVMeOF target and export passthru APIs which are used to integrate passthru code with nvmet-core. The new file passthru.c handles passthru cmd parsing and execution. In the passthru mode, we create a block layer request from the nvmet request and map the data on to the block layer request. Admin commands and features are on an allow list as there are a number of each that don't make too much sense with passthrough. We use an allow list such that new commands can be considered before being blindly passed through. In both cases, vendor specific commands are always allowed. We also reject reservation IO commands as the underlying device cannot differentiate between multiple hosts behind a fabric. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-29nvmet: use xarray for ctrl ns storingChaitanya Kulkarni1-2/+1
This patch replaces the ctrl->namespaces tracking from linked list to xarray and improves the performance when accessing one namespce :- XArray vs Default:- IOPS and BW (more the better) increase BW (~1.8%):- --------------------------------------------------- XArray :- read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=162k, BW=631MiB/s (662MB/s)(18.5GiB/30001msec) Default:- read: IOPS=156k, BW=609MiB/s (639MB/s)(17.8GiB/30001msec) read: IOPS=157k, BW=613MiB/s (643MB/s)(17.0GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) Submission latency (less the better) decrease (~8.3%):- ------------------------------------------------------- XArray:- slat (usec): min=7, max=8386, avg=11.19, stdev=5.96 slat (usec): min=7, max=441, avg=11.09, stdev=4.48 slat (usec): min=7, max=1088, avg=11.21, stdev=4.54 Default :- slat (usec): min=8, max=2826.5k, avg=23.96, stdev=3911.50 slat (usec): min=8, max=503, avg=12.52, stdev=5.07 slat (usec): min=8, max=2384, avg=12.50, stdev=5.28 CPU Usage (less the better) decrease (~5.2%):- ---------------------------------------------- XArray:- cpu : usr=1.84%, sys=18.61%, ctx=949471, majf=0, minf=250 cpu : usr=1.83%, sys=18.41%, ctx=950262, majf=0, minf=237 cpu : usr=1.82%, sys=18.82%, ctx=957224, majf=0, minf=234 Default:- cpu : usr=1.70%, sys=19.21%, ctx=858196, majf=0, minf=251 cpu : usr=1.82%, sys=19.98%, ctx=929720, majf=0, minf=227 cpu : usr=1.83%, sys=20.33%, ctx=947208, majf=0, minf=235. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08nvmet: introduce flags member in nvmet_fabrics_opsMax Gurtovoy1-2/+3
Replace has_keyed_sgls and metadata_support booleans with a flags member that will be used for adding more features in the future. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: add metadata support for block devicesIsrael Rukshin1-2/+5
Allocate the metadata SGL buffers and set metadata fields for the request. Then create a block IO request for the metadata from the protection SG list. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: add metadata/T10-PI supportIsrael Rukshin1-0/+19
Expose the namespace metadata format when PI is enabled. The user needs to enable the capability per subsystem and per port. The other metadata properties are taken from the namespace/bdev. Usage example: echo 1 > /config/nvmet/subsystems/${NAME}/attr_pi_enable echo 1 > /config/nvmet/ports/${PORT_NUM}/param_pi_enable Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: rename nvmet_check_data_len to nvmet_check_transfer_lenIsrael Rukshin1-1/+1
The function doesn't check only the data length, because the transfer length includes also the metadata length in some cases. This is preparation for adding metadata (T10-PI) support. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: rename nvmet_rw_len to nvmet_rw_data_lenIsrael Rukshin1-1/+1
The function doesn't add the metadata length (only data length is calculated). This is preparation for adding metadata (T10-PI) support. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: add metadata characteristics for a namespaceIsrael Rukshin1-0/+3
Fill those namespace fields from the block device format for adding metadata (T10-PI) over fabric support with block devices. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27nvmet: add helper to revalidate bdev and file nsChaitanya Kulkarni1-0/+1
This patch adds a wrapper helper to indicate size change in the bdev & file-backed namespace when revalidating ns. This helper is needed in order to minimize code repetition in the next patch for configfs.c and existing admin-cmd.c.   Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-09nvmet: add ns revalidation supportAnthony Iliopoulos1-0/+2
Add support for detecting capacity changes on nvmet blockdev and file backed namespaces. This allows for emulating and testing online resizing of nvme devices and filesystems on top. Signed-off-by: Anthony Iliopoulos <ailiop@suse.com> [chaitanya: Fix comments posted on V1] Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: reuse code a bit more] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-26nvmet: Add get_mdts op for controllersMax Gurtovoy1-0/+1
Some transports, such as RDMA, would like to set the Maximum Data Transfer Size (MDTS) according to device/port/ctrl characteristics. This will enable the transport to set the optimal MDTS according to controller needs and device capabilities. Add a new nvmet transport op that is called during ctrl identification. This will not effect transports that don't implement this option. The return value of the new op is according to the NVMe spec definition for MDTS. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com>
2020-03-04nvmet: make ctrl model configurableMark Ruijter1-0/+8
This patch adds a new target subsys attribute which allows user to optionally specify model name which then used in the nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure. The default value for the model is set to "Linux" for backward compatibility. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Mark Ruijter <MRuijter@onestopsystems.com> [chaitanya.kulkarni@wdc.com *Use macro for default model, coding style fixes. *Use RCU for accessing model in for configfs and in nvmet_execute_identify_ctrl(). ] Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-03-04nvmet: make ctrl-id configurableChaitanya Kulkarni1-0/+2
This patch adds a new target subsys attribute which allows user to optionally specify target controller IDs which then used in the nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure. For example, when using a cluster setup with two nodes, with a dual ported NVMe drive and exporting the drive from both the nodes, The connection to the host fails due to the same controller ID and results in the following error message:- "nvme nvmeX: Duplicate cntlid XXX with nvmeX, rejecting" With this patch now user can partition the controller IDs for each subsystem by setting up the cntlid_min and cntlid_max. These values will be used at the time of the controller ID creation. By partitioning the ctrl-ids for each subsystem results in the unique ctrl-id space which avoids the collision. When new attribute is not specified target will fall back to original cntlid calculation method. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2020-02-04nvmet: fix dsm failure when payload does not match sgl descriptorSagi Grimberg1-0/+1
The host is allowed to pass the controller an sgl describing a buffer that is larger than the dsm payload itself, allow it when executing dsm. Reported-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Reviewed-by: Christoph Hellwig <hch@lst.de>, Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2019-11-04nvmet: Open code nvmet_req_execute()Christoph Hellwig1-1/+0
Now that nvmet_req_execute does nothing, open code it. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-04nvmet: Remove the data_len field from the nvmet_req structChristoph Hellwig1-2/+1
Instead of storing the expected length and checking it when it's executed, just check the length inside the command themselves. A new helper, nvmet_check_data_len() is created to help with this check. Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, udpate changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-04nvmet: Introduce nvmet_dsm_len() helperChristoph Hellwig1-0/+6
Similar to the nvmet_rw_len helper. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-07-31nvmet: Fix use-after-free bug when a port is removedLogan Gunthorpe1-0/+3
When a port is removed through configfs, any connected controllers are still active and can still send commands. This causes a use-after-free bug which is detected by KASAN for any admin command that dereferences req->port (like in nvmet_execute_identify_ctrl). To fix this, disconnect all active controllers when a subsystem is removed from a port. This ensures there are no active controllers when the port is eventually removed. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
2019-07-09nvmet: export I/O characteristics attributes in IdentifyBart Van Assche1-0/+8
Make the NVMe NAWUN, NAWUPF, NACWU, NPWG, NPWA, NPDG and NOWS attributes available to initator systems for the block backend. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-06-21nvmet: add transport discovery change opJames Smart1-0/+2
Some transports, such as FC-NVME, support discovery controller change events without the use of a persistent discovery controller. FC receives events via RSCN from the FC Fabric Controller or subsystem FC port. This patch adds a nvmet transport op that is called whenever a discovery change event occurs in the nvmet layer. To facilitate the callback without adding another layer to cross into core.c to reference the transport ops, the port structure snapshots the transport ops when the port is enabled and clears them when disabled. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-25nvmet: rename nvme_completion instances from rsp to cqeMax Gurtovoy1-2/+2
Use NVMe namings for improving code readability. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-04-11nvmet: fix discover log page when offsets are usedKeith Busch1-0/+1
The nvme target hadn't been taking the Get Log Page offset parameter into consideration, and so has been returning corrupted log pages when offsets are used. Since many tools, including nvme-cli, split the log request to 4k, we've been breaking discovery log responses when more than 3 subsystems exist. Fix the returned data by internally generating the entire discovery log page and copying only the requested bytes into the user buffer. The command log page offset type has been modified to a native __le64 to make it easier to extract the value from a command. Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-02-20nvmet: convert to SPDX identifiersChristoph Hellwig1-9/+1
Update license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
2018-12-18nvmet: use a macro for default error locationChaitanya Kulkarni1-0/+1
This patch defines a new macro NVMET_NO_ERROR_LOC to represent the default error location value in the nvme-error-log-page. This is a pure cleanup patch and it does not change any functionality. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvmet: add error log support for file backendChaitanya Kulkarni1-0/+2
This patch adds support for the file backend to populate the error log entries. Here we map the errno to the NVMe status codes. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvmet: add interface to update error-log pageChaitanya Kulkarni1-5/+0
This patch adds nvmet_req based interface to the nvmet-core so that we can update the error log page. We update error log page in the request completion path when status is not set to NVME_SC_SUCCESS. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvmet: add error-log definitionsChaitanya Kulkarni1-0/+6
This patch adds necessary fields in the target data structures to support error log page. For a target controller, we add a new error log field to maintain the error log, at any given point we maintain error entries equal to NVMET_ERROR_LOG_SLOTS for each controller. In the following patch, we also update the error log page entry in the I/O completion path so we introduce a spinlock for synchronization of the log. For nvmet_req, we add a new field error_loc to hold the location of the error in the command when the actual error occurs for each request and a starting LBA if applicable. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvme: remove nvme_common command cdw10 arrayChaitanya Kulkarni1-1/+1
This is a preparation patch which removes the nvme common command cdw10 array and replace with individual fields. This is needed for the nvmet error log page implementation make is error log page entry offset assignment easier. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-13nvmet: Add install_queue calloutSagi Grimberg1-0/+1
nvmet-tcp will implement it to allocate queue commands which are only known at nvmf connect time (sq size). Signed-off-by: Sagi Grimberg <sagi@lightbitslabs.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07nvmet: fix the structure member indentationChaitanya Kulkarni1-4/+4
This is a cleanup patch which fixes the structure member indentation introduced by the p2p: commit c6925093d0b2 ("nvmet: Optionally use PCI P2P memory"). We don't change any functionality in this patch. This is needed so that any future members will also follow the uniform indentation. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: support fabrics sq flow controlSagi Grimberg1-0/+1
Technical proposal 8005 "fabrics SQ flow control" introduces a mode where a host and controller agree to omit sq_head pointer updates when sending nvme completions. In case the host indicated desire to operate in this mode (connect attribute) the controller will return back a connect completion with sq_head value of 0xffff as indication that it will omit sq_head pointer updates. This mode saves us an atomic update in the I/O path. Reviewed-by: Hannes Reinecke <hare@suse.com> [hch: suggested better implementation] Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: mark nvmet_genctr staticChristoph Hellwig1-1/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: enable Discovery Controller AENsJay Sternberg1-1/+10
Add functions to find connections requesting Discovery Change events and send a notification to hosts that maintain an explicit persistent connection and have and active Asynchronous Event Request pending. Only Hosts that have access to the Subsystem effected by the change will receive notifications of Discovery Change event. Call these functions each time there is a configfs change that effects the Discover Log Pages. Set the OAES field in the Identify Controller response to advertise the support for Asynchronous Event Notifications. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: allow host connect even if no allowed subsystems are exportedSagi Grimberg1-2/+1
It is perfectly valid that a host connects to a discovery subsystem and gets an empty discovery log page since no subsystems are provisioned to it. No reason to disallow connecting to the discovery subsystem all together. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: add defines for discovery change async eventsJay Sternberg1-0/+2
Add AEN/AER values as defined by the specification Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: make kato and AEN processing for use by other controllersJay Sternberg1-0/+6
Make common process of get/set features available to other controllers by making simple functions static inline and others not static and prototypes in nvmet.h file Also remove static from nvmet_execute_async_event and add prototype to nvmet.h to allow used by other controllers Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: allow Keep Alive for Discovery controllerJay Sternberg1-1/+3
Per change to specification allowing Discovery controllers to have explicit persistent connections, remove restriction on Discovery controllers allowing kato on connect. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: change aen mask functions to use bit numbersJay Sternberg1-5/+5
Functions nvmet_aen_disabled and nvmet_clear_aen were using values not bit numbers ie 1 << 9 not 9 for bit function clear_bit and test_and_set_bit. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: provide aen bit functions for multiple controller typesJay Sternberg1-0/+15
Move nvmet_aen_disabled and nvmet_clear_aen in preparation for other types of controllers to use, initially the discovery controller. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07nvmet: support for traffic based keep-aliveSagi Grimberg1-0/+2
A controller that supports traffic based keep-alive can restart the keep alive timer even when no keep-alive was not received in the kato period as long as other admin or I/O commands were received. For each command set ctrl->cmd_seen to true, and when keep-alive timer expires, if any commands were seen, resched ka_work instead of escalating to a fatal error. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-10-25Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds1-0/+17
Pull PCI updates from Bjorn Helgaas: - Fix ASPM link_state teardown on removal (Lukas Wunner) - Fix misleading _OSC ASPM message (Sinan Kaya) - Make _OSC optional for PCI (Sinan Kaya) - Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set (Patrick Talbert) - Remove x86 and arm64 node-local allocation for host bridge structures (Punit Agrawal) - Pay attention to device-specific _PXM node values (Jonathan Cameron) - Support new Immediate Readiness bit (Felipe Balbi) - Differentiate between pciehp surprise and safe removal (Lukas Wunner) - Remove unnecessary pciehp includes (Lukas Wunner) - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner) - Tolerate PCIe Slot Presence Detect being hardwired to zero to workaround broken hardware, e.g., the Wilocity switch/wireless device (Lukas Wunner) - Unify pciehp controller & slot structs (Lukas Wunner) - Constify hotplug_slot_ops (Lukas Wunner) - Drop hotplug_slot_info (Lukas Wunner) - Embed hotplug_slot struct into users instead of allocating it separately (Lukas Wunner) - Initialize PCIe port service drivers directly instead of relying on initcall ordering (Keith Busch) - Restore PCI config state after a slot reset (Keith Busch) - Save/restore DPC config state along with other PCI config state (Keith Busch) - Reference count devices during AER handling to avoid race issue with concurrent hot removal (Keith Busch) - If an Upstream Port reports ERR_FATAL, don't try to read the Port's config space because it is probably unreachable (Keith Busch) - During error handling, use slot-specific reset instead of secondary bus reset to avoid link up/down issues on hotplug ports (Keith Busch) - Restore previous AER/DPC handling that does not remove and re-enumerate devices on ERR_FATAL (Keith Busch) - Notify all drivers that may be affected by error recovery resets (Keith Busch) - Always generate error recovery uevents, even if a driver doesn't have error callbacks (Keith Busch) - Make PCIe link active reporting detection generic (Keith Busch) - Support D3cold in PCIe hierarchies during system sleep and runtime, including hotplug and Thunderbolt ports (Mika Westerberg) - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots are empty or occupied (Jon Derrick) - Remove duplicated include from pci/pcie/err.c and unused variable from cpqphp (YueHaibing) - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza Pawandeep) - Uninline PCI bus accessors for better ftracing (Keith Busch) - Remove unused AER Root Port .error_resume method (Keith Busch) - Use kfifo in AER instead of a local version (Keith Busch) - Use threaded IRQ in AER bottom half (Keith Busch) - Use managed resources in AER core (Keith Busch) - Reuse pcie_port_find_device() for AER injection (Keith Busch) - Abstract AER interrupt handling to disconnect error injection (Keith Busch) - Refactor AER injection callbacks to simplify future improvments (Keith Busch) - Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski) - Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko) - Add switch fall-through annotations (Gustavo A. R. Silva) - Remove unused Switchtec quirk variable (Joshua Abraham) - Fix pci.c kernel-doc warning (Randy Dunlap) - Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig) - Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng) - Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid useless dmesg errors (Logan Gunthorpe) - Update Switchtec NTB documentation (Wesley Yung) - Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz) - Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang) - Add PCI support for peer-to-peer DMA (Logan Gunthorpe) - Add sysfs group for PCI peer-to-peer memory statistics (Logan Gunthorpe) - Add PCI peer-to-peer DMA scatterlist mapping interface (Logan Gunthorpe) - Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan Gunthorpe) - Add PCI peer-to-peer DMA driver writer's documentation (Logan Gunthorpe) - Add block layer flag to indicate driver support for PCI peer-to-peer DMA (Logan Gunthorpe) - Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P memory (Logan Gunthorpe) - Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan Gunthorpe) - Add nvme-pci support for PCI peer-to-peer memory in requests (Logan Gunthorpe) - Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise, Christoph Hellwig, Logan Gunthorpe) - Cache VF config space size to optimize enumeration of many VFs (KarimAllah Ahmed) - Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas) - Fix VMD AERSID quirk Device ID matching (Jon Derrick) - Fix Cadence PHY handling during probe (Alan Douglas) - Signal Cadence Endpoint interrupts via AXI region 0 instead of last region (Alan Douglas) - Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan Douglas) - Remove redundant controller tests for "device_type == pci" (Rob Herring) - Document R-Car E3 (R8A77990) bindings (Tho Vu) - Add device tree support for R-Car r8a7744 (Biju Das) - Drop unused mvebu PCIe capability code (Thomas Petazzoni) - Add shared PCI bridge emulation code (Thomas Petazzoni) - Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni) - Add aardvark Root Port emulation (Thomas Petazzoni) - Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach) - Add initial power management for i.MX7 (Leonard Crestez) - Add PME_Turn_Off support for i.MX7 (Leonard Crestez) - Fix qcom runtime power management error handling (Bjorn Andersson) - Update TI dra7xx unaligned access errata workaround for host mode as well as endpoint mode (Vignesh R) - Fix kirin section mismatch warning (Nathan Chancellor) - Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare) - Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I) - Update Keystone to use MRRS quirk for host bridge instead of open coding (Kishon Vijay Abraham I) - Refactor Keystone link establishment (Kishon Vijay Abraham I) - Simplify and speed up Keystone link training (Kishon Vijay Abraham I) - Remove unused Keystone host_init argument (Kishon Vijay Abraham I) - Merge Keystone driver files into one (Kishon Vijay Abraham I) - Remove redundant Keystone platform_set_drvdata() (Kishon Vijay Abraham I) - Rename Keystone functions for uniformity (Kishon Vijay Abraham I) - Add Keystone device control module DT binding (Kishon Vijay Abraham I) - Use SYSCON API to get Keystone control module device IDs (Kishon Vijay Abraham I) - Clean up Keystone PHY handling (Kishon Vijay Abraham I) - Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I) - Clean up Keystone config space access checks (Kishon Vijay Abraham I) - Get Keystone outbound window count from DT (Kishon Vijay Abraham I) - Clean up Keystone outbound window configuration (Kishon Vijay Abraham I) - Clean up Keystone DBI setup (Kishon Vijay Abraham I) - Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I) - Fix Keystone IRQ status checking (Kishon Vijay Abraham I) - Add debug messages for all Keystone errors (Kishon Vijay Abraham I) - Clean up Keystone includes and macros (Kishon Vijay Abraham I) - Fix Mediatek unchecked return value from devm_pci_remap_iospace() (Gustavo A. R. Silva) - Fix Mediatek endpoint/port matching logic (Honghui Zhang) - Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui Zhang) - Remove redundant Mediatek PM domain check (Honghui Zhang) - Convert Mediatek to pci_host_probe() (Honghui Zhang) - Fix Mediatek MSI enablement (Honghui Zhang) - Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang) - Add Mediatek loadable module support (Honghui Zhang) - Detach VMD resources after stopping root bus to prevent orphan resources (Jon Derrick) - Convert pcitest build process to that used by other tools (iio, perf, etc) (Gustavo Pimentel) * tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI/AER: Refactor error injection fallbacks PCI/AER: Abstract AER interrupt handling PCI/AER: Reuse existing pcie_port_find_device() interface PCI/AER: Use managed resource allocations PCI: pcie: Remove redundant 'default n' from Kconfig PCI: aardvark: Implement emulated root PCI bridge config space PCI: mvebu: Convert to PCI emulated bridge config space PCI: mvebu: Drop unused PCI express capability code PCI: Introduce PCI bridge emulated config space common logic PCI: vmd: Detach resources after stopping root bus nvmet: Optionally use PCI P2P memory nvmet: Introduce helper functions to allocate and free request SGLs nvme-pci: Add support for P2P memory in requests nvme-pci: Use PCI p2pmem subsystem to manage the CMB IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]() block: Add PCI P2P flag for request queue PCI/P2PDMA: Add P2P DMA driver writer's documentation docs-rst: Add a new directory for PCI documentation PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset ...
2018-10-17nvmet: Optionally use PCI P2P memoryLogan Gunthorpe1-0/+15
Create a configfs attribute in each nvme-fabrics namespace to enable P2P memory use. The attribute may be enabled (with a boolean) or a specific P2P device may be given (with the device's PCI name). When enabled, the namespace will ensure the underlying block device supports P2P and is compatible with any specified P2P device. If no device was specified it will ensure there is compatible P2P memory somewhere in the system. Enabling a namespace with P2P memory will fail with EINVAL (and an appropriate dmesg error) if any of these conditions are not met. Once a controller is set up on a specific port, the P2P device to use for each namespace will be found and stored in a radix tree by namespace ID. When memory is allocated for a request, the tree is used to look up the P2P device to allocate memory against. If no device is in the tree (because no appropriate device was found), or if allocation of P2P memory fails, fall back to using regular memory. Signed-off-by: Stephen Bates <sbates@raithlin.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> [hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-17nvmet: Introduce helper functions to allocate and free request SGLsLogan Gunthorpe1-0/+2
Add helpers to allocate and free the SGL in a struct nvmet_req: int nvmet_req_alloc_sgl(struct nvmet_req *req) void nvmet_req_free_sgl(struct nvmet_req *req) This will be expanded in a future patch to implement peer-to-peer memory DMAs and should be common with all target drivers. The new helpers are used in nvmet-rdma. Seeing we use req.transfer_len as the length of the SGL it is set earlier and cleared on any error. It also seems to be unnecessary to accumulate the length as the map_sgl functions should only ever be called once per request. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me>
2018-10-01nvmet: don't split large I/Os unconditionallySagi Grimberg1-0/+1
If we know that the I/O size exceeds our inline bio vec, no point using it and split the rest to begin with. We could in theory reuse the inline bio and only allocate the bio_vec, but its really not worth optimizing for. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-08nvmet: add ns write protect supportChaitanya Kulkarni1-0/+4
This patch implements the Namespace Write Protect feature described in "NVMe TP 4005a Namespace Write Protect". In this version, we implement No Write Protect and Write Protect states for target ns which can be toggled by set-features commands from the host side. For write-protect state transition, we need to flush the ns specified as a part of command so we also add helpers for carrying out synchronous flush operations. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: fixed an incorrect endianess conversion, minor cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-07-27nvmet: support configuring ANA groupsChristoph Hellwig1-3/+27
Allow creating non-default ANA groups (group ID > 1). Groups are created either by assigning the group ID to a namespace, or by creating a configfs group object under a specific port. All namespaces assigned to a group that doesn't have a configfs object for a given port are marked as inaccessible. Allow changing the ANA state on a per-port basis by creating an ana_groups directory under each port, and another directory with an ana_state file in it. The default ANA group 1 directory is created automatically for each port. For all changes in ANA configuration the ANA change AEN is sent. We only keep a global changecount instead of additional per-group changecounts to keep the implementation as simple as possible. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2018-07-27nvmet: add minimal ANA supportChristoph Hellwig1-0/+15
Add support for Asynchronous Namespace Access as specified in NVMe 1.3 TP 4004. Just add a default ANA group 1 that is optimized on all ports. This is (and will remain) the default assignment for any namespace not epxlicitly assigned to another ANA group. The ANA state can be manually changed through the configfs interface, including the change state. Includes fixes and improvements from Hannes Reinecke. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>