aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/drivers/infiniband/ulp/srp/ib_srp.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-10-02IB/srp: Annotate struct srp_fr_pool with __counted_byKees Cook1-1/+1
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct srp_fr_pool. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230929180431.3005464-5-keescook@chromium.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-12-29RDMA/srp: Move large values to a new enum for gcc13Jiri Slaby (SUSE)1-3/+5
Since gcc13, each member of an enum has the same type as the enum [1]. And that is inherited from its members. Provided these two: SRP_TAG_NO_REQ = ~0U, SRP_TAG_TSK_MGMT = 1U << 31 all other members are unsigned ints. Esp. with SRP_MAX_SGE and SRP_TSK_MGMT_SQ_SIZE and their use in min(), this results in the following warnings: include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast drivers/infiniband/ulp/srp/ib_srp.c:563:42: note: in expansion of macro 'min' include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast drivers/infiniband/ulp/srp/ib_srp.c:2369:27: note: in expansion of macro 'min' So move the large values away to a separate enum, so that they don't affect other members. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113 Link: https://lore.kernel.org/r/20221212120411.13750-1-jirislaby@kernel.org Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22RDMA/srp: Support more than 255 rdma portsMikhael Goikhman1-1/+1
Currently ib_srp module does not support devices with more than 256 ports. Switch from u8 to u32 to fix the problem. Fixes: 1fb7f8973f51 ("RDMA: Support more than 255 rdma ports") Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Mikhael Goikhman <migo@nvidia.com> Link: https://lore.kernel.org/r/7d80d8844f1abb3a54170b7259f0a02be38080a6.1663747327.git.leonro@nvidia.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28RDMA/srp: Remove the srp_host.released completionBart Van Assche1-1/+0
Move the kfree(host) calls into srp_release_dev(). Convert a device_unregister() call into a device_del() and a device_put() call. Remove the host->released completion object. This patch prepares for handling dev_set_name() failure in srp_add_port(). Link: https://lore.kernel.org/r/20220825213900.864587-3-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-02-23RDMA/ib_srp: Add more documentationBart Van Assche1-1/+10
Make it more clear what the different ib_srp data structures represent. Link: https://lore.kernel.org/r/20220215210511.28303-2-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-28RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacentBart Van Assche1-2/+0
Define .init_cmd_priv and .exit_cmd_priv callback functions in struct scsi_host_template. Set .cmd_size such that the SCSI core allocates per-command private data. Use scsi_cmd_priv() to access that private data. Remove the req_ring pointer from struct srp_rdma_ch since it is no longer necessary. Convert srp_alloc_req_data() and srp_free_req_data() into functions that initialize one instance of the SRP-private command data. This is a micro-optimization since this patch removes several pointer dereferences from the hot path. Note: due to commit e73a5e8e8003 ("scsi: core: Only return started requests from scsi_host_find_tag()"), it is no longer necessary to protect the completion path against duplicate responses. Link: https://lore.kernel.org/r/20210524041211.9480-6-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-02RDMA/srp: Remove support for FMR memory registrationMax Gurtovoy1-21/+6
FMR is not supported on most recent RDMA devices (that use fast memory registration mechanism). Also, FMR was recently removed from NFS/RDMA ULP. Link: https://lore.kernel.org/r/2-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-20RDMA: Replace zero-length array with flexible-array memberGustavo A. R. Silva1-1/+1
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Link: https://lore.kernel.org/r/20200213010425.GA13068@embeddedor.com Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> # added a few more
2019-10-08RDMA/srp: Add parse function for maximum initiator to target IU sizeHonggang Li1-0/+1
According to SRP specifications 'srp-r16a' and 'srp2r06', IOControllerProfile attributes for SRP target port include the maximum initiator to target IU size. SRP connection daemons, such as srp_daemon, can get the value from the subnet manager. The SRP connection daemon can pass this value to kernel. This patch adds a parse function for it. Upstream commit [1] enables the kernel parameter, 'use_imm_data', by default. [1] also use (8 * 1024) as the default value for kernel parameter 'max_imm_data'. With those default values, the maximum initiator to target IU size will be 8260. In case the SRPT modules, which include the in-tree 'ib_srpt.ko' module, do not support SRP-2 'immediate data' feature, the default maximum initiator to target IU size is significantly smaller than 8260. For 'ib_srpt.ko' module, which built from source before [2], the default maximum initiator to target IU is 2116. [1] introduces a regression issue for old srp targets with default kernel parameters, as the connection will be rejected because of a too large maximum initiator to target IU size. [1] commit 882981f4a411 ("RDMA/srp: Add support for immediate data") [2] commit 5dabcd0456d7 ("RDMA/srpt: Add support for immediate data") Link: https://lore.kernel.org/r/20190927174352.7800-1-honli@redhat.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Honggang Li <honli@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-04RDMA/srp: Honor the max_send_sge device attributeBart Van Assche1-0/+1
Instead of assuming that max_send_sge >= 3, restrict the number of scatter gather elements to what is supported by the RDMA adapter. Link: https://lore.kernel.org/r/20190930231707.48259-8-bvanassche@acm.org Cc: Honggang LI <honli@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-04RDMA/srp: Remove two castsBart Van Assche1-0/+2
This patch does not change any functionality. Link: https://lore.kernel.org/r/20190930231707.48259-7-bvanassche@acm.org Cc: Honggang LI <honli@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19RDMA/srp: Add support for immediate dataBart Van Assche1-0/+12
Request permission to send immediate data during login. If the SRP target grants this request, send the payload of write requests <= 8 KB as immediate data. Cc: Sergey Gorenko <sergeygo@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-19RDMA/srp: Rework handling of the maximum information unit lengthBart Van Assche1-1/+2
Move the maximum initiator to target information unit length parameter from struct srp_target_port into struct srp_rdma_ch. This patch does not change any functionality but makes the next patch easier to read. Cc: Sergey Gorenko <sergeygo@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-19RDMA/srp: Move srp_rdma_ch.max_ti_iu_len declarationBart Van Assche1-1/+2
Since srp_rdma_ch.max_ti_iu_len is used in the hot path, move it to the section with data structure members used in the hot path. Cc: Sergey Gorenko <sergeygo@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-19RDMA/srp: Handle large SCSI CDBs correctlyBart Van Assche1-0/+2
Reserve additional space for CDBs that contain more than sixteen bytes and set the add_cdb_len field for such CDBs as required. From the SRP standard: "The ADDITIONAL CDB LENGTH field contains the length in dwords of the ADDITIONAL CDB field." Cc: Sergey Gorenko <sergeygo@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-23IB/srp: Add target_can_queue login parameterBart Van Assche1-0/+1
Although I'm not sure this parameter is useful for regular SRP users, setting this parameter to 1 has shown to be invaluable for testing the block layer core, SCSI core and device mapper queue running mechanisms. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-23IB/srp: Add RDMA/CM supportBart Van Assche1-7/+35
Since the SRP_LOGIN_REQ defined in the SRP standard is larger than what fits in the RDMA/CM login request private data, introduce a new login request format for the RDMA/CM. Note: since srp_daemon and ibsrpdm rely on the subnet manager and since there is no equivalent of the IB subnet manager in non-IB networks, login has to be performed manually for non-IB networks. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/srp: Cache global rkeyBart Van Assche1-1/+2
This is a micro-optimization for the hot path. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-05-01IB/SA: Rename ib_sa_path_rec to sa_path_recDasaratharaman Chandramouli1-1/+1
Rename ib_sa_path_rec to a more generic sa_path_rec. This is part of extending ib_sa to also support OPA path records in addition to the IB defined path records. Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19IB/srp: Fix race conditions related to task managementBart Van Assche1-0/+1
Avoid that srp_process_rsp() overwrites the status information in ch if the SRP target response timed out and processing of another task management function has already started. Avoid that issuing multiple task management functions concurrently triggers list corruption. This patch prevents that the following stack trace appears in the system log: WARNING: CPU: 8 PID: 9269 at lib/list_debug.c:52 __list_del_entry_valid+0xbc/0xc0 list_del corruption. prev->next should be ffffc90004bb7b00, but was ffff8804052ecc68 CPU: 8 PID: 9269 Comm: sg_reset Tainted: G W 4.10.0-rc7-dbg+ #3 Call Trace: dump_stack+0x68/0x93 __warn+0xc6/0xe0 warn_slowpath_fmt+0x4a/0x50 __list_del_entry_valid+0xbc/0xc0 wait_for_completion_timeout+0x12e/0x170 srp_send_tsk_mgmt+0x1ef/0x2d0 [ib_srp] srp_reset_device+0x5b/0x110 [ib_srp] scsi_ioctl_reset+0x1c7/0x290 scsi_ioctl+0x12a/0x420 sd_ioctl+0x9d/0x100 blkdev_ioctl+0x51e/0x9f0 block_ioctl+0x38/0x40 do_vfs_ioctl+0x8f/0x700 SyS_ioctl+0x3c/0x70 entry_SYSCALL_64_fastpath+0x18/0xad Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: Steve Feeley <Steve.Feeley@sandisk.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-09-23IB/srp: use IB_PD_UNSAFE_GLOBAL_RKEYChristoph Hellwig1-2/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13IB/srp: Prevent mapping failuresBart Van Assche1-0/+1
If both max_sectors and the queue_depth are high enough it can happen that the MR pool is depleted temporarily. This causes the SRP initiator to report mapping failures. Although the SRP initiator recovers from such mapping failures, prevent that this can happen by allocating more memory regions. Additionally, only enable memory registration if at least two pages can be registered per memory region. Reported-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Laurence Oberman <loberman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-12IB/srp: Introduce target->mr_pool_sizeBart Van Assche1-0/+1
This patch does not change any functionality. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-15Merge branch 'rdma-cq.2' of git://git.infradead.org/users/hch/rdma into 4.5/rdma-cqDoug Ledford1-5/+2
Signed-off-by: Doug Ledford <dledford@redhat.com> Conflicts: drivers/infiniband/ulp/srp/ib_srp.c - Conflicts with changes in ib_srp.c introduced during 4.4-rc updates
2015-12-11IB/srp: use the new CQ APIChristoph Hellwig1-5/+2
This also moves recv completion handling from hardirq context into softirq context. Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-12-07IB/srp: Fix srp_map_sg_fr()Bart Van Assche1-4/+1
After dma_map_sg() has been called the return value of that function must be used as the number of elements in the scatterlist instead of scsi_sg_count(). Fixes: commit f7f7aab1a5c0 ("IB/srp: Convert to new registration API") Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: stable <stable@vger.kernel.org> # v4.4+ Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/srp: Convert to new registration APISagi Grimberg1-3/+8
Instead of constructing a page list, call ib_map_mr_sg and post a new ib_reg_wr. srp_map_finish_fr now returns the number of sg elements registered. Remove srp_finish_mapping since no one is calling it. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/srp: Create an insecure all physical rkey only if neededBart Van Assche1-2/+2
The SRP initiator only needs this if the insecure register_always=N performance optimization is enabled, or if FRWR/FMR is not supported in the driver. Do not create an all physical MR unless it is needed to support either of those modes. Default register_always to true so the out of the box configuration does not create an insecure all physical MR. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [bvanassche: reworked and rebased this patch] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/srp: Register the indirect data buffer descriptorBart Van Assche1-0/+4
Instead of always using the global rkey for the indirect data buffer descriptor, register that descriptor with the HCA if the kernel module parameter register_always has been set to Y. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/srp: Introduce srp_device.use_fmrBart Van Assche1-0/+1
Introduce the variable srp_device.use_fmr. Leave out the dev->has_fr / dev->has_fmr and ch->fr_pool / ch->fmr_pool checks since these are redundant. This patch does not change any functionality but makes the source code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/srp: Remove the memory registration backtracking codeBart Van Assche1-6/+0
Mapping a discontiguous sg-list requires multiple memory regions and hence can exhaust the memory region pool. The SRP initiator already handles this by temporarily reducing the queue depth. This means that it is safe to remove the memory registration backtracking code. This patch has been tested with direct I/O sizes up to 256 MB. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/srp: Add memory descriptor array pointer range checkingBart Van Assche1-2/+8
Although most paths through which a request is submitted check block layer parameters like the max_segments limit, these are not checked when an SG_IO or direct I/O request is submitted. Hence add a range check for the memory descriptor array pointer. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/srp: Add 64-bit LUN supportBart Van Assche1-1/+0
The SCSI standard defines 64-bit values for LUNs. Large arrays employing large or hierarchical LUN numbers become more and more common. So update the SRP initiator to use 64-bit LUN numbers. See also Hannes Reinecke, commit 9cb78c16f5da ("scsi: use 64-bit LUNs"), June 2014. The largest LUN number that has been tested is 0xd2003fff00000000. Checked the following structure sizes with gdb: * sizeof(struct srp_cmd) = 48 * sizeof(struct srp_tsk_mgmt) = 48 * sizeof(struct srp_aer_req) = 36 The ibmvscsi changes have been compile tested only (on a PPC system). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: Brian King <brking@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-18IB/srp: Fix connection state trackingBart Van Assche1-1/+1
Reception of a DREQ message only causes the state of a single channel to change. Hence move the 'connected' member variable from the target to the channel data structure. This patch avoids that following false positive warning can be reported by srp_destroy_qp(): WARNING: at drivers/infiniband/ulp/srp/ib_srp.c:617 srp_destroy_qp+0xa6/0x120 [ib_srp]() Call Trace: [<ffffffff8106e10f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8106e16a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa0440226>] srp_destroy_qp+0xa6/0x120 [ib_srp] [<ffffffffa0440322>] srp_free_ch_ib+0x82/0x1e0 [ib_srp] [<ffffffffa044408b>] srp_create_target+0x7ab/0x998 [ib_srp] [<ffffffff81346f60>] dev_attr_store+0x20/0x30 [<ffffffff811dd90f>] sysfs_write_file+0xef/0x170 [<ffffffff8116d248>] vfs_write+0xc8/0x190 [<ffffffff8116d411>] sys_write+0x51/0x90 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Cc: <stable@vger.kernel.org> #v3.19 Signed-off-by: Doug Ledford <dledford@redhat.com>
2014-11-12IB/srp: Fix a race condition triggered by destroying a queue pairBart Van Assche1-0/+2
At least LID reassignment can trigger a race condition in the SRP initiator driver, namely the receive completion handler trying to post a request on a QP during or after QP destruction and before the CQ's have been destroyed. Avoid this race by modifying a QP into the error state and by waiting until all receive completions have been processed before destroying a QP. Reported-by: Max Gurtuvoy <maxg@mellanox.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12IB/srp: Add multichannel supportBart Van Assche1-1/+2
Improve performance by using multiple RDMA/RC channels per SCSI host for communication with an SRP target. About the implementation: - Introduce a loop over all channels in the code that uses target->ch. - Set the SRP_MULTICHAN_MULTI flag during login for the creation of the second and subsequent channels. - RDMA completion vectors are chosen such that RDMA completion interrupts are handled by the CPU socket that submitted the I/O request. As one can see in this patch it has been assumed if a system contains n CPU sockets and m RDMA completion vectors have been assigned to an RDMA HCA that IRQ affinity has been configured such that completion vectors [i*m/n..(i+1)*m/n) are bound to CPU socket i with 0 <= i < n. - Modify srp_free_ch_ib() and srp_free_req_data() such that it becomes safe to invoke these functions after the corresponding allocation function failed. - Add a ch_count sysfs attribute per target port. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12IB/srp: Use block layer tagsBart Van Assche1-3/+0
Since the block layer already contains functionality to assign a tag to each request, use that functionality instead of reimplementing that functionality in the SRP initiator driver. This change makes the free_reqs list superfluous. Hence remove that list. [hch: updated to use .use_blk_tags instead scsi_activate_tcq] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12IB/srp: Separate target and channel variablesBart Van Assche1-22/+42
Changes in this patch: - Move channel variables into a new structure (struct srp_rdma_ch). - Add an srp_target_port pointer, 'lock' and 'comp_vector' members in struct srp_rdma_ch. - Add code to initialize these three new member variables. - Many boring "target->" into "ch->" changes. - The cm_id and completion handler context pointers are now of type srp_rdma_ch * instead of srp_target_port *. - Three kzalloc(a * b, f) calls have been changed into kcalloc(a, b, f) to avoid that this patch would trigger a checkpatch warning. - Two casts from u64 into unsigned long long have been left out because these are superfluous. Since considerable time u64 is defined as unsigned long long for all architectures supported by the Linux kernel. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12IB/srp: Introduce two new srp_target_port member variablesBart Van Assche1-1/+3
Introduce the srp_target_port member variables 'sgid' and 'pkey'. Change the type of 'orig_dgid' from __be16[8] into union ib_gid. This patch does not change any functionality but makes the "Separate target and channel variables" patch easier to verify. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-11-12IB/srp: Avoid that I/O hangs due to a cable pull during LUN scanningBart Van Assche1-0/+1
If a cable is pulled during LUN scanning it can happen that the SRP rport and the SCSI host have been created but no LUNs have been added to the SCSI host. Since multipathd only sends SCSI commands to a SCSI target if one or more SCSI devices are present and since there is no keepalive mechanism for IB queue pairs this means that after a LUN scan failed and after a reconnect has succeeded no data will be sent over the QP and hence that a subsequent cable pull will not be detected. Avoid this by not creating an rport or SCSI host if a cable is pulled during a SCSI LUN scan. Note: so far the above behavior has only been observed with the kernel module parameter ch_count set to a value >= 2. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-05-20IB/srp: Add fast registration supportBart Van Assche1-5/+70
Certain HCA types (e.g. Connect-IB) and certain configurations (e.g. ConnectX VF) support fast registration but not FMR. Hence add fast registration support. In function srp_rport_reconnect(), move the the srp_finish_req() loop from after to before the srp_create_target_ib() call. This is needed to avoid that srp_finish_req() tries to queue any invalidation requests for rkeys associated with the old queue pair on the newly allocated queue pair. Invoking srp_finish_req() before the queue pair has been reallocated is safe since srp_claim_req() handles completions correctly that arrive after srp_finish_req() has been invoked. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-20IB/srp: Rename FMR-related variablesBart Van Assche1-8/+8
The next patch will cause the renamed variables to be shared between the code for FMR and for FR memory registration. Make the names of these variables independent of the memory registration mode. This patch does not change any functionality. The start of this patch was the changes applied via the following shell command: sed -i.orig 's/SRP_FMR_SIZE/SRP_MAX_PAGES_PER_MR/g; \ s/fmr_page_mask/mr_page_mask/g;s/fmr_page_size/mr_page_size/g; \ s/fmr_page_shift/mr_page_shift/g;s/fmr_max_size/mr_max_size/g; \ s/max_pages_per_fmr/max_pages_per_mr/g;s/nfmr/nmdesc/g; \ s/fmr_len/dma_len/g' drivers/infiniband/ulp/srp/ib_srp.[ch] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-05-20IB/srp: One FMR pool per SRP connectionBart Van Assche1-4/+3
Allocate one FMR pool per SRP connection instead of one SRP pool per HCA. This improves scalability of the SRP initiator. Only request the SCSI mid-layer to retry a SCSI command after a temporary mapping failure (-ENOMEM) but not after a permanent mapping failure. This avoids that SCSI commands are retried indefinitely if a permanent memory mapping failure occurs. Tell the SCSI mid-layer to reduce queue depth temporarily in the unlikely case where an application is queuing many requests with more than max_pages_per_fmr sg-list elements. For FMR pool allocation, base the max_pages_per_fmr parameter on the HCA memory registration limit. Only try to allocate an FMR pool if FMR is supported. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-24IB/srp: Avoid duplicate connectionsBart Van Assche1-0/+1
The connection uniqueness check is performed before a new connection is added to the target list. This patch protects both actions by a mutex such that simultaneous writes from two different threads into the "add_target" variable do not result in duplicate connections. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/srp: Make queue size configurableBart Van Assche1-9/+8
Certain storage configurations, e.g. a sufficiently large array of hard disks in a RAID configuration, need a queue depth above 64 to achieve optimal performance. Hence make the queue depth configurable. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Tested-by: Jack Wang <xjtuwjp@gmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/srp: Start timers if a transport layer error occursBart Van Assche1-0/+1
Start the reconnect timer, fast_io_fail timer and dev_loss timers if a transport layer error occurs. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/srp: Use SRP transport layer error recoveryBart Van Assche1-1/+0
Enable fast_io_fail_tmo and dev_loss_tmo functionality for the IB SRP initiator. Add kernel module parameters that allow to specify default values for these parameters. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/srp: Keep rport as long as the IB transport layerBart Van Assche1-0/+1
Keep the rport data structure around after srp_remove_host() has finished until cleanup of the IB transport layer has finished completely. This is necessary because later patches use the rport pointer inside the queuecommand callback. Without this patch accessing the rport from inside a queuecommand callback is racy because srp_remove_host() must be invoked before scsi_remove_host() and because the queuecommand callback could get invoked after srp_remove_host() has finished. In other words, without this patch the queuecommand callback can get invoked after the rport data structure has been freed. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08IB/srp: Make transport layer retry count configurableVu Pham1-0/+1
Allow the InfiniBand RC retry count to be configured by the user as an option in the target login string. Reducing this retry count allows to reduce the path failover time. Signed-off-by: Vu Pham <vu@mellanox.com> [ bvanassche: Rewrote patch description / changed default retry count ] Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-07-01IB/srp: Make HCA completion vector configurableBart Van Assche1-0/+1
Several InfiniBand HCAs allow configuring the completion vector per CQ. This allows spreading the workload created by IB completion interrupts over multiple MSI-X vectors and hence over multiple CPU cores. In other words, configuring the completion vector properly not only allows reducing latency on an initiator connected to multiple SRP targets but also allows improving throughput. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>