aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-07-11RDMA/irdma: Fix sleep from invalid context BUGMustafa Ismail1-50/+0
Taking the qos_mutex to process RoCEv2 QP's on netdev events causes a kernel splat. Fix this by removing the handling for RoCEv2 in irdma_cm_teardown_connections that uses the mutex. This handling is only needed for iWARP to avoid having connections established while the link is down or having connections remain functional after the IP address is removed. BUG: sleeping function called from invalid context at kernel/locking/mutex. Call Trace: kernel: dump_stack+0x66/0x90 kernel: ___might_sleep.cold.92+0x8d/0x9a kernel: mutex_lock+0x1c/0x40 kernel: irdma_cm_teardown_connections+0x28e/0x4d0 [irdma] kernel: ? check_preempt_curr+0x7a/0x90 kernel: ? select_idle_sibling+0x22/0x3c0 kernel: ? select_task_rq_fair+0x94c/0xc90 kernel: ? irdma_exec_cqp_cmd+0xc27/0x17c0 [irdma] kernel: ? __wake_up_common+0x7a/0x190 kernel: irdma_if_notify+0x3cc/0x450 [irdma] kernel: ? sched_clock_cpu+0xc/0xb0 kernel: irdma_inet6addr_event+0xc6/0x150 [irdma] Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-07-11RDMA/irdma: Do not advertise 1GB page size for x722Mustafa Ismail4-2/+5
x722 does not support 1GB page size but the irdma driver incorrectly advertises 1GB page size support for x722 device to ib_core to compute the best page size to use on this MR. This could lead to incorrect start offsets computed by hardware on the MR. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-06-24RDMA/cm: Fix memory leak in ib_cm_insert_listenMiaoqian Lin1-1/+3
cm_alloc_id_priv() allocates resource for the cm_id_priv. When cm_init_listen() fails it doesn't free it, leading to memory leak. Add the missing error unwind. Fixes: 98f67156a80f ("RDMA/cm: Simplify establishing a listen cm_id") Link: https://lore.kernel.org/r/20220621052546.4821-1-linmq006@gmail.com Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-06-07RDMA/qedr: Fix reporting QP timeout attributeKamal Heib2-1/+4
Make sure to save the passed QP timeout attribute when the QP gets modified, so when calling query QP the right value is reported and not the converted value that is required by the firmware. This issue was found while running the pyverbs tests. Fixes: cecbcddf6461 ("qedr: Add support for QP verbs") Link: https://lore.kernel.org/r/20220525132029.84813-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2022-05-26RDMA/rtrs-clt: Fix one kernel-doc commentYang Li1-1/+1
Add the description of @pathname and remove @sessname in rtrs_clt_open() kernel-doc comment to remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. drivers/infiniband/ulp/rtrs/rtrs-clt.c:2809: warning: Function parameter or member 'pathname' not described in 'rtrs_clt_open' drivers/infiniband/ulp/rtrs/rtrs-clt.c:2809: warning: Excess function parameter 'sessname' description in 'rtrs_clt_open' Link: https://lore.kernel.org/r/20220526130945.98601-1-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Remove all traces of diagpkt supportDennis Dalessandro1-23/+0
One of the concessions we made to get our driver upstream was to remove the diagnostic packet support. There is however still some cruft that was left over. Remove it. Link: https://lore.kernel.org/r/20220520183727.48973.93587.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Consolidate software versionsDennis Dalessandro2-18/+1
There is no need to have separate user and kernel software versions. There is a single software that the kernel is compatible with. Also remove the notion of a "kernel type" that is long since deprecated. Link: https://lore.kernel.org/r/20220520183722.48973.60262.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Remove pointless driver versionDennis Dalessandro2-21/+0
Driver versions have long been forbidden in the RDMA subsystem. We removed most of the code relating to them and have been very strict about not allowing. However there is some leftover versioning that we do not need. Get rid of that. Link: https://lore.kernel.org/r/20220520183717.48973.17418.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Fix potential integer multiplication overflow errorsDennis Dalessandro1-1/+1
When multiplying of different types, an overflow is possible even when storing the result in a larger type. This is because the conversion is done after the multiplication. So arithmetic overflow and thus in incorrect value is possible. Correct an instance of this in the inter packet delay calculation. Fix by ensuring one of the operands is u64 which will promote the other to u64 as well ensuring no overflow. Cc: stable@vger.kernel.org Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20220520183712.48973.29855.stgit@awfm-01.cornelisnetworks.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Prevent panic when SDMA is disabledDouglas Miller1-0/+2
If the hfi1 module is loaded with HFI1_CAP_SDMA off, a call to hfi1_write_iter() will dereference a NULL pointer and panic. A typical stack frame is: sdma_select_user_engine [hfi1] hfi1_user_sdma_process_request [hfi1] hfi1_write_iter [hfi1] do_iter_readv_writev do_iter_write vfs_writev do_writev do_syscall_64 The fix is to test for SDMA in hfi1_write_iter() and fail the I/O with EINVAL. Link: https://lore.kernel.org/r/20220520183706.48973.79803.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/hfi1: Prevent use of lock before it is initializedDouglas Miller1-5/+7
If there is a failure during probe of hfi1 before the sdma_map_lock is initialized, the call to hfi1_free_devdata() will attempt to use a lock that has not been initialized. If the locking correctness validator is on then an INFO message and stack trace resembling the following may be seen: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. Call Trace: register_lock_class+0x11b/0x880 __lock_acquire+0xf3/0x7930 lock_acquire+0xff/0x2d0 _raw_spin_lock_irq+0x46/0x60 sdma_clean+0x42a/0x660 [hfi1] hfi1_free_devdata+0x3a7/0x420 [hfi1] init_one+0x867/0x11a0 [hfi1] pci_device_probe+0x40e/0x8d0 The use of sdma_map_lock in sdma_clean() is for freeing the sdma_map memory, and sdma_map is not allocated/initialized until after sdma_map_lock has been initialized. This code only needs to be run if sdma_map is not NULL, and so checking for that condition will avoid trying to use the lock before it is initialized. Fixes: 473291b3ea0e ("IB/hfi1: Fix for early release of sdma context") Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20220520183701.48973.72434.stgit@awfm-01.cornelisnetworks.com Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24RDMA/rxe: Fix an error handling path in rxe_get_mcg()Christophe JAILLET1-2/+4
The commit in the Fixes tag has shuffled some code. Now 'mcg_num' is incremented before the kzalloc(). So if the memory allocation fails, this increment must be undone. Fixes: a926a903b7dc ("RDMA/rxe: Do not call dev_mc_add/del() under a spinlock") Link: https://lore.kernel.org/r/fe137cd8b1f17593243aa73d59c18ea71ab9ee36.1653225896.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24IB/core: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20220521111145.81697-87-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24IB/hf1: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20220521111145.81697-83-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24IB/qib: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20220521111145.81697-56-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24IB/iser: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/20220521111145.81697-4-Julia.Lawall@inria.fr Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-20RDMA/mlx4: Avoid flush_scheduled_work() usageTetsuo Handa3-8/+34
Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_wq with local cm_wq. Link: https://lore.kernel.org/r/22f7183b-cc16-5a34-e879-7605f5efc6e6@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-20IB/isert: Avoid flush_scheduled_work() usageTetsuo Handa1-9/+16
Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_wq with local isert_login_wq. Link: https://lore.kernel.org/r/fbe5e9a8-0110-0c22-b7d6-74d53948d042@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-19RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr()Daisuke Matsuda1-1/+0
The pointer imr->umem is assigned twice. Fix this by removing the redundant one. Link: https://lore.kernel.org/r/20220518044914.1903125-1-matsuda-daisuke@fujitsu.com Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-17RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq()Minghao Chi1-1/+0
Calling synchronize_irq() right before free_irq() is quite useless. On one hand the IRQ can easily fire again before free_irq() is entered, on the other hand free_irq() itself calls synchronize_irq() internally (in a race condition free way), before any state associated with the IRQ is freed. Link: https://lore.kernel.org/r/20220513081647.1631141-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-12RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx()Wenpeng Liang4-254/+104
To reduce the code size and make the code clearer, replace all roce_get_xxx() with hr_reg_read() to read the data fields. Link: https://lore.kernel.org/r/20220512080012.38728-3-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-12RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx()Wenpeng Liang2-270/+157
To reduce the code size and make the code clearer, replace all roce_set_xxx() with hr_reg_xxx() to write the data fields. Link: https://lore.kernel.org/r/20220512080012.38728-2-liangwenpeng@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-11RDMA/irdma: Add SW mechanism to generate completions on errorMustafa Ismail4-37/+210
HW flushes after QP in error state is not reliable. This can lead to application hang waiting on a completion for outstanding WRs. Implement a SW mechanism to generate completions for any outstanding WR's after the QP is modified to error. This is accomplished by starting a delayed worker after the QP is modified to error and the HW flush is performed. The worker will generate completions that will be returned to the application when it polls the CQ. This mechanism only applies to Kernel applications. Link: https://lore.kernel.org/r/20220425181624.1617-1-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-11RDMA/siw: Enable siw on tunnel devicesBernard Metzler1-2/+3
Enable siw to attach to tunnel devices, there is no reason not to, siw properly generates all packets already. Link: https://lore.kernel.org/r/20220510143917.23735-1-bmt@zurich.ibm.com Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-10scsi: target: iscsi: Rename iscsi_conn to iscsit_connMax Gurtovoy2-33/+33
The structure iscsi_conn naming is used by the iSCSI initiator driver. Rename the target conn to iscsit_conn to have more readable code. Link: https://lore.kernel.org/r/20220428092939.36768-2-mgurtovoy@nvidia.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-10scsi: target: iscsi: Rename iscsi_cmd to iscsit_cmdMax Gurtovoy2-33/+33
The structure iscsi_cmd naming is used by the iSCSI initiator driver. Rename the target cmd to iscsit_cmd to have more readable code. Link: https://lore.kernel.org/r/20220428092939.36768-1-mgurtovoy@nvidia.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-09net/mlx5: Lag, expose number of lag portsMark Bloch4-2/+4
Downstream patches will add support for hardware lag with more than 2 ports. Add a way for users to query the number of lag ports. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09RDMA/rxe: Enforce IBA C11-17Bob Pearson3-0/+17
Add a counter to keep track of the number of WQs connected to a CQ and return an error if destroy_cq() is called while the counter is non zero. Link: https://lore.kernel.org/r/20220421014042.26985-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-09RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup()Bob Pearson2-29/+29
Move code from rxe_dealloc_mw() to rxe_mw_cleanup() to allow flows which hold a reference to mw to complete. Link: https://lore.kernel.org/r/20220421014042.26985-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-09RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup()Bob Pearson1-6/+4
Move the code which tears down an mr to rxe_mr_cleanup to allow operations holding a reference to the mr to complete. Link: https://lore.kernel.org/r/20220421014042.26985-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-09RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup()Bob Pearson3-10/+4
Move the code from rxe_qp_destroy() to rxe_qp_do_cleanup(). This allows flows holding references to qp to complete before the qp object is torn down. Link: https://lore.kernel.org/r/20220421014042.26985-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-09RDMA/rxe: Check rxe_get() return valueBob Pearson3-3/+6
In the tasklets (completer, responder, and requester) check the return value from rxe_get() to detect failures to get a reference. This only occurs if the qp has had its reference count drop to zero which indicates that it no longer should be used. The ref is never 0 today because the tasklets are flushed before the ref is dropped. The next patch changes this so that the ref is dropped then the tasklets are flushed. Link: https://lore.kernel.org/r/20220421014042.26985-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-09RDMA/rxe: Add rxe_srq_cleanup()Bob Pearson4-22/+25
Move cleanup code from rxe_destroy_srq() to rxe_srq_cleanup() which is called after all references are dropped to allow code depending on the srq object to complete. Link: https://lore.kernel.org/r/20220421014042.26985-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-06RDMA/rxe: Remove IB_SRQ_INIT_MASKBob Pearson3-57/+74
Currently the #define IB_SRQ_INIT_MASK is used to distinguish the rxe_create_srq verb from the rxe_modify_srq verb so that some code can be shared between these two subroutines. This commit splits rxe_srq_chk_attr into two subroutines: rxe_srq_chk_init and rxe_srq_chk_attr which handle the create_srq and modify_srq verbs separately. Link: https://lore.kernel.org/r/20220421014042.26985-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-06RDMA/rxe: Skip adjusting remote addr for write in retry operationChengguang Xu1-2/+0
For write request the remote addr will be sent only with first packet so we don't have to adjust wqe->iova in retry operation. Link: https://lore.kernel.org/r/20220502053907.6388-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/rxe: Optimize the mr pool structZhu Yanjun2-11/+3
Based on the commit c9f4c695835c ("RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC"), only the mr pool uses the RXE_POOL_ALLOC, As such, replace this flags with pool type to save memory. Link: https://lore.kernel.org/r/20220428041028.1363139-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/hns: Remove the num_cqc_timer variableYixing Liu4-5/+3
The bt number of cqc_timer of HIP09 increases compared with that of HIP08. Therefore, cqc_timer_bt_num and num_cqc_timer do not match. As a result, the driver may fail to allocate cqc_timer. So the driver needs to uniquely uses cqc_timer_bt_num to represent the bt number of cqc_timer. Fixes: 0e40dc2f70cd ("RDMA/hns: Add timer allocation support for hip08") Link: https://lore.kernel.org/r/20220429093545.58070-1-liangwenpeng@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/hns: Add the detection for CMDQ status in the device initialization processYangyang Li2-0/+27
CMDQ may fail during HNS ROCEE initialization. The following is the log when the execution fails: hns3 0000:bd:00.2: In reset process RoCE client reinit. hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 hns3 0000:bd:00.2 hns_2: failed to set gid, ret = -11! hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 <...> hns3 0000:bd:00.2: CMDQ move tail from 840 to 839 hns3 0000:bd:00.2: CMDQ move tail from 840 to 0 hns3 0000:bd:00.2: [cmd]token 14e mailbox 20 timeout. hns3 0000:bd:00.2 hns_2: set HEM step 0 failed! hns3 0000:bd:00.2 hns_2: set HEM address to HW failed! hns3 0000:bd:00.2 hns_2: failed to alloc mtpt, ret = -16. infiniband hns_2: Couldn't create ib_mad PD infiniband hns_2: Couldn't open port 1 hns3 0000:bd:00.2: Reset done, RoCE client reinit finished. However, even if ib_mad client registration failed, ib_register_device() still returns success to the driver. In the device initialization process, CMDQ execution fails because HW/FW is abnormal. Therefore, if CMDQ fails, the initialization function should set CMDQ to a fatal error state and return a failure to the caller. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/20220429093104.26687-1-liangwenpeng@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/rxe: Change mcg_lock to a _bh lockBob Pearson1-21/+15
rxe_mcast.c currently uses _irqsave spinlocks for rxe->mcg_lock while rxe_recv.c uses _bh spinlocks for the same lock. As there is no case where the mcg_lock can be taken from an IRQ, change these all to bh locks so we don't have confusing mismatched lock types on the same spinlock. Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/rxe: Do not call dev_mc_add/del() under a spinlockBob Pearson1-28/+23
These routines were not intended to be called under a spinlock and will throw debugging warnings: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 13 PID: 3107 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2f/0x50 CPU: 13 PID: 3107 Comm: python3 Tainted: G E 5.18.0-rc1+ #7 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:warn_bogus_irq_restore+0x2f/0x50 Call Trace: <TASK> _raw_spin_unlock_irqrestore+0x75/0x80 rxe_attach_mcast+0x304/0x480 [rdma_rxe] ib_attach_mcast+0x88/0xa0 [ib_core] ib_uverbs_attach_mcast+0x186/0x1e0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xcd/0x140 [ib_uverbs] ib_uverbs_cmd_verbs+0xdb0/0xea0 [ib_uverbs] ib_uverbs_ioctl+0xd2/0x160 [ib_uverbs] do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Move them out of the spinlock, it is OK if there is some races setting up the MC reception at the ethernet layer with rbtree lookups. Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/hns: Remove unnecessary ret variable from hns_roce_dereg_mr()Guo Zhengkui1-2/+1
Fix the following coccicheck warning: drivers/infiniband/hw/hns/hns_roce_mr.c:343:5-8: Unneeded variable: "ret". Return 0 directly instead. Link: https://lore.kernel.org/r/20220426070858.9098-1-guozhengkui@vivo.com Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/siw: Fix a condition race issue in MPA request processingCheng Xu1-3/+4
The calling of siw_cm_upcall and detaching new_cep with its listen_cep should be atomistic semantics. Otherwise siw_reject may be called in a temporary state, e,g, siw_cm_upcall is called but the new_cep->listen_cep has not being cleared. This fixes a WARN: WARNING: CPU: 7 PID: 201 at drivers/infiniband/sw/siw/siw_cm.c:255 siw_cep_put+0x125/0x130 [siw] CPU: 2 PID: 201 Comm: kworker/u16:22 Kdump: loaded Tainted: G E 5.17.0-rc7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: iw_cm_wq cm_work_handler [iw_cm] RIP: 0010:siw_cep_put+0x125/0x130 [siw] Call Trace: <TASK> siw_reject+0xac/0x180 [siw] iw_cm_reject+0x68/0xc0 [iw_cm] cm_work_handler+0x59d/0xe20 [iw_cm] process_one_work+0x1e2/0x3b0 worker_thread+0x50/0x3a0 ? rescuer_thread+0x390/0x390 kthread+0xe5/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Fixes: 6c52fdc244b5 ("rdma/siw: connection management") Link: https://lore.kernel.org/r/d528d83466c44687f3872eadcb8c184528b2e2d4.1650526554.git.chengyou@linux.alibaba.com Reported-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/rxe: Replace paylen by payloadBob Pearson1-6/+6
In finish_packet() in rxe_req.c a variable was incorrectly called paylen instead of payload. Elsewhere in the rxe source payload is always used for the RoCE payload length and paylen is always used for the UDP payload length. This will cause unnecessary confusion. Replace paylen by payload in finish_packet(). Link: https://lore.kernel.org/r/20220420172316.5465-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-02RDMA/irdma: Fix possible crash due to NULL netdev in notifierMustafa Ismail1-12/+9
For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where the netdev is guaranteed to not be NULL. Fixes: 6702bc147448 ("RDMA/irdma: Fix netdev notifications for vlan's") Link: https://lore.kernel.org/r/20220425181703.1634-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-02RDMA/irdma: Reduce iWARP QP destroy timeShiraz Saleem1-6/+4
QP destroy is synchronous and waits for its refcnt to be decremented in irdma_cm_node_free_cb (for iWARP) which fires after the RCU grace period elapses. Applications running a large number of connections are exposed to high wait times on destroy QP for events like SIGABORT. The long pole for this wait time is the firing of the call_rcu callback during a CM node destroy which can be slow. It holds the QP reference count and blocks the destroy QP from completing. call_rcu only needs to make sure that list walkers have a reference to the cm_node object before freeing it and thus need to wait for grace period elapse. The rest of the connection teardown in irdma_cm_node_free_cb is moved out of the grace period wait in irdma_destroy_connection. Also, replace call_rcu with a simple kfree_rcu as it just needs to do a kfree on the cm_node Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Link: https://lore.kernel.org/r/20220425181703.1634-3-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-02RDMA/irdma: Flush iWARP QP if modified to ERR from RTR stateTatyana Nikolova2-13/+7
When connection establishment fails in iWARP mode, an app can drain the QPs and hang because flush isn't issued when the QP is modified from RTR state to error. Issue a flush in this case using function irdma_cm_disconn(). Update irdma_cm_disconn() to do flush when cm_id is NULL, which is the case when the QP is in RTR state and there is an error in the connection establishment. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20220425181703.1634-2-shiraz.saleem@intel.com Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-25RDMA/core: Avoid flush_workqueue(system_unbound_wq) usageTetsuo Handa1-10/+14
Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_unbound_wq with local ib_unreg_wq. Link: https://lore.kernel.org/r/252cefb0-a400-83f6-2032-333d69f52c1b@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-25RDMA/rxe: Remove useless parameters for update_state()Li Zhijian1-3/+2
wqe was not used by update_state() so far. Commit aaaf62e06623 ("RDMA/rxe: Remove useless argument for update_state()") just did a partial fixes. Link: https://lore.kernel.org/r/20220412022903.574238-1-lizhijian@fujitsu.com Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-25RDMA/mlx5: Clean UMR QP type flow from mlx5_ib_post_send()Aharon Landau4-145/+1
No internal UMR operation is using mlx5_ib_post_send(), remove the UMR QP type logic from this function. Link: https://lore.kernel.org/r/0b2f368f14bc9266ebdf92a601ca4e1e5b1e1188.1649747695.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-25RDMA/mlx5: Use mlx5_umr_post_send_wait() to update xltAharon Landau5-208/+116
Move mlx5_ib_update_mr_pas logic to umr.c, and use mlx5_umr_post_send_wait() instead of mlx5_ib_post_send_wait(). Since it is the last use of mlx5_ib_post_send_wait(), remove it. Link: https://lore.kernel.org/r/55a4972f156aba3592a2fc9bcb33e2059acf295f.1649747695.git.leonro@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>