aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-01-10RDMA/core: Remove err in iw_query_portGuoqing Jiang1-6/+1
Since we can return device->ops.query_port directly, so no need to keep those lines. Link: https://lore.kernel.org/r/20200109134043.15568-1-guoqing.jiang@cloud.ionos.com Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/hns: Add support for reporting wc as software modeXi Wang6-34/+252
When hardware is in resetting stage, we may can't poll back all the expected work completions as the hardware won't generate cqe anymore. This patch allows the driver to compose the expected wc instead of the hardware during resetting stage. Once the hardware finished resetting, we can poll cq from hardware again. Link: https://lore.kernel.org/r/1578572412-25756-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/hns: Bugfix for posting a wqe with sgeLijun Ou1-16/+25
Driver should first check whether the sge is valid, then fill the valid sge and the caculated total into hardware, otherwise invalid sges will cause an error. Fixes: 52e3b42a2f58 ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode") Fixes: 7bdee4158b37 ("RDMA/hns: Fill sq wqe context of ud type in hip08") Link: https://lore.kernel.org/r/1578571852-13704-1-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add RcvShortLengthErrCnt to hfi1statsMike Marciniszyn3-0/+3
This counter, RxShrErr, is required for error analysis and debug. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200106134235.119356.29123.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add software counter for ctxt0 seq dropMike Marciniszyn4-0/+14
All other code paths increment some form of drop counter. This was missed in the original implementation. Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0") Link: https://lore.kernel.org/r/20200106134228.119356.96828.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Return void in packet receiving functionsGrzegorz Andrejczuk2-25/+18
Packet receiving functions returns int value, and yet the return values are not used at all. This patch converts the functions to return void. Link: https://lore.kernel.org/r/20200106134222.119356.84098.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Decouple IRQ name from typeGrzegorz Andrejczuk2-48/+59
IRQ name was connected to IRQ type, this is not sufficient and it would be better to use name as argument to msix_request_irq instead of assigning it to variables when function is called. Index argument was required to generate name and now it can be removed. To generate name correctly helpers function were added and updated. Link: https://lore.kernel.org/r/20200106134216.119356.44478.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Create API for auto activateMike Marciniszyn1-14/+23
Add an auto activate routine for use by the interrupt handler. Link: https://lore.kernel.org/r/20200106134210.119356.43079.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: IB/hfi1: Add an API to handle special case dropMike Marciniszyn3-8/+23
This patch pushes special case drop logic into an API to be shared by all interrupt handlers. Additionally, convert do_drop to a bool. Link: https://lore.kernel.org/r/20200106134203.119356.36962.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move common receive IRQ code to functionGrzegorz Andrejczuk1-30/+52
Tracing interrupts, incrementing interrupt counter and ASPM are part that will be reused by HFI1 receive IRQ handlers. Create common function to have shared code in one place. Link: https://lore.kernel.org/r/20200106134157.119356.32656.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Add fast and slow handlers for receive contextMike Marciniszyn4-60/+58
This patch eliminate special cases by adding a fast_handler member to the receive context and changes to the fast handler as specified in the new variable. Initialize the variable as soon as the setting for dma tail is known when the context is created. Setting fast path is called every time when any context has entered slow path. Add function to check if contexts is using fast path and do not set fast path when it is already done to improve RCD fastpath setting. Link: https://lore.kernel.org/r/20200106134150.119356.87558.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10IB/hfi1: Move chip specific functions to chip.cMike Marciniszyn3-69/+87
Move routines and defines associated with hdrq size validation to a chip specific routine since the limits are specific to the device. Fix incorrect value for min size 2 -> 32 CSR writes should also be in chip.c. Create a chip routine to write the hdrq specific CSRs and call as appropriate. Link: https://lore.kernel.org/r/20200106134144.119356.74312.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-10RDMA/core: Fix locking in ib_uverbs_event_readJason Gunthorpe1-18/+14
This should not be using ib_dev to test for disassociation, during disassociation is_closed is set under lock and the waitq is triggered. Instead check is_closed and be sure to re-obtain the lock to test the value after the wait_event returns. Fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications") Link: https://lore.kernel.org/r/1578504126-9400-12-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-09IB/core: Fix build failure without hugepagesArnd Bergmann1-2/+3
HPAGE_SHIFT is only defined on architectures that support hugepages: drivers/infiniband/core/umem_odp.c: In function 'ib_umem_odp_get': drivers/infiniband/core/umem_odp.c:245:26: error: 'HPAGE_SHIFT' undeclared (first use in this function); did you mean 'PAGE_SHIFT'? Enclose this in an #ifdef. Fixes: 9ff1b6466a29 ("IB/core: Fix ODP with IB_ACCESS_HUGETLB handling") Link: https://lore.kernel.org/r/20200109084740.2872079-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Rename event_handler_lock to qp_open_list_lockParav Pandit3-8/+8
This lock is used to protect the qp->open_list linked list. As a side effect it seems to also globally serialize the qp event_handler, but it isn't clear if that is a deliberate design. Link: https://lore.kernel.org/r/20191212113024.336702-5-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Cut down single member ib_cache structureParav Pandit2-20/+17
Given that ib_cache structure has only single member now, merge the cache lock directly in the ib_device. Link: https://lore.kernel.org/r/20191212113024.336702-4-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/core: Let IB core distribute cache update eventsParav Pandit4-72/+92
Currently when the low level driver notifies Pkey, GID, and port change events they are notified to the registered handlers in the order they are registered. IB core and other ULPs such as IPoIB are interested in GID, LID, Pkey change events. Since all GID queries done by ULPs are serviced by IB core, and the IB core deferes cache updates to a work queue, it is possible for other clients to see stale cache data when they handle their own events. For example, the below call tree shows how ipoib will call rdma_query_gid() concurrently with the update to the cache sitting in the WQ. mlx5_ib_handle_event() ib_dispatch_event() ib_cache_event() queue_work() -> slow cache update [..] ipoib_event() queue_work() [..] work handler ipoib_ib_dev_flush_light() __ipoib_ib_dev_flush() ipoib_dev_addr_changed_valid() rdma_query_gid() <- Returns old GID, cache not updated. Move all the event dispatch to a work queue so that the cache update is always done before any clients are notified. Fixes: f35faa4ba956 ("IB/core: Simplify ib_query_gid to always refer to cache") Link: https://lore.kernel.org/r/20191212113024.336702-3-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07IB/mlx5: Do reverse sequence during device removalParav Pandit1-0/+2
When IB device profile initialization completes, device is marked as active. However, IB device is not marked inactive, during device removal flow. It should be the mirror of the add flow. Hence, mark it inactive during remove sequence. Link: https://lore.kernel.org/r/20191212113024.336702-2-leon@kernel.org Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Fix coding style issuesLijun Ou2-143/+92
Fix some coding style issuses without changing logic of codes, most of the modification is unreasonable line breaks and alignments. Link: https://lore.kernel.org/r/1578313276-29080-8-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Replace custom macros HNS_ROCE_ALIGN_UPWenpeng Liang2-26/+20
HNS_ROCE_ALIGN_UP can be replaced by round_up() which is defined in kernel.h. Link: https://lore.kernel.org/r/1578313276-29080-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove redundant print informationYixing Liu1-1/+0
There are already necessary prints in outer function, prints in hns_roce_function_clear() may confuse users. So these prints is removed. Link: https://lore.kernel.org/r/1578313276-29080-6-git-send-email-liweihang@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Delete unnessary parameters in hns_roce_v2_qp_modify()Lijun Ou1-3/+1
Current state and new state of qp won't be configured when modifying qp, so these two redundant parameters should be removed. Link: https://lore.kernel.org/r/1578313276-29080-5-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Update the value of qp typeLijun Ou1-5/+7
The values used to represent service type of RC and UD should be interchanged according to design of hardware. And it's better to define these types in enumeration than macros. Link: https://lore.kernel.org/r/1578313276-29080-4-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Remove unused function hns_roce_init_eq_table()Lijun Ou1-1/+0
hns_roce_init_eq_table() is an unused function that only retains its declaration in driver. Link: https://lore.kernel.org/r/1578313276-29080-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/hns: Avoid printing address of mtt pageWenpeng Liang1-2/+2
Address of a page shouldn't be printed in case of security issues. Link: https://lore.kernel.org/r/1578313276-29080-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/core: Add trace points to follow MR allocationChuck Lever2-11/+151
Track the lifetime of ib_mr objects. Here's sample output from a test run with NFS/RDMA: <...>-361 [009] 79238.772782: mr_alloc: pd.id=3 mr.id=11 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772812: mr_alloc: pd.id=3 mr.id=12 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772839: mr_alloc: pd.id=3 mr.id=13 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772866: mr_alloc: pd.id=3 mr.id=14 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772893: mr_alloc: pd.id=3 mr.id=15 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772921: mr_alloc: pd.id=3 mr.id=16 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772947: mr_alloc: pd.id=3 mr.id=17 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.772974: mr_alloc: pd.id=3 mr.id=18 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773001: mr_alloc: pd.id=3 mr.id=19 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773028: mr_alloc: pd.id=3 mr.id=20 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79238.773055: mr_alloc: pd.id=3 mr.id=21 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.270942: mr_alloc: pd.id=3 mr.id=22 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.270975: mr_alloc: pd.id=3 mr.id=23 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271007: mr_alloc: pd.id=3 mr.id=24 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271036: mr_alloc: pd.id=3 mr.id=25 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271067: mr_alloc: pd.id=3 mr.id=26 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271095: mr_alloc: pd.id=3 mr.id=27 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271121: mr_alloc: pd.id=3 mr.id=28 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271153: mr_alloc: pd.id=3 mr.id=29 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271181: mr_alloc: pd.id=3 mr.id=30 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271208: mr_alloc: pd.id=3 mr.id=31 type=MEM_REG max_num_sg=30 rc=0 <...>-361 [009] 79240.271236: mr_alloc: pd.id=3 mr.id=32 type=MEM_REG max_num_sg=30 rc=0 <...>-4351 [001] 79242.299400: mr_dereg: mr.id=32 <...>-4351 [001] 79242.299467: mr_dereg: mr.id=31 <...>-4351 [001] 79242.299554: mr_dereg: mr.id=30 <...>-4351 [001] 79242.299615: mr_dereg: mr.id=29 <...>-4351 [001] 79242.299684: mr_dereg: mr.id=28 <...>-4351 [001] 79242.299748: mr_dereg: mr.id=27 <...>-4351 [001] 79242.299812: mr_dereg: mr.id=26 <...>-4351 [001] 79242.299874: mr_dereg: mr.id=25 <...>-4351 [001] 79242.299944: mr_dereg: mr.id=24 <...>-4351 [001] 79242.300009: mr_dereg: mr.id=23 <...>-4351 [001] 79242.300190: mr_dereg: mr.id=22 <...>-4351 [001] 79242.300263: mr_dereg: mr.id=21 <...>-4351 [001] 79242.300326: mr_dereg: mr.id=20 <...>-4351 [001] 79242.300388: mr_dereg: mr.id=19 <...>-4351 [001] 79242.300450: mr_dereg: mr.id=18 <...>-4351 [001] 79242.300516: mr_dereg: mr.id=17 <...>-4351 [001] 79242.300629: mr_dereg: mr.id=16 <...>-4351 [001] 79242.300718: mr_dereg: mr.id=15 <...>-4351 [001] 79242.300784: mr_dereg: mr.id=14 <...>-4351 [001] 79242.300879: mr_dereg: mr.id=13 <...>-4351 [001] 79242.300945: mr_dereg: mr.id=12 <...>-4351 [001] 79242.301012: mr_dereg: mr.id=11 Some features of the output: - The lifetime and owner PD of each MR is clearly visible. - The type of MR is captured, as is the SGE array size. - Failing MR allocation can be recorded. Link: https://lore.kernel.org/r/20191218201820.30584.34636.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/core: Trace points for diagnosing completion queue issuesChuck Lever6-4/+320
Sample trace events: kworker/u29:0-300 [007] 120.042217: cq_alloc: cq.id=4 nr_cqe=161 comp_vector=2 poll_ctx=WORKQUEUE <idle>-0 [002] 120.056292: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.056402: cq_process: cq.id=4 wake-up took 109 [us] from interrupt kworker/2:1H-482 [002] 120.056407: cq_poll: cq.id=4 requested 16, returned 1 <idle>-0 [002] 120.067503: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.067537: cq_process: cq.id=4 wake-up took 34 [us] from interrupt kworker/2:1H-482 [002] 120.067541: cq_poll: cq.id=4 requested 16, returned 1 <idle>-0 [002] 120.067657: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 120.067672: cq_process: cq.id=4 wake-up took 15 [us] from interrupt kworker/2:1H-482 [002] 120.067674: cq_poll: cq.id=4 requested 16, returned 1 ... systemd-1 [002] 122.392653: cq_schedule: cq.id=4 kworker/2:1H-482 [002] 122.392688: cq_process: cq.id=4 wake-up took 35 [us] from interrupt kworker/2:1H-482 [002] 122.392693: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.392836: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.392970: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.393083: cq_poll: cq.id=4 requested 16, returned 16 kworker/2:1H-482 [002] 122.393195: cq_poll: cq.id=4 requested 16, returned 3 Several features to note in this output: - The WCE count and context type are reported at allocation time - The CPU and kworker for each CQ is evident - The CQ's restracker ID is tagged on each trace event - CQ poll scheduling latency is measured - Details about how often single completions occur versus multiple completions are evident - The cost of the ULP's completion handler is recorded Link: https://lore.kernel.org/r/20191218201815.30584.3481.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-07RDMA/cma: Add trace points in RDMA Connection ManagerChuck Lever4-23/+475
Record state transitions as each connection is established. The IP address of both peers and the Type of Service is reported. These trace points are not in performance hot paths. Also, record each cm_event_handler call to ULPs. This eliminates the need for each ULP to add its own similar trace point in its CM event handler function. These new trace points appear in a new trace subsystem called "rdma_cma". Sample events: <...>-220 [004] 121.430733: cm_id_create: cm.id=0 <...>-472 [003] 121.430991: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ADDR_RESOLVED (0/0) <...>-472 [003] 121.430995: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-472 [003] 121.431172: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ROUTE_RESOLVED (2/0) <...>-472 [003] 121.431174: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 121.433480: cm_qp_create: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 pd.id=2 qp_type=RC send_wr=4091 recv_wr=256 qp_num=521 rc=0 <...>-220 [004] 121.433577: cm_send_req: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 kworker/1:2-973 [001] 121.436190: cm_send_mra: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436340: cm_send_rtu: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/1:2-973 [001] 121.436359: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 ESTABLISHED (9/0) kworker/1:2-973 [001] 121.436365: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-1975 [005] 123.161954: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-1975 [005] 123.161974: cm_sent_dreq: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 <...>-220 [004] 123.162102: cm_disconnect: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 kworker/0:1-13 [000] 123.162391: cm_event_handler: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 DISCONNECTED (10/0) kworker/0:1-13 [000] 123.162393: cm_event_done: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 result=0 <...>-220 [004] 123.164456: cm_qp_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 qp_num=521 <...>-220 [004] 123.165290: cm_id_destroy: cm.id=0 src=192.168.2.51:35090 dst=192.168.2.55:20049 tos=0 Some features to note: - restracker ID of the rdma_cm_id is tagged on each trace event - The source and destination IP addresses and TOS are reported - CM event upcalls are shown with decoded event and status - CM state transitions are reported - rdma_cm_id lifetime events are captured - The latency of ULP CM event handlers is reported - Lifetime events of associated QPs are reported - Device removal and insertion is reported This patch is based on previous work by: Saeed Mahameed <saeedm@mellanox.com> Mukesh Kacker <mukesh.kacker@oracle.com> Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com> Aron Silverton <aron.silverton@oracle.com> Avinash Repaka <avinash.repaka@oracle.com> Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Link: https://lore.kernel.org/r/20191218201810.30584.3052.stgit@manet.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/cm: Delete unused CM ARP functionsLeon Romanovsky2-100/+0
Clean the code by deleting ARP functions, which are not called anyway. Link: https://lore.kernel.org/r/20191212093830.316934-46-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/cm: Delete unused CM LAP functionsLeon Romanovsky2-168/+0
Clean the code by deleting LAP functions, which are not called anyway. Link: https://lore.kernel.org/r/20191212093830.316934-43-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/i40iw: fix a potential NULL pointer dereferenceXiyu Yang1-0/+2
A NULL pointer can be returned by in_dev_get(). Thus add a corresponding check so that a NULL pointer dereference will be avoided at this place. Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status") Link: https://lore.kernel.org/r/1577672668-46499-1-git-send-email-xiyuyang19@fudan.edu.cn Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/rxe: Fix error type of mmap_offsetJiewei Ke1-1/+1
The type of mmap_offset should be u64 instead of int to match the type of mminfo.offset. If otherwise, after we create several thousands of CQs, it will run into overflow issues. Link: https://lore.kernel.org/r/20191227113613.5020-1-kejiewei.cn@gmail.com Signed-off-by: Jiewei Ke <kejiewei.cn@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/mlx5: use true,false for bool variablezhengbin2-3/+3
Fixes coccicheck warning: drivers/infiniband/hw/mlx5/mr.c:150:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/mr.c:1455:2-26: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx5/qp.c:1874:6-20: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-6-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/mlx4: use true,false for bool variablezhengbin1-2/+2
Fixes coccicheck warning: drivers/infiniband/hw/mlx4/qp.c:852:2-14: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/hw/mlx4/qp.c:3087:3-10: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-5-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/iser: use true,false for bool variablezhengbin2-2/+2
Fixes coccicheck warning: drivers/infiniband/ulp/iser/iser_memory.c:530:2-21: WARNING: Assignment of 0/1 to bool variable drivers/infiniband/ulp/iser/iser_verbs.c:1096:2-21: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-4-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: use true,false for bool variablezhengbin1-1/+1
Fixes coccicheck warning: drivers/infiniband/hw/hfi1/rc.c:2602:1-8: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-3-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/siw: use true,false for bool variablezhengbin1-1/+1
Fixes coccicheck warning: drivers/infiniband/sw/siw/siw_cm.c:32:18-41: WARNING: Assignment of 0/1 to bool variable Link: https://lore.kernel.org/r/1577176812-2238-2-git-send-email-zhengbin13@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/core: Fix ODP with IB_ACCESS_HUGETLB handlingYishai Hadas1-17/+4
As VMAs for a given range might not be available as part of the registration phase in ODP. ib_init_umem_odp() considered the expected page shift value that was previously set and initializes its internals accordingly. If memory isn't backed by physical contiguous pages aligned to a hugepage boundary an error will be set as part of the page fault flow and come back to the user as some failed RDMA operation. Fixes: 0008b84ea9af ("IB/umem: Add support to huge ODP") Link: https://lore.kernel.org/r/20191222124649.52300-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/core: Fix ODP get user pages flowYishai Hadas1-1/+1
The nr_pages argument of get_user_pages_remote() should always be in terms of the system page size, not the MR page size. Use PAGE_SIZE instead of umem_odp->page_shift. Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support") Link: https://lore.kernel.org/r/20191222124649.52300-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/mlx5: Unify ODP MR code paths to allow extra flexibilityArtemy Kovalyov4-67/+67
Building MR translation table in the ODP case requires additional flexibility, namely random access to DMA addresses. Make both direct and indirect ODP MR use same code path, separated from the non-ODP MR code path. With the restructuring the correct page_shift is now used around __mlx5_ib_populate_pas(). Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Link: https://lore.kernel.org/r/20191222124649.52300-2-leon@kernel.org Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/rdmavt: Correct comments in rdmavt_qp.h headerMike Marciniszyn2-22/+9
Comments need to be with the definition of rvt_restart_sge(). Other comments were duplicated in sw/rdmavt/rc.c and were removed. Fixes: 385156c5f2a6 ("IB/hfi: Move RC functions into a header file") Link: https://lore.kernel.org/r/20191219211934.58387.88014.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: List all receive contexts from debugfsMichael J. Ruhl2-4/+10
The current debugfs output for receive contexts (rcds), stops after the kernel receive contexts have been displayed. This is not enough information to fully diagnose packet drops. Display all of the receive contexts. Augment the output with some more context information. Limit the ring buffer header output to 5 entries to avoid overextending the sequential file output. Fixes: bf808b5039c ("IB/hfi1: Add kernel receive context info to debugfs") Link: https://lore.kernel.org/r/20191219211928.58387.20737.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/hfi1: Add accessor API routines to access context membersMike Marciniszyn9-85/+183
This patch adds a set of accessor routines to access context members. Link: https://lore.kernel.org/r/20191219211922.58387.26548.stgit@awfm-01.aw.intel.com Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/mlx4: Redo TX checksum offload in line with docsEugene Crosser2-11/+12
Ingress checksum offload was not working for IPv6 frames because the conditional expression that checks validation status passed from the hardware was not matching the algorithm described in the documentation. This patch defines L4_CSUM flag (which falls inside the badfcs_enc field in the existing definition of the CQE layout) and replaces the conditional expression with the one defined in the "ConnectX(r) Family Programmer's Manual" document. Link: https://lore.kernel.org/r/20191219134847.413582-1-leon@kernel.org Signed-off-by: Eugene Crosser <evgenii.cherkashin@profitbricks.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/cm: Use RCU synchronization mechanism to protect cm_id_private xa_load()Danit Goldberg1-30/+13
The RCU mechanism is optimized for read-mostly scenarios and therefore more suitable to protect the cm_id_private to decrease "cm.lock" congestion. This patch replaces the existing spinlock locking mechanism and kfree with RCU mechanism in places where spinlock(cm.lock) protected xa_load returning the cm_id_priv In addition, delete the cm_get_id() function as there is no longer a distinction if the caller already holds the cm_lock. Remove an open coded version of cm_get_id(). Link: https://lore.kernel.org/r/20191219134750.413429-1-leon@kernel.org Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/srpt: Remove unnecessary assertion in srpt_queue_responseAditya Pakki1-2/+0
Since ch has already been de-referenced by the time we get to the BUG_ON, it is useless. The back trace alone is enough to tell what is going on, delete the redundant BUG_ON. Link: https://lore.kernel.org/r/20191217194437.25568-1-pakki001@umn.edu Signed-off-by: Aditya Pakki <pakki001@umn.edu> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/netlink: Do not always generate an ACK for some netlink operationsHåkon Bugge2-3/+3
In rdma_nl_rcv_skb(), the local variable err is assigned the return value of the supplied callback function, which could be one of ib_nl_handle_resolve_resp(), ib_nl_handle_set_timeout(), or ib_nl_handle_ip_res_resp(). These three functions all return skb->len on success. rdma_nl_rcv_skb() is merely a copy of netlink_rcv_skb(). The callback functions used by the latter have the convention: "Returns 0 on success or a negative error code". In particular, the statement (equal for both functions): if (nlh->nlmsg_flags & NLM_F_ACK || err) implies that rdma_nl_rcv_skb() always will ack a message, independent of the NLM_F_ACK being set in nlmsg_flags or not. The fix could be to change the above statement, but it is better to keep the two *_rcv_skb() functions equal in this respect and instead change the three callback functions in the rdma subsystem to the correct convention. Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload") Link: https://lore.kernel.org/r/20191216120436.3204814-1-haakon.bugge@oracle.com Suggested-by: Mark Haywood <mark.haywood@oracle.com> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Tested-by: Mark Haywood <mark.haywood@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03IB/mlx5: Fix outstanding_pi index for GSI qpsPrabhath Sajeepa1-2/+1
Commit b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") changed the way outstanding WRs are tracked for the GSI QP. But the fix did not cover the case when a call to ib_post_send() fails and updates index to track outstanding. Since the prior commmit outstanding_pi should not be bounded otherwise the loop generate_completions() will fail. Fixes: b0ffeb537f3a ("IB/mlx5: Fix iteration overrun in GSI qps") Link: https://lore.kernel.org/r/1576195889-23527-1-git-send-email-psajeepa@purestorage.com Signed-off-by: Prabhath Sajeepa <psajeepa@purestorage.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/siw: Simplify QP representationBernard Metzler7-70/+42
Change siw_qp to contain ib_qp. Use rdma_is_kernel_res() on contained ib_qp to distinguish kernel level from user level applications resources. Apply same mechanism for kernel/user level application detection to completion queues. Link: https://lore.kernel.org/r/20191210161729.31598-1-bmt@zurich.ibm.com Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03RDMA/hns: Simplify the calculation and usage of wqe idx for post verbsYixian Liu3-48/+35
Currently, the wqe idx is calculated repeatly everywhere it is used. This patch defines wqe_idx and calculated it only once, then just use it as needed. Fixes: 2d40788825ac ("RDMA/hns: Add support for processing send wr and receive wr") Link: https://lore.kernel.org/r/1575981902-5274-1-git-send-email-liweihang@hisilicon.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>