aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-01-28RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failureLeon Romanovsky1-4/+1
Failure in XRCD FW deallocation command leaves memory leaked and returns error to the user which he can't do anything about it. This patch changes behavior to always free memory and always return success to the user. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-26IB/mthca: Fix gup usage in mthca_map_user_db()Davidlohr Bueso1-1/+1
get_user_pages() must be called with mmap_sem held, currently it is not. In fact it is called under the user db_table->mutex. To fix this we can convert gup to use the fast alternative, and safely avoid taking mmap_sem, if possible. Furthermore this is safe wrt to the mutex as other callers that take the lock (unmap and alloc_db) are not called under mmap_sem (hence possible deadlock). Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-25RDMA/qedr: lower print level of flushed CQEsKalderon, Michal1-3/+3
There are races where can still get flush on CQEs before the QP enters error state. This is not an error and should be treated as debug information. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-24RDMA/qedr: Use pci_enable_atomic_ops_to_root()Felix Kuehling1-51/+8
Use PCI core interface pci_enable_atomic_ops_to_root() to enable atomic capability. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michal Kalderon <michal.kalderon@cavium.com> CC: Ram Amrani <Ram.Amrani@cavium.com> CC: Doug Ledford <dledford@redhat.com>
2018-01-19RDMA/mlx5: Remove redundant allocation warning printLeon Romanovsky1-11/+8
The kmalloc() failure to allocate memory generates enough information and doesn't need to be accompanied by another driver print. Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-18RDMA/hns: Remove unnecessary platform_get_resource() error checkweiyongjun (A)1-4/+0
devm_ioremap_resource() already checks if the resource is NULL, so remove the unnecessary platform_get_resource() error check. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18IB/mlx5: Mmap the HCA's clock info to user-spaceFeras Daoud2-12/+42
This patch maps the new page to user space applications to allow converting a user space completion timestamp to system wall time at the lowest possible latency cost. By using a versioning scheme we allow compatibility between current and future userspace libraries. The change moves mlx5_ib_mmap_cmd enum from mlx5_ib.h to the abi header file mlx5-abi.h. Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Eitan Rabin <rabin@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18RDMA/bnxt_re: Add SRQ support for Broadcom adaptersDevesh Sharma7-98/+816
Shared receive queue (SRQ) is defined as a pool of receive buffers shared among multiple QPs which belong to same protection domain in a given process context. Use of SRQ reduces the memory foot print of IB applications. Broadcom adapters support SRQ, adding code-changes to enable shared receive queue. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18RDMA/bnxt_re: expose detailed stats retrieved from HWSelvin Xavier7-10/+417
Broadcom's adapter supports more granular statistics to allow better understanding about the state of the chip when data traffic is flowing. Exposing the detailed stats to the consumer through the standard hook available in the kverbs interface. In order to retrieve all the information, driver implements a firmware command. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-18RDMA/bnxt_re: Add support for MRs with Huge pagesSomnath Kotur5-57/+140
Depending on the OS page-table configurations, applications may request MRs which has page size alignment other than 4K Underlying provider driver needs to adjust its PBL boundaries according to the incoming page boundaries in the PA list. Adding a capability to register MRs having pages-sizes other than 4K (Hugepages). Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-17RDMA/bnxt_re: Add support for query firmware versionSelvin Xavier6-15/+39
The device now reports firmware version thus, removing the hard coded values of the FW version string and redundant fw_rev hook from sysfs. Adding code to query firmware version from underlying device and report it through the kernel verb to get firmware version string. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-17RDMA/bnxt_re: Enable RoCE on virtual functionsSelvin Xavier6-37/+182
RoCE can be used by virtual functions (VFs) as well. Adding code changes to allow resource reservation, initialization and avail the resources to the RDMA applications running on those VFs. Currently, fifty percent of the total available resources are reserved for PF and remaining are equally divided among active VFs. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-10/+12
Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16i40iw: Free IEQ resourcesMustafa Ismail3-2/+3
The iWARP Exception Queue (IEQ) resources are not freed when a QP is destroyed. Fix this by freeing IEQ resources when freeing QP resources. Fixes: d37498417947 ("i40iw: add files for iwarp interface") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16i40iw: Remove setting of rem_addr.lenMustafa Ismail1-3/+0
Remove setting of rem_addr.len before calling iw_rdma_write, iw_inline_rdma_write and rdma_read. rem_addr.len is not used in those functions. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16i40iw: Remove limit on re-posting AEQ entries to HWSindhu Devale2-3/+0
Currently, if the number of processed Asynchronous Event Queue (AEQ) entries exceeds 255, they are not returned to HW for re-use. During scale-up, the unreturned AEQ entries can grow to the max AEQ size and cause the HW to report an AEQ overflow. Remove the check which limits the number of processed AEQ entries returned to HW. Fixes: 86dbcd0f12e9 ("RDMA/i40iw: add file to handle cqp calls") Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16i40iw: Zero-out consumer key on allocate stag for FMRShiraz Saleem1-0/+1
If the application invalidates the MR before the FMR WR, HW parses the consumer key portion of the stag and returns an invalid stag key Asynchronous Event (AE) that tears down the QP. Fix this by zeroing-out the consumer key portion of the allocated stag returned to application for FMR. Fixes: ee855d3b93f3 ("RDMA/i40iw: Add base memory management extensions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16i40iw: Remove extra call to i40iw_est_sd()Shiraz Saleem1-2/+0
Remove redundant estimate SD function call. sd_needed should already be updated at the end of the do while resource reduction loop. Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Set the guid for hip08 RoCE deviceoulijun1-0/+4
This patch assign a guid(Global Unique identifer) value to the hip08 device. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Update the verbs of polling for completionoulijun2-1/+13
If the port is a RoCEv2 port, the remote port address and QP information which returned for UD will be modified. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Assign zero for pkey_index of wc in hip08oulijun1-0/+3
Because pkey is fixed for hip08 RoCE, it needs to assign zero for pkey_index of wc. otherwise, it will happen an error when establishing connection by communication management mechanism. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Fill sq wqe context of ud type in hip08oulijun2-145/+385
This patch mainly configure the fields of sq wqe of ud type when posting wr of gsi qp type. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Add gsi qp support for modifying qp in hip08oulijun2-16/+49
It needs to Assign the values for some fields in qp context when qp type is gsi qp type in hip08. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Create gsi qp in hip08oulijun1-2/+16
The gsi qp and rc qp use the same qp context structure and the created flow, only differentiate them by qpn and qp type. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16RDMA/hns: Assign the correct value for tx_cqnoulijun1-1/+1
When modifying qp from init to init, it need to assign the cqn of send cq for tx cqn field of qp context. Otherwise, it will cause a mistake when the send and recv cq sizes are different. Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2-6/+5
Pull rdma fixes from Doug Ledford: "We had a few more items creep up over the last week. Given we are in -rc8, these are obviously limited to bugs that have a big downside and for which we are certain of the fix. The first is a straight up oops bug that all you have to do is read the code to see it's a guaranteed 100% oops bug. The second is a use-after-free issue. We get away lucky if the queue we are shutting down is empty, but if it isn't, we can end up oopsing. We really need to drain the queue before destroying it. The final one is an issue with bad user input causing us to access our port array out of bounds. While fixing the array out of bounds issue, it was noticed that the original code did the same thing twice (the call to rdma_ah_set_port_num()), so its removal is not balanced by a readd elsewhere, it was already where it needed to be in addition to where it didn't need to be. Summary: - Oops fix in hfi1 driver - use-after-free issue in iser-target - use of user supplied array index without proper checking" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Fix out-of-bound access while querying AH IB/hfi1: Prevent a NULL dereference iser-target: Fix possible use-after-free in connection establishment error
2018-01-15IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH portsJack Morgenstein1-8/+5
Allocating steerable UD QPs depends on having at least one IB port, while releasing those QPs does not. As a result, when there are only ETH ports, the IB (RoCE) driver requests releasing a qp range whose base qp is zero, with qp count zero. When SR-IOV is enabled, and the VF driver is running on a VM over a hypervisor which treats such qp release calls as errors (rather than NOPs), we see lines in the VM message log like: mlx4_core 0002:00:02.0: Failed to release qp range base:0 cnt:0 Fix this by adding a check for a zero count in mlx4_release_qp_range() (which thus treats releasing 0 qps as a nop), and eliminating the check for device managed flow steering when releasing steerable UD QPs. (Freeing ib_uc_qpns_bitmap unconditionally is also OK, since it remains NULL when steerable UD QPs are not allocated). Cc: <stable@vger.kernel.org> Fixes: 4196670be786 ("IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-15RDMA/qedr: Fix endian problems around imm_dataJason Gunthorpe1-2/+2
The double swap matches what user space rdma-core does to imm_data. wc->imm_data is not used in the kernel so this change has no practical impact. Acked-by: Michal Kalderon <michal.kalderon@cavium.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-15RDMA/hns: Fix endian problems around imm_data and rkeyJason Gunthorpe3-7/+11
This matches the changes made recently to the userspace hns driver when it was made sparse clean. See rdma-core commit bffd380cfe56 ("libhns: Make the provider sparse clean") wc->imm_data is not used in the kernel so this change has no practical impact. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-15RDMA/mlx5: Fix out-of-bound access while querying AHLeon Romanovsky1-4/+3
The rdma_ah_find_type() accesses the port array based on an index controlled by userspace. The existing bounds check is after the first use of the index, so userspace can generate an out of bounds access, as shown by the KASN report below. ================================================================== BUG: KASAN: slab-out-of-bounds in to_rdma_ah_attr+0xa8/0x3b0 Read of size 4 at addr ffff880019ae2268 by task ibv_rc_pingpong/409 CPU: 0 PID: 409 Comm: ibv_rc_pingpong Not tainted 4.15.0-rc2-00031-gb60a3faf5b83-dirty #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 Call Trace: dump_stack+0xe9/0x18f print_address_description+0xa2/0x350 kasan_report+0x3a5/0x400 to_rdma_ah_attr+0xa8/0x3b0 mlx5_ib_query_qp+0xd35/0x1330 ib_query_qp+0x8a/0xb0 ib_uverbs_query_qp+0x237/0x7f0 ib_uverbs_write+0x617/0xd80 __vfs_write+0xf7/0x500 vfs_write+0x149/0x310 SyS_write+0xca/0x190 entry_SYSCALL_64_fastpath+0x18/0x85 RIP: 0033:0x7fe9c7a275a0 RSP: 002b:00007ffee5498738 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007fe9c7ce4b00 RCX: 00007fe9c7a275a0 RDX: 0000000000000018 RSI: 00007ffee5498800 RDI: 0000000000000003 RBP: 000055d0c8d3f010 R08: 00007ffee5498800 R09: 0000000000000018 R10: 00000000000000ba R11: 0000000000000246 R12: 0000000000008000 R13: 0000000000004fb0 R14: 000055d0c8d3f050 R15: 00007ffee5498560 Allocated by task 1: __kmalloc+0x3f9/0x430 alloc_mad_private+0x25/0x50 ib_mad_post_receive_mads+0x204/0xa60 ib_mad_init_device+0xa59/0x1020 ib_register_device+0x83a/0xbc0 mlx5_ib_add+0x50e/0x5c0 mlx5_add_device+0x142/0x410 mlx5_register_interface+0x18f/0x210 mlx5_ib_init+0x56/0x63 do_one_initcall+0x15b/0x270 kernel_init_freeable+0x2d8/0x3d0 kernel_init+0x14/0x190 ret_from_fork+0x24/0x30 Freed by task 0: (stack is not available) The buggy address belongs to the object at ffff880019ae2000 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 104 bytes to the right of 512-byte region [ffff880019ae2000, ffff880019ae2200) The buggy address belongs to the page: page:000000005d674e18 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 0000000000000000 0000000000000000 00000001000c000c raw: dead000000000100 dead000000000200 ffff88001a402000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880019ae2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff880019ae2200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880019ae2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880019ae2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Disabling lock debugging due to kernel taint Cc: <stable@vger.kernel.org> Fixes: 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-12net/mlx5: Fix mlx5_get_uars_page to return error codeEran Ben Elisha1-1/+1
Change mlx5_get_uars_page to return ERR_PTR in case of allocation failure. Change all callers accordingly to check the IS_ERR(ptr) instead of NULL. Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-12{net,ib}/mlx5: Don't disable local loopback multicast traffic when neededEran Ben Elisha1-3/+6
There are systems platform information management interfaces (such as HOST2BMC) for which we cannot disable local loopback multicast traffic. Separate disable_local_lb_mc and disable_local_lb_uc capability bits so driver will not disable multicast loopback traffic if not supported. (It is expected that Firmware will not set disable_local_lb_mc if HOST2BMC is running for example.) Function mlx5_nic_vport_update_local_lb will do best effort to disable/enable UC/MC loopback traffic and return success only in case it succeeded to changed all allowed by Firmware. Adapt mlx5_ib and mlx5e to support the new cap bits. Fixes: 2c43c5a036be ("net/mlx5e: Enable local loopback in loopback selftest") Fixes: c85023e153e3 ("IB/mlx5: Add raw ethernet local loopback support") Fixes: bded747bb432 ("net/mlx5: Add raw ethernet local loopback firmware command") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-10IB/mlx5: remove redundant assignment of mdevColin Ian King1-1/+1
The initial assignment to mdev is redundant as mdev is re-assigned later and the first assigned value is never read. Remove this redundant assignment. Cleans up clang warning: drivers/infiniband/hw/mlx5/main.c:359:24: warning: Value stored to 'mdev' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-10IB/hfi1: Prevent a NULL dereferenceDan Carpenter1-2/+2
In the original code, we set "fd->uctxt" to NULL and then dereference it which will cause an Oops. Fixes: f2a3bc00a03c ("IB/hfi1: Protect context array set/clear with spinlock") Cc: <stable@vger.kernel.org> # 4.14.x Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
2018-01-08Merge branch 'bart-srpt-for-next' into k.o/wip/dl-for-nextDoug Ledford1-1/+1
Merging in 12 patch series from Bart that required changes in the current for-rc branch in order to apply cleanly. Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-08IB/mlx5: Don't advertise RAW QP support in dual port modeDaniel Jurgens1-7/+14
When operating in dual port RoCE mode FW doesn't support steering for raw QPs on the slave port. They still work on the master port, but the user has no way of knowing which port is the master. The capability is reported per device, not per port. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Route MADs for dual port RoCEDaniel Jurgens1-7/+14
Route performance query MADs to the correct mlx5_core_dev when using dual port RoCE mode. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08{net, IB}/mlx5: Change set_roce_gid to take a port numberDaniel Jurgens1-1/+1
When in dual port mode setting a RoCE GID for any port flows through the master ports mlx5_core_dev. Provide an interface to set the port when sending this command. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Update counter implementation for dual port RoCEDaniel Jurgens2-28/+42
Update the counter interface for multiple ports. Some counter sets always comes from the primary device. Port specific counters should be accessed per mlx5_core_dev not always through the IB master mdev. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Change debugfs to have per port contentsParav Pandit3-29/+73
When there are multiple ports for single IB(RoCE) device, support debugfs entries to be available for each port. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Implement dual port functionality in query routinesDaniel Jurgens1-17/+85
Port operations must be routed to their native mlx5_core_dev. A multiport RoCE device registers itself as having 2 ports even before a 2nd port is affiliated. If an unaffilated port is queried use capability information from the master port, these values are the same. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Move IB event processing onto a workqueueDaniel Jurgens1-13/+56
Because mlx5_ib_event can be called from atomic context move event handling onto a workqueue. A mutex lock is required to get the IB device for slave ports, so move event processing onto a work queue. When an IB event is received, check if the mlx5_core_dev is a slave port, if so attempt to get the IB device it's affiliated with. If found process the event for that device, otherwise return. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08{net, IB}/mlx5: Manage port association for multiport RoCEDaniel Jurgens2-25/+424
When mlx5_ib_add is called determine if the mlx5 core device being added is capable of dual port RoCE operation. If it is, determine whether it is a master device or a slave device using the num_vhca_ports and affiliate_nic_vport_criteria capabilities. If the device is a slave, attempt to find a master device to affiliate it with. Devices that can be affiliated will share a system image guid. If none are found place it on a list of unaffiliated ports. If a master is found bind the port to it by configuring the port affiliation in the NIC vport context. Similarly when mlx5_ib_remove is called determine the port type. If it's a slave port, unaffiliate it from the master device, otherwise just remove it from the unaffiliated port list. The IB device is registered as a multiport device, even if a 2nd port is not available for affiliation. When the 2nd port is affiliated later the GID cache must be refreshed in order to get the default GIDs for the 2nd port in the cache. Export roce_rescan_device to provide a mechanism to refresh the cache after a new port is bound. In a multiport configuration all IB object (QP, MR, PD, etc) related commands should flow through the master mlx5_core_dev, other commands must be sent to the slave port mlx5_core_mdev, an interface is provide to get the correct mdev for non IB object commands. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Make netdev notifications multiport capableDaniel Jurgens3-37/+55
When multiple RoCE ports are supported registration for events on multiple netdevs is required. Refactor the event registration and handling to support multiple ports. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Reduce the use of num_port capabilityDaniel Jurgens3-12/+11
Remove use of the num_ports general capability throughout. The number of ports will be variable in the future, and reported in a different way. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Add support for DC target QPMoni Shoua3-3/+212
A DC Target (DCT) QP is represented in the hardware as a unique object. This object is created by CREATE_DCT command and destroyed by DESTROY_DCT command. However, in the driver we describe it as a QP. The hardware command that creates a DCT needs parameters that the verb create_qp() does not provide. Those remaining parameters are provided with the call to the verb modify_qp(). Therefore we delay the actual creation of a DCT in the hardware until the stage of modify_qp() to RTR. A support for query_qp() was added as well. It uses QUERY_DCT command to retrieve the applicable fields. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Add support for DC Initiator QPMoni Shoua1-5/+72
DC Initiator (DCI) QP is represented like any other QP in the hardware. However, like any other transport QP there are attributes and settings that are special to DCI QP and needs specific attention and care. Make necessary changes to configure DCI QP. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-08IB/mlx5: Handle type IB_QPT_DRIVER when creating a QPMoni Shoua2-1/+112
The QP type IB_QPT_DRIVER doesn't describe the transport or the service that the QP provides but those are known only to the hardware driver. The actual type of the QP is stored in the hardware driver context (i.e. mlx5_qp) under the field qp_sub_type. Take the real QP type and any extra data that is required to create the QP from the driver channel and modify the QP initial attributes before continuing with create_qp(). Downstream patches from this series will add support for both DCI and DCT driver QPs. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-05bnxt_re: report RoCE device support at info levelJonathan Toppins1-1/+1
Reporting that a device doesn't support RoCE seems like a valuable piece of information to have when trying to determine why a driver is not binding to a device. Better to report this at info log level instead of requiring a user to enable all debug messages in the driver. Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-By: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>