aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband (follow)
AgeCommit message (Collapse)AuthorFilesLines
2020-01-25RDMA/cm: Use IBA functions for simple get/set acessorsJason Gunthorpe2-509/+104
Use a Coccinelle spatch to replace CM helper functions with IBA_GET/SET versions. Applied with $ spatch --sp-file edits.sp --in-place drivers/infiniband/core/cm.c The spatch file was generated using the template pattern: @@ expression val; {struct} *msg; @@ - {old_setter} + IBA_SET({new_name}, msg, val) @@ {struct} *msg; @@ - {old_getter} + IBA_GET({new_name}, msg) Iterated for every IBA_CHECK_GET()/IBA_CHECK_GET() pairing. Touched up with clang-format after. Link: https://lore.kernel.org/r/20200116170037.30109-4-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25RDMA/cm: Add SET/GET implementations to hide IBA wire formatLeon Romanovsky2-0/+268
There is no separation between RDMA-CM wire format as it is declared in IBTA and kernel logic which implements needed support. Such situation causes to many mistakes in conversion between big-endian (wire format) and CPU format used by kernel. It also mixes RDMA core code with combination of uXX and beXX variables. The idea that all accesses to IBA definitions will go through special GET/SET macros to ensure that no conversion mistakes are made. The shifting and masking required to read the value is automatically deduced using the field offset description from the tables in the IBA specification. This starts with the CM MADs described in IBTA release 1.3 volume 1. To confirm that the new macros behave the same as the old accessors a self-test is included in this patch. Each macro replacing a straightforward struct field compile-time tests that the new field has the same offsetof() and width as the old field. For the fields with accessor functions a runtime test, the 'all ones' value is placed in a dummy message and read back in several ways to confirm that both approaches give identical results. Later patches in this series delete the self test. This creates a tested table of new field name, old field name(s) and some meta information like BE coding for the functions which will be used in the next patches. Link: https://lore.kernel.org/r/20200116170037.30109-3-jgg@ziepe.ca Link: https://lore.kernel.org/r/20191212093830.316934-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25RDMA/cm: Add accessors for CM_REQ transport_typeJason Gunthorpe1-12/+29
Access the two fields through wrappers, like all other fields, to make it clearer what is happening. Link: https://lore.kernel.org/r/20200116170037.30109-2-jgg@ziepe.ca Tested-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25IB/mlx5: Return the administrative GUID if existsDanit Goldberg1-16/+12
A user can change the operational GUID (a.k.a affective GUID) through link/infiniband. Therefore it is preferred to return the currently set GUID if it exists instead of the operational. This way the PF can query which VF GUID will be set in the next bind. In order to align with MAC address, zero is returned if administrative GUID is not set. For example, before setting administrative GUID: $ ip link show ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, NODE_GUID 00:00:00:00:00:00:00:00, PORT_GUID 00:00:00:00:00:00:00:00, link-state auto, trust off, query_rss off Then: $ ip link set ib0 vf 0 node_guid 11:00:af:21:cb:05:11:00 $ ip link set ib0 vf 0 port_guid 22:11:af:21:cb:05:11:00 After setting administrative GUID: $ ip link show ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 4092 qdisc mq state UP mode DEFAULT group default qlen 256 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff vf 0 link/infiniband 00:00:00:08:fe:80:00:00:00:00:00:00:52:54:00:c0:fe:12:34:55 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof checking off, NODE_GUID 11:00:af:21:cb:05:11:00, PORT_GUID 22:11:af:21:cb:05:11:00, link-state auto, trust off, query_rss off Fixes: 9c0015ef0928 ("IB/mlx5: Implement callbacks for getting VFs GUID attributes") Link: https://lore.kernel.org/r/20200116120048.12744-1-leon@kernel.org Signed-off-by: Danit Goldberg <danitg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-25RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fenceJason Gunthorpe1-0/+2
The set of entry->driver_removed is missing locking, protect it with xa_lock() which is held by the only reader. Otherwise readers may continue to see driver_removed = false after rdma_user_mmap_entry_remove() returns and may continue to try and establish new mmaps. Fixes: 3411f9f01b76 ("RDMA/core: Create mmap database and cookie helper functions") Link: https://lore.kernel.org/r/20200115202041.GA17199@ziepe.ca Reviewed-by: Gal Pressman <galpress@amazon.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21Merge tag 'rds-odp-for-5.5' into rdma.git for-nextJason Gunthorpe38-209/+317
From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== * tag 'rds-odp-for-5.5': net/rds: Use prefetch for On-Demand-Paging MR net/rds: Handle ODP mr registration/unregistration net/rds: Detect need of On-Demand-Paging memory registration RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs RDMA/mlx5: Don't fake udata for kernel path IB/mlx5: Add ODP WQE handlers for kernel QPs IB/core: Add interface to advise_mr for kernel users IB/core: Introduce ib_reg_user_mr IB: Allow calls to ib_umem_get from kernel ULPs Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21Merge tag 'rds-odp-for-5.5' of https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdmaDavid S. Miller35-193/+290
Leon Romanovsky says: ==================== Use ODP MRs for kernel ULPs The following series extends MR creation routines to allow creation of user MRs through kernel ULPs as a proxy. The immediate use case is to allow RDS to work over FS-DAX, which requires ODP (on-demand-paging) MRs to be created and such MRs were not possible to create prior this series. The first part of this patchset extends RDMA to have special verb ib_reg_user_mr(). The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP acts as an agent for the userspace application. The second part provides advise MR functionality for ULPs. This is integral part of ODP flows and used to trigger pagefaults in advance to prepare memory before running working set. The third part is actual user of those in-kernel APIs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-21scsi: RDMA/isert: Fix a recently introduced regression related to logoutBart Van Assche1-12/+0
iscsit_close_connection() calls isert_wait_conn(). Due to commit e9d3009cb936 both functions call target_wait_for_sess_cmds() although that last function should be called only once. Fix this by removing the target_wait_for_sess_cmds() call from isert_wait_conn() and by only calling isert_wait_conn() after target_wait_for_sess_cmds(). Fixes: e9d3009cb936 ("scsi: target: iscsi: Wait for all commands to finish before freeing a session"). Link: https://lore.kernel.org/r/20200116044737.19507-1-bvanassche@acm.org Reported-by: Rahul Kundu <rahul.kundu@chelsio.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-20Merge tag 'v5.5-rc7' into perf/core, to pick up fixesIngo Molnar5-16/+27
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-01-19Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller5-16/+27
2020-01-16Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxSaeed Mahameed1-4/+6
This merge syncs with mlx5-next latest HW bits and layout updates for next features, in addition one patch that improves mlx5_create_auto_grouped_flow_table() API across all mlx5 users. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Refactor mlx5_create_auto_grouped_flow_table net/mlx5e: Add discard counters per priority net/mlx5e: Expose FEC feilds and related capability bit net/mlx5: Add mlx5_ifc definitions for connection tracking support net/mlx5: Add copy header action struct layout net/mlx5: Expose resource dump register mapping net/mlx5: Add structures and defines for MIRC register net/mlx5: Read MCAM register groups 1 and 2 net/mlx5: Add structures layout for new MCAM access reg groups net/mlx5: Expose vDPA emulation device capabilities net/mlx5: Add Virtio Emulation related device capabilities Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16net/mlx5: Refactor mlx5_create_auto_grouped_flow_tablePaul Blakey1-4/+6
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param which already carries the max_fte, prio and flags memebers, and is used the same in similar mlx5_create_flow_table() function. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-16IB/mlx4: Fix memory leak in add_gid error flowJack Morgenstein1-4/+16
In procedure mlx4_ib_add_gid(), if the driver is unable to update the FW gid table, there is a memory leak in the driver's copy of the gid table: the gid entry's context buffer is not freed. If such an error occurs, free the entry's context buffer, and mark the entry as available (by setting its context pointer to NULL). Fixes: e26be1bfef81 ("IB/mlx4: Implement ib_device callbacks") Link: https://lore.kernel.org/r/20200115085050.73746-1-leon@kernel.org Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16IB/mlx5: Expose RoCE accelerator countersAvihai Horon1-0/+18
Introduce the following RoCE accelerator counters: * roce_adp_retrans - number of adaptive retransmission for RoCE traffic. * roce_adp_retrans_to - number of times RoCE traffic reached time out due to adaptive retransmission. * roce_slow_restart - number of times RoCE slow restart was used. * roce_slow_restart_cnps - number of times RoCE slow restart generate CNP packets. * roce_slow_restart_trans - number of times RoCE slow restart change state to slow restart. Link: https://lore.kernel.org/r/20200115145459.83280-3-leon@kernel.org Signed-off-by: Avihai Horon <avihaih@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/mlx5: Set relaxed ordering when requestedMichael Guralnik4-5/+23
Enable relaxed ordering in the mkey context when requested. As relaxed ordering is not currently supported in UMR, disable UMR usage for relaxed ordering MRs. Link: https://lore.kernel.org/r/1578506740-22188-11-git-send-email-yishaih@mellanox.com Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/core: Add the core support field to METHOD_GET_CONTEXTMichael Guralnik1-0/+8
Add the core support field to METHOD_GET_CONTEXT, this field should represent capabilities that are not device-specific. Return support for optional access flags for memory regions. User-space will use this capability to mask the optional access flags for unsupporting kernels. Link: https://lore.kernel.org/r/1578506740-22188-10-git-send-email-yishaih@mellanox.com Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/efa: Allow passing of optional access flags for MR registrationMichael Guralnik1-0/+1
As part of adding a range of optional access flags that drivers need to be able to accept, mask this range inside efa driver. This will prevent the driver from failing when an access flag from that range is passed. Link: https://lore.kernel.org/r/1578506740-22188-8-git-send-email-yishaih@mellanox.com Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/uverbs: Add ioctl command to get a device contextJason Gunthorpe5-62/+114
Allow future extensions of the get context command through the uverbs ioctl kabi. Unlike the uverbs version this does not return an async_fd as well, that has to be done with another command. Link: https://lore.kernel.org/r/1578506740-22188-5-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/core: Remove ucontext_lock from the uverbs_destry_ufile_hw() pathJason Gunthorpe2-21/+5
This lock only serializes ucontext creation. Instead of checking the ucontext_lock during destruction hold the existing hw_destroy_rwsem during creation, which is the standard pattern for object creation. The simplification of locking is needed for the next patch. Link: https://lore.kernel.org/r/1578506740-22188-4-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/core: Add UVERBS_METHOD_ASYNC_EVENT_ALLOCJason Gunthorpe2-2/+25
Allow the async FD to be allocated separately from the context. This is necessary to introduce the ioctl to create a context, as an ioctl should only ever create a single uobject at a time. If multiple async FDs are created then the first one is used to deliver affiliated events from any ib_uevent_object, with all subsequent ones will receive only unaffiliated events. Link: https://lore.kernel.org/r/1578506740-22188-3-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-16RDMA/mlx5: Fix handling of IOVA != user_va in ODP pathsJason Gunthorpe2-6/+15
Till recently it was not possible for userspace to specify a different IOVA, but with the new ibv_reg_mr_iova() library call this can be done. To compute the user_va we must compute: user_va = (iova - iova_start) + user_va_start while being cautious of overflow and other math problems. The iova is not reliably stored in the mmkey when the MR is created. Only the cached creation path (the common one) set it, so it must also be set when creating uncached MRs. Fix the weird use of iova when computing the starting page index in the MR. In the normal case, when iova == umem.address: iova & (~(BIT(page_shift) - 1)) == ALIGN_DOWN(umem.address, odp->page_size) == ib_umem_start(odp) And when iova is different using it in math with a user_va is wrong. Finally, do not allow an implicit ODP to be created with a non-zero IOVA as we have no support for that. Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/mlx5: Mask out unsupported ODP capabilities for kernel QPsMoni Shoua1-0/+17
The ODP handler for WQEs in RQ or SRQ is not implented for kernel QPs. Therefore don't report support in these if query comes from a kernel user. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16RDMA/mlx5: Don't fake udata for kernel pathLeon Romanovsky1-18/+16
Kernel paths must not set udata and provide NULL pointer, instead of faking zeroed udata struct. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/mlx5: Add ODP WQE handlers for kernel QPsMoni Shoua3-70/+117
One of the steps in ODP page fault handler for WQEs is to read a WQE from a QP send queue or receive queue buffer at a specific index. Since the implementation of this buffer is different between kernel and user QP the implementation of the handler needs to be aware of that and handle it in a different way. ODP for kernel MRs is currently supported only for RDMA_READ and RDMA_WRITE operations so change the handler to - read a WQE from a kernel QP send queue - fail if access to receive queue or shared receive queue is required for a kernel QP Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/core: Add interface to advise_mr for kernel usersMoni Shoua1-0/+11
Allow ULPs to call advise_mr, so they can control ODP regions in the same way as user space applications. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB/core: Introduce ib_reg_user_mrMoni Shoua3-1/+34
Add ib_reg_user_mr() for kernel ULPs to register user MRs. The common use case that uses this function is a userspace application that allocates memory for HCA access but the responsibility to register the memory at the HCA is on an kernel ULP. This ULP that acts as an agent for the userspace application. This function is intended to be used without a user context so vendor drivers need to be aware of calling reg_user_mr() device operation with udata equal to NULL. Among all drivers, i40iw is the only driver which relies on presence of udata, so check udata existence for that driver. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16IB: Allow calls to ib_umem_get from kernel ULPsMoni Shoua32-98/+80
So far the assumption was that ib_umem_get() and ib_umem_odp_get() are called from flows that start in UVERBS and therefore has a user context. This assumption restricts flows that are initiated by ULPs and need the service that ib_umem_get() provides. This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device directly by relying on the fact that both UVERBS and ULPs sets that field correctly. Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-15IB/srp: Never use immediate data if it is disabled by a userSergey Gorenko1-1/+2
Some SRP targets that do not support specification SRP-2, put the garbage to the reserved bits of the SRP login response. The problem was not detected for a long time because the SRP initiator ignored those bits. But now one of them is used as SRP_LOGIN_RSP_IMMED_SUPP. And it causes a critical error on the target when the initiator sends immediate data. The ib_srp module has a use_imm_date parameter to enable or disable immediate data manually. But it does not help in the above case, because use_imm_date is ignored at handling the SRP login response. The problem is definitely caused by a bug on the target side, but the initiator's behavior also does not look correct. The initiator should not use immediate data if use_imm_date is disabled by a user. This commit adds an additional checking of use_imm_date at the handling of SRP login response to avoid unexpected use of immediate data. Fixes: 882981f4a411 ("RDMA/srp: Add support for immediate data") Link: https://lore.kernel.org/r/20200115133055.30232-1-sergeygo@mellanox.com Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/rxe: Compute the maximum sges and inline size based on the WQE sizeRao Shoaib1-10/+8
The SGE buffer size and max_inline data should be derived from the size of the WQE. Each value individually sets the WQE size, so compute the actual sizes based on the actual WQE size and configure the QP with the maximums. Also fix the missing return of the actual maximum capability to the caller. Link: https://lore.kernel.org/r/1578962480-17814-3-git-send-email-rao.shoaib@oracle.com Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15Introduce maximum WQE size to check limitsRao Shoaib1-1/+6
Introduce maximum WQE size to impose limits on max SGE's and inline data Link: https://lore.kernel.org/r/1578962480-17814-2-git-send-email-rao.shoaib@oracle.com Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/efa: Remove unused ucontext parameter from efa_qp_user_mmap_entries_removeGal Pressman1-7/+4
The ucontext parameter is unused, remove it. Link: https://lore.kernel.org/r/20200114085706.82229-6-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/efa: Remove {} brackets from single statement ifGal Pressman1-2/+1
The {} brackets are not needed according to the Linux coding style. Link: https://lore.kernel.org/r/20200114085706.82229-5-galpress@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas JahJah <firasj@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/efa: Device definitions documentation updatesGal Pressman1-13/+24
Various clarifications and updates to the documentation of the device definitions. No functional changes in this patch. Link: https://lore.kernel.org/r/20200114085706.82229-4-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Add support for extended atomic in userspaceJiaran Zhang2-2/+17
To support extended atomic operations including cmp & swap and fetch & add of 8 bytes, 16 bytes, 32 bytes, 64 bytes in userspace, some field in qpc should be configured. Link: https://lore.kernel.org/r/1579052546-11746-1-git-send-email-liweihang@huawei.com Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Get pf capabilities from firmwareLijun Ou2-109/+6
Get pf capabilities from firmware according to different hardwares, if it fails, all capabilities will be set with a default value. Link: https://lore.kernel.org/r/1578738761-3176-4-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Add interfaces to get pf capabilities from firmwareLijun Ou3-0/+527
pf capabilities are set by default for hip08 previously which should depends on different types of hardware. So add new interfaces to get them from firmware. Link: https://lore.kernel.org/r/1578738761-3176-3-git-send-email-liweihang@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-15RDMA/hns: Remove some redundant variables related to capabilitiesWeihang Li3-7/+1
In struct hns_roce_caps, max_srq_sg and max_srqwqes is unused, and max_srqs has the same effect with num_srqs. So remove them from this structrue. Link: https://lore.kernel.org/r/1578738761-3176-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Use READ_ONCE for ib_ufile.async_fileJason Gunthorpe5-26/+18
The writer for async_file holds the ucontext_lock, while the readers are left unlocked. Most readers rely on an implicit locking, either by having a uobject (which cannot be created before a context) or by holding the ib_ufile kref. However ib_uverbs_free_hw_resources() has no implicit lock and has a possible race. Make this all clear and sane by using READ_ONCE consistently. Link: https://lore.kernel.org/r/1578504126-9400-15-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Make ib_uverbs_async_event_file into a uobjectJason Gunthorpe8-143/+95
This makes async events aligned with completion events as both are full uobjects of FD type and use the same uobject lifecycle. A bunch of duplicate code is consolidated and the general flow between the two FDs is now very similar. Link: https://lore.kernel.org/r/1578504126-9400-14-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Remove the ufile arg from rdma_alloc_begin_uobjectJason Gunthorpe1-17/+19
Now that all callers provide a non-NULL attrs the ufile is redundant. Adjust things so that the context handling is done inside alloc_uobj, and the ib_uverbs_get_ucontext_file() is avoided if we already have the context. Link: https://lore.kernel.org/r/1578504126-9400-13-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Simplify type usage for ib_uverbs_async_handler()Jason Gunthorpe3-56/+34
This function works on an ib_uverbs_async_file. Accept that as a parameter instead of the struct ib_uverbs_file. Consoldiate all the callers working from an ib_uevent_object to a single function and locate the async_file directly from the struct ib_uobject instead of using context_ptr. Link: https://lore.kernel.org/r/1578504126-9400-11-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_wq.uobjectJason Gunthorpe2-7/+9
This is a struct ib_uwq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-10-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_srq.uobjectJason Gunthorpe2-8/+12
This is a struct ib_usrq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-9-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_qp.uobjectJason Gunthorpe3-16/+23
This is a struct ib_uqp_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-8-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not erase the type of ib_cq.uobjectJason Gunthorpe4-21/+31
This is a struct ib_ucq_object pointer, instead of using container_of() all over the place just store it with its actual type. Link: https://lore.kernel.org/r/1578504126-9400-7-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Make ib_ucq_object use ib_uevent_objectJason Gunthorpe4-35/+25
Any uobject that sends events into the async_event_file should be using ib_uevent_object so it can use the standard uevent based helper functions. CQ pushes events into both the async_event and the comp_channel in an open coded way. Move the async events related stuff to ib_uevent_object. Link: https://lore.kernel.org/r/1578504126-9400-6-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Do not allow alloc_commit to failJason Gunthorpe4-128/+96
This is a left over from an earlier version that creates a lot of complexity for error unwind, particularly for FD uobjects. The only reason this was done is so that anon_inode_get_file() could be called with the final fops and a fully setup uobject. Both need to be setup since unwinding anon_inode_get_file() via fput will call the driver's release(). Now that the driver does not provide release, we no longer need to worry about this complicated sequence, simply create the struct file at the start and allow the core code's release function to deal with the abort case. This allows all the confusing error paths around commit to be removed. Link: https://lore.kernel.org/r/1578504126-9400-5-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Simplify devx async commandsJason Gunthorpe1-14/+10
With the new FD structure the async commands do not need to hold any references while running. The existing mlx5_cmd_exec_cb() and mlx5_cmd_cleanup_async_ctx() provide enough synchronization to ensure that all outstanding commands are completed before the uobject can be destructed. Remove the now confusing get_file() and the type erasure of the devx_async_cmd_event_file. Link: https://lore.kernel.org/r/1578504126-9400-4-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/core: Simplify destruction of FD uobjectsJason Gunthorpe6-117/+88
FD uobjects have a weird split between the struct file and uobject world. Simplify this to make them pure uobjects and use a generic release method for all struct file operations. This fixes the control flow so that mlx5_cmd_cleanup_async_ctx() is always called before erasing the linked list contents to make the concurrancy simpler to understand. For this to work the uobject destruction must fence anything that it is cleaning up - the design must not rely on struct file lifetime. Only deliver_event() relies on the struct file to when adding new events to the queue, add a is_destroyed check under lock to block it. Link: https://lore.kernel.org/r/1578504126-9400-3-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-13RDMA/mlx5: Use RCU and direct refcounts to keep memory aliveJason Gunthorpe3-37/+23
dispatch_event_fd() runs from a notifier with minimal locking, and relies on RCU and a file refcount to keep the uobject and eventfd alive. As the next patch wants to remove the file_operations release function from the drivers, re-organize things so that the devx_event_notifier() path uses the existing RCU to manage the lifetime of the uobject and eventfd. Move the refcount puts to a call_rcu so that the objects are guaranteed to exist and remove the indirect file refcount. Link: https://lore.kernel.org/r/1578504126-9400-2-git-send-email-yishaih@mellanox.com Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>