aboutsummaryrefslogtreecommitdiffstats
path: root/include/rdma (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-07-04RDMA/uverbs: Remove UA_FLAGSJason Gunthorpe1-14/+21
This bit of boilerplate isn't really necessary, we can use bitfields instead of a flags enum and the macros can then individually initialize them through the __VA_ARGS__ like everything else. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04RDMA/uverbs: Get rid of the & in method specificationsJason Gunthorpe2-13/+24
Hide it inside the macros. The & is confusing and interferes with using this as a generic DSL in later patches. Since this also touches almost every line, also run the specs through clang-format (with 'BinPackParameters: false') to make the maintenance easier. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04RDMA/uverbs: Simplify UVERBS_OBJECT and _TREE family of macrosJason Gunthorpe2-34/+48
Instead of the large set of indirecting macros, define the few needed macros to directly instantiate the struct uverbs_oject_tree_def and associated objects list. This is small amount of code duplication but the readability is far better. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04RDMA/uverbs: Simplify method definition macrosJason Gunthorpe2-51/+40
Instead of the large set of indirecting macros, define the few needed macros to directly instantiate the struct uverbs_method_def and associated attributes list. This is small amount of code duplication but the readability is far better. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04RDMA/uverbs: Simplify UVERBS_ATTR family of macrosJason Gunthorpe1-98/+74
Instead of using a complex cascade of macros, just directly provide the initializer list each of the declarations is trying to create. Now that the macros are simplified this also reworks the uverbs_attr_spec to be friendly to older compilers by eliminating any unnamed structures/unions inside, and removing the duplication of some fields. The structure size remains at 16 bytes which was the original motivation for some of this oddness. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-04RDMA/uverbs: Store the specs_root in the struct ib_uverbs_deviceJason Gunthorpe1-1/+1
The specs are required to operate the uverbs file, so they belong inside the ib_uverbs_device, not inside the ib_device. The spec passed in the ib_device is just a communication from the driver and should not be used during runtime. This also changes the lifetime of the spec memory to match the ib_uverbs_device, however at this time the spec_root can still contain driver pointers after disassociation, so it cannot be used if ib_dev is NULL. This is preparation for another series. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-07-03include/rdma/opa_addr.h: Fix an endianness issueBart Van Assche1-1/+1
IB_MULTICAST_LID_BASE is defined as follows: #define IB_MULTICAST_LID_BASE cpu_to_be16(0xC000) Hence use be16_to_cpu() to convert it to CPU endianness. Compile-tested only. Fixes: af808ece5ce9 ("IB/SA: Check dlid before SA agent queries for ClassPortInfo") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-29IB: Improve uverbs_cleanup_ucontext algorithmYishai Hadas2-8/+49
Improve uverbs_cleanup_ucontext algorithm to work properly when the topology graph of the objects cannot be determined at compile time. This is the case with objects created via the devx interface in mlx5. Typically uverbs objects must be created in a strict topologically sorted order, so that LIFO ordering will generally cause them to be freed properly. There are only a few cases (eg memory windows) where objects can point to things out of the strict LIFO order. Instead of using an explicit ordering scheme where the HW destroy is not allowed to fail, go over the list multiple times and allow the destroy function to fail. If progress halts then a final, desperate, cleanup is done before leaking the memory. This indicates a driver bug. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-25RDMA/verbs: Drop kernel variant of destroy_flowLeon Romanovsky1-2/+0
Following the removal of ib_create_flow(), adjust the code to get rid of ib_destroy_flow() too. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-25RDMA/verbs: Drop kernel variant of create_flowLeon Romanovsky1-2/+0
There are no kernel users of this interface so lets drop it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-25RDMA/core: Remove unused ib cache functionsJason Gunthorpe1-39/+0
Now that all users have been converted to use the version of these APIs that returns a gid_attr pointer we can delete the old entry points. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25IB/cm: Replace members of sa_path_rec with 'struct sgid_attr *'Parav Pandit1-46/+3
While processing a path record entry in CM messages the associated GID attribute is now also supplied. Currently for RoCE a netdevice's net namespace pointer and ifindex are stored in path record entry. Both of these fields of the netdev can change anytime while processing CM messages. Additionally storing net namespace without holding reference will lead to use-after-free crash. Therefore it is removed. Netdevice information for RoCE is instead provided via referenced gid attribute in ib_cm requests. Such a design leads to a situation where the kernel can crash when the net pointer becomes invalid. However today it is always initialized to init_net, which cannot become invalid. In order to support processing packets in any arbitrary namespace of the received packet, it is necessary to avoid such conditions. This patch removes the dependency on the net pointer and ifindex; instead it will rely on SGID attribute which contains a pointer to netdev. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25IB/cm: Pass the sgid_attr through various eventsParav Pandit1-0/+3
Make the sgid_attr available along with path information to the event consumer, this allows the consumer to keep using the same GID table entry as the event is related to. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25IB/cm: Keep track of the sgid_attr that created the cm idParav Pandit1-0/+2
Hold reference to the the sgid_attr which is used in a cm_id until the cm_id is destroyed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-25IB: Make ib_init_ah_attr_from_wc set sgid_attrParav Pandit1-0/+7
The work completion is inspected to determine what dgid table entry was used to receieve the packet, produces a sgid_attr that matches and sticks it in the ah_attr. All callers of this function are now required to release the ah_attr on success. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-22IB/uverbs: Delete type and id from uverbs_obj_attrJason Gunthorpe1-4/+0
In this context the uobject is not allowed to be NULL, so type is the same as uobject->type, and at least for IDR, id is the same as uobject->id. FD objects should never handle the FD number outside the uAPI boundary code. Suggested-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19IB/rdmavt, IB/hfi1: Create device dependent s_flagsMike Marciniszyn1-13/+17
Move some s_flags defines out of rdmavt and into hfi1 because they are hfi1 specific and therefore should remain in the driver instead of bubbling up to rdmavt. Document device specific ranges in rdmavt and remap those in hfi1. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19IB/core: Expose ib_ucontext from a given ib_uverbs_fileYishai Hadas1-0/+2
Drivers that use the IOCTL API may have the ib_uverbs_file and need a way to get the related ib_ucontext from it, this is enabled by this patch. Downstream patches from this series will use it. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19IB/core: Introduce DECLARE_UVERBS_GLOBAL_METHODSYishai Hadas2-2/+4
Introduce a new macro to be used for global methods on a singleton object. This macros sets internally the type_attrs to be NULL as such an object can't be created. Downstream patches from this series will use this macro. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19IB/uverbs: Add a macro to define a type with no kernel known sizeMatan Barak1-0/+2
Sometimes the uverbs uAPI doesn't really care about the structure it gets from user-space. All it wants to do is to allocate enough space and send it to the hardware/provider driver. Adding a UVERBS_ATTR_MIN_SIZE that could be used for this scenarios. We use USHRT_MAX as the kernel known size to bypass any zero validations. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-19IB/uverbs: Add PTR_IN attributes that are allocated/copied automaticallyMatan Barak1-1/+35
Adding UVERBS_ATTR_SPEC_F_ALLOC_AND_COPY flag to PTR_IN attributes. By using this flag, the parse automatically allocates and copies the user-space data. This data is accessible by using uverbs_attr_get_len and uverbs_attr_get_alloced_ptr inline accessor functions from the handler. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/core: add max_send_sge and max_recv_sge attributesSteve Wise1-1/+2
This patch replaces the ib_device_attr.max_sge with max_send_sge and max_recv_sge. It allows ulps to take advantage of devices that have very different send and recv sge depths. For example cxgb4 has a max_recv_sge of 4, yet a max_send_sge of 16. Splitting out these attributes allows much more efficient use of the SQ for cxgb4 with ulps that use the RDMA_RW API. Consider a large RDMA WRITE that has 16 scattergather entries. With max_sge of 4, the ulp would send 4 WRITE WRs, but with max_sge of 16, it can be done with 1 WRITE WR. Acked-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18RDMA: Hold the sgid_attr inside the struct ib_ah/qpJason Gunthorpe1-0/+4
If the AH has a GRH then hold a reference to the sgid_attr inside the common struct. If the QP is modified with an AV that includes a GRH then also hold a reference to the sgid_attr inside the common struct. This informs the cache that the sgid_index is in-use so long as the AH or QP using it exists. This also means that all drivers can access the sgid_attr directly from the ah_attr instead of querying the cache during their UD post-send paths. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18RDMA: Convert drivers to use sgid_attr instead of sgid_indexParav Pandit1-4/+4
The core code now ensures that all driver callbacks that receive an rdma_ah_attrs will have a sgid_attr's pointer if there is a GRH present. Drivers can use this pointer instead of calling a query function with sgid_index. This simplifies the drivers and also avoids races where a gid_index lookup may return different data if it is changed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18IB{cm, core}: Introduce and use ah_attr copy, move, replace APIsJason Gunthorpe1-0/+5
Introduce AH attribute copy, move and replace APIs to be used by core and provider drivers. In CM code flow when ah attribute might be re-initialized twice while processing incoming request, or initialized once while from path record while sending out CM requests. Therefore use rdma_move_ah_attr API to handle such scenarios instead of memcpy(). Provider drivers keeps a copy ah_attr during the lifetime of the ah. Therefore, use rdma_replace_ah_attr() which conditionally release reference to old ah_attr and holds reference to new attribute whose referrence is released when the AH is freed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18IB/core: Add a sgid_attr pointer to struct rdma_ah_attrJason Gunthorpe1-0/+7
The sgid_attr will ultimately replace the sgid_index in the ah_attr. This will allow for all layers to have a consistent view of what gid table entry was selected as processing runs through all stages of the stack. This commit introduces the pointer and ensures it is set before calling any driver callback that includes a struct ah_attr callback, allowing future patches to adjust both the drivers and the callers to use sgid_attr instead of sgid_index. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-18IB: Replace ib_query_gid/ib_get_cached_gid with rdma_query_gidParav Pandit1-4/+0
If the gid_attr argument is NULL then the functions behave identically to rdma_query_gid. ib_query_gid just calls ib_get_cached_gid, so everything can be consolidated to one function. Now that all callers either use rdma_query_gid() or ib_get_cached_gid(), ib_query_gid() API is removed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/core: Provide rdma_ versions of the gid cache APIJason Gunthorpe1-0/+17
These versions are functionally similar but all return gid_attrs and related information via reference instead of via copy. The old API is preserved, implemented as wrappers around the new, until all callers can be converted. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/core: Introduce GID attribute get, put and hold APIsParav Pandit1-0/+4
This patch introduces three APIs, rdma_get_gid_attr(), rdma_put_gid_attr(), and rdma_hold_gid_attr() which expose the reference counting for GID table entries to the entire stack. The kref counting is based on the struct ib_gid_attr pointer Later patches will convert more cache query function to return struct ib_gid_attrs. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18RDMA: Use GID from the ib_gid_attr during the add_gid() callbackParav Pandit1-2/+1
Now that ib_gid_attr contains the GID, make use of that in the add_gid() callback functions for the provider drivers to simplify the add_gid() implementations. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/core: Introduce GID entry reference countsParav Pandit1-0/+1
In order to be able to expose pointers to the ib_gid_attrs in the GID table we need to make it so the value of the pointer cannot be changed. Thus each GID table entry gets a unique piece of kref'd memory that is written only during initialization and remains constant for its lifetime. This eventually will allow the struct ib_gid_attrs to be returned without copy from many of query the APIs, but it also provides a way to track when all users of a HW table index go away. For roce we no longer allow an in-use HW table index to be re-used for a new an different entry. When a GID table entry needs to be removed it is hidden from the find API, but remains as a valid HW index and all ib_gid_attr points remain valid. The HW index is not relased until all users put the kref. Later patches will broadly replace the use of the sgid_index integer with the kref'd structure. Ultimately this will prevent security problems where the OS changes the properties of a HW GID table entry while an active user object is still using the entry. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-12Convert infiniband uverbs to struct_sizeMatthew Wilcox1-4/+1
The flows were hidden from the C compiler; expose them as a zero-length array to allow struct_size to work. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-05RDMA/restrack: Change SPDX tag to properly reflect licenseLeon Romanovsky1-1/+1
Resource tracking is supposed to be dual licensed: GPL-2.0 and OpenIB, but the SPDX tag was not compliant to it. Update the tag to properly reflect license. Fixes: 02d8883f520e ("RDMA/restrack: Add general infrastructure to track RDMA resources") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04RDMA/core: introduce check masks for T10-PI offloadMax Gurtovoy1-0/+13
T10-PI offload capability is currently supported in iSER protocol only, and the definition of the HCA protection information checks are missing from the core layer. Add those definition to avoid code duplication in other drivers (such iSER target and NVMeoF). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-04Merge tag 'verbs_flow_counters' of git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git into for-nextJason Gunthorpe2-2/+52
Pull verbs counters series from Leon Romanovsky: ==================== Verbs flow counters support This series comes to allow user space applications to monitor real time traffic activity and events of the verbs objects it manages, e.g.: ibv_qp, ibv_wq, ibv_flow. The API enables generic counters creation and define mapping to association with a verbs object, the current mlx5 driver is using this API for flow counters. With this API, an application can monitor the entire life cycle of object activity, defined here as a static counters attachment. This API also allows dynamic counters monitoring of measurement points for a partial period in the verbs object life cycle. In addition it presents the implementation of the generic counters interface. This will be achieved by extending flow creation by adding a new flow count specification type which allows the user to associate a previously created flow counters using the generic verbs counters interface to the created flow, once associated the user could read statistics by using the read function of the generic counters interface. The API includes: 1. create and destroyed API of a new counters objects 2. read the counters values from HW Note: Attaching API to allow application to define the measurement points per objects is a user space only API and this data is passed to kernel when the counted object (e.g. flow) is created with the counters object. =================== * tag 'verbs_flow_counters': IB/mlx5: Add counters read support IB/mlx5: Add flow counters read support IB/mlx5: Add flow counters binding support IB/mlx5: Add counters create and destroy support IB/uverbs: Add support for flow counters IB/core: Add support for flow counters IB/core: Support passing uhw for create_flow IB/uverbs: Add read counters support IB/core: Introduce counters read verb IB/uverbs: Add create/destroy counters support IB/core: Introduce counters object and its create/destroy IB/uverbs: Add an ib_uobject getter to ioctl() infrastructure net/mlx5: Export flow counter related API net/mlx5: Use flow counter pointer as input to the query function
2018-06-02IB/core: Add support for flow countersRaed Salem1-1/+14
A counters object could be attached to flow on creation by providing the counter specification action. General counters description which count packets and bytes are introduced, downstream patches from this series will use them as part of flow counters binding. In addition, increase number of flow specifications supported layers to 10 upon adding count specification and for the previously added drop specification. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-02IB/core: Support passing uhw for create_flowMatan Barak1-1/+2
This is required when user-space drivers need to pass extra information regarding how to handle this flow steering specification. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-02IB/core: Introduce counters read verbRaed Salem1-0/+14
The user supplies counters instance and a reference to an output array of uint64_t. The driver reads the hardware counters values and writes them to the output index location in the user supplied array. All counters values are represented as uint64_t types. To be able to successfully read the data the counters must be first bound to an IB object. Downstream patches will present binding method for flow counters. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-02IB/core: Introduce counters object and its create/destroyRaed Salem1-0/+11
A verbs application may need to get statistics and info on various aspects of a verb object (e.g. Flow, QP, ...), in general case the application will state which object's counters its interested in (we refer to this action as attach), bind this new counters object to the appropriate verb object and on later stage read their values using the counters object. This series introduces a general API for counters object that may accumulate any ib object counters type, bound and read on demand. Counters instance is allocated on an IB context and belongs to that context. Upon successful creation the counters can be bound to a verbs object so that hardware counter instances can be created and read. Downstream patches in this series will introduce the attach, bind and the read functionality. Counters instance can be de-allocated, upon successful destruction the related hardware resources are released. Prior to destroy call the user must first make sure that the counters is not being used by any IB object, e.g. not attached to any of its counted type otherwise an EBUSY error is invoked. Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Raed Salem <raeds@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-02IB/uverbs: Add an ib_uobject getter to ioctl() infrastructureMatan Barak1-0/+11
Previously, the user had to dig inside the attribute to get the uobject. Add a helper function that correctly extract it (and do the required checks) for him/her. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-28Merge branch 'mr_fix' into git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma for-nextJason Gunthorpe3-6/+19
Update mlx4 to support user MR creation against read-only memory, previously it required the memory to be writable. Based on rdma for-rc due to dependencies. * mr_fix: (2 commits) IB/mlx4: Mark user MR as writable if actual virtual memory is writable IB/core: Make testing MR flags for writability a static inline function
2018-05-28IB/core: Make testing MR flags for writability a static inline functionJack Morgenstein1-0/+14
Make the MR writability flags check, which is performed in umem.c, a static inline function in file ib_verbs.h This allows the function to be used by low-level infiniband drivers. Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-05-24IB/core: Reduce the places that use zgidParav Pandit1-0/+1
Instead of open coding memcmp() to check whether a given GID is zero or not, use a helper function to do so, and replace instances of memcpy(z,&zgid) with memset. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-23IB/uverbs: Fix uverbs_attr_get_objJason Gunthorpe1-5/+5
The err pointer comes from uverbs_attr_get, not from the uobject member, which does not store an ERR_PTR. Fixes: be934cca9e98 ("IB/uverbs: Add device memory registration ioctl support") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-05-22RDMA/CMA: add rdma_iw_cm_id() and rdma_res_to_id() helpersSteve Wise1-0/+3
Add a helper function for iwarp drivers to be able to map an rdma_cm_id to an iw_cm_id. This is useful for dumping driver specific NLDEV/RESTRACK connection state. Add a helper to return the rdma_cm_id pointer from the rdma_restack pointer. This is needed for rdma drivers to map a res entry back to the public rdma_cm_id struct. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-16IB/uverbs: Introduce a MPLS steering match filterAriel Levkovich1-0/+15
Add a new MPLS steering match filter that can match against a single MPLS tag field. Since the MPLS header can reside in different locations in the packet's protocol stack as well as be encapsulated with a tunnel protocol, it is required to know the exact location of the header in the protocol stack. Therefore, when including the MPLS protocol spec in the specs list, it is mandatory to provide the list in an ordered manner, so that it represents the actual header order in a matching packet. Drivers that process the spec list and apply the matching rule should treat the position of the MPLS spec in the spec list as the actual location of the MPLS label in the packet's protocol stack. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-16IB/uverbs: Introduce a GRE steering match filterAriel Levkovich1-0/+17
Adding a new GRE steering match filter that can match against key and protocol fields. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-15IB/umem: Use the correct mm during ib_umem_releaseLidong Chen1-1/+0
User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads. If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has exited, get_pid_task will return NULL and ib_umem_release will not decrease mm->pinned_vm. Instead of using threads to locate the mm, use the overall tgid from the ib_ucontext struct instead. This matches the behavior of ODP and disassociate in handling the mm of the process that called ibv_reg_mr. Cc: <stable@vger.kernel.org> Fixes: 87773dd56d54 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get") Signed-off-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-09IB/{hfi1, qib, rdmavt}: Move logic to allocate receive WQE into rdmavtBrian Welty1-0/+1
Moving receive-side WQE allocation logic into rdmavt will allow further code reuse between qib and hfi1 drivers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-05-09IB/{hfi1, rdmavt, qib}: Implement CQ completion vector supportSebastian Sanchez2-5/+7
Currently the driver doesn't support completion vectors. These are used to indicate which sets of CQs should be grouped together into the same vector. A vector is a CQ processing thread that runs on a specific CPU. If an application has several CQs bound to different completion vectors, and each completion vector runs on different CPUs, then the completion queue workload is balanced. This helps scale as more nodes are used. Implement CQ completion vector support using a global workqueue where a CQ entry is queued to the CPU corresponding to the CQ's completion vector. Since the workqueue is global, it's guaranteed to always be there when queueing CQ entries; Therefore, the RCU locking for cq->rdi->worker in the hot path is superfluous. Each completion vector is assigned to a different CPU. The number of completion vectors available is computed by taking the number of online, physical CPUs from the local NUMA node and subtracting the CPUs used for kernel receive queues and the general interrupt. Special use cases: * If there are no CPUs left for completion vectors, the same CPU for the general interrupt is used; Therefore, there would only be one completion vector available. * For multi-HFI systems, the number of completion vectors available for each device is the total number of completion vectors in the local NUMA node divided by the number of devices in the same NUMA node. If there's a division remainder, the first device to get initialized gets an extra completion vector. Upon a CQ creation, an invalid completion vector could be specified. Handle it as follows: * If the completion vector is less than 0, set it to 0. * Set the completion vector to the result of the passed completion vector moded with the number of device completion vectors available. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>