aboutsummaryrefslogtreecommitdiffstats
path: root/include/net/ip6_fib.h (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2019-02-08RDMA/device: Use an ida instead of a free page in alloc_nameJason Gunthorpe1-11/+17
ida is the proper data structure to hold list of clustered small integers and then allocate an unused integer. Get rid of the convoluted and limited open-coded bitmap. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/device: Get rid of reg_stateJason Gunthorpe2-12/+2
This really has no purpose anymore, refcount can be used to tell if the device is still registered. Keeping it around just invites mis-use. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-02-08RDMA/device: Call ib_cache_release_one() only from ib_device_release()Jason Gunthorpe2-29/+15
Instead of complicated logic about when this memory is freed, always free it during device release(). All the cache pointers start out as NULL, so it is safe to call this before the cache is initialized. This makes for a simpler error unwind flow, and a simpler understanding of the lifetime of the memory allocations inside the struct ib_device. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/device: Ensure that security memory is always freedJason Gunthorpe3-12/+6
Since this only frees memory it should be done during the release callback. Otherwise there are possible error flows where it might not get called if registration aborts. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/device: Check that the rename is nop under the lockJason Gunthorpe1-4/+6
Since another rename could be running in parallel it is safer to check that the name is not changing inside the lock, where we already know the device name will not change. Fixes: d21943dd19b5 ("RDMA/core: Implement IB device rename function") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-02-08RDMA: Handle PD allocations by IB/coreLeon Romanovsky39-409/+325
The PD allocations in IB/core allows us to simplify drivers and their error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand in had with relevant update in .dealloc_pd(). We will use this opportunity and convert .dealloc_pd() to don't fail, as it was suggested a long time ago, failures are not happening as we have never seen a WARN_ON print. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/core: Share driver structure size with coreLeon Romanovsky2-0/+15
Add new macros to be used in drivers while registering ops structure and IB/core while calling allocation routines, so drivers won't need to perform kzalloc/kfree in their paths. The change in allocation stage allows us to initialize common fields prior to calling to drivers (e.g. restrack). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/core: Don't register each MAD agent for LSM notifierDaniel Jurgens4-24/+35
When creating many MAD agents in a short period of time, receive packet processing can be delayed long enough to cause timeouts while new agents are being added to the atomic notifier chain with IRQs disabled. Notifier chain registration and unregstration is an O(n) operation. With large numbers of MAD agents being created and destroyed simultaneously the CPUs spend too much time with interrupts disabled. Instead of each MAD agent registering for it's own LSM notification, maintain a list of agents internally and register once, this registration already existed for handling the PKeys. This list is write mostly, so a normal spin lock is used vs a read/write lock. All MAD agents must be checked, so a single list is used instead of breaking them down per device. Notifier calls are done under rcu_read_lock, so there isn't a risk of similar packet timeouts while checking the MAD agents security settings when notified. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/core: Eliminate a hole in MAD agent structDaniel Jurgens1-2/+2
Move the security related fields above the u8s to eliminate a hole in the struct. pahole before: struct ib_mad_agent { ... u32 hi_tid; /* 48 4 */ u32 flags; /* 52 4 */ u8 port_num; /* 56 1 */ u8 rmpp_version; /* 57 1 */ /* XXX 6 bytes hole, try to pack */ /* --- cacheline 1 boundary (64 bytes) --- */ void * security; /* 64 8 */ bool smp_allowed; /* 72 1 */ bool lsm_nb_reg; /* 73 1 */ /* XXX 6 bytes hole, try to pack */ struct notifier_block lsm_nb; /* 80 24 */ /* XXX last struct has 4 bytes of padding */ /* size: 104, cachelines: 2, members: 14 */ ... }; pahole after: struct ib_mad_agent { ... u32 hi_tid; /* 48 4 */ u32 flags; /* 52 4 */ void * security; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ struct notifier_block lsm_nb; /* 64 24 */ /* XXX last struct has 4 bytes of padding */ u8 port_num; /* 88 1 */ u8 rmpp_version; /* 89 1 */ bool smp_allowed; /* 90 1 */ bool lsm_nb_reg; /* 91 1 */ /* size: 96, cachelines: 2, members: 14 */ ... }; Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/core: Fix potential memory leak while creating MAD agentsDaniel Jurgens1-2/+6
If the MAD agents isn't allowed to manage the subnet, or fails to register for the LSM notifier, the security context is leaked. Free the context in these cases. Fixes: 47a2b338fe63 ("IB/core: Enforce security on management datagrams") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reported-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/core: Unregister notifier before freeing MAD securityDaniel Jurgens1-1/+2
If the notifier runs after the security context is freed an access of freed memory can occur. Fixes: 47a2b338fe63 ("IB/core: Enforce security on management datagrams") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/usnic: Fix locking when unregisteringParvi Kaustubhi1-2/+4
Move the call to usnic_ib_device_remove after usnic_ib_ibdev_list_lock has been released. Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08iw_cxgb4: use tos when finding ipv6 routesSteve Wise1-2/+3
When IPv6 support was added, the correct tos was not passed to cxgb_find_route6(). This potentially results in the wrong route entry. Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08iw_cxgb4: use tos when importing the endpointSteve Wise1-1/+1
import_ep() is passed the correct tos, but doesn't use it correctly. Fixes: ac8e4c69a021 ("cxgb4/iw_cxgb4: TOS support") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08iw_cxgb4: use listening ep tos when accepting new connectionsSteve Wise1-2/+7
If the parent listening endpoint has a service type set, then use that when setting up the connection. This allows server-side applications to mandate the tos for passive side connections via rdma_set_service_type() on the listening endpoints. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/iwcm: add tos_set bool to iw_cm structSteve Wise2-1/+4
This allows drivers to know the tos was actively set by the application. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08RDMA/cma: listening device cm_ids should inherit tosSteve Wise1-0/+2
If a user binds to INADDR_ANY and sets the service id, then the device-specific cm_ids should also use this tos. This allows an app to do: rdma_bind_addr(INADDR_ANY) set_service_type() rdma_listen() And connections setup via this listening endpoint will use the correct tos. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08IB/cma: Define option to set ack timeout and pack tos_setDanit Goldberg5-1/+47
Define new option in 'rdma_set_option' to override calculated QP timeout when requested to provide QP attributes to modify a QP. At the same time, pack tos_set to be bitfield. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_en: Enable RDMA driver support for 57500 chipDevesh Sharma1-3/+0
Re-enabling RDMA driver support on 57500 chips. Removing the forced error code for 57500 chip. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Update kernel user abi to pass chip contextDevesh Sharma2-3/+25
User space verbs provider library would need chip context. Changing the ABI to add chip version details in structure. Furthermore, changing the kernel driver ucontext allocation code to initialize the abi structure with appropriate values. As suggested by community, appended the new fields at the bottom of the ABI structure and retaining to older fields as those were in the older versions. Keeping the ABI version at 1 and adding a new field in the ucontext response structure to hold the component mask. The user space library should check pre-defined flags to figure out if a certain feature is supported on not. Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Add extended psn structure for 57500 adaptersDevesh Sharma5-18/+70
The new 57500 series of adapter has bigger psn search structure. The size of new structure is 16B. Changing the control path memory allocation and fast path code to accommodate the new psn structure while maintaining the backward compatibility. There are few additional changes listed below: - For 57500 chip max-sge are limited to 6 for now. - For 57500 chip max-receive-sge should be set to 6 for now. - Add driver/hardware interface structure for new chip. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Enable GSI QP support for 57500 seriesDevesh Sharma4-55/+99
In the new 57500 series of adapters the GSI qp is a UD type QP unlike the previous generation where it was a Raw Eth QP. Changing the control and data path to support the same. Listing all the significant diffs: - AH creation resolve network type unconditionally - Add check at relevant places to distinguish from Raw Eth processing flow. - bnxt_re_process_res_ud_wc report completion with GRH flag when qp is GSI. - Change length, cfa_meta and smac to match new driver/hardware interface. - Add new driver/hardware interface. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Skip backing store allocation for 57500 seriesDevesh Sharma4-7/+14
The backing store to keep HW context data structures is allocated and initialized by L2 driver. For 57500 chip RoCE driver do not require to allocate and initialize additional memory. Changing to skip duplicate allocation and initialization for 57500 adapters. Driver continues as before for older chips. This patch also takes care of stats context memory alignment to 128 boundary, a requirement for 57500 series of chip. Older chips do not care of alignment, thus the change is unconditional. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Add 64bit doorbells for 57500 seriesDevesh Sharma9-153/+276
The new chip series has 64 bit doorbell for notification queues. Thus, both control and data path event queues need new routines to write 64 bit doorbell. Adding the same. There is new doorbell interface between the chip and driver. Changing the chip specific data structure definitions. Additional significant changes are listed below - bnxt_re_net_ring_free/alloc takes a new argument - bnxt_qplib_enable_nq and enable_rcfw uses new doorbell offset for new chip. - DB mapping for NQ and CREQ now maps 8 bytes. - DBR_DBR_* macros renames to DBC_DBC_* - store nq_db_offset in a 32bit data type. - got rid of __iowrite64_copy, used writeq instead. - changed the DB header initialization to simpler scheme. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07RDMA/bnxt_re: Add chip context to identify 57500 seriesDevesh Sharma5-1/+51
Adding setup and destroy routines for chip-context. The chip context would be used frequently in control and data path to take execution flow depending on the chip type. chip context structure pointer is added to the relevant data structures. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07IB/mlx5: Simplify WQE count power of two checkGal Pressman1-3/+3
Use is_power_of_2() instead of hard coding it in the driver. While at it, fix the meaningless error print. Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07Documentation/infiniband: update from locked to pinned_vmDavidlohr Bueso1-2/+2
We are really talking about pinned_vm here. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07drivers/IB,core: reduce scope of mmap_semDavidlohr Bueso1-39/+2
ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the vma_list, so we cannot really get rid of mmap_sem altogether, but now that the counter is atomic, we can get of some complexity that mmap_sem brings with only pinned_vm. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07drivers/IB,usnic: reduce scope of mmap_semDavidlohr Bueso3-55/+6
usnic_uiom_get_pages() uses gup_longterm() so we cannot really get rid of mmap_sem altogether in the driver, but we can get rid of some complexity that mmap_sem brings with only pinned_vm. We can get rid of the wq altogether as we no longer need to defer work to unpin pages as the counter is now atomic. We also share the lock. Acked-by: Parvi Kaustubhi <pkaustub@cisco.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07drivers/IB,hfi1: do not se mmap_semDavidlohr Bueso1-6/+0
This driver already uses gup_fast() and thus we can just drop the mmap_sem protection around the pinned_vm counter. Note that the window between when hfi1_can_pin_pages() is called and the actual counter is incremented remains the same as mmap_sem was _only_ used for when ->pinned_vm was touched. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.det> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07drivers/IB,qib: optimize mmap_sem usageDavidlohr Bueso1-46/+27
The driver uses mmap_sem for both pinned_vm accounting and get_user_pages(). Because rdma drivers might want to use gup_longterm() in the future we still need some sort of mmap_sem serialization (as opposed to removing it entirely by using gup_fast()). Now that pinned_vm is atomic the writer lock can therefore be converted to reader. This also fixes a bug that __qib_get_user_pages was not taking into account the current value of pinned_vm. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07drivers/mic/scif: do not use mmap_semDavidlohr Bueso1-25/+11
The driver uses mmap_sem for both pinned_vm accounting and get_user_pages(). By using gup_fast() and letting the mm handle the lock if needed, we can no longer rely on the semaphore and simplify the whole thing. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07mm: make mm->pinned_vm an atomic64 counterDavidlohr Bueso10-27/+28
Taking a sleeping lock to _only_ increment a variable is quite the overkill, and pretty much all users do this. Furthermore, some drivers (ie: infiniband and scif) that need pinned semantics can go to quite some trouble to actually delay via workqueue (un)accounting for pinned pages when not possible to acquire it. By making the counter atomic we no longer need to hold the mmap_sem and can simply some code around it for pinned_vm users. The counter is 64-bit such that we need not worry about overflows such as rdma user input controlled from userspace. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05RDMA/iwpm: move kdoc comments to functionsSteve Wise3-182/+123
Move the iwpm kdoc comments from the prototype declarations to above the function bodies. There are no functional changes in this patch. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05RDMA/cma: Remove CM_ID statistics provided by rdma-cm moduleLeon Romanovsky3-101/+3
Netlink statistics exported by rdma-cm never had any working user space component published to the mailing list or to any open source project. Canvassing various proprietary users, and the original requester, we find that there are no real users of this interface. This patch simply removes all occurrences of RDMA CM netlink in favour of modern nldev implementation, which provides the same information and accompanied by widely used user space component. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05IB/mlx5: Do not use hw_access_flags for be and CPU dataBart Van Assche1-8/+8
Avoid that sparse reports the following for the mlx5 driver: drivers/infiniband/hw/mlx5/qp.c:2671:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2671:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2671:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2679:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2679:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2679:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2680:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2680:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2680:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2684:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2684:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2684:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: incorrect type in argument 1 (different base types) drivers/infiniband/hw/mlx5/qp.c:2686:28: expected unsigned int [usertype] val drivers/infiniband/hw/mlx5/qp.c:2686:28: got restricted __be32 [usertype] drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 This patch does not change any functionality. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/IWPM: Support no port mapping requirementsSteve Wise7-7/+192
A soft iwarp driver that uses the host TCP stack via a kernel mode socket does not need port mapping. In fact, if the port map daemon, iwpmd, is running, then iwpmd must not try and create/bind a socket to the actual port for a soft iwarp connection, since the driver already has that socket bound. Yet if the soft iwarp driver wants to interoperate with hard iwarp devices that -are- using port mapping, then the soft iwarp driver's mappings still need to be maintained and advertised by the iwpm protocol. This patch enhances the rdma driver<->iwcm interface to allow an iwarp driver to specify that it does not want port mapping. The iwpm kernel<->iwpmd interface is also enhanced to pass up this information on map requests. Care is taken to interoperate with the current iwpmd version (ABI version 3) and only use the new NL attributes if iwpmd supports ABI version 4. The ABI version define has also been created in rdma_netlink.h so both kernel and user code can share it. The iwcm and iwpmd negotiate the ABI version to use with a new HELLO netlink message. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/IWPM: refactor the IWPM message attribute namesSteve Wise2-20/+41
In order to add new IWPM_NL attributes, the enums for the IWPM commands attributes are refactored such that a new attribute can be added without breaking ABI version 3. Instead of sharing nl attribute enums for both request and response messages, we create separate enums for each IWPM message request and reply. This allows us to extend any given IWPM message by adding new attributes for just that message. These new enums are created, though, in a way to avoid breaking ABI version 3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04iw_cxgb*: kzalloc the iwcm verbs structSteve Wise2-2/+2
So future additions to that struct get initialized by default. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/hns: Fix the chip hanging caused by sending doorbell during resetWei Hu (Xavier)3-9/+28
On hi08 chip, There is a possibility of chip hanging when sending doorbell during reset. We can fix it by prohibiting doorbell during reset. Fixes: 2d40788825ac ("RDMA/hns: Add support for processing send wr and receive wr") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during resetWei Hu (Xavier)4-13/+167
On hi08 chip, There is a possibility of chip hanging and some errors when sending mailbox & doorbell during reset. We can fix it by prohibiting mailbox and doorbell during reset and reset occurred to ensure that hardware can work normally. Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/hns: Fix the Oops during rmmod or insmod ko when reset occursWei Hu (Xavier)3-13/+112
In the reset process, the hns3 NIC driver notifies the RoCE driver to perform reset related processing by calling the .reset_notify() interface registered by the RoCE driver in hip08 SoC. In the current version, if a reset occurs simultaneously during the execution of rmmod or insmod ko, there may be Oops error as below: Internal error: Oops: 86000007 [#1] PREEMPT SMP Modules linked in: hns_roce(O) hns3(O) hclge(O) hnae3(O) [last unloaded: hns_roce_hw_v2] CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G O 4.19.0-ge00d540 #1 Hardware name: Huawei Technologies Co., Ltd. Workqueue: events hclge_reset_service_task [hclge] pstate: 60c00009 (nZCv daif +PAN +UAO) pc : 0xffff00000100b0b8 lr : 0xffff00000100aea0 sp : ffff000009afbab0 x29: ffff000009afbab0 x28: 0000000000000800 x27: 0000000000007ff0 x26: ffff80002f90c004 x25: 00000000000007ff x24: ffff000008f97000 x23: ffff80003efee0a8 x22: 0000000000001000 x21: ffff80002f917ff0 x20: ffff8000286ea070 x19: 0000000000000800 x18: 0000000000000400 x17: 00000000c4d3225d x16: 00000000000021b8 x15: 0000000000000400 x14: 0000000000000400 x13: 0000000000000000 x12: ffff80003fac6e30 x11: 0000800036303000 x10: 0000000000000001 x9 : 0000000000000000 x8 : ffff80003016d000 x7 : 0000000000000000 x6 : 000000000000003f x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000004 x2 : 00000000000007ff x1 : 0000000000000000 x0 : 0000000000000000 Process kworker/0:1 (pid: 14, stack limit = 0x00000000af8f0ad9) Call trace: 0xffff00000100b0b8 0xffff00000100b3a0 hns_roce_init+0x624/0xc88 [hns_roce] 0xffff000001002df8 0xffff000001006960 hclge_notify_roce_client+0x74/0xe0 [hclge] hclge_reset_service_task+0xa58/0xbc0 [hclge] process_one_work+0x1e4/0x458 worker_thread+0x40/0x450 kthread+0x12c/0x130 ret_from_fork+0x10/0x18 Code: bad PC value In the reset process, we will release the resources firstly, and after the hardware reset is completed, we will reapply resources and reconfigure the hardware. We can solve this problem by modifying both the NIC and the RoCE driver. We can modify the concurrent processing in the NIC driver to avoid calling the .reset_notify and .uninit_instance ops at the same time. And we need to modify the RoCE driver to record the reset stage and the driver's init/uninit state, and check the state in the .reset_notify, .init_instance. and uninit_instance functions to avoid NULL pointer operation. Fixes: cb7a94c9c808 ("RDMA/hns: Add reset process for RoCE in hip08") Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/rxe: Improve loopback markingKamal Heib3-8/+5
Currently a packet is marked for loopback only if the source and destination addresses equals. This is not enough when multiple gids are present in rxe device's gid table and the traffic is from one gid to another. Fix it by marking the packet for loopback if the destination MAC address is equal to the source MAC address. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/rxe: Move rxe_init_av() to rxe_av.cKamal Heib4-14/+11
Move the function rxe_init_av() to rxe_av.c file and use it instead of calling rxe_av_from_attr() and rxe_av_fill_ip_info(), also remove the unused rxe_dev parameter from rxe_init_av(). Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/hns: Make some function staticYueHaibing2-5/+6
Fixes the following sparse warnings: drivers/infiniband/hw/hns/hns_roce_hw_v2.c:5822:5: warning: symbol 'hns_roce_v2_query_srq' was not declared. Should it be static? drivers/infiniband/hw/hns/hns_roce_srq.c:158:6: warning: symbol 'hns_roce_srq_free' was not declared. Should it be static? drivers/infiniband/hw/hns/hns_roce_srq.c:81:5: warning: symbol 'hns_roce_srq_alloc' was not declared. Should it be static? Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/core: Remove ib_sg_dma_address() and ib_sg_dma_len()Bart Van Assche9-76/+36
Keeping single line wrapper functions is not useful. Hence remove the ib_sg_dma_address() and ib_sg_dma_len() functions. This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/mlx5: Advertise XRC ODP supportMoni Shoua1-0/+18
Query all per transport caps for XRC and set the appropriate bits in the per transport field of the advertised struct. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/mlx5: Advertise SRQ ODP support for supported transportsMoni Shoua1-0/+6
ODP support in SRQ is per transport capability. Based on device capabilities set this flag in device structure for future queries. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/mlx5: Add ODP SRQ supportMoni Shoua1-23/+61
Add changes to the WQE page-fault handler to 1. Identify that the event is for a SRQ WQE 2. Pass SRQ object instead of a QP to the function that reads the WQE 3. Parse the SRQ WQE with respect to its structure The rest is handled as for regular RQ WQE. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/mlx5: Let read user wqe also from SRQ bufferMoni Shoua3-55/+166
Reading a WQE from SRQ is almost identical to reading from regular RQ. The differences are the size of the queue, the size of a WQE and buffer location. Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE from a SRQ or RQ by caller choice. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>