aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux/mlx5 (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-03-01net/mlx5: Emit port affinity event for multipath offloadsRoi Dayan1-0/+1
Under multipath offload scheme, as part of handling fib events, emit mlx5 port affinity event on the enabled ports which will be handled by the tc offloads code. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-03-01net/mlx5: Add multipath modeRoi Dayan1-0/+1
In order to offload ecmp-on-host scheme where next-hop routes are used, we will make use of HW LAG. Add accessor function to let upper layers in the driver to realize if the lag acts in multi-path mode. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22net/mlx5e: Fix GRE key by controlling port tunnel entropy calculationEli Britstein1-0/+2
Flow entropy is calculated on the inner packet headers and used for flow distribution in processing, routing etc. For GRE-type encapsulations the entropy value is placed in the eight LSB of the key field in the GRE header as defined in NVGRE RFC 7637. For UDP based encapsulations the entropy value is placed in the source port of the UDP header. The hardware may support entropy calculation specifically for GRE and for all tunneling protocols. With commit df2ef3bff193 ("net/mlx5e: Add GRE protocol offloading") GRE is offloaded, but the hardware is configured by default to calculate flow entropy so packets transmitted on the wire have a wrong key. To support UDP based tunnels (i.e VXLAN), GRE (i.e. no flow entropy) and NVGRE (i.e. with flow entropy) the hardware behaviour must be controlled by the driver. Ensure port entropy calculation is enabled for offloaded VXLAN tunnels and disable port entropy calculation in the presence of offloaded GRE tunnels by monitoring the presence of entropy enabling tunnels (i.e VXLAN) and entropy disabing tunnels (i.e GRE). Fixes: df2ef3bff193 ("net/mlx5e: Add GRE protocol offloading") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-22net/mlx5: Introduce tunnel entropy control in PCMR registerEli Britstein1-2/+10
When using the device packet encapsulation offload, the device calculates an entropy value, representing the inner packet headers. The entropy field is placed inside the outer packet headers. For UDP-type encapsulations, the entropy is placed in the source port field of the UDP header. For GRE-type encapsulations, the entropy is placed in the 8 LSB of the key field in the GRE header. If the device does not recognize the encapsulation type, the entropy is not placed in the packet. Entropy setting can be controlled using PCMR register. if encapsulation offload is not used force_entropy_cap should be set to 0x0. Entropy setting is enabled/disabled using entropy_calc, and could be additionally enabled/disabled for GRE encapsulation by entropy_gre_calc. As a pre-step to automatically control the tunnel entropy, introduce the entropy fields in the PCMR register with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Consider ECPF vport depends on eswitch ownershipBodong Wang3-4/+12
ECPF connects to the eswitch through vport 0xfffe. ECPF may or may not be the eswitch manager depending on firmware configuration. 1. If ECPF is eswitch manager: ECPF will take over the eswitch manager responsibility. A rep of the host PF shall be created at the ECPF side for the eswitch manager to control. 2. If ECPF is not eswitch manager: host PF will be the eswitch manager, ECPF acts similar as a VF to the host PF. Host PF will be aware of the ECPF vport presence and control it's rep. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Assign a different position for uplink rep and vportBodong Wang1-3/+8
In offloads mode, the current implementation puts the uplink representor at index zero of the vport reps array. It is not "natural" to place it at index 0 since we want to put the representor for vport 0 at index 0 with the introduction of SmartNIC. A separate patch will handle the case whether a rep is needed for vport 0 (PF vport). So, we want to have a different placeholder for uplink vport and representor. It was placed at the end of vport and rep array. Since vport number can no longer act as an index into the vport or representors arrays, use functions to map vport numbers to indices when accessing the vports or representors arrays, and vice versa. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driverBodong Wang1-7/+4
Eswitch has two users: IB and ETH. They both register repersentors when mlx5 interface is added, and unregister the repersentors when mlx5 interface is removed. Ideally, each driver should only deal with the entities which are unique to itself. However, current IB and ETH drivers have to perform the following eswitch operations: 1. When registering, specify how many vports to register. This number is the same for both drivers which is the total available vport numbers. 2. When unregistering, specify the number of registered vports to do unregister. Also, unload the repersentors which are already loaded. It's unnecessary for eswitch driver to hands out the control of above operations to individual driver users, as they're not unique to each driver. Instead, such operations should be centralized to eswitch driver. This consolidates eswitch control flow, and simplified IB and ETH driver. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Add state to eswitch vport representorsBodong Wang1-1/+7
Currently the eswitch vport reps have a valid indicator, which is set on register and unset on unregister. However, a rep can be loaded or not loaded when doing unregister, current driver checks if the vport of that rep is enabled as a flag to imply the rep is loaded. However, for ECPF, this is not valid as the host PF will enable the vports for its VFs instead. Add three states: {unregistered, registered, loaded}, with the following state changes across different operations: create: (none) -> unregistered reg: unregistered -> registered load: registered -> loaded unload: loaded -> registered unreg: registered -> unregistered Note that the state shall only be updated inside eswitch driver rather than individual drivers such as ETH or IB. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Split VF and special vports for offloads modeBodong Wang1-0/+1
When driver is entering offloads mode, there are two major tasks to do: initialize flow steering and create representors. Flow steering should make sure enough flow table/group spaces are reserved for all reps. Representors will be created in a group, all or none. With the introduction of ECPF, flow steering should still reserve the same spaces. But, the representors are not always loaded/unloaded in a single piece. Once ECPF is in offloads mode, it will get the number of VF changing event from host PF. In such scenario, only the VF reps should be loaded/unloaded, not the reps for special vports (such as the uplink vport). Thus, when entering offloads mode, driver should specify the total number of reps, and the number of VF reps separately. When leaving offloads mode, the cleanup should use the information self-contained in eswitch such as number of VFs. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to host PF vport as other vportBodong Wang1-2/+2
Commands referring to vports use the following scheme: 1. When referring to my own vport, put 0 in vport and 0 in other_vport. 2. When referring to another vport, put the vport number of the referred vport and put 1 in other_vport. It was assumed that driver is accessing other vport when vport number is greater than 0. With the above scheme, the case that ECPF eswitch manager is trying to access host PF vport will fall over with scheme 1 as the vport number is 0. This is apparently wrong as driver is trying to refer other vport. As such usage can only happen in the eswitch context, change relevant functions to provide other vport input properly. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-15net/mlx5: E-Switch, Properly refer to the esw manager vportBodong Wang1-0/+2
In SmartNIC mode, the eswitch manager is not necessarily the PF (vport 0). Use a helper function to get the correct eswitch manager vport number and cache on the eswitch instance for fast reference. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Add support to ext_* fields introduced in Port Type and Speed registerAya Levin1-0/+19
This patch exposes new link modes (including 50Gbps per lane), and ext_* fields which describes the new link modes in Port Type and Speed register (PTYS). Access functions, translation functions (speed <-> HW bits) and link max speed function were modified. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Add new fields to Port Type and Speed registerAya Levin1-4/+8
Register Port Type and Speed (PTYS) introduces three new fields extending the speed/protocols the can be reported and configured. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Refactor queries to speed fields in Port Type and Speed registerAya Levin1-11/+0
This patch fascicles queries to speed related fields in Port Type and Speed register (PTYS) into a single API. I addition, this patch refactors functions which serves only Ethernet driver: remove the protocol type as an input parameter, move code from 'core' directory into 'en' directory and add 'eth' prefix to the function's name. The patch also encapsulates functions that are not used outside the Ethernet driver removes redundant include files. Signed-off-by: Aya Levin <ayal@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: E-Switch, Avoid magic numbers when initializing offloads modeBodong Wang1-1/+3
When dealing with the offloads mode initialization, driver refers to the number of VFs and add magic number one (1) to take account of the uplink. This is not clear and will make the code less readable after adding other vports (e.g. host PF). As these are special vports compared to VF vports, add a helper macro to denote such special vports and eliminate the use of magic number. Moreover, when creating offloads flow table and groups, the driver reserves two more slots for UC and MC miss rules. Replace this magic number with a helper macro as well. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Relocate vport macros to the vport header fileBodong Wang2-6/+7
These are two macros in the driver general header which deal with the number of total vports and if a vport is vport manager. Such macros are vport entities, better to place them at the vport header file. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: E-Switch, Normalize the name of uplink vport numberBodong Wang1-0/+4
Driver used to name uplink vport as FDB_UPLINK_VPORT, it's hard to comply with the same naming convention along with the introduction of other vports. Use MLX5_VPORT as the prefix for such vports and relocate the uplink vport definition to public header file for the benefits of both net and IB drivers. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Provide an alternative VF upper bound for ECPFBodong Wang1-1/+10
ECPF doesn't support SR-IOV, but an ECPF E-Switch manager shall know the max VFs supported by its peer host PF in order to control those VF vports. The current driver implementation uses the total vfs quantity as provided by the pci sub-system for an upper bound of the VF vports the e-switch code needs to deal with. This obviously can't work as is on ECPF e-switch manager. For now, we use a hard coded value of 128 on such systems. Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Add host params change eventBodong Wang2-0/+7
In Embedded CPU (EC) configurations, the EC driver needs to know when the number of virtual functions change on the corresponding PF at the host side. This is required so the EC driver can create or destroy representor net devices that represent the VFs ports. Whenever a change in the number of VFs occurs, firmware will generate an event towards the EC which will trigger a work to complete the rest of the handling. The specifics of the handling will be introduced in a downstream patch. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Add query host params commandBodong Wang1-0/+41
The QUERY_HOST_PARAMS command is used by an Embedded CPU Physical Function (ECPF) driver to identify and retrieve information about the PF on the host side. E.g, number of virtual functions and PCI BDF. The number of VFs can be changed on the fly, a function is added to query current number of VFs and will be used in downstream patches. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Update enable HCA dependencyBodong Wang1-2/+4
With the introduction of ECPF, we require that the ECPF driver will aways call enable/disable HCA for that PF in the same way a PF does this for its VFs. The PF is still responsible for calling enable and disable HCA for its VFs. To distinguish between the ECPF executing enable/disable HCA for itself or for the PF, it sets the embedded CPU function bit in the input params struct of these commands. When the bit is cleared and function ID is zero, it refers to the peer PF. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Introduce Mellanox SmartNIC and modify page management logicBodong Wang3-6/+17
Mellanox's SmartNIC combines embedded CPU(e.g, ARM) processing power with advanced network offloads to accelerate a multitude of security, networking and storage applications. With the introduction of the SmartNIC, there is a new PCI function called Embedded CPU Physical Function(ECPF). And it's possible for a PF to get its ICM pages from the ECPF PCI function. Driver shall identify if it is running on such a function by reading a bit in the initialization segment. When firmware asks for pages, it would issue a page request event specifying how many pages it requests and for which function. That driver responds with a manage_pages command providing the requested pages along with an indication for which function it is providing these pages. The encoding before this patch was as follows: function_id == 0: pages are requested for the function receiving the EQE. function_id != 0: pages are requested for VF identified by the function_id value A new one bit field in the EQE identifies that pages are requested for the ECPF. The notion of page_supplier can be introduced here and to support that, manage pages and query pages were modified so firmware can distinguish the following cases: 1. Function provides pages for itself 2. PF provides pages for its VF 3. ECPF provides pages to itself 4. ECPF provides pages for another function This distinction is possible through the introduction of the bit "embedded_cpu_function" in query_pages, manage_pages and page request EQE. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Use consistent vport num argument typeBodong Wang1-4/+4
Use u16 for vport number, which matches how hardware refers to this argument throughout commands. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-14net/mlx5: Use void pointer as the type in address_of macroBodong Wang1-1/+1
Better to use void * and avoid unnecessary casts. This patch doesn't change any functionality. Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-02-03net/mlx5: Set ODP SRQ support in firmwareMoni Shoua2-0/+4
To avoid compatibility issue with older kernels the firmware doesn't allow SRQ to work with ODP unless kernel asks for it. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-02-03net/mlx5: Add XRC transport to ODP device capabilities layoutMoni Shoua1-1/+3
The device capabilities for ODP structure was missing the field for XRC transport so add it here. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-01-24net/mlx5: Make mlx5_cmd_exec_cb() a safe APIJason Gunthorpe1-6/+26
APIs that have deferred callbacks should have some kind of cleanup function that callers can use to fence the callbacks. Otherwise things like module unloading can lead to dangling function pointers, or worse. The IB MR code is the only place that calls this function and had a really poor attempt at creating this fence. Provide a good version in the core code as future patches will add more places that need this fence. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-01-15RDMA/mad: Reduce MAD scope to mlx5_ib onlyLeon Romanovsky1-2/+0
Management Datagram Interface (MAD) is applicable only when physical port is Infiniband. It makes MAD command logic to be completely unrelated to eth/core parts of mlx5. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2-22/+58
Pull rdma updates from Jason Gunthorpe: "This has been a fairly typical cycle, with the usual sorts of driver updates. Several series continue to come through which improve and modernize various parts of the core code, and we finally are starting to get the uAPI command interface cleaned up. - Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4, mlx5, qib, rxe, usnic - Rework the entire syscall flow for uverbs to be able to run over ioctl(). Finally getting past the historic bad choice to use write() for command execution - More functional coverage with the mlx5 'devx' user API - Start of the HFI1 series for 'TID RDMA' - SRQ support in the hns driver - Support for new IBTA defined 2x lane widths - A big series to consolidate all the driver function pointers into a big struct and have drivers provide a 'static const' version of the struct instead of open coding initialization - New 'advise_mr' uAPI to control device caching/loading of page tables - Support for inline data in SRPT - Modernize how umad uses the driver core and creates cdev's and sysfs files - First steps toward removing 'uobject' from the view of the drivers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits) RDMA/srpt: Use kmem_cache_free() instead of kfree() RDMA/mlx5: Signedness bug in UVERBS_HANDLER() IB/uverbs: Signedness bug in UVERBS_HANDLER() IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported IB/umad: Start using dev_groups of class IB/umad: Use class_groups and let core create class file IB/umad: Refactor code to use cdev_device_add() IB/umad: Avoid destroying device while it is accessed IB/umad: Simplify and avoid dynamic allocation of class IB/mlx5: Fix wrong error unwind IB/mlx4: Remove set but not used variable 'pd' RDMA/iwcm: Don't copy past the end of dev_name() string IB/mlx5: Fix long EEH recover time with NVMe offloads IB/mlx5: Simplify netdev unbinding IB/core: Move query port to ioctl RDMA/nldev: Expose port_cap_flags2 IB/core: uverbs copy to struct or zero helper IB/rxe: Reuse code which sets port state IB/rxe: Make counters thread safe IB/mlx5: Use the correct commands for UMEM and UCTX allocation ...
2018-12-20net/mlx5e: XDP, Support Enhanced Multi-Packet TX WQETariq Toukan1-0/+1
Add support for the HW feature of multi-packet WQE in XDP xmit flow. The conventional TX descriptor (WQE, Work Queue Element) serves a single packet. Our HW has support for multi-packet WQE (MPWQE) in which a single descriptor serves multiple TX packets. This reduces both the PCI overhead and the CPU cycles wasted on writing them. In this patch we add support for the HW feature, which is supported starting from ConnectX-5. Performance: Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz XDP_TX: We see a huge gain on single port ConnectX-5, and reach the 100 Mpps milestone. * Single-port HCA: Before: 70 Mpps After: 100 Mpps (+42.8%) * Dual-port HCA: Before: 51.7 Mpps After: 57.3 Mpps (+10.8%) * In both cases we tested traffic on one port and for now On Dual-port HCAs we see only small gain, we are working to overcome this bottleneck, but for the moment only with experimental firmware on dual port HCAs we can reach the wanted numbers as seen on Single-port HCAs. XDP_REDIRECT: Redirect from (A) ConnectX-5 to (B) ConnectX-5. Due to a setup limitation, (A) and (B) are on different NUMA nodes, so absolute performance numbers are not optimal. Note: Below is the transmit rate of (B), not the redirect rate of (A) which is in some cases higher. * (B) is single-port: Before: 77 Mpps After: 90 Mpps (+16.8%) * (B) is dual-port: Before: 61 Mpps After: 72 Mpps (+18%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-20IB/mlx5: Use the correct commands for UMEM and UCTX allocationYishai Hadas1-20/+42
During testing the command format was changed to close a security hole. Revise the driver to use the command format that will actually be supported in GA firmware. Both the UMEM and UCTX are intended only for use by the kernel and cannot be executed using a general command. Since the UMEM and CTX are not part of the general object the caps bits were moved to be some log_xxx location in the general HCA caps. The firmware code was adapted as well to match the above. Fixes: a8b92ca1b0e5 ("IB/mlx5: Introduce DEVX") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Achiad Shochat <achiad@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-20Merge branch 'mlx5-next' into rdma.gitJason Gunthorpe4-14/+141
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates taken for dependencies on following patches. * branche 'mlx5-next': (23 commits) IB/mlx5: Introduce uid as part of alloc/dealloc transport domain net/mlx5: Add shared Q counter bits net/mlx5: Continue driver initialization despite debugfs failure net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c net/mlx5: Remove the get protocol device interface entry net/mlx5: Support extended destination format in flow steering command net/mlx5: E-Switch, Change vhca id valid bool field to bit flag net/mlx5: Introduce extended destination fields net/mlx5: Revise gre and nvgre key formats net/mlx5: Add monitor commands layout and event data net/mlx5: Add support for plugged-disabled cable status in PME net/mlx5: Add support for PCIe power slot exceeded error in PME net/mlx5: Rework handling of port module events net/mlx5: Move flow counters data structures from flow steering header ...
2018-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+6
Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20IB/mlx5: Introduce uid as part of alloc/dealloc transport domainYishai Hadas1-2/+2
Introduce uid as part of alloc/dealloc transport domain to match the device specification. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-19net/mlx5: Add shared Q counter bitsLeon Romanovsky1-1/+5
Updated HW specification file with needed bits to allow sharing of Q counters between DEVX contexts and kernel. Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-14net/mlx5: Make RoCE and SR-IOV LAG modes explicitAviv Heller1-0/+2
With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-14net/mlx5: Introduce inter-device communication mechanismAviv Heller1-0/+2
This introduces devcom, a generic mechanism for performing operations on both physical functions of the same Connect-X card. The first user of this API is merged eswitch, which will be introduced in subsequent patches. Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-14Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxSaeed Mahameed1-9/+15
mlx5-next shared branch with rdma subtree to avoid mlx5 rdma v.s. netdev conflicts. Highlights: 1) Lag refactroing and flow counter affinity bits. 2) mlx5 core cleanups By Roi Dayan (2) and others * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-14net/mlx5: E-Switch, Introduce flow counter affinityShahar Klein1-1/+7
This dictates the device affinity for eswitch flow counters, set by the FW according to the HW device capabilities. Under "source eswitch" affinity, the counter should be allocated on the device related to the source vport in the match. This covers both non merged e-switch mode as well as old FW that does not advertise this cap. Under "flow eswitch" affinity, the counter should be allocated on the device where the eswitch rule is set. Signed-off-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-14net/mlx5: Use lowercase 'X' for hex valuesSaeed Mahameed1-8/+8
Apparently gcc is cool with upper case '0X' but it is not commonly used. Replace '0X' with lowercase '0x' in mlx5_ifc.h file. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-13net/mlx5: E-Switch, Fix fdb cap bits swapVu Pham1-4/+6
The cap bits locations for the fdb caps of multi path to table (used for local mirroring) and multi encap (used for prio/chains) were wrongly used in swapped locations. This went unnoted so far b/c we tested the offending patch with CX5 FW that supports both of them. On different environments where not both caps are supported, we will be messed up, fix that. Fixes: b9aa0ba17af5 ('net/mlx5: Add cap bits for multi fdb encap') Signed-off-by: Vu Pham <vu@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11net/mlx5e: Avoid query PPCNT register if not supported by the deviceEyal Davidovich1-1/+3
PPCNT is not supported if PCAM access reg is supported and ppcnt bit is clear. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11net/mlx5e: Use CQE padding for Ethernet CQsDaniel Jurgens1-5/+5
Writing 64B CQEs to 128B cache lines results in a RMW operation. Padding the CQEs to 128B if possible improves performance on 128B cache line systems like PPC. Testing on PPC showed up to a 24% improvement in small packet throughput vs the default behavior, depending on the workload and system topology. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11IB/mlx5: Report CapabilityMask2 in ib_query_portMichael Guralnik1-2/+2
CapabilityMask2 exists when IB_PORT_CAP_MASK2_SUP is set in the original capability mask. In such cases, query its value and report it in query port. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-10net/mlx5: Remove the get protocol device interface entryOr Gerlitz1-2/+0
This isn't used anywhere across the mlx5 driver stack, remove it. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5: Support extended destination format in flow steering commandEli Britstein1-0/+2
Update the flow steering command formatting according to the extended destination API. Note that the FW dictates that multi destination FTEs that involve at least one encap must use the extended destination format, while single destination ones must use the legacy format. Using extended destination format requires FW support. Check for its capabilities and return error if not supported. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5: E-Switch, Change vhca id valid bool field to bit flagEli Britstein1-1/+5
Change the driver flow destination struct to use bit flags with the vhca id valid being the 1st one. The flags field is more extendable and will be used in downstream patch. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5: Introduce extended destination fieldsEli Britstein1-3/+16
Extended destinations provide the ability to configure different encapsulation properties per destination on a single FTE. This is needed for use-cases such as remote mirroring over tunneled networks. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5: Revise gre and nvgre key formatsOz Shlomo1-2/+11
GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l). Define the two key parsing alternatives in a union, thus enabling both access methods. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5: Add monitor commands layout and event dataEyal Davidovich2-1/+87
Will be used in downstream patch to monitor counter changes by the HCA and report it to the driver by an event. The driver will update its counters cached data accordingly. Signed-off-by: Eyal Davidovich <eyald@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>