aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2018-10-17net/mlx5: Split FDB fast path prio to multiple namespacesPaul Blakey1-1/+1
Towards supporting multi-chains and priorities, split the FDB fast path to multiple namespaces (sub namespaces), each with multiple priorities. This patch adds a new flow steering type, FS_TYPE_PRIO_CHAINS, which is like current FS_TYPE_PRIO, but may contain only namespaces, and those will be in parallel to one another in terms of managing of the flow tables connections inside them. Meaning, while searching for the next or previous flow table to connect for a new table inside such namespace we skip the parallel namespaces in the same level under the FS_TYPE_PRIO_CHAINS prio we originated from. We use this new type for splitting the fast path prio into multiple parallel namespaces, each containing normal prios. The prios inside them (and their tables) will be connected to one another, but not from one parallel namespace to another, instead the last prio in each namespace will be connected to the next prio in the containing FDB namespace, which is the slow path prio. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17net/mlx5: Use flow counter IDs and not the wrapping cache objectMark Bloch1-2/+2
Currently, when a flow rule is created using the FS core layer, the caller has to pass the entire flow counter object and not just the counter HW handle (ID). This requires both the FS core and the caller to have knowledge about the inner implementation of the FS layer flow counters cache and limits the possible users. Move to use the counter ID across the place when dealing with flows. Doing this decoupling, now can we privatize the inner implementation of the flow counters. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-nextSaeed Mahameed1-1/+1
mlx5 updates for both net-next and rdma-next * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits) net/mlx5: Expose DC scatter to CQE capability bit net/mlx5: Update mlx5_ifc with DEVX UID bits net/mlx5: Set uid as part of DCT commands net/mlx5: Set uid as part of SRQ commands net/mlx5: Set uid as part of SQ commands net/mlx5: Set uid as part of RQ commands net/mlx5: Set uid as part of QP commands net/mlx5: Set uid as part of CQ commands net/mlx5: Rename incorrect naming in IFC file net/mlx5: Export packet reformat alloc/dealloc functions net/mlx5: Pass a namespace for packet reformat ID allocation net/mlx5: Expose new packet reformat capabilities {net, RDMA}/mlx5: Rename encap to reformat packet net/mlx5: Move header encap type to IFC header file net/mlx5: Break encap/decap into two separated flow table creation flags net/mlx5: Add support for more namespaces when allocating modify header net/mlx5: Export modify header alloc/dealloc functions net/mlx5: Add proper NIC TX steering flow tables support net/mlx5: Cleanup flow namespace getter switch logic net/mlx5: Add memic command opcode to command checker ... Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-01net/mlx5: E-Switch, Fix out of bound access when setting vport rateEran Ben Elisha1-2/+2
The code that deals with eswitch vport bw guarantee was going beyond the eswitch vport array limit, fix that. This was pointed out by the kernel address sanitizer (KASAN). The error from KASAN log: [2018-09-15 15:04:45] BUG: KASAN: slab-out-of-bounds in mlx5_eswitch_set_vport_rate+0x8c1/0xae0 [mlx5_core] Fixes: c9497c98901c ("net/mlx5: Add support for setting VF min rate") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-09-05{net, RDMA}/mlx5: Rename encap to reformat packetMark Bloch1-1/+1
Renames all encap mlx5_{core,ib} code to use the new naming of packet reformat. This change doesn't introduce any function change and is needed to properly reflect the operation being done by this action. For example not only can we encapsulate a packet, but also decapsulate it. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-08-08net/mlx5: E-Switch, Remove unused argument when creating legacy FDBEli Cohen1-2/+2
Remove unused nvports argument. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08net/mlx5: Rename modify/query_vport state related enumsEran Ben Elisha1-5/+5
Modify and query vport state commands share the same admin_state and op_mod values, rename the enums to fit them both. In addition, remove the esw prefix from the admin state enum as this also applied for vnic. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31net/mlx5e: E-Switch, Initialize eswitch only if eswitch managerEli Cohen1-2/+2
Execute mlx5_eswitch_init() only if we have MLX5_ESWITCH_MANAGER capabilities. Do the same for mlx5_eswitch_cleanup(). Fixes: a9f7705ffd66 ("net/mlx5: Unify vport manager capability check") Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-18net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_modeSaeed Mahameed1-1/+1
With debug kernel UBSAN detects the following issue, which might happen when eswitch instance is not created, fix this by testing the eswitch pointer before returning the eswitch mode, if not set return mode = SRIOV_NONE. [ 32.528951] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/eswitch.c:2219:12 [ 32.528951] member access within null pointer of type 'struct mlx5_eswitch' [ 32.528951] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc3-dirty #181 [ 32.528951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 32.528951] Call Trace: [ 32.528951] dump_stack+0xc7/0x13b [ 32.528951] ? show_regs_print_info+0x5/0x5 [ 32.528951] ? __pm_runtime_use_autosuspend+0x140/0x140 [ 32.528951] ubsan_epilogue+0x9/0x49 [ 32.528951] ubsan_type_mismatch_common+0x1f9/0x2c0 [ 32.528951] ? ucs2_as_utf8+0x310/0x310 [ 32.528951] ? device_initialize+0x229/0x2e0 [ 32.528951] __ubsan_handle_type_mismatch+0x9f/0xc9 [ 32.528951] ? __ubsan_handle_divrem_overflow+0x19b/0x19b [ 32.578008] ? ib_device_get_by_index+0xf0/0xf0 [ 32.578008] mlx5_eswitch_mode+0x30/0x40 [ 32.578008] mlx5_ib_add+0x1e0/0x4a0 Fixes: 57cbd893c4c5 ("net/mlx5: E-Switch, Move representors definition to a global scope") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-06-26net/mlx5: E-Switch, Disallow vlan/spoofcheck setup if not being esw managerEli Cohen1-7/+5
In smartnic env, if the host (PF) driver is not an e-switch manager, we are not allowed to apply eswitch ports setups such as vlan (VST), spoof-checks, min/max rate or state. Make sure we are eswitch manager when coming to issue these callbacks and err otherwise. Also fix the definition of ESW_ALLOWED to rely on eswitch_manager capability and on the vport_group_manger. Operations on the VF nic vport context, such as setting a mac or reading the vport counters are allowed to the PF in this scheme. The modify nic vport guid code was modified to omit checking the nic_vport_node_guid_modify eswitch capability. The reason for doing so is that modifying node guid requires vport group manager capability, and there's no need to check further capabilities. 1. set_vf_vlan - disallowed 2. set_vf_spoofchk - disallowed 3. set_vf_mac - allowed 4. get_vf_config - allowed 5. set_vf_trust - disallowed 6. set_vf_rate - disallowed 7. get_vf_stat - allowed 8. set_vf_link_state - disallowed Fixes: f942380c1239 ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-06-26net/mlx5: E-Switch, Avoid setup attempt if not being e-switch managerOr Gerlitz1-1/+1
In smartnic env, the host (PF) driver might not be an e-switch manager, hence the FW will err on driver attempts to deal with setting/unsetting the eswitch and as a result the overall setup of sriov will fail. Fix that by avoiding the operation if e-switch management is not allowed for this driver instance. While here, move to use the correct name for the esw manager capability name. Fixes: 81848731ff40 ('net/mlx5: E-Switch, Add SR-IOV (FDB) support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Guy Kushnir <guyk@mellanox.com> Reviewed-by: Eli Cohen <eli@melloanox.com> Tested-by: Eli Cohen <eli@melloanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-06-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-9/+6
Pull rdma updates from Jason Gunthorpe: "This has been a quiet cycle for RDMA, the big bulk is the usual smallish driver updates and bug fixes. About four new uAPI related things. Not as much Szykaller patches this time, the bugs it finds are getting harder to fix. Summary: - More work cleaning up the RDMA CM code - Usual driver bug fixes and cleanups for qedr, qib, hfi1, hns, i40iw, iw_cxgb4, mlx5, rxe - Driver specific resource tracking and reporting via netlink - Continued work for name space support from Parav - MPLS support for the verbs flow steering uAPI - A few tricky IPoIB fixes improving robustness - HFI1 driver support for the '16B' management packet format - Some auditing to not print kernel pointers via %llx or similar - Mark the entire 'UCM' user-space interface as BROKEN with the intent to remove it entirely. The user space side of this was long ago replaced with RDMA-CM and syzkaller is finding bugs in the residual UCM interface nobody wishes to fix because nobody uses it. - Purge more bogus BUG_ON's from Leon - 'flow counters' verbs uAPI - T10 fixups for iser/isert, these are Acked by Martin but going through the RDMA tree due to dependencies" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (138 commits) RDMA/mlx5: Update SPDX tags to show proper license RDMA/restrack: Change SPDX tag to properly reflect license IB/hfi1: Fix comment on default hdr entry size IB/hfi1: Rename exp_lock to exp_mutex IB/hfi1: Add bypass register defines and replace blind constants IB/hfi1: Remove unused variable IB/hfi1: Ensure VL index is within bounds IB/hfi1: Fix user context tail allocation for DMA_RTAIL IB/hns: Use zeroing memory allocator instead of allocator/memset infiniband: fix a possible use-after-free bug iw_cxgb4: add INFINIBAND_ADDR_TRANS dependency IB/isert: use T10-PI check mask definitions from core layer IB/iser: use T10-PI check mask definitions from core layer RDMA/core: introduce check masks for T10-PI offload IB/isert: fix T10-pi check mask setting IB/mlx5: Add counters read support IB/mlx5: Add flow counters read support IB/mlx5: Add flow counters binding support IB/mlx5: Add counters create and destroy support IB/uverbs: Add support for flow counters ...
2018-06-02net/mlx5: Use flow counter pointer as input to the query functionOr Gerlitz1-9/+6
This allows to un-expose the details of struct mlx5_fc and keep it internal to the core driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-25net/mlx5: E-Switch, Reorganize and rename fdb flow tablesChris Mi1-11/+11
We have several fdb flow tables for each of the legacy and switchdev modes. In the switchdev mode, there are fast path and slow path flow tables. Towards adding more flow tables in upcoming patches, reorganize and rename the various existing ones to reflect their functionality. Signed-off-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-18Merge tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxDavid S. Miller1-1/+1
Saeed Mahameed says: ==================== mlx5-updates-2018-05-17 mlx5 core dirver updates for both net-next and rdma-next branches. From Christophe JAILLET, first three patche to use kvfree where needed. From: Or Gerlitz <ogerlitz@mellanox.com> Next six patches from Roi and Co adds support for merged sriov e-switch which comes to serve cases where both PFs, VFs set on them and both uplinks are to be used in single v-switch SW model. When merged e-switch is supported, the per-port e-switch is logically merged into one e-switch that spans both physical ports and all the VFs. This model allows to offload TC eswitch rules between VFs belonging to different PFs (and hence have different eswitch affinity), it also sets the some of the foundations needed for uplink LAG support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-17net/mlx5: Add destination e-switch ownerShahar Klein1-1/+1
The destination e-switch owner allows a rule in namespace of one e-switch owner to point to a vport that is natively associated with another e-switch owner. Signed-off-by: Shahar Klein <shahark@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-10net/mlx5: E-Switch, Include VF RDMA stats in vport statisticsAdi Nissim1-1/+10
The host side reporting of VF vport statistics didn't include the VF RDMA traffic. Fixes: 3b751a2a418a ("net/mlx5: E-Switch, Introduce get vf statistics") Signed-off-by: Adi Nissim <adin@mellanox.com> Reported-by: Ariel Almog <ariela@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-26net/mlx5: Add packet dropped while vport down statisticsMoshe Shemesh1-5/+26
Added the following packets dropped while vport down statistics: Rx dropped while vport down - counts packets which were steered by e-switch to a vport, but dropped since the vport was down. This counter will be shown on ip link tool as part of the vport rx_dropped counter. Tx dropped while vport down - counts packets which were transmitted by a vport, but dropped due to vport logical link down. This counter will be shown on ip link tool as part of the vport tx_dropped counter. The counters are read from FW by command QUERY_VNIC_ENV. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-28Merge tag 'mlx5-updates-2018-02-23' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxDavid S. Miller1-2/+21
Saeed Mahameed says: mlx5-update-2018-02-23 (IB representors) From: Mark Bloch <markb@mellanox.com> ========= Add IB representor when in switchdev mode The following series adds support for an IB (RAW Ethernet only) device representor which is created when the user switches to switchdev mode. Today when switching to switchdev mode the only representors which are created are net devices. Each netdev is a representor of a virtual function and any data sent via the representor is received on the virtual function, and any data sent via the virtual function is received by the representor. For the mlx5 driver the main use of this functionality is to be able to use Open vSwitch on the hypervisor in order to manage/control traffic from/to the virtual functions. Open vSwitch can also work with DPDK devices and not just net devices, this series exposes an IB device, which Mellanox PMD driver uses, which then can be used by Open vSwitch DPDK. An IB device representor exposes only RAW Ethernet QP capabilities and the ability to create flow rules to direct traffic to its RX queues. The state of the IB device (ACTIVE/DOWN etc..) is based on the state of the corresponding net device representor. No other RDMA/RoCE functionality is currently supported and no GID table is exposed. ========= Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-23net/mlx5: E-Switch, Reload IB interface when switching devlink modesMark Bloch1-2/+15
Up until this point it wasn't possible to activate IB representors when switching to switchdev mode, remove this limitation. We trigger reload of the PF IB interface in order to make sure that already allocated resources are invalid and new resources will be opened correctly with all the limitations of switchdev mode applied (only raw packet capabilities, without RoCE). We also move the remove/add to a place where the E-Switch mode is set/unset to better control when to trigger this action, this will allow the IB side to start in the correct mode. For better code reuse, create a function which reloads an interface and export it. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Move representors definition to a global scopeMark Bloch1-0/+6
In preparation for IB representors, move representors structs to a global scope, also expose functions needed for registration, unregistration, eswitch mode and creating a flow rule to direct traffic from SQs to the right VF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-20net/mlx5: E-Switch, Fix drop counters use before creationEugenia Emantayev1-4/+4
First use of drop counters happens in esw_apply_vport_conf function, while they are allocated later in the flow. Fix that by moving esw_vport_create_drop_counters function to be called before the first use. Fixes: b8a0dbe3a90b ("net/mlx5e: E-switch, Add steering drop counters") Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-09net/mlx5e: E-switch, Add steering drop countersEugenia Emantayev1-2/+97
Add flow counters to count packets dropped due to drop rules configured in eswitch egress and ingress ACLs. These counters will count VFs violations and incoming traffic drops. Will be presented on hypervisor via standard 'ip -s link show' command. Example: "ip -s link show dev enp5s0f0" 6: enp5s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 24:8a:07:a5:28:f0 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 2 TX: bytes packets errors dropped carrier collsns 1406 17 0 0 0 0 vf 0 MAC 00:00:ca:fe:ca:fe, vlan 5, spoof checking off, link-state auto, trust off, query_rss off RX: bytes packets mcast bcast dropped 1666 29 14 32 0 TX: bytes packets dropped 2880 44 2412 Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5: Separate ingress/egress namespaces for each vportGal Pressman1-4/+6
Each vport has its own root flow table for the ACL flow tables and root flow table is per namespace, therefore we should create a namespace for each vport. Fixes: efdc810ba39d ("net/mlx5: Flow steering, Add vport ACL support") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5e: E-Switch, Use the name of static array instead of its addressGal Pressman1-13/+13
Using the address of a static array is the same as using its name (in this specific use-case), but it's confusing and makes the code less readable. Fixes: 1bd27b11c1df ("net/mlx5: Introduce E-switch QoS management") Fixes: bd77bf1cb595 ("net/mlx5: Add SRIOV VF max rate configuration support") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-28net/mlx5: E-Switch, Refactor vport representors initializationMark Bloch1-8/+4
Refactor the init stage of vport representors registration. vport number and hw id can be assigned by the E-Switch driver and not by the netdevice driver. While here, make the error path of mlx5_eswitch_init() a reverse order of the good path, also use kcalloc to allocate an array instead of kzalloc. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-11-04net/mlx5: Initialize destination_flow struct to 0Rabie Loulou1-1/+1
This is needed in order to enlarge it with more members that will get value of 0 when not set. Signed-off-by: Rabie Loulou <rabiel@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-09-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-166/+45
Pull networking updates from David Miller: 1) Support ipv6 checksum offload in sunvnet driver, from Shannon Nelson. 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric Dumazet. 3) Allow generic XDP to work on virtual devices, from John Fastabend. 4) Add bpf device maps and XDP_REDIRECT, which can be used to build arbitrary switching frameworks using XDP. From John Fastabend. 5) Remove UFO offloads from the tree, gave us little other than bugs. 6) Remove the IPSEC flow cache, from Florian Westphal. 7) Support ipv6 route offload in mlxsw driver. 8) Support VF representors in bnxt_en, from Sathya Perla. 9) Add support for forward error correction modes to ethtool, from Vidya Sagar Ravipati. 10) Add time filter for packet scheduler action dumping, from Jamal Hadi Salim. 11) Extend the zerocopy sendmsg() used by virtio and tap to regular sockets via MSG_ZEROCOPY. From Willem de Bruijn. 12) Significantly rework value tracking in the BPF verifier, from Edward Cree. 13) Add new jump instructions to eBPF, from Daniel Borkmann. 14) Rework rtnetlink plumbing so that operations can be run without taking the RTNL semaphore. From Florian Westphal. 15) Support XDP in tap driver, from Jason Wang. 16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal. 17) Add Huawei hinic ethernet driver. 18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan Delalande. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits) i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq i40e: avoid NVM acquire deadlock during NVM update drivers: net: xgene: Remove return statement from void function drivers: net: xgene: Configure tx/rx delay for ACPI drivers: net: xgene: Read tx/rx delay for ACPI rocker: fix kcalloc parameter order rds: Fix non-atomic operation on shared flag variable net: sched: don't use GFP_KERNEL under spin lock vhost_net: correctly check tx avail during rx busy polling net: mdio-mux: add mdio_mux parameter to mdio_mux_init() rxrpc: Make service connection lookup always check for retry net: stmmac: Delete dead code for MDIO registration gianfar: Fix Tx flow control deactivation cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6 cxgb4: Fix pause frame count in t4_get_port_stats cxgb4: fix memory leak tun: rename generic_xdp to skb_xdp tun: reserve extra headroom only when XDP is set net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping net: dsa: bcm_sf2: Advertise number of egress queues ...
2017-09-03Merge tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds1-1/+1
Pull rdma updates from Doug Ledford: "This is a big pull request. Of note is that I'm sending you the new ioctl API for the rdma subsystem. We put it up on linux-api@, but didn't get much response. The API is complex, but it solves two different problems in one go: 1) The bi-directional nature of the RDMA file write calls, which created the security hole we had to handle (and for which the fix is now causing problems for systems in production, we were a bit over zealous in the fix and the ability to open a device, then fork, then create new queue pairs on the device and use them is broken). 2) The bloat caused by different vendors implementing extensions to the base verbs API. Each vendor's hardware is slightly different, and the hardware might be suitable for one extension but not another. By the time we add generic extensions for all the different ways that the different hardware can offload things, the API becomes bloated. Things like our completion structs have started to exceed a cache line in size because of all the elements needed to support this. That in turn shows up heavily in the performance graphs with a noticable drop in performance on 100Gigabit links as our completion structs go from occupying one cache line to 1+. This API makes things like the completion structs modular in a very similar way to netlink so that your structs can only include the items needed for the offloads/features you are actually using on a given queue pair. In that way we support everything, but only use what we need, and our structs stay smaller. The ioctl API is better explained by the posting on linux-api@ than I can explain it here, so I'll just leave it at that. The rest of the pull request is typical stuff. Updates for 4.14 kernel merge window - Lots of hfi1 driver updates (mixed with a few qib and core updates as well) - rxe updates - various mlx updates - Set default roce type to RoCEv2 - Several larger fixes for bnxt_re that were too big for -rc - Several larger fixes for qedr that, likewise, were too big for -rc - Misc core changes - Make the hns_roce driver compilable on arches other than aarch64 so we can more easily debug build issues related to it - Add rdma-netlink infrastructure updates - Add automatic IRQ affinity infrastructure - Add 32bit lid support - Lots of misc fixes across the subsystem from random people - Autoloading of RDMA netlink modules - PCI pool cleanups from Romain Perier - mlx5 driver feature additions and fixes - Hardware tag matchine feature - Fix sleeping in atomic when resolving roce ah - Add experimental ioctl interface as posted to linux-api@" * tag 'for-linus-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (328 commits) IB/core: Expose ioctl interface through experimental Kconfig IB/core: Assign root to all drivers IB/core: Add completion queue (cq) object actions IB/core: Add legacy driver's user-data IB/core: Export ioctl enum types to user-space IB/core: Explicitly destroy an object while keeping uobject IB/core: Add macros for declaring methods and attributes IB/core: Add uverbs merge trees functionality IB/core: Add DEVICE object and root tree structure IB/core: Declare an object instead of declaring only type attributes IB/core: Add new ioctl interface RDMA/vmw_pvrdma: Fix a signedness RDMA/vmw_pvrdma: Report network header type in WC IB/core: Add might_sleep() annotation to ib_init_ah_from_wc() IB/cm: Fix sleeping in atomic when RoCE is used IB/core: Add support to finalize objects in one transaction IB/core: Add a generic way to execute an operation on a uobject Documentation: Hardware tag matching IB/mlx5: Support IB_SRQT_TM net/mlx5: Add XRQ support ...
2017-08-18mlx5: ensure 0 is returned when vport is zeroColin Ian King1-1/+1
Currently, if vport is zero then then an uninialized return status in err is returned. Since the only return status at the end of the function esw_add_uc_addr is zero for the current set of return paths we may as well just return 0 rather than err to fix this issue. Detected by CoverityScan, CID#1452698 ("Uninitialized scalar variable") Fixes: eeb66cdb6826 ("net/mlx5: Separate between E-Switch and MPFS") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08mlx5: convert to generic pci_alloc_irq_vectorsSagi Grimberg1-1/+1
Now that we have a generic code to allocate an array of irq vectors and even correctly spread their affinity, correctly handle cpu hotplug events and more, were much better off using it. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-07net/mlx5: Separate between E-Switch and MPFSSaeed Mahameed1-152/+38
Multi-Physical Function Switch (MPFs) is required for when multi-PF configuration is enabled to allow passing user configured unicast MAC addresses to the requesting PF. Before this patch eswitch.c used to manage the HW MPFS l2 table, E-Switch always (regardless of sriov) enabled vport(0) (NIC PF) vport's contexts update on unicast mac address list changes, to populate the PF's MPFS L2 table accordingly. In downstream patch we would like to allow compiling the driver without E-Switch functionalities, for that we move MPFS l2 table logic out of eswitch.c into its own file, and provide Kconfig flag (MLX5_MPFS) to allow compiling out MPFS for those who don't want Multi-PF support. NIC PF netdevice will now directly update MPFS l2 table via the new MPFS API. VF netdevice has no access to MPFS L2 table, so E-Switch will remain responsible of updating its MPFS l2 table on behalf of its VFs. Due to this change we also don't require enabling vport(0) (PF vport) unicast mac changes events anymore, for when SRIOV is not enabled. Which means E-Switch is now activated only on SRIOV activation, and not required otherwise. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com
2017-08-07net/mlx5: Unify vport manager capability checkSaeed Mahameed1-15/+8
Expose MLX5_VPORT_MANAGER macro to check for strict vport manager E-switch and MPFS (Multi Physical Function Switch) abilities. VPORT manager must be a PF with an ethernet link and with FW advertised vport group manager capability Replace older checks with the new macro and use it where needed in eswitch.c and mlx5e netdev eswitch related flows. The same macro will be reused in MPFS separation downstream patch. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-07-27net/mlx5: Clean SRIOV eswitch resources upon VF creation failureEran Ben Elisha1-1/+2
Upon sriov enable, eswitch is always enabled. Currently, if enable hca failed over all VFs, we would skip eswitch disable as part of sriov disable, which will lead to resources leak. Fix it by disabling eswitch if it was enabled (use indication from eswitch mode). Fixes: 6b6adee3dad2 ('net/mlx5: SRIOV core code refactoring') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-16net/mlx5: Avoid blank lines before/after closing/opening bracesOr Gerlitz1-1/+0
Fixed checkpatch complaints on that: CHECK: Blank lines aren't necessary before a close brace '}' CHECK: Blank lines aren't necessary after an open brace '{' and one on missing blank line.. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-08net/mlx5e: Add cache for HW modify header IDsOr Gerlitz1-0/+1
Packets belonging to flows which are different by matching may still need to go through the same header re-write. Add a cache for header re-write IDs keyed by the binary chain of modify header actions. The caching is supported for both eswitch and NIC use-cases, where the actual conversion of the code to use caching comes in next patches, one per use-case. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-05-14{net, IB}/mlx5: Replace mlx5_vzalloc with kvzallocLeon Romanovsky1-15/+9
Commit a7c3e901a46f ("mm: introduce kv[mz]alloc helpers") added proper implementation of mlx5_vzalloc function to the MM core. This made the mlx5_vzalloc function useless, so let's remove it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30net/mlx5: E-Switch, Avoid redundant memory allocationEli Cohen1-18/+2
struct esw_mc_addr is a small struct that can be part of struct mlx5_eswitch. Define it as a field and not as a pointer and save the kzalloc call and then error flow handling. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-22net/mlx5: E-Switch, Add control for encapsulationRoi Dayan1-0/+5
Implement the devlink e-switch encapsulation control set and get callbacks. Apply the value set by the user on the switchdev offloads mode when creating the fast FDB table where offloaded rules will be set. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-17net/mlx5: Refactor create flow table method to accept underlay QPErez Shitrit1-1/+4
IB flow tables need the underlay qp to perform flow steering. Here we change the API of the flow tables creation to accept the underlay QP number as a parameter in order to support IB (IPoIB) flow steering. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-5/+5
All merge conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-29net/mlx5: Return EOPNOTSUPP when failing to get steering name-spaceOr Gerlitz1-3/+3
When we fail to retrieve a hardware steering name-space, the returned error code should say that this operation is not supported. Align the various places in the driver where this call is made to this convention. Also, make sure to warn when we fail to retrieve a SW (ANCHOR) name-space. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-29net/mlx5: Change ENOTSUPP to EOPNOTSUPPOr Gerlitz1-2/+2
As ENOTSUPP is specific to NFS, change the return error value to EOPNOTSUPP in various places in the mlx5 driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-24net/mlx5: Add support for setting VF min rateMohamad Haj Yahia1-6/+91
Add support for SRIOV VF min rate guarantee by using the TSAR BW share weights mechanism. The TSAR BW share vport attribute represents the weight of that vport among the other vports weights which means that the actual vport BW percentage is the same vport weight percentage among the total vports weights sum. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-19net/mlx5: Add support to s-tag in mlx5 firmware interfaceMohamad Haj Yahia1-6/+6
Add svlan_tag and rename vlan_tag to cvlan_tag in flow table entry match param. Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2016-12-28net/mlx5: Prevent setting multicast macs for VFsMohamad Haj Yahia1-1/+1
Need to check that VF mac address entered by the admin user is either zero or unicast mac. Multicast mac addresses are prohibited. Fixes: 77256579c6b4 ('net/mlx5: E-Switch, Introduce Vport administration functions') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-24net/mlx5: E-Switch, Add control for inline modeRoi Dayan1-0/+1
Implement devlink show and set of HW inline-mode. The supported modes: none, link, network, transport. We currently support one mode for all vports so set is done on all vports. When eswitch is first initialized the inline-mode is queried from the FW. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net/mlx5e: Add basic TC tunnel set action for SRIOV offloadsHadar Hen Zion1-0/+1
In mlx5 HW, encapsulation is offloaded by the steering rule having index into an encapsulation table containing the entire set of headers to be added by the HW. The driver sets these headers in a buffer when we are offloading the action. The code maintains mlx5_encap_entry for each encap header it has encountered when attempted to offload TC tunnel set action. This entry maintains a linked list of all the flows sharing the same encap header, when the last flow is removed from the list the encap entry is removed. The actual encap_header is allocated by the driver in the hardware only if we have layer two neighbour info when the encap entry is created. While the flow is in the driver, the driver holds a reference on the neighbour. When a new flow with encap action is inserted, the code first checks if the required encap entry exists according to the tunnel set parameters. If it does the encap is shared, otherwise a new mlx5_encap_entry is created. TC action parsing implementation in the driver assumes that tunnel set action is provided in the same order set by the user, e.g before the mirred_redirect action. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net/mlx5: Support encap id when setting new steering entryHadar Hen Zion1-10/+13
In order to support steering rules which add encapsulation headers, encap_id parameter is needed. Add new mlx5_flow_act struct which holds action related parameter: action, flow_tag and encap_id. Use mlx5_flow_act struct when adding a new steering rule. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09net/mlx5: Add creation flags when adding new flow tableHadar Hen Zion1-1/+1
When creating flow tables, allow the caller to specify creation flags. Currently no flags are used and as such this patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>