aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2019-02-06net/mlx5e: Update hw flows when encap source mac changedTonghao Zhang1-0/+1
When we offload tc filters to hardware, hardware flows can be updated when mac of encap destination ip is changed. But we ignore one case, that the mac of local encap ip can be changed too, so we should also update them. To fix it, add route_dev in mlx5e_encap_entry struct to save the local encap netdevice, and when mac changed, kernel will flush all the neighbour on the netdevice and send NETEVENT_NEIGH_UPDATE event. The mlx5 driver will delete the flows and add them when neighbour available again. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Cc: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-20net/mlx5e: Fail attempt to offload e-switch TC flows with egress upper devicesEli Britstein1-0/+3
We use the switchdev parent HW id helper to identify if the mirred device shares the same ASIC/port with the ingress device. This can get us wrong in the presence of upper devices such as vlan or bridge set over the HW devices (VF or uplink representors), b/c the switchdev ID is retrieved recursively. To fail offload attempts in such cases, we condition the check on the egress device to have not only the same switchdev ID but also the relevant mlx5 netdev ops. Fixes: 03a9d11e6eeb ('net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-17net/mlx5e: Uninstantiate esw manager vport netdev on switchdev modeOr Gerlitz1-4/+2
Now, when we have a dedicated uplink representor, the netdev instance set over the esw manager vport (PF) is of no-use. As such, remove it once we're on switchdev mode and get it back to life when off switchdev. This is done by reloading the Ethernet interface as well (we already do that for the IB interface) from the eswitch code while going in/out of switchdev mode. The Eth add/remove entries are modified to act differently when called in switchdev mode. In this case we only deal with registration of the eth vport representors. The rep netdevices are created from the eswitch call to load the registered eth representors. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-17net/mlx5e: Remove leftover code from the PF netdev being uplink repOr Gerlitz1-3/+0
Remove some last leftovers from using the PF netdev as the e-switch uplink representor. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-17net/mlx5e: Use dedicated uplink vport netdev representorOr Gerlitz1-5/+8
Currently, when running in sriov switchdev mode, we are using the PF netdevice as the uplink representor, this is problematic from few aspects: - will break when the PF isn't eswitch manager (e.g smart NIC env) - misalignment with other NIC switchdev drivers - makes us have and maintain special code, hurts the driver quality/robustness - which in turn opens the door for future bugs As of each and all of the above, we move to have a dedicated netdev representor for the uplink vport in a similar manner done for for the VF vports. This includes the following: 1. have an uplink rep netdev as we have for VF reps 2. all reps use same load/unload functions 3. HW stats for uplink based on physical port counters and not vport counters 4. link state for the uplink managed through PAOS and not vport state 5. the uplink rep has sysfs link to the PF PCI function && uses the PF MAC address Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5e: Branch according to classified tunnel typeOz Shlomo1-0/+2
Currently the tunnel offloading encap/decap methods assumes that VXLAN is the sole tunneling protocol. Lay the infrastructure for supporting multiple tunneling protocols by branching according to the tunnel net device kind. Encap filters tunnel type is determined according to the egress/mirred net device. Decap filters classify the tunnel type according to the filter's ingress net device kind. Distinguish between the tunnel type as defined by the SW model and the FW reformat type that specifies the HW operation being made. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5e: Support TC indirect block notifications for eswitch uplink reprsOz Shlomo1-0/+13
Towards using this mechanism as the means to offload tunnel decap rules set on SW tunnel devices instead of egdev, add the supporting structures and functions. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-10net/mlx5e: Store eswitch uplink representor state on a dedicated structOz Shlomo1-1/+8
Currently only a single field in the representor private structure is relevant for uplink representors. As a pre-step to allow adding additional uplink representor fields, introduce uplink representor private structure. This is prepration step towards replacing egdev logic with the indirect block notification mechanism. This patch doesn't change any functionality. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-05-17net/mlx5e: Use shared table for offloaded TC eswitch flowsOr Gerlitz1-0/+1
Currently, each representor netdev use their own hash table to keep the mapping from TC flow (f->cookie) to the driver offloaded instance. The table is the one which originally was added for offloading TC NIC (not eswitch) rules. This scheme breaks when the core TC code calls us to add the same flow twice, (e.g under egdev use case) since we don't spot that and offload a 2nd flow into the HW with the wrong source vport. As a pre-step to solve that, we move to use a single table which keeps all offloaded TC eswitch flows. The table is located at the eswitch uplink representor object. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5e: E-Switch, Move send-to-vport rule struct to en_repMark Bloch1-0/+5
Move struct mlx5_esw_sq which keeps send-to-vport rule to from the eswitch code to mlx5e and rename it to better reflect where it belongs Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-29net/mlx5: E-Switch, Create generic header struct to be used by representorsMark Bloch1-1/+1
Now that we don't store type dependent data in struct mlx5_eswitch_rep we can create a generic interface, and representor type. struct mlx5_eswitch_rep will store an array of interfaces, each interface is used by a different representor type. Once we moved to a more generic interface, rdma driver representors can be added and utilize the same mechanism as the Ethernet driver representors use. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-12-28net/mlx5e: Move ethernet representors data into separate structMark Bloch1-0/+9
Ethernet representors have a need to store data which is applicable only for them. Create a priv void pointer in struct mlx5_eswitch_rep and move mlx5e to store the relevant data there. As part of this change we also initialize rep_if in mlx5e_rep_register_vf_vports() as otherwise the E-Switch code will copy a priv value which is garbage. We also rename mlx5_eswitch_get_uplink_netdev() to mlx5_eswitch_get_uplink_priv() and make it return void *. This way E-Switch code doesn't need to deal with net devices and we leave the task of getting it to mlx5e. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-08-07net/mlx5: Add CONFIG_MLX5_ESWITCH KconfigSaeed Mahameed1-0/+8
Allow to selectively build the driver with or without sriov eswitch, VF representors and TC offloads. Also remove the need of two ndo ops structures (sriov & basic) and keep only one unified ndo ops, compile out VF SRIOV ndos when not needed (MLX5_ESWITCH=n), and for VF netdev calling those ndos will result in returning -EPERM. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: Jes Sorensen <jsorensen@fb.com> Cc: kernel-team@fb.com
2017-08-07net/mlx5e: NIC netdev init flow cleanupSaeed Mahameed1-0/+1
Remove redundant call to unregister vport representor in mlx5e_add error flow. Hide the representor priv and eswitch internal structures from en_main.c as preparation step for downstream patches which would allow building the driver without support for representors and eswitch. Fixes: 6f08a22c5fb2 ("net/mlx5e: Register/unregister vport representors on interface attach/detach") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2017-04-30net/mlx5e: Update neighbour 'used' state using HW flow rules countersHadar Hen Zion1-0/+11
When IP tunnel encapsulation rules are offloaded, the kernel can't see the traffic of the offloaded flow. The neighbour for the IP tunnel destination of the offloaded flow can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To make sure that a neighbour which is used by the HW won't become STALE, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME period, when packets were matched and counted by the HW for one of the tunnel encap flows related to this neighbour. The periodic task that updates the used neighbours is scheduled when a tunnel encap rule is successfully offloaded into HW and keeps re-scheduling itself as long as the representor's neighbours list isn't empty. Add, remove, lookup and status change operations done over the representor's neighbours list or the neighbour hash entry encaps list are all serialized by RTNL lock. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30net/mlx5e: Add support to neighbour update flowHadar Hen Zion1-1/+37
In order to offload TC encap rules, the driver does a lookup for the IP tunnel neighbour according to the output device and the destination IP given by the user. To keep tracking after the validity state of such neighbours, we keep the neighbours information (pair of device pointer and destination IP) in a hash table maintained at the relevant egress representor and register to get NETEVENT_NEIGH_UPDATE events. When getting neighbour update netevent, we search for a match among the cached neighbours entries used for encapsulation. In case the neighbour isn't valid, we can't offload the flow into the HW. We cache the flow (requested matching and actions) in the driver and offload the rule later, when the neighbour is resolved and becomes valid. When a flow is only cached in the driver and not offloaded into HW yet, we use EAGAIN return value to mark it internally, the TC ndo still returns success. Listen to kernel neighbour update netevents to trace relevant neighbours validity state: 1. If a neighbour becomes valid, offload the related rules to HW. 2. If the neighbour becomes invalid, remove the related rules from HW. 3. If the neighbour mac address was changed, update the encap header. Remove all the offloaded rules using the old encap header from the HW and insert new rules to HW with updated encap header. Access to the neighbors hash table is protected by RTNL lock of its caller or by the table's spinlock. Details of the locking/synchronization among the different actions applied on the neighbour table: Add/remove operations - protected by RTNL lock of its caller (all TC commands are protected by RTNL lock). Add and remove operations are initiated only when the user inserts/removes a TC rule into/from the driver. Lookup/remove operations - since the lookup operation is done from netevent notifier block, RTNL lock can't be used (atomic context). Use the table's spin lock to protect lookups from TC user removal operation. bh is used since netevent can be called from a softirq context. Lookup/add operations - The hash table access functions are taking care of the protection between lookup and add operations. When adding/removing encap headers and rules to/from the HW, RTNL lock is used. It can happen when: 1. The user inserts/removes a TC rule into/from the driver (TC commands are protected by RTNL lock of it's caller). 2. The driver gets neighbour notification event, which reports about neighbour validity status change. Before adding/removing encap headers and rules to/from the HW, RTNL lock is taken. A neighbour hash table entry should be freed when its encap list is empty. Since The neighbour update netevent notification schedules a neighbour update work that uses the neighbour hash entry, it can't be freed unconditionally when the encap list becomes empty during TC delete rule flow. Use reference count to protect from freeing neighbour hash table entry while it's still in use. When the user asks to unregister a netdvice used by one of the neigbours, neighbour removal notification is received. Then we take a reference on the neighbour and don't free it until the relevant encap entries (and flows) are marked as invalid (not offloaded) and removed from HW. As long as the encap entry is still valid (checked under RTNL lock) we can safely access the neighbour device saved on mlx5e_neigh struct. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30net/mlx5e: Add neighbour hash table to the representorsHadar Hen Zion1-0/+30
Add hash table to the representors which is to be used by the next patch to save neighbours information in the driver. In order to offload IP tunnel encapsulation rules, the driver must find the tunnel dst neighbour according to the output device and the destination address given by the user. The next patch will cache the neighbors information in the driver to allow support in neigh update flow for tunnel encap rules. The neighbour entries are also saved in a list so we easily iterate over them when querying statistics in order to provide 'used' feedback to the kernel neighbour NUD core. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30net/mlx5e: Move the encap entry structure from the eswitch headerOr Gerlitz1-0/+13
The encap entry structure isn't manipulated by the eswitch code, hence it can/needs to be removed from the eswitch header. Do that, and change it to have mlx5e_ prefix. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30net/mlx5e: Extendable vport representor netdev private dataSaeed Mahameed1-0/+55
Make representor netdev private data extendable by adding new struct "mlx5e_rep_priv" and use it as the rep netdev private data struct instead of directly pointing to mlx5_eswitch_rep. Added new en_rep.h header file to contain all representor related definitions and prototypes, and moved all representor specific logic into en_rep.c. Needed for downstream patches to extend representor functionality to support neighbour update. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>