aboutsummaryrefslogtreecommitdiffstatshomepage
AgeCommit message (Collapse)AuthorFilesLines
2020-03-17net: bridge: vlan tunnel: constify bridge and port argumentsNikolay Aleksandrov3-11/+13
The vlan tunnel code changes vlan options, it shouldn't touch port or bridge options so we can constify the port argument. This would later help us to re-use these functions from the vlan options code. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: bridge: vlan options: rename br_vlan_opts_eq to br_vlan_opts_eq_rangeNikolay Aleksandrov3-7/+7
It is more appropriate name as it shows the intent of why we need to check the options' state. It also allows us to give meaning to the two arguments of the function: the first is the current vlan (v_curr) being checked if it could enter the range ending in the second one (range_end). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'stmmac-100GB-Enterprise-MAC-support'David S. Miller8-3/+340
Jose Abreu says: ==================== net: stmmac: 100GB Enterprise MAC support Adds the support for Enterprise MAC IP version which allows operating speeds up to 100GB. Patch 1/4, adds the support in XPCS for XLGMII interface that is used in this kind of Enterprise MAC IPs. Patch 2/4, adds the XLGMII interface support in stmmac. Patch 3/4, adds the HW specific support for Enterprise MAC. We end in patch 4/4, by updating stmmac documentation to mention the support for this new IP version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Documentation: networking: stmmac: Mention new XLGMAC supportJose Abreu1-2/+5
Add the Enterprise MAC support to the list of supported IP versions and the newly added XLGMII interface support. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: stmmac: Add support for Enterprise MAC versionJose Abreu5-1/+173
Adds the support for Enterprise MAC IP version which is very similar to XGMAC. It's so similar that we just need to check the device id and add new speeds definitions and some minor callbacks. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: stmmac: Add XLGMII supportJose Abreu2-0/+64
Add XLGMII support for stmmac including the list of speeds and defines for them. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: phy: xpcs: Add XLGMII supportJose Abreu1-0/+98
Add XLGMII support for XPCS. This does not include Autoneg feature. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'ionic-bits-and-bytes'David S. Miller4-7/+22
Shannon Nelson says: ==================== ionic bits and bytes These are a few little updates to the ionic driver while we are in between other feature work. While these are mostly Fixes, they are almost all low priority and needn't be promoted to net. The one higher need is patch 1, but it is fixing something that hasn't made it out of net-next yet. v3: allow decode of unknown transciever and use type codes from sfp.h v2: add Fixes tags to patches 1-4, and a little description for patch 5 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17ionic: add decode for IONIC_RC_ENOSUPPShannon Nelson1-0/+3
Add decoding for a new firmware error code. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17ionic: print data for unknown xcvr typeShannon Nelson1-4/+9
If we don't recognize the transceiver type, set the xcvr type and data length such that ethtool can at least print the first 256 bytes and the reader can figure out why the transceiver is not recognized. While we're here, we can update the phy_id type values to use the enum values in sfp.h. Fixes: 4d03e00a2140 ("ionic: Add initial ethtool support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17ionic: remove adminq napi instanceShannon Nelson1-0/+1
Remove the adminq's napi struct when tearing down the adminq. Fixes: 1d062b7b6f64 ("ionic: Add basic adminq support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17ionic: deinit rss only if selectedShannon Nelson1-1/+2
Don't bother de-initing RSS if it wasn't selected. Fixes: aa3198819bea ("ionic: Add RSS support") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17ionic: stop devlink warn on mgmt deviceShannon Nelson1-2/+7
If we don't set a port type, the devlink code will eventually print a WARN in the kernel log. Because the mgmt device is not really a useful port, don't register it as a devlink port. Fixes: b3f064e9746d ("ionic: add support for device id 0x1004") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'net_sched-allow-use-of-hrtimer-slack'David S. Miller4-12/+42
Eric Dumazet says: ==================== net_sched: allow use of hrtimer slack Packet schedulers have used hrtimers with exact expiry times. Some of them can afford having a slack, in order to reduce the number of timer interrupts and feed bigger batches to increase efficiency. FQ for example does not care if throttled packets are sent with an additional (small) delay. Original observation of having maybe too many interrupts was made by Willem de Bruijn. v2: added strict netlink checking (Jakub Kicinski) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net_sched: sch_fq: enable use of hrtimer slackEric Dumazet2-4/+19
Add a new attribute to control the fq qdisc hrtimer slack. Default is set to 10 usec. When/if packets are throttled, fq set up an hrtimer that can lead to one interrupt per packet in the throttled queue. By using a timer slack, we allow better use of timer interrupts, by giving them a chance to call multiple timer callbacks at each hardware interrupt. Also, giving a slack allows FQ to dequeue batches of packets instead of a single one, thus increasing xmit_more efficiency. This has no negative effect on the rate a TCP flow can sustain, since each TCP flow maintains its own precise vtime (tp->tcp_wstamp_ns) v2: added strict netlink checking (as feedback from Jakub Kicinski) Tested: 1000 concurrent flows all using paced packets. 1,000,000 packets sent per second. Before the patch : $ vmstat 2 10 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 60726784 23628 3485992 0 0 138 1 977 535 0 12 87 0 0 0 0 0 60714700 23628 3485628 0 0 0 0 1568827 26462 0 22 78 0 0 1 0 0 60716012 23628 3485656 0 0 0 0 1570034 26216 0 22 78 0 0 0 0 0 60722420 23628 3485492 0 0 0 0 1567230 26424 0 22 78 0 0 0 0 0 60727484 23628 3485556 0 0 0 0 1568220 26200 0 22 78 0 0 2 0 0 60718900 23628 3485380 0 0 0 40 1564721 26630 0 22 78 0 0 2 0 0 60718096 23628 3485332 0 0 0 0 1562593 26432 0 22 78 0 0 0 0 0 60719608 23628 3485064 0 0 0 0 1563806 26238 0 22 78 0 0 1 0 0 60722876 23628 3485236 0 0 0 130 1565874 26566 0 22 78 0 0 1 0 0 60722752 23628 3484908 0 0 0 0 1567646 26247 0 22 78 0 0 After the patch, slack of 10 usec, we can see a reduction of interrupts per second, and a small decrease of reported cpu usage. $ vmstat 2 10 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 0 60722564 23628 3484728 0 0 133 1 696 545 0 13 87 0 0 1 0 0 60722568 23628 3484824 0 0 0 0 977278 25469 0 20 80 0 0 0 0 0 60716396 23628 3484764 0 0 0 0 979997 25326 0 20 80 0 0 0 0 0 60713844 23628 3484960 0 0 0 0 981394 25249 0 20 80 0 0 2 0 0 60720468 23628 3484916 0 0 0 0 982860 25062 0 20 80 0 0 1 0 0 60721236 23628 3484856 0 0 0 0 982867 25100 0 20 80 0 0 1 0 0 60722400 23628 3484456 0 0 0 8 982698 25303 0 20 80 0 0 0 0 0 60715396 23628 3484428 0 0 0 0 981777 25176 0 20 80 0 0 0 0 0 60716520 23628 3486544 0 0 0 36 978965 27857 0 21 79 0 0 0 0 0 60719592 23628 3486516 0 0 0 22 977318 25106 0 20 80 0 0 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net_sched: do not reprogram a timer about to expireEric Dumazet1-2/+7
qdisc_watchdog_schedule_range_ns() can use the newly added slack and avoid rearming the hrtimer a bit earlier than the current value. This patch has no effect if delta_ns parameter is zero. Note that this means the max slack is potentially doubled. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net_sched: add qdisc_watchdog_schedule_range_ns()Eric Dumazet2-6/+16
Some packet schedulers might want to add a slack when programming hrtimers. This can reduce number of interrupts and increase batch sizes and thus give good xmit_more savings. This commit adds qdisc_watchdog_schedule_range_ns() helper, with an extra delta_ns parameter. Legacy qdisc_watchdog_schedule_n() becomes an inline passing a zero slack. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'nfp-type'David S. Miller13-52/+49
Jakub Kicinski says: ==================== net: rename flow_action stats and set NFP type Jiri, I hope this is okay with you, I just dropped the "type" from the helper and value names, and now things should be able to fit on a line, within 80 characters. Second patch makes the NFP able to offload DELAYED stats, which is the type it supports. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17nfp: allow explicitly selected delayed statsJakub Kicinski1-1/+2
NFP flower offload uses delayed stats. Kernel recently gained the ability to specify stats types. Make nfp accept DELAYED stats, not just the catch all "any". Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: rename flow_action_hw_stats_types* -> flow_action_hw_stats*Jakub Kicinski13-52/+48
flow_action_hw_stats_types_check() helper takes one of the FLOW_ACTION_HW_STATS_*_BIT values as input. If we align the arguments to the opening bracket of the helper there is no way to call this helper and stay under 80 characters. Remove the "types" part from the new flow_action helpers and enum values. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'net-phy-improve-phy_driver-callback-handle_interrupt'David S. Miller3-19/+28
Heiner Kallweit says: ==================== net: phy: improve phy_driver callback handle_interrupt did_interrupt() clears the interrupt, therefore handle_interrupt() can not check which event triggered the interrupt. To overcome this constraint and allow more flexibility for customer interrupt handlers, let's decouple handle_interrupt() from parts of the phylib interrupt handling. Custom interrupt handlers now have to implement the did_interrupt() functionality in handle_interrupt() if needed. Fortunately we have just one custom interrupt handler so far (in the mscc PHY driver), convert it to the changed API and make use of the benefits. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: phy: mscc: consider interrupt source in interrupt handlerHeiner Kallweit1-2/+5
Trigger the respective interrupt handler functionality only if the related interrupt source bit is set. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: phy: improve phy_driver callback handle_interruptHeiner Kallweit3-17/+23
did_interrupt() clears the interrupt, therefore handle_interrupt() can not check which event triggered the interrupt. To overcome this constraint and allow more flexibility for customer interrupt handlers, let's decouple handle_interrupt() from parts of the phylib interrupt handling. Custom interrupt handlers now have to implement the did_interrupt() functionality in handle_interrupt() if needed. Fortunately we have just one custom interrupt handler so far (in the mscc PHY driver), convert it to the changed API. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17Merge branch 'ethtool-consolidate-irq-coalescing-last-part'David S. Miller14-64/+35
Jakub Kicinski says: ==================== ethtool: consolidate irq coalescing - last part Convert remaining drivers following the groundwork laid in a recent patch set [1] and continued in [2], [3], [4], [5]. The aim of the effort is to consolidate irq coalescing parameter validation in the core. This set is the sixth and last installment. It converts the remaining 8 drivers in drivers/net/ethernet. The last patch makes declaring supported IRQ coalescing parameters a requirement. [1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/ [2] https://lore.kernel.org/netdev/20200306010602.1620354-1-kuba@kernel.org/ [3] https://lore.kernel.org/netdev/20200310021512.1861626-1-kuba@kernel.org/ [4] https://lore.kernel.org/netdev/20200311223302.2171564-1-kuba@kernel.org/ [5] https://lore.kernel.org/netdev/20200313040803.2367590-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: ethtool: require drivers to set supported_coalesce_paramsJakub Kicinski4-3/+17
Now that all in-tree drivers have been updated we can make the supported_coalesce_params mandatory. To save debugging time in case some driver was missed (or is out of tree) add a warning when netdev is registered with set_coalesce but without supported_coalesce_params. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: axienet: let core reject the unsupported coalescing parametersJakub Kicinski1-21/+1
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver already correctly rejected all unsupported parameters. No functional changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: ll_temac: let core reject the unsupported coalescing parametersJakub Kicinski1-19/+2
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver already correctly rejected all unsupported parameters. No functional changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: davinci_emac: reject unsupported coalescing paramsJakub Kicinski1-0/+1
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: cpsw: reject unsupported coalescing paramsJakub Kicinski2-0/+2
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: tehuti: reject unsupported coalescing paramsJakub Kicinski1-0/+2
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: dwc-xlgmac: let core reject the unsupported coalescing parametersJakub Kicinski1-15/+2
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver already correctly rejected all unsupported parameters. While at it remove unnecessary zeroing on get. No functional changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: socionext: reject unsupported coalescing paramsJakub Kicinski1-0/+2
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net: sfc: reject unsupported coalescing paramsJakub Kicinski2-6/+6
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. The check for use_adaptive_tx_coalesce will now be done by the core. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-17net/mlx5: Avoid forwarding to other eswitch uplinkEli Cohen1-0/+3
Do not allow forwarding of encapsulated traffic received from one eswtich's uplink to another eswtich's uplink. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: Eswitch, enable forwarding back to uplink portEli Cohen2-17/+45
Add dependencny on cap termination_table_raw_traffic to allow non encapsulated packets received from uplink to be forwarded back to the received uplink port. Refactor the conditions into a separate function. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: Add support for offloading traffic from uplink to uplinkEli Cohen1-27/+70
Termination tables change the direction of a packet in hw from RX to SX pipeline. Use that to offload hairpin flows received from uplink and sent back to uplink. Currently termination tables are used for pushing VLAN to packets received from uplink and targeting a VF. Extend the implementation to allow forwarding packets to uplink. These packets can either be encapsulated or not. In case encapsulation is needed before forwarding, move the reformat object to the termination table as required. Extend the hash table key to include tunnel information for the sake of reusing reformat objects. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: Don't use termination tables in slow pathEli Cohen3-7/+19
Don't use termination tables for packets that are steered to the slow path, as a pre-step for supporting packet encap (packet reformat) action on termination tables. Packet encap (reformat action) actions steer the packet to the slow path until outer arp entries are resolved. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: Avoid configuring eswitch QoS if not supportedEli Cohen2-0/+11
Check if QoS is enabled for the eswitch before attempting to configure QoS parameters and emit a netlink error if not supported. Introduce an API to check if QoS is supported for the eswitch. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: Fix devlink port register sequenceVladyslav Tarasiuk3-24/+21
If udevd is configured to rename interfaces according to persistent naming rules and if a network interface has phys_port_name in sysfs, its contents will be appended to the interface name. However, register_netdev creates device in sysfs and if devlink_port_register is called after that, there is a timeframe in which udevd may read an empty phys_port_name value. The consequence is that the interface will lose this suffix and its name will not be really persistent. The solution is to register the port before registering a netdev. Fixes: c6acd629eec7 ("net/mlx5e: Add support for devlink-port in non-representors mode") Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: Fix rejecting all egress rules not on vlanRoi Dayan1-14/+1
The original condition rejected all egress rules that are not on tunnel device. Also, the whole point of this egress reject was to disallow bad rules because of egdev which doesn't exists today, so remove this check entirely. Fixes: 0a7fcb78cc21 ("net/mlx5e: Support inner header rewrite with goto action") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: en_tc: Rely just on register loopback for tunnel restorationPaul Blakey1-3/+3
Register loopback which is needed for tunnel restoration, is now always enabled if supported and not just with metadata enabled, check for that instead. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: CT: Fix stack usage compiler warningSaeed Mahameed1-9/+22
Fix the following warnings: [-Werror=frame-larger-than=] In function ‘mlx5_tc_ct_entry_add_rule’: drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c:541:1: error: the frame size of 1136 bytes is larger than 1024 bytes In function ‘__mlx5_tc_ct_flow_offload’: drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c:1049:1: error: the frame size of 1168 bytes is larger than 1024 bytes Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com>
2020-03-17net/mlx5e: CT: Fix insert rules when TC_CT config isn't enabledPaul Blakey1-0/+9
If CONFIG_MLX5_TC_CT isn't enabled, all offloading of eswitch tc rules fails on parsing ct match, even if there is no ct match. Return success if there is no ct match, regardless of config. Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5e: CT: remove set but not used variable 'unnew'YueHaibing1-2/+1
drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c: In function mlx5_tc_ct_parse_match: drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c:699:36: warning: variable unnew set but not used [-Wunused-but-set-variable] Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: E-Switch, Skip restore modify header between prios of same chainPaul Blakey1-1/+1
Restore modify header writes the chain mapping on the packet. This modify header and action is added on all prios connections, and gets overwritten with the same value consecutively in prios of the same chain. Use the chain's modify header only for the last prio of a given tc chain. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: E-Switch: Fix using fwd and modify when firmware doesn't support itPaul Blakey1-2/+13
Currently, if firmware doesn't support fwd and modify, driver fails initializing eswitch chains while entering switchdev mode. Instead, on such cases, disable the chains and prio feature (as we can't restore the chain on miss) and the usage of fwd and modify. Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-17net/mlx5: Add missing inline to stub esw_add_restore_ruleNathan Chancellor1-1/+1
When CONFIG_MLX5_ESWITCH is unset, clang warns: In file included from drivers/net/ethernet/mellanox/mlx5/core/main.c:58: drivers/net/ethernet/mellanox/mlx5/core/eswitch.h:670:1: warning: unused function 'esw_add_restore_rule' [-Wunused-function] esw_add_restore_rule(struct mlx5_eswitch *esw, u32 tag) ^ 1 warning generated. This stub function is missing inline; add it to suppress the warning. Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-03-18netfilter: Introduce egress hookLukas Wunner7-8/+83
Commit e687ad60af09 ("netfilter: add netfilter ingress hook after handle_ing() under unique static key") introduced the ability to classify packets on ingress. Allow the same on egress. Position the hook immediately before a packet is handed to tc and then sent out on an interface, thereby mirroring the ingress order. This order allows marking packets in the netfilter egress hook and subsequently using the mark in tc. Another benefit of this order is consistency with a lot of existing documentation which says that egress tc is performed after netfilter hooks. Egress hooks already exist for the most common protocols, such as NF_INET_LOCAL_OUT or NF_ARP_OUT, and those are to be preferred because they are executed earlier during packet processing. However for more exotic protocols, there is currently no provision to apply netfilter on egress. A common workaround is to enslave the interface to a bridge and use ebtables, or to resort to tc. But when the ingress hook was introduced, consensus was that users should be given the choice to use netfilter or tc, whichever tool suits their needs best: https://lore.kernel.org/netdev/20150430153317.GA3230@salvia/ This hook is also useful for NAT46/NAT64, tunneling and filtering of locally generated af_packet traffic such as dhclient. There have also been occasional user requests for a netfilter egress hook in the past, e.g.: https://www.spinics.net/lists/netfilter/msg50038.html Performance measurements with pktgen surprisingly show a speedup rather than a slowdown with this commit: * Without this commit: Result: OK: 34240933(c34238375+d2558) usec, 100000000 (60byte,0frags) 2920481pps 1401Mb/sec (1401830880bps) errors: 0 * With this commit: Result: OK: 33997299(c33994193+d3106) usec, 100000000 (60byte,0frags) 2941410pps 1411Mb/sec (1411876800bps) errors: 0 * Without this commit + tc egress: Result: OK: 39022386(c39019547+d2839) usec, 100000000 (60byte,0frags) 2562631pps 1230Mb/sec (1230062880bps) errors: 0 * With this commit + tc egress: Result: OK: 37604447(c37601877+d2570) usec, 100000000 (60byte,0frags) 2659259pps 1276Mb/sec (1276444320bps) errors: 0 * With this commit + nft egress: Result: OK: 41436689(c41434088+d2600) usec, 100000000 (60byte,0frags) 2413320pps 1158Mb/sec (1158393600bps) errors: 0 Tested on a bare-metal Core i7-3615QM, each measurement was performed three times to verify that the numbers are stable. Commands to perform a measurement: modprobe pktgen echo "add_device lo@3" > /proc/net/pktgen/kpktgend_3 samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i 'lo@3' -n 100000000 Commands for testing tc egress: tc qdisc add dev lo clsact tc filter add dev lo egress protocol ip prio 1 u32 match ip dst 4.3.2.1/32 Commands for testing nft egress: nft add table netdev t nft add chain netdev t co \{ type filter hook egress device lo priority 0 \; \} nft add rule netdev t co ip daddr 4.3.2.1/32 drop All testing was performed on the loopback interface to avoid distorting measurements by the packet handling in the low-level Ethernet driver. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-18netfilter: Generalize ingress hookLukas Wunner2-15/+32
Prepare for addition of a netfilter egress hook by generalizing the ingress hook introduced by commit e687ad60af09 ("netfilter: add netfilter ingress hook after handle_ing() under unique static key"). In particular, rename and refactor the ingress hook's static inlines such that they can be reused for an egress hook. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-03-18netfilter: Rename ingress hook include fileLukas Wunner2-1/+1
Prepare for addition of a netfilter egress hook by renaming <linux/netfilter_ingress.h> to <linux/netfilter_netdev.h>. The egress hook also necessitates a refactoring of the include file, but that is done in a separate commit to ease reviewing. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>