aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2019-08-09r8169: fix performance issue on RTL8168evlHolger Hoffstätte1-3/+3
Disabling TSO but leaving SG active results is a significant performance drop. Therefore disable also SG on RTL8168evl. This restores the original performance. Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO") Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09tcp: Update TCP_BASE_MSS commentJosh Hunt1-1/+1
TCP_BASE_MSS is used as the default initial MSS value when MTU probing is enabled. Update the comment to reflect this. Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09tcp: add new tcp_mtu_probe_floor sysctlJosh Hunt5-1/+18
The current implementation of TCP MTU probing can considerably underestimate the MTU on lossy connections allowing the MSS to get down to 48. We have found that in almost all of these cases on our networks these paths can handle much larger MTUs meaning the connections are being artificially limited. Even though TCP MTU probing can raise the MSS back up we have seen this not to be the case causing connections to be "stuck" with an MSS of 48 when heavy loss is present. Prior to pushing out this change we could not keep TCP MTU probing enabled b/c of the above reasons. Now with a reasonble floor set we've had it enabled for the past 6 months. The new sysctl will still default to TCP_MIN_SND_MSS (48), but gives administrators the ability to control the floor of MSS probing. Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09devlink: remove pointless data_len arg from region snapshot createJiri Pirko3-12/+6
The size of the snapshot has to be the same as the size of the region, therefore no need to pass it again during snapshot creation. Remove the arg and use region->size instead. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-09tcp: batch calls to sk_flush_backlog()Eric Dumazet1-5/+6
Starting from commit d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog") loopback flows got hurt, because for each skb sent, the socket receives an immediate ACK and sk_flush_backlog() causes extra work. Intent was to not let the backlog grow too much, but we went a bit too far. We can check the backlog every 16 skbs (about 1MB chunks) to increase TCP over loopback performance by about 15 % Note that the call to sk_flush_backlog() handles a single ACK, thanks to coalescing done on backlog, but cleans the 16 skbs found in rtx rb-tree. Reported-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08liquidio: Use pcie_flr() instead of reimplementing itDenis Efremov1-3/+1
octeon_mbox_process_cmd() directly writes the PCI_EXP_DEVCTL_BCR_FLR bit, which bypasses timing requirements imposed by the PCIe spec. This patch fixes the function to use the pcie_flr() interface instead. Signed-off-by: Denis Efremov <efremov@linux.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08r8169: allocate rx buffers using alloc_pages_nodeHeiner Kallweit1-26/+19
We allocate 16kb per rx buffer, so we can avoid some overhead by using alloc_pages_node directly instead of bothering kmalloc_node. Due to this change buffers are page-aligned now, therefore the alignment check can be removed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08fq_codel: remove set but not used variables 'prev_ecn_mark' and 'prev_drop_count'YueHaibing1-4/+0
Fixes gcc '-Wunused-but-set-variable' warning: net/sched/sch_fq_codel.c: In function fq_codel_dequeue: net/sched/sch_fq_codel.c:288:23: warning: variable prev_ecn_mark set but not used [-Wunused-but-set-variable] net/sched/sch_fq_codel.c:288:6: warning: variable prev_drop_count set but not used [-Wunused-but-set-variable] They are not used since commit 77ddaff218fc ("fq_codel: Kill useless per-flow dropped statistic") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08mlxsw: spectrum: Extend to support Spectrum-3 ASICJiri Pirko3-3/+59
Extend existing driver for Spectrum and Spectrum-2 ASICs to support Spectrum-3 ASIC as well. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08Merge branch 'stmmac-next'David S. Miller14-9/+1474
Jose Abreu says: ==================== net: stmmac: Improvements for -next [ This is just a rebase of v2 into latest -next in order to avoid a merge conflict ] Couple of improvements for -next tree. More info in commit logs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: selftests: Add a selftest for Flexible RX ParserJose Abreu1-1/+97
Add a selftest for the Flexible RX Parser feature. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: Add Flexible RX Parser support in XGMACJose Abreu3-0/+206
XGMAC cores also support the Flexible RX Parser feature. Add the support for it in the XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: Implement Safety Features in XGMAC coreJose Abreu3-0/+311
XGMAC also supports Safety Features. This patch implements the configuration and handling of this feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: selftests: Add test for VLAN and Double VLAN FilteringJose Abreu1-0/+205
Add a selftest for VLAN and Double VLAN Filtering in stmmac. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: Implement VLAN Hash Filtering in XGMACJose Abreu7-0/+139
Implement the VLAN Hash Filtering feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: selftests: Add RSS testJose Abreu1-0/+19
Add a test for RSS in the stmmac selftests. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: Implement RSS and enable it in XGMAC coreJose Abreu10-4/+242
Implement the RSS functionality and add the corresponding callbacks in XGMAC core. Changes from v1: - Do not use magic constants (Jakub) - Use ethtool_rxfh_indir_default() (Jakub) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: xgmac: Implement tx_queue_prio()Jose Abreu2-1/+22
Implement the TX Queue Priority callback in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: xgmac: Implement set_mtl_tx_queue_weight()Jose Abreu1-1/+21
Implement the TX Queue Weight callback. In order for this to be active we also need to set ETS algorithm when configuring Queue. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: stmmac: xgmac: Implement MMC countersJose Abreu7-2/+212
Implement the MMC counters feature in XGMAC core. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08tipc: add loopback device trackingJohn Rutherford7-1/+88
Since node internal messages are passed directly to the socket, it is not possible to observe those messages via tcpdump or wireshark. We now remedy this by making it possible to clone such messages and send the clones to the loopback interface. The clones are dropped at reception and have no functional role except making the traffic visible. The feature is enabled if network taps are active for the loopback device. pcap filtering restrictions require the messages to be presented to the receiving side of the loopback device. v3 - Function dev_nit_active used to check for network taps. - Procedure netif_rx_ni used to send cloned messages to loopback device. Signed-off-by: John Rutherford <john.rutherford@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08Merge branch 'flow_offload-add-indr-block-in-nf_table_offload'David S. Miller10-285/+464
wenxu says: ==================== flow_offload: add indr-block in nf_table_offload This series patch make nftables offload support the vlan and tunnel device offload through indr-block architecture. The first four patches mv tc indr block to flow offload and rename to flow-indr-block. Because the new flow-indr-block can't get the tcf_block directly. The fifth patch provide a callback list to get flow_block of each subsystem immediately when the device register and contain a block. The last patch make nf_tables_offload support flow-indr-block. This version add a mutex lock for add/del flow_indr_block_ing_cb ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08netfilter: nf_tables_offload: support indr block callwenxu3-24/+135
nftable support indr-block call. It makes nftable an offload vlan and tunnel device. nft add table netdev firewall nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; } nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0 nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; } nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0 Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08flow_offload: support get multi-subsystem blockwenxu3-15/+55
It provide a callback list to find the blocks of tc and nft subsystems Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08flow_offload: move tc indirect block to flow offloadwenxu7-263/+280
move tc indirect block to flow_offload and rename it to flow indirect block.The nf_tables can use the indr block architecture. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08cls_api: add flow_indr_block_call functionwenxu1-10/+17
This patch make indr_block_call don't access struct tc_indr_block_cb and tc_indr_block_dev directly Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08cls_api: remove the tcf_block cachewenxu1-8/+8
Remove the tcf_block in the tc_indr_block_dev for muti-subsystem support. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08cls_api: modify the tc_indr_block_ing_cmd parameters.wenxu1-11/+15
This patch make tc_indr_block_ing_cmd can't access struct tc_indr_block_dev and tc_indr_block_cb. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08Merge branch 'net-batched-receive-in-GRO-path'David S. Miller5-11/+54
Edward Cree says: ==================== net: batched receive in GRO path This series listifies part of GRO processing, in a manner which allows those packets which are not GROed (i.e. for which dev_gro_receive returns GRO_NORMAL) to be passed on to the listified regular receive path. dev_gro_receive() itself is not listified, nor the per-protocol GRO callback, since GRO's need to hold packets on lists under napi->gro_hash makes keeping the packets on other lists awkward, and since the GRO control block state of held skbs can refer only to one 'new' skb at a time. Instead, when napi_frags_finish() handles a GRO_NORMAL result, stash the skb onto a list in the napi struct, which is received at the end of the napi poll or when its length exceeds the (new) sysctl net.core.gro_normal_batch. Performance figures with this series, collected on a back-to-back pair of Solarflare sfn8522-r2 NICs with 120-second NetPerf tests. In the stats, sample size n for old and new code is 6 runs each; p is from a Welch t-test. Tests were run both with GRO enabled and disabled, the latter simulating uncoalesceable packets (e.g. due to IP or TCP options). The receive side (which was the device under test) had the NetPerf process pinned to one CPU, and the device interrupts pinned to a second CPU. CPU utilisation figures (used in cases of line-rate performance) are summed across all CPUs. net.core.gro_normal_batch was left at its default value of 8. TCP 4 streams, GRO on: all results line rate (9.415Gbps) net-next: 210.3% cpu after #1: 181.5% cpu (-13.7%, p=0.031 vs net-next) after #3: 196.7% cpu (- 8.4%, p=0.136 vs net-next) TCP 4 streams, GRO off: net-next: 8.017 Gbps after #1: 7.785 Gbps (- 2.9%, p=0.385 vs net-next) after #3: 7.604 Gbps (- 5.1%, p=0.282 vs net-next. But note *) TCP 1 stream, GRO off: net-next: 6.553 Gbps after #1: 6.444 Gbps (- 1.7%, p=0.302 vs net-next) after #3: 6.790 Gbps (+ 3.6%, p=0.169 vs net-next) TCP 1 stream, GRO on, busy_read = 50: all results line rate net-next: 156.0% cpu after #1: 174.5% cpu (+11.9%, p=0.015 vs net-next) after #3: 165.0% cpu (+ 5.8%, p=0.147 vs net-next) TCP 1 stream, GRO off, busy_read = 50: net-next: 6.488 Gbps after #1: 6.625 Gbps (+ 2.1%, p=0.059 vs net-next) after #3: 7.351 Gbps (+13.3%, p=0.026 vs net-next) TCP_RR 100 streams, GRO off, 8000 byte payload net-next: 995.083 us after #1: 969.167 us (- 2.6%, p=0.204 vs net-next) after #3: 976.433 us (- 1.9%, p=0.254 vs net-next) TCP_RR 100 streams, GRO off, 8000 byte payload, busy_read = 50: net-next: 2.851 ms after #1: 2.871 ms (+ 0.7%, p=0.134 vs net-next) after #3: 2.937 ms (+ 3.0%, p<0.001 vs net-next) TCP_RR 100 streams, GRO off, 1 byte payload, busy_read = 50: net-next: 867.317 us after #1: 865.717 us (- 0.2%, p=0.334 vs net-next) after #3: 868.517 us (+ 0.1%, p=0.414 vs net-next) (*) These tests produced a mixture of line-rate and below-line-rate results, meaning that statistically speaking the results were 'censored' by the upper bound, and were thus not normally distributed, making a Welch t-test mathematically invalid. I therefore also calculated estimators according to [1], which gave the following: net-next: 8.133 Gbps after #1: 8.130 Gbps (- 0.0%, p=0.499 vs net-next) after #3: 7.680 Gbps (- 5.6%, p=0.285 vs net-next) (though my procedure for determining ν wasn't mathematically well-founded either, so take that p-value with a grain of salt). A further check came from dividing the bandwidth figure by the CPU usage for each test run, giving: net-next: 3.461 after #1: 3.198 (- 7.6%, p=0.145 vs net-next) after #3: 3.641 (+ 5.2%, p=0.280 vs net-next) The above results are fairly mixed, and in most cases not statistically significant. But I think we can roughly conclude that the series marginally improves non-GROable throughput, without hurting latency (except in the large-payload busy-polling case, which in any case yields horrid performance even on net-next (almost triple the latency without busy-poll). Also, drivers which, unlike sfc, pass UDP traffic to GRO would expect to see a benefit from gaining access to batching. Changed in v3: * gro_normal_batch sysctl now uses SYSCTL_ONE instead of &one * removed RFC tags (no comments after a week means no-one objects, right?) Changed in v2: * During busy poll, call gro_normal_list() to receive batched packets after each cycle of the napi busy loop. See comments in Patch #3 for complications of doing the same in busy_poll_stop(). [1]: Cohen 1959, doi: 10.1080/00401706.1959.10489859 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: use listified RX for handling GRO_NORMAL skbsEdward Cree3-3/+52
When GRO decides not to coalesce a packet, in napi_frags_finish(), instead of passing it to the stack immediately, place it on a list in the napi struct. Then, at flush time (napi_complete_done(), napi_poll(), or napi_busy_loop()), call netif_receive_skb_list_internal() on the list. We'd like to do that in napi_gro_flush(), but it's not called if !napi->gro_bitmask, so we have to do it in the callers instead. (There are a handful of drivers that call napi_gro_flush() themselves, but it's not clear why, or whether this will affect them.) Because a full 64 packets is an inefficiently large batch, also consume the list whenever it exceeds gro_normal_batch, a new net/core sysctl that defaults to 8. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08sfc: falcon: don't score irq moderation points for GROEdward Cree1-4/+1
Same rationale as for sfc, except that this wasn't performance-tested. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08sfc: don't score irq moderation points for GROEdward Cree1-4/+1
We already scored points when handling the RX event, no-one else does this, and looking at the history it appears this was originally meant to only score on merges, not on GRO_NORMAL. Moreover, it gets in the way of changing GRO to not immediately pass GRO_NORMAL skbs to the stack. Performance testing with four TCP streams received on a single CPU (where throughput was line rate of 9.4Gbps in all tests) showed a 13.7% reduction in RX CPU usage (n=6, p=0.03). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08qed: Add new ethtool supported port types based on media.Rahul Verma3-3/+8
Supported ports in ethtool <eth1> are displayed based on media type. For media type fibre and twinaxial, port type is "FIBRE". Media type Base-T is "TP" and media KR is "Backplane". V1->V2: Corrected the subject. Signed-off-by: Rahul Verma <rahulv@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08cxgb4: smt: Use normal int for refcountChuhong Yuan2-8/+8
All refcount operations are protected by spinlocks now. Then the atomic counter can be replaced by a normal int. This patch depends on PATCH 1/2. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08cxgb4: smt: Add lock for atomic_dec_and_testChuhong Yuan1-2/+2
The atomic_dec_and_test() is not safe because it is outside of locks. Move the locks of t4_smte_free() to its caller, cxgb4_smt_release() to protect the atomic decrement. Fixes: 3bdb376e6944 ("cxgb4: introduce SMT ops to prepare for SMAC rewrite support") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08selftests: Add l2tp testsDavid Ahern2-1/+383
Add IPv4 and IPv6 l2tp tests. Current set is over IP and with IPsec. v2 - add l2tp.sh to TEST_PROGS in Makefile Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08net: delete "register" keywordAlexey Dobriyan4-21/+21
Delete long obsoleted "register" keyword. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08mkiss: Use refcount_t for refcountChuhong Yuan1-5/+6
refcount_t is better for reference counters since its implementation can prevent overflows. So convert atomic_t ref counters to refcount_t. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08dpaa_eth: Use refcount_t for refcountChuhong Yuan2-4/+5
refcount_t is better for reference counters since its implementation can prevent overflows. So convert atomic_t ref counters to refcount_t. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-08Merge tag 'batadv-next-for-davem-20190808' of git://git.open-mesh.org/linux-mergeDavid S. Miller7-8/+205
Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - Replace usage of strlcpy with strscpy, by Sven Eckelmann - Add OGMv2 per-interface queue and aggregations, by Linus Luessing (2 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1012-7699/+9587
Just minor overlapping changes in the conflicts here. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds225-1274/+2402
Pull networking fixes from David Miller: "Yeah I should have sent a pull request last week, so there is a lot more here than usual: 1) Fix memory leak in ebtables compat code, from Wenwen Wang. 2) Several kTLS bug fixes from Jakub Kicinski (circular close on disconnect etc.) 3) Force slave speed check on link state recovery in bonding 802.3ad mode, from Thomas Falcon. 4) Clear RX descriptor bits before assigning buffers to them in stmmac, from Jose Abreu. 5) Several missing of_node_put() calls, mostly wrt. for_each_*() OF loops, from Nishka Dasgupta. 6) Double kfree_skb() in peak_usb can driver, from Stephane Grosjean. 7) Need to hold sock across skb->destructor invocation, from Cong Wang. 8) IP header length needs to be validated in ipip tunnel xmit, from Haishuang Yan. 9) Use after free in ip6 tunnel driver, also from Haishuang Yan. 10) Do not use MSI interrupts on r8169 chips before RTL8168d, from Heiner Kallweit. 11) Upon bridge device init failure, we need to delete the local fdb. From Nikolay Aleksandrov. 12) Handle erros from of_get_mac_address() properly in stmmac, from Martin Blumenstingl. 13) Handle concurrent rename vs. dump in netfilter ipset, from Jozsef Kadlecsik. 14) Setting NETIF_F_LLTX on mac80211 causes complete breakage with some devices, so revert. From Johannes Berg. 15) Fix deadlock in rxrpc, from David Howells. 16) Fix Kconfig deps of enetc driver, we must have PHYLIB. From Yue Haibing. 17) Fix mvpp2 crash on module removal, from Matteo Croce. 18) Fix race in genphy_update_link, from Heiner Kallweit. 19) bpf_xdp_adjust_head() stopped working with generic XDP when we fixes generic XDP to support stacked devices properly, fix from Jesper Dangaard Brouer. 20) Unbalanced RCU locking in rt6_update_exception_stamp_rt(), from David Ahern. 21) Several memory leaks in new sja1105 driver, from Vladimir Oltean" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (214 commits) net: dsa: sja1105: Fix memory leak on meta state machine error path net: dsa: sja1105: Fix memory leak on meta state machine normal path net: dsa: sja1105: Really fix panic on unregistering PTP clock net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well net: dsa: sja1105: Fix broken learning with vlan_filtering disabled net: dsa: qca8k: Add of_node_put() in qca8k_setup_mdio_bus() net: sched: sample: allow accessing psample_group with rtnl net: sched: police: allow accessing police->params with rtnl net: hisilicon: Fix dma_map_single failed on arm64 net: hisilicon: fix hip04-xmit never return TX_BUSY net: hisilicon: make hip04_tx_reclaim non-reentrant tc-testing: updated vlan action tests with batch create/delete net sched: update vlan action for batched events operations net: stmmac: tc: Do not return a fragment entry net: stmmac: Fix issues when number of Queues >= 4 net: stmmac: xgmac: Fix XGMAC selftests be2net: disable bh with spin_lock in be_process_mcc net: cxgb3_main: Fix a resource leak in a error path in 'init_one()' net: ethernet: sun4i-emac: Support phy-handle property for finding PHYs net: bridge: move default pvid init/deinit to NETDEV_REGISTER/UNREGISTER ...
2019-08-06Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller5-51/+90
Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2019-08-05 This series contains updates to i40e driver only. Dmitrii adds missing statistic counters for VEB and VEB TC's. Slawomir adds support for logging the "Disable Firmware LLDP" flag option and its current status. Jake fixes an issue where VF's being notified of their link status before their queues are enabled which was causing issues. So always report link status down when the VF queues are not enabled. Also adds future proofing when statistics are added or removed by adding checks to ensure the data pointer for the strings lines up with the expected statistics count. Czeslaw fixes the advertised mode reported in ethtool for FEC, where the "None BaseR RS" was always being displayed no matter what the mode it was in. Also added logging information when the PF is entering or leaving "allmulti" (or promiscuous) mode. Fixed up the logging logic for VF's when leaving multicast mode to not include unicast as well. v2: drop Aleksandr's patch (previously patch #2 in the series) to display the VF MAC address that is set by the VF while community feedback is addressed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06openvswitch: Print error when ovs_execute_actions() failsYifeng Sun1-2/+5
Currently in function ovs_dp_process_packet(), return values of ovs_execute_actions() are silently discarded. This patch prints out an debug message when error happens so as to provide helpful hints for debugging. Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06Merge branch 'sja1105-fixes'David S. Miller4-98/+75
Vladimir Oltean says: ==================== Fixes for SJA1105 DSA: FDBs, Learning and PTP This is an assortment of functional fixes for the sja1105 switch driver targeted for the "net" tree (although they apply on net-next just as well). Patch 1/5 ("net: dsa: sja1105: Fix broken learning with vlan_filtering disabled") repairs a breakage introduced in the early development stages of the driver: support for traffic from the CPU has broken "normal" frame forwarding (based on DMAC) - there is connectivity through the switch only because all frames are flooded. I debated whether this patch qualifies as a fix, since it puts the switch into a mode it has never operated in before (aka SVL). But "normal" forwarding did use to work before the "Traffic support for SJA1105 DSA driver" patchset, and arguably this patch should have been part of that. Also, it would be strange for this feature to be broken in the 5.2 LTS. Patch 2/5 ("net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as well") is a simplification of a previous FDB-related patch that is currently in the 5.3 rc's. Patches 3/5 - 5/5 fix various crashes found while running linuxptp over the switch ports for extended periods of time, or in conjunction with other error conditions. The fixed-up commits were all introduced in 5.2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Fix memory leak on meta state machine error pathVladimir Oltean1-0/+1
When RX timestamping is enabled and two link-local (non-meta) frames are received in a row, this constitutes an error. The tagger is always caching the last link-local frame, in an attempt to merge it with the meta follow-up frame when that arrives. To recover from the above error condition, the initial cached link-local frame is dropped and the second frame in a row is cached (in expectance of the second meta frame). However, when dropping the initial link-local frame, its backing memory was being leaked. Fixes: f3097be21bf1 ("net: dsa: sja1105: Add a state machine for RX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Fix memory leak on meta state machine normal pathVladimir Oltean1-10/+1
After a meta frame is received, it is associated with the cached sp->data->stampable_skb from the DSA tagger private structure. Cached means its refcount is incremented with skb_get() in order for dsa_switch_rcv() to not free it when the tagger .rcv returns NULL. The mistake is that skb_unref() is not the correct function to use. It will correctly decrement the refcount (which will go back to zero) but the skb memory will not be freed. That is the job of kfree_skb(), which also calls skb_unref(). But it turns out that freeing the cached stampable_skb is in fact not necessary. It is still a perfectly valid skb, and now it is even annotated with the partial RX timestamp. So remove the skb_copy() altogether and simply pass the stampable_skb with a refcount of 1 (incremented by us, decremented by dsa_switch_rcv) up the stack. Fixes: f3097be21bf1 ("net: dsa: sja1105: Add a state machine for RX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Really fix panic on unregistering PTP clockVladimir Oltean2-6/+5
The IS_ERR_OR_NULL(priv->clock) check inside sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync from actually being run. Additionally, sja1105_ptp_clock_unregister() does not actually get run, when placed in sja1105_remove(). The DSA switch gets torn down, but the sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister needs to be moved to sja1105_teardown, to be symmetrical with sja1105_ptp_clock_register which is called from the DSA sja1105_setup. It is strange to fix a "fixes" patch, but the probe failure can only be seen when the attached PHY does not respond to MDIO (issue which I can't pinpoint the reason to) and it goes away after I power-cycle the board. This time the patch was validated on a failing board, and the kernel panic from the fixed commit's message can no longer be seen. Fixes: 29dd908d355f ("net: dsa: sja1105: Cancel PTP delayed work on unregister") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Use the LOCKEDS bit for SJA1105 E/T as wellVladimir Oltean2-16/+13
It looks like the FDB dump taken from first-generation switches also contains information on whether entries are static or not. So use that instead of searching through the driver's tables. Fixes: d763778224ea ("net: dsa: sja1105: Implement is_static for FDB entries on E/T") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Fix broken learning with vlan_filtering disabledVladimir Oltean1-66/+55
When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>