aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2020-07-13dccp: Replace HTTP links with HTTPS onesAlexander A. Klimov6-7/+7
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13net: fddi: skfp: Remove addr_to_string().Tetsuo Handa3-44/+27
kbuild test robot found that addr_to_string() is available only when DEBUG is defined. And I found that what that function is doing is what %pM will do. Thus, replace %s with %pM and remove thread-unsafe addr_to_string() function. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13net: bridge: fix undefined br_vlan_can_enter_range in tunnel codeNikolay Aleksandrov1-0/+6
If bridge vlan filtering is not defined we won't have br_vlan_can_enter_range and thus will get a compile error as was reported by Stephen and the build bot. So let's define a stub for when vlan filtering is not used. Fixes: 94339443686b ("net: bridge: notify on vlan tunnel changes done via the old api") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12net: sky2: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-42/+45
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GPF_ with a correct flag. It has been compile tested. When memory is allocated in 'sky2_alloc_buffers()', GFP_KERNEL can be used because some other memory allocations in the same function already use this flag. When memory is allocated in 'sky2_probe()', GFP_KERNEL can be used because another memory allocations in the same function already uses this flag. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12net: skge: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-40/+36
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GPF_ with a correct flag. It has been compile tested. When memory is allocated in 'skge_up()', GFP_KERNEL can be used because some other memory allocations done a few lines below in 'skge_ring_alloc()' already use this flag. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12Merge branch 'Fix-MTU-warnings-for-fec-mv886xxx-combo'David S. Miller2-0/+29
Andrew Lunn says: ==================== Fix MTU warnings for fec/mv886xxx combo Since changing the MTU of dsa slave interfaces was implemented, the fec/mv88e6xxx combo has been giving warnings: [ 2.275925] mv88e6085 0.2:00: nonfatal error -95 setting MTU on port 9 [ 2.284306] eth1: mtu greater than device maximum [ 2.287759] fec 400d1000.ethernet eth1: error -22 setting MTU to include DSA overhead This patchset adds support for changing the MTU on mv88e6xxx switches, which do support jumbo frames. And it modifies the FEC driver to support its true MTU range, which is larger than the default Ethernet MTU. ==================== Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12net: fec: Set max MTU size to allow the MTU to be changedAndrew Lunn1-0/+2
The FEC allocates 2K buffers, but looses some of it due to alignment. It can however support an MTU bigger than the default. This is particularly interesting when used in combination with Ethernet switches supporting DSA, which have extra headers. The DSA core will try to increase the MTU to support these extra headers. If the max size defaults to that of standard Ethernet we get a warning. By setting the max to what the driver actually supports, we avoid this warning. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12net: dsa: mv88e6xxx: Implement MTU changeAndrew Lunn1-0/+27
The Marvell Switches support jumbo packages. So implement the callbacks needed for changing the MTU. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-12net: bridge: notify on vlan tunnel changes done via the old apiNikolay Aleksandrov1-2/+47
If someone uses the old vlan API to configure tunnel mappings we'll only generate the old-style full port notification. That would be a problem if we are monitoring the new vlan notifications for changes. The patch resolves the issue by adding vlan notifications to the old tunnel netlink code. As usual we try to compress the notifications for as many vlans in a range as possible, thus a vlan tunnel change is considered able to enter the "current" vlan notification range if: 1. vlan exists 2. it has actually changed (curr_change == true) 3. it passes all standard vlan notification range checks done by br_vlan_can_enter_range() such as option equality, id continuity etc Note that vlan tunnel changes (add/del) are considered a part of vlan options so only RTM_NEWVLAN notification is generated with the relevant information inside. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller849-3770/+7181
All conflicts seemed rather trivial, with some guidance from Saeed Mameed on the tc_ct.c one. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds1-1/+1
Pull libnvdimm fix from Dan Williams: "A one-line Fix for key ring search permissions to address a regression from -rc1" * tag 'libnvdimm-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/security: Fix key lookup permissions
2020-07-10Merge tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds5-18/+22
Pull cifs fixes from Steve French: "Four cifs/smb3 fixes: the three for stable fix problems found recently with change notification including a reference count leak" * tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number cifs: fix reference leak for tlink smb3: fix unneeded error message on change notify cifs: remove the retry in cifs_poxis_lock_set smb3: fix access denied on change notify request to some servers
2020-07-10Merge tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linuxLinus Torvalds1-0/+20
Pull coding style terminology documentation from Dan Williams: "The discussion has tapered off as well as the incoming ack, review, and sign-off tags. I did not see a reason to wait for the next merge window" * tag 'inclusive-terminology' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux: CodingStyle: Inclusive Terminology
2020-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds197-922/+1634
Pull networking fixes from David Miller: 1) Restore previous behavior of CAP_SYS_ADMIN wrt loading networking BPF programs, from Maciej Żenczykowski. 2) Fix dropped broadcasts in mac80211 code, from Seevalamuthu Mariappan. 3) Slay memory leak in nl80211 bss color attribute parsing code, from Luca Coelho. 4) Get route from skb properly in ip_route_use_hint(), from Miaohe Lin. 5) Don't allow anything other than ARPHRD_ETHER in llc code, from Eric Dumazet. 6) xsk code dips too deeply into DMA mapping implementation internals. Add dma_need_sync and use it. From Christoph Hellwig 7) Enforce power-of-2 for BPF ringbuf sizes. From Andrii Nakryiko. 8) Check for disallowed attributes when loading flow dissector BPF programs. From Lorenz Bauer. 9) Correct packet injection to L3 tunnel devices via AF_PACKET, from Jason A. Donenfeld. 10) Don't advertise checksum offload on ipa devices that don't support it. From Alex Elder. 11) Resolve several issues in TCP MD5 signature support. Missing memory barriers, bogus options emitted when using syncookies, and failure to allow md5 key changes in established states. All from Eric Dumazet. 12) Fix interface leak in hsr code, from Taehee Yoo. 13) VF reset fixes in hns3 driver, from Huazhong Tan. 14) Make loopback work again with ipv6 anycast, from David Ahern. 15) Fix TX starvation under high load in fec driver, from Tobias Waldekranz. 16) MLD2 payload lengths not checked properly in bridge multicast code, from Linus Lüssing. 17) Packet scheduler code that wants to find the inner protocol currently only works for one level of VLAN encapsulation. Allow Q-in-Q situations to work properly here, from Toke Høiland-Jørgensen. 18) Fix route leak in l2tp, from Xin Long. 19) Resolve conflict between the sk->sk_user_data usage of bpf reuseport support and various protocols. From Martin KaFai Lau. 20) Fix socket cgroup v2 reference counting in some situations, from Cong Wang. 21) Cure memory leak in mlx5 connection tracking offload support, from Eli Britstein. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits) mlxsw: pci: Fix use-after-free in case of failed devlink reload mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON() net: macb: fix call to pm_runtime in the suspend/resume functions net: macb: fix macb_suspend() by removing call to netif_carrier_off() net: macb: fix macb_get/set_wol() when moving to phylink net: macb: mark device wake capable when "magic-packet" property present net: macb: fix wakeup test in runtime suspend/resume routines bnxt_en: fix NULL dereference in case SR-IOV configuration fails libbpf: Fix libbpf hashmap on (I)LP32 architectures net/mlx5e: CT: Fix memory leak in cleanup net/mlx5e: Fix port buffers cell size value net/mlx5e: Fix 50G per lane indication net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash net/mlx5e: Fix VXLAN configuration restore after function reload net/mlx5e: Fix usage of rcu-protected pointer net/mxl5e: Verify that rpriv is not NULL net/mlx5: E-Switch, Fix vlan or qos setting in legacy mode net/mlx5: Fix eeprom support for SFP module cgroup: Fix sock_cgroup_data on big-endian. selftests: bpf: Fix detach from sockmap tests ...
2020-07-10mips: Remove compiler check in unroll macroNathan Chancellor1-3/+1
CONFIG_CC_IS_GCC is undefined when Clang is used, which breaks the build (see our Travis link below). Clang 8 was chosen as a minimum version for this check because there were some improvements around __builtin_constant_p in that release. In reality, MIPS was not even buildable until clang 9 so that check was not technically necessary. Just remove all compiler checks and just assume that we have a working compiler. Fixes: d4e60453266b ("Restore gcc check in mips asm/unroll.h") Link: https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/359642821 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-10inet: Remove an unnecessary argument of syn_ack_recalc().Kuniyuki Iwashima1-20/+13
Commit 0c3d79bce48034018e840468ac5a642894a521a3 ("tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT") introduces syn_ack_recalc() which decides if a minisock is held and a SYN+ACK is retransmitted or not. If rskq_defer_accept is not zero in syn_ack_recalc(), max_retries always has the same value because max_retries is overwritten by rskq_defer_accept in reqsk_timer_handler(). This commit adds three changes: - remove redundant non-zero check for rskq_defer_accept in reqsk_timer_handler(). - remove max_retries from the arguments of syn_ack_recalc() and use rskq_defer_accept instead. - rename thresh to max_syn_ack_retries for readability. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> CC: Julian Anastasov <ja@ssi.bg> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge branch 'mlxsw-Various-fixes'David S. Miller2-17/+39
Ido Schimmel says: ==================== mlxsw: Various fixes Fix two issues found by syzkaller. Patch #1 removes inappropriate usage of WARN_ON() following memory allocation failure. Constantly triggered when syzkaller injects faults. Patch #2 fixes a use-after-free that can be triggered by 'devlink dev info' following a failed devlink reload. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10mlxsw: pci: Fix use-after-free in case of failed devlink reloadIdo Schimmel1-16/+38
In case devlink reload failed, it is possible to trigger a use-after-free when querying the kernel for device info via 'devlink dev info' [1]. This happens because as part of the reload error path the PCI command interface is de-initialized and its mailboxes are freed. When the devlink '->info_get()' callback is invoked the device is queried via the command interface and the freed mailboxes are accessed. Fix this by initializing the command interface once during probe and not during every reload. This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c') and also allows user space to query the running firmware version (for example) from the device after a failed reload. [1] BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline] BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675 Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355 CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xf6/0x16e lib/dump_stack.c:118 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 check_memory_region_inline mm/kasan/generic.c:186 [inline] check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192 memcpy+0x39/0x60 mm/kasan/common.c:106 memcpy include/linux/string.h:406 [inline] mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675 mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335 mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline] mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline] mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985 mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline] mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090 devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588 devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648 genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575 netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245 __netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353 genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638 genl_family_rcv_msg net/netlink/genetlink.c:733 [inline] genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469 genl_rcv+0x24/0x40 net/netlink/genetlink.c:764 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0x150/0x190 net/socket.c:672 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363 ___sys_sendmsg+0xff/0x170 net/socket.c:2417 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a9c8336f6544 ("mlxsw: core: Add support for devlink info command") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()Ido Schimmel1-1/+1
We should not trigger a warning when a memory allocation fails. Remove the WARN_ON(). The warning is constantly triggered by syzkaller when it is injecting faults: [ 2230.758664] FAULT_INJECTION: forcing a failure. [ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0 [ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28 ... [ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0 [ 2230.898179] Kernel panic - not syncing: panic_on_warn set ... [ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28 [ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Fixes: 3057224e014c ("mlxsw: spectrum_router: Implement FIB offload in deferred work") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge branch 'devlink-health'David S. Miller5-76/+216
Moshe Shemesh says: ==================== Add devlink-health support for devlink ports Implement support for devlink health reporters on per-port basis. This patchset comes to fix a design issue as some health reporters report on errors and run recovery on device level while the actual functionality is on port level. As for the current implemented devlink health reporters it is relevant only to Tx and Rx reporters of mlx5, which has only one port, so no real effect on functionality, but this should be fixed before more drivers will use devlink health reporters. First part in the series prepares common functions parts for health reporter implementation. Second introduces required API to devlink-health and mlx5e ones demonstrate its usage and implement the feature for mlx5 driver. The per-port reporter functionality is achieved by adding a list of devlink_health_reporters to devlink_port struct in a manner similar to existing device infrastructure. This is the only major difference and it makes possible to fully reuse device reporters operations. The effect will be seen in conjunction with iproute2 additions and will affect all devlink health commands. User can distinguish between device and port reporters by looking at a devlink handle. Port reporters have a port index at the end of the address and such addresses can be provided as a parameter in every place where devlink-health accepted it. These can be obtained from devlink port show command. For example: $ devlink health show pci/0000:00:0a.0: reporter fw state healthy error 0 recover 0 auto_dump true pci/0000:00:0a.0/1: reporter tx state healthy error 0 recover 0 grace_period 500 auto_recover true auto_dump true $ devlink health set pci/0000:00:0a.0/1 reporter tx grace_period 1000 \ auto_recover false auto_dump false $ devlink health show pci/0000:00:0a.0/1 reporter tx pci/0000:00:0a.0/1: reporter tx state healthy error 0 recover 0 grace_period 1000 auto_recover flase auto_dump false Note: User can use the same devlink health uAPI commands can get now either port health reporter or device health reporter. For example, the recover command: Before this patchset: devlink health recover DEV reporter REPORTER_NAME After this patchset: devlink health recover { DEV | DEV/PORT_INDEX } reporter REPORTER_NAME Changes v1 -> v2: Fixed functions comment to match parameters list. Changes v2 -> v3: Added motivation to cover letter and note on uAPI. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net/mlx5e: Move devlink-health rx and tx reporters to devlink portVladyslav Tarasiuk2-15/+7
Utilize new devlink-health port reporters API to move rx and tx reporters from device to port. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net/mlx5e: Move devlink port register and unregister callsVladyslav Tarasiuk1-10/+5
Register devlink ports upon NIC init. TX and RX health reporters handle errors which may occur early on at driver initialization. And because these reporters are to be moved to port context, they require devlink ports to be already registered. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10devlink: Add devlink health port reporters APIVladyslav Tarasiuk2-0/+59
In order to use new devlink port health reporters infrastructure, add corresponding constructor and destructor functions. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10devlink: Implement devlink health reporters on per-port basisVladyslav Tarasiuk2-17/+79
Add devlink-health reporter support on per-port basis. The main difference existing devlink-health is that port reporters are stored in per-devlink_port lists. Upon creation of such health reporter the reference to a port it belongs to is stored in reporter struct. Fill the port index attribute in devlink-health response to allow devlink userspace utility to distinguish between device and port reporters. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10devlink: Create generic devlink health reporter search functionVladyslav Tarasiuk1-4/+14
Add a generic __devlink_health_reporter_find_by_name() that can be used with arbitrary devlink health reporter list. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10devlink: Rework devlink health reporter destructorVladyslav Tarasiuk1-13/+24
Devlink keeps its own reference to every reporter in a list and inits refcount to 1 upon reporter's creation. Existing destructor waits to free the memory indefinitely using msleep() until all references except devlink's own are put. Rework this mechanism by moving memory free routine to a separate function, which is called when the last reporter reference is put. Besides, it allows to call __devlink_health_reporter_destroy() while locked on a reporters list mutex in symmetry to __devlink_health_reporter_create(), which is required in follow-up patch. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10devlink: Refactor devlink health reporter constructorVladyslav Tarasiuk1-17/+28
Prepare a common routine in devlink_health_reporter_create() for usage in similar functions for devlink port health reporters. Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge branch 'macb-WOL-fixes'David S. Miller1-12/+19
Nicolas Ferre says: ==================== net: macb: Wake-on-Lan magic packet fixes and GEM handling Here is a split series to fix WoL magic-packet on the current macb driver. Only fixes in this one based on current net/master. Changes in v5: - Addressed the error code returned by phylink_ethtool_set_wol() as suggested by Russell. If PHY handles WoL, MAC doesn't stay in the way. - Removed Florian's tag on 3/5 because of the above changes. - Correct the "Fixes" tag on 1/5. Changes in v4: - Pure bug fix series for 'net'. GEM addition and MACB update removed: will be sent later. Changes in v3: - Revert some of the v2 changes done in macb_resume(). Now the resume function supports in-depth re-configuration of the controller in order to deal with deeper sleep states. Basically as it was before changes introduced by this series - Tested for non-regression with our deeper Power Management mode which cuts power to the controller completely Changes in v2: - Add patch 4/7 ("net: macb: fix macb_suspend() by removing call to netif_carrier_off()") needed for keeping phy state consistent - Add patch 5/7 ("net: macb: fix call to pm_runtime in the suspend/resume functions") that prevent putting the macb in runtime pm suspend mode when WoL is used - Collect review tags on 3 first patches from Florian: Thanks! - Review of macb_resume() function - Addition of pm_wakeup_event() in both MACB and GEM WoL IRQ handlers ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix call to pm_runtime in the suspend/resume functionsNicolas Ferre1-2/+4
The calls to pm_runtime_force_suspend/resume() functions are only relevant if the device is not configured to act as a WoL wakeup source. Add the device_may_wakeup() test before calling them. Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Sergio Prado <sergio.prado@e-labworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix macb_suspend() by removing call to netif_carrier_off()Nicolas Ferre1-1/+0
As we now use the phylink call to phylink_stop() in the non-WoL path, there is no need for this call to netif_carrier_off() anymore. It can disturb the underlying phylink FSM. Fixes: 7897b071ac3b ("net: macb: convert to phylink") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix macb_get/set_wol() when moving to phylinkNicolas Ferre1-6/+12
Keep previous function goals and integrate phylink actions to them. phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver supports Wake-on-Lan. Initialization of "supported" and "wolopts" members is done in phylink function, no need to keep them in calling function. phylink_ethtool_set_wol() return value is considered and determines if the MAC has to handle WoL or not. The case where the PHY doesn't implement WoL leads to the MAC configuring it to provide this feature. Fixes: 7897b071ac3b ("net: macb: convert to phylink") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Antoine Tenart <antoine.tenart@bootlin.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: mark device wake capable when "magic-packet" property presentNicolas Ferre1-1/+1
Change the way the "magic-packet" DT property is handled in the macb_probe() function, matching DT binding documentation. Now we mark the device as "wakeup capable" instead of calling the device_init_wakeup() function that would enable the wakeup source. For Ethernet WoL, enabling the wakeup_source is done by using ethtool and associated macb_set_wol() function that already calls device_set_wakeup_enable() for this purpose. That would reduce power consumption by cutting more clocks if "magic-packet" property is set but WoL is not configured by ethtool. Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Cc: Sergio Prado <sergio.prado@e-labworks.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10net: macb: fix wakeup test in runtime suspend/resume routinesNicolas Ferre1-2/+2
Use the proper struct device pointer to check if the wakeup flag and wakeup source are positioned. Use the one passed by function call which is equivalent to &bp->dev->dev.parent. It's preventing the trigger of a spurious interrupt in case the Wake-on-Lan feature is used. Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support") Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Cc: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10bnxt_en: fix NULL dereference in case SR-IOV configuration failsDavide Caratti1-1/+1
we need to set 'active_vfs' back to 0, if something goes wrong during the allocation of SR-IOV resources: otherwise, further VF configurations will wrongly assume that bp->pf.vf[x] are valid memory locations, and commands like the ones in the following sequence: # echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs # ip link set dev ens1f0np0 up # ip link set dev ens1f0np0 vf 0 trust on will cause a kernel crash similar to this: bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV BUG: kernel NULL pointer dereference, address: 0000000000000014 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 43 PID: 2059 Comm: ip Tainted: G I 5.8.0-rc2.upstream+ #871 Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019 RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en] Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89 RSP: 0018:ffffac6246a1f570 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000b RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b28f538900 RBP: ffff98b28f538900 R08: 0000000000000000 R09: 0000000000000008 R10: ffffffffb9515be0 R11: ffffac6246a1f678 R12: ffff98b28f538000 R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffc05451e0 FS: 00007fde0f688800(0000) GS:ffff98baffd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000014 CR3: 000000104bb0a003 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: do_setlink+0x994/0xfe0 __rtnl_newlink+0x544/0x8d0 rtnl_newlink+0x47/0x70 rtnetlink_rcv_msg+0x29f/0x350 netlink_rcv_skb+0x4a/0x110 netlink_unicast+0x21d/0x300 netlink_sendmsg+0x329/0x450 sock_sendmsg+0x5b/0x60 ____sys_sendmsg+0x204/0x280 ___sys_sendmsg+0x88/0xd0 __sys_sendmsg+0x5e/0xa0 do_syscall_64+0x47/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.") Reported-by: Fei Liu <feliu@redhat.com> CC: Jonathan Toppins <jtoppins@redhat.com> CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'mlx5-updates-2020-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller10-280/+672
Saeed Mahameed says: ==================== mlx5-updates-2020-07-09 This series provides updates to mlx5 CT (connection tracking) offloads For more information please see tag log below. Please pull and let me know if there is any problem. The following conflict is expected when net is merged into net-next: to resolve just use the hunks from net-next. <<<<<<< HEAD (net-next) mlx5_tc_ct_del_ft_entry(ct_priv, entry); kfree(entry); ======= (net) mlx5_tc_ct_entry_del_rules(ct_priv, entry); kfree(entry); >>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af mlx5 connection tracking offloads updates: 1) Restore CT state from lookup in zone instead of tupleid On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT entry and restore it, instead of the driver allocated tuple id. This improves flow insertion rate by avoiding the allocation of a header rewrite context to maintain the tupleid. 2) Re-use modify header HW objects for identical modify actions. 3) Expand tunnel register mappings Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel options mappings. Restoring the ct state from zone lookup instead of tuple id requires reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel mappings. Expand tunnel and tunnel options register mappings to 12 bit each. 4) Trivial cleanup and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller4-15/+26
Alexei Starovoitov says: ==================== pull-request: bpf 2020-07-09 The following pull-request contains BPF updates for your *net* tree. We've added 4 non-merge commits during the last 1 day(s) which contain a total of 4 files changed, 26 insertions(+), 15 deletions(-). The main changes are: 1) fix crash in libbpf on 32-bit archs, from Jakub and Andrii. 2) fix crash when l2tp and bpf_sk_reuseport conflict, from Martin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'mlx5-fixes-2020-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller14-62/+196
Saeed Mahameed says: ==================== mlx5 fixes 2020-07-02 This series introduces some fixes to mlx5 driver. V1->v2: - Drop "ip -s" patch and mirred device hold reference patch. - Will revise them in a later submission. Please pull and let me know if there is any problem. For -stable v5.2 ('net/mlx5: Fix eeprom support for SFP module') For -stable v5.4 ('net/mlx5e: Fix 50G per lane indication') For -stable v5.5 ('net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash') ('net/mlx5e: Fix VXLAN configuration restore after function reload') For -stable v5.7 ('net/mlx5e: CT: Fix memory leak in cleanup') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge branch 'udp_tunnel-add-NIC-RX-port-offload-infrastructure'David S. Miller35-397/+2608
Jakub Kicinski says: ==================== udp_tunnel: add NIC RX port offload infrastructure Kernel has a facility to notify drivers about the UDP tunnel ports so that devices can recognize tunneled packets. This is important mostly for RX - devices which don't support CHECKSUM_COMPLETE can report checksums of inner packets, and compute RSS over inner headers. Some drivers also match the UDP tunnel ports also for TX, although doing so may lead to false positives and negatives. Unfortunately the user experience when trying to take adavantage of these facilities is suboptimal. First of all there is no way for users to check which ports are offloaded. Many drivers resort to printing messages to aid debugging, other use debugfs. Even worse the availability of the RX features (NETIF_F_RX_UDP_TUNNEL_PORT) is established purely on the basis of the driver having the ndos installed. For most drivers, however, the ability to perform offloads is contingent on device capabilities (driver support multiple device and firmware versions). Unless driver resorts to hackish clearing of features set incorrectly by the core - users are left guessing whether their device really supports UDP tunnel port offload or not. There is currently no way to indicate or configure whether RX features include just the checksum offload or checksum and using inner headers for RSS. Many drivers default to not using inner headers for RSS because most implementations populate the source port with entropy from the inner headers. This, however, is not always the case, for example certain switches are only able to use a fixed source port during encapsulation. We have also seen many driver authors get the intricacies of UDP tunnel port offloads wrong. Most commonly the drivers forget to perform reference counting, or take sleeping locks in the callbacks. This work tries to improve the situation by pulling the UDP tunnel port table maintenance out of the drivers. It turns out that almost all drivers maintain a fixed size table of ports (in most cases one per tunnel type), so we can take care of all the refcounting in the core, and let the driver specify if they need to sleep in the callbacks or not. The new common implementation will also support replacing ports - when a port is removed from a full table it will try to find a previously missing port to take its place. This patch only implements the core functionality along with a few drivers I was hoping to test manually [1] along with a test based on a netdevsim implementation. Following patches will convert all the drivers. Once that's complete we can remove the ndos, and rely directly on the new infrastrucutre. Then after RSS (RXFH) is converted to netlink we can add the ability to configure the use of inner RSS headers for UDP tunnels. [1] Unfortunately I wasn't able to, turns out 2 of the devices I had access to were older generation or had old FW, and they did not actually support UDP tunnel port notifications (see the second paragraph). The thrid device appears to program the UDP ports correctly but it generates bad UDP checksums with or without these patches. Long story short - I'd appreciate reviews and testing here.. v4: - better build fix (hopefully this one does it..) v3: - fix build issue; - improve bnxt changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10mlx4: convert to new udp_tunnel_nic infraJakub Kicinski2-84/+25
Convert to new infra, make use of the ability to sleep in the callback. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10bnxt: convert to new udp_tunnel_nic infraJakub Kicinski2-113/+40
Convert to new infra, taking advantage of sleeping in callbacks. v2: - use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication that the offload is active. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: convert to new udp_tunnel_nic infraJakub Kicinski2-139/+44
Make use of new common udp_tunnel_nic infra. ixgbe supports IPv4 only, and only single VxLAN and Geneve ports (one each). v2: - split out the RXCSUM feature handling to separate change; - declare structs separately; - use ti.type instead of assuming table 0 is VxLAN; - move setting netdev->udp_tunnel_nic_info to its own switch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: don't clear UDP tunnel ports when RXCSUM is disabledJakub Kicinski1-20/+0
It appears the clearing of UDP tunnel ports when RXCSUM is disabled is unnecessary. Driver will not pay attention to checksum bits if RXCSUM is not set, so we can let the hardware parse the packets. Note that the UDP tunnel port NDO handlers don't pay attention to the state of RXCSUM, so the ports could had been re-programmed, anyway. This cleanup simplifies later conversion patch. v2: - break this out of the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10selftests: net: add a test for UDP tunnel info infraJakub Kicinski1-0/+786
Add validating the UDP tunnel infra works. $ ./udp_tunnel_nic.sh PASSED all 383 checks Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10netdevsim: add UDP tunnel port offload supportJakub Kicinski5-2/+224
Add UDP tunnel port handlers to our fake driver so we can test the core infra. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ethtool: add tunnel info interfaceJakub Kicinski12-1/+472
Add an interface to report offloaded UDP ports via ethtool netlink. Now that core takes care of tracking which UDP tunnel ports the NICs are aware of we can quite easily export this information out to user space. The responsibility of writing the netlink dumps is split between ethtool code and udp_tunnel_nic.c - since udp_tunnel module may not always be loaded, yet we should always report the capabilities of the NIC. $ ethtool --show-tunnels eth0 Tunnel information for eth0: UDP port table 0: Size: 4 Types: vxlan No entries UDP port table 1: Size: 4 Types: geneve, vxlan-gpe Entries (1): port 1230, vxlan-gpe v4: - back to v2, build fix is now directly in udp_tunnel.h v3: - don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET not set. v2: - fix string set count, - reorder enums in the uAPI, - fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset in docs and comments. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10udp_tunnel: add central NIC RX port offload infrastructureJakub Kicinski8-5/+983
Cater to devices which: (a) may want to sleep in the callbacks; (b) only have IPv4 support; (c) need all the programming to happen while the netdev is up. Drivers attach UDP tunnel offload info struct to their netdevs, where they declare how many UDP ports of various tunnel types they support. Core takes care of tracking which ports to offload. Use a fixed-size array since this matches what almost all drivers do, and avoids a complexity and uncertainty around memory allocations in an atomic context. Make sure that tunnel drivers don't try to replay the ports when new NIC netdev is registered. Automatic replays would mess up reference counting, and will be removed completely once all drivers are converted. v4: - use a #define NULL to avoid build issues with CONFIG_INET=n. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10udp_tunnel: re-number the offload tunnel typesJakub Kicinski1-3/+3
Make it possible to use tunnel types as flags more easily. There doesn't appear to be any user using the type as an array index, so this should make no difference. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10debugfs: make sure we can remove u32_array files cleanlyJakub Kicinski5-30/+31
debugfs_create_u32_array() allocates a small structure to wrap the data and size information about the array. If users ever try to remove the file this leads to a leak since nothing ever frees this wrapper. That said there are no upstream users of debugfs_create_u32_array() that'd remove a u32 array file (we only have one u32 array user in CMA), so there is no real bug here. Make callers pass a wrapper they allocated. This way the lifetime management of the wrapper is on the caller, and we can avoid the potential leak in debugfs. CC: Chucheng Luo <luochucheng@vivo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds7-36/+63
Pull rdma fixes from Jason Gunthorpe: "Small update, a few more merge window bugs and normal driver bug fixes: - Two merge window regressions in mlx5: a error path bug found by syzkaller and some lost code during a rework preventing ipoib from working in some configurations - Silence clang compilation warning in OPA related code - Fix a long standing race condition in ib_nl for ACM - Resolve when the HFI1 is shutdown" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Set PD pointers for the error flow unwind IB/mlx5: Fix 50G per lane indication RDMA/siw: Fix reporting vendor_part_id IB/sa: Resolv use-after-free in ib_nl_make_request() IB/hfi1: Do not destroy link_wq when the device is shut down IB/hfi1: Do not destroy hfi1_wq when the device is shut down RDMA/mlx5: Fix legacy IPoIB QP initialization IB/hfi1: Add explicit cast OPA_MTU_8192 to 'enum ib_mtu'
2020-07-10Merge tag 'linux-kselftest-fixes-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftestLinus Torvalds5-50/+53
Pull kselftest fixes from Shuah Khan: "TPM2 test changes to run on python3 and kselftest framework fix to incorrect return type" * tag 'linux-kselftest-fixes-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kselftest: ksft_test_num return type should be unsigned selftests: tpm: upgrade TPM2 tests from Python 2 to Python 3