aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/export-to-postgresql.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-05-03bnxt_en: Initiallize bp->ptp_lock first before using itMichael Chan1-8/+7
bnxt_ptp_init() calls bnxt_ptp_init_rtc() which will acquire the ptp_lock spinlock. The spinlock is not initialized until later. Move the bnxt_ptp_init_rtc() call after the spinlock is initialized. Fixes: 24ac1ecd5240 ("bnxt_en: Add driver support to use Real Time Counter for PTP") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flagSomnath Kotur1-5/+4
bnxt_open() can fail in this code path, especially on a VF when it fails to reserve default rings: bnxt_open() __bnxt_open_nic() bnxt_clear_int_mode() bnxt_init_dflt_ring_mode() RX rings would be set to 0 when we hit this error path. It is possible for a subsequent bnxt_open() call to potentially succeed with a code path like this: bnxt_open() bnxt_hwrm_if_change() bnxt_fw_init_one() bnxt_fw_init_one_p3() bnxt_set_dflt_rfs() bnxt_rfs_capable() bnxt_hwrm_reserve_rings() On older chips, RFS is capable if we can reserve the number of vnics that is equal to RX rings + 1. But since RX rings is still set to 0 in this code path, we may mistakenly think that RFS is supported for 0 RX rings. Later, when the default RX rings are reserved and we try to enable RFS, it would fail and cause bnxt_open() to fail unnecessarily. We fix this in 2 places. bnxt_rfs_capable() will always return false if RX rings is not yet set. bnxt_init_dflt_ring_mode() will call bnxt_set_dflt_rfs() which will always clear the RFS flags if RFS is not supported. Fixes: 20d7d1c5c9b1 ("bnxt_en: reliably allocate IRQ table on reset to avoid crash") Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03smsc911x: allow using IRQ0Sergey Shtylyov1-1/+1
The AlphaProject AP-SH4A-3A/AP-SH4AD-0A SH boards use IRQ0 for their SMSC LAN911x Ethernet chip, so the networking on them must have been broken by commit 965b2aa78fbc ("net/smsc911x: fix irq resource allocation failure") which filtered out 0 as well as the negative error codes -- it was kinda correct at the time, as platform_get_irq() could return 0 on of_irq_get() failure and on the actual 0 in an IRQ resource. This issue was fixed by me (back in 2016!), so we should be able to fix this driver to allow IRQ0 usage again... When merging this to the stable kernels, make sure you also merge commit e330b9a6bb35 ("platform: don't return 0 from platform_get_irq[_byname]() on error") -- that's my fix to platform_get_irq() for the DT platforms... Fixes: 965b2aa78fbc ("net/smsc911x: fix irq resource allocation failure") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/656036e4-6387-38df-b8a7-6ba683b16e63@omp.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03net: sfp: Add tx-fault workaround for Huawei MA5671A SFP ONTMatthew Hagan1-1/+11
As noted elsewhere, various GPON SFP modules exhibit non-standard TX-fault behaviour. In the tested case, the Huawei MA5671A, when used in combination with a Marvell mv88e6085 switch, was found to persistently assert TX-fault, resulting in the module being disabled. This patch adds a quirk to ignore the SFP_F_TX_FAULT state, allowing the module to function. Change from v1: removal of erroneous return statment (Andrew Lunn) Signed-off-by: Matthew Hagan <mnhagan88@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220502223315.1973376-1-mnhagan88@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03net: rds: acquire refcount on TCP socketsTetsuo Handa1-0/+8
syzbot is reporting use-after-free read in tcp_retransmit_timer() [1], for TCP socket used by RDS is accessing sock_net() without acquiring a refcount on net namespace. Since TCP's retransmission can happen after a process which created net namespace terminated, we need to explicitly acquire a refcount. Link: https://syzkaller.appspot.com/bug?extid=694120e1002c117747ed [1] Reported-by: syzbot <syzbot+694120e1002c117747ed@syzkaller.appspotmail.com> Fixes: 26abe14379f8e2fa ("net: Modify sk_alloc to not reference count the netns of kernel sockets.") Fixes: 8a68173691f03661 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+694120e1002c117747ed@syzkaller.appspotmail.com> Link: https://lore.kernel.org/r/a5fb1fc4-2284-3359-f6a0-e4e390239d7b@I-love.SAKURA.ne.jp Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03selftests/net: so_txtime: usage(): fix documentation of default clockMarc Kleine-Budde1-1/+1
The program uses CLOCK_TAI as default clock since it was added to the Linux repo. In commit: | 040806343bb4 ("selftests/net: so_txtime multi-host support") a help text stating the wrong default clock was added. This patch fixes the help text. Fixes: 040806343bb4 ("selftests/net: so_txtime multi-host support") Cc: Carlos Llamas <cmllamas@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20220502094638.1921702-3-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03selftests/net: so_txtime: fix parsing of start time stamp on 32 bit systemsMarc Kleine-Budde1-1/+1
This patch fixes the parsing of the cmd line supplied start time on 32 bit systems. A "long" on 32 bit systems is only 32 bit wide and cannot hold a timestamp in nano second resolution. Fixes: 040806343bb4 ("selftests/net: so_txtime multi-host support") Cc: Carlos Llamas <cmllamas@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20220502094638.1921702-2-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operationalIdo Schimmel1-0/+3
In emulated environments, the bridge ports enslaved to br1 get a carrier before changing br1's PVID. This means that by the time the PVID is changed, br1 is already operational and configured with an IPv6 link-local address. When the test is run with netdevs registered by mlxsw, changing the PVID is vetoed, as changing the VID associated with an existing L3 interface is forbidden. This restriction is similar to the 8021q driver's restriction of changing the VID of an existing interface. Fix this by taking br1 down and bringing it back up when it is fully configured. With this fix, the test reliably passes on top of both the SW and HW data paths (emulated or not). Fixes: 239e754af854 ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20220502084507.364774-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03net: emaclite: Add error handling for of_address_to_resource()Shravya Kumbham1-3/+12
check the return value of of_address_to_resource() and also add missing of_node_put() for np and npp nodes. Fixes: e0a3bc65448c ("net: emaclite: Support multiple phys connected to one MDIO bus") Addresses-Coverity: Event check_return value. Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03net: emaclite: Don't advertise 1000BASE-T and do auto negotiationShravya Kumbham1-15/+0
In xemaclite_open() function we are setting the max speed of emaclite to 100Mb using phy_set_max_speed() function so, there is no need to write the advertising registers to stop giga-bit speed and the phy_start() function starts the auto-negotiation so, there is no need to handle it separately using advertising registers. Remove the phy_read and phy_write of advertising registers in xemaclite_open() function. Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-01net: dsa: b53: convert to phylink_pcsRussell King (Oracle)5-72/+75
Convert B53 to use phylink_pcs for the serdes rather than hooking it into the MAC-layer callbacks. Fixes: 81c1681cbb9f ("net: dsa: b53: mark as non-legacy") Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01pci_irq_vector() can't be used in atomic context any longer. This conflictsThomas Gleixner1-8/+8
with the usage of this function in nic_mbx_intr_handler(). Cache the Linux interrupt numbers in struct nicpf and use that cache in the interrupt handler to select the mailbox. Fixes: 495c66aca3da ("genirq/msi: Convert to new functions") Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Sunil Goutham <sgoutham@marvell.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Link: https://bugzilla.redhat.com/show_bug.cgi?id=2041772 Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugsDuoming Zhou1-1/+1
There are destructive operations such as nfcmrvl_fw_dnld_abort and gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware, gpio and so on could be destructed while the upper layer functions such as nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads to double-free, use-after-free and null-ptr-deref bugs. There are three situations that could lead to double-free bugs. The first situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | nfcmrvl_nci_unregister_dev release_firmware() | nfcmrvl_fw_dnld_abort kfree(fw) //(1) | fw_dnld_over | release_firmware ... | kfree(fw) //(2) | ... The second situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_fw_dnld_start | ... | mod_timer | (wait a time) | fw_dnld_timeout | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware ... | kfree(fw) //(2) The third situation is shown below: (Thread 1) | (Thread 2) nfcmrvl_nci_recv_frame | if(..->fw_download_in_progress)| nfcmrvl_fw_dnld_recv_frame | queue_work | | fw_dnld_rx_work | nfcmrvl_nci_unregister_dev fw_dnld_over | nfcmrvl_fw_dnld_abort release_firmware | fw_dnld_over kfree(fw) //(1) | release_firmware | kfree(fw) //(2) The firmware struct is deallocated in position (1) and deallocated in position (2) again. The crash trace triggered by POC is like below: BUG: KASAN: double-free or invalid-free in fw_dnld_over Call Trace: kfree fw_dnld_over nfcmrvl_nci_unregister_dev nci_uart_tty_close tty_ldisc_kill tty_ldisc_hangup __tty_hangup.part.0 tty_release ... What's more, there are also use-after-free and null-ptr-deref bugs in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev, then, we dereference firmware, gpio or the members of priv->fw_dnld in nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen. This patch reorders destructive operations after nci_unregister_device in order to synchronize between cleanup routine and firmware download routine. The nci_unregister_device is well synchronized. If the device is detaching, the firmware download routine will goto error. If firmware download routine is executing, nci_unregister_device will wait until firmware download routine is finished. Fixes: 3194c6870158 ("NFC: nfcmrvl: add firmware download support") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01nfc: replace improper check device_is_registered() in netlink related functionsDuoming Zhou1-15/+14
The device_is_registered() in nfc core is used to check whether nfc device is registered in netlink related functions such as nfc_fw_download(), nfc_dev_up() and so on. Although device_is_registered() is protected by device_lock, there is still a race condition between device_del() and device_is_registered(). The root cause is that kobject_del() in device_del() is not protected by device_lock. (cleanup task) | (netlink task) | nfc_unregister_device | nfc_fw_download device_del | device_lock ... | if (!device_is_registered)//(1) kobject_del//(2) | ... ... | device_unlock The device_is_registered() returns the value of state_in_sysfs and the state_in_sysfs is set to zero in kobject_del(). If we pass check in position (1), then set zero in position (2). As a result, the check in position (1) is useless. This patch uses bool variable instead of device_is_registered() to judge whether the nfc device is registered, which is well synchronized. Fixes: 3e256b8f8dfa ("NFC: add nfc subsystem core") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-01net: stmmac: disable Split Header (SPH) for Intel platformsTan Tee Min3-1/+3
Based on DesignWare Ethernet QoS datasheet, we are seeing the limitation of Split Header (SPH) feature is not supported for Ipv4 fragmented packet. This SPH limitation will cause ping failure when the packets size exceed the MTU size. For example, the issue happens once the basic ping packet size is larger than the configured MTU size and the data is lost inside the fragmented packet, replaced by zeros/corrupted values, and leads to ping fail. So, disable the Split Header for Intel platforms. v2: Add fixes tag in commit message. Fixes: 67afd6d1cfdf("net: stmmac: Add Split Header support and enable it in XGMAC cores") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Ong, Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30mld: respect RCU rules in ip6_mc_source() and ip6_mc_msfilter()Eric Dumazet1-4/+4
Whenever RCU protected list replaces an object, the pointer to the new object needs to be updated _before_ the call to kfree_rcu() or call_rcu() Also ip6_mc_msfilter() needs to update the pointer before releasing the mc_lock mutex. Note that linux-5.13 was supporting kfree_rcu(NULL, rcu), so this fix does not need the conditional test I was forced to use in the equivalent patch for IPv4. Fixes: 882ba1f73c06 ("mld: convert ipv6_mc_socklist->sflist to RCU") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30net: igmp: respect RCU rules in ip_mc_source() and ip_mc_msfilter()Eric Dumazet1-3/+6
syzbot reported an UAF in ip_mc_sf_allow() [1] Whenever RCU protected list replaces an object, the pointer to the new object needs to be updated _before_ the call to kfree_rcu() or call_rcu() Because kfree_rcu(ptr, rcu) got support for NULL ptr only recently in commit 12edff045bc6 ("rcu: Make kfree_rcu() ignore NULL pointers"), I chose to use the conditional to make sure stable backports won't miss this detail. if (psl) kfree_rcu(psl, rcu); net/ipv6/mcast.c has similar issues, addressed in a separate patch. [1] BUG: KASAN: use-after-free in ip_mc_sf_allow+0x6bb/0x6d0 net/ipv4/igmp.c:2655 Read of size 4 at addr ffff88807d37b904 by task syz-executor.5/908 CPU: 0 PID: 908 Comm: syz-executor.5 Not tainted 5.18.0-rc4-syzkaller-00064-g8f4dd16603ce #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 ip_mc_sf_allow+0x6bb/0x6d0 net/ipv4/igmp.c:2655 raw_v4_input net/ipv4/raw.c:190 [inline] raw_local_deliver+0x4d1/0xbe0 net/ipv4/raw.c:218 ip_protocol_deliver_rcu+0xcf/0xb30 net/ipv4/ip_input.c:193 ip_local_deliver_finish+0x2ee/0x4c0 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_local_deliver+0x1b3/0x200 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:461 [inline] ip_rcv_finish+0x1cb/0x2f0 net/ipv4/ip_input.c:437 NF_HOOK include/linux/netfilter.h:307 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] ip_rcv+0xaa/0xd0 net/ipv4/ip_input.c:556 __netif_receive_skb_one_core+0x114/0x180 net/core/dev.c:5405 __netif_receive_skb+0x24/0x1b0 net/core/dev.c:5519 netif_receive_skb_internal net/core/dev.c:5605 [inline] netif_receive_skb+0x13e/0x8e0 net/core/dev.c:5664 tun_rx_batched.isra.0+0x460/0x720 drivers/net/tun.c:1534 tun_get_user+0x28b7/0x3e30 drivers/net/tun.c:1985 tun_chr_write_iter+0xdb/0x200 drivers/net/tun.c:2015 call_write_iter include/linux/fs.h:2050 [inline] new_sync_write+0x38a/0x560 fs/read_write.c:504 vfs_write+0x7c0/0xac0 fs/read_write.c:591 ksys_write+0x127/0x250 fs/read_write.c:644 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f3f12c3bbff Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 99 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 cc fd ff ff 48 RSP: 002b:00007f3f13ea9130 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f3f12d9bf60 RCX: 00007f3f12c3bbff RDX: 0000000000000036 RSI: 0000000020002ac0 RDI: 00000000000000c8 RBP: 00007f3f12ce308d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000036 R11: 0000000000000293 R12: 0000000000000000 R13: 00007fffb68dd79f R14: 00007f3f13ea9300 R15: 0000000000022000 </TASK> Allocated by task 908: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:45 [inline] set_alloc_info mm/kasan/common.c:436 [inline] ____kasan_kmalloc mm/kasan/common.c:515 [inline] ____kasan_kmalloc mm/kasan/common.c:474 [inline] __kasan_kmalloc+0xa6/0xd0 mm/kasan/common.c:524 kasan_kmalloc include/linux/kasan.h:234 [inline] __do_kmalloc mm/slab.c:3710 [inline] __kmalloc+0x209/0x4d0 mm/slab.c:3719 kmalloc include/linux/slab.h:586 [inline] sock_kmalloc net/core/sock.c:2501 [inline] sock_kmalloc+0xb5/0x100 net/core/sock.c:2492 ip_mc_source+0xba2/0x1100 net/ipv4/igmp.c:2392 do_ip_setsockopt net/ipv4/ip_sockglue.c:1296 [inline] ip_setsockopt+0x2312/0x3ab0 net/ipv4/ip_sockglue.c:1432 raw_setsockopt+0x274/0x2c0 net/ipv4/raw.c:861 __sys_setsockopt+0x2db/0x6a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 753: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track+0x21/0x30 mm/kasan/common.c:45 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370 ____kasan_slab_free mm/kasan/common.c:366 [inline] ____kasan_slab_free+0x13d/0x180 mm/kasan/common.c:328 kasan_slab_free include/linux/kasan.h:200 [inline] __cache_free mm/slab.c:3439 [inline] kmem_cache_free_bulk+0x69/0x460 mm/slab.c:3774 kfree_bulk include/linux/slab.h:437 [inline] kfree_rcu_work+0x51c/0xa10 kernel/rcu/tree.c:3318 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 Last potentially related work creation: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 __kasan_record_aux_stack+0x7e/0x90 mm/kasan/generic.c:348 kvfree_call_rcu+0x74/0x990 kernel/rcu/tree.c:3595 ip_mc_msfilter+0x712/0xb60 net/ipv4/igmp.c:2510 do_ip_setsockopt net/ipv4/ip_sockglue.c:1257 [inline] ip_setsockopt+0x32e1/0x3ab0 net/ipv4/ip_sockglue.c:1432 raw_setsockopt+0x274/0x2c0 net/ipv4/raw.c:861 __sys_setsockopt+0x2db/0x6a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Second to last potentially related work creation: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 __kasan_record_aux_stack+0x7e/0x90 mm/kasan/generic.c:348 call_rcu+0x99/0x790 kernel/rcu/tree.c:3074 mpls_dev_notify+0x552/0x8a0 net/mpls/af_mpls.c:1656 notifier_call_chain+0xb5/0x200 kernel/notifier.c:84 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1938 call_netdevice_notifiers_extack net/core/dev.c:1976 [inline] call_netdevice_notifiers net/core/dev.c:1990 [inline] unregister_netdevice_many+0x92e/0x1890 net/core/dev.c:10751 default_device_exit_batch+0x449/0x590 net/core/dev.c:11245 ops_exit_list+0x125/0x170 net/core/net_namespace.c:167 cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 The buggy address belongs to the object at ffff88807d37b900 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 4 bytes inside of 64-byte region [ffff88807d37b900, ffff88807d37b940) The buggy address belongs to the physical page: page:ffffea0001f4dec0 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88807d37b180 pfn:0x7d37b flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff) raw: 00fff00000000200 ffff888010c41340 ffffea0001c795c8 ffff888010c40200 raw: ffff88807d37b180 ffff88807d37b000 000000010000001f 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x342040(__GFP_IO|__GFP_NOWARN|__GFP_COMP|__GFP_HARDWALL|__GFP_THISNODE), pid 2963, tgid 2963 (udevd), ts 139732238007, free_ts 139730893262 prep_new_page mm/page_alloc.c:2441 [inline] get_page_from_freelist+0xba2/0x3e00 mm/page_alloc.c:4182 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5408 __alloc_pages_node include/linux/gfp.h:587 [inline] kmem_getpages mm/slab.c:1378 [inline] cache_grow_begin+0x75/0x350 mm/slab.c:2584 cache_alloc_refill+0x27f/0x380 mm/slab.c:2957 ____cache_alloc mm/slab.c:3040 [inline] ____cache_alloc mm/slab.c:3023 [inline] __do_cache_alloc mm/slab.c:3267 [inline] slab_alloc mm/slab.c:3309 [inline] __do_kmalloc mm/slab.c:3708 [inline] __kmalloc+0x3b3/0x4d0 mm/slab.c:3719 kmalloc include/linux/slab.h:586 [inline] kzalloc include/linux/slab.h:714 [inline] tomoyo_encode2.part.0+0xe9/0x3a0 security/tomoyo/realpath.c:45 tomoyo_encode2 security/tomoyo/realpath.c:31 [inline] tomoyo_encode+0x28/0x50 security/tomoyo/realpath.c:80 tomoyo_realpath_from_path+0x186/0x620 security/tomoyo/realpath.c:288 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_path_perm+0x21b/0x400 security/tomoyo/file.c:822 security_inode_getattr+0xcf/0x140 security/security.c:1350 vfs_getattr fs/stat.c:157 [inline] vfs_statx+0x16a/0x390 fs/stat.c:232 vfs_fstatat+0x8c/0xb0 fs/stat.c:255 __do_sys_newfstatat+0x91/0x110 fs/stat.c:425 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1356 [inline] free_pcp_prepare+0x549/0xd20 mm/page_alloc.c:1406 free_unref_page_prepare mm/page_alloc.c:3328 [inline] free_unref_page+0x19/0x6a0 mm/page_alloc.c:3423 __vunmap+0x85d/0xd30 mm/vmalloc.c:2667 __vfree+0x3c/0xd0 mm/vmalloc.c:2715 vfree+0x5a/0x90 mm/vmalloc.c:2746 __do_replace+0x16b/0x890 net/ipv6/netfilter/ip6_tables.c:1117 do_replace net/ipv6/netfilter/ip6_tables.c:1157 [inline] do_ip6t_set_ctl+0x90d/0xb90 net/ipv6/netfilter/ip6_tables.c:1639 nf_setsockopt+0x83/0xe0 net/netfilter/nf_sockopt.c:101 ipv6_setsockopt+0x122/0x180 net/ipv6/ipv6_sockglue.c:1026 tcp_setsockopt+0x136/0x2520 net/ipv4/tcp.c:3696 __sys_setsockopt+0x2db/0x6a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Memory state around the buggy address: ffff88807d37b800: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc ffff88807d37b880: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc >ffff88807d37b900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ^ ffff88807d37b980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88807d37ba00: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc Fixes: c85bb41e9318 ("igmp: fix ip_mc_sf_allow race [v5]") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Flavio Leitner <fbl@sysclose.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30rxrpc: Enable IPv6 checksums on transport socketDavid Howells1-0/+3
AF_RXRPC doesn't currently enable IPv6 UDP Tx checksums on the transport socket it opens and the checksums in the packets it generates end up 0. It probably should also enable IPv6 UDP Rx checksums and IPv4 UDP checksums. The latter only seem to be applied if the socket family is AF_INET and don't seem to apply if it's AF_INET6. IPv4 packets from an IPv6 socket seem to have checksums anyway. What seems to have happened is that the inet_inv_convert_csum() call didn't get converted to the appropriate udp_port_cfg parameters - and udp_sock_create() disables checksums unless explicitly told not too. Fix this by enabling the three udp_port_cfg checksum options. Fixes: 1a9b86c9fd95 ("rxrpc: use udp tunnel APIs instead of open code in rxrpc_open_socket") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: Vadim Fedorenko <vfedorenko@novek.ru> cc: David S. Miller <davem@davemloft.net> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30net: cpsw: add missing of_node_put() in cpsw_probe_dt()Yang Yingliang1-1/+4
'tmp_node' need be put before returning from cpsw_probe_dt(), so add missing of_node_put() in error path. Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29net: stmmac: dwmac-sun8i: add missing of_node_put() in sun8i_dwmac_register_mdio_mux()Yang Yingliang1-0/+1
The node pointer returned by of_get_child_by_name() with refcount incremented, so add of_node_put() after using it. Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220428095716.540452-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29net: dsa: mt7530: add missing of_node_put() in mt7530_setup()Yang Yingliang1-0/+1
Add of_node_put() if of_get_phy_mode() fails in mt7530_setup() Fixes: 0c65b2b90d13 ("net: of_get_phy_mode: Change API to solve int/unit warnings") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220428095317.538829-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29net: dsa: ksz9477: port mirror sniffing limited to one portArun Ramadoss1-4/+34
This patch limits the sniffing to only one port during the mirror add. And during the mirror_del it checks for all the ports using the sniff, if and only if no other ports are referring, sniffing is disabled. The code is updated based on the review comments of LAN937x port mirror patch. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210422094257.1641396-8-prasanna.vengateshan@microchip.com/ Fixes: b987e98e50ab ("dsa: add DSA switch driver for Microchip KSZ9477") Signed-off-by: Prasanna Vengateshan <prasanna.vengateshan@microchip.com> Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Link: https://lore.kernel.org/r/20220428070709.7094-1-arun.ramadoss@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29hinic: fix bug of wq out of bound accessQiao Ma1-2/+5
If wq has only one page, we need to check wqe rolling over page by compare end_idx and curr_idx, and then copy wqe to shadow wqe to avoid out of bound access. This work has been done in hinic_get_wqe, but missed for hinic_read_wqe. This patch fixes it, and removes unnecessary MASKED_WQE_IDX(). Fixes: 7dd29ee12865 ("hinic: add sriov feature support") Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com> Reviewed-by: Xunlei Pang <xlpang@linux.alibaba.com> Link: https://lore.kernel.org/r/282817b0e1ae2e28fdf3ed8271a04e77f57bf42e.1651148587.git.mqaio@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29net: mdio: Fix ENOMEM return value in BCM6368 mux bus controllerNiels Dossche1-1/+1
Error values inside the probe function must be < 0. The ENOMEM return value has the wrong sign: it is positive instead of negative. Add a minus sign. Fixes: e239756717b5 ("net: mdio: Add BCM6368 MDIO mux bus controller") Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220428211931.8130-1-dossche.niels@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29net: ethernet: mediatek: add missing of_node_put() in mtk_sgmii_init()Yang Yingliang1-0/+1
The node pointer returned by of_parse_phandle() with refcount incremented, so add of_node_put() after using it in mtk_sgmii_init(). Fixes: 9ffee4a8276c ("net: ethernet: mediatek: Extend SGMII related functions") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220428062543.64883-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29selftests/net/forwarding: add missing tests to MakefileHangbin Liu1-0/+33
When generating the selftests to another folder, the fixed tests are missing as they are not in Makefile, e.g. make -C tools/testing/selftests/ install \ TARGETS="net/forwarding" INSTALL_PATH=/tmp/kselftests Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29selftests/net: add missing tests to MakefileHangbin Liu1-1/+2
When generating the selftests to another folder, the fixed tests are missing as they are not in Makefile, e.g. make -C tools/testing/selftests/ install \ TARGETS="net" INSTALL_PATH=/tmp/kselftests Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29can: grcan: only use the NAPI poll budget for RXAndreas Larsson1-15/+7
The previous split budget between TX and RX made it return not using the entire budget but at the same time not having calling called napi_complete. This sometimes led to the poll to not be called, and at the same time having TX and RX interrupts disabled resulting in the driver getting stuck. Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores") Link: https://lore.kernel.org/all/20220429084656.29788-4-andreas@gaisler.com Cc: stable@vger.kernel.org Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-04-29can: grcan: grcan_probe(): fix broken system id check for errata workaround needsAndreas Larsson1-5/+11
The systemid property was checked for in the wrong place of the device tree and compared to the wrong value. Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores") Link: https://lore.kernel.org/all/20220429084656.29788-3-andreas@gaisler.com Cc: stable@vger.kernel.org Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-04-29can: grcan: use ofdev->dev when allocating DMA memoryDaniel Hellstrom1-2/+4
Use the device of the device tree node should be rather than the device of the struct net_device when allocating DMA buffers. The driver got away with it on sparc32 until commit 53b7670e5735 ("sparc: factor the dma coherent mapping into helper") after which the driver oopses. Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores") Link: https://lore.kernel.org/all/20220429084656.29788-2-andreas@gaisler.com Cc: stable@vger.kernel.org Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-04-29can: grcan: grcan_close(): fix deadlockDuoming Zhou1-0/+2
There are deadlocks caused by del_timer_sync(&priv->hang_timer) and del_timer_sync(&priv->rr_timer) in grcan_close(), one of the deadlocks are shown below: (Thread 1) | (Thread 2) | grcan_reset_timer() grcan_close() | mod_timer() spin_lock_irqsave() //(1) | (wait a time) ... | grcan_initiate_running_reset() del_timer_sync() | spin_lock_irqsave() //(2) (wait timer to stop) | ... We hold priv->lock in position (1) of thread 1 and use del_timer_sync() to wait timer to stop, but timer handler also need priv->lock in position (2) of thread 2. As a result, grcan_close() will block forever. This patch extracts del_timer_sync() from the protection of spin_lock_irqsave(), which could let timer handler to obtain the needed lock. Link: https://lore.kernel.org/all/20220425042400.66517-1-duoming@zju.edu.cn Fixes: 6cec9b07fe6a ("can: grcan: Add device driver for GRCAN and GRHCAN cores") Cc: stable@vger.kernel.org Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-04-29can: isotp: remove re-binding of bound socketOliver Hartkopp1-20/+5
As a carry over from the CAN_RAW socket (which allows to change the CAN interface while mantaining the filter setup) the re-binding of the CAN_ISOTP socket needs to take care about CAN ID address information and subscriptions. It turned out that this feature is so limited (e.g. the sockopts remain fix) that it finally has never been needed/used. In opposite to the stateless CAN_RAW socket the switching of the CAN ID subscriptions might additionally lead to an interrupted ongoing PDU reception. So better remove this unneeded complexity. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://lore.kernel.org/all/20220422082337.1676-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-04-28tcp: fix F-RTO may not work correctly when receiving DSACKPengcheng Yang1-1/+2
Currently DSACK is regarded as a dupack, which may cause F-RTO to incorrectly enter "loss was real" when receiving DSACK. Packetdrill to demonstrate: // Enable F-RTO and TLP 0 `sysctl -q net.ipv4.tcp_frto=2` 0 `sysctl -q net.ipv4.tcp_early_retrans=3` 0 `sysctl -q net.ipv4.tcp_congestion_control=cubic` // Establish a connection +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // RTT 10ms, RTO 210ms +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.01 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 2 data segments +0 write(4, ..., 2000) = 2000 +0 > P. 1:2001(2000) ack 1 // TLP +.022 > P. 1001:2001(1000) ack 1 // Continue to send 8 data segments +0 write(4, ..., 10000) = 10000 +0 > P. 2001:10001(8000) ack 1 // RTO +.188 > . 1:1001(1000) ack 1 // The original data is acked and new data is sent(F-RTO step 2.b) +0 < . 1:1(0) ack 2001 win 257 +0 > P. 10001:12001(2000) ack 1 // D-SACK caused by TLP is regarded as a dupack, this results in // the incorrect judgment of "loss was real"(F-RTO step 3.a) +.022 < . 1:1(0) ack 2001 win 257 <sack 1001:2001,nop,nop> // Never-retransmitted data(3001:4001) are acked and // expect to switch to open state(F-RTO step 3.b) +0 < . 1:1(0) ack 4001 win 257 +0 %{ assert tcpi_ca_state == 0, tcpi_ca_state }% Fixes: e33099f96d99 ("tcp: implement RFC5682 F-RTO") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1650967419-2150-1-git-send-email-yangpc@wangsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"Dany Madden2-100/+35
This reverts commit 723ad916134784b317b72f3f6cf0f7ba774e5dae When client requests channel or ring size larger than what the server can support the server will cap the request to the supported max. So, the client would not be able to successfully request resources that exceed the server limit. Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits") Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASKVladimir Oltean1-4/+0
The Time-Specified Departure feature is indeed mutually exclusive with TX IP checksumming in ENETC, but TX checksumming in itself is broken and was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx checksumming offload code"). The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply with software TSO's expectations, and still did the checksumming in software by calling skb_checksum_help(). So there isn't any restriction for the Time-Specified Departure feature. However, enetc_setup_tc_txtime() doesn't understand that, and blindly looks for NETIF_F_CSUM_MASK. Instead of checking for things which can literally never happen in the current code base, just remove the check and let the driver offload tc-etf qdiscs. Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28ixgbe: ensure IPsec VF<->PF compatibilityLeon Romanovsky1-1/+2
The VF driver can forward any IPsec flags and such makes the function is not extendable and prone to backward/forward incompatibility. If new software runs on VF, it won't know that PF configured something completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag. Fixes: eda0333ac293 ("ixgbe: add VF IPsec management") Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shannon Nelson <snelson@pensando.io> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28MAINTAINERS: Update BNXT entry with firmware filesFlorian Fainelli1-0/+2
There appears to be a maintainer gap for BNXT TEE firmware files which causes some patches to be missed. Update the entry for the BNXT Ethernet controller with its companion firmware files. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220427163606.126154-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28netfilter: nft_socket: only do sk lookups when indev is availableFlorian Westphal1-14/+38
Check if the incoming interface is available and NFT_BREAK in case neither skb->sk nor input device are set. Because nf_sk_lookup_slow*() assume packet headers are in the 'in' direction, use in postrouting is not going to yield a meaningful result. Same is true for the forward chain, so restrict the use to prerouting, input and output. Use in output work if a socket is already attached to the skb. Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-04-28gfs2: No short reads or writes upon glock contentionAndreas Gruenbacher1-4/+0
Commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") changed gfs2_file_read_iter() and gfs2_file_buffered_write() to allow dropping the inode glock while faulting in user buffers. When the lock was dropped, a short result was returned to indicate that the operation was interrupted. As pointed out by Linus (see the link below), this behavior is broken and the operations should always re-acquire the inode glock and resume the operation instead. Link: https://lore.kernel.org/lkml/CAHk-=whaz-g_nOOoo8RRiWNjnv2R+h6_xk2F1J4TuSRxk1MtLw@mail.gmail.com/ Fixes: 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2022-04-27net: fec: add missing of_node_put() in fec_enet_init_stop_mode()Yang Yingliang1-1/+1
Put device node in error path in fec_enet_init_stop_mode(). Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27bnx2x: fix napi API usage sequenceManish Chopra1-4/+5
While handling PCI errors (AER flow) driver tries to disable NAPI [napi_disable()] after NAPI is deleted [__netif_napi_del()] which causes unexpected system hang/crash. System message log shows the following: ======================================= [ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures. [ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)' [ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->error_detected(IO frozen) [ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports: 'need reset' [ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking bnx2x->error_detected(IO frozen) [ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports: 'need reset' [ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset' [ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow: [ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows: [ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow: [ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows: [ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset' [ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking bnx2x->slot_reset() [ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing... [ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset --> driver unload Uninterruptible tasks ===================== crash> ps | grep UN 213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd] 215 2 0 c000000004c80000 UN 0.0 0 0 [kworker/0:2] 2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd 4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty 4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty 32423 2 26 c00000020038c580 UN 0.0 0 0 [kworker/26:3] 32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd 32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail 33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail 33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup 33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd 33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail 33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd EEH handler hung while bnx2x sleeping and holding RTNL lock =========================================================== crash> bt 213 PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd" #0 [c000000004d477e0] __schedule at c000000000c70808 #1 [c000000004d478b0] schedule at c000000000c70ee0 #2 [c000000004d478e0] schedule_timeout at c000000000c76dec #3 [c000000004d479c0] msleep at c0000000002120cc #4 [c000000004d479f0] napi_disable at c000000000a06448 ^^^^^^^^^^^^^^^^ #5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x] #6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x] #7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc #8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8 #9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64 And the sleeping source code ============================ crash> dis -ls c000000000a06448 FILE: ../net/core/dev.c LINE: 6702 6697 { 6698 might_sleep(); 6699 set_bit(NAPI_STATE_DISABLE, &n->state); 6700 6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state)) * 6702 msleep(1); 6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state)) 6704 msleep(1); 6705 6706 hrtimer_cancel(&n->timer); 6707 6708 clear_bit(NAPI_STATE_DISABLE, &n->state); 6709 } EEH calls into bnx2x twice based on the system log above, first through bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes the following call chains: bnx2x_io_error_detected() +-> bnx2x_eeh_nic_unload() +-> bnx2x_del_all_napi() +-> __netif_napi_del() bnx2x_io_slot_reset() +-> bnx2x_netif_stop() +-> bnx2x_napi_disable() +->napi_disable() Fix this by correcting the sequence of NAPI APIs usage, that is delete the NAPI after disabling it. Fixes: 7fa6f34081f1 ("bnx2x: AER revised") Reported-by: David Christensen <drc@linux.vnet.ibm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27tls: Skip tls_append_frag on zero copy sizeMaxim Mikityanskiy1-5/+7
Calling tls_append_frag when max_open_record_len == record->len might add an empty fragment to the TLS record if the call happens to be on the page boundary. Normally tls_append_frag coalesces the zero-sized fragment to the previous one, but not if it's on page boundary. If a resync happens then, the mlx5 driver posts dump WQEs in tx_post_resync_dump, and the empty fragment may become a data segment with byte_count == 0, which will confuse the NIC and lead to a CQE error. This commit fixes the described issue by skipping tls_append_frag on zero size to avoid adding empty fragments. The fix is not in the driver, because an empty fragment is hardly the desired behavior. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220426154949.159055-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27docs: vm/page_owner: use literal blocks for param descriptionAkira Yokosawa1-2/+3
Sphinx generates hard-to-read lists of parameters at the bottom of the page. Fix them by putting literal-block markers of "::" in front of them. Link: https://lkml.kernel.org/r/cfd3bcc0-b51d-0c68-c065-ca1c4c202447@gmail.com Signed-off-by: Akira Yokosawa <akiyks@gmail.com> Fixes: 57f2b54a9379 ("Documentation/vm/page_owner.rst: update the documentation") Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Haowen Bai <baihaowen@meizu.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Alex Shi <seakeel@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same timeZqiang1-0/+7
kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/ destroy(). The kasan_quarantine_remove_cache() call is protected by cpuslock in kmem_cache_destroy() to ensure serialization with kasan_cpu_offline(). However the kasan_quarantine_remove_cache() call is not protected by cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache shrink occurs at same time, the cpu_quarantine may be corrupted by interrupt (per_cpu_remove_cache operation). So add a cpu_quarantine offline flags check in per_cpu_remove_cache(). [akpm@linux-foundation.org: add comment, per Zqiang] Link: https://lkml.kernel.org/r/20220414025925.2423818-1-qiang1.zhang@intel.com Signed-off-by: Zqiang <qiang1.zhang@intel.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27intel_idle: Fix SPR C6 optimizationArtem Bityutskiy1-5/+3
The Sapphire Rapids (SPR) C6 optimization was added to the end of the 'spr_idle_state_table_update()' function. However, the function has a 'return' which may happen before the optimization has a chance to run. And this may prevent the optimization from happening. This is an unlikely scenario, but possible if user boots with, say, the 'intel_idle.preferred_cstates=6' kernel boot option. This patch fixes the issue by eliminating the problematic 'return' statement. Fixes: 3a9cf77b60dc ("intel_idle: add core C6 optimization for SPR") Suggested-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27intel_idle: Fix the 'preferred_cstates' module parameterArtem Bityutskiy1-7/+12
Problem description. When user boots kernel up with the 'intel_idle.preferred_cstates=4' option, we enable C1E and disable C1 states on Sapphire Rapids Xeon (SPR). In order for C1E to work on SPR, we have to enable the C1E promotion bit on all CPUs. However, we enable it only on one CPU. Fix description. The 'intel_idle' driver already has the infrastructure for disabling C1E promotion on every CPU. This patch uses the same infrastructure for enabling C1E promotion on every CPU. It changes the boolean 'disable_promotion_to_c1e' variable to a tri-state 'c1e_promotion' variable. Tested on a 2-socket SPR system. I verified the following combinations: * C1E promotion enabled and disabled in BIOS. * Booted with and without the 'intel_idle.preferred_cstates=4' kernel argument. In all 4 cases C1E promotion was correctly set on all CPUs. Also tested on an old Broadwell system, just to make sure it does not cause a regression. C1E promotion was correctly disabled on that system, both C1 and C1E were exposed (as expected). Fixes: da0e58c038e6 ("intel_idle: add 'preferred_cstates' module argument") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> [ rjw: Minor changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-27hex2bin: fix access beyond string endMikulas Patocka1-3/+6
If we pass too short string to "hex2bin" (and the string size without the terminating NUL character is even), "hex2bin" reads one byte after the terminating NUL character. This patch fixes it. Note that hex_to_bin returns -1 on error and hex2bin return -EINVAL on error - so we can't just return the variable "hi" or "lo" on error. This inconsistency may be fixed in the next merge window, but for the purpose of fixing this bug, we just preserve the existing behavior and return -1 and -EINVAL. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Fixes: b78049831ffe ("lib: add error checking to hex2bin") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27hex2bin: make the function hex_to_bin constant-timeMikulas Patocka2-8/+26
The function hex2bin is used to load cryptographic keys into device mapper targets dm-crypt and dm-integrity. It should take constant time independent on the processed data, so that concurrently running unprivileged code can't infer any information about the keys via microarchitectural convert channels. This patch changes the function hex_to_bin so that it contains no branches and no memory accesses. Note that this shouldn't cause performance degradation because the size of the new function is the same as the size of the old function (on x86-64) - and the new function causes no branch misprediction penalties. I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64 i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32 sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64 powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no branches in the generated code. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-27Add Eric Dumazet to networking maintainersJakub Kicinski1-0/+2
Welcome Eric! Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Link: https://lore.kernel.org/r/20220426175723.417614-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27floppy: disable FDRAWCMD by defaultWilly Tarreau2-11/+48
Minh Yuan reported a concurrency use-after-free issue in the floppy code between raw_cmd_ioctl and seek_interrupt. [ It turns out this has been around, and that others have reported the KASAN splats over the years, but Minh Yuan had a reproducer for it and so gets primary credit for reporting it for this fix - Linus ] The problem is, this driver tends to break very easily and nowadays, nobody is expected to use FDRAWCMD anyway since it was used to manipulate non-standard formats. The risk of breaking the driver is higher than the risk presented by this race, and accessing the device requires privileges anyway. Let's just add a config option to completely disable this ioctl and leave it disabled by default. Distros shouldn't use it, and only those running on antique hardware might need to enable it. Link: https://lore.kernel.org/all/000000000000b71cdd05d703f6bf@google.com/ Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com Reported-by: Minh Yuan <yuanmingbuaa@gmail.com> Reported-by: syzbot+8e8958586909d62b6840@syzkaller.appspotmail.com Reported-by: cruise k <cruise4k@gmail.com> Reported-by: Kyungtae Kim <kt0755@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Tested-by: Denis Efremov <efremov@linux.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>