wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2022-06-07	af_unix: Fix a data-race in unix_dgram_peer_wake_me().	Kuniyuki Iwashima	1	-1/+1
	unix_dgram_poll() calls unix_dgram_peer_wake_me() without `other`'s lock held and check if its receive queue is full. Here we need to use unix_recvq_full_lockless() instead of unix_recvq_full(), otherwise KCSAN will report a data-race. Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20220605232325.11804-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-07	stmmac: intel: Fix an error handling path in intel_eth_pci_probe()	Christophe JAILLET	1	-3/+1
	When the managed API is used, there is no need to explicitly call pci_free_irq_vectors(). This looks to be a left-over from the commit in the Fixes tag. Only the .remove() function had been updated. So remove this unused function call and update goto label accordingly. Fixes: 8accc467758e ("stmmac: intel: use managed PCI function on probe and resume") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Link: https://lore.kernel.org/r/1ac9b6787b0db83b0095711882c55c77c8ea8da0.1654462241.git.christophe.jaillet@wanadoo.fr Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-06	net: ethernet: bgmac: Fix refcount leak in bcma_mdio_mii_register	Miaoqian Lin	1	-0/+1
	of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. Add missing of_node_put() to avoid refcount leak. Fixes: 55954f3bfdac ("net: ethernet: bgmac: move BCMA MDIO Phy code into a separate file") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220603133238.44114-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-06	amt: fix wrong type string definition	Taehee Yoo	1	-0/+1
	amt message type definition starts from 1, not 0. But type_str[] starts from 0. So, it prints wrong type information. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-06	amt: fix possible null-ptr-deref in amt_rcv()	Taehee Yoo	1	-1/+2
	When amt interface receives amt message, it tries to obtain amt private data from sock. If there is no amt private data, it frees an skb immediately. After kfree_skb(), it increases the rx_dropped stats. But in order to use rx_dropped, amt private data is needed. So, it makes amt_rcv() to do not increase rx_dropped stats when it can not obtain amt private data. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 1a1a0e80e005 ("amt: fix possible memory leak in amt_rcv()") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-06	amt: fix wrong usage of pskb_may_pull()	Taehee Yoo	1	-18/+37
	It adds missing pskb_may_pull() in amt_update_handler() and amt_multicast_data_handler(). And it fixes wrong parameter of pskb_may_pull() in amt_advertisement_handler() and amt_membership_query_handler(). Reported-by: Jakub Kicinski <kuba@kernel.org> Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-06	netfilter: nf_tables: bail out early if hardware offload is not supported	Pablo Neira Ayuso	5	-3/+31
	If user requests for NFT_CHAIN_HW_OFFLOAD, then check if either device provides the .ndo_setup_tc interface or there is an indirect flow block that has been registered. Otherwise, bail out early from the preparation phase. Moreover, validate that family == NFPROTO_NETDEV and hook is NF_NETDEV_INGRESS. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-06	net: phy: dp83867: retrigger SGMII AN when link change	Tan Tee Min	1	-0/+29
	There is a limitation in TI DP83867 PHY device where SGMII AN is only triggered once after the device is booted up. Even after the PHY TPI is down and up again, SGMII AN is not triggered and hence no new in-band message from PHY to MAC side SGMII. This could cause an issue during power up, when PHY is up prior to MAC. At this condition, once MAC side SGMII is up, MAC side SGMII wouldn`t receive new in-band message from TI PHY with correct link status, speed and duplex info. As suggested by TI, implemented a SW solution here to retrigger SGMII Auto-Neg whenever there is a link change. v2: Add Fixes tag in commit message. Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy") Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Sit, Michael Wei Hong <michael.wei.hong.sit@intel.com> Reviewed-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Tan Tee Min <tee.min.tan@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220526090347.128742-1-tee.min.tan@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-06	netfilter: nf_tables: memleak flow rule from commit path	Pablo Neira Ayuso	1	-0/+6
	Abort path release flow rule object, however, commit path does not. Update code to destroy these objects before releasing the transaction. Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-06	netfilter: nf_tables: release new hooks on unsupported flowtable flags	Pablo Neira Ayuso	1	-4/+8
	Release the list of new hooks that are pending to be registered in case that unsupported flowtable flags are provided. Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-02	netfilter: nf_tables: always initialize flowtable hook list in transaction	Pablo Neira Ayuso	1	-0/+1
	The hook list is used if nft_trans_flowtable_update(trans) == true. However, initialize this list for other cases for safety reasons. Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-02	module: Fix prefix for module.sig_enforce module param	Saravana Kannan	1	-0/+3
	Commit cfc1d277891e ("module: Move all into module/") changed the prefix of the module param by moving/renaming files. A later commit also moves the module_param() into a different file, thereby changing the prefix yet again. This would break kernel cmdline compatibility and also userspace compatibility at /sys/module/module/parameters/sig_enforce. So, set the prefix back to "module.". Fixes: cfc1d277891e ("module: Move all into module/") Link: https://lore.kernel.org/lkml/20220602034111.4163292-1-saravanak@google.com/ Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Aaron Tomlin <atomlin@redhat.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Saravana Kannan <saravanak@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-02	net/af_packet: make sure to pull mac header	Eric Dumazet	1	-2/+4
	GSO assumes skb->head contains link layer headers. tun device in some case can provide base 14 bytes, regardless of VLAN being used or not. After blamed commit, we can end up setting a network header offset of 18+, we better pull the missing bytes to avoid a posible crash in GSO. syzbot report was: kernel BUG at include/linux/skbuff.h:2699! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 3601 Comm: syz-executor210 Not tainted 5.18.0-syzkaller-11338-g2c5ca23f7414 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__skb_pull include/linux/skbuff.h:2699 [inline] RIP: 0010:skb_mac_gso_segment+0x48f/0x530 net/core/gro.c:136 Code: 00 48 c7 c7 00 96 d4 8a c6 05 cb d3 45 06 01 e8 26 bb d0 01 e9 2f fd ff ff 49 c7 c4 ea ff ff ff e9 f1 fe ff ff e8 91 84 19 fa <0f> 0b 48 89 df e8 97 44 66 fa e9 7f fd ff ff e8 ad 44 66 fa e9 48 RSP: 0018:ffffc90002e2f4b8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000000 RDX: ffff88805bb58000 RSI: ffffffff8760ed0f RDI: 0000000000000004 RBP: 0000000000005dbc R08: 0000000000000004 R09: 0000000000000fe0 R10: 0000000000000fe4 R11: 0000000000000000 R12: 0000000000000fe0 R13: ffff88807194d780 R14: 1ffff920005c5e9b R15: 0000000000000012 FS: 000055555730f300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200015c0 CR3: 0000000071ff8000 CR4: 0000000000350ee0 Call Trace: <TASK> __skb_gso_segment+0x327/0x6e0 net/core/dev.c:3411 skb_gso_segment include/linux/netdevice.h:4749 [inline] validate_xmit_skb+0x6bc/0xf10 net/core/dev.c:3669 validate_xmit_skb_list+0xbc/0x120 net/core/dev.c:3719 sch_direct_xmit+0x3d1/0xbe0 net/sched/sch_generic.c:327 __dev_xmit_skb net/core/dev.c:3815 [inline] __dev_queue_xmit+0x14a1/0x3a00 net/core/dev.c:4219 packet_snd net/packet/af_packet.c:3071 [inline] packet_sendmsg+0x21cb/0x5550 net/packet/af_packet.c:3102 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492 ___sys_sendmsg+0xf3/0x170 net/socket.c:2546 __sys_sendmsg net/socket.c:2575 [inline] __do_sys_sendmsg net/socket.c:2584 [inline] __se_sys_sendmsg net/socket.c:2582 [inline] __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f4b95da06c9 Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffd7defc4c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffd7defc4f0 RCX: 00007f4b95da06c9 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 0000000000000003 R08: bb1414ac00000050 R09: bb1414ac00000050 R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd7defc4e0 R14: 00007ffd7defc4d8 R15: 00007ffd7defc4d4 </TASK> Fixes: dfed913e8b55 ("net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	net: add debug info to __skb_pull()	Eric Dumazet	1	-1/+8
	While analyzing yet another syzbot report, I found the following patch very useful. It allows to better understand what went wrong. This debug info is only enabled if CONFIG_DEBUG_NET=y, which is the case for syzbot builds. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	net: CONFIG_DEBUG_NET depends on CONFIG_NET	Eric Dumazet	1	-1/+1
	It makes little sense to debug networking stacks if networking is not compiled in. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	stmmac: intel: Add RPL-P PCI ID	Michael Sit Wei Hong	1	-0/+2
	Add PCI ID for Ethernet TSN Controller on RPL-P. Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com> Link: https://lore.kernel.org/r/20220602073507.3955721-1-michael.wei.hong.sit@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	net: stmmac: use dev_err_probe() for reporting mdio bus registration failure	Rasmus Villemoes	2	-4/+4
	I have a board where these two lines are always printed during boot: imx-dwmac 30bf0000.ethernet: Cannot register the MDIO bus imx-dwmac 30bf0000.ethernet: stmmac_dvr_probe: MDIO bus (id: 1) registration failed It's perfectly fine, and the device is successfully (and silently, as far as the console goes) probed later. Use dev_err_probe() instead, which will demote these messages to debug level (thus removing the alarming messages from the console) when the error is -EPROBE_DEFER, and also has the advantage of including the error code if/when it happens to be something other than -EPROBE_DEFER. While here, add the missing \n to one of the format strings. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20220602074840.1143360-1-linux@rasmusvillemoes.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	tipc: check attribute length for bearer name	Hoang Le	1	-2/+1
	syzbot reported uninit-value: ===================================================== BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:644 [inline] BUG: KMSAN: uninit-value in string+0x4f9/0x6f0 lib/vsprintf.c:725 string_nocheck lib/vsprintf.c:644 [inline] string+0x4f9/0x6f0 lib/vsprintf.c:725 vsnprintf+0x2222/0x3650 lib/vsprintf.c:2806 vprintk_store+0x537/0x2150 kernel/printk/printk.c:2158 vprintk_emit+0x28b/0xab0 kernel/printk/printk.c:2256 vprintk_default+0x86/0xa0 kernel/printk/printk.c:2283 vprintk+0x15f/0x180 kernel/printk/printk_safe.c:50 _printk+0x18d/0x1cf kernel/printk/printk.c:2293 tipc_enable_bearer net/tipc/bearer.c:371 [inline] __tipc_nl_bearer_enable+0x2022/0x22a0 net/tipc/bearer.c:1033 tipc_nl_bearer_enable+0x6c/0xb0 net/tipc/bearer.c:1042 genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline] - Do sanity check the attribute length for TIPC_NLA_BEARER_NAME. - Do not use 'illegal name' in printing message. Reported-by: syzbot+e820fdc8ce362f2dea51@syzkaller.appspotmail.com Fixes: cb30a63384bc ("tipc: refactor function tipc_enable_bearer()") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Link: https://lore.kernel.org/r/20220602063053.5892-1-hoang.h.le@dektech.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-02	i2c: ismt: prevent memory corruption in ismt_access()	Dan Carpenter	1	-0/+3
	The "data->block[0]" variable comes from the user and is a number between 0-255. It needs to be capped to prevent writing beyond the end of dma_buffer[]. Fixes: 5e9a97b1f449 ("i2c: ismt: Adding support for I2C_SMBUS_BLOCK_PROC_CALL") Reported-and-tested-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-02	ice: fix access-beyond-end in the switch code	Alexander Lobakin	4	-139/+115
	Global `-Warray-bounds` enablement revealed some problems, one of which is the way we define and use AQC rules messages. In fact, they have a shared header, followed by the actual message, which can be of one of several different formats. So it is straightforward enough to define that header as a separate struct and then embed it into message structures as needed, but currently all the formats reside in one union coupled with the header. Then, the code allocates only the memory needed for a particular message format, leaving the union potentially incomplete. There are no actual reads or writes beyond the end of an allocated chunk, but at the same time, the whole implementation is fragile and backed by an equilibrium rather than strong type and memory checks. Define the structures the other way around: one for the common header and the rest for the actual formats with the header embedded. There are no places where several union members would be used at the same time anyway. This allows to use proper struct_size() and let the compiler know what is going to be done. Finally, unsilence `-Warray-bounds` back for ice_switch.c. Other little things worth mentioning: * &ice_sw_rule_vsi_list_query is not used anywhere, remove it. It's weird anyway to talk to hardware with purely kernel types (bitmaps); * expand the ICE_SW_RULE__SIZE() macros to pass a structure variable name to struct_size() to let it do strict typechecking; rename ice_sw_rule_lkup_rx_tx::hdr to ::hdr_data to keep ::hdr for the header structure to have the same name for it constistenly everywhere; * drop the duplicate of %ICE_SW_RULE_RX_TX_NO_HDR_SIZE residing in ice_switch.h. Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Fixes: 66486d8943ba ("ice: replace single-element array used for C struct hack") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220601105924.2841410-1-alexandr.lobakin@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-02	nfp: remove padding in nfp_nfdk_tx_desc	Fei Qin	3	-9/+17
	NFDK firmware supports 48-bit dma addressing and parses 16 high bits of dma addresses. In nfp_nfdk_tx_desc, dma related structure and tso related structure are union. When "mss" be filled with nonzero value due to enable tso, the memory used by "padding" may be also filled. Then, firmware may parse wrong dma addresses which causes TX watchdog timeout problem. This patch removes padding and unifies the dma_addr_hi bits with the one in firmware. nfp_nfdk_tx_desc_set_dma_addr is also added to match this change. Fixes: c10d12e3dce8 ("nfp: add support for NFDK data path") Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220601083449.50556-1-simon.horman@corigine.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-02	ax25: Fix ax25 session cleanup problems	Duoming Zhou	4	-11/+20
	There are session cleanup problems in ax25_release() and ax25_disconnect(). If we setup a session and then disconnect, the disconnected session is still in "LISTENING" state that is shown below. Active AX.25 sockets Dest Source Device State Vr/Vs Send-Q Recv-Q DL9SAU-4 DL9SAU-3 ??? LISTENING 000/000 0 0 DL9SAU-3 DL9SAU-4 ??? LISTENING 000/000 0 0 The first reason is caused by del_timer_sync() in ax25_release(). The timers of ax25 are used for correct session cleanup. If we use ax25_release() to close ax25 sessions and ax25_dev is not null, the del_timer_sync() functions in ax25_release() will execute. As a result, the sessions could not be cleaned up correctly, because the timers have stopped. In order to solve this problem, this patch adds a device_up flag in ax25_dev in order to judge whether the device is up. If there are sessions to be cleaned up, the del_timer_sync() in ax25_release() will not execute. What's more, we add ax25_cb_del() in ax25_kill_by_device(), because the timers have been stopped and there are no functions that could delete ax25_cb if we do not call ax25_release(). Finally, we reorder the position of ax25_list_lock in ax25_cb_del() in order to synchronize among different functions that call ax25_cb_del(). The second reason is caused by improper check in ax25_disconnect(). The incoming ax25 sessions which ax25->sk is null will close heartbeat timer, because the check "if(!ax25->sk \|\| ..)" is satisfied. As a result, the session could not be cleaned up properly. In order to solve this problem, this patch changes the improper check to "if(ax25->sk && ..)" in ax25_disconnect(). What`s more, the ax25_disconnect() may be called twice, which is not necessary. For example, ax25_kill_by_device() calls ax25_disconnect() and sets ax25->state to AX25_STATE_0, but ax25_release() calls ax25_disconnect() again. In order to solve this problem, this patch add a check in ax25_release(). If the flag of ax25->sk equals to SOCK_DEAD, the ax25_disconnect() in ax25_release() should not be executed. Fixes: 82e31755e55f ("ax25: Fix UAF bugs in ax25 timers") Fixes: 8a367e74c012 ("ax25: Fix segfault after sock connection timeout") Reported-and-tested-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://lore.kernel.org/r/20220530152158.108619-1-duoming@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-02	netfilter: nf_tables: delete flowtable hooks via transaction list	Pablo Neira Ayuso	2	-26/+6
	Remove inactive bool field in nft_hook object that was introduced in abadb2f865d7 ("netfilter: nf_tables: delete devices from flowtable"). Move stale flowtable hooks to transaction list instead. Deleting twice the same device does not result in ENOENT. Fixes: abadb2f865d7 ("netfilter: nf_tables: delete devices from flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-01	assoc_array: Fix BUG_ON during garbage collect	Stephen Brennan	1	-0/+8
	A rare BUG_ON triggered in assoc_array_gc: [3430308.818153] kernel BUG at lib/assoc_array.c:1609! Which corresponded to the statement currently at line 1593 upstream: BUG_ON(assoc_array_ptr_is_meta(p)); Using the data from the core dump, I was able to generate a userspace reproducer[1] and determine the cause of the bug. [1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc After running the iterator on the entire branch, an internal tree node looked like the following: NODE (nr_leaves_on_branch: 3) SLOT [0] NODE (2 leaves) SLOT [1] NODE (1 leaf) SLOT [2..f] NODE (empty) In the userspace reproducer, the pr_devel output when compressing this node was: -- compress node 0x5607cc089380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] after: 3 At slot 0, an internal node with 2 leaves could not be folded into the node, because there was only one available slot (slot 0). Thus, the internal node was retained. At slot 1, the node had one leaf, and was able to be folded in successfully. The remaining nodes had no leaves, and so were removed. By the end of the compression stage, there were 14 free slots, and only 3 leaf nodes. The tree was ascended and then its parent node was compressed. When this node was seen, it could not be folded, due to the internal node it contained. The invariant for compression in this function is: whenever nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all leaf nodes. The compression step currently cannot guarantee this, given the corner case shown above. To fix this issue, retry compression whenever we have retained a node, and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second compression will then allow the node in slot 1 to be folded in, satisfying the invariant. Below is the output of the reproducer once the fix is applied: -- compress node 0x560e9c562380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] internal nodes remain despite enough space, retrying -- compress node 0x560e9c562380 -- free=14, leaves=1 [0] fold node 2/15 [nx 0] after: 3 Changes ======= DH: - Use false instead of 0. - Reorder the inserted lines in a couple of places to put retained before next_slot. ver #2) - Fix typo in pr_devel, correct comparison to "<=" Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Cc: <stable@vger.kernel.org> Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/ # v1 Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/ # v2 Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-01	net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline	Slark Xiao	1	-0/+1
	Adding support for Cinterion device MV31 with Qualcomm new baseline. Use different PIDs to separate it from previous base line products. All interfaces settings keep same as previous. T: Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=1e2d ProdID=00b9 Rev=04.14 S: Manufacturer=Cinterion S: Product=Cinterion PID 0x00B9 USB Mobile Broadband S: SerialNumber=90418e79 C: #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan I: If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option I: If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option I: If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option Signed-off-by: Slark Xiao <slark_xiao@163.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20220601040531.6016-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01	sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels	Íñigo Huguet	1	-4/+2
	tx_channel_offset is calculated in efx_allocate_msix_channels, but it is also calculated again in efx_set_channels because it was originally done there, and when efx_allocate_msix_channels was introduced it was forgotten to be removed from efx_set_channels. Moreover, the old calculation is wrong when using efx_separate_tx_channels because now we can have XDP channels after the TX channels, so n_channels - n_tx_channels doesn't point to the first TX channel. Remove the old calculation from efx_set_channels, and add the initialization of this variable if MSI or legacy interrupts are used, next to the initialization of the rest of the related variables, where it was missing. This has been already done for sfc, do it also for sfc_siena. Fixes: 3990a8fffbda ("sfc: allocate channels for XDP tx queues") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01	sfc/siena: fix considering that all channels have TX queues	Martin Habets	1	-1/+1
	Normally, all channels have RX and TX queues, but this is not true if modparam efx_separate_tx_channels=1 is used. In that cases, some channels only have RX queues and others only TX queues (or more preciselly, they have them allocated, but not initialized). Fix efx_channel_has_tx_queues to return the correct value for this case too. This has been already done for sfc, do it also for sfc_siena. Messages shown at probe time before the fix: sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0 ------------[ cut here ]------------ netdevice: ens6f0np0: failed to initialise TXQ -1 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped Call Trace: efx_init_tx_queue+0xaa/0xf0 [sfc] efx_start_channels+0x49/0x120 [sfc] efx_start_all+0x1f8/0x430 [sfc] efx_net_open+0x5a/0xe0 [sfc] __dev_open+0xd0/0x190 __dev_change_flags+0x1b3/0x220 dev_change_flags+0x21/0x60 [...] stripped Messages shown at remove time before the fix: sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues sfc 0000:03:00.0 ens6f0np0: failed to flush queues Fixes: 8700aff08984 ("sfc: fix channel allocation with brute force") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Tested-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01	socket: Don't use u8 type in uapi socket.h	Tobias Klauser	1	-1/+1
	Use plain 255 instead, which also avoid introducing an additional header dependency on <linux/types.h> Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/r/20220531094345.13801-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01	rtc: mxc: Silence a clang warning	Fabio Estevam	1	-1/+1
	Change the of_device_get_match_data() cast to (uintptr_t) to silence the following clang warning: drivers/rtc/rtc-mxc.c:315:19: warning: cast to smaller integer type 'enum imx_rtc_type' from 'const void *' [-Wvoid-pointer-to-enum-cast] Reported-by: kernel test robot <lkp@intel.com> Fixes: ba7aa63000f2 ("rtc: mxc: use of_device_get_match_data") Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20220526011459.1167197-1-festevam@gmail.com
2022-06-01	MAINTAINERS: rectify entries for some i3c drivers after dt conversion	Lukas Bulwahn	1	-2/+2
	Commit 4bd69ecfa672 ("dt-bindings: i3c: Convert cdns,i3c-master to DT schema") and commit 6742ca620bd9 ("dt-bindings: i3c: Convert snps,dw-i3c-master to DT schema") convert some i3c dt-bindings to yaml, but miss to adjust its reference in MAINTAINERS. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about broken references. Repair these file references in I3C DRIVER FOR CADENCE I3C MASTER IP and I3C DRIVER FOR SYNOPSYS DESIGNWARE. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20220601074212.19984-1-lukas.bulwahn@gmail.com
2022-06-01	afs: Fix infinite loop found by xfstest generic/676	David Howells	1	-1/+4
	In AFS, a directory is handled as a file that the client downloads and parses locally for the purposes of performing lookup and getdents operations. The in-kernel afs filesystem has a number of functions that do this. A directory file is arranged as a series of 2K blocks divided into 32-byte slots, where a directory entry occupies one or more slots, plus each block starts with one or more metadata blocks. When parsing a block, if the last slots are occupied by a dirent that occupies more than a single slot and the file position points at a slot that's not the initial one, the logic in afs_dir_iterate_block() that skips over it won't advance the file pointer to the end of it. This will cause an infinite loop in getdents() as it will keep retrying that block and failing to advance beyond the final entry. Fix this by advancing the file pointer if the next entry will be beyond it when we skip a block. This was found by the generic/676 xfstest but can also be triggered with something like: ~/xfstests-dev/src/t_readdir_3 /xfstest.test/z 4000 1 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: http://lore.kernel.org/r/165391973497.110268.2939296942213894166.stgit@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-01	netfilter: nf_tables: use kfree_rcu(ptr, rcu) to release hooks in clean_net path	Pablo Neira Ayuso	1	-1/+1
	Use kfree_rcu(ptr, rcu) variant instead as described by ae089831ff28 ("netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant"). Fixes: f9a43007d3f7 ("netfilter: nf_tables: double hook unregistration in netns path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-01	netfilter: nat: really support inet nat without l3 address	Florian Westphal	2	-1/+45
	When no l3 address is given, priv->family is set to NFPROTO_INET and the evaluation function isn't called. Call it too so l4-only rewrite can work. Also add a test case for this. Fixes: a33f387ecd5aa ("netfilter: nft_nat: allow to specify layer 4 protocol NAT only") Reported-by: Yi Chen <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-01	net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()	Dan Carpenter	1	-1/+1
	The tcf_ct_flow_table_fill_tuple_ipv6() function is supposed to return false on failure. It should not return negatives because that means succes/true. Fixes: fcb6aa86532c ("act_ct: Support GRE offload") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Link: https://lore.kernel.org/r/YpYFnbDxFl6tQ3Bn@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-01	net: ping6: Fix ping -6 with interface name	Aya Levin	1	-4/+4
	When passing interface parameter to ping -6: $ ping -6 ::11:141:84:9 -I eth2 Results in: PING ::11:141:84:10(::11:141:84:10) from ::11:141:84:9 eth2: 56 data bytes ping: sendmsg: Invalid argument ping: sendmsg: Invalid argument Initialize the fl6's outgoing interface (OIF) before triggering ip6_datagram_send_ctl. Don't wipe fl6 after ip6_datagram_send_ctl() as changes in fl6 that may happen in the function are overwritten explicitly. Update comment accordingly. Fixes: 13651224c00b ("net: ping6: support setting basic SOL_IPV6 options via cmsg") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220531084544.15126-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-01	macsec: fix UAF bug for real_dev	Ziyang Xuan	1	-0/+7
	Create a new macsec device but not get reference to real_dev. That can not ensure that real_dev is freed after macsec. That will trigger the UAF bug for real_dev as following: ================================================================== BUG: KASAN: use-after-free in macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662 Call Trace: ... macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662 dev_get_iflink+0x73/0xe0 net/core/dev.c:637 default_operstate net/core/link_watch.c:42 [inline] rfc2863_policy+0x233/0x2d0 net/core/link_watch.c:54 linkwatch_do_dev+0x2a/0x150 net/core/link_watch.c:161 Allocated by task 22209: ... alloc_netdev_mqs+0x98/0x1100 net/core/dev.c:10549 rtnl_create_link+0x9d7/0xc00 net/core/rtnetlink.c:3235 veth_newlink+0x20e/0xa90 drivers/net/veth.c:1748 Freed by task 8: ... kfree+0xd6/0x4d0 mm/slub.c:4552 kvfree+0x42/0x50 mm/util.c:615 device_release+0x9f/0x240 drivers/base/core.c:2229 kobject_cleanup lib/kobject.c:673 [inline] kobject_release lib/kobject.c:704 [inline] kref_put include/linux/kref.h:65 [inline] kobject_put+0x1c8/0x540 lib/kobject.c:721 netdev_run_todo+0x72e/0x10b0 net/core/dev.c:10327 After commit faab39f63c1f ("net: allow out-of-order netdev unregistration") and commit e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev"), we can add dev_hold_track() in macsec_dev_init() and dev_put_track() in macsec_free_netdev() to fix the problem. Fixes: 2bce1ebed17d ("macsec: fix refcnt leak in module exit routine") Reported-by: syzbot+d0e94b65ac259c29ce7a@syzkaller.appspotmail.com Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20220531074500.1272846-1-william.xuanziyang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-01	octeontx2-af: fix error code in is_valid_offset()	Dan Carpenter	1	-1/+1
	The is_valid_offset() function returns success/true if the call to validate_and_get_cpt_blkaddr() fails. Fixes: ecad2ce8c48f ("octeontx2-af: cn10k: Add mailbox to configure reassembly timeout") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YpXDrTPb8qV01JSP@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-01	wifi: mac80211: fix use-after-free in chanctx code	Johannes Berg	1	-5/+2
	In ieee80211_vif_use_reserved_context(), when we have an old context and the new context's replace_state is set to IEEE80211_CHANCTX_REPLACE_NONE, we free the old context in ieee80211_vif_use_reserved_reassign(). Therefore, we cannot check the old_ctx anymore, so we should set it to NULL after this point. However, since the new_ctx replace state is clearly not IEEE80211_CHANCTX_REPLACES_OTHER, we're not going to do anything else in this function and can just return to avoid accessing the freed old_ctx. Cc: stable@vger.kernel.org Fixes: 5bcae31d9cb1 ("mac80211: implement multi-vif in-place reservations") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220601091926.df419d91b165.I17a9b3894ff0b8323ce2afdb153b101124c821e5@changeid
2022-06-01	bonding: guard ns_targets by CONFIG_IPV6	Hangbin Liu	3	-4/+14
	Guard ns_targets in struct bond_params by CONFIG_IPV6, which could save 256 bytes if IPv6 not configed. Also add this protection for function bond_is_ip6_target_ok() and bond_get_targets_ip6(). Remove the IS_ENABLED() check for bond_opts[] as this will make BOND_OPT_NS_TARGETS uninitialized if CONFIG_IPV6 not enabled. Add a dummy bond_option_ns_ip6_targets_set() for this situation. Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Link: https://lore.kernel.org/r/20220531063727.224043-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-31	tcp: tcp_rtx_synack() can be called from process context	Eric Dumazet	1	-2/+2
	Laurent reported the enclosed report [1] This bug triggers with following coditions: 0) Kernel built with CONFIG_DEBUG_PREEMPT=y 1) A new passive FastOpen TCP socket is created. This FO socket waits for an ACK coming from client to be a complete ESTABLISHED one. 2) A socket operation on this socket goes through lock_sock() release_sock() dance. 3) While the socket is owned by the user in step 2), a retransmit of the SYN is received and stored in socket backlog. 4) At release_sock() time, the socket backlog is processed while in process context. 5) A SYNACK packet is cooked in response of the SYN retransmit. 6) -> tcp_rtx_synack() is called in process context. Before blamed commit, tcp_rtx_synack() was always called from BH handler, from a timer handler. Fix this by using TCP_INC_STATS() & NET_INC_STATS() which do not assume caller is in non preemptible context. [1] BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180 caller is tcp_rtx_synack.part.0+0x36/0xc0 CPU: 10 PID: 2180 Comm: epollpep Tainted: G OE 5.16.0-0.bpo.4-amd64 #1 Debian 5.16.12-1~bpo11+1 Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021 Call Trace: <TASK> dump_stack_lvl+0x48/0x5e check_preemption_disabled+0xde/0xe0 tcp_rtx_synack.part.0+0x36/0xc0 tcp_rtx_synack+0x8d/0xa0 ? kmem_cache_alloc+0x2e0/0x3e0 ? apparmor_file_alloc_security+0x3b/0x1f0 inet_rtx_syn_ack+0x16/0x30 tcp_check_req+0x367/0x610 tcp_rcv_state_process+0x91/0xf60 ? get_nohz_timer_target+0x18/0x1a0 ? lock_timer_base+0x61/0x80 ? preempt_count_add+0x68/0xa0 tcp_v4_do_rcv+0xbd/0x270 __release_sock+0x6d/0xb0 release_sock+0x2b/0x90 sock_setsockopt+0x138/0x1140 ? __sys_getsockname+0x7e/0xc0 ? aa_sk_perm+0x3e/0x1a0 __sys_setsockopt+0x198/0x1e0 __x64_sys_setsockopt+0x21/0x30 do_syscall_64+0x38/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-31	net: sched: add barrier to fix packet stuck problem for lockless qdisc	Guoju Fang	1	-0/+6
	In qdisc_run_end(), the spin_unlock() only has store-release semantic, which guarantees all earlier memory access are visible before it. But the subsequent test_bit() has no barrier semantics so may be reordered ahead of the spin_unlock(). The store-load reordering may cause a packet stuck problem. The concurrent operations can be described as below, CPU 0 \| CPU 1 qdisc_run_end() \| qdisc_run_begin() . \| . ----> /* may be reorderd here */ \| . \| . \| . \| spin_unlock() \| set_bit() \| . \| smp_mb__after_atomic() ---- test_bit() \| spin_trylock() . \| . Consider the following sequence of events: CPU 0 reorder test_bit() ahead and see MISSED = 0 CPU 1 calls set_bit() CPU 1 calls spin_trylock() and return fail CPU 0 executes spin_unlock() At the end of the sequence, CPU 0 calls spin_unlock() and does nothing because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one take it, until the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it. This patch fix this by adding one explicit barrier. As spin_unlock() and test_bit() ordering is a store-load ordering, a full memory barrier smp_mb() is needed here. Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220528101628.120193-1-gjfang@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-31	netfilter: flowtable: fix nft_flow_route source address for nat case	wenxu	1	-2/+2
	For snat and dnat cases, the saddr should be taken from reverse tuple. Fixes: 3412e1641828 (netfilter: flowtable: nft_flow_route use more data for reverse route) Signed-off-by: wenxu <wenxu@chinatelecom.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-31	netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag	wenxu	1	-0/+2
	The nf_flow_table gets route through ip_route_output_key. If the saddr is not local one, then FLOWI_FLAG_ANYSRC flag should be set. Without this flag, the route lookup for other_dst will fail. Fixes: 3412e1641828 (netfilter: flowtable: nft_flow_route use more data for reverse route) Signed-off-by: wenxu <wenxu@chinatelecom.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-31	netfilter: nf_tables: double hook unregistration in netns path	Pablo Neira Ayuso	1	-13/+41
	__nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: 767d1216bff8 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-31	netfilter: nf_tables: hold mutex on netns pre_exit path	Pablo Neira Ayuso	1	-0/+4
	clean_net() runs in workqueue while walking over the lists, grab mutex. Fixes: 767d1216bff8 ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-31	netfilter: nf_tables: sanitize nft_set_desc_concat_parse()	Pablo Neira Ayuso	1	-4/+13
	Add several sanity checks for nft_set_desc_concat_parse(): - validate desc->field_count not larger than desc->field_len array. - field length cannot be larger than desc->field_len (ie. U8_MAX) - total length of the concatenation cannot be larger than register array. Joint work with Florian Westphal. Fixes: f3a2181e16f1 ("netfilter: nf_tables: Support for sets with multiple ranged fields") Reported-by: <zhangziming.zzm@antgroup.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-05-31	NFSv4.1 mark qualified async operations as MOVEABLE tasks	Olga Kornievskaia	5	-12/+30
	Mark async operations such as RENAME, REMOVE, COMMIT MOVEABLE for the nfsv4.1+ sessions. Fixes: 85e39feead948 ("NFSv4.1 identify and mark RPC tasks that can move between transports") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-31	xprtrdma: treat all calls not a bcall when bc_serv is NULL	Kinglong Mee	1	-0/+5
	When a rdma server returns a fault format reply, nfs v3 client may treats it as a bcall when bc service is not exist. The debug message at rpcrdma_bc_receive_call are, [56579.837169] RPC: rpcrdma_bc_receive_call: callback XID 00000001, length=20 [56579.837174] RPC: rpcrdma_bc_receive_call: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 After that, rpcrdma_bc_receive_call will meets NULL pointer as, [ 226.057890] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8 ... [ 226.058704] RIP: 0010:_raw_spin_lock+0xc/0x20 ... [ 226.059732] Call Trace: [ 226.059878] rpcrdma_bc_receive_call+0x138/0x327 [rpcrdma] [ 226.060011] __ib_process_cq+0x89/0x170 [ib_core] [ 226.060092] ib_cq_poll_work+0x26/0x80 [ib_core] [ 226.060257] process_one_work+0x1a7/0x360 [ 226.060367] ? create_worker+0x1a0/0x1a0 [ 226.060440] worker_thread+0x30/0x390 [ 226.060500] ? create_worker+0x1a0/0x1a0 [ 226.060574] kthread+0x116/0x130 [ 226.060661] ? kthread_flush_work_fn+0x10/0x10 [ 226.060724] ret_from_fork+0x35/0x40 ... Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-31	NFSv4: Fix free of uninitialized nfs4_label on referral lookup.	Benjamin Coddington	5	-14/+25
	Send along the already-allocated fattr along with nfs4_fs_locations, and drop the memcpy of fattr. We end up growing two more allocations, but this fixes up a crash as: PID: 790 TASK: ffff88811b43c000 CPU: 0 COMMAND: "ls" #0 [ffffc90000857920] panic at ffffffff81b9bfde #1 [ffffc900008579c0] do_trap at ffffffff81023a9b #2 [ffffc90000857a10] do_error_trap at ffffffff81023b78 #3 [ffffc90000857a58] exc_stack_segment at ffffffff81be1f45 #4 [ffffc90000857a80] asm_exc_stack_segment at ffffffff81c009de #5 [ffffc90000857b08] nfs_lookup at ffffffffa0302322 [nfs] #6 [ffffc90000857b70] __lookup_slow at ffffffff813a4a5f #7 [ffffc90000857c60] walk_component at ffffffff813a86c4 #8 [ffffc90000857cb8] path_lookupat at ffffffff813a9553 #9 [ffffc90000857cf0] filename_lookup at ffffffff813ab86b Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 9558a007dbc3 ("NFS: Remove the label from the nfs4_lookup_res struct") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-05-31	net/mlx5: Fix mlx5_get_next_dev() peer device matching	Saeed Mahameed	1	-11/+23
	In some use-cases, mlx5 instances will need to search for their peer device (the other port on the same HCA). For that, mlx5 device matching mechanism relied on auxiliary_find_device() to search, and used a bad matching callback function. This approach has two issues: 1) next_phys_dev() the matching function, assumed all devices are of the type mlx5_adev (mlx5 auxiliary device) which is wrong and could lead to crashes, this worked for a while, since only lately other drivers started registering auxiliary devices. 2) using the auxiliary class bus (auxiliary_find_device) to search for mlx5_core_dev devices, who are actually PCIe device instances, is wrong. This works since mlx5_core always has at least one mlx5_adev instance hanging around in the aux bus. As suggested by others we can fix 1. by comparing device names prefixes if they have the string "mlx5_core" in them, which is not a best practice ! but even with that fixed, still 2. needs fixing, we are trying to match pcie device peers so we should look in the right bus (pci bus), hence this fix. The fix: 1) search the pci bus for mlx5 peer devices, instead of the aux bus 2) to validated devices are the same type "mlx5_core_dev" compare if they have the same driver, which is bulletproof. This wouldn't have worked with the aux bus since the various mlx5 aux device types don't share the same driver, even if they share the same device wrapper struct (mlx5_adev) "which helped to find the parent device" Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus") Reported-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reported-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com>