linux-dev - Linux kernel development work

Age	Commit message (Collapse)	Author	Files	Lines
2020-11-17	net/mlx5: Add handling of port type in rule deletion	Michael Guralnik	1	-0/+7
	Handle destruction of rules with port destination type to enable full destruction of flow. Without this handling of TX rules the deletion of these rules fails. Dmesg of flow destruction failure: [ 203.714146] mlx5_core 0000:00:0b.0: mlx5_cmd_check:753:(pid 342): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x144b7a) [ 210.547387] ------------[ cut here ]------------ [ 210.548663] refcount_t: decrement hit 0; leaking memory. [ 210.550651] WARNING: CPU: 4 PID: 342 at lib/refcount.c:31 refcount_warn_saturate+0x5c/0x110 [ 210.550654] Modules linked in: mlx5_ib mlx5_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core [ 210.550675] CPU: 4 PID: 342 Comm: test Not tainted 5.8.0-rc2+ #116 [ 210.550678] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 210.550680] RIP: 0010:refcount_warn_saturate+0x5c/0x110 [ 210.550685] Code: c6 d1 1b 01 00 0f 84 ad 00 00 00 5b 5d c3 80 3d b5 d1 1b 01 00 75 f4 48 c7 c7 20 d1 15 82 c6 05 a5 d1 1b 01 01 e8 a7 eb af ff <0f> 0b eb dd 80 3d 99 d1 1b 01 00 75 d4 48 c7 c7 c0 cf 15 82 c6 05 [ 210.550687] RSP: 0018:ffff8881642e77e8 EFLAGS: 00010282 [ 210.550691] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000 [ 210.550694] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102c85ceef [ 210.550696] RBP: ffff888161720428 R08: ffffffff8124c10e R09: ffffed103243beae [ 210.550698] R10: ffff8881921df56b R11: ffffed103243bead R12: ffff8881841b4180 [ 210.550701] R13: ffff888161720428 R14: ffff8881616d0000 R15: ffff888161720380 [ 210.550704] FS: 00007fc27f025740(0000) GS:ffff888192000000(0000) knlGS:0000000000000000 [ 210.550706] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 210.550708] CR2: 0000557e4b41a6a0 CR3: 0000000002415004 CR4: 0000000000360ea0 [ 210.550711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 210.550713] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 210.550715] Call Trace: [ 210.550717] mlx5_del_flow_rules+0x484/0x490 [mlx5_core] [ 210.550720] ? mlx5_cmd_set_fte+0xa80/0xa80 [mlx5_core] [ 210.550722] mlx5_ib_destroy_flow+0x17f/0x280 [mlx5_ib] [ 210.550724] uverbs_free_flow+0x4c/0x90 [ib_uverbs] [ 210.550726] destroy_hw_idr_uobject+0x41/0xb0 [ib_uverbs] [ 210.550728] uverbs_destroy_uobject+0xaa/0x390 [ib_uverbs] [ 210.550731] __uverbs_cleanup_ufile+0x129/0x1b0 [ib_uverbs] [ 210.550733] ? uverbs_destroy_uobject+0x390/0x390 [ib_uverbs] [ 210.550735] uverbs_destroy_ufile_hw+0x78/0x190 [ib_uverbs] [ 210.550737] ib_uverbs_close+0x36/0x140 [ib_uverbs] [ 210.550739] __fput+0x181/0x380 [ 210.550741] task_work_run+0x88/0xd0 [ 210.550743] do_exit+0x5f6/0x13b0 [ 210.550745] ? sched_clock_cpu+0x30/0x140 [ 210.550747] ? is_current_pgrp_orphaned+0x70/0x70 [ 210.550750] ? lock_downgrade+0x360/0x360 [ 210.550752] ? mark_held_locks+0x1d/0x90 [ 210.550754] do_group_exit+0x8a/0x140 [ 210.550756] get_signal+0x20a/0xf50 [ 210.550758] do_signal+0x8c/0xbe0 [ 210.550760] ? hrtimer_nanosleep+0x1d8/0x200 [ 210.550762] ? nanosleep_copyout+0x50/0x50 [ 210.550764] ? restore_sigcontext+0x320/0x320 [ 210.550766] ? __hrtimer_init+0xf0/0xf0 [ 210.550768] ? timespec64_add_safe+0x150/0x150 [ 210.550770] ? mark_held_locks+0x1d/0x90 [ 210.550772] ? lockdep_hardirqs_on_prepare+0x14c/0x240 [ 210.550774] __prepare_exit_to_usermode+0x119/0x170 [ 210.550776] do_syscall_64+0x65/0x300 [ 210.550778] ? trace_hardirqs_off+0x10/0x120 [ 210.550781] ? mark_held_locks+0x1d/0x90 [ 210.550783] ? asm_sysvec_apic_timer_interrupt+0xa/0x20 [ 210.550785] ? lockdep_hardirqs_on+0x112/0x190 [ 210.550787] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 210.550789] RIP: 0033:0x7fc27f1cd157 [ 210.550791] Code: Bad RIP value. [ 210.550793] RSP: 002b:00007ffd4db27ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000023 [ 210.550798] RAX: fffffffffffffdfc RBX: ffffffffffffff80 RCX: 00007fc27f1cd157 [ 210.550800] RDX: 00007fc27f025740 RSI: 00007ffd4db27eb0 RDI: 00007ffd4db27eb0 [ 210.550803] RBP: 0000000000000016 R08: 0000000000000000 R09: 000000000000000e [ 210.550805] R10: 00007ffd4db27dc7 R11: 0000000000000246 R12: 0000000000400c00 [ 210.550808] R13: 00007ffd4db285f0 R14: 0000000000000000 R15: 0000000000000000 [ 210.550809] irq event stamp: 49399 [ 210.550812] hardirqs last enabled at (49399): [<ffffffff81172d36>] console_unlock+0x556/0x6f0 [ 210.550815] hardirqs last disabled at (49398): [<ffffffff81172897>] console_unlock+0xb7/0x6f0 [ 210.550818] softirqs last enabled at (48706): [<ffffffff81e0037b>] __do_softirq+0x37b/0x60c [ 210.550820] softirqs last disabled at (48697): [<ffffffff81c00e2f>] asm_call_on_stack+0xf/0x20 [ 210.550822] ---[ end trace ad18c0e6fa846454 ]--- [ 210.581862] mlx5_core 0000:00:0c.0: mlx5_destroy_flow_table:2132:(pid 342): Flow table 262150 wasn't destroyed, refcount > 1 Fixes: a7ee18bdee83 ("RDMA/mlx5: Allow creating a matcher for a NIC TX flow table") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17	net/mlx5e: Fix check if netdev is bond slave	Maor Dickman	1	-1/+1
	Bond events handler uses bond_slave_get_rtnl to check if net device is bond slave. bond_slave_get_rtnl return the rcu rx_handler pointer from the netdev which exists for bond slaves but also exists for devices that are attached to linux bridge so using it as indication for bond slave is wrong. Fix by using netif_is_lag_port instead. Fixes: 7e51891a237f ("net/mlx5e: Use netdev events to set/del egress acl forward-to-vport rule") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17	net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb	Huy Nguyen	4	-13/+16
	Both TC and IPsec crypto offload use metadata_regB to store private information. Since TC does not use bit 31 of regB, IPsec will use bit 31 as the IPsec packet marker. The IPsec's regB usage is changed to: Bit31: IPsec marker Bit30-24: IPsec syndrome Bit23-0: IPsec obj id Fixes: b2ac7541e377 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17	net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.	Huy Nguyen	1	-7/+6
	The IP's checksum partial still requires L4 csum flag on Ethernet WQE. Make the IPsec WAs only for the IP's non checksum partial case (for example icmd packet) Fixes: 5be019040cb7 ("net/mlx5e: IPsec: Add Connect-X IPsec Tx data path offload") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17	net/mlx5e: Fix refcount leak on kTLS RX resync	Maxim Mikityanskiy	1	-5/+8
	On resync, the driver calls inet_lookup_established (__inet6_lookup_established) that increases sk_refcnt of the socket. To decrease it, the driver set skb->destructor to sock_edemux. However, it didn't work well, because the TCP stack also sets this destructor for early demux, and the refcount gets decreased only once, while increased two times (in mlx5e and in the TCP stack). It leads to a socket leak, a TLS context leak, which in the end leads to calling tls_dev_del twice: on socket close and on driver unload, which in turn leads to a crash. This commit fixes the refcount leak by calling sock_gen_put right away after using the socket, thus fixing all the subsequent issues. Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-17	tcp: only postpone PROBE_RTT if RTT is < current min_rtt estimate	Ryan Sharpelletti	1	-1/+1
	During loss recovery, retransmitted packets are forced to use TCP timestamps to calculate the RTT samples, which have a millisecond granularity. BBR is designed using a microsecond granularity. As a result, multiple RTT samples could be truncated to the same RTT value during loss recovery. This is problematic, as BBR will not enter PROBE_RTT if the RTT sample is <= the current min_rtt sample, meaning that if there are persistent losses, PROBE_RTT will constantly be pushed off and potentially never re-entered. This patch makes sure that BBR enters PROBE_RTT by checking if RTT sample is < the current min_rtt sample, rather than <=. The Netflix transport/TCP team discovered this bug in the Linux TCP BBR code during lab tests. Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control") Signed-off-by: Ryan Sharpelletti <sharpelletti@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Link: https://lore.kernel.org/r/20201116174412.1433277-1-sharpelletti.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17	net: ftgmac100: Fix crash when removing driver	Joel Stanley	1	-0/+4
	When removing the driver we would hit BUG_ON(!list_empty(&dev->ptype_specific)) in net/core/dev.c due to still having the NC-SI packet handler registered. # echo 1e660000.ethernet > /sys/bus/platform/drivers/ftgmac100/unbind ------------[ cut here ]------------ kernel BUG at net/core/dev.c:10254! Internal error: Oops - BUG: 0 [#1] SMP ARM CPU: 0 PID: 115 Comm: sh Not tainted 5.10.0-rc3-next-20201111-00007-g02e0365710c4 #46 Hardware name: Generic DT based system PC is at netdev_run_todo+0x314/0x394 LR is at cpumask_next+0x20/0x24 pc : [<806f5830>] lr : [<80863cb0>] psr: 80000153 sp : 855bbd58 ip : 00000001 fp : 855bbdac r10: 80c03d00 r9 : 80c06228 r8 : 81158c54 r7 : 00000000 r6 : 80c05dec r5 : 80c05d18 r4 : 813b9280 r3 : 813b9054 r2 : 8122c470 r1 : 00000002 r0 : 00000002 Flags: Nzcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none Control: 00c5387d Table: 85514008 DAC: 00000051 Process sh (pid: 115, stack limit = 0x7cb5703d) ... Backtrace: [<806f551c>] (netdev_run_todo) from [<80707eec>] (rtnl_unlock+0x18/0x1c) r10:00000051 r9:854ed710 r8:81158c54 r7:80c76bb0 r6:81158c10 r5:8115b410 r4:813b9000 [<80707ed4>] (rtnl_unlock) from [<806f5db8>] (unregister_netdev+0x2c/0x30) [<806f5d8c>] (unregister_netdev) from [<805a8180>] (ftgmac100_remove+0x20/0xa8) r5:8115b410 r4:813b9000 [<805a8160>] (ftgmac100_remove) from [<805355e4>] (platform_drv_remove+0x34/0x4c) Fixes: bd466c3fb5a4 ("net/faraday: Support NCSI mode") Signed-off-by: Joel Stanley <joel@jms.id.au> Link: https://lore.kernel.org/r/20201117024448.1170761-1-joel@jms.id.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17	net: b44: fix error return code in b44_init_one()	Zhang Changzhong	1	-1/+2
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 39a6f4bce6b4 ("b44: replace the ssb_dma API with the generic DMA API") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1605582131-36735-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-17	qed: fix error return code in qed_iwarp_ll2_start()	Zhang Changzhong	1	-3/+9
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 469981b17a4f ("qed: Add unaligned and packed packet processing") Fixes: fcb39f6c10b2 ("qed: Add mpa buffer descriptors for storing and processing mpa fpdus") Fixes: 1e28eaad07ea ("qed: Add iWARP support for fpdu spanned over more than two tcp packets") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Michal Kalderon <michal.kalderon@marvell.com> Link: https://lore.kernel.org/r/1605532033-27373-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	bnxt_en: Avoid unnecessary NVM_GET_DEV_INFO cmd error log on VFs.	Vasundhara Volam	1	-0/+3
	VFs do not have access permissions to issue NVM_GET_DEV_INFO firmware command. Fixes: 4933f6753b50 ("bnxt_en: Add bnxt_hwrm_nvm_get_dev_info() to query NVM info.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	bnxt_en: Fix counter overflow logic.	Michael Chan	1	-0/+1
	bnxt_add_one_ctr() adds a hardware counter to a software counter and adjusts for the hardware counter wraparound against the mask. The logic assumes that the hardware counter is always smaller than or equal to the mask. This assumption is mostly correct. But in some cases if the firmware is older and does not provide the accurate mask, the driver can use a mask that is smaller than the actual hardware mask. This can cause some extra carry bits to be added to the software counter, resulting in counters that far exceed the actual value. Fix it by masking the hardware counter with the mask passed into bnxt_add_one_ctr(). Fixes: fea6b3335527 ("bnxt_en: Accumulate all counters.") Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	bnxt_en: Free port stats during firmware reset.	Michael Chan	1	-1/+2
	Firmware is unable to retain the port counters during any kind of fatal or non-fatal resets, so we must clear the port counters to avoid false detection of port counter overflow. Fixes: fea6b3335527 ("bnxt_en: Accumulate all counters.") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	bnxt_en: read EEPROM A2h address using page 0	Edwin Peer	1	-1/+1
	The module eeprom address range returned by bnxt_get_module_eeprom() should be 256 bytes of A0h address space, the lower half of the A2h address space, and page 0 for the upper half of the A2h address space. Fix the firmware call by passing page_number 0 for the A2h slave address space. Fixes: 42ee18fe4ca2 ("bnxt_en: Add Support for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPRO") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: ipa: lock when freeing transaction	Alex Elder	1	-3/+12
	Transactions sit on one of several lists, depending on their state (allocated, pending, complete, or polled). A spinlock protects against concurrent access when transactions are moved between these lists. Transactions are also reference counted. A newly-allocated transaction has an initial count of 1; a transaction is released in gsi_trans_free() only if its decremented reference count reaches 0. Releasing a transaction includes removing it from the polled (or if unused, allocated) list, so the spinlock is acquired when we release a transaction. The reference count is used to allow a caller to synchronously wait for a committed transaction to complete. In this case, the waiter takes an extra reference to the transaction before committing it (so it won't be freed), and releases its reference (calls gsi_trans_free()) when it is done with it. Similarly, gsi_channel_update() takes an extra reference to ensure a transaction isn't released before the function is done operating on it. Until the transaction is moved to the completed list (by this function) it won't be freed, so this reference is taken "safely." But in the quiesce path, we want to wait for the "last" transaction, which we find in the completed or polled list. Transactions on these lists can be freed at any time, so we (try to) prevent that by taking the reference while holding the spinlock. Currently gsi_trans_free() decrements a transaction's reference count unconditionally, acquiring the lock to remove the transaction from its list only when the count reaches 0. This does not protect the quiesce path, which depends on the lock to ensure its extra reference prevents release of the transaction. Fix this by only dropping the last reference to a transaction in gsi_trans_free() while holding the spinlock. Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions") Reported-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20201114182017.28270-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net/tls: fix corrupted data in recvmsg	Vadim Fedorenko	1	-1/+1
	If tcp socket has more data than Encrypted Handshake Message then tls_sw_recvmsg will try to decrypt next record instead of returning full control message to userspace as mentioned in comment. The next message - usually Application Data - gets corrupted because it uses zero copy for decryption that's why the data is not stored in skb for next iteration. Revert check to not decrypt next record if current is not Application Data. Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Link: https://lore.kernel.org/r/1605413760-21153-1-git-send-email-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: qualcomm: rmnet: Fix incorrect receive packet handling during cleanup	Subash Abhinov Kasiviswanathan	1	-0/+5
	During rmnet unregistration, the real device rx_handler is first cleared followed by the removal of rx_handler_data after the rcu synchronization. Any packets in the receive path may observe that the rx_handler is NULL. However, there is no check when dereferencing this value to use the rmnet_port information. This fixes following splat by adding the NULL check. Unable to handle kernel NULL pointer dereference at virtual address 000000000000000d pc : rmnet_rx_handler+0x124/0x284 lr : rmnet_rx_handler+0x124/0x284 rmnet_rx_handler+0x124/0x284 __netif_receive_skb_core+0x758/0xd74 __netif_receive_skb+0x50/0x17c process_backlog+0x15c/0x1b8 napi_poll+0x88/0x284 net_rx_action+0xbc/0x23c __do_softirq+0x20c/0x48c Fixes: ceed73a2cf4a ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation") Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Link: https://lore.kernel.org/r/1605298325-3705-1-git-send-email-subashab@codeaurora.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: mvneta: fix possible memory leak in mvneta_swbm_add_rx_fragment	Lorenzo Bianconi	1	-2/+3
	Recycle the page running page_pool_put_full_page() in mvneta_swbm_add_rx_fragment routine when the last descriptor contains just the FCS or if the received packet contains more than MAX_SKB_FRAGS fragments Fixes: ca0e014609f0 ("net: mvneta: move skb build after descriptors processing") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/df6a2bad70323ee58d3901491ada31c1ca2a40b9.1605291228.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: stmmac: Use rtnl_lock/unlock on netif_set_real_num_rx_queues() call	Wong Vee Khee	1	-0/+2
	Fix an issue where dump stack is printed on suspend resume flow due to netif_set_real_num_rx_queues() is not called with rtnl_lock held(). Fixes: 686cff3d7022 ("net: stmmac: Fix incorrect location to set real_num_rx\|tx_queues") Reported-by: Christophe ROULLIER <christophe.roullier@st.com> Tested-by: Christophe ROULLIER <christophe.roullier@st.com> Cc: Alexandre TORGUE <alexandre.torgue@st.com> Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com> Link: https://lore.kernel.org/r/20201115074210.23605-1-vee.khee.wong@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: bridge: add missing counters to ndo_get_stats64 callback	Heiner Kallweit	1	-0/+1
	In br_forward.c and br_input.c fields dev->stats.tx_dropped and dev->stats.multicast are populated, but they are ignored in ndo_get_stats64. Fixes: 28172739f0a2 ("net: fix 64 bit counters on 32 bit arches") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/58ea9963-77ad-a7cf-8dfd-fc95ab95f606@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: ethernet: ti: cpsw: fix error return code in cpsw_probe()	Zhang Changzhong	1	-0/+1
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 83a8471ba255 ("net: ethernet: ti: cpsw: refactor probe to group common hw initialization") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1605250173-18438-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: stmmac: dwmac-intel-plat: fix error return code in intel_eth_plat_probe()	Zhang Changzhong	1	-1/+3
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 9efc9b2b04c7 ("net: stmmac: Add dwmac-intel-plat for GBE driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1605249243-17262-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	qlcnic: fix error return code in qlcnic_83xx_restart_hw()	Zhang Changzhong	1	-1/+2
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 3ced0a88cd4c ("qlcnic: Add support to run firmware POST") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1605248186-16013-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	cx82310_eth: fix error return code in cx82310_bind()	Zhang Changzhong	1	-1/+2
	Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: ca139d76b0d9 ("cx82310_eth: re-enable ethernet mode after router reboot") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1605247627-15385-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	MAINTAINERS: update cxgb4 and cxgb3 maintainer	Raju Rangoju	1	-3/+3
	Update cxgb4 and cxgb3 driver maintainer Signed-off-by: Raju Rangoju <rajur@chelsio.com> Link: https://lore.kernel.org/r/20201116104322.3959-1-rajur@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: lantiq: Wait for the GPHY firmware to be ready	Martin Blumenstingl	1	-0/+11
	A user reports (slightly shortened from the original message): libphy: lantiq,xrx200-mdio: probed mdio_bus 1e108000.switch-mii: MDIO device at address 17 is missing. gswip 1e108000.switch lan: no phy at 2 gswip 1e108000.switch lan: failed to connect to port 2: -19 lantiq,xrx200-net 1e10b308.eth eth0: error -19 setting up slave phy This is a single-port board using the internal Fast Ethernet PHY. The user reports that switching to PHY scanning instead of configuring the PHY within device-tree works around this issue. The documentation for the standalone variant of the PHY11G (which is probably very similar to what is used inside the xRX200 SoCs but having the firmware burnt onto that standalone chip in the factory) states that the PHY needs 300ms to be ready for MDIO communication after releasing the reset. Add a 300ms delay after initializing all GPHYs to ensure that the GPHY firmware had enough time to initialize and to appear on the MDIO bus. Unfortunately there is no (known) documentation on what the minimum time to wait after releasing the reset on an internal PHY so play safe and take the one for the external variant. Only wait after the last GPHY firmware is loaded to not slow down the initialization too much ( xRX200 has two GPHYs but newer SoCs have at least three GPHYs). Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20201115165757.552641-1-martin.blumenstingl@googlemail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	MAINTAINERS: Add Martin Schiller as a maintainer for the X.25 stack	Xie He	1	-10/+9
	Martin Schiller is an active developer and reviewer for the X.25 code. His company is providing products based on the Linux X.25 stack. So he is a good candidate for maintainers of the X.25 code. The original maintainer of the X.25 network layer (Andrew Hendry) has not sent any email to the netdev mail list since 2013. So he is probably inactive now. Cc: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: Xie He <xie.he.0141@gmail.com> Acked-by: Martin Schiller <ms@dev.tdt.de> Link: https://lore.kernel.org/r/20201114111029.326972-1-xie.he.0141@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	ipv6/netfilter: Discard first fragment not including all headers	Georg Kohmann	3	-20/+46
	Packets are processed even though the first fragment don't include all headers through the upper layer header. This breaks TAHI IPv6 Core Conformance Test v6LC.1.3.6. Referring to RFC8200 SECTION 4.5: "If the first fragment does not include all headers through an Upper-Layer header, then that fragment should be discarded and an ICMP Parameter Problem, Code 3, message should be sent to the source of the fragment, with the Pointer field set to zero." The fragment needs to be validated the same way it is done in commit 2efdaaaf883a ("IPv6: reply ICMP error if the first fragment don't include all headers") for ipv6. Wrap the validation into a common function, ipv6_frag_thdr_truncated() to check for truncation in the upper layer header. This validation does not fullfill all aspects of RFC 8200, section 4.5, but is at the moment sufficient to pass mentioned TAHI test. In netfilter, utilize the fragment offset returned by find_prev_fhdr() to let ipv6_frag_thdr_truncated() start it's traverse from the fragment header. Return 0 to drop the fragment in the netfilter. This is the same behaviour as used on other protocol errors in this function, e.g. when nf_ct_frag6_queue() returns -EPROTO. The Fragment will later be picked up by ipv6_frag_rcv() in reassembly.c. ipv6_frag_rcv() will then send an appropriate ICMP Parameter Problem message back to the source. References commit 2efdaaaf883a ("IPv6: reply ICMP error if the first fragment don't include all headers") Signed-off-by: Georg Kohmann <geokohma@cisco.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://lore.kernel.org/r/20201111115025.28879-1-geokohma@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	net: fec: Fix reference count leak in fec series ops	Zhang Qilong	1	-7/+5
	pm_runtime_get_sync() will increment pm usage at first and it will resume the device later. If runtime of the device has error or device is in inaccessible state(or other error state), resume operation will fail. If we do not call put operation to decrease the reference, it will result in reference count leak. Moreover, this device cannot enter the idle state and always stay busy or other non-idle state later. So we fixed it by replacing it with pm_runtime_resume_and_get. Fixes: 8fff755e9f8d0 ("net: fec: Ensure clocks are enabled while using mdio bus") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-16	PM: runtime: Add pm_runtime_resume_and_get to deal with usage counter	Zhang Qilong	1	-0/+21
	In many case, we need to check return value of pm_runtime_get_sync, but it brings a trouble to the usage counter processing. Many callers forget to decrease the usage counter when it failed, which could resulted in reference leak. It has been discussed a lot[0][1]. So we add a function to deal with the usage counter for better coding. [0]https://lkml.org/lkml/2020/6/14/88 [1]https://patchwork.ozlabs.org/project/linux-tegra/list/?series=178139 Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-15	can: m_can: m_can_stop(): set device to software init mode before closing	Faiz Abbas	1	-0/+3
	There might be some requests pending in the buffer when the interface close sequence occurs. In some devices, these pending requests might lead to the module not shutting down properly when m_can_clk_stop() is called. Therefore, move the device to init state before potentially powering it down. Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Acked-by: Dan Murphy <dmurphy@ti.com> Link: https://lore.kernel.org/r/20200825055442.16994-1-faiz_abbas@ti.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: m_can: Fix freeing of can device from peripherials	Dan Murphy	3	-19/+33
	Fix leaking netdev device from peripherial devices. The call to allocate the netdev device is made from and managed by the peripherial. Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Dan Murphy <dmurphy@ti.com> Link: http://lore.kernel.org/r/20200227183829.21854-2-dmurphy@ti.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: m_can: m_can_class_free_dev(): introduce new function	Dan Murphy	2	-0/+7
	This patch creates a common function that peripherials can call to free the netdev device when failures occur. Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework") Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Dan Murphy <dmurphy@ti.com> Link: http://lore.kernel.org/r/20200227183829.21854-2-dmurphy@ti.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: m_can: m_can_handle_state_change(): fix state change	Wu Bo	1	-2/+2
	m_can_handle_state_change() is called with the new_state as an argument. In the switch statements for CAN_STATE_ERROR_ACTIVE, the comment and the following code indicate that a CAN_STATE_ERROR_WARNING is handled. This patch fixes this problem by changing the case to CAN_STATE_ERROR_WARNING. Signed-off-by: Wu Bo <wubo.oduw@gmail.com> Link: http://lore.kernel.org/r/20200129022330.21248-2-wubo.oduw@gmail.com Cc: Dan Murphy <dmurphy@ti.com> Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: tcan4x5x: tcan4x5x_can_remove(): fix order of deregistration	Marc Kleine-Budde	1	-2/+2
	Change the order in tcan4x5x_can_remove() to be the exact inverse of tcan4x5x_can_probe(). First m_can_class_unregister(), then power down the device. Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Cc: Dan Murphy <dmurphy@ti.com> Link: http://lore.kernel.org/r/20201019154233.1262589-10-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: tcan4x5x: tcan4x5x_can_probe(): add missing error checking for devm_regmap_init()	Marc Kleine-Budde	1	-0/+4
	This patch adds the missing error checking when initializing the regmap interface fails. Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Cc: Dan Murphy <dmurphy@ti.com> Link: http://lore.kernel.org/r/20201019154233.1262589-7-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: tcan4x5x: replace depends on REGMAP_SPI with depends on SPI	Enric Balletbo i Serra	1	-1/+2
	regmap is a library function that gets selected by drivers that need it. No driver modules should depend on it. Instead depends on SPI and select REGMAP_SPI. Depending on REGMAP_SPI makes this driver only build if another driver already selected REGMAP_SPI, as the symbol can't be selected through the menu kernel configuration. Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Link: http://lore.kernel.org/r/20200413141013.506613-1-enric.balletbo@collabora.com Reviewed-by: Dan Murphy <dmurphy@ti.com> Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: flexcan: fix failure handling of pm_runtime_get_sync()	Zhang Qilong	1	-2/+6
	pm_runtime_get_sync() will increment pm usage at first and it will resume the device later. If runtime of the device has error or device is in inaccessible state(or other error state), resume operation will fail. If we do not call put operation to decrease the reference, it will result in reference leak in the two functions flexcan_get_berr_counter() and flexcan_open(). Moreover, this device cannot enter the idle state and always stay busy or other non-idle state later. So we should fix it through adding pm_runtime_put_noidle(). Fixes: ca10989632d88 ("can: flexcan: implement can Runtime PM") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Link: https://lore.kernel.org/r/20201108083000.2599705-1-zhangqilong3@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: flexcan: flexcan_setup_stop_mode(): add missing "req_bit" to stop mode property comment	Marc Kleine-Budde	1	-1/+1
	In the patch d9b081e3fc4b ("can: flexcan: remove ack_grp and ack_bit handling from driver") the unused ack_grp and ack_bit were removed from the driver. However in the comment, the "req_bit" was accidentally removed, too. This patch adds back the "req_bit" bit. Fixes: d9b081e3fc4b ("can: flexcan: remove ack_grp and ack_bit handling from driver") Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: http://lore.kernel.org/r/20201014114810.2911135-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: peak_usb: fix potential integer overflow on shift of a int	Colin Ian King	1	-2/+2
	The left shift of int 32 bit integer constant 1 is evaluated using 32 bit arithmetic and then assigned to a signed 64 bit variable. In the case where time_ref->adapter->ts_used_bits is 32 or more this can lead to an oveflow. Avoid this by shifting using the BIT_ULL macro instead. Fixes: bb4785551f64 ("can: usb: PEAK-System Technik USB adapters driver core") Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201105112427.40688-1-colin.king@canonical.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: mcba_usb: mcba_usb_start_xmit(): first fill skb, then pass to can_put_echo_skb()	Marc Kleine-Budde	1	-2/+2
	The driver has to first fill the skb with data and then handle it to can_put_echo_skb(). This patch moves the can_put_echo_skb() down, right before sending the skb out via USB. Fixes: 51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer") Cc: Remigiusz Kołłątaj <remigiusz.kollataj@mobica.com> Link: https://lore.kernel.org/r/20201111221204.1639007-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: ti_hecc: Fix memleak in ti_hecc_probe	Zhang Qilong	1	-5/+8
	In the error handling, we should goto the probe_exit_candev to free ndev to prevent memory leak. Fixes: dabf54dd1c63 ("can: ti_hecc: Convert TI HECC driver to DT only driver") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Link: https://lore.kernel.org/r/20201114111708.3465543-1-zhangqilong3@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: dev: can_restart(): post buffer from the right context	Alejandro Concepcion Rodriguez	1	-1/+1
	netif_rx() is meant to be called from interrupt contexts. can_restart() may be called by can_restart_work(), which is called from a worqueue, so it may run in process context. Use netif_rx_ni() instead. Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Co-developed-by: Loris Fauster <loris.fauster@ttcontrol.com> Signed-off-by: Loris Fauster <loris.fauster@ttcontrol.com> Signed-off-by: Alejandro Concepcion Rodriguez <alejandro@acoro.eu> Link: https://lore.kernel.org/r/4e84162b-fb31-3a73-fa9a-9438b4bd5234@acoro.eu [mkl: use netif_rx_ni() instead of netif_rx_any_context()] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: af_can: prevent potential access of uninitialized member in canfd_rcv()	Anant Thazhemadam	1	-5/+14
	In canfd_rcv(), cfd->len is uninitialized when skb->len = 0, and this uninitialized cfd->len is accessed nonetheless by pr_warn_once(). Fix this uninitialized variable access by checking cfd->len's validity condition (cfd->len > CANFD_MAX_DLEN) separately after the skb->len's condition is checked, and appropriately modify the log messages that are generated as well. In case either of the required conditions fail, the skb is freed and NET_RX_DROP is returned, same as before. Fixes: d4689846881d ("can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once") Reported-by: syzbot+9bcb0c9409066696d3aa@syzkaller.appspotmail.com Tested-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Link: https://lore.kernel.org/r/20201103213906.24219-3-anant.thazhemadam@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-15	can: af_can: prevent potential access of uninitialized member in can_rcv()	Anant Thazhemadam	1	-5/+14
	In can_rcv(), cfd->len is uninitialized when skb->len = 0, and this uninitialized cfd->len is accessed nonetheless by pr_warn_once(). Fix this uninitialized variable access by checking cfd->len's validity condition (cfd->len > CAN_MAX_DLEN) separately after the skb->len's condition is checked, and appropriately modify the log messages that are generated as well. In case either of the required conditions fail, the skb is freed and NET_RX_DROP is returned, same as before. Fixes: 8cb68751c115 ("can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once") Reported-by: syzbot+9bcb0c9409066696d3aa@syzkaller.appspotmail.com Tested-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Link: https://lore.kernel.org/r/20201103213906.24219-2-anant.thazhemadam@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2020-11-14	devlink: Add missing genlmsg_cancel() in devlink_nl_sb_port_pool_fill()	Wang Hai	1	-2/+4
	If sb_occ_port_pool_get() failed in devlink_nl_sb_port_pool_fill(), msg should be canceled by genlmsg_cancel(). Fixes: df38dafd2559 ("devlink: implement shared buffer occupancy monitoring interface") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://lore.kernel.org/r/20201113111622.11040-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14	net: stmmac: dwmac_lib: enlarge dma reset timeout	Jisheng Zhang	1	-1/+1
	If the phy enables power saving technology, the dwmac's software reset needs more time to complete, enlarge dma reset timeout to 200000us. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Link: https://lore.kernel.org/r/20201113090902.5c7aab1a@xhacker.debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14	lan743x: prevent entire kernel HANG on open, for some platforms	Sven Van Asbroeck	1	-1/+2
	On arm imx6, when opening the chip's netdev, the whole Linux kernel intermittently hangs/freezes. This is caused by a bug in the driver code which tests if pcie interrupts are working correctly, using the software interrupt: 1. open: enable the software interrupt 2. open: tell the chip to assert the software interrupt 3. open: wait for flag 4. ISR: acknowledge s/w interrupt, set flag 5. open: notice flag, disable the s/w interrupt, continue Unfortunately the ISR only acknowledges the s/w interrupt, but does not disable it. This will re-trigger the ISR in a tight loop. On some (lucky) platforms, open proceeds to disable the s/w interrupt even while the ISR is 'spinning'. On arm imx6, the spinning ISR does not allow open to proceed, resulting in a hung Linux kernel. Fix minimally by disabling the s/w interrupt in the ISR, which will prevent it from spinning. This won't break anything because the s/w interrupt is used as a one-shot interrupt. Note that this is a minimal fix, overlooking many possible cleanups, e.g.: - lan743x_intr_software_isr() is completely redundant and reads INT_STS twice for no apparent reason - disabling the s/w interrupt in lan743x_intr_test_isr() is now redundant, but harmless - waiting on software_isr_flag can be converted from a sleeping poll loop to wait_event_timeout() Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # arm imx6 lan7430 Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201112204741.12375-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14	lan743x: fix issue causing intermittent kernel log warnings	Sven Van Asbroeck	1	-5/+5
	When running this chip on arm imx6, we intermittently observe the following kernel warning in the log, especially when the system is under high load: [ 50.119484] ------------[ cut here ]------------ [ 50.124377] WARNING: CPU: 0 PID: 303 at kernel/softirq.c:169 __local_bh_enable_ip+0x100/0x184 [ 50.132925] IRQs not enabled as expected [ 50.159250] CPU: 0 PID: 303 Comm: rngd Not tainted 5.7.8 #1 [ 50.164837] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) [ 50.171395] [<c0111a38>] (unwind_backtrace) from [<c010be28>] (show_stack+0x10/0x14) [ 50.179162] [<c010be28>] (show_stack) from [<c05b9dec>] (dump_stack+0xac/0xd8) [ 50.186408] [<c05b9dec>] (dump_stack) from [<c0122e40>] (__warn+0xd0/0x10c) [ 50.193391] [<c0122e40>] (__warn) from [<c0123238>] (warn_slowpath_fmt+0x98/0xc4) [ 50.200892] [<c0123238>] (warn_slowpath_fmt) from [<c012b010>] (__local_bh_enable_ip+0x100/0x184) [ 50.209860] [<c012b010>] (__local_bh_enable_ip) from [<bf09ecbc>] (destroy_conntrack+0x48/0xd8 [nf_conntrack]) [ 50.220038] [<bf09ecbc>] (destroy_conntrack [nf_conntrack]) from [<c0ac9b58>] (nf_conntrack_destroy+0x94/0x168) [ 50.230160] [<c0ac9b58>] (nf_conntrack_destroy) from [<c0a4aaa0>] (skb_release_head_state+0xa0/0xd0) [ 50.239314] [<c0a4aaa0>] (skb_release_head_state) from [<c0a4aadc>] (skb_release_all+0xc/0x24) [ 50.247946] [<c0a4aadc>] (skb_release_all) from [<c0a4b4cc>] (consume_skb+0x74/0x17c) [ 50.255796] [<c0a4b4cc>] (consume_skb) from [<c081a2dc>] (lan743x_tx_release_desc+0x120/0x124) [ 50.264428] [<c081a2dc>] (lan743x_tx_release_desc) from [<c081a98c>] (lan743x_tx_napi_poll+0x5c/0x18c) [ 50.273755] [<c081a98c>] (lan743x_tx_napi_poll) from [<c0a6b050>] (net_rx_action+0x118/0x4a4) [ 50.282306] [<c0a6b050>] (net_rx_action) from [<c0101364>] (__do_softirq+0x13c/0x53c) [ 50.290157] [<c0101364>] (__do_softirq) from [<c012b29c>] (irq_exit+0x150/0x17c) [ 50.297575] [<c012b29c>] (irq_exit) from [<c0196a08>] (__handle_domain_irq+0x60/0xb0) [ 50.305423] [<c0196a08>] (__handle_domain_irq) from [<c05d44fc>] (gic_handle_irq+0x4c/0x90) [ 50.313790] [<c05d44fc>] (gic_handle_irq) from [<c0100ed4>] (__irq_usr+0x54/0x80) [ 50.321287] Exception stack(0xecd99fb0 to 0xecd99ff8) [ 50.326355] 9fa0: 1cf1aa74 00000001 00000001 00000000 [ 50.334547] 9fc0: 00000001 00000000 00000000 00000000 00000000 00000000 00004097 b6d17d14 [ 50.342738] 9fe0: 00000001 b6d17c60 00000000 b6e71f94 800b0010 ffffffff [ 50.349364] irq event stamp: 2525027 [ 50.352955] hardirqs last enabled at (2525026): [<c0a6afec>] net_rx_action+0xb4/0x4a4 [ 50.360892] hardirqs last disabled at (2525027): [<c0d6d2fc>] _raw_spin_lock_irqsave+0x1c/0x50 [ 50.369517] softirqs last enabled at (2524660): [<c01015b4>] __do_softirq+0x38c/0x53c [ 50.377446] softirqs last disabled at (2524693): [<c012b29c>] irq_exit+0x150/0x17c [ 50.385027] ---[ end trace c0b571db4bc8087d ]--- The driver is calling dev_kfree_skb() from code inside a spinlock, where h/w interrupts are disabled. This is forbidden, as documented in include/linux/netdevice.h. The correct function to use dev_kfree_skb_irq(), or dev_kfree_skb_any(). Fix by using the correct dev_kfree_skb_xxx() functions: in lan743x_tx_release_desc(): called by lan743x_tx_release_completed_descriptors() called by in lan743x_tx_napi_poll() which holds a spinlock called by lan743x_tx_release_all_descriptors() called by lan743x_tx_close() which can-sleep conclusion: use dev_kfree_skb_any() in lan743x_tx_xmit_frame(): which holds a spinlock conclusion: use dev_kfree_skb_irq() in lan743x_tx_close(): which can-sleep conclusion: use dev_kfree_skb() in lan743x_rx_release_ring_element(): called by lan743x_rx_close() which can-sleep called by lan743x_rx_open() which can-sleep conclusion: use dev_kfree_skb() Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201112185949.11315-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14	netlabel: fix an uninitialized warning in netlbl_unlabel_staticlist()	Paul Moore	1	-1/+1
	Static checking revealed that a previous fix to netlbl_unlabel_staticlist() leaves a stack variable uninitialized, this patches fixes that. Fixes: 866358ec331f ("netlabel: fix our progress tracking in netlbl_unlabel_staticlist()") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/r/160530304068.15651.18355773009751195447.stgit@sifl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14	sctp: change to hold/put transport for proto_unreach_timer	Xin Long	3	-5/+5
	A call trace was found in Hangbin's Codenomicon testing with debug kernel: [ 2615.981988] ODEBUG: free active (active state 0) object type: timer_list hint: sctp_generate_proto_unreach_event+0x0/0x3a0 [sctp] [ 2615.995050] WARNING: CPU: 17 PID: 0 at lib/debugobjects.c:328 debug_print_object+0x199/0x2b0 [ 2616.095934] RIP: 0010:debug_print_object+0x199/0x2b0 [ 2616.191533] Call Trace: [ 2616.194265] <IRQ> [ 2616.202068] debug_check_no_obj_freed+0x25e/0x3f0 [ 2616.207336] slab_free_freelist_hook+0xeb/0x140 [ 2616.220971] kfree+0xd6/0x2c0 [ 2616.224293] rcu_do_batch+0x3bd/0xc70 [ 2616.243096] rcu_core+0x8b9/0xd00 [ 2616.256065] __do_softirq+0x23d/0xacd [ 2616.260166] irq_exit+0x236/0x2a0 [ 2616.263879] smp_apic_timer_interrupt+0x18d/0x620 [ 2616.269138] apic_timer_interrupt+0xf/0x20 [ 2616.273711] </IRQ> This is because it holds asoc when transport->proto_unreach_timer starts and puts asoc when the timer stops, and without holding transport the transport could be freed when the timer is still running. So fix it by holding/putting transport instead for proto_unreach_timer in transport, just like other timers in transport. v1->v2: - Also use sctp_transport_put() for the "out_unlock:" path in sctp_generate_proto_unreach_event(), as Marcelo noticed. Fixes: 50b5d6ad6382 ("sctp: Fix a race between ICMP protocol unreachable and connect()") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/102788809b554958b13b95d33440f5448113b8d6.1605331373.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>