aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/ethernet/aquantia
AgeCommit message (Collapse)AuthorFilesLines
2022-11-10net: atlantic: macsec: clear encryption keys from the stackAntoine Tenart2-7/+13
Commit aaab73f8fba4 ("macsec: clear encryption keys from the stack after setting up offload") made sure to clean encryption keys from the stack after setting up offloading, but the atlantic driver made a copy and did not clear it. Fix this. [4 Fixes tags below, all part of the same series, no need to split this] Fixes: 9ff40a751a6f ("net: atlantic: MACSec ingress offload implementation") Fixes: b8f8a0b7b5cb ("net: atlantic: MACSec ingress offload HW bindings") Fixes: 27736563ce32 ("net: atlantic: MACSec egress offload implementation") Fixes: 9d106c6dd81b ("net: atlantic: MACSec egress offload HW bindings") Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-24atlantic: fix deadlock at aq_nic_stopÍñigo Huguet2-24/+74
NIC is stopped with rtnl_lock held, and during the stop it cancels the 'service_task' work and free irqs. However, if CONFIG_MACSEC is set, rtnl_lock is acquired both from aq_nic_service_task and aq_linkstate_threaded_isr. Then a deadlock happens if aq_nic_stop tries to cancel/disable them when they've already started their execution. As the deadlock is caused by rtnl_lock, it causes many other processes to stall, not only atlantic related stuff. Fix it by introducing a mutex that protects each NIC's macsec related data, and locking it instead of the rtnl_lock from the service task and the threaded IRQ. Before this patch, all macsec data was protected with rtnl_lock, but maybe not all of it needs to be protected. With this new mutex, further efforts can be made to limit the protected data only to that which requires it. However, probably it doesn't worth it because all macsec's data accesses are infrequent, and almost all are done from macsec_ops or ethtool callbacks, called holding rtnl_lock, so macsec_mutex won't never be much contended. The issue appeared repeteadly attaching and deattaching the NIC to a bond interface. Doing that after this patch I cannot reproduce the bug. Fixes: 62c1c2e606f6 ("net: atlantic: MACSec offload skeleton") Reported-by: Li Liang <liali@redhat.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28net: drop the weight argument from netif_napi_addJakub Kicinski2-4/+2
We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-23net: atlantic: macsec: remove checks on the prepare phaseAntoine Tenart1-57/+0
Remove checks on the prepare phase as it is now unused by the MACsec core implementation. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-23net: atlantic: macsec: make the prepare phase a noopAntoine Tenart1-45/+45
In preparation for removing the MACsec h/w offloading preparation phase, make it a no-op in the Atlantic driver. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-3/+0
drivers/net/ethernet/freescale/fec.h 7b15515fc1ca ("Revert "fec: Restart PPS after link state change"") 40c79ce13b03 ("net: fec: add stop mode support for imx8 platform") https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/ drivers/pinctrl/pinctrl-ocelot.c c297561bc98a ("pinctrl: ocelot: Fix interrupt controller") 181f604b33cd ("pinctrl: ocelot: add ability to be used in a non-mmio configuration") https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/ tools/testing/selftests/drivers/net/bonding/Makefile bbb774d921e2 ("net: Add tests for bonding and team address list management") 152e8ec77640 ("selftests/bonding: add a test for bonding lladdr target") https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/ drivers/net/can/usb/gs_usb.c 5440428b3da6 ("can: gs_usb: gs_can_open(): fix race dev->can.state condition") 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") https://lore.kernel.org/all/84f45a7d-92b6-4dc5-d7a1-072152fab6ff@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21net: atlantic: fix potential memory leak in aq_ndev_close()Jianglei Nie1-3/+0
If aq_nic_stop() fails, aq_ndev_close() returns err without calling aq_nic_deinit() to release the relevant memory and resource, which will lead to a memory leak. We can fix it by deleting the if condition judgment and goto statement to call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak. Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-31net: ethernet: move from strlcpy with unused retval to strscpyWolfram Sang1-1/+1
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2 Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4|5} Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-09net: atlantic: fix aq_vec index out of range errorChia-Lin Kao (AceLan)1-13/+8
The final update statement of the for loop exceeds the array range, the dereference of self->aq_vec[i] is not checked and then leads to the index out of range error. Also fixed this kind of coding style in other for loop. [ 97.937604] UBSAN: array-index-out-of-bounds in drivers/net/ethernet/aquantia/atlantic/aq_nic.c:1404:48 [ 97.937607] index 8 is out of range for type 'aq_vec_s *[8]' [ 97.937608] CPU: 38 PID: 3767 Comm: kworker/u256:18 Not tainted 5.19.0+ #2 [ 97.937610] Hardware name: Dell Inc. Precision 7865 Tower/, BIOS 1.0.0 06/12/2022 [ 97.937611] Workqueue: events_unbound async_run_entry_fn [ 97.937616] Call Trace: [ 97.937617] <TASK> [ 97.937619] dump_stack_lvl+0x49/0x63 [ 97.937624] dump_stack+0x10/0x16 [ 97.937626] ubsan_epilogue+0x9/0x3f [ 97.937627] __ubsan_handle_out_of_bounds.cold+0x44/0x49 [ 97.937629] ? __scm_send+0x348/0x440 [ 97.937632] ? aq_vec_stop+0x72/0x80 [atlantic] [ 97.937639] aq_nic_stop+0x1b6/0x1c0 [atlantic] [ 97.937644] aq_suspend_common+0x88/0x90 [atlantic] [ 97.937648] aq_pm_suspend_poweroff+0xe/0x20 [atlantic] [ 97.937653] pci_pm_suspend+0x7e/0x1a0 [ 97.937655] ? pci_pm_suspend_noirq+0x2b0/0x2b0 [ 97.937657] dpm_run_callback+0x54/0x190 [ 97.937660] __device_suspend+0x14c/0x4d0 [ 97.937661] async_suspend+0x23/0x70 [ 97.937663] async_run_entry_fn+0x33/0x120 [ 97.937664] process_one_work+0x21f/0x3f0 [ 97.937666] worker_thread+0x4a/0x3c0 [ 97.937668] ? process_one_work+0x3f0/0x3f0 [ 97.937669] kthread+0xf0/0x120 [ 97.937671] ? kthread_complete_and_exit+0x20/0x20 [ 97.937672] ret_from_fork+0x22/0x30 [ 97.937676] </TASK> v2. fixed "warning: variable 'aq_vec' set but not used" v3. simplified a for loop Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code") Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Acked-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Link: https://lore.kernel.org/r/20220808081845.42005-1-acelan.kao@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-15/+8
include/net/sock.h 310731e2f161 ("net: Fix data-races around sysctl_mem.") e70f3c701276 ("Revert "net: set SK_MEM_QUANTUM to 4096"") https://lore.kernel.org/all/20220711120211.7c8b7cba@canb.auug.org.au/ net/ipv4/fib_semantics.c 747c14307214 ("ip: fix dflt addr selection for connected nexthop") d62607c3fe45 ("net: rename reference+tracking helpers") net/tls/tls.h include/net/tls.h 3d8c51b25a23 ("net/tls: Check for errors in tls_device_init") 587903142308 ("tls: create an internal header") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-14net: atlantic: remove aq_nic_deinit() when resumeChia-Lin Kao (AceLan)1-3/+0
aq_nic_deinit() has been called while suspending, so we don't have to call it again on resume. Actually, call it again leads to another hang issue when resuming from S3. Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992345] Call Trace: Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992346] <TASK> Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992348] aq_nic_deinit+0xb4/0xd0 [atlantic] Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992356] aq_pm_thaw+0x7f/0x100 [atlantic] Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992362] pci_pm_resume+0x5c/0x90 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992366] ? pci_pm_thaw+0x80/0x80 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992368] dpm_run_callback+0x4e/0x120 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992371] device_resume+0xad/0x200 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992373] async_resume+0x1e/0x40 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992374] async_run_entry_fn+0x33/0x120 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992377] process_one_work+0x220/0x3c0 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992380] worker_thread+0x4d/0x3f0 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992382] ? process_one_work+0x3c0/0x3c0 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992384] kthread+0x12a/0x150 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992386] ? set_kthread_struct+0x40/0x40 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992387] ret_from_fork+0x22/0x30 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992391] </TASK> Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992392] ---[ end trace 1ec8c79604ed5e0d ]--- Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992394] PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992397] atlantic 0000:02:00.0: PM: failed to resume async: error -110 Fixes: 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression") Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Link: https://lore.kernel.org/r/20220713111224.1535938-2-acelan.kao@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-14net: atlantic: remove deep parameter on suspend/resume functionsChia-Lin Kao (AceLan)1-14/+10
Below commit claims that atlantic NIC requires to reset the device on pm op, and had set the deep to true for all suspend/resume functions. commit 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression") So, we could remove deep parameter on suspend/resume functions without any functional change. Fixes: 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression") Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Link: https://lore.kernel.org/r/20220713111224.1535938-1-acelan.kao@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-27net: atlantic:fix repeated words in commentsJilin Yuan1-2/+2
Delete the redundant word 'the'. Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com> Link: https://lore.kernel.org/r/20220625071558.3852-1-yuanjilin@cdjrlc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-11/+20
No conflicts. Build issue in drivers/net/ethernet/sfc/ptp.c 54fccfdd7c66 ("sfc: efx_default_channel_type APIs can be static") 49e6123c65da ("net: sfc: fix memory leak due to ptp channel") https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-11net: atlantic: verify hw_head_ lies within TX buffer ringGrant Grundler1-0/+7
Bounds check hw_head index provided by NIC to verify it lies within the TX buffer ring. Reported-by: Aashay Shringarpure <aashay@google.com> Reported-by: Yi Chou <yich@google.com> Reported-by: Shervin Oloumi <enlightened@google.com> Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11net: atlantic: add check for MAX_SKB_FRAGSGrant Grundler1-1/+5
Enforce that the CPU can not get stuck in an infinite loop. Reported-by: Aashay Shringarpure <aashay@google.com> Reported-by: Yi Chou <yich@google.com> Reported-by: Shervin Oloumi <enlightened@google.com> Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11net: atlantic: reduce scope of is_rsc_completeGrant Grundler1-7/+6
Don't defer handling the err case outside the loop. That's pointless. And since is_rsc_complete is only used inside this loop, declare it inside the loop to reduce it's scope. Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-11net: atlantic: fix "frag[0] not initialized"Grant Grundler1-2/+1
In aq_ring_rx_clean(), if buff->is_eop is not set AND buff->len < AQ_CFG_RX_HDR_SIZE, then hdr_len remains equal to buff->len and skb_add_rx_frag(xxx, *0*, ...) is not called. The loop following this code starts calling skb_add_rx_frag() starting with i=1 and thus frag[0] is never initialized. Since i is initialized to zero at the top of the primary loop, we can just reference and post-increment i instead of hardcoding the 0 when calling skb_add_rx_frag() the first time. Reported-by: Aashay Shringarpure <aashay@google.com> Reported-by: Yi Chou <yich@google.com> Reported-by: Shervin Oloumi <enlightened@google.com> Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-10net: atlantic: always deep reset on pm op, fixing up my null deref regressionManuel Ullmann1-2/+2
The impact of this regression is the same for resume that I saw on thaw: the kernel hangs and nothing except SysRq rebooting can be done. Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep par in pm functions, preventing null derefs"), where I disabled deep pm resets in suspend and resume, trying to make sense of the atl_resume_common() deep parameter in the first place. It turns out, that atlantic always has to deep reset on pm operations. Even though I expected that and tested resume, I screwed up by kexec-rebooting into an unpatched kernel, thus missing the breakage. This fixup obsoletes the deep parameter of atl_resume_common, but I leave the cleanup for the maintainers to post to mainline. Suspend and hibernation were successfully tested by the reporters. Fixes: cbe6c3a8f8f4 ("net: atlantic: invert deep par in pm functions, preventing null derefs") Link: https://lore.kernel.org/regressions/9-Ehc_xXSwdXcvZqKD5aSqsqeNj5Izco4MYEwnx5cySXVEc9-x_WC4C3kAoCqNTi-H38frroUK17iobNVnkLtW36V6VWGSQEOHXhmVMm5iQ=@protonmail.com/ Reported-by: Jordan Leppert <jordanleppert@protonmail.com> Reported-by: Holger Hoffstaette <holger@applied-asynchrony.com> Tested-by: Jordan Leppert <jordanleppert@protonmail.com> Tested-by: Holger Hoffstaette <holger@applied-asynchrony.com> CC: <stable@vger.kernel.org> # 5.10+ Signed-off-by: Manuel Ullmann <labre@posteo.de> Link: https://lore.kernel.org/r/87bkw8dfmp.fsf@posteo.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-29eth: atlantic: remove a copy of the NAPI_POLL_WEIGHT defineJakub Kicinski3-4/+2
Defining local versions of NAPI_POLL_WEIGHT with the same values in the drivers just makes refactoring harder. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni1-4/+4
drivers/net/ethernet/microchip/lan966x/lan966x_main.c d08ed852560e ("net: lan966x: Make sure to release ptp interrupt") c8349639324a ("net: lan966x: Add FDMA functionality") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-20net: atlantic: Implement .ndo_xdp_xmit handlerTaehee Yoo3-0/+26
aq_xdp_xmit() is the callback function of .ndo_xdp_xmit. It internally calls aq_nic_xmit_xdpf() to send packet. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20net: atlantic: Implement xdp data planeTaehee Yoo6-9/+468
It supports XDP_PASS, XDP_DROP and multi buffer. The new function aq_nic_xmit_xdpf() is used to send packet with xdp_frame and internally it calls aq_nic_map_xdp(). AQC chip supports 32 multi-queues and 8 vectors(irq). there are two option 1. under 8 cores and 4 tx queues per core. 2. under 4 cores and 8 tx queues per core. Like ixgbe, these tx queues can be used only for XDP_TX, XDP_REDIRECT queue. If so, no tx_lock is needed. But this patchset doesn't use this strategy because getting hardware tx queue index cost is too high. So, tx_lock is used in the aq_nic_xmit_xdpf(). single-core, single queue, 80% cpu utilization. 30.75% bpf_prog_xxx_xdp_prog_tx [k] bpf_prog_xxx_xdp_prog_tx 10.35% [kernel] [k] aq_hw_read_reg <---------- here 4.38% [kernel] [k] get_page_from_freelist single-core, 8 queues, 100% cpu utilization, half PPS. 45.56% [kernel] [k] aq_hw_read_reg <---------- here 17.58% bpf_prog_xxx_xdp_prog_tx [k] bpf_prog_xxx_xdp_prog_tx 4.72% [kernel] [k] hw_atl_b0_hw_ring_rx_receive The new function __aq_ring_xdp_clean() is a xdp rx handler and this is called only when XDP is attached. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20net: atlantic: Implement xdp control planeTaehee Yoo10-36/+176
aq_xdp() is a xdp setup callback function for Atlantic driver. When XDP is attached or detached, the device will be restarted because it uses different headroom, tailroom, and page order value. If XDP enabled, it switches default page order value from 0 to 2. Because the default maximum frame size is still 2K and it needs additional area for headroom and tailroom. The total size(headroom + frame size + tailroom) is 2624. So, 1472Bytes will be always wasted for every frame. But when order-2 is used, these pages can be used 6 times with flip strategy. It means only about 106Bytes per frame will be wasted. Also, It supports xdp fragment feature. MTU can be 16K if xdp prog supports xdp fragment. If not, MTU can not exceed 2K - ETH_HLEN - ETH_FCS. And a static key is added and It will be used to call the xdp_clean handler in ->poll(). data plane implementation will be contained the followed patch. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18net: atlantic: invert deep par in pm functions, preventing null derefsManuel Ullmann1-4/+4
This will reset deeply on freeze and thaw instead of suspend and resume and prevent null pointer dereferences of the uninitialized ring 0 buffer while thawing. The impact is an indefinitely hanging kernel. You can't switch consoles after this and the only possible user interaction is SysRq. BUG: kernel NULL pointer dereference RIP: 0010:aq_ring_rx_fill+0xcf/0x210 [atlantic] aq_vec_init+0x85/0xe0 [atlantic] aq_nic_init+0xf7/0x1d0 [atlantic] atl_resume_common+0x4f/0x100 [atlantic] pci_pm_thaw+0x42/0xa0 resolves in aq_ring.o to ``` 0000000000000ae0 <aq_ring_rx_fill>: { /* ... */ baf: 48 8b 43 08 mov 0x8(%rbx),%rax buff->flags = 0U; /* buff is NULL */ ``` The bug has been present since the introduction of the new pm code in 8aaa112a57c1 ("net: atlantic: refactoring pm logic") and was hidden until 8ce84271697a ("net: atlantic: changes for multi-TC support"), which refactored the aq_vec_{free,alloc} functions into aq_vec_{,ring}_{free,alloc}, but is technically not wrong. The original functions just always reinitialized the buffers on S3/S4. If the interface is down before freezing, the bug does not occur. It does not matter, whether the initrd contains and loads the module before thawing. So the fix is to invert the boolean parameter deep in all pm function calls, which was clearly intended to be set like that. First report was on Github [1], which you have to guess from the resume logs in the posted dmesg snippet. Recently I posted one on Bugzilla [2], since I did not have an AQC device so far. #regzbot introduced: 8ce84271697a #regzbot from: koo5 <kolman.jindrich@gmail.com> #regzbot monitor: https://github.com/Aquantia/AQtion/issues/32 Fixes: 8aaa112a57c1 ("net: atlantic: refactoring pm logic") Link: https://github.com/Aquantia/AQtion/issues/32 [1] Link: https://bugzilla.kernel.org/show_bug.cgi?id=215798 [2] Cc: stable@vger.kernel.org Reported-by: koo5 <kolman.jindrich@gmail.com> Signed-off-by: Manuel Ullmann <labre@posteo.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08net: atlantic: Avoid out-of-bounds indexingKai-Heng Feng2-16/+16
UBSAN warnings are observed on atlantic driver: [ 294.432996] UBSAN: array-index-out-of-bounds in /build/linux-Qow4fL/linux-5.15.0/drivers/net/ethernet/aquantia/atlantic/aq_nic.c:484:48 [ 294.433695] index 8 is out of range for type 'aq_vec_s *[8]' The ring is dereferenced right before breaking out the loop, to prevent that from happening, only use the index in the loop to fix the issue. BugLink: https://bugs.launchpad.net/bugs/1958770 Tested-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Link: https://lore.kernel.org/r/20220408022204.16815-1-kai.heng.feng@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-24net: atlantic: Use the bitmap API instead of hand-writing itChristophe JAILLET1-4/+2
Simplify code by using bitmap_weight() and bitmap_zero() instead of hand-writing these functions. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+8
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c commit 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") commit 31108d142f36 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") commit 4390c6edc0fb ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/ net/smc/smc_wr.c commit 49dc9013e34b ("net/smc: Use the bitmap API when applicable") commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock") bitmap_zero()/memset() is removed by the fix Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-27atlantic: Fix buff_ring OOB in aq_ring_rx_cleanZekun Shen1-0/+8
The function obtain the next buffer without boundary check. We should return with I/O error code. The bug is found by fuzzing and the crash report is attached. It is an OOB bug although reported as use-after-free. [ 4.804724] BUG: KASAN: use-after-free in aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.805661] Read of size 4 at addr ffff888034fe93a8 by task ksoftirqd/0/9 [ 4.806505] [ 4.806703] CPU: 0 PID: 9 Comm: ksoftirqd/0 Tainted: G W 5.6.0 #34 [ 4.809030] Call Trace: [ 4.809343] dump_stack+0x76/0xa0 [ 4.809755] print_address_description.constprop.0+0x16/0x200 [ 4.810455] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.811234] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.813183] __kasan_report.cold+0x37/0x7c [ 4.813715] ? aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.814393] kasan_report+0xe/0x20 [ 4.814837] aq_ring_rx_clean+0x1e88/0x2730 [atlantic] [ 4.815499] ? hw_atl_b0_hw_ring_rx_receive+0x9a5/0xb90 [atlantic] [ 4.816290] aq_vec_poll+0x179/0x5d0 [atlantic] [ 4.816870] ? _GLOBAL__sub_I_65535_1_aq_pci_func_init+0x20/0x20 [atlantic] [ 4.817746] ? __next_timer_interrupt+0xba/0xf0 [ 4.818322] net_rx_action+0x363/0xbd0 [ 4.818803] ? call_timer_fn+0x240/0x240 [ 4.819302] ? __switch_to_asm+0x40/0x70 [ 4.819809] ? napi_busy_loop+0x520/0x520 [ 4.820324] __do_softirq+0x18c/0x634 [ 4.820797] ? takeover_tasklets+0x5f0/0x5f0 [ 4.821343] run_ksoftirqd+0x15/0x20 [ 4.821804] smpboot_thread_fn+0x2f1/0x6b0 [ 4.822331] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.823041] ? __kthread_parkme+0x80/0x100 [ 4.823571] ? smpboot_unregister_percpu_thread+0x160/0x160 [ 4.824301] kthread+0x2b5/0x3b0 [ 4.824723] ? kthread_create_on_node+0xd0/0xd0 [ 4.825304] ret_from_fork+0x35/0x40 Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEXHangbin Liu1-3/+0
Since commit 94dd016ae538 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP ioctl to active device") the user could get bond active interface's PHC index directly. But when there is a failover, the bond active interface will change, thus the PHC index is also changed. This may break the user's program if they did not update the PHC timely. This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX. When the user wants to get the bond active interface's PHC, they need to add this flag and be aware the PHC index may be changed. With the new flag. All flag checks in current drivers are removed. Only the checking in net_hwtstamp_validate() is kept. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski11-64/+197
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-02ethernet: aquantia: Try MAC address from device treeTianhao Chai1-10/+14
Apple M1 Mac minis (2020) with 10GE NICs do not have MAC address in the card, but instead need to obtain MAC addresses from the device tree. In this case the hardware will report an invalid MAC. Currently atlantic driver does not query the DT for MAC address and will randomly assign a MAC if the NIC doesn't have a permanent MAC burnt in. This patch causes the driver to perfer a valid MAC address from OF (if present) over HW self-reported MAC and only fall back to a random MAC address when neither of them is valid. Signed-off-by: Tianhao Chai <cth451@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlantic: Remove warn trace message.Sameer Saurabh1-3/+0
Remove the warn trace message - it's not a correct check here, because the function can still be called on the device in DOWN state Fixes: 508f2e3dce454 ("net: atlantic: split rx and tx per-queue stats") Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlantic: Fix statistics logic for production hardwareDmitry Bogdanov5-26/+138
B0 is the main and widespread device revision of atlantic2 HW. In the current state, driver will incorrectly fetch the statistics for this revision. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Dmitry Bogdanov <dbezrukov@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29Remove Half duplex mode speed capabilities.Sameer Saurabh1-4/+1
Since Half Duplex mode has been deprecated by the firmware, driver should not advertise Half Duplex speed in ethtool support link speed values. Fixes: 071a02046c262 ("net: atlantic: A2: half duplex support") Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlantic: Add missing DIDs and fix 115c.Nikita Danilov4-1/+27
At the late production stages new dev ids were introduced. These are now in production, so its important for the driver to recognize these. And also fix the board caps for AQC115C adapter. Fixes: b3f0c79cba206 ("net: atlantic: A2 hw_ops skeleton") Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlantic: Fix to display FW bundle version instead of FW mac version.Sameer Saurabh1-3/+3
The correct way to reflect firmware version is to use bundle version. Hence populating the same instead of MAC fw version. Fixes: c1be0bf092bd2 ("net: atlantic: common functions needed for basic A2 init/deinit hw_ops") Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlatnic: enable Nbase-t speeds with base-tNikita Danilov3-19/+13
When 2.5G is advertised, N-Base should be advertised against the T-base caps. N5G is out of use in baseline code and driver should treat both 5G and N5G (and also 2.5G and N2.5G) equally from user perspective. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-29atlantic: Increase delay for fw transactionsDmitry Bogdanov1-2/+5
The max waiting period (of 1 ms) while reading the data from FW shared buffer is too small for certain types of data (e.g., stats). There's a chance that FW could be updating buffer at the same time and driver would be unsuccessful in reading data. Firmware manual recommends to have 1 sec timeout to fix this issue. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Dmitry Bogdanov <dbezrukov@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
drivers/net/ipa/ipa_main.c 8afc7e471ad3 ("net: ipa: separate disabling setup from modem stop") 76b5fbcd6b47 ("net: ipa: kill ipa_modem_init()") Duplicated include, drop one. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-22ethtool: extend ringparam setting/getting API with rx_buf_lenHao Chen1-2/+6
Add two new parameters kernel_ringparam and extack for .get_ringparam and .set_ringparam to extend more ring params through netlink. Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-19atlantic: fix double-free in aq_ring_tx_cleanZekun Shen1-1/+2
We found this bug while fuzzing the device driver. Using and freeing the dangling pointer buff->skb would cause use-after-free and double-free. This bug is triggerable with compromised/malfunctioning devices. We found the bug with QEMU emulation and tested the patch by emulation. We did NOT test on a real device. Attached is the bug report. BUG: KASAN: double-free or invalid-free in consume_skb+0x6c/0x1c0 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? consume_skb+0x6c/0x1c0 kasan_report_invalid_free+0x61/0xa0 ? consume_skb+0x6c/0x1c0 __kasan_slab_free+0x15e/0x170 ? consume_skb+0x6c/0x1c0 kfree+0x8c/0x230 consume_skb+0x6c/0x1c0 aq_ring_tx_clean+0x5c2/0xa80 [atlantic] aq_vec_poll+0x309/0x5d0 [atlantic] ? _sub_I_65535_1+0x20/0x20 [atlantic] ? __next_timer_interrupt+0xba/0xf0 net_rx_action+0x363/0xbd0 ? call_timer_fn+0x240/0x240 ? __switch_to_asm+0x34/0x70 ? napi_busy_loop+0x520/0x520 ? net_tx_action+0x379/0x720 __do_softirq+0x18c/0x634 ? takeover_tasklets+0x5f0/0x5f0 run_ksoftirqd+0x15/0x20 smpboot_thread_fn+0x2f1/0x6b0 ? smpboot_unregister_percpu_thread+0x160/0x160 ? __kthread_parkme+0x80/0x100 ? smpboot_unregister_percpu_thread+0x160/0x160 kthread+0x2b5/0x3b0 ? kthread_create_on_node+0xd0/0xd0 ret_from_fork+0x22/0x40 Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu> Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-15atlantic: Fix OOB read and write in hw_atl_utils_fw_rpc_waitZekun Shen1-0/+10
This bug report shows up when running our research tools. The reports is SOOB read, but it seems SOOB write is also possible a few lines below. In details, fw.len and sw.len are inputs coming from io. A len over the size of self->rpc triggers SOOB. The patch fixes the bugs by adding sanity checks. The bugs are triggerable with compromised/malfunctioning devices. They are potentially exploitable given they first leak up to 0xffff bytes and able to overwrite the region later. The patch is tested with QEMU emulater. This is NOT tested with a real device. Attached is the log we found by fuzzing. BUG: KASAN: slab-out-of-bounds in hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] Read of size 4 at addr ffff888016260b08 by task modprobe/213 CPU: 0 PID: 213 Comm: modprobe Not tainted 5.6.0 #1 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] __kasan_report.cold+0x37/0x7c ? aq_hw_read_reg_bit+0x60/0x70 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] kasan_report+0xe/0x20 hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] hw_atl_utils_fw_rpc_call+0x95/0x130 [atlantic] hw_atl_utils_fw_rpc_wait+0x176/0x210 [atlantic] hw_atl_utils_mpi_create+0x229/0x2e0 [atlantic] ? hw_atl_utils_fw_rpc_wait+0x210/0x210 [atlantic] ? hw_atl_utils_initfw+0x9f/0x1c8 [atlantic] hw_atl_utils_initfw+0x12a/0x1c8 [atlantic] aq_nic_ndev_register+0x88/0x650 [atlantic] ? aq_nic_ndev_init+0x235/0x3c0 [atlantic] aq_pci_probe+0x731/0x9b0 [atlantic] ? aq_pci_func_init+0xc0/0xc0 [atlantic] local_pci_probe+0xd3/0x160 pci_device_probe+0x23f/0x3e0 Reported-by: Brendan Dolan-Gavitt <brendandg@nyu.edu> Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16ethernet: aquantia: use eth_hw_addr_set()Jakub Kicinski1-2/+4
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Use an array on the stack, then call eth_hw_addr_set(). eth_hw_addr_set() is after error checking, this should be fine, error propagates all the way to failing probe. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-14ethernet: constify references to netdev->dev_addr in driversJakub Kicinski8-14/+14
This big patch sprinkles const on local variables and function arguments which may refer to netdev->dev_addr. Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Some of the changes here are not strictly required - const is sometimes cast off but pointer is not used for writing. It seems like it's still better to add the const in case the code changes later or relevant -W flags get enabled for the build. No functional changes. Link: https://lore.kernel.org/r/20211014142432.449314-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-02ethernet: use eth_hw_addr_set() instead of ether_addr_copy()Jakub Kicinski1-1/+1
Convert Ethernet from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-23atlantic: Fix issue in the pm resume flow.Sudarsana Reddy Kalluru1-2/+2
After fixing hibernation resume flow, another usecase was found which should be explicitly handled - resume when device is in "down" state. Invoke aq_nic_init jointly with aq_nic_start only if ndev was already up during suspend/hibernate. We still need to perform nic_deinit() if caller requests for it, to handle the freeze/resume scenarios. Fixes: 57f780f1c433 ("atlantic: Fix driver resume flow.") Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+3
include/linux/netdevice.h net/socket.c d0efb16294d1 ("net: don't unconditionally copy_from_user a struct ifreq for socket ioctls") 876f0bf9d0d5 ("net: socket: simplify dev_ifconf handling") 29c4964822aa ("net: socket: rework compat_ifreq_ioctl()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-29atlantic: Fix driver resume flow.Sudarsana Reddy Kalluru1-0/+3
Driver crashes when restoring from the Hibernate. In the resume flow, driver need to clean up the older nic/vec objects and re-initialize them. Fixes: 8aaa112a57c1d ("net: atlantic: refactoring pm logic") Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24ethtool: extend coalesce setting uAPI with CQE modeYufeng Mo1-2/+6
In order to support more coalesce parameters through netlink, add two new parameter kernel_coal and extack for .set_coalesce and .get_coalesce, then some extra info can return to user with the netlink API. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>