aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-12-02Merge tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linuxLinus Torvalds1-9/+10
Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers: "This is a core arm64 change. However, I was asked to take this because most uses of kernel-mode FPSIMD are in crypto or CRC code. In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction" * tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits) lib/crypto: arm64: Move remaining algorithms to scoped ksimd API lib/crypto: arm/blake2b: Move to scoped ksimd API arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack arm64/fpu: Enforce task-context only for generic kernel mode FPU net/mlx5: Switch to more abstract scoped ksimd guard API on arm64 arm64/xorblocks: Switch to 'ksimd' scoped guard API crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API crypto/arm64: polyval - Switch to 'ksimd' scoped guard API crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API raid6: Move to more abstract 'ksimd' guard API crypto: aegis128-neon - Move to more abstract 'ksimd' guard API crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API ...
2025-12-02Merge tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linuxLinus Torvalds1-4/+4
Pull 'at_least' array size update from Eric Biggers: "C supports lower bounds on the sizes of array parameters, using the static keyword as follows: 'void f(int a[static 32]);'. This allows the compiler to warn about a too-small array being passed. As discussed, this reuse of the 'static' keyword, while standard, is a bit obscure. Therefore, add an alias 'at_least' to compiler_types.h. Then, add this 'at_least' annotation to the array parameters of various crypto library functions" * tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: sha2: Add at_least decoration to fixed-size array params lib/crypto: sha1: Add at_least decoration to fixed-size array params lib/crypto: poly1305: Add at_least decoration to fixed-size array params lib/crypto: md5: Add at_least decoration to fixed-size array params lib/crypto: curve25519: Add at_least decoration to fixed-size array params lib/crypto: chacha: Add at_least decoration to fixed-size array params lib/crypto: chacha20poly1305: Statically check fixed array lengths compiler_types: introduce at_least parameter decoration pseudo keyword wifi: iwlwifi: trans: rename at_least variable to min_mode
2025-12-02Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linuxLinus Torvalds2-25/+25
Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.19. It includes: - Add SHA-3 support to lib/crypto/, including support for both the hash functions and the extendable-output functions. Reimplement the existing SHA-3 crypto_shash support on top of the library. This is motivated mainly by the upcoming support for the ML-DSA signature algorithm, which needs the SHAKE128 and SHAKE256 functions. But even on its own it's a useful cleanup. This also fixes the longstanding issue where the architecture-optimized SHA-3 code was disabled by default. - Add BLAKE2b support to lib/crypto/, and reimplement the existing BLAKE2b crypto_shash support on top of the library. This is motivated mainly by btrfs, which supports BLAKE2b checksums. With this change, all btrfs checksum algorithms now have library APIs. btrfs is planned to start just using the library directly. This refactor also improves consistency between the BLAKE2b code and BLAKE2s code. And as usual, it also fixes the issue where the architecture-optimized BLAKE2b code was disabled by default. - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL support in crypto_shash. Reimplement HCTR2 on top of the library. This simplifies the code and improves HCTR2 performance. As usual, it also makes the architecture-optimized code be enabled by default. The generic implementation of POLYVAL is greatly improved as well. - Clean up the BLAKE2s code - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3" * tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits) fscrypt: Drop obsolete recommendation to enable optimized POLYVAL crypto: polyval - Remove the polyval crypto_shash crypto: hctr2 - Convert to use POLYVAL library lib/crypto: x86/polyval: Migrate optimized code into library lib/crypto: arm64/polyval: Migrate optimized code into library lib/crypto: polyval: Add POLYVAL library crypto: polyval - Rename conflicting functions lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value lib/crypto: x86/blake2s: Improve readability lib/crypto: x86/blake2s: Use local labels for data lib/crypto: x86/blake2s: Drop check for nblocks == 0 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit lib/crypto: arm, arm64: Drop filenames from file comments lib/crypto: arm/blake2s: Fix some comments crypto: s390/sha3 - Remove superseded SHA-3 code crypto: sha3 - Reimplement using library API crypto: jitterentropy - Use default sha3 implementation lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions lib/crypto: sha3: Support arch overrides of one-shot digest functions ...
2025-11-27net: fec: do not register PPS event for PEROUTWei Fang1-2/+5
There are currently two situations that can trigger the PTP interrupt, one is the PPS event, the other is the PEROUT event. However, the irq handler fec_pps_interrupt() does not check the irq event type and directly registers a PPS event into the system, but the event may be a PEROUT event. This is incorrect because PEROUT is an output signal, while PPS is the input of the kernel PPS system. Therefore, add a check for the event type, if pps_enable is true, it means that the current event is a PPS event, and then the PPS event is registered. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-5-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: do not allow enabling PPS and PEROUT simultaneouslyWei Fang1-0/+12
In the current driver, PPS and PEROUT use the same channel to generate the events, so they cannot be enabled at the same time. Otherwise, the later configuration will overwrite the earlier configuration. Therefore, when configuring PPS, the driver will check whether PEROUT is enabled. Similarly, when configuring PEROUT, the driver will check whether PPS is enabled. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-4-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: do not update PEROUT if it is enabledWei Fang2-10/+34
If the previously set PEROUT is already active, updating it will cause the new PEROUT to start immediately instead of at the specified time. This is because fep->reload_period is updated whithout check whether the PEROUT is enabled, and the old PEROUT is not disabled. Therefore, the pulse period will be updated immediately in the pulse interrupt handler fec_pps_interrupt(). Currently, the driver does not support directly updating PEROUT and it will make the logic be more complicated. To fix the current issue, add a check before enabling the PEROUT, the driver will return an error if PEROUT is enabled. If users wants to update a new PEROUT, they should disable the old PEROUT first. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-3-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: cancel perout_timer when PEROUT is disabledWei Fang1-0/+2
The PEROUT allows the user to set a specified future time to output the periodic signal. If the future time is far from the current time, the FEC driver will use hrtimer to configure PEROUT one second before the future time. However, the hrtimer will not be canceled if the PEROUT is disabled before the hrtimer expires. So the PEROUT will be configured when the hrtimer expires, which is not as expected. Therefore, cancel the hrtimer in fec_ptp_pps_disable() to fix this issue. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-2-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-26Merge tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-canJakub Kicinski5-41/+124
Marc Kleine-Budde says: ==================== pull-request: can 2025-11-26 this is a pull request of 8 patches for net/main. Seungjin Bae provides a patch for the kvaser_usb driver to fix a potential infinite loop in the USB data stream command parser. Thomas Mühlbacher's patch for the sja1000 driver IRQ handler's max loop handling, that might lead to unhandled interrupts. 3 patches by me for the gs_usb driver fix handling of failed transmit URBs and add checking of the actual length of received URBs before accessing the data. The next patch is by me and is a port of Thomas Mühlbacher's patch (fix IRQ handler's max loop handling, that might lead to unhandled interrupts.) to the sun4i_can driver. Biju Das provides a patch for the rcar_canfd driver to fix the CAN-FD mode setting. The last patch is by Shaurya Rane for the em_canid filter to ensure that the complete CAN frame is present in the linear data buffer before accessing it. * tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: net/sched: em_canid: fix uninit-value in em_canid_match can: rcar_canfd: Fix CAN-FD mode as default can: sun4i_can: sun4i_can_interrupt(): fix max irq loop handling can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing data can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing header can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBs can: sja1000: fix max irq loop handling can: kvaser_usb: leaf: Fix potential infinite loop in command parsers ==================== Link: https://patch.msgid.link/20251126155713.217105-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: atlantic: fix fragment overflow handling in RX pathJiefeng Zhang1-0/+5
The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17) fragments when handling large multi-descriptor packets. This causes an out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic. The issue occurs because the driver doesn't check the total number of fragments before calling skb_add_rx_frag(). When a packet requires more than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds. Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE, then all fragments are accounted for. And reusing the existing check to prevent the overflow earlier in the code path. This crash occurred in production with an Aquantia AQC113 10G NIC. Stack trace from production environment: ``` RIP: 0010:skb_add_rx_frag_netmem+0x29/0xd0 Code: 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f8 41 89 ca 48 89 d7 48 63 ce 8b 90 c0 00 00 00 48 c1 e1 04 48 01 ca 48 03 90 c8 00 00 00 <48> 89 7a 30 44 89 52 3c 44 89 42 38 40 f6 c7 01 75 74 48 89 fa 83 RSP: 0018:ffffa9bec02a8d50 EFLAGS: 00010287 RAX: ffff925b22e80a00 RBX: ffff925ad38d2700 RCX: fffffffe0a0c8000 RDX: ffff9258ea95bac0 RSI: ffff925ae0a0c800 RDI: 0000000000037a40 RBP: 0000000000000024 R08: 0000000000000000 R09: 0000000000000021 R10: 0000000000000848 R11: 0000000000000000 R12: ffffa9bec02a8e24 R13: ffff925ad8615570 R14: 0000000000000000 R15: ffff925b22e80a00 FS: 0000000000000000(0000) GS:ffff925e47880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9258ea95baf0 CR3: 0000000166022004 CR4: 0000000000f72ef0 PKRU: 55555554 Call Trace: <IRQ> aq_ring_rx_clean+0x175/0xe60 [atlantic] ? aq_ring_rx_clean+0x14d/0xe60 [atlantic] ? aq_ring_tx_clean+0xdf/0x190 [atlantic] ? kmem_cache_free+0x348/0x450 ? aq_vec_poll+0x81/0x1d0 [atlantic] ? __napi_poll+0x28/0x1c0 ? net_rx_action+0x337/0x420 ``` Fixes: 6aecbba12b5c ("net: atlantic: add check for MAX_SKB_FRAGS") Changes in v4: - Add Fixes: tag to satisfy patch validation requirements. Changes in v3: - Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE, then all fragments are accounted for. Signed-off-by: Jiefeng Zhang <jiefeng.z.zhang@gmail.com> Link: https://patch.msgid.link/20251126032249.69358-1-jiefeng.z.zhang@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26virtio-net: avoid unnecessary checksum calculation on guest RXJon Kohler2-2/+3
Commit a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") inadvertently altered checksum offload behavior for guests not using UDP GSO tunneling. Before, tun_put_user called tun_vnet_hdr_from_skb, which passed has_data_valid = true to virtio_net_hdr_from_skb. After, tun_put_user began calling tun_vnet_hdr_tnl_from_skb instead, which passes has_data_valid = false into both call sites. This caused virtio hdr flags to not include VIRTIO_NET_HDR_F_DATA_VALID for SKBs where skb->ip_summed == CHECKSUM_UNNECESSARY. As a result, guests are forced to recalculate checksums unnecessarily. Restore the previous behavior by ensuring has_data_valid = true is passed in the !tnl_gso_type case, but only from tun side, as virtio_net_hdr_tnl_from_skb() is used also by the virtio_net driver, which in turn must not use VIRTIO_NET_HDR_F_DATA_VALID on tx. cc: stable@vger.kernel.org Fixes: a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") Signed-off-by: Jon Kohler <jon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20251125222754.1737443-1-jon@nutanix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26eth: fbnic: Fix counter roll-over issueMohsin Bashir1-1/+1
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs() when calculating descriptor slots. The issue occurs when head - tail results in a large positive value (unsigned) and the compiler interprets head - tail - 1 as a signed value. Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation, which is a common way of avoiding this problem when dealing with these sort of ring space calculations. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20251125211704.3222413-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing trafficVladimir Oltean1-7/+0
When using the SGMII PCS as a fixed-link chip-to-chip connection, it is easy to miss the fact that traffic passes only at 1G, since that's what any normal such connection would use. When using the SGMII PCS connected towards an on-board PHY or an SFP module, it is immediately noticeable that when the link resolves to a speed other than 1G, traffic from the MAC fails to pass: TX counters increase, but nothing gets decoded by the other end, and no local RX counters increase either. Artificially lowering a fixed-link rate to speed = <100> makes us able to see the same issue as in the case of having an SGMII PHY. Some debugging shows that the XPCS configuration is A-OK, but that the MAC Configuration Table entry for the port has the SPEED bits still set to 1000Mbps, due to a special condition in the driver. Deleting that condition, and letting the resolved link speed be programmed directly into the MAC speed field, results in a functional link at all 3 speeds. This piece of evidence, based on testing on both generations with SGMII support (SJA1105S and SJA1110A) directly contradicts the statement from the blamed commit that "the MAC is fixed at 1 Gbps and we need to configure the PCS only (if even that)". Worse, that statement is not backed by any documentation, and no one from NXP knows what it might refer to. I am unable to recall sufficient context regarding my testing from March 2020 to understand what led me to draw such a braindead and factually incorrect conclusion. Yet, there is nothing of value regarding forcing the MAC speed, either for SGMII or 2500Base-X (introduced at a later stage), so remove all such logic. Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122111324.136761-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: wwan: mhi: Keep modem name match with Foxconn T99W640Slark Xiao1-1/+1
Correct it since M.2 device T99W640 has updated from T99W515. We need to align it with MHI side otherwise this modem can't get the network. Fixes: ae5a34264354 ("bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640") Signed-off-by: Slark Xiao <slark_xiao@163.com> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20251125070900.33324-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26can: rcar_canfd: Fix CAN-FD mode as defaultBiju Das1-22/+31
The commit 5cff263606a1 ("can: rcar_canfd: Fix controller mode setting") has aligned with the flow mentioned in the hardware manual for all SoCs except R-Car Gen3 and RZ/G2L SoCs. On R-Car Gen4 and RZ/G3E SoCs, due to the wrong logic in the commit[1] sets the default mode to FD-Only mode instead of CAN-FD mode. This patch sets the CAN-FD mode as the default for all SoCs by dropping the rcar_canfd_set_mode() as some SoC requires mode setting in global reset mode, and the rest of the SoCs in channel reset mode and update the rcar_canfd_reset_controller() to take care of these constraints. Moreover, the RZ/G3E and R-Car Gen4 SoCs support 3 modes compared to 2 modes on the R-Car Gen3. Use inverted logic in rcar_canfd_reset_controller() to simplify the code later to support FD-only mode. [1] commit 45721c406dcf ("can: rcar_canfd: Add support for r8a779a0 SoC") Fixes: 5cff263606a1 ("can: rcar_canfd: Fix controller mode setting") Cc: stable@vger.kernel.org Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://patch.msgid.link/20251118123926.193445-1-biju.das.jz@bp.renesas.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-25r8169: fix RTL8127 hang on suspend/shutdownHeiner Kallweit1-5/+14
There have been reports that RTL8127 hangs on suspend and shutdown, partially disappearing from lspci until power-cycling. According to Realtek disabling PLL's when switching to D3 should be avoided on that chip version. Fix this by aligning disabling PLL's with the vendor drivers, what in addition results in PLL's not being disabled when switching to D3hot on other chip versions. Fixes: f24f7b2f3af9 ("r8169: add support for RTL8127A") Tested-by: Fabio Baltieri <fabio.baltieri@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/d7faae7e-66bc-404a-a432-3a496600575f@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25net: sxgbe: fix potential NULL dereference in sxgbe_rx()Alexey Kodanev1-1/+3
Currently, when skb is null, the driver prints an error and then dereferences skb on the next line. To fix this, let's add a 'break' after the error message to switch to sxgbe_rx_refill(), which is similar to the approach taken by the other drivers in this particular case, e.g. calxeda with xgmac_rx(). Found during a code review. Fixes: 1edb9ca69e8a ("net: sxgbe: add basic framework for Samsung 10Gb ethernet driver") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251121123834.97748-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25team: Move team device type change at the end of team_port_addNikola Z. Ivanov1-8/+15
Attempting to add a port device that is already up will expectedly fail, but not before modifying the team device header_ops. In the case of the syzbot reproducer the gre0 device is already in state UP when it attempts to add it as a port device of team0, this fails but before that header_ops->create of team0 is changed from eth_header to ipgre_header in the call to team_dev_type_check_change. Later when we end up in ipgre_header() struct ip_tunnel* points to nonsense as the private data of the device still holds a struct team. Example sequence of iproute2 commands to reproduce the hang/BUG(): ip link add dev team0 type team ip link add dev gre0 type gre ip link set dev gre0 up ip link set dev gre0 master team0 ip link set dev team0 up ping -I team0 1.1.1.1 Move team_dev_type_check_change down where all other checks have passed as it changes the dev type with no way to restore it in case one of the checks that follow it fail. Also make sure to preserve the origial mtu assignment: - If port_dev is not the same type as dev, dev takes mtu from port_dev - If port_dev is the same type as dev, port_dev takes mtu from dev This is done by adding a conditional before the call to dev_set_mtu to prevent it from assigning port_dev->mtu = dev->mtu and instead letting team_dev_type_check_change assign dev->mtu = port_dev->mtu. The conditional is needed because the patch moves the call to team_dev_type_check_change past dev_set_mtu. Testing: - team device driver in-tree selftests - Add/remove various devices as slaves of team device - syzbot Reported-by: syzbot+a2a3b519de727b0f7903@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a2a3b519de727b0f7903 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20251122002027.695151-1-zlatistiv@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25net/mlx5e: Fix validation logic in rate limitingDanielle Costantino1-1/+1
The rate limiting validation condition currently checks the output variable max_bw_value[i] instead of the input value maxrate->tc_maxrate[i]. This causes the validation to compare an uninitialized or stale value rather than the actual requested rate. The condition should check the input rate to properly validate against the upper limit: } else if (maxrate->tc_maxrate[i] <= upper_limit_gbps) { This aligns with the pattern used in the first branch, which correctly checks maxrate->tc_maxrate[i] against upper_limit_mbps. The current implementation can lead to unreliable validation behavior: - For rates between 25.5 Gbps and 255 Gbps, if max_bw_value[i] is 0 from initialization, the GBPS path may be taken regardless of whether the actual rate is within bounds - When processing multiple TCs (i > 0), max_bw_value[i] contains the value computed for the previous TC, affecting the validation logic - The overflow check for rates exceeding 255 Gbps may not trigger consistently depending on previous array values This patch ensures the validation correctly examines the requested rate value for proper bounds checking. Fixes: 43b27d1bd88a ("net/mlx5e: Fix wraparound in rate limiting for values above 255 Gbps") Signed-off-by: Danielle Costantino <dcostantino@meta.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20251124180043.2314428-1-dcostantino@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25net: lan966x: Fix the initialization of taprioHoratiu Vultur1-1/+4
To initialize the taprio block in lan966x, it is required to configure the register REVISIT_DLY. The purpose of this register is to set the delay before revisit the next gate and the value of this register depends on the system clock. The problem is that the we calculated wrong the value of the system clock period in picoseconds. The actual system clock is ~165.617754MHZ and this correspond to a period of 6038 pico seconds and not 15125 as currently set. Fixes: e462b2717380b4 ("net: lan966x: Add offload support for taprio") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251121061411.810571-1-horatiu.vultur@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: phy: mxl-gpy: fix link properties on USXGMII and internal PHYsDaniel Golle1-7/+11
gpy_update_interface() returns early in case the PHY is internal or connected via USXGMII. In this case the gigabit master/slave property as well as MDI/MDI-X status also won't be read which seems wrong. Always read those properties by moving the logic to retrieve them to gpy_read_status(). Fixes: fd8825cd8c6fc ("net: phy: mxl-gpy: Add PHY Auto/MDI/MDI-X set driver for GPY211 chips") Fixes: 311abcdddc00a ("net: phy: add support to get Master-Slave configuration") Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/71fccf3f56742116eb18cc070d2a9810479ea7f9.1763650701.git.daniel@makrotopia.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: dsa: microchip: Fix symetry in ksz_ptp_msg_irq_{setup/free}()Bastien Curutchet (Schneider Electric)1-11/+7
The IRQ numbers created through irq_create_mapping() are only assigned to ptpmsg_irq[n].num at the end of the IRQ setup. So if an error occurs between their creation and their assignment (for instance during the request_threaded_irq() step), we enter the error path and fail to release the newly created virtual IRQs because they aren't yet assigned to ptpmsg_irq[n].num. Move the mapping creation to ksz_ptp_msg_irq_setup() to ensure symetry with what's released by ksz_ptp_msg_irq_free(). In the error path, move the irq_dispose_mapping to the out_ptp_msg label so it will be called only on created IRQs. Cc: stable@vger.kernel.org Fixes: cc13ab18b201 ("net: dsa: microchip: ptp: enable interrupt for timestamping") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Link: https://patch.msgid.link/20251120-ksz-fix-v6-5-891f80ae7f8f@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: dsa: microchip: Free previously initialized ports on init failuresBastien Curutchet (Schneider Electric)1-12/+11
If a port interrupt setup fails after at least one port has already been successfully initialized, the gotos miss some resource releasing: - the already initialized PTP IRQs aren't released - the already initialized port IRQs aren't released if the failure occurs in ksz_pirq_setup(). Merge 'out_girq' and 'out_ptpirq' into a single 'port_release' label. Behind this label, use the reverse loop to release all IRQ resources for all initialized ports. Jump in the middle of the reverse loop if an error occurs in ksz_ptp_irq_setup() to only release the port IRQ of the current iteration. Cc: stable@vger.kernel.org Fixes: c9cd961c0d43 ("net: dsa: microchip: lan937x: add interrupt support for port phy link") Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Link: https://patch.msgid.link/20251120-ksz-fix-v6-4-891f80ae7f8f@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: dsa: microchip: Don't free uninitialized ksz_irqBastien Curutchet (Schneider Electric)1-1/+1
If something goes wrong at setup, ksz_irq_free() can be called on uninitialized ksz_irq (for example when ksz_ptp_irq_setup() fails). It leads to freeing uninitialized IRQ numbers and/or domains. Use dsa_switch_for_each_user_port_continue_reverse() in the error path to iterate only over the fully initialized ports. Cc: stable@vger.kernel.org Fixes: cc13ab18b201 ("net: dsa: microchip: ptp: enable interrupt for timestamping") Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Link: https://patch.msgid.link/20251120-ksz-fix-v6-3-891f80ae7f8f@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: dsa: microchip: ptp: Fix checks on irq_find_mapping()Bastien Curutchet (Schneider Electric)1-2/+2
irq_find_mapping() returns a positive IRQ number or 0 if no IRQ is found but it never returns a negative value. However, during the PTP IRQ setup, we verify that its returned value isn't negative. Fix the irq_find_mapping() check to enter the error path when 0 is returned. Return -EINVAL in such case. Cc: stable@vger.kernel.org Fixes: cc13ab18b201 ("net: dsa: microchip: ptp: enable interrupt for timestamping") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Link: https://patch.msgid.link/20251120-ksz-fix-v6-2-891f80ae7f8f@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: dsa: microchip: common: Fix checks on irq_find_mapping()Bastien Curutchet (Schneider Electric)1-4/+4
irq_find_mapping() returns a positive IRQ number or 0 if no IRQ is found but it never returns a negative value. However, on each irq_find_mapping() call, we verify that the returned value isn't negative. Fix the irq_find_mapping() checks to enter error paths when 0 is returned. Return -EINVAL in such cases. CC: stable@vger.kernel.org Fixes: c9cd961c0d43 ("net: dsa: microchip: lan937x: add interrupt support for port phy link") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com> Link: https://patch.msgid.link/20251120-ksz-fix-v6-1-891f80ae7f8f@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-25net: aquantia: Add missing descriptor cache invalidation on ATL2Kai-Heng Feng4-19/+25
ATL2 hardware was missing descriptor cache invalidation in hw_stop(), causing SMMU translation faults during device shutdown and module removal: [ 70.355743] arm-smmu-v3 arm-smmu-v3.5.auto: event 0x10 received: [ 70.361893] arm-smmu-v3 arm-smmu-v3.5.auto: 0x0002060000000010 [ 70.367948] arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000020000000000 [ 70.374002] arm-smmu-v3 arm-smmu-v3.5.auto: 0x00000000ff9bc000 [ 70.380055] arm-smmu-v3 arm-smmu-v3.5.auto: 0x0000000000000000 [ 70.386109] arm-smmu-v3 arm-smmu-v3.5.auto: event: F_TRANSLATION client: 0001:06:00.0 sid: 0x20600 ssid: 0x0 iova: 0xff9bc000 ipa: 0x0 [ 70.398531] arm-smmu-v3 arm-smmu-v3.5.auto: unpriv data write s1 "Input address caused fault" stag: 0x0 Commit 7a1bb49461b1 ("net: aquantia: fix potential IOMMU fault after driver unbind") and commit ed4d81c4b3f2 ("net: aquantia: when cleaning hw cache it should be toggled") fixed cache invalidation for ATL B0, but ATL2 was left with only interrupt disabling. This allowed hardware to write to cached descriptors after DMA memory was unmapped, triggering SMMU faults. Once cache invalidation is applied to ATL2, the translation fault can't be observed anymore. Add shared aq_hw_invalidate_descriptor_cache() helper and use it in both ATL B0 and ATL2 hw_stop() implementations for consistent behavior. Fixes: e54dcf4bba3e ("net: atlantic: basic A2 init/deinit hw_ops") Tested-by: Carol Soto <csoto@nvidia.com> Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251120041537.62184-1-kaihengf@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-23wifi: iwlwifi: trans: rename at_least variable to min_modeJason A. Donenfeld1-4/+4
The subsequent commit is going to add a macro that redefines `at_least` to mean something else. Given that the usage here in iwlwifi is the only use of that identifier in the whole kernel, just rename it to a more fitting name, `min_mode`. Cc: Miri Korenblit <miriam.rachel.korenblit@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20251123054819.2371989-1-Jason@zx2c4.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-21net: phy: mxl-gpy: fix bogus error on USXGMII and integrated PHYDaniel Golle1-1/+1
As the interface mode doesn't need to be updated on PHYs connected with USXGMII and integrated PHYs, gpy_update_interface() should just return 0 in these cases rather than -EINVAL which has wrongly been introduced by commit 7a495dde27ebc ("net: phy: mxl-gpy: Change gpy_update_interface() function return type"), as this breaks support for those PHYs. Fixes: 7a495dde27ebc ("net: phy: mxl-gpy: Change gpy_update_interface() function return type") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/f744f721a1fcc5e2e936428c62ff2c7d94d2a293.1763648168.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20veth: reduce XDP no_direct return section to fix raceJesper Dangaard Brouer1-4/+3
As explain in commit fa349e396e48 ("veth: Fix race with AF_XDP exposing old or uninitialized descriptors") for veth there is a chance after napi_complete_done() that another CPU can manage start another NAPI instance running veth_pool(). For NAPI this is correctly handled as the napi_schedule_prep() check will prevent multiple instances from getting scheduled, but for the remaining code in veth_pool() this can run concurrent with the newly started NAPI instance. The problem/race is that xdp_clear_return_frame_no_direct() isn't designed to be nested. Prior to commit 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") the temporary BPF net context bpf_redirect_info was stored per CPU, where this wasn't an issue. Since this commit the BPF context is stored in 'current' task_struct. When running veth in threaded-NAPI mode, then the kthread becomes the storage area. Now a race exists between two concurrent veth_pool() function calls one exiting NAPI and one running new NAPI, both using the same BPF net context. Race is when another CPU gets within the xdp_set_return_frame_no_direct() section before exiting veth_pool() calls the clear-function xdp_clear_return_frame_no_direct(). Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/176356963888.337072.4805242001928705046.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20be2net: pass wrb_params in case of OS2BMCAndrey Vatoropin1-3/+4
be_insert_vlan_in_pkt() is called with the wrb_params argument being NULL at be_send_pkt_to_bmc() call site.  This may lead to dereferencing a NULL pointer when processing a workaround for specific packet, as commit bc0c3405abbb ("be2net: fix a Tx stall bug caused by a specific ipv6 packet") states. The correct way would be to pass the wrb_params from be_xmit(). Fixes: 760c295e0e8d ("be2net: Support for OS2BMC.") Cc: stable@vger.kernel.org Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru> Link: https://patch.msgid.link/20251119105015.194501-1-a.vatoropin@crpt.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20Merge tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wirelessPaolo Abeni1-0/+7
Johannes Berg says: ==================== wireless-2025-11-20 A single fix for scanning on some rtw89 devices. * tag 'wireless-2025-11-20' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: rtw89: hw_scan: Don't let the operating channel be last ==================== Link: https://patch.msgid.link/20251120085433.8601-3-johannes@sipsolutions.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-20net: dsa: microchip: lan937x: Fix RGMII delay tuningOleksij Rempel1-0/+1
Correct RGMII delay application logic in lan937x_set_tune_adj(). The function was missing `data16 &= ~PORT_TUNE_ADJ` before setting the new delay value. This caused the new value to be bitwise-OR'd with the existing PORT_TUNE_ADJ field instead of replacing it. For example, when setting the RGMII 2 TX delay on port 4, the intended TUNE_ADJUST value of 0 (RGMII_2_TX_DELAY_2NS) was incorrectly OR'd with the default 0x1B (from register value 0xDA3), leaving the delay at the wrong setting. This patch adds the missing mask to clear the field, ensuring the correct delay value is written. Physical measurements on the RGMII TX lines confirm the fix, showing the delay changing from ~1ns (before change) to ~2ns. While testing on i.MX 8MP showed this was within the platform's timing tolerance, it did not match the intended hardware-characterized value. Fixes: b19ac41faa3f ("net: dsa: microchip: apply rgmii tx and rx delay in phylink mac config") Cc: stable@vger.kernel.org Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/20251114090951.4057261-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-19Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueJakub Kicinski2-3/+21
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-11-18 (idpf, ice) This series contains updates to idpf and ice drivers. Emil adds a check for NULL vport_config during removal to avoid NULL pointer dereference in idpf. Grzegorz fixes PTP teardown paths to account for some missed cleanups for ice driver. * '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: fix PTP cleanup on driver removal in error path idpf: fix possible vport_config NULL pointer deref in remove ==================== Link: https://patch.msgid.link/20251118235207.2165495-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20wifi: rtw89: hw_scan: Don't let the operating channel be lastBitterblue Smith1-0/+7
Scanning can be offloaded to the firmware. To that end, the driver prepares a list of channels to scan, including periodic visits back to the operating channel, and sends the list to the firmware. When the channel list is too long to fit in a single H2C message, the driver splits the list, sends the first part, and tells the firmware to scan. When the scan is complete, the driver sends the next part of the list and tells the firmware to scan. When the last channel that fit in the H2C message is the operating channel something seems to go wrong in the firmware. It will acknowledge receiving the list of channels but apparently it will not do anything more. The AP can't be pinged anymore. The driver still receives beacons, though. One way to avoid this is to split the list of channels before the operating channel. Affected devices: * RTL8851BU with firmware 0.29.41.3 * RTL8832BU with firmware 0.29.29.8 * RTL8852BE with firmware 0.29.29.8 The commit 57a5fbe39a18 ("wifi: rtw89: refactor flow that hw scan handles channel list") is found by git blame, but it is actually to refine the scan flow, but not a culprit, so skip Fixes tag. Reported-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Closes: https://lore.kernel.org/linux-wireless/0abbda91-c5c2-4007-84c8-215679e652e1@gmail.com/ Cc: stable@vger.kernel.org # 6.16+ Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/c1e61744-8db4-4646-867f-241b47d30386@gmail.com
2025-11-19net: phylink: add missing supported link modes for the fixed-linkWei Fang1-0/+3
Pause, Asym_Pause and Autoneg bits are not set when pl->supported is initialized, so these link modes will not work for the fixed-link. This leads to a TCP performance degradation issue observed on the i.MX943 platform. The switch CPU port of i.MX943 is connected to an ENETC MAC, this link is a fixed link and the link speed is 2.5Gbps. And one of the switch user ports is the RGMII interface, and its link speed is 1Gbps. If the flow-control of the fixed link is not enabled, we can easily observe the iperf performance of TCP packets is very low. Because the inbound rate on the CPU port is greater than the outbound rate on the user port, the switch is prone to congestion, leading to the loss of some TCP packets and requiring multiple retransmissions. Solving this problem should be as simple as setting the Asym_Pause and Pause bits. The reason why the Autoneg bit needs to be set, Russell has gave a very good explanation in the thread [1], see below. "As the advertising and lp_advertising bitmasks have to be non-empty, and the swphy reports aneg capable, aneg complete, and AN enabled, then for consistency with that state, Autoneg should be set. This is how it was prior to the blamed commit." Fixes: de7d3f87be3c ("net: phylink: Use phy_caps_lookup for fixed-link configuration") Link: https://lore.kernel.org/aRjqLN8eQDIQfBjS@shell.armlinux.org.uk # [1] Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20251117102943.1862680-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18net/mlx5: Clean up only new IRQ glue on request_irq() failurePradyumn Rahar1-4/+2
The mlx5_irq_alloc() function can inadvertently free the entire rmap and end up in a crash[1] when the other threads tries to access this, when request_irq() fails due to exhausted IRQ vectors. This commit modifies the cleanup to remove only the specific IRQ mapping that was just added. This prevents removal of other valid mappings and ensures precise cleanup of the failed IRQ allocation's associated glue object. Note: This error is observed when both fwctl and rds configs are enabled. [1] mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 general protection fault, probably for non-canonical address 0xe277a58fde16f291: 0000 [#1] SMP NOPTI RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Call Trace: <TASK> ? show_trace_log_lvl+0x1d6/0x2f9 ? show_trace_log_lvl+0x1d6/0x2f9 ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] ? __die_body.cold+0x8/0xa ? die_addr+0x39/0x53 ? exc_general_protection+0x1c4/0x3e9 ? dev_vprintk_emit+0x5f/0x90 ? asm_exc_general_protection+0x22/0x27 ? free_irq_cpu_rmap+0x23/0x7d mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core] irq_pool_request_vector+0x7d/0x90 [mlx5_core] mlx5_irq_request+0x2e/0xe0 [mlx5_core] mlx5_irq_request_vector+0xad/0xf7 [mlx5_core] comp_irq_request_pci+0x64/0xf0 [mlx5_core] create_comp_eq+0x71/0x385 [mlx5_core] ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core] mlx5_comp_eqn_get+0x72/0x90 [mlx5_core] ? xas_load+0x8/0x91 mlx5_comp_irqn_get+0x40/0x90 [mlx5_core] mlx5e_open_channel+0x7d/0x3c7 [mlx5_core] mlx5e_open_channels+0xad/0x250 [mlx5_core] mlx5e_open_locked+0x3e/0x110 [mlx5_core] mlx5e_open+0x23/0x70 [mlx5_core] __dev_open+0xf1/0x1a5 __dev_change_flags+0x1e1/0x249 dev_change_flags+0x21/0x5c do_setlink+0x28b/0xcc4 ? __nla_parse+0x22/0x3d ? inet6_validate_link_af+0x6b/0x108 ? cpumask_next+0x1f/0x35 ? __snmp6_fill_stats64.constprop.0+0x66/0x107 ? __nla_validate_parse+0x48/0x1e6 __rtnl_newlink+0x5ff/0xa57 ? kmem_cache_alloc_trace+0x164/0x2ce rtnl_newlink+0x44/0x6e rtnetlink_rcv_msg+0x2bb/0x362 ? __netlink_sendskb+0x4c/0x6c ? netlink_unicast+0x28f/0x2ce ? rtnl_calcit.isra.0+0x150/0x146 netlink_rcv_skb+0x5f/0x112 netlink_unicast+0x213/0x2ce netlink_sendmsg+0x24f/0x4d9 __sock_sendmsg+0x65/0x6a ____sys_sendmsg+0x28f/0x2c9 ? import_iovec+0x17/0x2b ___sys_sendmsg+0x97/0xe0 __sys_sendmsg+0x81/0xd8 do_syscall_64+0x35/0x87 entry_SYSCALL_64_after_hwframe+0x6e/0x0 RIP: 0033:0x7fc328603727 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727 RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc </TASK> ---[ end trace f43ce73c3c2b13a2 ]--- RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00 74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31 f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff RSP: 0018:ff384881640eaca0 EFLAGS: 00010282 RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400 RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88 R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8 FS: 00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) kvm-guest: disable async PF for cpu 0 Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Tested-by: Mohith Kumar Thummaluru<mohith.k.kumar.thummaluru@oracle.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1763381768-1234998-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-18ice: fix PTP cleanup on driver removal in error pathGrzegorz Nitka1-3/+19
Improve the cleanup on releasing PTP resources in error path. The error case might happen either at the driver probe and PTP feature initialization or on PTP restart (errors in reset handling, NVM update etc). In both cases, calls to PF PTP cleanup (ice_ptp_cleanup_pf function) and 'ps_lock' mutex deinitialization were missed. Additionally, ptp clock was not unregistered in the latter case. Keep PTP state as 'uninitialized' on init to distinguish between error scenarios and to avoid resource release duplication at driver removal. The consequence of missing ice_ptp_cleanup_pf call is the following call trace dumped when ice_adapter object is freed (port list is not empty, as it is required at this stage): [ T93022] ------------[ cut here ]------------ [ T93022] WARNING: CPU: 10 PID: 93022 at ice/ice_adapter.c:67 ice_adapter_put+0xef/0x100 [ice] ... [ T93022] RIP: 0010:ice_adapter_put+0xef/0x100 [ice] ... [ T93022] Call Trace: [ T93022] <TASK> [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] ? __warn.cold+0xb0/0x10e [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] ? report_bug+0xd8/0x150 [ T93022] ? handle_bug+0xe9/0x110 [ T93022] ? exc_invalid_op+0x17/0x70 [ T93022] ? asm_exc_invalid_op+0x1a/0x20 [ T93022] ? ice_adapter_put+0xef/0x100 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] [ T93022] pci_device_remove+0x42/0xb0 [ T93022] device_release_driver_internal+0x19f/0x200 [ T93022] driver_detach+0x48/0x90 [ T93022] bus_remove_driver+0x70/0xf0 [ T93022] pci_unregister_driver+0x42/0xb0 [ T93022] ice_module_exit+0x10/0xdb0 [ice 33d2647ad4f6d866d41eefff1806df37c68aef0c] ... [ T93022] ---[ end trace 0000000000000000 ]--- [ T93022] ice: module unloaded Fixes: e800654e85b5 ("ice: Use ice_adapter for PTP shared data instead of auxdev") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-11-18idpf: fix possible vport_config NULL pointer deref in removeEmil Tantilov1-0/+2
Attempting to remove the driver will cause a crash in cases where the vport failed to initialize. Following trace is from an instance where the driver failed during an attempt to create a VF: [ 1661.543624] idpf 0000:84:00.7: Device HW Reset initiated [ 1722.923726] idpf 0000:84:00.7: Transaction timed-out (op:1 cookie:2900 vc_op:1 salt:29 timeout:60000ms) [ 1723.353263] BUG: kernel NULL pointer dereference, address: 0000000000000028 ... [ 1723.358472] RIP: 0010:idpf_remove+0x11c/0x200 [idpf] ... [ 1723.364973] Call Trace: [ 1723.365475] <TASK> [ 1723.365972] pci_device_remove+0x42/0xb0 [ 1723.366481] device_release_driver_internal+0x1a9/0x210 [ 1723.366987] pci_stop_bus_device+0x6d/0x90 [ 1723.367488] pci_stop_and_remove_bus_device+0x12/0x20 [ 1723.367971] pci_iov_remove_virtfn+0xbd/0x120 [ 1723.368309] sriov_disable+0x34/0xe0 [ 1723.368643] idpf_sriov_configure+0x58/0x140 [idpf] [ 1723.368982] sriov_numvfs_store+0xda/0x1c0 Avoid the NULL pointer dereference by adding NULL pointer check for vport_config[i], before freeing user_config.q_coalesce. Fixes: e1e3fec3e34b ("idpf: preserve coalescing settings across resets") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Chittim Madhu <madhu.chittim@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-11-18net: ps3_gelic_net: handle skb allocation failuresFlorian Fuchs2-11/+35
Handle skb allocation failures in RX path, to avoid NULL pointer dereference and RX stalls under memory pressure. If the refill fails with -ENOMEM, complete napi polling and wake up later to retry via timer. Also explicitly re-enable RX DMA after oom, so the dmac doesn't remain stopped in this situation. Previously, memory pressure could lead to skb allocation failures and subsequent Oops like: Oops: Kernel access of bad area, sig: 11 [#2] Hardware name: SonyPS3 Cell Broadband Engine 0x701000 PS3 NIP [c0003d0000065900] gelic_net_poll+0x6c/0x2d0 [ps3_gelic] (unreliable) LR [c0003d00000659c4] gelic_net_poll+0x130/0x2d0 [ps3_gelic] Call Trace: gelic_net_poll+0x130/0x2d0 [ps3_gelic] (unreliable) __napi_poll+0x44/0x168 net_rx_action+0x178/0x290 Steps to reproduce the issue: 1. Start a continuous network traffic, like scp of a 20GB file 2. Inject failslab errors using the kernel fault injection: echo -1 > /sys/kernel/debug/failslab/times echo 30 > /sys/kernel/debug/failslab/interval echo 100 > /sys/kernel/debug/failslab/probability 3. After some time, traces start to appear, kernel Oopses and the system stops Step 2 is not always necessary, as it is usually already triggered by the transfer of a big enough file. Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3") Signed-off-by: Florian Fuchs <fuchsfl@gmail.com> Link: https://patch.msgid.link/20251113181000.3914980-1-fuchsfl@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-18net: qlogic/qede: fix potential out-of-bounds read in qede_tpa_cont() and qede_tpa_end()Pavel Zhigulin1-2/+3
The loops in 'qede_tpa_cont()' and 'qede_tpa_end()', iterate over 'cqe->len_list[]' using only a zero-length terminator as the stopping condition. If the terminator was missing or malformed, the loop could run past the end of the fixed-size array. Add an explicit bound check using ARRAY_SIZE() in both loops to prevent a potential out-of-bounds access. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 55482edc25f0 ("qede: Add slowpath/fastpath support and enable hardware GRO") Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com> Link: https://patch.msgid.link/20251113112757.4166625-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-17net: airoha: Do not loopback traffic to GDM2 if it is available on the deviceLorenzo Bianconi1-1/+1
Airoha_eth driver forwards offloaded uplink traffic (packets received on GDM1 and forwarded to GDM{3,4}) to GDM2 in order to apply hw QoS. This is correct if the device does not support a dedicated GDM2 port. In this case, in order to enable hw offloading for uplink traffic, the packets should be sent to GDM{3,4} directly. Fixes: 9cd451d414f6 ("net: airoha: Add loopback support for GDM2") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20251113-airoha-hw-offload-gdm2-fix-v1-1-7e4ca300872f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-17can: sun4i_can: sun4i_can_interrupt(): fix max irq loop handlingMarc Kleine-Budde1-2/+2
Reading the interrupt register `SUN4I_REG_INT_ADDR` causes all of its bits to be reset. If we ever reach the condition of handling more than `SUN4I_CAN_MAX_IRQ` IRQs, we will have read the register and reset all its bits but without actually handling the interrupt inside of the loop body. This may, among other issues, cause us to never `netif_wake_queue()` again after a transmission interrupt. Fixes: 0738eff14d81 ("can: Allwinner A10/A20 CAN Controller support - Kernel module") Cc: stable@vger.kernel.org Co-developed-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20251116-sun4i-fix-loop-v1-1-3d76d3f81950@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-17can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing dataMarc Kleine-Budde1-5/+54
The URB received in gs_usb_receive_bulk_callback() contains a struct gs_host_frame. The length of the data after the header depends on the gs_host_frame hf::flags and the active device features (e.g. time stamping). Introduce a new function gs_usb_get_minimum_length() and check that we have at least received the required amount of data before accessing it. Only copy the data to that skb that has actually been received. Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-3-a29b42eacada@pengutronix.de [mkl: rename gs_usb_get_minimum_length() -> +gs_usb_get_minimum_rx_length()] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-16can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing headerMarc Kleine-Budde1-7/+19
The driver expects to receive a struct gs_host_frame in gs_usb_receive_bulk_callback(). Use struct_group to describe the header of the struct gs_host_frame and check that we have at least received the header before accessing any members of it. To resubmit the URB, do not dereference the pointer chain "dev->parent->hf_size_rx" but use "parent->hf_size_rx" instead. Since "urb->context" contains "parent", it is always defined, while "dev" is not defined if the URB it too short. Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-2-a29b42eacada@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-16can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBsMarc Kleine-Budde1-2/+15
The driver lacks the cleanup of failed transfers of URBs. This reduces the number of available URBs per error by 1. This leads to reduced performance and ultimately to a complete stop of the transmission. If the sending of a bulk URB fails do proper cleanup: - increase netdev stats - mark the echo_sbk as free - free the driver's context and do accounting - wake the send queue Closes: https://github.com/candle-usb/candleLight_fw/issues/187 Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Link: https://patch.msgid.link/20251114-gs_usb-fix-usb-callbacks-v1-1-a29b42eacada@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-16can: sja1000: fix max irq loop handlingThomas Mühlbacher1-2/+2
Reading the interrupt register `SJA1000_IR` causes all of its bits to be reset. If we ever reach the condition of handling more than `SJA1000_MAX_IRQ` IRQs, we will have read the register and reset all its bits but without actually handling the interrupt inside of the loop body. This may, among other issues, cause us to never `netif_wake_queue()` again after a transmission interrupt. Fixes: 429da1cc841b ("can: Driver for the SJA1000 CAN controller") Cc: stable@vger.kernel.org Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20251115153437.11419-1-tmuehlbacher@posteo.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-16can: kvaser_usb: leaf: Fix potential infinite loop in command parsersSeungjin Bae1-2/+2
The `kvaser_usb_leaf_wait_cmd()` and `kvaser_usb_leaf_read_bulk_callback` functions contain logic to zero-length commands. These commands are used to align data to the USB endpoint's wMaxPacketSize boundary. The driver attempts to skip these placeholders by aligning the buffer position `pos` to the next packet boundary using `round_up()` function. However, if zero-length command is found exactly on a packet boundary (i.e., `pos` is a multiple of wMaxPacketSize, including 0), `round_up` function will return the unchanged value of `pos`. This prevents `pos` to be increased, causing an infinite loop in the parsing logic. This patch fixes this in the function by using `pos + 1` instead. This ensures that even if `pos` is on a boundary, the calculation is based on `pos + 1`, forcing `round_up()` to always return the next aligned boundary. Fixes: 7259124eac7d ("can: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c") Signed-off-by: Seungjin Bae <eeodqql09@gmail.com> Reviewed-by: Jimmy Assarsson <extja@kvaser.com> Tested-by: Jimmy Assarsson <extja@kvaser.com> Link: https://patch.msgid.link/20251023162709.348240-1-eeodqql09@gmail.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-11-14veth: more robust handing of race to avoid txq getting stuckJesper Dangaard Brouer1-18/+20
Commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") introduced a race condition that can lead to a permanently stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra Max). The race occurs in veth_xmit(). The producer observes a full ptr_ring and stops the queue (netif_tx_stop_queue()). The subsequent conditional logic, intended to re-wake the queue if the consumer had just emptied it (if (__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a "lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and traffic halts. This failure is caused by an incorrect use of the __ptr_ring_empty() API from the producer side. As noted in kernel comments, this check is not guaranteed to be correct if a consumer is operating on another CPU. The empty test is based on ptr_ring->consumer_head, making it reliable only for the consumer. Using this check from the producer side is fundamentally racy. This patch fixes the race by adopting the more robust logic from an earlier version V4 of the patchset, which always flushed the peer: (1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier are removed. Instead, after stopping the queue, we unconditionally call __veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled, making it solely responsible for re-waking the TXQ. This handles the race where veth_poll() consumes all packets and completes NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue. The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule NAPI. (2) On the consumer side, the logic for waking the peer TXQ is moved out of veth_xdp_rcv() and placed at the end of the veth_poll() function. This placement is part of fixing the race, as the netif_tx_queue_stopped() check must occur after rx_notify_masked is potentially set to false during NAPI completion. This handles the race where veth_poll() consumes all packets, but haven't finished (rx_notify_masked is still true). The producer veth_xmit() stops the TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning not starting NAPI. Then veth_poll() change rx_notify_masked to false and stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake it up. Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14net: mlxsw: linecards: fix missing error check in mlxsw_linecard_devlink_info_get()Pavel Zhigulin1-0/+2
The call to devlink_info_version_fixed_put() in mlxsw_linecard_devlink_info_get() did not check for errors, although it is checked everywhere in the code. Add missed 'err' check to the mlxsw_linecard_devlink_info_get() Fixes: 3fc0c51905fb ("mlxsw: core_linecards: Expose device PSID over device info") Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251113161922.813828-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-14net: dsa: hellcreek: fix missing error handling in LED registrationPavel Zhigulin1-2/+12
The LED setup routine registered both led_sync_good and led_is_gm devices without checking the return values of led_classdev_register(). If either registration failed, the function continued silently, leaving the driver in a partially-initialized state and leaking a registered LED classdev. Add proper error handling Fixes: 7d9ee2e8ff15 ("net: dsa: hellcreek: Add PTP status LEDs") Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://patch.msgid.link/20251113135745.92375-1-Pavel.Zhigulin@kaspersky.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>