aboutsummaryrefslogtreecommitdiffstatshomepage
AgeCommit message (Collapse)AuthorFilesLines
2026-05-16wifi: iwlwifi: use correct function to read STEP_URM registerMoriya Itzchaki1-3/+3
CNVI_PMU_STEP_FLOW is a PRPH register, not a UMAC PRPH register. Use iwl_read_prph() instead of iwl_read_umac_prph() to read it correctly. Signed-off-by: Moriya Itzchaki <moriya.itzchaki@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20260515151352.3a69fa2dbda7.I8d96635a9c06a835b05a10b6d66c8a9299676246@changeid Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2026-05-16wifi: iwlwifi: mvm: fix driver-set TX rates on old devicesJohannes Berg2-19/+22
On old devices such as 7265D, rates are still encoded in version 1 format, which doesn't use the CCK/OFDM rate index (0-3/0-7) but rather their PLCP value (e.g. 10 for 1 Mbps CCK rate.) While introducing v3 rates, I changed the driver from internally handling v1 rates and converting to v2, to internally handling v3 and converting to v1 or v2 according to the firmware. I accordingly changed the code in iwl_mvm_mac80211_idx_to_hwrate() to no longer have different values for different APIs. This was correct. However, I later reverted this part of the change, because it was reported that I had broken beacon rates, causing a FW assert/crash. This caused TX_CMD rates to be set incorrectly, potentially causing a warning when reported back from the device as having been used. Fix this (hopefully correctly now) by handling beacon rates in the TX_CMD that's embedded in the beacon template command separately. Restore iwl_mvm_mac80211_idx_to_hwrate() to return only the rate index, not PLCP value, fixing the real TX_CMD. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20260515151351.7407e293dff7.I4ea1a17f8fe99c933d3f3e30d077cf4246125c3e@changeid Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2026-05-16wifi: iwlwifi: mld: don't dereference a pointer before NULL checking itMiri Korenblit1-7/+6
In iwl_mld_remove_link, the link->fw_id is saved at the beginning of the function so we have it after we freed the link. But the link pointer can be NULL, and is not checked when the fw_id is stored. Fix it by simply freeing the link at the end of the function. fFixes: 0e66a39f4f0e ("wifi: iwlwifi: fix potential use after free in iwl_mld_remove_link()") Reviewed-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20260515151351.371f40fc6711.I6a82cfe9655564e9c5731af91c36493b26b1208e@changeid Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2026-05-16wifi: iwlwifi: mld: stop TX during firmware restartSheroz Juraev1-0/+10
When iwlwifi firmware crashes (e.g., NMI_INTERRUPT_UNKNOWN on Intel BE201/Wi-Fi 7), iwl_mld_nic_error() sets mld->fw_status.in_hw_restart to true. However, iwl_mld_tx_from_txq() does not check this flag before dequeuing frames from mac80211 and pushing them to the transport layer. Since the firmware is dead, iwl_trans_tx() returns -EIO for each frame, which then gets freed immediately. Under high-throughput conditions (e.g., Tailscale UDP traffic or active SSH sessions), this creates a tight dequeue-send-fail-free loop that wastes CPU cycles and generates rapid skb allocation churn, leading to memory pressure from slab fragmentation. The RX path already has this guard (iwl_mld_rx_mpdu checks in_hw_restart at rx.c:1906), and so does the TXQ allocation worker (iwl_mld_add_txqs_wk at tx.c:156). Add the same guard to iwl_mld_tx_from_txq() to stop all TX during firmware restart. Frames left in mac80211's TXQs are naturally drained after restart completes, when queue reallocation triggers iwl_mld_tx_from_txq() via iwl_mld_add_txq_list(), or when new upper-layer traffic invokes wake_tx_queue. Tested on ASUS Zenbook 14 UX3405CA with Intel BE201 (Wi-Fi 7) on kernel 6.19.5 where the firmware crashes approximately every 10-15 minutes under Tailscale traffic. Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver") Cc: stable@vger.kernel.org Signed-off-by: Sheroz Juraev <goodmartiandev@gmail.com> Link: https://patch.msgid.link/20260315081221.2678478-1-goodmartiandev@gmail.com Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2026-05-16wifi: iwlwifi: mld: fix TSO segmentation explosion when AMSDU is disabledCole Leavitt1-1/+4
When the TLC notification disables AMSDU for a TID, the MLD driver sets max_tid_amsdu_len to the sentinel value 1. The TSO segmentation path in iwl_mld_tx_tso_segment() checks for zero but not for this sentinel, allowing it to reach the num_subframes calculation: num_subframes = (max_tid_amsdu_len + pad) / (subf_len + pad) = (1 + 2) / (1534 + 2) = 0 This zero propagates to iwl_tx_tso_segment() which sets: gso_size = num_subframes * mss = 0 Calling skb_gso_segment() with gso_size=0 creates over 32000 tiny segments from a single GSO skb. This floods the TX ring with ~1024 micro-frames (the rest are purged), creating a massive burst of TX completion events that can lead to memory corruption and a subsequent use-after-free in TCP's retransmit queue (refcount underflow in tcp_shifted_skb, NULL deref in tcp_rack_detect_loss). The MVM driver is immune because it checks mvmsta->amsdu_enabled before reaching the num_subframes calculation. The MLD driver has no equivalent bitmap check and relies solely on max_tid_amsdu_len, which does not catch the sentinel value. Fix this by detecting the sentinel value (max_tid_amsdu_len == 1) at the existing check and falling back to non-AMSDU TSO segmentation. Also add a WARN_ON_ONCE guard after the num_subframes division as defense-in-depth to catch any future code paths that produce zero through a different mechanism. Suggested-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com> Fixes: d1e879ec600f ("wifi: iwlwifi: add iwlmld sub-driver") Signed-off-by: Cole Leavitt <cole@unwrap.rs> Link: https://patch.msgid.link/20260405054145.1064152-3-cole@unwrap.rs Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2026-05-16batman-adv: fix batadv_skb_is_frag() kernel-docSven Eckelmann1-2/+2
The kernel-doc comment for batadv_skb_is_frag() contained two errors: * the function description referred to "gain a unicast packet" instead of "contains unicast fragment". * the Return section omitted "merged" from "newly skb", leaving the description grammatically incorrect and inconsistent with the function description. Fixes: bc62216dc8e2 ("batman-adv: frag: disallow unicast fragment in fragment") Signed-off-by: Sven Eckelmann <sven@narfation.org>
2026-05-16tracing: Fix desc in error path for the trace remote test moduleVincent Donnefort1-2/+2
During initialisation in remote_test_load(), if one of the simple_ring_buffer fails to initialise, the error path attempts to rollback initialised buffers. However, the rollback incorrectly uses the global pointer to the trace descriptor, which is only set upon successful load completion. Fix the error path by using the local pointer to the descriptor. Link: https://patch.msgid.link/20260515201616.337469-1-vdonnefort@google.com Fixes: ea908a2b79c8 ("tracing: Add a trace remote module for testing") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Vincent Donnefort <vdonnefort@google.com> base-commit: 5d6919055dec134de3c40167a490f33c74c12581 Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2026-05-16io_uring/waitid: clear waitid info before copying it to userspaceHeechan Kang1-0/+1
IORING_OP_WAITID stores its result fields in struct io_waitid::info and later copies them to userspace siginfo. The prep path initializes the request arguments, but it does not initialize info itself. If the wait operation completes without reporting a child event, the common wait code can return without writing wo_info. In that case io_waitid_finish() still copies iw->info to userspace, exposing stale bytes from the reused io_kiocb command storage. Clear the result storage during prep so the io_uring path matches the regular waitid syscall, which uses a zero-initialized struct waitid_info. Fixes: f31ecf671ddc ("io_uring: add IORING_OP_WAITID support") Cc: stable@vger.kernel.org # 6.7+ Signed-off-by: Heechan Kang <gganji11@naver.com> Link: https://patch.msgid.link/20260516184709.852814-1-gganji11@naver.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-05-16Merge tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds8-52/+27
Pull powerpc fixes from Madhavan Srinivasan: - Fix preempt count leak in sysfs show paths - Fix error handling in pika_dtm_thread - Remove pmac_low_i2c_{lock,unlock}() - Enable all windfarms by default - Fix dead default for GUEST_STATE_BUFFER_TEST - Remove redundant preempt_disable|enable() calls from arch_irq_work_raise() Thanks to Aboorva Devarajan, Ally Heev, Amit Machhiwal, Bart Van Assche, Christophe Leroy, Christophe Leroy (CS GROUP), Dan Carpenter, Gautam Menghani, Harsh Prateek Bora, Julian Braha, Krzysztof Kozlowski, Linus Walleij, Ma Ke, Ritesh Harjani (IBM), and Sayali Patil * tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/time: Remove redundant preempt_disable|enable() calls from arch_irq_work_raise() powerpc/hv-gpci: fix preempt count leak in sysfs show paths powerpc: fix dead default for GUEST_STATE_BUFFER_TEST powerpc/powermac: Remove pmac_low_i2c_{lock,unlock}() powerpc/warp: Fix error handling in pika_dtm_thread powerpc: 82xx: fix uninitialized pointers with free attribute powerpc/g5: Enable all windfarms by default
2026-05-16Merge tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds12-35/+99
Pull sound fixes from Takashi Iwai: "A collection of small fixes. All device-specific small changes: HD-audio: - Fix NULL pointer dereference in snd_hda_ctl_add() - ACPI and Kconfig fixes for Cirrus drivers - A regression fix CA0132 codec - Various device-specific quirks for HP, Lenovo, Samsung, Framework etc - Documentation path fix USB-audio: - Boundary checks for MIDI endpoint descriptors - Offload mapping error handling for Qualcomm - A new device quirk for TTGK Technology USB-C Audio - A fix for Focusrite Scarlett2 mixer" * tag 'sound-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/ca0132: Disable auto-detect on manual output select ALSA: hda/realtek: Add mute LED quirk for HP Pavilion Laptop 16-ag0xxx ALSA: hda/realtek: ALC269 fixup for Lenovo Yoga Pro 7 15ASH111 audio ALSA: hda: Fix NULL pointer dereference in snd_hda_ctl_add() ALSA: hda/realtek: Add quirk for Samsung Galaxy Book5 360 headphone ALSA: hda/cs35l56: Drop malformed default N from Kconfig ALSA: hda/realtek: fix mic boost on Framework PTL ALSA: hda/realtek: Limit mic boost on Positivo DN50E ALSA: doc: cs35l56: Update path to HDA driver source ALSA: usb-audio: qcom: Check offload mapping failures ALSA: hda/realtek: Fix Legion 7 16ITHG6 speaker amp binding ALSA: usb-audio: Add iface reset and delay quirk for TTGK Technology USB-C Audio ALSA: scarlett2: Add missing error check when initialise Autogain Status ALSA: hda: cs35l41: Put ACPI device on missing physical node ALSA: hda: cs35l56: Put ACPI device after setting companion ALSA: usb-audio: Bound MIDI 2.0 endpoint descriptor scans ALSA: usb-audio: Bound MIDI endpoint descriptor scans ALSA: hda/realtek: Add codec SSID quirk for Lenovo Yoga Pro 9 16IMH9 (17aa:38d5)
2026-05-16hwmon: (lm90) Add lock protection to lm90_alertGuenter Roeck1-0/+2
Sashiko reports: lm90_alert() executes in the smbus alert context and calls lm90_update_confreg() to disable the hardware alert line, without acquiring hwmon_lock. Concurrently, sysfs write operations (such as lm90_write_convrate) hold the hwmon_lock, temporarily modify data->config, and then restore it. If an alert interrupt occurs concurrently with a sysfs write, the sysfs path will overwrite the alert handler's modifications to data->config and the hardware register. This unintentionally re-enables the hardware alert line while the alarm is still active, causing an interrupt storm. Add the missing lock to lm90_alert() to solve the problem. Fixes: 7a1d220ccb0cc ("hwmon: (lm90) Introduce function to update configuration register") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-05-16hwmon: (lm90) Stop work before releasing hwmon deviceGuenter Roeck1-4/+20
Sashiko reports: In lm90_probe(), the devm action to cancel the alert_work and report_work (lm90_restore_conf) is registered in lm90_init_client() before devm_hwmon_device_register_with_info() is called. Because devm executes cleanup actions in reverse order during module unbind or probe failure, the hwmon device is unregistered and freed first. If lm90_alert_work() or lm90_report_alarms() runs in the window between the hwmon device being freed and the delayed works being cancelled, lm90_update_alarms() will dereference the freed data->hwmon_dev here. Fix the problem by canceling the workers separately after registering the hwmon device and before registering the interrupt handler. This ensures that the workers are canceled after interrupts are disabled and before the hwmon device is released. Add "shutdown" flag to indicate that device shutdown is in progress to prevent workers from being re-armed. Fixes: f6d0775119fb9 ("hwmon: (lm90) Rework alarm/status handling") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2026-05-16ALSA: pcm_drm_eld: rate-limit ELD parsing errorsFrancesco Saverio Pavone1-2/+2
Mirror of Mark Brown's ASoC: hdac_hdmi rate-limit patch (commit [lkml.kernel.org/lkml/2025/6/13/1380]) for the generic snd_parse_eld() helper used by ASoC hdmi-codec. When a HDMI sink is disconnected (e.g. a board with two HDMI outputs and only one cable), userspace audio servers like PipeWire keep probing the disconnected card and trigger: HDMI: Unknown ELD version 0 at every probe — easily 30+ messages per burst on rk3588. The same applies to malformed ELD (MNL out of range). Both conditions are expected when no sink is attached; rate-limit the dev_info() so the kernel ring buffer does not fill up. Signed-off-by: Francesco Saverio Pavone <pavone.lawyer@gmail.com> Link: https://patch.msgid.link/20260516141244.21801-1-pavone.lawyer@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-05-16drm/msm/snapshot: fix dumping of the unaligned regionsDmitry Baryshkov1-6/+18
The snapshotting code internally aligns data segment to 16 bytes. This works fine for DPU code (where most of the regions are aligned), but fails for snapshotting of the DSI data (because DSI data region is shifted by 4 bytes). Fix the code by removing length alignment and by accurately printing last registers in the region. While reworking the code also fix the 16x memory overallocation in msm_disp_state_dump_regs(). Fixes: 98659487b845 ("drm/msm: add support to take dpu snapshot") Reported-by: Salendarsingh Gaud <sgaud@qti.qualcomm.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/725449/ Message-ID: <20260516-msm-fix-dsi-dump-2-v2-1-9e49fb2d240e@oss.qualcomm.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
2026-05-16ALSA: hda: Avoid quirk matching with zero PCI SSIDTakashi Iwai1-2/+2
Heiko reported that BIOS on some recent machines doesn't set up PCI SSID properly but leave with zero (e.g. on HP Dragonfly Folio 13.5 inch G3 with SSID 103c:8a05/8a06), which confuses the quirk table matching and results in the non-functional state. Fix it by skipping the PCI SSID matching when either vendor or device ID is zero and falling back to the codec SSID that is supposed to be more stable for those cases. Reported-by: Heiko Schmid <heiko@future-machines.org> Tested-by: Heiko Schmid <heiko@future-machines.org> Closes: https://lore.kernel.org/20260514133110.12302-1-heiko@future-machines.org Link: https://patch.msgid.link/20260515105700.276420-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-05-16ALSA: hda/realtek: Add quirk for HP 250 G10 (103c:8b34)Sergio Boglione1-0/+1
HP 250 15.6 inch G10 Notebook PC uses the same ALC236 codec as the HP 255 15.6 inch G10 (103c:8b2f) and requires the same fixup to enable the internal speaker EAPD and microphone routing. Signed-off-by: Sergio Boglione <sboglione@gmail.com> Link: https://patch.msgid.link/20260516131651.143109-1-sboglione@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-05-16ALSA: hda/realtek: Use ALC287_FIXUP_TXNW2781_I2C for ASUS Strix Gxx5Eric Naim1-6/+6
These devices were incorrectly using the ALC287_FIXUP_TAS2781_I2C quirk leading to errors: [ 18.765990] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found [ 18.768153] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found [ 18.768476] Serial bus multi instantiate pseudo device driver TXNW2781:00: error -ENXIO: IRQ index 0 not found [ 18.768899] Serial bus multi instantiate pseudo device driver TXNW2781:00: Instantiated 3 I2C devices. Use the ALC287_FIXUP_TXNW2781_I2C quirk instead to fix this and restore speaker audio on affected devices. Fixes: 1e9c708dc3ae ("ALSA: hda/tas2781: Add new quirk for Lenovo, ASUS, Dell projects") Link: https://lore.kernel.org/59fd4aa4-76b9-4984-8db9-a60e55ec6e80@losource.net/ Closes: https://lore.kernel.org/CACB9z7kjs8rhLstEc8fV29BCTb5dd881JwGozoKdO5cwCb=YwQ@mail.gmail.com Signed-off-by: Eric Naim <dnaim@cachyos.org> Link: https://patch.msgid.link/20260516111532.111463-1-dnaim@cachyos.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-05-16ALSA: asihpi: Fix potential OOB array access at reading cacheTakashi Iwai1-0/+6
find_control() to retrieve a cached info accesses the array with the given index blindly, which may lead to an OOB array access. Add a sanity check for avoiding it. Link: https://sashiko.dev/#/patchset/20260511230121.28606-1-rosenp%40gmail.com Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20260515085606.242284-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-05-16netfilter: nf_queue: hold bridge skb->dev while queuedHaoze Xie3-1/+6
br_pass_frame_up() rewrites skb->dev from the ingress port to the bridge master before queueing bridge LOCAL_IN packets. NFQUEUE only holds references on state.in/out and bridge physdevs, so a queued bridge packet can retain a freed bridge master in skb->dev until reinjection. When the verdict is reinjected later, br_netif_receive_skb() re-enters the receive path with skb->dev still pointing at the freed bridge master, triggering a use-after-free. Store skb->dev in the queue entry, hold a reference on it for the queue lifetime, and use the saved device when dropping queued packets during NETDEV_DOWN handling. Fixes: ac2863445686 ("netfilter: bridge: add nf_afinfo to enable queuing to userspace") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Haoze Xie <royenheart@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: br_netfilter: Reallocate headroom if necessary in neigh_hh_bridge()Lorenzo Bianconi2-3/+11
neigh_hh_bridge() assumes the skb always has sufficient headroom to copy the aligned L2 header. This assumption can trigger the crash reported below using the following netfilter setup: $modprobe br_netfilter $sysctl -w net.bridge.bridge-nf-call-iptables=1 $root@OpenWrt:~# nft list ruleset table ip nat { chain prerouting { type nat hook prerouting priority dstnat; policy accept; ip daddr 192.168.83.123 dnat to 192.168.83.120 } } - iperf3 client (192.168.83.119) --> bridge (192.168.83.118) --> iperf3 server (192.168.83.120) the iperf3 client is sending packet for 192.168.83.123 to the bridge device. [ 1579.036575] Unable to handle kernel write to read-only memory at virtual address ffffff8004d76ffe [ 1579.045482] Mem abort info: [ 1579.048273] ESR = 0x000000009600004f [ 1579.052024] EC = 0x25: DABT (current EL), IL = 32 bits [ 1579.057363] SET = 0, FnV = 0 [ 1579.060417] EA = 0, S1PTW = 0 [ 1579.063550] FSC = 0x0f: level 3 permission fault [ 1579.068345] Data abort info: [ 1579.071224] ISV = 0, ISS = 0x0000004f, ISS2 = 0x00000000 [ 1579.076720] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 [ 1579.081770] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 1579.087092] swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000080dc4000 [ 1579.093794] [ffffff8004d76ffe] pgd=180000009ffff003, p4d=180000009ffff003, pud=180000009ffff003, pmd=180000009ffe3003, pte=0060000084d76787 [ 1579.106343] Internal error: Oops: 000000009600004f [#1] SMP [ 1579.193824] CPU: 0 UID: 0 PID: 235 Comm: napi/qdma_eth-3 Tainted: G O 6.12.57 #0 [ 1579.202614] Tainted: [O]=OOT_MODULE [ 1579.206102] Hardware name: Airoha AN7581 Evaluation Board (DT) [ 1579.211929] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1579.218889] pc : br_nf_pre_routing_finish_bridge+0x1ac/0xcc8 [br_netfilter] [ 1579.225859] lr : br_nf_pre_routing_finish_bridge+0x18c/0xcc8 [br_netfilter] [ 1579.232822] sp : ffffffc0817cba20 [ 1579.236128] x29: ffffffc0817cba20 x28: 0000000000000000 x27: ffffff8002b89000 [ 1579.243273] x26: ffffff8004d7700e x25: 0000000000000008 x24: 0000000000000000 [ 1579.250416] x23: ffffffc08179d4c0 x22: 0000000000000000 x21: ffffffc08179d4c0 [ 1579.257561] x20: ffffff8004d9b800 x19: ffffff8015010000 x18: 0000000000000014 [ 1579.264704] x17: ffffffbf9e930000 x16: ffffffc0817c8000 x15: 0000000000000070 [ 1579.271848] x14: 0000000000000080 x13: 0000000000000001 x12: 0000000000000000 [ 1579.278993] x11: ffffffc0798caae0 x10: ffffff8014db6fd8 x9 : 0000000000000000 [ 1579.286136] x8 : 0000000000000003 x7 : ffffffc08171f628 x6 : 000000001a3b83d3 [ 1579.293281] x5 : 0000000000000000 x4 : 1beb76f22fee0000 x3 : ffffff8004d7700e [ 1579.300425] x2 : 0000000000000000 x1 : ffffff8004d9b8bc x0 : ffffff80026ed000 [ 1579.307570] Call trace: [ 1579.310018] br_nf_pre_routing_finish_bridge+0x1ac/0xcc8 [br_netfilter] [ 1579.316632] br_nf_hook_thresh+0xd4/0x14bc [br_netfilter] [ 1579.322032] br_nf_hook_thresh+0x250/0x14bc [br_netfilter] [ 1579.327517] br_nf_hook_thresh+0x76c/0x14bc [br_netfilter] [ 1579.333003] br_handle_frame+0x180/0x480 [ 1579.336935] __netif_receive_skb_core.constprop.0+0x540/0xf40 [ 1579.342682] __netif_receive_skb_one_core+0x28/0x50 [ 1579.347561] process_backlog+0x98/0x1e0 [ 1579.351398] __napi_poll+0x34/0x1c4 [ 1579.354887] net_rx_action+0x178/0x330 [ 1579.358638] handle_softirqs+0x108/0x2d4 [ 1579.362560] __do_softirq+0x10/0x18 [ 1579.366051] ____do_softirq+0xc/0x20 [ 1579.369627] call_on_irq_stack+0x30/0x4c [ 1579.373550] do_softirq_own_stack+0x18/0x20 [ 1579.377734] do_softirq+0x4c/0x60 [ 1579.381050] __local_bh_enable_ip+0x88/0x98 [ 1579.385234] napi_threaded_poll_loop+0x188/0x21c [ 1579.389853] napi_threaded_poll+0x70/0x80 [ 1579.393863] kthread+0xd8/0xdc [ 1579.396918] ret_from_fork+0x10/0x20 [ 1579.400499] Code: 88dffc22 3707ffc2 f9406663 f9406684 (f81f0064) [ 1579.406589] ---[ end trace 0000000000000000 ]--- [ 1579.411209] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 1579.418083] SMP: stopping secondary CPUs [ 1579.422012] Kernel Offset: disabled Fix the issue reallocating the skb headroom if necessary in neigh_hh_bridge routine. Fixes: e179e6322ac33 ("netfilter: bridge-netfilter: Fix MAC header handling with IP DNAT") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ipset: annotate "pos" for concurrent readers/writersJozsef Kadlecsik1-24/+38
The "pos" structure member of struct hbucket stores the first free slot in the hash bucket of a hash type of set and there are concurrent readers/writers. Annotate accesses properly. Fixes: 18f84d41d34f ("netfilter: ipset: Introduce RCU locking in hash:* types") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ipset: Fix data race between add and dump in all hash typesJozsef Kadlecsik1-2/+5
When adding a new entry to the next position in the existing hash bucket, the position index was incremented too early and parallel dump could read it before the entry was populated with the value. Move the setting of the position index after populating the entry. v2: Position counting fixed, noticed by Florian Westphal. Fixes: 18f84d41d34f ("netfilter: ipset: Introduce RCU locking in hash:* types") Reported-by: syzbot+786c889f046e8b003ca6@syzkaller.appspotmail.com Reported-by: syzbot+1da17e4b41d795df059e@syzkaller.appspotmail.com Reported-by: syzbot+421c5f3ff8e9493084d9@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ipset: Fix data race between add and list header in all hash typesJozsef Kadlecsik1-2/+2
The "ipset list -terse" command is actually a dump operation which may run parallel with "ipset add" commands, which can trigger an internal resizing of the hash type of sets just being dumped. However, dumping just the header part of the set was not protected against underlying resizing. Fix it by protecting the header dumping part as well. Fixes: c4c997839cf9 ("netfilter: ipset: Fix parallel resizing and listing of the same set") Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ip6t_hbh: reject oversized option listsZhengchuan Liang1-0/+4
struct ip6t_opts stores at most IP6T_OPTS_OPTSNR option descriptors, but hbh_mt6_check() does not reject larger optsnr values supplied from userspace. Validate optsnr in the rule setup path so only match data that fits the fixed-size opts array can be installed. This follows the existing xtables pattern of rejecting invalid user-provided counts in checkentry() and keeps the packet matching path unchanged. `struct ip6t_opts` has a fixed `opts[IP6T_OPTS_OPTSNR]` array, where `IP6T_OPTS_OPTSNR` is 16, then off-by-one array access is possible: [ 137.924693][ T8692] UBSAN: array-index-out-of-bounds in ../net/ipv6/netfilter/ip6t_hbh.c:110:29 [ 137.926167][ T8692] index 16 is out of range for type '__u16 [16]' Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: nft_inner: release local_lock before re-enabling softirqsFlorian Westphal1-1/+1
Quoting sashiko: In the error path, local_bh_enable() is called before local_unlock_nested_bh(). Fixes: ba36fada9ab4 ("netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ipset: stop hash:* range iteration at endNan Li4-4/+17
The following hash set variants: hash:ip,mark hash:ip,port hash:ip,port,ip hash:ip,port,net iterate IPv4 ranges with a 32-bit iterator. The iterator must stop once the last address in the requested range has been processed. Advancing it once more can move the traversal state past the end of the request, so a later retry may continue from an unintended position. Handle the iterator increment explicitly at the end of the loop and stop once the upper bound has been processed. This keeps the existing retry behaviour intact for valid ranges while preventing traversal from continuing past the original boundary. Fixes: 48596a8ddc46 ("netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addresses") Cc: stable@kernel.org Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Nan Li <tonanli66@gmail.com> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: nft_inner: Fix IPv6 inner_thoff desyncYizhou Zhao1-1/+0
In nft_inner_parse_l2l3(), when processing inner IPv6 packets, ipv6_find_hdr() correctly computes the transport header offset traversing all extension headers, but the result is immediately overwritten with nhoff + sizeof(_ip6h) (40 bytes), which only accounts for the IPv6 base header. This creates a desync between inner_thoff (wrong — points to extension header start) and l4proto (correct — e.g., IPPROTO_TCP), enabling transport header forgery and potential firewall bypass. This issue affects stable versions from Linux 6.2. For comparison, the normal (non-inner) IPv6 path correctly preserves ipv6_find_hdr()'s result. Removing the incorrect overwrite ensures that ipv6_find_hdr()'s calculated transport header offset is preserved, thereby fixing the desynchronization. Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Cc: stable@vger.kernel.org Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn> Reported-by: Xuewei Feng <fengxw06@126.com> Reported-by: Qi Li <qli01@tsinghua.edu.cn> Reported-by: Ke Xu <xuke@tsinghua.edu.cn> Assisted-by: GLM:5.1 Z.ai Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: ipset: fix a potential dump-destroy raceJozsef Kadlecsik1-0/+1
When dumping sets in order to create the proper order for restore, the list type of sets dumped last. Therefore internally we run the dumping loop twice: first with all non-list type of sets and skipping the list type ones and then secondly for the list type of sets. Sashiko noticed that there's a potential race between dump and destroy if in the first loop the last set was a list type of set: its pointer remains unreferenced and a concurrent destroy can free it. Fix the issue by resetting the variable holding the pointer. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16ipvs: avoid possible loop in ip_vs_dst_event on resizingJulian Anastasov2-66/+124
Sashiko points out that unprivileged user can frequently call ip_vs_flush() or ip_vs_del_service() to trigger svc_table_changes updates that can lead to infinite loop in ip_vs_dst_event(). This can also happen if the user triggers frequent table resizing without deleting all services. We should also consider the possible effects if the user triggers many NETDEV_DOWN events. One way to solve it is to hold svc_resize_sem in ip_vs_dst_event() but this can block the dev notifier during the whole resizing process. Instead, use new rw_semaphore svc_replace_sem to protect just the svc_table replacement which is a short code section. Then hold svc_replace_sem in ip_vs_dst_event() to serialize with replacing the svc_table. As result, loop is avoided as there is no need to repeat the table walking from the start. By this way changes in svc_table_changes can happen only when all services are removed and all dev references dropped which allows us to abort the table walking. As IP_VS_WORK_SVC_NORESIZE is the flag used to stop the svc_resize_work under service_mutex, we should check only this flag often but not while under service_mutex. To remove the mutex_trylock() for service_mutex in the second phase where the resizer installs the new table after rehashing, we will avoid holding the service_mutex there. As result, the code in configuration context which is under service_mutex should access ipvs->svc_table under RCU because it can be replaced at anytime and released after a RCU grace period. As for ip_vs_zero_all(), it needs different solution as a table walker which can escape single RCU read-side critical section: to hold the svc_replace_sem to prevent table to be replaced. In ip_vs_status_show() prefer to hold svc_replace_sem to avoid many loops, just detect if the svc_table is removed. Prefer the newly attached table for the u_thresh/l_thresh checks to know when to grow/shrink while adding or deleting services because the new table size is based on the latest parameters. Link: https://sashiko.dev/#/patchset/20260505001648.360569-1-pablo%40netfilter.org Fixes: 840aac3d900d ("ipvs: use resizable hash table for services") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16netfilter: nf_conntrack_helper: fix possible null deref during error logFlorian Westphal1-5/+8
Reported by sashiko: there is a small race window. If a helper module is unloaded or a userspace-defined helper is removed, nf_conntrack_helper_unregister() sets ->helper to NULL. Handle this safely. This needs a second patch to close related race during nf_conntrack_helper_unregister(). Fixes: b20ab9cc63ca ("netfilter: nf_ct_helper: better logging for dropped packets") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2026-05-16xfrm: ah: use skb_to_full_sk in async output callbacksMichael Bommarito2-2/+2
When AH output is offloaded to an asynchronous crypto provider (hardware accelerators such as AMD CCP, or a forced-async software shim used for testing), the digest completion fires ah_output_done() / ah6_output_done() on a workqueue. The egress skb at that point may have been originated by a TCP listener sending a SYN-ACK, which sets skb->sk to a request_sock via skb_set_owner_edemux(); it may also have been originated by an inet_timewait_sock retransmit. Neither is a full struct sock, and passing the raw skb->sk to xfrm_output_resume() then forwards a non-full socket through the rest of the xfrm output chain. xfrm_output_resume() and its downstream consumers expect a full sk where they dereference at all. The natural egress path through ah_output_done() does not crash today because the consumers that read past sock_common are either gated by sk_fullsock() or short-circuit on flags that are clear on a fresh request_sock; an exhaustive walk of the 50 most plausible consumers under sch_fq, dev_queue_xmit, netfilter, tc-egress and cgroup-egress BPF found no current unguarded deref. The bug is still a real type confusion that future consumer changes could turn into a memory-corruption primitive. This is the same bug class fixed for ESP in commit 1620c88887b1 ("xfrm: Fix the usage of skb->sk"). Apply the analogous fix to AH: convert skb->sk to a full socket pointer (or NULL) via skb_to_full_sk() before handing it to xfrm_output_resume(). The same async AH callbacks were touched recently for an independent ESN-related ICV layout bug in commit ec54093e6a8f ("xfrm: ah: account for ESN high bits in async callbacks"); the sk type-confusion addressed here is orthogonal. This patch is part of an ongoing audit of the AH callback paths; an ah_output ihl-validation hardening series is also currently under review on netdev. Reproduced under UML + KASAN + lockdep with a forced-async hmac(sha1) shim that registers at priority 9999 and wraps the sync in-tree hmac-sha1-lib. With the shim loaded, ah_output_done runs on every SYN-ACK egress through a transport-mode AH SA and skb->sk arrives as a request_sock (TCP_NEW_SYN_RECV); after this patch, xfrm_output_resume() receives the listener (the result of sk_to_full_sk()) and consumer derefs land on full-sock fields as intended. Fixes: 9ab1265d5231 ("xfrm: Use actual socket sk instead of skb socket for xfrm_output_resume") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2026-05-16ASoC: amd: acp: Add DMI quirk for ASUS Zenbook S16 UM5606GAJasper Smet1-0/+7
The ASUS Zenbook S16 (UM5606GA) with AMD Ryzen AI 9 465 (Strix Point, ACP 7.0) has a BIOS that incorrectly sets the ACPI property 'acp-audio-config-flag' to 0x10 (FLAG_AMD_LEGACY_ONLY_DMIC) for the ACP device. This prevents snd_pci_ps from probing the SoundWire bus, resulting in no internal audio (dummy output only). The hardware uses a Cirrus Logic CS42L43 (headphone/jack) and four CS35L56 smart amplifiers (speakers), all on SoundWire link 1. The corresponding machine table entry (acp70_cs42l43_l1u0_cs35l56x4_l1u0123) already exists in amd-acp70-acpi-match.c and correctly describes this topology. Add a DMI quirk to override the flag to 0, consistent with the existing entry for the HN7306EA. Signed-off-by: Jasper Smet <josbeir@gmail.com> Link: https://patch.msgid.link/20260513052137.56703-1-josbeir@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-05-16spi: mtk-snfi: Fix resource leak in mtk_snand_read_page_cache()Felix Gu1-1/+1
When DMA read times out in mtk_snand_read_page_cache(), the original code erroneously jumped to cleanup label which skips DMA unmapping and ECC disable, causing a resource leak. Fixes: 764f1b748164 ("spi: add driver for MTK SPI NAND Flash Interface") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Link: https://patch.msgid.link/20260510-snfi-v1-1-bc375cf1af8e@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-05-16ASoC: amd: acp-sdw-legacy: check CPU DAI name before loggingCássio Gabriel1-1/+1
devm_kasprintf() can fail and return NULL. The legacy AMD SoundWire machine driver logs cpus->dai_name before checking the allocation result. Move the debug print after the NULL check, matching the ordering used by the SOF AMD SoundWire path after commit 5726b68473f7 ("ASoC: amd/sdw_utils: avoid NULL deref when devm_kasprintf() fails"). Fixes: 2981d9b0789c ("ASoC: amd: acp: add soundwire machine driver for legacy stack") Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Link: https://patch.msgid.link/20260511-asoc-amd-acp-sdw-legacy-dai-name-null-v1-1-dc6151b6da8a@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-05-16ASoC: qcom: q6apm-dai: Allocate an extra page for PCM buffersSrinivas Kandagatla1-1/+6
Some Old DSP firmware versions use 32-bit address arithmetic and size for validating the PCM buffer address range. If a buffer is allocated near the top of the 32-bit address space, arithmetic calculations involving the end address can overflow and fail checks. Work around this by increasing the preallocated PCM buffer size by one page. The DSP is still passed the usable buffer size, excluding the extra page, which prevents the firmware from seeing an end address that crosses the 32-bit boundary. This was not hit before because PCM buffer allocation and DSP-side mapping happened at different points, and the size mapped on the DSP was usually nperiods * period_size. Therefore the mapped size was unlikely to match the full preallocated buffer size exactly, although the issue was still possible. With early buffer mapping on the DSP, the full preallocated buffer is mapped during PCM creation, making the failure reproducible at boot. Fixes: 8ea6e25c8536 ("ASoC: qcom: q6apm: Add support for early buffer mapping on DSP") Cc: Stable@vger.kernel.org Reported-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Closes: https://lore.kernel.org/all/7f10abbd-fb78-4c3a-ab90-7ca78239891a@oldschoolsolutions.biz/ Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Tested-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Link: https://patch.msgid.link/20260514090607.2435484-1-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-05-15net: hsr: defer node table free until after RCU readersMichael Bommarito1-2/+2
HSR node-list and node-status generic-netlink operations run under rcu_read_lock(). They walk hsr->node_db through hsr_get_next_node() and hsr_get_node_data(), but RTM_DELLINK teardown removes the same node table with plain list_del() and frees each node immediately. That lets a generic-netlink reader hold a struct hsr_node pointer across hsr_dellink(). In a KASAN build, widening the reader window after hsr_get_next_node() obtains the node reproduces a slab-use-after-free when the reader copies node->macaddress_A; the freeing stack is hsr_del_nodes() from hsr_dellink(). Use list_del_rcu() and defer the free through the existing hsr_free_node_rcu() callback. This matches the lifetime rule used by the HSR prune paths, which already delete nodes with list_del_rcu() and call_rcu(). Fixes: b9a1e627405d ("hsr: implement dellink to clean up resources") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/20260513233838.3064715-2-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-16btrfs: swallow btrfs_record_squota_delta() ENOENTBoris Burkov1-2/+3
I thought that it was likely I could harden squota deletion to the point that it was impossible to end up with an extent accounted to a qgroup outliving its qgroup. Several recent bugs have made me re-consider that position. Ultimately, this is a tradeoff between short term stability and long term strictness, but I think given that there could be another layer of bugs behind the 2-3 I just fixed, I would feel much more confident in people using squotas if the risk was "your values can get a bit out of whack which you can fix by deleting stuff or disabling/re-enabling/repairing" vs "it will abort your filesystem". As the final nail in the coffin, the Meta production kernel was lacking earlier fixes from me and Qu regarding subvol qgroup lifetime, so this is what we have been testing at scale, so I think at least for now upstream should have the same extra layer of protection. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-16btrfs: clamp to avoid squota underflowBoris Burkov1-2/+13
Simple quota accounting can undercount metadata tree block allocations in certain scenarios. When an undercounted subvolume is deleted and its tree blocks freed, the free deltas decrement rfer/excl past zero, wrapping the u64 to a value near U64_MAX. Once wrapped, can_delete_squota_qgroup() sees non-zero rfer and refuses to delete the qgroup. The qgroup becomes permanently orphaned in the quota tree, since there is no subvolume left to generate frees that would bring the counter back to zero. While we ultimately want to fix any mis-accounting at the source, it is also helpful and worthwhile to mitigate the damage by clamping rfer and excl to zero on underflow rather than allowing the u64 to wrap. This at least allows us to clean up the messed up qgroups on subvol deletion. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-16btrfs: fix squota accounting during enable generationBoris Burkov2-4/+28
The first transaction that enables squotas is special and a bit tricky. We have to set BTRFS_FS_QUOTA_ENABLED after the transaction to avoid a deadlock, so any delayed refs that run before we set the bit are not squota accounted. For data this is fine, we don't get an owner_ref, so there is no real harm, it's as if the extent predated squotas. However for metadata, the tree block will have gen == enable_gen so when we free it later, we will decrement the squota accounting, which can result in an underflow. Before it is freed, btrfs check shows errors, as we have mismatched usage between the node generations/owners and the squota values. There are two angles to this fix: 1. For extents that come in delayed_refs that run during the enable_gen transaction, we must actually set enable_gen to the *next* transaction. That is the first transaction that we can really properly account in any way. 2. For extents that come in between the end of our transaction handle and the time we set the BTRFS_FS_QUOTA_ENABLED bit, we need an additional bit, BTRFS_FS_SQUOTA_ENABLING which only affects recording squota deltas, so we do pick up those extents. Otherwise, we would miss them, even for enable_gen + 1. Fixes: bd7c1ea3a302 ("btrfs: qgroup: check generation when recording simple quota delta") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-16btrfs: check for subvolume before deleting squota qgroupBoris Burkov1-25/+25
The invariant that we want to maintain with subvolume qgroups is that the qgroup can only be deleted if there is no root. With squotas, we thought that it was sufficient to just check the usage, because we assumed that deleting a subvolume will drive it's qgroups usage to 0, and thus 0 usage implies no subvolume. However, this is false, for two reasons: - A subvol whose extents are all from before squotas was enabled. - A subvol that was created in this transaction and for which we have not yet run any delayed refs. In both cases, deleting the qgroup breaks the desired invariant and we are left with a subvolume with no qgroup but squotas are enabled. Fix this by unifying the deletion check logic between full qgroups and squotas. Squotas do all the same checks *and* the additional usage == 0 check, which is the one extra rule peculiar to squotas. Link: https://lore.kernel.org/linux-btrfs/adnBhWfJQ1n3hZC8@merlins.org/ Fixes: a8df35619948 ("btrfs: forbid deleting live subvol qgroup") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-16btrfs: always drop root->inodes lock before cond_resched()Boris Burkov2-2/+6
find_first_inode() and find_first_inode_to_shrink() lock root->inodes, then loop over them, occasionally skipping some inodes. When they skip an inode, they attempt to share the cpu/lock with cond_resched_lock(). However, that has a subtle problem associated with it. cond_resched_lock() only drops the lock if it needs to actually call schedule(). With CONFIG_PREEMPT_NONE, this means the full timeslice as detected at ticks. With 8+ cpus and default tunables, this is 2.8ms. So regardless of HZ, we will run for at least 2.8ms in this loop without dropping the lock, assuming it finds no suitable inodes. If HZ is small enough, it might be even worse as the tick granularity becomes bigger than the timeslice. The knock-on effect of this is that callers to btrfs_del_inode_from_root() like kswapd trying to shrink the inode slab or userspace threads calling evict() will spin on xa_lock(&root->inodes) for 2.8ms, so the extent map shrinker dominates the lock even though ostensibly it is intending to share it. This produces memory pressure as there is only one kswapd and it runs sequentially so it can get stuck in the inode slab shrinking. To fix it, simply replace cond_resched_lock() with an open coded variant which unconditionally does unlock/lock around cond_resched. Sharing the lock is decoupled from sharing the CPU, and all the users of the lock now share it fairly. I was able to reproduce this on test systems by producing a lot of empty files (to make a big root->inodes xarray), then producing memory pressure by reading large files larger than ram, triggering kswapd and the extent_map shrinker. The lock contention is visible with perf or lockstat. This patch also relieved a user-apparent bottleneck on a production system from the original report. Tested-by: Rik van Riel <riel@surriel.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-16btrfs: mark file extent range dirty after converting prealloc extentsRobbie Ko1-3/+8
When writing into a preallocated extent, ordered extent completion calls btrfs_mark_extent_written() to convert the file extent item from the BTRFS_FILE_EXTENT_PREALLOC type to the BTRFS_FILE_EXTENT_REG type. If the preallocated extent was created beyond i_size with fallocate keep-size, and the inode is evicted and loaded again before the write, the inode's file_extent_tree is initialized only up to i_size. The beyond i_size prealloc extent is therefore not tracked there. After a write into that extent extends i_size, btrfs_mark_extent_written() updates the file extent item, but the corresponding range is not marked dirty in the inode's file_extent_tree. This can leave disk_i_size stale when the filesystem does not use the no-holes feature, so after remount the file size can go back to the old value. The following reproducer triggers the problem: $ cat test.sh #!/bin/bash DEV=/dev/sdi MNT=/mnt/sdi mkfs.btrfs -f -O ^no-holes $DEV mount $DEV $MNT touch $MNT/file fallocate -n -l 2M $MNT/file umount $MNT mount $DEV $MNT dd if=/dev/zero of=$MNT/file bs=1M count=1 conv=notrunc ls -lh $MNT/file umount $MNT mount $DEV $MNT ls -lh $MNT/file umount $MNT Running the reproducer gives the following result: $ ./test.sh (...) 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.000596024 s, 1.8 GB/s -rw-rw-r-- 1 root root 1.0M May 8 16:34 /mnt/sdi/file -rw-rw-r-- 1 root root 0 May 8 16:34 /mnt/sdi/file Fix this by marking the written range dirty in the inode's file_extent_tree after successfully converting the prealloc extent to a regular extent. Fixes: 9ddc959e802b ("btrfs: use the file extent tree infrastructure") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com> [ Minor change log updates ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-15vsock/virtio: fix zerocopy completion for multi-skb sendsStefano Garzarella1-49/+34
When a large message is fragmented into multiple skbs, the zerocopy uarg is only allocated and attached to the last skb in the loop. Non-final skbs carry pinned user pages with no completion tracking, so the kernel has no way to notify userspace when those pages are safe to reuse. If the loop breaks early the uarg is never allocated at all, leaking pinned pages with no completion notification. Fix this by following the approach used by TCP: allocate the zerocopy uarg (if not provided by the caller) before the send loop and attach it to every skb via skb_zcopy_set(), which takes a reference per skb. Each skb's completion properly decrements the refcount, and the notification only fires after the last skb is freed. On failure, if no data was sent, the uarg is cleanly aborted via net_zcopy_put_abort(). This issue was initially discovered by sashiko while reviewing commit 1cb36e252211 ("vsock/virtio: fix MSG_ZEROCOPY pinned-pages accounting") but was pre-existing. Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support") Closes: https://sashiko.dev/#/patchset/20260420132051.217589-1-sgarzare%40redhat.com Reported-by: Maher Azzouzi <maherazz04@gmail.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Link: https://patch.msgid.link/20260514092948.268720-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-15octeontx2-af: CGX: add bounds check to cgx_speed_mbps indexSam Daly1-1/+6
cgx_speed_mbps has 13 elements but RESP_LINKSTAT_SPEED can yield values 0-15. If it returns a value >= 13, this causes an out-of-bounds array access. Add a bounds check and default to speed 0 if the index is out of range. Fixes: 61071a871ea6 ("octeontx2-af: Forward CGX link notifications to PFs") Cc: Sunil Goutham <sgoutham@marvell.com> Cc: Linu Cherian <lcherian@marvell.com> Cc: Geetha sowjanya <gakula@marvell.com> Cc: hariprasad <hkelam@marvell.com> Cc: Subbaraya Sundeep <sbhatta@marvell.com> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: stable <stable@kernel.org> Signed-off-by: Sam Daly <sam@samdaly.ie> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/2026051352-refined-demise-e88d@gregkh Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-15IB/IPoIB: ndo_set_rx_mode_async conversionDragos Tatulea1-3/+5
The commit in the fixes tag added a warning for devices that are netdev ops locked that they should be converted to .ndo_set_rx_mode_async. IPoIB for mlx5 is such a driver which was missed during the conversion because the flow is more complex: - mlx5 part of IPoIB device was converted to ops-lock in commit [1]. - ipoib_intf_init() then overrides netdev_ops with ipoib_netdev_ops_{pf,vf}, which still wired ndo_set_rx_mode to the legacy sync path -- tripping the new warning on every probe. So now we have the following splat: netdevice: ib0 (uninitialized): ops-locked drivers should use ndo_set_rx_mode_async WARNING: net/core/dev.c:11366 at register_netdevice+0x83c/0x21d0 ... register_netdev+0x1f/0x40 ipoib_add_one+0x35c/0x880 [ib_ipoib] This patch implements .ndo_set_rx_mode_async but it simply schedules the multicast restart task like before. This is done to maintain the assumption that this task and others [2] must run on the same order workqueue to avoid racing with themselves. The race between ipoib_mcast_join_task() and ipoib_mcast_restart_task() would be the most obvious example. [1] 8f7b00307bf1, "net/mlx5e: Convert mlx5 netdevs to instance locking") [2] ipoib_mcast_join_task, ipoib_mcast_restart_task, ipoib_mcast_carrier_on_task, ipoib_reap_ah, ipoib_reap_neigh Fixes: 3cbd22938877 ("net: warn ops-locked drivers still using ndo_set_rx_mode") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://patch.msgid.link/20260513124519.3357165-1-dtatulea@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-15Merge tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds32-492/+537
Pull drm fixes from Dave Airlie: "Weekly fixes pull, small and all over fixes, mostly xe and amdgpu, with some ttm and a core fix for the handle change pain. core: - fix for the fix for the handle change race ttm: - avoid infinite loop in swap out - avoid infinite loop in BO shrinking - convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC bridge: - imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup i915: - Skip __i915_request_skip() for already signaled requests - Fix VSC dynamic range signaling for RGB formats [dp] xe: - Madvise fix around purgeability tracking - Restore engine mask for specific blitter style - Couple UAF fixes - Drop unused ggtt_balloon field amdgpu: - Userq fixes - DCN 3.2 fix - RAS fix - GC 12 fix gma500: - oaktrail_lvds: fix i2c handling loongson: - use managed cleanup for connector polling panfrost: - handle results from reservation locking correctly qaic: - check for integer overflows in mmap logic rocket: - handle results from reservation locking correctly" * tag 'drm-fixes-2026-05-16' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm: Replace old pointer to new idr drm/loongson: Use managed KMS polling drm/ttm: Fix ttm_bo_shrink() infinite LRU walk on backup failure drm/ttm: Convert -EAGAIN from dmem_cgroup_try_charge to -ENOSPC drm/gma500/oaktrail_lvds: fix i2c adapter leaks on init drm/gma500/oaktrail_lvds: fix hang on init failure drm/gma500/oaktrail_hdmi: fix i2c adapter leak on setup drm/xe: Drop unused ggtt_balloon field accel/qaic: Add overflow check to remap_pfn_range during mmap drm/i915/dp: Fix VSC dynamic range signaling for RGB formats drm/i915: skip __i915_request_skip() for already signaled requests drm/bridge: imx8qxp-pxl2dpi: avoid ERR_PTR with device_node cleanup drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12 drm/amd/ras: Fix CPER ring debugfs read overflow drm/amd/display: Wrap DCN32 phantom-plane allocation in DC_RUN_WITH_PREEMPTION_ENABLED drm/amdgpu: fix userq hang detection and reset drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues drm/amdgpu: rework amdgpu_userq_signal_ioctl v3 drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset drm/xe/dma-buf: fix UAF with retry loop ...
2026-05-16drm: Replace old pointer to new idrEdward Adam Davis1-5/+2
Commit 5e28b7b94408 introduced a logical error by failing to replace the newly generated IDR pointer to old id's pointer at the correct location within the "change handle" logic; this resulted in the issue reported by syzbot [1]. Specifically, the new IDR object pointer is intended to replace the original id's pointer during the normal execution flow. Additionally, an unnecessary conditional check for the ret exit path has been removed. [1] !RB_EMPTY_ROOT(&prime_fpriv->dmabufs) WARNING: drivers/gpu/drm/drm_prime.c:224 at drm_prime_destroy_file_private+0x48/0x60 drivers/gpu/drm/drm_prime.c:224, CPU#0: syz.0.17/5833 Call Trace: drm_file_free.part.0+0x7e6/0xcc0 drivers/gpu/drm/drm_file.c:269 drm_file_free drivers/gpu/drm/drm_file.c:237 [inline] drm_close_helper.isra.0+0x186/0x200 drivers/gpu/drm/drm_file.c:290 drm_release+0x1ab/0x360 drivers/gpu/drm/drm_file.c:438 Fixes: 5e28b7b94408 ("drm: Set old handle to NULL before prime swap in change_handle") Reported-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d7c9eed171647e421013 Cc: stable@vger.kernel.org Tested-by: syzbot+d7c9eed171647e421013@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patch.msgid.link/tencent_C267296443AAA4567771176886DFF364A305@qq.com
2026-05-15ipv4: raw: reject IP_HDRINCL packets with ihl < 5Michael Bommarito1-1/+1
raw_send_hdrinc() validates that the caller-supplied IPv4 header fits within the message length: iphlen = iph->ihl * 4; err = -EINVAL; if (iphlen > length) goto error_free; if (iphlen >= sizeof(*iph)) { /* fix up saddr, tot_len, id, csum, transport_header */ } It does not, however, reject ihl < 5. For such a packet the "if (iphlen >= sizeof(*iph))" branch is skipped, leaving the crafted iphdr untouched, but the packet is still handed to __ip_local_out() and onward. Downstream consumers that read iph->ihl assume a sane value: net/ipv4/ah4.c:ah_output() in particular subtracts sizeof(struct iphdr) from top_iph->ihl * 4 and passes the (signed-int-negative, then cast to size_t) result to memcpy(), producing an OOB access of length close to SIZE_MAX and a host kernel panic. An IPv4 header with ihl < 5 is malformed by definition (RFC 791: "Internet Header Length is the length of the internet header in 32 bit words ... Note that the minimum value for a correct header is 5."). The kernel should not be willing to inject such a packet into its own output path. Reject "iphlen < sizeof(*iph)" alongside the existing "iphlen > length" check. This matches the principle that locally constructed packets that re-enter the IP stack must pass the same basic sanity tests that a foreign packet would be subjected to. Once this lands, the "if (iphlen >= sizeof(*iph))" wrapper around the fixup branch becomes redundant; left in place to keep the patch minimal and backport-friendly. A follow-up can unwrap it. Note that commit 86f4c90a1c5c ("ipv4, ipv6: ensure raw socket message is big enough to hold an IP header") ensures the message buffer is large enough to hold an iphdr, but does not constrain the self-reported iph->ihl. Reachability: the malformed packet source is any caller with CAP_NET_RAW, including an unprivileged process in a user+net namespace on a kernel with CONFIG_USER_NS=y. The reproduced AH crash also requires a matching xfrm AH policy on the outgoing route; a container granted CAP_NET_ADMIN can install that state and policy in its netns. Loopback bypasses xfrm_output, so the trigger uses a real netdev. Reproduced on UML + KASAN: kernel-mode fault at addr 0x0 with memcpy_orig at the crash site. Same shape reproduces inside a rootless Docker container with --cap-add NET_ADMIN on a stock distro kernel. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://patch.msgid.link/77ec2b5e8111961c2c39883c92e8aa2709039c17.1778614451.git.michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-05-15Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds2-43/+40
Pull arm64 MPAM fixes from Catalin Marinas: - Fix NULL dereference and a false-positive warning when the driver probes hardware with surprising version numbers - Fix writing values to the wrong registers when probing cache-utilisation counters. Replace 'NRDY' probing with a version that is robust for platforms where the bit is writeable by both hardware and software * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm_mpam: Check whether the config array is allocated before destroying it arm_mpam: Fix false positive assert failure during mpam_disable() arm_mpam: Improve check for whether or not NRDY is hardware managed arm_mpam: Pretend that NRDY is always hardware managed arm_mpam: Fix monitor instance selection when checking for hardware NRDY
2026-05-15Merge tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linuxLinus Torvalds7-110/+255
Pull iommu fixes from Joerg Roedel: "This is probably the largest fixes pull-request ever sent for IOMMU. I partially blame it on AI code review which found some issues but there is also some rework in here to fix issues in the iommu parts of PCI device reset. AMD-Vi: - Add bounds checks to debugfs and table lookups Intel VT-d: - Apply an existing quirk for Q35 graphic device - Skip dev_pasid teardown for the blocked domain to avoid out-of-bounds access - Return early if dev_pasid is missing to prevent NULL dereference or UAF Core: - Fix bugs and corner cases in pci_dev_reset_iommu_prepare/done() - Fix various issues found by AI in iommupt code MAINTAINERS email address update for RISCV IOMMU" * tag 'iommu-fixes-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: MAINTAINERS: update Tomasz Jeznach's email address iommupt: Fix the end_index calculation in __map_range_leaf() iommupt: Check for missing PAGE_SIZE in the pgsize_bitmap iommu: Handle unmap error when iommu_debug is enabled iommu: Fix up map/unmap debugging for iommupt domains iommu: Fix loss of errno on map failure for classic ops iommu/vt-d: Avoid NULL pointer dereference or refcount corruption iommu/vt-d: Fix oops due to out of scope access iommu/vt-d: Disable DMAR for Intel Q35 IGFX iommu: Warn on premature unblock during DMA aliased sibling reset iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset iommu: Fix ATS invalidation timeouts during __iommu_remove_group_pasid() iommu: Fix nested pci_dev_reset_iommu_prepare/done() iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done() iommu: Replace per-group resetting_domain with per-gdev blocked flag iommu: Fix kdocs of pci_dev_reset_iommu_done() iommu: Fix NULL group->domain dereference in pci_dev_reset_iommu_done() iommu/amd: Bounds-check devid in __rlookup_amd_iommu() iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs