wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2025-06-14	ALSA: hda/realtek: Add quirk for Asus GA605K	Simon Trimmer	1	-0/+17
	The GA605K has similar audio hardware to the GA403U so apply the same quirk. Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com> Tested-by: Timofey Titovets <nefelim4ag@gmail.com> Link: https://github.com/alsa-project/alsa-ucm-conf/issues/578 Link: https://patch.msgid.link/20250613145251.397500-1-simont@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-06-14	ALSA: hda/realtek: enable headset mic on Latitude 5420 Rugged	Jonathan Lane	1	-0/+1
	Like many Dell laptops, the 3.5mm port by default can not detect a combined headphones+mic headset or even a pure microphone. This change enables the port's functionality. Signed-off-by: Jonathan Lane <jon@borg.moe> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250611193124.26141-2-jon@borg.moe Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-06-13	rust: devres: do not dereference to the internal Revocable	Danilo Krummrich	1	-11/+16
	We can't expose direct access to the internal Revocable, since this allows users to directly revoke the internal Revocable without Devres having the chance to synchronize with the devres callback -- we have to guarantee that the internal Revocable has been fully revoked before the device is fully unbound. Hence, remove the corresponding Deref implementation and, instead, provide indirect accessors for the internal Revocable. Note that we can still support Devres::revoke() by implementing the required synchronization (which would be almost identical to the synchronization in Devres::drop()). Fixes: 76c01ded724b ("rust: add devres abstraction") Reviewed-by: Benno Lossin <lossin@kernel.org> Link: https://lore.kernel.org/r/20250611174827.380555-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	rust: devres: fix race in Devres::drop()	Danilo Krummrich	1	-8/+29
	In Devres::drop() we first remove the devres action and then drop the wrapped device resource. The design goal is to give the owner of a Devres object control over when the device resource is dropped, but limit the overall scope to the corresponding device being bound to a driver. However, there's a race that was introduced with commit 8ff656643d30 ("rust: devres: remove action in `Devres::drop`"), but also has been (partially) present from the initial version on. In Devres::drop(), the devres action is removed successfully and subsequently the destructor of the wrapped device resource runs. However, there is no guarantee that the destructor of the wrapped device resource completes before the driver core is done unbinding the corresponding device. If in Devres::drop(), the devres action can't be removed, it means that the devres callback has been executed already, or is still running concurrently. In case of the latter, either Devres::drop() wins revoking the Revocable or the devres callback wins revoking the Revocable. If Devres::drop() wins, we (again) have no guarantee that the destructor of the wrapped device resource completes before the driver core is done unbinding the corresponding device. CPU0 CPU1 ------------------------------------------------------------------------ Devres::drop() { Devres::devres_callback() { self.data.revoke() { this.data.revoke() { is_available.swap() == true is_available.swap == false } } // [...] // device fully unbound drop_in_place() { // release device resource } } } Depending on the specific device resource, this can potentially lead to user-after-free bugs. In order to fix this, implement the following logic. In the devres callback, we're always good when we get to revoke the device resource ourselves, i.e. Revocable::revoke() returns true. If Revocable::revoke() returns false, it means that Devres::drop(), concurrently, already drops the device resource and we have to wait for Devres::drop() to signal that it finished dropping the device resource. Note that if we hit the case where we need to wait for the completion of Devres::drop() in the devres callback, it means that we're actually racing with a concurrent Devres::drop() call, which already started revoking the device resource for us. This is rather unlikely and means that the concurrent Devres::drop() already started doing our work and we just need to wait for it to complete it for us. Hence, there should not be any additional overhead from that. (Actually, for now it's even better if Devres::drop() does the work for us, since it can bypass the synchronize_rcu() call implied by Revocable::revoke(), but this goes away anyways once I get to implement the split devres callback approach, which allows us to first flip the atomics of all registered Devres objects of a certain device, execute a single synchronize_rcu() and then drop all revocable objects.) In Devres::drop() we try to revoke the device resource. If that is not successful, it means that the devres callback already did and we're good. Otherwise, we try to remove the devres action, which, if successful, means that we're good, since the device resource has just been revoked by us before we removed the devres action successfully. If the devres action could not be removed, it means that the devres callback must be running concurrently, hence we signal that the device resource has been revoked by us, using the completion. This makes it safe to drop a Devres object from any task and at any point of time, which is one of the design goals. Fixes: 76c01ded724b ("rust: add devres abstraction") Reported-by: Alice Ryhl <aliceryhl@google.com> Closes: https://lore.kernel.org/lkml/aD64YNuqbPPZHAa5@google.com/ Reviewed-by: Benno Lossin <lossin@kernel.org> Link: https://lore.kernel.org/r/20250612121817.1621-4-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	rust: revocable: indicate whether `data` has been revoked already	Danilo Krummrich	1	-4/+14
	Return a boolean from Revocable::revoke() and Revocable::revoke_nosync() to indicate whether the data has been revoked already. Return true if the data hasn't been revoked yet (i.e. this call revoked the data), false otherwise. This is required by Devres in order to synchronize the completion of the revoke process. Reviewed-by: Benno Lossin <lossin@kernel.org> Acked-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20250612121817.1621-3-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	rust: completion: implement initial abstraction	Danilo Krummrich	5	-0/+124
	Implement a minimal abstraction for the completion synchronization primitive. This initial abstraction only adds complete_all() and wait_for_completion(), since that is what is required for the subsequent Devres patch. Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Benno Lossin <lossin@kernel.org> Acked-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20250612121817.1621-2-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	io_uring: run local task_work from ring exit IOPOLL reaping	Jens Axboe	1	-0/+3
	In preparation for needing to shift NVMe passthrough to always use task_work for polled IO completions, ensure that those are suitably run at exit time. See commit: 9ce6c9875f3e ("nvme: always punt polled uring_cmd end_io work to task_work") for details on why that is necessary. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	nvme: always punt polled uring_cmd end_io work to task_work	Jens Axboe	1	-14/+7
	Currently NVMe uring_cmd completions will complete locally, if they are polled. This is done because those completions are always invoked from task context. And while that is true, there's no guarantee that it's invoked under the right ring context, or even task. If someone does NVMe passthrough via multiple threads and with a limited number of poll queues, then ringA may find completions from ringB. For that case, completing the request may not be sound. Always just punt the passthrough completions via task_work, which will redirect the completion, if needed. Cc: stable@vger.kernel.org Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	PCI/PM: Set up runtime PM even for devices without PCI PM	Mario Limonciello	1	-2/+3
	4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing") intended to put PCI devices into D0, but in doing so unintentionally changed runtime PM initialization not to occur on devices that don't support PCI PM. This caused a regression in vfio-pci due to an imbalance with its use. Adjust the logic in pci_pm_init() so that even if PCI PM isn't supported runtime PM is still initialized. Fixes: 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing") Reported-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Closes: https://lore.kernel.org/linux-pci/20250424043232.1848107-1-superm1@kernel.org/T/#m7e8929d6421690dc8bd6dc639d86c2b4db27cbc4 Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Closes: https://lore.kernel.org/linux-pci/20250424043232.1848107-1-superm1@kernel.org/T/#m40d277dcdb9be64a1609a82412d1aa906263e201 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Cc: Alex Williamson <alex.williamson@redhat.com> Link: https://patch.msgid.link/20250611233117.61810-1-superm1@kernel.org
2025-06-13	posix-cpu-timers: fix race between handle_posix_cpu_timers() and posix_cpu_timer_del()	Oleg Nesterov	1	-0/+9
	If an exiting non-autoreaping task has already passed exit_notify() and calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent or debugger right after unlock_task_sighand(). If a concurrent posix_cpu_timer_del() runs at that moment, it won't be able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or lock_task_sighand() will fail. Add the tsk->exit_state check into run_posix_cpu_timers() to fix this. This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because exit_task_work() is called before exit_notify(). But the check still makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail anyway in this case. Cc: stable@vger.kernel.org Reported-by: Benoît Sevens <bsevens@google.com> Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-13	ASoC: amd: yc: update quirk data for HP Victus	Raven Black	1	-0/+7
	Make the internal microphone work on HP Victus laptops. Signed-off-by: Raven Black <ravenblack@gmail.com> Link: https://patch.msgid.link/20250613-support-hp-victus-microphone-v1-1-bebc4c3a2041@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13	io_uring/kbuf: don't truncate end buffer for multiple buffer peeks	Jens Axboe	1	-1/+4
	If peeking a bunch of buffers, normally io_ring_buffers_peek() will truncate the end buffer. This isn't optimal as presumably more data will be arriving later, and hence it's better to stop with the last full buffer rather than truncate the end buffer. Cc: stable@vger.kernel.org Fixes: 35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers") Reported-by: Christian Mazakas <christian.mazakas@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	powerpc: Fix struct termio related ioctl macros	Madhavan Srinivasan	1	-4/+4
	Since termio interface is now obsolete, include/uapi/asm/ioctls.h has some constant macros referring to "struct termio", this caused build failure at userspace. In file included from /usr/include/asm/ioctl.h:12, from /usr/include/asm/ioctls.h:5, from tst-ioctls.c:3: tst-ioctls.c: In function 'get_TCGETA': tst-ioctls.c:12:10: error: invalid application of 'sizeof' to incomplete type 'struct termio' 12 \| return TCGETA; \| ^~~~~~ Even though termios.h provides "struct termio", trying to juggle definitions around to make it compile could introduce regressions. So better to open code it. Reported-by: Tulio Magno <tuliom@ascii.art.br> Suggested-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Closes: https://lore.kernel.org/linuxppc-dev/8734dji5wl.fsf@ascii.art.br/ Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250517142237.156665-1-maddy@linux.ibm.com
2025-06-13	spi: tegra210-qspi: Remove cache operations	Thierry Reding	1	-14/+0
	The DMA memory for this driver is allocated using dma_alloc_coherent(), which ends up mapping the allocated memory as uncached. Performing the various dma_sync_() operations on this memory causes issues during SPI flashing: [ 7.818017] pc : dcache_inval_poc+0x40/0x58 [ 7.822128] lr : arch_sync_dma_for_cpu+0x2c/0x4c [ 7.826854] sp : ffff80008193bcf0 [ 7.830267] x29: ffff80008193bcf0 x28: ffffa3fe5ff1e908 x27: ffffa3fe627bb130 [ 7.837528] x26: ffff000086952180 x25: ffff00008015c8ac x24: ffff000086c9b480 [ 7.844878] x23: ffff00008015c800 x22: 0000000000000002 x21: 0000000000010000 [ 7.852229] x20: 0000000106dae000 x19: ffff000080112410 x18: 0000000000000001 [ 7.859580] x17: ffff000080159400 x16: ffffa3fe607a9bd8 x15: ffff0000eac1b180 [ 7.866753] x14: 000000000000000c x13: 0000000000000001 x12: 000000000000025a [ 7.874104] x11: 0000000000000000 x10: 7f73e96357f6a07f x9 : db1fc8072a7f5e3a [ 7.881365] x8 : ffff000086c9c588 x7 : ffffa3fe607a9bd8 x6 : ffff80008193bc28 [ 7.888630] x5 : 000000000000ffff x4 : 0000000000000009 x3 : 000000000000003f [ 7.895892] x2 : 0000000000000040 x1 : ffff000086dbe000 x0 : ffff000086db0000 [ 7.903155] Call trace: [ 7.905606] dcache_inval_poc+0x40/0x58 (P) [ 7.909804] iommu_dma_sync_single_for_cpu+0xb4/0xb8 [ 7.914617] __dma_sync_single_for_cpu+0x158/0x194 [ 7.919428] __this_module+0x5b020/0x5baf8 [spi_tegra210_quad] [ 7.925291] irq_thread_fn+0x2c/0xc0 [ 7.928966] irq_thread+0x16c/0x318 [ 7.932467] kthread+0x12c/0x214 Fix this by removing all calls to the dma_sync_() functions. This isn't ideal because DMA is used only for relatively large (> 64 words or 256 bytes) and using uncached memory for this might be slow. Reworking this to use cached memory for faster access and reintroducing the cache maintenance calls is probably worth a follow-up patch. Reported-by: Brad Griffis <bgriffis@nvidia.com> Fixes: 017f1b0bae08 ("spi: tegra210-quad: Add support for internal DMA") Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20250613123037.2082788-1-thierry.reding@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13	Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists	Bagas Sanjaya	1	-0/+2
	Stephen Rothwell reports htmldocs warning on ublk docs: Documentation/block/ublk.rst:414: ERROR: Unexpected indentation. [docutils] Fix the warning by separating sublists of auto buffer registration fallback behavior from their appropriate parent list item. Fixes: ff20c516485e ("ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250612132638.193de386@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250613023857.15971-1-bagasdotme@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	iommu/tegra: Fix incorrect size calculation	Jason Gunthorpe	1	-2/+2
	This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32 in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-13	Documentation: nouveau: Update GSP message queue kernel-doc reference	Bagas Sanjaya	1	-1/+1
	GSP message queue docs has been moved following RPC handling split in commit 8a8b1ec5261f20 ("drm/nouveau/gsp: split rpc handling out on its own"), before GSP-RM implementation is versioned in commit c472d828348caf ("drm/nouveau/gsp: move subdev/engine impls to subdev/gsp/rm/r535/"). However, the kernel-doc reference in nouveau docs is left behind, which triggers htmldocs warnings: ERROR: Cannot find file ./drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c WARNING: No kernel-doc for file ./drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c Update the reference. Fixes: c472d828348c ("drm/nouveau/gsp: move subdev/engine impls to subdev/gsp/rm/r535/") Fixes: 8a8b1ec5261f ("drm/nouveau/gsp: split rpc handling out on its own") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20250611020805.22418-2-bagasdotme@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	drm/nouveau/bl: increase buffer size to avoid truncate warning	Jacob Keller	1	-1/+1
	The nouveau_get_backlight_name() function generates a unique name for the backlight interface, appending an id from 1 to 99 for all backlight devices after the first. GCC 15 (and likely other compilers) produce the following -Wformat-truncation warning: nouveau_backlight.c: In function ‘nouveau_backlight_init’: nouveau_backlight.c:56:69: error: ‘%d’ directive output may be truncated writing between 1 and 10 bytes into a region of size 3 [-Werror=format-truncation=] 56 \| snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); \| ^~ In function ‘nouveau_get_backlight_name’, inlined from ‘nouveau_backlight_init’ at nouveau_backlight.c:351:7: nouveau_backlight.c:56:56: note: directive argument in the range [1, 2147483647] 56 \| snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); \| ^~~~~~~~~~~~~~~~ nouveau_backlight.c:56:17: note: ‘snprintf’ output between 14 and 23 bytes into a destination of size 15 56 \| snprintf(backlight_name, BL_NAME_SIZE, "nv_backlight%d", nb); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The warning started appearing after commit ab244be47a8f ("drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name()") This fix for the ida usage removed the explicit value check for ids larger than 99. The compiler is unable to intuit that the ida_alloc_max() limits the returned value range between 0 and 99. Because the compiler can no longer infer that the number ranges from 0 to 99, it thinks that it could use as many as 11 digits (10 + the potential - sign for negative numbers). The warning has gone unfixed for some time, with at least one kernel test robot report. The code breaks W=1 builds, which is especially frustrating with the introduction of CONFIG_WERROR. The string is stored temporarily on the stack and then copied into the device name. Its not a big deal to use 11 more bytes of stack rounding out to an even 24 bytes. Increase BL_NAME_SIZE to 24 to avoid the truncation warning. This fixes the W=1 builds that include this driver. Compile tested only. Fixes: ab244be47a8f ("drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name()") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312050324.0kv4PnfZ-lkp@intel.com/ Suggested-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20250610-jk-nouveua-drm-bl-snprintf-fix-v2-1-7fdd4b84b48e@intel.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	drm/nouveau: fix a use-after-free in r535_gsp_rpc_push()	Zhi Wang	1	-5/+12
	The RPC container is released after being passed to r535_gsp_rpc_send(). When sending the initial fragment of a large RPC and passing the caller's RPC container, the container will be freed prematurely. Subsequent attempts to send remaining fragments will therefore result in a use-after-free. Allocate a temporary RPC container for holding the initial fragment of a large RPC when sending. Free the caller's container when all fragments are successfully sent. Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Zhi Wang <zhiw@nvidia.com> Link: https://lore.kernel.org/r/20250527163712.3444-1-zhiw@nvidia.com [ Rebase onto Blackwell changes. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-06-13	drm/nouveau/gsp: Fix potential integer overflow on integer shifts	Colin Ian King	1	-1/+1
	The left shift int 32 bit integer constants 1 is evaluated using 32 bit arithmetic and then assigned to a 64 bit unsigned integer. In the case where the shift is 32 or more this can lead to an overflow. Avoid this by shifting using the BIT_ULL macro instead. Fixes: 6c3ac7bcfcff ("drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250522131512.2768310-1-colin.i.king@gmail.com
2025-06-13	genirq/irq_sim: Initialize work context pointers properly	Gyeyoung Baek	1	-1/+1
	Initialize `ops` member's pointers properly by using kzalloc() instead of kmalloc() when allocating the simulation work context. Otherwise the pointers contain random content leading to invalid dereferencing. Signed-off-by: Gyeyoung Baek <gye976@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250612124827.63259-1-gye976@gmail.com
2025-06-13	genirq/cpuhotplug: Restore affinity even for suspended IRQ	Brian Norris	1	-7/+0
	Commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") tried to make managed shutdown/startup properly reference counted, but it missed the fact that the unplug and hotplug code has an intentional imbalance by skipping IRQS_SUSPENDED interrupts on the "restore" path. This means that if a managed-affinity interrupt was both suspended and managed-shutdown (such as may happen during system suspend / S3), resume skips calling irq_startup_managed(), and would again have an unbalanced depth this time, with a positive value (i.e., remaining unexpectedly masked). This IRQS_SUSPENDED check was introduced in commit a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity") for essentially the same reason as commit 788019eb559f, to prevent that irq_startup() would unconditionally re-enable an interrupt too early. Because irq_startup_managed() now respsects the disable-depth count, the IRQS_SUSPENDED check is not longer needed, and instead, it causes harm. Thus, drop the IRQS_SUSPENDED check, and restore balance. This effectively reverts commit a60dd06af674 ("genirq/cpuhotplug: Skip suspended interrupts when restoring affinity"), because it is replaced by commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug"). Fixes: 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") Reported-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Link: https://lore.kernel.org/all/20250612183303.3433234-3-briannorris@chromium.org Closes: https://lore.kernel.org/lkml/24ec4adc-7c80-49e9-93ee-19908a97ab84@gmail.com/
2025-06-13	genirq/cpuhotplug: Rebalance managed interrupts across multi-CPU hotplug	Brian Norris	1	-0/+8
	Commit 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") intended to only decrement the disable depth once per managed shutdown, but instead it decrements for each CPU hotplug in the affinity mask, until its depth reaches a point where it finally gets re-started. For example, consider: 1. Interrupt is affine to CPU {M,N} 2. disable_irq() -> depth is 1 3. CPU M goes offline -> interrupt migrates to CPU N / depth is still 1 4. CPU N goes offline -> irq_shutdown() / depth is 2 5. CPU N goes online -> irq_restore_affinity_of_irq() -> irqd_is_managed_and_shutdown()==true -> irq_startup_managed() -> depth is 1 6. CPU M goes online -> irq_restore_affinity_of_irq() -> irqd_is_managed_and_shutdown()==true -> irq_startup_managed() -> depth is 0 * BUG: driver expects the interrupt is still disabled * -> irq_startup() -> irqd_clr_managed_shutdown() 7. enable_irq() -> depth underflow / unbalanced enable_irq() warning This should clear the managed-shutdown flag at step 6, so that further hotplugs don't cause further imbalance. Note: It might be cleaner to also remove the irqd_clr_managed_shutdown() invocation from __irq_startup_managed(). But this is currently not possible because of irq_update_affinity_desc() as it sets IRQD_MANAGED_SHUTDOWN and expects irq_startup() to clear it. Fixes: 788019eb559f ("genirq: Retain disable depth for managed interrupts across CPU hotplug") Reported-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Aleksandrs Vinarskis <alex.vinarskis@gmail.com> Link: https://lore.kernel.org/all/20250612183303.3433234-2-briannorris@chromium.org
2025-06-13	ata: ahci: Disallow LPM for ASUSPRO-D840SA motherboard	Niklas Cassel	1	-1/+18
	A user has bisected a regression which causes graphical corruptions on his screen to commit 7627a0edef54 ("ata: ahci: Drop low power policy board type"). Simply reverting commit 7627a0edef54 ("ata: ahci: Drop low power policy board type") makes the graphical corruptions on his screen to go away. (Note: there are no visible messages in dmesg that indicates a problem with AHCI.) The user also reports that the problem occurs regardless if there is an HDD or an SSD connected via AHCI, so the problem is not device related. The devices also work fine on other motherboards, so it seems specific to the ASUSPRO-D840SA motherboard. While enabling low power modes for AHCI is not supposed to affect completely unrelated hardware, like a graphics card, it does however allow the system to enter deeper PC-states, which could expose ACPI issues that were previously not visible (because the system never entered these lower power states before). There are previous examples where enabling LPM exposed serious BIOS/ACPI bugs, see e.g. commit 240630e61870 ("ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS"). Since there hasn't been any BIOS update in years for the ASUSPRO-D840SA motherboard, disable LPM for this board, in order to avoid entering lower PC-states, which triggers graphical corruptions. Cc: stable@vger.kernel.org Reported-by: Andy Yang <andyybtc79@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220111 Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type") Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hans de Goede <hansg@kernel.org> Link: https://lore.kernel.org/r/20250612141750.2108342-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org>
2025-06-13	block: Fix bvec_set_folio() for very large folios	Matthew Wilcox (Oracle)	1	-2/+5
	Similarly to 26064d3e2b4d ("block: fix adding folio to bio"), if we attempt to add a folio that is larger than 4GB, we'll silently truncate the offset and len. Widen the parameters to size_t, assert that the length is less than 4GB and set the first page that contains the interesting data rather than the first page of the folio. Fixes: 26db5ee15851 (block: add a bvec_set_folio helper) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP	Matthew Wilcox (Oracle)	1	-1/+1
	It is possible for physically contiguous folios to have discontiguous struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not. This is correctly handled by folio_page_idx(), so remove this open-coded implementation. Fixes: 640d1930bef4 (block: Add bio_for_each_folio_all()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13	Revert "platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1"	Kurt Borja	1	-1/+1
	This reverts commit 5ff79cabb23a2f14d2ed29e9596aec908905a0e6. Although the Alienware m16 R1 AMD model supports G-Mode, it actually has a lower power ceiling than plain "performance" profile, which results in lower performance. Reported-by: Cihan Ozakca <cozakca@outlook.com> Cc: stable@vger.kernel.org # 6.15.x Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250611-m16-rev-v1-1-72d13bad03c9@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-13	platform/x86/amd/pmc: Add PCSpecialist Lafite Pro V 14M to 8042 quirks list	Mario Limonciello	1	-0/+9
	Every other s2idle cycle fails to reach hardware sleep when keyboard wakeup is enabled. This appears to be an EC bug, but the vendor refuses to fix it. It was confirmed that turning off i8042 wakeup avoids ths issue (albeit keyboard wakeup is disabled). Take the lesser of two evils and add it to the i8042 quirk list. Reported-by: Raoul <ein4rth@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220116 Tested-by: Raoul <ein4rth@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250611203341.3733478-1-superm1@kernel.org Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-06-13	spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine	Thangaraj Samynathan	1	-1/+1
	Removes MSI-X from the interrupt request path, as the DMA engine used by the SPI controller does not support MSI-X interrupts. Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com> Link: https://patch.msgid.link/20250612023059.71726-1-thangaraj.s@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13	ASoC: apple: mca: Drop default ARCH_APPLE in Kconfig	Sven Peter	1	-1/+0
	When the first driver for Apple Silicon was upstreamed we accidentally included `default ARCH_APPLE` in its Kconfig which then spread to almost every subsequent driver. As soon as ARCH_APPLE is set to y this will pull in many drivers as built-ins which is not what we want. Thus, drop `default ARCH_APPLE` from Kconfig. Signed-off-by: Sven Peter <sven@kernel.org> Reviewed-by: Janne Grunau <j@jannau.net> Link: https://patch.msgid.link/20250612-apple-kconfig-defconfig-v1-10-0e6f9cb512c1@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13	crypto: testmgr - reinstate kconfig control over full self-tests	Eric Biggers	4	-10/+38
	Commit 698de822780f ("crypto: testmgr - make it easier to enable the full set of tests") removed support for building kernels that run only the "fast" set of crypto self-tests by default. This assumed that nearly everyone actually wanted the full set of tests, if they had already chosen to enable the tests at all. Unfortunately, it turns out that both Debian and Fedora intentionally have the crypto self-tests enabled in their production kernels. And for production kernels we do need to keep the testing time down, which implies just running the "fast" tests, not the full set of tests. For Fedora, a reason for enabling the tests in production is that they are being (mis)used to meet the FIPS 140-3 pre-operational testing requirement. However, the other reason for enabling the tests in production, which applies to both distros, is that they provide some value in protecting users from buggy drivers. Unfortunately, the crypto/ subsystem has many buggy and untested drivers for off-CPU hardware accelerators on rare platforms. These broken drivers get shipped to users, and there have been multiple examples of the tests preventing these buggy drivers from being used. So effectively, the tests are being relied on in production kernels. I think this is kind of crazy (untested drivers should just not be enabled at all), but that seems to be how things work currently. Thus, reintroduce a kconfig option that controls the level of testing. Call it CRYPTO_SELFTESTS_FULL instead of the original name CRYPTO_MANAGER_EXTRA_TESTS, which was slightly misleading. Moreover, given the "production kernel" use case, make CRYPTO_SELFTESTS depend on EXPERT instead of DEBUG_KERNEL. I also haven't reinstated all the #ifdefs in crypto/testmgr.c. Instead, just rely on the compiler to optimize out unused code. Fixes: 40b9969796bf ("crypto: testmgr - replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS") Fixes: 698de822780f ("crypto: testmgr - make it easier to enable the full set of tests") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-13	ALSA: usb-audio: Rename ALSA kcontrol PCM and PCM1 for the KTMicro sound card	wangdicheng	1	-0/+12
	PCM1 not in Pulseaudio's control list; standardize control to "Speaker" and "Headphone". Signed-off-by: wangdicheng <wangdicheng@kylinos.cn> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250613063636.239683-1-wangdich9700@163.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-06-13	perf/x86/intel: Fix crash in icl_update_topdown_event()	Kan Liang	1	-1/+1
	The perf_fuzzer found a hard-lockup crash on a RaptorLake machine: Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000 CPU: 23 UID: 0 PID: 0 Comm: swapper/23 Tainted: [W]=WARN Hardware name: Dell Inc. Precision 9660/0VJ762 RIP: 0010:native_read_pmc+0x7/0x40 Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ... RSP: 000:fffb03100273de8 EFLAGS: 00010046 .... Call Trace: <TASK> icl_update_topdown_event+0x165/0x190 ? ktime_get+0x38/0xd0 intel_pmu_read_event+0xf9/0x210 __perf_event_read+0xf9/0x210 CPUs 16-23 are E-core CPUs that don't support the perf metrics feature. The icl_update_topdown_event() should not be invoked on these CPUs. It's a regression of commit: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read") The bug introduced by that commit is that the is_topdown_event() function is mistakenly used to replace the is_topdown_count() call to check if the topdown functions for the perf metrics feature should be invoked. Fix it. Fixes: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read") Closes: https://lore.kernel.org/lkml/352f0709-f026-cd45-e60c-60dfd97f73f3@maine.edu/ Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org # v6.15+ Link: https://lore.kernel.org/r/20250612143818.2889040-1-kan.liang@linux.intel.com
2025-06-13	powerpc: dts: mpc8315erdb: Add GPIO controller node	J. Neuschäfer	1	-0/+10
	The MPC8315E SoC and variants have a GPIO controller at IMMR + 0xc00. This node was previously missing from the device tree. Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250611-mpc-gpio-v1-1-02d1f75336e2@posteo.net
2025-06-13	powerpc/microwatt: Fix model property in device tree	J. Neuschäfer	1	-1/+1
	The standard property for the model name is called "model". Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250611-microwatt-v2-1-80847bbc5f9c@posteo.net
2025-06-13	powerpc/eeh: Fix missing PE bridge reconfiguration during VFIO EEH recovery	Narayana Murty N	1	-0/+2
	VFIO EEH recovery for PCI passthrough devices fails on PowerNV and pseries platforms due to missing host-side PE bridge reconfiguration. In the current implementation, eeh_pe_configure() only performs RTAS or OPAL-based bridge reconfiguration for native host devices, but skips it entirely for PEs managed through VFIO in guest passthrough scenarios. This leads to incomplete EEH recovery when a PCI error affects a passthrough device assigned to a QEMU/KVM guest. Although VFIO triggers the EEH recovery flow through VFIO_EEH_PE_ENABLE ioctl, the platform-specific bridge reconfiguration step is silently bypassed. As a result, the PE's config space is not fully restored, causing subsequent config space access failures or EEH freeze-on-access errors inside the guest. This patch fixes the issue by ensuring that eeh_pe_configure() always invokes the platform's configure_bridge() callback (e.g., pseries_eeh_phb_configure_bridge) even for VFIO-managed PEs. This ensures that RTAS or OPAL calls to reconfigure the PE bridge are correctly issued on the host side, restoring the PE's configuration space after an EEH event. This fix is essential for reliable EEH recovery in QEMU/KVM guests using VFIO PCI passthrough on PowerNV and pseries systems. Tested with: - QEMU/KVM guest using VFIO passthrough (IBM Power9,(lpar)Power11 host) - Injected EEH errors with pseries EEH errinjct tool on host, recovery verified on qemu guest. - Verified successful config space access and CAP_EXP DevCtl restoration after recovery Fixes: 212d16cdca2d ("powerpc/eeh: EEH support for VFIO PCI device") Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250508062928.146043-1-nnmlinux@linux.ibm.com
2025-06-13	powerpc/vdso: Fix build of VDSO32 with pcrel	Christophe Leroy	2	-2/+2
	Building vdso32 on power10 with pcrel leads to following errors: VDSO32A arch/powerpc/kernel/vdso/gettimeofday-32.o arch/powerpc/kernel/vdso/gettimeofday.S: Assembler messages: arch/powerpc/kernel/vdso/gettimeofday.S:40: Error: syntax error; found `@', expected `,' arch/powerpc/kernel/vdso/gettimeofday.S:71: Info: macro invoked from here arch/powerpc/kernel/vdso/gettimeofday.S:40: Error: junk at end of line: `@notoc' arch/powerpc/kernel/vdso/gettimeofday.S:71: Info: macro invoked from here ... make[2]: * [arch/powerpc/kernel/vdso/Makefile:85: arch/powerpc/kernel/vdso/gettimeofday-32.o] Error 1 make[1]: * [arch/powerpc/Makefile:388: vdso_prepare] Error 2 Once the above is fixed, the following happens: VDSO32C arch/powerpc/kernel/vdso/vgettimeofday-32.o cc1: error: '-mpcrel' requires '-mcmodel=medium' make[2]: * [arch/powerpc/kernel/vdso/Makefile:89: arch/powerpc/kernel/vdso/vgettimeofday-32.o] Error 1 make[1]: * [arch/powerpc/Makefile:388: vdso_prepare] Error 2 make: *** [Makefile:251: __sub-make] Error 2 Make sure pcrel version of CFUNC() macro is used only for powerpc64 builds and remove -mpcrel for powerpc32 builds. Fixes: 7e3a68be42e1 ("powerpc/64: vmlinux support building with PCREL addresing") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/1fa3453f07d42a50a70114da9905bf7b73304fca.1747073669.git.christophe.leroy@csgroup.eu
2025-06-13	drm/mgag200: Do not include <linux/export.h>	Thomas Zimmermann	1	-1/+0
	Fix the compile-time warning drivers/gpu/drm/mgag200/mgag200_ddc.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250612085308.203861-1-tzimmermann@suse.de
2025-06-13	drm/ast: Do not include <linux/export.h>	Thomas Zimmermann	1	-1/+0
	Fix the compile-time warning drivers/gpu/drm/ast/ast_mode.c: warning: EXPORT_SYMBOL() is not used, but #include <linux/export.h> is present Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://lore.kernel.org/r/20250612084257.200907-1-tzimmermann@suse.de
2025-06-12	mm: add mmap_prepare() compatibility layer for nested file systems	Lorenzo Stoakes	5	-3/+107
	Nested file systems, that is those which invoke call_mmap() within their own f_op->mmap() handlers, may encounter underlying file systems which provide the f_op->mmap_prepare() hook introduced by commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). We have a chicken-and-egg scenario here - until all file systems are converted to using .mmap_prepare(), we cannot convert these nested handlers, as we can't call f_op->mmap from an .mmap_prepare() hook. So we have to do it the other way round - invoke the .mmap_prepare() hook from an .mmap() one. in order to do so, we need to convert VMA state into a struct vm_area_desc descriptor, invoking the underlying file system's f_op->mmap_prepare() callback passing a pointer to this, and then setting VMA state accordingly and safely. This patch achieves this via the compat_vma_mmap_prepare() function, which we invoke from call_mmap() if f_op->mmap_prepare() is specified in the passed in file pointer. We place the fundamental logic into mm/vma.h where VMA manipulation belongs. We also update the VMA userland tests to accommodate the changes. The compat_vma_mmap_prepare() function and its associated machinery is temporary, and will be removed once the conversion of file systems is complete. We carefully place this code so it can be used with CONFIG_MMU and also with cutting edge nommu silicon. [akpm@linux-foundation.org: export compat_vma_mmap_prepare tp fix build] [lorenzo.stoakes@oracle.com: remove unused declarations] Link: https://lkml.kernel.org/r/ac3ae324-4c65-432a-8c6d-2af988b18ac8@lucifer.local Link: https://lkml.kernel.org/r/20250609165749.344976-1-lorenzo.stoakes@oracle.com Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/ Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-12	ionic: Prevent driver/fw getting out of sync on devcmd(s)	Brett Creeley	1	-1/+2
	Some stress/negative firmware testing around devcmd(s) returning EAGAIN found that the done bit could get out of sync in the firmware when it wasn't cleared in a retry case. While here, change the type of the local done variable to a bool to match the return type from ionic_dev_cmd_done(). Fixes: ec8ee714736e ("ionic: stretch heartbeat detection") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250609212827.53842-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12	SUNRPC: Cleanup/fix initial rq_pages allocation	Benjamin Coddington	1	-5/+1
	While investigating some reports of memory-constrained NUMA machines failing to mount v3 and v4.0 nfs mounts, we found that svc_init_buffer() was not attempting to retry allocations from the bulk page allocator. Typically, this results in a single page allocation being returned and the mount attempt fails with -ENOMEM. A retry would have allowed the mount to succeed. Additionally, it seems that the bulk allocation in svc_init_buffer() is redundant because svc_alloc_arg() will perform the required allocation and does the correct thing to retry the allocations. The call to allocate memory in svc_alloc_arg() drops the preferred node argument, but I expect we'll still allocate on the preferred node because the allocation call happens within the svc thread context, which chooses the node with memory closest to the current thread's execution. This patch cleans out the bulk allocation in svc_init_buffer() to allow svc_alloc_arg() to handle the allocation/retry logic for rq_pages. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: ed603bcf4fea ("sunrpc: Replace the rq_pages array with dynamically-allocated memory") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-06-12	NFSD: Avoid corruption of a referring call list	Chuck Lever	1	-0/+1
	The new code neglects to remove a freshly-allocated RCL from the callback's referring call list when no matching referring call is found. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202505171002.cE46sdj5-lkp@intel.com/ Fixes: 4f3c8d8c9e10 ("NFSD: Implement CB_SEQUENCE referring call lists") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-06-12	smb: improve directory cache reuse for readdir operations	Bharath SM	2	-17/+19
	Currently, cached directory contents were not reused across subsequent 'ls' operations because the cache validity check relied on comparing the ctx pointer, which changes with each readdir invocation. As a result, the cached dir entries was not marked as valid and the cache was not utilized for subsequent 'ls' operations. This change uses the file pointer, which remains consistent across all readdir calls for a given directory instance, to associate and validate the cache. As a result, cached directory contents can now be correctly reused, improving performance for repeated directory listings. Performance gains with local windows SMB server: Without the patch and default actimeo=1: 1000 directory enumeration operations on dir with 10k files took 135.0s With this patch and actimeo=0: 1000 directory enumeration operations on dir with 10k files took just 5.1s Signed-off-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-12	smb: client: fix perf regression with deferred closes	Paulo Alcantara	1	-3/+6
	Customer reported that one of their applications started failing to open files with STATUS_INSUFFICIENT_RESOURCES due to NetApp server hitting the maximum number of opens to same file that it would allow for a single client connection. It turned out the client was failing to reuse open handles with deferred closes because matching ->f_flags directly without masking off O_CREAT\|O_EXCL\|O_TRUNC bits first broke the comparision and then client ended up with thousands of deferred closes to same file. Those bits are already satisfied on the original open, so no need to check them against existing open handles. Reproducer: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <pthread.h> #define NR_THREADS 4 #define NR_ITERATIONS 2500 #define TEST_FILE "/mnt/1/test/dir/foo" static char buf[64]; static void worker(void arg) { int i, j; int fd; for (i = 0; i < NR_ITERATIONS; i++) { fd = open(TEST_FILE, O_WRONLY\|O_CREAT\|O_APPEND, 0666); for (j = 0; j < 16; j++) write(fd, buf, sizeof(buf)); close(fd); } } int main(int argc, char *argv[]) { pthread_t t[NR_THREADS]; int fd; int i; fd = open(TEST_FILE, O_WRONLY\|O_CREAT\|O_TRUNC, 0666); close(fd); memset(buf, 'a', sizeof(buf)); for (i = 0; i < NR_THREADS; i++) pthread_create(&t[i], NULL, worker, NULL); for (i = 0; i < NR_THREADS; i++) pthread_join(t[i], NULL); return 0; } Before patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1391 After patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1 Cc: linux-cifs@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Jay Shin <jaeshin@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Fixes: b8ea3b1ff544 ("smb: enable reuse of deferred file handles for write operations") Acked-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-12	arm64/ptrace: Fix stack-out-of-bounds read in regs_get_kernel_stack_nth()	Tengda Wu	1	-1/+1
	KASAN reports a stack-out-of-bounds read in regs_get_kernel_stack_nth(). Call Trace: [ 97.283505] BUG: KASAN: stack-out-of-bounds in regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.284677] Read of size 8 at addr ffff800089277c10 by task 1.sh/2550 [ 97.285732] [ 97.286067] CPU: 7 PID: 2550 Comm: 1.sh Not tainted 6.6.0+ #11 [ 97.287032] Hardware name: linux,dummy-virt (DT) [ 97.287815] Call trace: [ 97.288279] dump_backtrace+0xa0/0x128 [ 97.288946] show_stack+0x20/0x38 [ 97.289551] dump_stack_lvl+0x78/0xc8 [ 97.290203] print_address_description.constprop.0+0x84/0x3c8 [ 97.291159] print_report+0xb0/0x280 [ 97.291792] kasan_report+0x84/0xd0 [ 97.292421] __asan_load8+0x9c/0xc0 [ 97.293042] regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.293835] process_fetch_insn+0x770/0xa30 [ 97.294562] kprobe_trace_func+0x254/0x3b0 [ 97.295271] kprobe_dispatcher+0x98/0xe0 [ 97.295955] kprobe_breakpoint_handler+0x1b0/0x210 [ 97.296774] call_break_hook+0xc4/0x100 [ 97.297451] brk_handler+0x24/0x78 [ 97.298073] do_debug_exception+0xac/0x178 [ 97.298785] el1_dbg+0x70/0x90 [ 97.299344] el1h_64_sync_handler+0xcc/0xe8 [ 97.300066] el1h_64_sync+0x78/0x80 [ 97.300699] kernel_clone+0x0/0x500 [ 97.301331] __arm64_sys_clone+0x70/0x90 [ 97.302084] invoke_syscall+0x68/0x198 [ 97.302746] el0_svc_common.constprop.0+0x11c/0x150 [ 97.303569] do_el0_svc+0x38/0x50 [ 97.304164] el0_svc+0x44/0x1d8 [ 97.304749] el0t_64_sync_handler+0x100/0x130 [ 97.305500] el0t_64_sync+0x188/0x190 [ 97.306151] [ 97.306475] The buggy address belongs to stack of task 1.sh/2550 [ 97.307461] and is located at offset 0 in frame: [ 97.308257] __se_sys_clone+0x0/0x138 [ 97.308910] [ 97.309241] This frame has 1 object: [ 97.309873] [48, 184) 'args' [ 97.309876] [ 97.310749] The buggy address belongs to the virtual mapping at [ 97.310749] [ffff800089270000, ffff800089279000) created by: [ 97.310749] dup_task_struct+0xc0/0x2e8 [ 97.313347] [ 97.313674] The buggy address belongs to the physical page: [ 97.314604] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14f69a [ 97.315885] flags: 0x15ffffe00000000(node=1\|zone=2\|lastcpupid=0xfffff) [ 97.316957] raw: 015ffffe00000000 0000000000000000 dead000000000122 0000000000000000 [ 97.318207] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 97.319445] page dumped because: kasan: bad access detected [ 97.320371] [ 97.320694] Memory state around the buggy address: [ 97.321511] ffff800089277b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.322681] ffff800089277b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.323846] >ffff800089277c00: 00 00 f1 f1 f1 f1 f1 f1 00 00 00 00 00 00 00 00 [ 97.325023] ^ [ 97.325683] ffff800089277c80: 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 [ 97.326856] ffff800089277d00: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 This issue seems to be related to the behavior of some gcc compilers and was also fixed on the s390 architecture before: commit d93a855c31b7 ("s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()") As described in that commit, regs_get_kernel_stack_nth() has confirmed that `addr` is on the stack, so reading the value at `addr` should be allowed. Use READ_ONCE_NOCHECK() helper to silence the KASAN check for this case. Fixes: 0a8ea52c3eb1 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature") Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> Link: https://lore.kernel.org/r/20250604005533.1278992-1-wutengda@huaweicloud.com [will: Use 'addr' as the argument to READ_ONCE_NOCHECK()] Signed-off-by: Will Deacon <will@kernel.org>
2025-06-12	arm64/gcs: Don't call gcs_free() during flush_gcs()	Mark Brown	1	-1/+3
	Currently we call gcs_free() during flush_gcs() to reset the thread state for GCS. This includes unmapping any kernel allocated GCS, but this is redundant when doing a flush_thread() since we are reinitialising the thread memory too. Inline the reinitialisation of the thread struct. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20250611-arm64-gcs-flush-thread-v1-1-cc26feeddabd@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2025-06-12	drm/xe/lrc: Use a temporary buffer for WA BB	Lucas De Marchi	1	-4/+20
	In case the BO is in iomem, we can't simply take the vaddr and write to it. Instead, prepare a separate buffer that is later copied into io memory. Right now it's just a few words that could be using xe_map_write32(), but the intention is to grow the WA BB for other uses. Fixes: 617d824c5323 ("drm/xe: Add WA BB to capture active context utilization") Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://lore.kernel.org/r/20250604-wa-bb-fix-v1-1-0dfc5dafcef0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit ef48715b2d3df17c060e23b9aa636af3d95652f8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-12	selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context	Gal Pressman	1	-1/+58
	Add test_rss_default_context_rule() to verify that ntuple rules can correctly direct traffic to the default RSS context (context 0). The test creates two ntuple rules with explicit location priorities: - A high-priority rule (loc 0) directing specific port traffic to context 0. - A low-priority rule (loc 1) directing all other TCP traffic to context 1. This validates that: 1. Rules targeting the default context function properly. 2. Traffic steering works as expected when mixing default and additional RSS contexts. The test was written by AI, and reviewed by humans. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250612071958.1696361-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12	net: ethtool: Don't check if RSS context exists in case of context 0	Gal Pressman	1	-1/+2
	Context 0 (default context) always exists, there is no need to check whether it exists or not when adding a flow steering rule. The existing check fails when creating a flow steering rule for context 0 as it is not stored in the rss_ctx xarray. For example: $ ethtool --config-ntuple eth2 flow-type tcp4 dst-ip 194.237.147.23 dst-port 19983 context 0 loc 618 rmgr: Cannot insert RX class rule: Invalid argument Cannot insert classification rule An example usecase for this could be: - A high-priority rule (loc 0) directing specific port traffic to context 0. - A low-priority rule (loc 1) directing all other TCP traffic to context 1. This is a user-visible regression that was caught in our testing environment, it was not reported by a user yet. Fixes: de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250612071958.1696361-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>