wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2025-03-08	locking/lockdep: Add kasan_check_byte() check in lock_acquire()	Waiman Long	1	-0/+9
	KASAN instrumentation of lockdep has been disabled, as we don't need KASAN to check the validity of lockdep internal data structures and incur unnecessary performance overhead. However, the lockdep_map pointer passed in externally may not be valid (e.g. use-after-free) and we run the risk of using garbage data resulting in false lockdep reports. Add kasan_check_byte() call in lock_acquire() for non kernel core data object to catch invalid lockdep_map and print out a KASAN report before any lockdep splat, if any. Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20250214195242.2480920-1-longman@redhat.com Link: https://lore.kernel.org/r/20250307232717.1759087-7-boqun.feng@gmail.com
2025-03-08	locking/lockdep: Disable KASAN instrumentation of lockdep.c	Waiman Long	1	-1/+2
	Both KASAN and LOCKDEP are commonly enabled in building a debug kernel. Each of them can significantly slow down the speed of a debug kernel. Enabling KASAN instrumentation of the LOCKDEP code will further slow things down. Since LOCKDEP is a high overhead debugging tool, it will never get enabled in a production kernel. The LOCKDEP code is also pretty mature and is unlikely to get major changes. There is also a possibility of recursion similar to KCSAN. To evaluate the performance impact of disabling KASAN instrumentation of lockdep.c, the time to do a parallel build of the Linux defconfig kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2) and an arm64 system were used as test beds. Two sets of non-RT and RT kernels with similar configurations except mainly CONFIG_PREEMPT_RT were used for evaluation. For the Skylake system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 0m47.642s 4m19.811s [CONFIG_KASAN_INLINE=y] Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9) Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3) Debug kernel (patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2) RT kernel (baseline) 0m54.871s 7m15.340s [CONFIG_KASAN_INLINE=n] RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7) RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3) RT debug kernel (patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0) [CONFIG_KASAN_INLINE=y] RT debug kernel 3m22.155s (x3.7) 77m53.018s (x10.7) RT debug kernel (patched) 2m36.700s (x2.9) 54m31.195s (x7.5) RT debug kernel (patched + mitigations=off) 2m06.110s (x2.3) 45m49.493s (x6.3) For the Zen 2 system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 1m42.806s 39m48.714s [CONFIG_KASAN_INLINE=y] Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2) Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2) Debug kernel (patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3) RT kernel (baseline) 1m51.500s 14m56.322s [CONFIG_KASAN_INLINE=n] RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4) RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7) RT debug kernel (patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4) For the arm64 system: Kernel Run time Sys time ------ -------- -------- Non-debug kernel (baseline) 1m56.844s 8m47.150s Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5) Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8) RT kernel (baseline) 4m01.641s 18m16.777s [CONFIG_KASAN_INLINE=n] RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7) RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8) Turning the mitigations off doesn't seems to have any noticeable impact on the performance of the arm64 system. So the mitigation=off entries aren't included. For the x86 CPUs, CPU mitigations has a much bigger impact on performance, especially the RT debug kernel with CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially big impact on the debug kernel. It is also the majority of the slowdown with mitigations on. It is because the patched RET instruction slows down function returns. A lot of helper functions that are normally compiled out or inlined may become real function calls in the debug kernel. With !CONFIG_KASAN_INLINE, the KASAN instrumentation inserts a lot of __asan_loadX() and __kasan_check_read() function calls to memory access portion of the code. The lockdep's __lock_acquire() function, for instance, has 66 __asan_loadX() and 6 __kasan_check_read() calls added with KASAN instrumentation. Of course, the actual numbers may vary depending on the compiler used and the exact version of the lockdep code. With the Skylake test system, the parallel kernel build times reduction of the RT debug kernel with this patch are: CONFIG_KASAN_INLINE=n: -37% CONFIG_KASAN_INLINE=y: -22% The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still significant. Setting CONFIG_KASAN_INLINE=y can result in a significant performance improvement. The major drawback is a significant increase in the size of kernel text. In the case of vmlinux, its text size increases from 45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The size increase of other kernel modules should be similar. With the newly added rtmutex and lockdep lock events, the relevant event counts for the test runs with the Skylake system were: Event type Debug kernel RT debug kernel ---------- ------------ --------------- lockdep_acquire 1,968,663,277 5,425,313,953 rtlock_slowlock - 401,701,156 rtmutex_slowlock - 139,672 The __lock_acquire() calls in the RT debug kernel are x2.8 times of the non-RT debug kernel with the same workload. Since the __lock_acquire() function is a big hitter in term of performance slowdown, this makes the RT debug kernel much slower than the non-RT one. The average lock nesting depth is likely to be higher in the RT debug kernel too leading to longer execution time in the __lock_acquire() function. As the small advantage of enabling KASAN instrumentation to catch potential memory access error in the lockdep debugging tool is probably not worth the drawback of further slowing down a debug kernel, disable KASAN instrumentation in the lockdep code to allow the debug kernels to regain some performance back, especially for the RT debug kernels. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250307232717.1759087-6-boqun.feng@gmail.com
2025-03-08	locking/lock_events: Add locking events for lockdep	Waiman Long	2	-1/+14
	Add some lock events to lockdep to profile its behavior. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307232717.1759087-5-boqun.feng@gmail.com
2025-03-08	locking/lock_events: Add locking events for rtmutex slow paths	Waiman Long	2	-5/+45
	Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock(). Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250307232717.1759087-4-boqun.feng@gmail.com
2025-03-08	locking/semaphore: Use wake_q to wake up processes outside lock critical section	Waiman Long	1	-4/+9
	A circular lock dependency splat has been seen involving down_trylock(): ====================================================== WARNING: possible circular locking dependency detected 6.12.0-41.el10.s390x+debug ------------------------------------------------------ dd/32479 is trying to acquire lock: 0015a20accd0d4f8 ((console_sem).lock){-.-.}-{2:2}, at: down_trylock+0x26/0x90 but task is already holding lock: 000000017e461698 (&zone->lock){-.-.}-{2:2}, at: rmqueue_bulk+0xac/0x8f0 the existing dependency chain (in reverse order) is: -> #4 (&zone->lock){-.-.}-{2:2}: -> #3 (hrtimer_bases.lock){-.-.}-{2:2}: -> #2 (&rq->__lock){-.-.}-{2:2}: -> #1 (&p->pi_lock){-.-.}-{2:2}: -> #0 ((console_sem).lock){-.-.}-{2:2}: The console_sem -> pi_lock dependency is due to calling try_to_wake_up() while holding the console_sem raw_spinlock. This dependency can be broken by using wake_q to do the wakeup instead of calling try_to_wake_up() under the console_sem lock. This will also make the semaphore's raw_spinlock become a terminal lock without taking any further locks underneath it. The hrtimer_bases.lock is a raw_spinlock while zone->lock is a spinlock. The hrtimer_bases.lock -> zone->lock dependency happens via the debug_objects_fill_pool() helper function in the debugobjects code. -> #4 (&zone->lock){-.-.}-{2:2}: __lock_acquire+0xe86/0x1cc0 lock_acquire.part.0+0x258/0x630 lock_acquire+0xb8/0xe0 _raw_spin_lock_irqsave+0xb4/0x120 rmqueue_bulk+0xac/0x8f0 __rmqueue_pcplist+0x580/0x830 rmqueue_pcplist+0xfc/0x470 rmqueue.isra.0+0xdec/0x11b0 get_page_from_freelist+0x2ee/0xeb0 __alloc_pages_noprof+0x2c2/0x520 alloc_pages_mpol_noprof+0x1fc/0x4d0 alloc_pages_noprof+0x8c/0xe0 allocate_slab+0x320/0x460 ___slab_alloc+0xa58/0x12b0 __slab_alloc.isra.0+0x42/0x60 kmem_cache_alloc_noprof+0x304/0x350 fill_pool+0xf6/0x450 debug_object_activate+0xfe/0x360 enqueue_hrtimer+0x34/0x190 __run_hrtimer+0x3c8/0x4c0 __hrtimer_run_queues+0x1b2/0x260 hrtimer_interrupt+0x316/0x760 do_IRQ+0x9a/0xe0 do_irq_async+0xf6/0x160 Normally a raw_spinlock to spinlock dependency is not legitimate and will be warned if CONFIG_PROVE_RAW_LOCK_NESTING is enabled, but debug_objects_fill_pool() is an exception as it explicitly allows this dependency for non-PREEMPT_RT kernel without causing PROVE_RAW_LOCK_NESTING lockdep splat. As a result, this dependency is legitimate and not a bug. Anyway, semaphore is the only locking primitive left that is still using try_to_wake_up() to do wakeup inside critical section, all the other locking primitives had been migrated to use wake_q to do wakeup outside of the critical section. It is also possible that there are other circular locking dependencies involving printk/console_sem or other existing/new semaphores lurking somewhere which may show up in the future. Let just do the migration now to wake_q to avoid headache like this. Reported-by: yzbot+ed801a886dfdbfe7136d@syzkaller.appspotmail.com Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250307232717.1759087-3-boqun.feng@gmail.com
2025-03-08	locking/rtmutex: Use the 'struct' keyword in kernel-doc comment	Randy Dunlap	1	-2/+2
	Add the "struct" keyword to prevent a kernel-doc warning: rtmutex_common.h:67: warning: cannot understand function prototype: 'struct rt_wake_q_head ' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20250307232717.1759087-2-boqun.feng@gmail.com
2025-03-08	rust: lockdep: Remove support for dynamically allocated LockClassKeys	Mitchell Levy	1	-12/+4
	Currently, dynamically allocated LockCLassKeys can be used from the Rust side without having them registered. This is a soundness issue, so remove them. Fixes: 6ea5aa08857a ("rust: sync: introduce `LockClassKey`") Suggested-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Mitchell Levy <levymitchell0@gmail.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250307232717.1759087-11-boqun.feng@gmail.com
2025-03-07	x86/split_lock: Fix the delayed detection logic	Maksim Davydov	1	-4/+16
	If the warning mode with disabled mitigation mode is used, then on each CPU where the split lock occurred detection will be disabled in order to make progress and delayed work will be scheduled, which then will enable detection back. Now it turns out that all CPUs use one global delayed work structure. This leads to the fact that if a split lock occurs on several CPUs at the same time (within 2 jiffies), only one CPU will schedule delayed work, but the rest will not. The return value of schedule_delayed_work_on() would have shown this, but it is not checked in the code. A diagram that can help to understand the bug reproduction: - sld_update_msr() enables/disables SLD on both CPUs on the same core - schedule_delayed_work_on() internally checks WORK_STRUCT_PENDING_BIT. If a work has the 'pending' status, then schedule_delayed_work_on() will return an error code and, most importantly, the work will not be placed in the workqueue. Let's say we have a multicore system on which split_lock_mitigate=0 and a multithreaded application is running that calls splitlock in multiple threads. Due to the fact that sld_update_msr() affects the entire core (both CPUs), we will consider 2 CPUs from different cores. Let the 2 threads of this application schedule to CPU0 (core 0) and to CPU 2 (core 1), then: \| \|\| \| \| CPU 0 (core 0) \|\| CPU 2 (core 1) \| \|_________________________________\|\|___________________________________\| \| \|\| \| \| 1) SPLIT LOCK occured \|\| \| \| \|\| \| \| 2) split_lock_warn() \|\| \| \| \|\| \| \| 3) sysctl_sld_mitigate == 0 \|\| \| \| (work = &sl_reenable) \|\| \| \| \|\| \| \| 4) schedule_delayed_work_on() \|\| \| \| (reenable will be called \|\| \| \| after 2 jiffies on CPU 0) \|\| \| \| \|\| \| \| 5) disable SLD for core 0 \|\| \| \| \|\| \| \| ------------------------- \|\| \| \| \|\| \| \| \|\| 6) SPLIT LOCK occured \| \| \|\| \| \| \|\| 7) split_lock_warn() \| \| \|\| \| \| \|\| 8) sysctl_sld_mitigate == 0 \| \| \|\| (work = &sl_reenable, \| \| \|\| the same address as in 3) ) \| \| \|\| \| \| 2 jiffies \|\| 9) schedule_delayed_work_on() \| \| \|\| fials because the work is in \| \| \|\| the pending state since 4). \| \| \|\| The work wasn't placed to the \| \| \|\| workqueue. reenable won't be \| \| \|\| called on CPU 2 \| \| \|\| \| \| \|\| 10) disable SLD for core 0 \| \| \|\| \| \| \|\| From now on SLD will \| \| \|\| never be reenabled on core 1 \| \| \|\| \| \| ------------------------- \|\| \| \| \|\| \| \| 11) enable SLD for core 0 by \|\| \| \| __split_lock_reenable \|\| \| \| \|\| \| If the application threads can be scheduled to all processor cores, then over time there will be only one core left, on which SLD will be enabled and split lock will be able to be detected; and on all other cores SLD will be disabled all the time. Most likely, this bug has not been noticed for so long because sysctl_sld_mitigate default value is 1, and in this case a semaphore is used that does not allow 2 different cores to have SLD disabled at the same time, that is, strictly only one work is placed in the workqueue. In order to fix the warning mode with disabled mitigation mode, delayed work has to be per-CPU. Implement it. Fixes: 727209376f49 ("x86/split_lock: Add sysctl to control the misery mode") Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20250115131704.132609-1-davydov-max@yandex-team.ru
2025-03-06	fs/pipe: add simpler helpers for common cases	Linus Torvalds	7	-23/+49
	The fix to atomically read the pipe head and tail state when not holding the pipe mutex has caused a number of headaches due to the size change of the involved types. It turns out that we don't have _that_ many places that access these fields directly and were affected, but we have more than we strictly should have, because our low-level helper functions have been designed to have intimate knowledge of how the pipes work. And as a result, that random noise of direct 'pipe->head' and 'pipe->tail' accesses makes it harder to pinpoint any actual potential problem spots remaining. For example, we didn't have a "is the pipe full" helper function, but instead had a "given these pipe buffer indexes and this pipe size, is the pipe full". That's because some low-level pipe code does actually want that much more complicated interface. But most other places literally just want a "is the pipe full" helper, and not having it meant that those places ended up being unnecessarily much too aware of this all. It would have been much better if only the very core pipe code that cared had been the one aware of this all. So let's fix it - better late than never. This just introduces the trivial wrappers for "is this pipe full or empty" and to get how many pipe buffers are used, so that instead of writing if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) the places that literally just want to know if a pipe is full can just say if (pipe_is_full(pipe)) instead. The existing trivial cases were converted with a 'sed' script. This cuts down on the places that access pipe->head and pipe->tail directly outside of the pipe code (and core splice code) quite a lot. The splice code in particular still revels in doing the direct low-level accesses, and the fuse fuse_dev_splice_write() code also seems a bit unnecessarily eager to go very low-level, but it's at least a bit better than it used to be. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06	block: Name the RQF flags enum	Breno Leitao	1	-1/+1
	Commit 5f89154e8e9e3445f9b59 ("block: Use enum to define RQF_x bit indexes") converted the RQF flags to an anonymous enum, which was a beneficial change. This patch goes one step further by naming the enum as "rqf_flags". This naming enables exporting these flags to BPF clients, eliminating the need to duplicate these flags in BPF code. Instead, BPF clients can now access the same kernel-side values through CO:RE (Compile Once, Run Everywhere), as shown in this example: rqf_stats = bpf_core_enum_value(enum rqf_flags, __RQF_STATS) Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20250306-rqf_flags-v1-1-bbd64918b406@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-06	bcachefs: copygc now skips non-rw devices	Kent Overstreet	1	-13/+12
	There's no point in doing copygc on non-rw devices: the fragmentation doesn't matter if we're not writing to them, and we may not have anywhere to put the data on our other devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	bcachefs: Fix bch2_dev_journal_alloc() spuriously failing	Kent Overstreet	1	-27/+32
	Previously, we fixed journal resize spuriousl failing with -BCH_ERR_open_buckets_empty, but initial journal allocation was missed because it didn't invoke the "block on allocator" loop at all. Factor out the "loop on allocator" code to fix that. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-06	x86/boot: Sanitize boot params before parsing command line	Ard Biesheuvel	1	-0/+2
	The 5-level paging code parses the command line to look for the 'no5lvl' string, and does so very early, before sanitize_boot_params() has been called and has been given the opportunity to wipe bogus data from the fields in boot_params that are not covered by struct setup_header, and are therefore supposed to be initialized to zero by the bootloader. This triggers an early boot crash when using syslinux-efi to boot a recent kernel built with CONFIG_X86_5LEVEL=y and CONFIG_EFI_STUB=n, as the 0xff padding that now fills the unused PE/COFF header is copied into boot_params by the bootloader, and interpreted as the top half of the command line pointer. Fix this by sanitizing the boot_params before use. Note that there is no harm in calling this more than once; subsequent invocations are able to spot that the boot_params have already been cleaned up. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> # v6.1+ Link: https://lore.kernel.org/r/20250306155915.342465-2-ardb+git@google.com Closes: https://lore.kernel.org/all/202503041549.35913.ulrich.gemkow@ikr.uni-stuttgart.de
2025-03-06	fs/pipe: fix pipe buffer index use in FUSE	Linus Torvalds	1	-7/+6
	This was another case that Rasmus pointed out where the direct access to the pipe head and tail pointers broke on 32-bit configurations due to the type changes. As with the pipe FIONREAD case, fix it by using the appropriate helper functions that deal with the right pipe index sizing. Reported-by: Rasmus Villemoes <ravi@prevas.dk> Link: https://lore.kernel.org/all/878qpi5wz4.fsf@prevas.dk/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex")Cc: Oleg > Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06	fs/pipe: do not open-code pipe head/tail logic in FIONREAD	Linus Torvalds	1	-4/+3
	Rasmus points out that we do indeed have other cases of breakage from the type changes that were introduced on 32-bit targets in order to read the pipe head and tail values atomically (commit 3d252160b818: "fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex"). Fix it up by using the proper helper functions that now deal with the pipe buffer index types properly. This makes the code simpler and more obvious. The compiler does the CSE and loop hoisting of the pipe ring size masking that we used to do manually, so open-coding this was never a good idea. Reported-by: Rasmus Villemoes <ravi@prevas.dk> Link: https://lore.kernel.org/all/87cyeu5zgk.fsf@prevas.dk/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex")Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06	fs/pipe: express 'pipe_empty()' in terms of 'pipe_occupancy()'	Linus Torvalds	1	-6/+6
	That's what 'pipe_full()' does, so it's more consistent. But more importantly it gets the type limits right when the pipe head and tail are no longer necessarily 'unsigned int'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06	gpio: rcar: Fix missing of_node_put() call	Fabrizio Castro	1	-1/+6
	of_parse_phandle_with_fixed_args() requires its caller to call into of_node_put() on the node pointer from the output structure, but such a call is currently missing. Call into of_node_put() to rectify that. Fixes: 159f8a0209af ("gpio-rcar: Add DT support") Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20250305163753.34913-2-fabrizio.castro.jz@renesas.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-06	btrfs: fix a leaked chunk map issue in read_one_chunk()	Haoxiang Li	1	-0/+1
	Add btrfs_free_chunk_map() to free the memory allocated by btrfs_alloc_chunk_map() if btrfs_add_chunk_map() fails. Fixes: 7dc66abb5a47 ("btrfs: use a dedicated data structure for chunk maps") CC: stable@vger.kernel.org Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-06	net: ipv6: fix missing dst ref drop in ila lwtunnel	Justin Iurman	1	-0/+1
	Add missing skb_dst_drop() to drop reference to the old dst before adding the new dst to the skb. Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250305081655.19032-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-06	net: ipv6: fix dst ref loop in ila lwtunnel	Justin Iurman	1	-1/+2
	This patch follows commit 92191dd10730 ("net: ipv6: fix dst ref loops in rpl, seg6 and ioam6 lwtunnels") and, on a second thought, the same patch is also needed for ila (even though the config that triggered the issue was pathological, but still, we don't want that to happen). Fixes: 79ff2fc31e0f ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20250304181039.35951-1-justin.iurman@uliege.be Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-06	mctp i3c: handle NULL header address	Matt Johnston	1	-0/+3
	daddr can be NULL if there is no neighbour table entry present, in that case the tx packet should be dropped. saddr will usually be set by MCTP core, but check for NULL in case a packet is transmitted by a different protocol. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Fixes: c8755b29b58e ("mctp i3c: MCTP I3C driver") Link: https://patch.msgid.link/20250304-mctp-i3c-null-v1-1-4416bbd56540@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-06	sched/rt: Update limit of sched_rt sysctl in documentation	Shrikanth Hegde	1	-0/+3
	By default fair_server dl_server allocates 5% of the bandwidth to the root domain. Due to this writing any value less than 5% fails due to -EBUSY: $ cat /proc/sys/kernel/sched_rt_period_us 1000000 $ echo 49999 > /proc/sys/kernel/sched_rt_runtime_us -bash: echo: write error: Device or resource busy $ echo 50000 > /proc/sys/kernel/sched_rt_runtime_us $ Since the sched_rt_runtime_us allows -1 as the minimum, put this restriction in the documentation. One should check average of runtime/period in /sys/kernel/debug/sched/fair_server/cpuX/* for exact value. Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20250306052954.452005-3-sshegde@linux.ibm.com
2025-03-06	sched/deadline: Use online cpus for validating runtime	Shrikanth Hegde	1	-1/+1
	The ftrace selftest reported a failure because writing -1 to sched_rt_runtime_us returns -EBUSY. This happens when the possible CPUs are different from active CPUs. Active CPUs are part of one root domain, while remaining CPUs are part of def_root_domain. Since active cpumask is being used, this results in cpus=0 when a non active CPUs is used in the loop. Fix it by looping over the online CPUs instead for validating the bandwidth calculations. Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20250306052954.452005-2-sshegde@linux.ibm.com
2025-03-06	pid: Do not set pid_max in new pid namespaces	Michal Koutný	1	-1/+1
	It is already difficult for users to troubleshoot which of multiple pid limits restricts their workload. The per-(hierarchical-)NS pid_max would contribute to the confusion. Also, the implementation copies the limit upon creation from parent, this pattern showed cumbersome with some attributes in legacy cgroup controllers -- it's subject to race condition between parent's limit modification and children creation and once copied it must be changed in the descendant. Let's do what other places do (ucounts or cgroup limits) -- create new pid namespaces without any limit at all. The global limit (actually any ancestor's limit) is still effectively in place, we avoid the set/unshare race and bumps of global (ancestral) limit have the desired effect on pid namespace that do not care. Link: https://lore.kernel.org/r/20240408145819.8787-1-mkoutny@suse.com/ Link: https://lore.kernel.org/r/20250221170249.890014-1-mkoutny@suse.com/ Fixes: 7863dcc72d0f4 ("pid: allow pid_max to be set per pid namespace") Signed-off-by: Michal Koutný <mkoutny@suse.com> Link: https://lore.kernel.org/r/20250305145849.55491-1-mkoutny@suse.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-06	drm/bochs: Fix DPMS regression	Takashi Iwai	1	-2/+3
	The recent rewrite with the use of regular atomic helpers broke the DPMS unblanking on X11. Fix it by moving the call of bochs_hw_blank(false) from CRTC mode_set_nofb() to atomic_enable(). Fixes: 2037174993c8 ("drm/bochs: Use regular atomic helpers") Link: https://bugzilla.suse.com/show_bug.cgi?id=1238209 Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250304134203.20534-1-tiwai@suse.de
2025-03-05	net: dsa: mt7530: Fix traffic flooding for MMIO devices	Lorenzo Bianconi	1	-6/+2
	On MMIO devices (e.g. MT7988 or EN7581) unicast traffic received on lanX port is flooded on all other user ports if the DSA switch is configured without VLAN support since PORT_MATRIX in PCR regs contains all user ports. Similar to MDIO devices (e.g. MT7530 and MT7531) fix the issue defining default VLAN-ID 0 for MT7530 MMIO devices. Fixes: 110c18bfed414 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/20250304-mt7988-flooding-fix-v1-1-905523ae83e9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-05	io_uring/rw: ensure reissue path is correctly handled for IOPOLL	Jens Axboe	1	-4/+3
	The IOPOLL path posts CQEs when the io_kiocb is marked as completed, so it cannot rely on the usual retry that non-IOPOLL requests do for read/write requests. If -EAGAIN is received and the request should be retried, go through the normal completion path and let the normal flush logic catch it and reissue it, like what is done for !IOPOLL reads or writes. Fixes: d803d123948f ("io_uring/rw: handle -EAGAIN retry at IO completion time") Reported-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/io-uring/2b43ccfa-644d-4a09-8f8f-39ad71810f41@oracle.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-05	drm/xe/userptr: Unmap userptrs in the mmu notifier	Thomas Hellström	4	-9/+52
	If userptr pages are freed after a call to the xe mmu notifier, the device will not be blocked out from theoretically accessing these pages unless they are also unmapped from the iommu, and this violates some aspects of the iommu-imposed security. Ensure that userptrs are unmapped in the mmu notifier to mitigate this. A naive attempt would try to free the sg table, but the sg table itself may be accessed by a concurrent bind operation, so settle for only unmapping. v3: - Update lockdep asserts. - Fix a typo (Matthew Auld) Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Cc: Oak Zeng <oak.zeng@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-4-thomas.hellstrom@linux.intel.com (cherry picked from commit ba767b9d01a2c552d76cf6f46b125d50ec4147a6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe/hmm: Don't dereference struct page pointers without notifier lock	Thomas Hellström	1	-26/+86
	The pnfs that we obtain from hmm_range_fault() point to pages that we don't have a reference on, and the guarantee that they are still in the cpu page-tables is that the notifier lock must be held and the notifier seqno is still valid. So while building the sg table and marking the pages accesses / dirty we need to hold this lock with a validated seqno. However, the lock is reclaim tainted which makes sg_alloc_table_from_pages_segment() unusable, since it internally allocates memory. Instead build the sg-table manually. For the non-iommu case this might lead to fewer coalesces, but if that's a problem it can be fixed up later in the resource cursor code. For the iommu case, the whole sg-table may still be coalesced to a single contigous device va region. This avoids marking pages that we don't own dirty and accessed, and it also avoid dereferencing struct pages that we don't own. v2: - Use assert to check whether hmm pfns are valid (Matthew Auld) - Take into account that large pages may cross range boundaries (Matthew Auld) v3: - Don't unnecessarily check for a non-freed sg-table. (Matthew Auld) - Add a missing up_read() in an error path. (Matthew Auld) Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Cc: Oak Zeng <oak.zeng@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-3-thomas.hellstrom@linux.intel.com (cherry picked from commit ea3e66d280ce2576664a862693d1da8fd324c317) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe/hmm: Style- and include fixes	Thomas Hellström	2	-6/+8
	Add proper #ifndef around the xe_hmm.h header, proper spacing and since the documentation mostly follows kerneldoc format, make it kerneldoc. Also prepare for upcoming -stable fixes. Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Cc: Oak Zeng <oak.zeng@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Matthew Brost <Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250304173342.22009-2-thomas.hellstrom@linux.intel.com (cherry picked from commit bbe2b06b55bc061c8fcec034ed26e88287f39143) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe: Add staging tree for VM binds	Matthew Brost	3	-19/+46
	Concurrent VM bind staging and zapping of PTEs from a userptr notifier do not work because the view of PTEs is not stable. VM binds cannot acquire the notifier lock during staging, as memory allocations are required. To resolve this race condition, use a staging tree for VM binds that is committed only under the userptr notifier lock during the final step of the bind. This ensures a consistent view of the PTEs in the userptr notifier. A follow up may only use staging for VM in fault mode as this is the only mode in which the above race exists. v3: - Drop zap PTE change (Thomas) - s/xe_pt_entry/xe_pt_entry_staging (Thomas) Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: <stable@vger.kernel.org> Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Fixes: a708f6501c69 ("drm/xe: Update PT layer with better error handling") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-5-thomas.hellstrom@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> (cherry picked from commit 6f39b0c5ef0385eae586760d10b9767168037aa5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe: Fix fault mode invalidation with unbind	Thomas Hellström	4	-60/+75
	Fix fault mode invalidation racing with unbind leading to the PTE zapping potentially traversing an invalid page-table tree. Do this by holding the notifier lock across PTE zapping. This might transfer any contention waiting on the notifier seqlock read side to the notifier lock read side, but that shouldn't be a major problem. At the same time get rid of the open-coded invalidation in the bind code by relying on the notifier even when the vma bind is not yet committed. Finally let userptr invalidation call a dedicated xe_vm function performing a full invalidation. Fixes: e8babb280b5e ("drm/xe: Convert multiple bind ops into single job") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-4-thomas.hellstrom@linux.intel.com (cherry picked from commit 100a5b8dadfca50d91d9a4c9fc01431b42a25cab) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe/vm: Fix a misplaced #endif	Thomas Hellström	1	-1/+1
	Fix a (harmless) misplaced #endif leading to declarations appearing multiple times. Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-3-thomas.hellstrom@linux.intel.com (cherry picked from commit fcc20a4c752214b3e25632021c57d7d1d71ee1dd) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	drm/xe/vm: Validate userptr during gpu vma prefetching	Thomas Hellström	1	-1/+10
	If a userptr vma subject to prefetching was already invalidated or invalidated during the prefetch operation, the operation would repeatedly return -EAGAIN which would typically cause an infinite loop. Validate the userptr to ensure this doesn't happen. v2: - Don't fallthrough from UNMAP to PREFETCH (Matthew Brost) Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas") Fixes: 617eebb9c480 ("drm/xe: Fix array of binds") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-2-thomas.hellstrom@linux.intel.com (cherry picked from commit 03c346d4d0d85d210d549d43c8cfb3dfb7f20e0a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-05	ALSA: hda/realtek: Add support for ASUS Zenbook UM3406KA Laptops using CS35L41 HDA	Stefan Binding	1	-0/+1
	Laptop uses 2 CS35L41 Amps with HDA, using External boost with I2C Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-8-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for ASUS B5405 and B5605 Laptops using CS35L41 HDA	Stefan Binding	1	-0/+4
	Add support for ASUS B5605CCA and B5405CCA. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-7-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for ASUS B3405 and B3605 Laptops using CS35L41 HDA	Stefan Binding	1	-0/+4
	Add support for ASUS B3405CCA / P3405CCA, B3605CCA / P3605CCA, B3405CCA, B3605CCA. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-6-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for various ASUS Laptops using CS35L41 HDA	Stefan Binding	1	-0/+4
	Add support for ASUS B3405CVA, B5405CVA, B5605CVA, B3605CVA. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with SPI Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-5-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for ASUS ROG Strix G614 Laptops using CS35L41 HDA	Stefan Binding	1	-0/+2
	Add support for ASUS G614PH/PM/PP and G614FH/FM/FP. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-4-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for ASUS ROG Strix GA603 Laptops using CS35L41 HDA	Stefan Binding	1	-0/+2
	Add support for ASUS GA603KP, GA603KM and GA603KH. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-3-sbinding@opensource.cirrus.com
2025-03-05	ALSA: hda/realtek: Add support for ASUS ROG Strix G814 Laptop using CS35L41 HDA	Stefan Binding	1	-0/+2
	Add support for ASUS G814PH/PM/PP and G814FH/FM/FP. Laptops use 2 CS35L41 Amps with HDA, using Internal boost, with I2C. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20250305170714.755794-2-sbinding@opensource.cirrus.com
2025-03-05	nvme-tcp: fix signedness bug in nvme_tcp_init_connection()	Dan Carpenter	1	-3/+3
	The kernel_recvmsg() function returns an int which could be either negative error codes or the number of bytes received. The problem is that the condition: if (ret < sizeof(*icresp)) { is type promoted to type unsigned long and negative values are treated as high positive values which is success, when they should be treated as failure. Handle invalid positive returns separately from negative error codes to avoid this problem. Fixes: 578539e09690 ("nvme-tcp: fix connect failure on receiving partial ICResp PDU") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-05	fs/pipe: remove buggy and unused 'helper' function	Linus Torvalds	1	-9/+0
	While looking for incorrect users of the pipe head/tail fields (see commit c27c66afc449: "fs/pipe: Fix pipe_occupancy() with 16-bit indexes"), I found a bug in pipe_discard_from() that looked entirely broken. However, the fix is trivial: this buggy function isn't actually called by anything, so let's just remove it ASAP. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05	drm/amd/pm: always allow ih interrupt from fw	Kenneth Feng	1	-11/+1
	always allow ih interrupt from fw on smu v14 based on the interface requirement Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a3199eba46c54324193607d9114a1e321292d7a1) Cc: stable@vger.kernel.org # 6.12.x
2025-03-05	drm/radeon: Fix rs400_gpu_init for ATI mobility radeon Xpress 200M	Richard Thier	3	-3/+19
	num_gb_pipes was set to a wrong value using r420_pipe_config This have lead to HyperZ glitches on fast Z clearing. Closes: https://bugs.freedesktop.org/show_bug.cgi?id=110897 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Richard Thier <u9vata@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 044e59a85c4d84e3c8d004c486e5c479640563a6) Cc: stable@vger.kernel.org
2025-03-05	include/linux/pipe_fs_i: Add htmldoc annotation for "head_tail" member	K Prateek Nayak	1	-0/+1
	Add htmldoc annotation for the newly introduced "head_tail" member describing it to be a union of the pipe_inode_info's @head and @tail members. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/lkml/20250305204609.5e64768e@canb.auug.org.au/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex") Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05	fs/pipe: Fix pipe_occupancy() with 16-bit indexes	Linus Torvalds	1	-1/+1
	The pipe_occupancy() logic implicitly relied on the natural unsigned modulo arithmetic in C, but that doesn't work for the new 'pipe_index_t' case, since any arithmetic will be done in 'int' (and here we had also made it 'unsigned int' due to the function call boundary). So make the modulo arithmetic explicit by casting the result to the proper type. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/all/CAHk-=wjyHsGLx=rxg6PKYBNkPYAejgo7=CbyL3=HGLZLsAaJFQ@mail.gmail.com/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05	drm/amdkfd: Fix NULL Pointer Dereference in KFD queue	Andrew Martin	1	-2/+2
	Through KFD IOCTL Fuzzing we encountered a NULL pointer derefrence when calling kfd_queue_acquire_buffers. Fixes: 629568d25fea ("drm/amdkfd: Validate queue cwsr area and eop buffer size") Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 049e5bf3c8406f87c3d8e1958e0a16804fa1d530) Cc: stable@vger.kernel.org
2025-03-05	drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params	Ma Ke	1	-1/+2
	Null pointer dereference issue could occur when pipe_ctx->plane_state is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not null before accessing. This prevents a null pointer dereference. Found by code review. Fixes: 3be5262e353b ("drm/amd/display: Rename more dc_surface stuff to plane_state") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 63e6a77ccf239337baa9b1e7787cde9fa0462092) Cc: stable@vger.kernel.org
2025-03-05	sched/fair: Fix potential memory corruption in child_cfs_rq_on_list	Zecheng Li	1	-2/+4
	child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq. This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list, making the conversion invalid and potentially leading to memory corruption. Depending on the relative positions of leaf_cfs_rq_list and the task group (tg) pointer within the struct, this can cause a memory fault or access garbage data. The issue arises in list_add_leaf_cfs_rq, where both cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list. This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main conditional in child_cfs_rq_on_list. This ensures that the container_of operation will convert a correct cfs_rq struct. This check is sufficient because only cfs_rqs on the same CPU are added to the list, so verifying the 'prev' pointer against the current rq's list head is enough. Fixes a potential memory corruption issue that due to current struct layout might not be manifesting as a crash but could lead to unpredictable behavior when the layout changes. Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling") Signed-off-by: Zecheng Li <zecheng@google.com> Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20250304214031.2882646-1-zecheng@google.com