wireguard-linux - WireGuard for the Linux kernel

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-24	kasan, fortify: properly rename memintrinsics	Andrey Konovalov	1	-4/+18
	After commit 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem() functions") and the follow-up fixes, with CONFIG_FORTIFY_SOURCE enabled, even though the compiler instruments meminstrinsics by generating calls to __asan/__hwasan_ prefixed functions, FORTIFY_SOURCE still uses uninstrumented memset/memmove/memcpy as the underlying functions. As a result, KASAN cannot detect bad accesses in memset/memmove/memcpy. This also makes KASAN tests corrupt kernel memory and cause crashes. To fix this, use __asan_/__hwasan_memset/memmove/memcpy as the underlying functions whenever appropriate. Do this only for the instrumented code (as indicated by __SANITIZE_ADDRESS__). Link: https://lkml.kernel.org/r/20240517130118.759301-1-andrey.konovalov@linux.dev Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem() functions") Fixes: 51287dcb00cc ("kasan: emit different calls for instrumentable memintrinsics") Fixes: 36be5cba99f6 ("kasan: treat meminstrinsic as builtins in uninstrumented files") Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Reported-by: Erhard Furtner <erhard_f@mailbox.org> Reported-by: Nico Pache <npache@redhat.com> Closes: https://lore.kernel.org/all/20240501144156.17e65021@outsider.home/ Reviewed-by: Marco Elver <elver@google.com> Tested-by: Nico Pache <npache@redhat.com> Acked-by: Nico Pache <npache@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24	lib: add version into /proc/allocinfo output	Suren Baghdasaryan	2	-17/+35
	Add version string and a header at the beginning of /proc/allocinfo to allow later format changes. Example output: > head /proc/allocinfo allocinfo - version: 1.0 # <size> <calls> <tag info> 0 0 init/main.c:1314 func:do_initcalls 0 0 init/do_mounts.c:353 func:mount_nodev_root 0 0 init/do_mounts.c:187 func:mount_root_generic 0 0 init/do_mounts.c:158 func:do_mount_root 0 0 init/initramfs.c:493 func:unpack_to_rootfs 0 0 init/initramfs.c:492 func:unpack_to_rootfs 0 0 init/initramfs.c:491 func:unpack_to_rootfs 512 1 arch/x86/events/rapl.c:681 func:init_rapl_pmus 128 1 arch/x86/events/rapl.c:571 func:rapl_cpu_online [akpm@linux-foundation.org: remove stray newline from struct allocinfo_private] Link: https://lkml.kernel.org/r/20240514163128.3662251-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-24	mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL	Hailong.Liu	1	-3/+2
	commit a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc") includes support for __GFP_NOFAIL, but it presents a conflict with commit dd544141b9eb ("vmalloc: back off when the current task is OOM-killed"). A possible scenario is as follows: process-a __vmalloc_node_range(GFP_KERNEL \| __GFP_NOFAIL) __vmalloc_area_node() vm_area_alloc_pages() --> oom-killer send SIGKILL to process-a if (fatal_signal_pending(current)) break; --> return NULL; To fix this, do not check fatal_signal_pending() in vm_area_alloc_pages() if __GFP_NOFAIL set. This issue occurred during OPLUS KASAN TEST. Below is part of the log -> oom-killer sends signal to process [65731.222840] [ T1308] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/apps/uid_10198,task=gs.intelligence,pid=32454,uid=10198 [65731.259685] [T32454] Call trace: [65731.259698] [T32454] dump_backtrace+0xf4/0x118 [65731.259734] [T32454] show_stack+0x18/0x24 [65731.259756] [T32454] dump_stack_lvl+0x60/0x7c [65731.259781] [T32454] dump_stack+0x18/0x38 [65731.259800] [T32454] mrdump_common_die+0x250/0x39c [mrdump] [65731.259936] [T32454] ipanic_die+0x20/0x34 [mrdump] [65731.260019] [T32454] atomic_notifier_call_chain+0xb4/0xfc [65731.260047] [T32454] notify_die+0x114/0x198 [65731.260073] [T32454] die+0xf4/0x5b4 [65731.260098] [T32454] die_kernel_fault+0x80/0x98 [65731.260124] [T32454] __do_kernel_fault+0x160/0x2a8 [65731.260146] [T32454] do_bad_area+0x68/0x148 [65731.260174] [T32454] do_mem_abort+0x151c/0x1b34 [65731.260204] [T32454] el1_abort+0x3c/0x5c [65731.260227] [T32454] el1h_64_sync_handler+0x54/0x90 [65731.260248] [T32454] el1h_64_sync+0x68/0x6c [65731.260269] [T32454] z_erofs_decompress_queue+0x7f0/0x2258 --> be->decompressed_pages = kvcalloc(be->nr_pages, sizeof(struct page *), GFP_KERNEL \| __GFP_NOFAIL); kernel panic by NULL pointer dereference. erofs assume kvmalloc with __GFP_NOFAIL never return NULL. [65731.260293] [T32454] z_erofs_runqueue+0xf30/0x104c [65731.260314] [T32454] z_erofs_readahead+0x4f0/0x968 [65731.260339] [T32454] read_pages+0x170/0xadc [65731.260364] [T32454] page_cache_ra_unbounded+0x874/0xf30 [65731.260388] [T32454] page_cache_ra_order+0x24c/0x714 [65731.260411] [T32454] filemap_fault+0xbf0/0x1a74 [65731.260437] [T32454] __do_fault+0xd0/0x33c [65731.260462] [T32454] handle_mm_fault+0xf74/0x3fe0 [65731.260486] [T32454] do_mem_abort+0x54c/0x1b34 [65731.260509] [T32454] el0_da+0x44/0x94 [65731.260531] [T32454] el0t_64_sync_handler+0x98/0xb4 [65731.260553] [T32454] el0t_64_sync+0x198/0x19c Link: https://lkml.kernel.org/r/20240510100131.1865-1-hailong.liu@oppo.com Fixes: 9376130c390a ("mm/vmalloc: add support for __GFP_NOFAIL") Signed-off-by: Hailong.Liu <hailong.liu@oppo.com> Acked-by: Michal Hocko <mhocko@suse.com> Suggested-by: Barry Song <21cnbao@gmail.com> Reported-by: Oven <liyangouwen1@oppo.com> Reviewed-by: Barry Song <baohua@kernel.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Chao Yu <chao@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Gao Xiang <xiang@kernel.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-22	mm: simplify and improve print_vma_addr() output	Linus Torvalds	1	-13/+6
	Use '%pD' to print out the filename, and print out the actual offset within the file too, rather than just what the virtual address of the mapping is (which doesn't tell you anything about any mapping offsets). Also, use the exact vma_lookup() instead of find_vma() - the latter looks up any vma _after_ the address, which is of questionable value (yes, maybe you fell off the beginning, but you'd be more likely to fall off the end). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-22	x86: improve bitop code generation with clang	Linus Torvalds	1	-5/+5
	This uses the new ASM_INPUT_RM macro to avoid the bad code generation issue that clang has with more generic asm inputs. This ends up avoiding generating code like this: mov %r10,(%rsp) tzcnt (%rsp),%rcx which now becomes just tzcnt %r10,%rcx and in the process ends up also removing a few unnecessary stack frames when the only use was that pointless "asm uses memory location off stack". Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-22	x86: improve array_index_mask_nospec() code generation	Linus Torvalds	1	-14/+10
	Don't force the inputs to be 'unsigned long', when the comparison can easily be done in 32-bit if that's more appropriate. Note that while we can look at the inputs to choose an appropriate size for the compare instruction, the output is fixed at 'unsigned long'. That's not technically optimal either, since a 32-bit 'sbbl' would often be sufficient. But for the outgoing mask we don't know how the mask ends up being used (ie we have uses that have an incoming 32-bit array index, but end up using the mask for other things). That said, it only costs the extra REX prefix to always generate the 64-bit mask. [ A 'sbbl' also always technically generates a 64-bit mask, but with the upper 32 bits clear: that's fine for when the incoming index that will be masked is already 32-bit, but not if you use the mask to mask a pointer afterwards, like the file table lookup does ] Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-22	clang: work around asm input constraint problems	Linus Torvalds	2	-0/+19
	Work around clang problems with asm constraints that have multiple possibilities, particularly "g" and "rm". Clang seems to turn inputs like that into the most generic form, which is the memory input - but to make matters worse, clang won't even use a possible original memory location, but will spill the value to stack, and use the stack for the asm input. See https://github.com/llvm/llvm-project/issues/20571#issuecomment-980933442 for some explanation of why clang has this strange behavior, but the end result is that "g" and "rm" really end up generating horrid code. Link: https://github.com/llvm/llvm-project/issues/20571 Cc: Peter Zijlstra <peterz@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-22	vfs: Delete the associated dentry when deleting a file	Yafang Shao	1	-8/+7
	Our applications, built on Elasticsearch[0], frequently create and delete files. These applications operate within containers, some with a memory limit exceeding 100GB. Over prolonged periods, the accumulation of negative dentries within these containers can amount to tens of gigabytes. Upon container exit, directories are deleted. However, due to the numerous associated dentries, this process can be time-consuming. Our users have expressed frustration with this prolonged exit duration, which constitutes our first issue. Simultaneously, other processes may attempt to access the parent directory of the Elasticsearch directories. Since the task responsible for deleting the dentries holds the inode lock, processes attempting directory lookup experience significant delays. This issue, our second problem, is easily demonstrated: - Task 1 generates negative dentries: $ pwd ~/test $ mkdir es && cd es/ && ./create_and_delete_files.sh [ After generating tens of GB dentries ] $ cd ~/test && rm -rf es [ It will take a long duration to finish ] - Task 2 attempts to lookup the 'test/' directory $ pwd ~/test $ ls The 'ls' command in Task 2 experiences prolonged execution as Task 1 is deleting the dentries. We've devised a solution to address both issues by deleting associated dentry when removing a file. Interestingly, we've noted that a similar patch was proposed years ago[1], although it was rejected citing the absence of tangible issues caused by negative dentries. Given our current challenges, we're resubmitting the proposal. All relevant stakeholders from previous discussions have been included for reference. Some alternative solutions are also under discussion[2][3], such as shrinking child dentries outside of the parent inode lock or even asynchronously shrinking child dentries. However, given the straightforward nature of the current solution, I believe this approach is still necessary. [ NOTE! This is a pretty fundamental change in how we deal with unlinking dentries, and it doesn't change the fact that you can have lots of negative dentries from just doing negative lookups. But the kernel test robot is at least initially happy with this from a performance angle, so I'm applying this ASAP just to get more testing and as a "known fix for an issue people hit in real life". Put another way: we should still look at the alternatives, and this patch may get reverted if somebody finds a performance regression on some other load. - Linus ] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://github.com/elastic/elasticsearch [0] Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [1] Link: https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [2] Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ [3] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Waiman Long <longman@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Wangkai <wangkai86@huawei.com> Cc: Colin Walters <walters@verbum.org> Tested-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-21	gpiolib: acpi: Fix failed in acpi_gpiochip_find() by adding parent node match	Devyn Liu	1	-1/+18
	Previous patch modified the standard used by acpi_gpiochip_find() to match device nodes. Using the device node set in gc->gpiodev->d- ev instead of gc->parent. However, there is a situation in gpio-dwapb where the GPIO device driver will set gc->fwnode for each port corresponding to a child node under a GPIO device, so gc->gpiodev->dev will be assigned the value of each child node in gpiochip_add_data(). gpio-dwapb.c: 128,31 static int dwapb_gpio_add_port(struct dwapb_gpio gpio, struct dwapb_port_property pp, unsigned int offs); port->gc.fwnode = pp->fwnode; 693,39 static int dwapb_gpio_probe; err = dwapb_gpio_add_port(gpio, &pdata->properties[i], i); When other drivers request GPIO pin resources through the GPIO device node provided by ACPI (corresponding to the parent node), the change of the matching object to gc->gpiodev->dev in acpi_gpiochip_find() only allows finding the value of each port (child node), resulting in a failed request. Reapply the condition of using gc->parent for match in acpi_gpio- chip_find() in the code can compatible with the problem of gpio-dwapb, and will not affect the two cases mentioned in the patch: 1. There is no setting for gc->fwnode. 2. The case that depends on using gc->fwnode for match. Fixes: 5062e4c14b75 ("gpiolib: acpi: use the fwnode in acpi_gpiochip_find()") Fixes: 067dbc1ea5ce ("gpiolib: acpi: Don't use GPIO chip fwnode in acpi_gpiochip_find()") Signed-off-by: Devyn Liu <liudingyuan@huawei.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Benjamin Tissoires <bentiss@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-05-21	gpiolib: acpi: Move ACPI device NULL check to acpi_can_fallback_to_crs()	Laura Nao	1	-3/+7
	Following the relocation of the function call outside of __acpi_find_gpio(), move the ACPI device NULL check to acpi_can_fallback_to_crs(). Signed-off-by: Laura Nao <laura.nao@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reported-by: kernelci.org bot <bot@kernelci.org> Closes: https://lore.kernel.org/all/20240426154208.81894-1-laura.nao@collabora.com/ Fixes: 49c02f6e901c ("gpiolib: acpi: Move acpi_can_fallback_to_crs() out of __acpi_find_gpio()") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2024-05-21	fs/pidfs: make 'lsof' happy with our inode changes	Linus Torvalds	1	-4/+24
	pidfs started using much saner inodes in commit b28ddcc32d8f ("pidfs: convert to path_from_stashed() helper"), but that exposed the fact that lsof had some knowledge of just how odd our old anon_inode usage was. For example, legacy anon_inodes hadn't even initialized the inode type in the inode mode, so everything had a type of zero. So sane tools like 'stat' would report these files as "weird file", but 'lsof' instead used that (together with the name of the link in proc) to notice that it's an anonymous inode, and used it to detect pidfd files. Let's keep our internal new sane inode model, but mask the file type bits at 'stat()' time in the getattr() function we already have, and by making the dentry name match what lsof expects too. This keeps our internal models sane, but should make user space see the same old odd behavior. Reported-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/all/a15b1050-4b52-4740-a122-a4d055c17f11@kernel.org/ Link: https://github.com/lsof-org/lsof/issues/317 Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Seth Forshee <sforshee@kernel.org> Cc: Tycho Andersen <tycho@tycho.pizza> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-21	KEYS: trusted: Do not use WARN when encode fails	Jarkko Sakkinen	1	-1/+2
	When asn1_encode_sequence() fails, WARN is not the correct solution. 1. asn1_encode_sequence() is not an internal function (located in lib/asn1_encode.c). 2. Location is known, which makes the stack trace useless. 3. Results a crash if panic_on_warn is set. It is also noteworthy that the use of WARN is undocumented, and it should be avoided unless there is a carefully considered rationale to use it. Replace WARN with pr_err, and print the return value instead, which is only useful piece of information. Cc: stable@vger.kernel.org # v5.13+ Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-21	KEYS: trusted: Fix memory leak in tpm2_key_encode()	Jarkko Sakkinen	1	-6/+18
	'scratch' is never freed. Fix this by calling kfree() in the success, and in the error case. Cc: stable@vger.kernel.org # +v5.13 Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-20	arch: Fix name collision with ACPI's video.o	Thomas Zimmermann	4	-2/+2
	Commit 2fd001cd3600 ("arch: Rename fbdev header and source files") renames the video source files under arch/ such that they do not refer to fbdev any longer. The new files named video.o conflict with ACPI's video.ko module. Modprobing the ACPI module can then fail with warnings about missing symbols, as shown below. (i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_unregister (err -2) (i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_register_backlight (err -2) (i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol __acpi_video_get_backlight_type (err -2) (i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_register (err -2) Fix the issue by renaming the architecture's video.o to video-common.o. Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/intel-gfx/9dcac6e9-a3bf-4ace-bbdc-f697f767f9e0@suse.de/T/#t Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 2fd001cd3600 ("arch: Rename fbdev header and source files") Reviewed-by: Hans de Goede <hdegoede@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Cc: linux-fbdev@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-20	dm: always manage discard support in terms of max_hw_discard_sectors	Mike Snitzer	9	-13/+9
	Commit 4f563a64732d ("block: add a max_user_discard_sectors queue limit") changed block core to set max_discard_sectors to: min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) Since commit 1c0e720228ad ("dm: use queue_limits_set") it was reported dm-thinp was failing in a few fstests (generic/347 and generic/405) with the first WARN_ON_ONCE in dm_cell_key_has_valid_range() being reported, e.g.: WARNING: CPU: 1 PID: 30 at drivers/md/dm-bio-prison-v1.c:128 dm_cell_key_has_valid_range+0x3d/0x50 blk_set_stacking_limits() sets max_user_discard_sectors to UINT_MAX, so given how block core now sets max_discard_sectors (detailed above) it follows that blk_stack_limits() stacks up the underlying device's max_hw_discard_sectors and max_discard_sectors is set to match it. If max_hw_discard_sectors exceeds dm's BIO_PRISON_MAX_RANGE, then dm_cell_key_has_valid_range() will trigger the warning with: WARN_ON_ONCE(key->block_end - key->block_begin > BIO_PRISON_MAX_RANGE) Aside from this warning, the discard will fail. Fix this and other DM issues by governing discard support in terms of max_hw_discard_sectors instead of max_discard_sectors. Reported-by: Theodore Ts'o <tytso@mit.edu> Fixes: 1c0e720228ad ("dm: use queue_limits_set") Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-05-20	dm-integrity: set discard_granularity to logical block size	Mikulas Patocka	1	-0/+1
	dm-integrity could set discard_granularity lower than the logical block size. This could result in failures when sending discard requests to dm-integrity. This fix is needed for kernels prior to 6.10. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Eric Wheeler <linux-integrity@lists.ewheeler.net> Cc: stable@vger.kernel.org # <= 6.9 Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-05-20	Revert "fanotify: remove unneeded sub-zero check for unsigned value"	Linus Torvalds	1	-1/+1
	This reverts commit e6595224464b692ddae193d783402130d1625147. These kinds of patches are only making the code worse. Compilers don't care about the unnecessary check, but removing it makes the code less obvious to a human. The declaration of 'len' is more than 80 lines earlier, so a human won't easily see that 'len' is of an unsigned type, so to a human the range check that checks against zero is much more explicit and obvious. Any tool that complains about a range check like this just because the variable is unsigned is actively detrimental, and should be ignored. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-20	Coccinelle: pm_runtime: Fix grammar in comment	Thorsten Blum	1	-1/+1
	s/does not use unnecessary/do not unnecessarily use/ Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-05-20	coccinelle: misc: minmax: Suppress reports for err returns	Ricardo Ribalda	1	-16/+16
	Most of the people prefer: return ret < 0 ? ret: 0; than: return min(ret, 0); Let's tweak the cocci file to ignore those lines completely. Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
2024-05-20	Revert "selftests/cgroup: Drop define _GNU_SOURCE"	Shuah Khan	7	-0/+15
	This reverts commit c1457d9aad5ee2feafcf85aa9a58ab50500159d2. The framework change to add D_GNU_SOURCE to KHDR_INCLUDES to Makefile, lib.mk, and kselftest_harness.h is reverted as it is causing build failures and warnings. Revert this change as this change depends on the framework change. Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-20	Revert "selftests/sgx: Include KHDR_INCLUDES in Makefile"	Shuah Khan	2	-1/+2
	This reverts commit 2c3b8f8f37c6c0c926d584cf4158db95e62b960c. The framework change to add D_GNU_SOURCE to KHDR_INCLUDES to Makefile, lib.mk, and kselftest_harness.h is reverted as it is causing build failures and warnings. Revert this change as this change depends on the framework change. Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-20	Revert "selftests: Compile kselftest headers with -D_GNU_SOURCE"	Shuah Khan	3	-4/+4
	This reverts commit daef47b89efd0b745e8478d69a3ad724bd8b4dc6. This framework change to add D_GNU_SOURCE to KHDR_INCLUDES to Makefile, lib.mk, and kselftest_harness.h is causing build failures and warnings. Revert this change. Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2024-05-19	dt-bindings: mailbox: qcom-ipcc: Document the SDX75 IPCC	Rohit Agarwal	1	-0/+1
	Document the Inter-Processor Communication Controller on the SDX75 Platform. Signed-off-by: Rohit Agarwal <quic_rohiagar@quicinc.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	dt-bindings: mailbox: qcom: Add MSM8974 APCS compatible	Luca Weiss	1	-0/+1
	Add compatible for the Qualcomm MSM8974 APCS block. Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: Convert from tasklet to BH workqueue	Allen Pais	2	-18/+19
	The only generic interface to execute asynchronously in the BH context is tasklet; however, it's marked deprecated and has some design flaws. To replace tasklets, BH workqueue support was recently added. A BH workqueue behaves similarly to regular workqueues except that the queued work items are executed in the BH context. Based on the work done by Tejun Heo <tj@kernel.org> Branch: https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-6.10 Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: mtk-cmdq: Fix pm_runtime_get_sync() warning in mbox shutdown	Jason-JH.Lin	1	-1/+1
	The return value of pm_runtime_get_sync() in cmdq_mbox_shutdown() will return 1 when pm runtime state is active, and we don't want to get the warning message in this case. So we change the return value < 0 for WARN_ON(). Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend") Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: mtk-cmdq-mailbox: fix module autoloading	Krzysztof Kozlowski	1	-0/+1
	Add MODULE_DEVICE_TABLE(), so this module could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: zynqmp: handle SGI for shared IPI	Tanmay Shah	1	-7/+152
	At least one IPI is used in TF-A for communication with PMC firmware. If this IPI needs to be used by other agents such as RPU then, IPI system interrupt can't be generated in mailbox driver. In such case TF-A generates SGI to mailbox driver for IPI notification. Signed-off-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@amd.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: arm_mhuv3: Add driver	Cristian Marussi	4	-0/+1126
	Add support for ARM MHUv3 mailbox controller. Support is limited to the MHUv3 Doorbell extension using only the PBX/MBX combined interrupts. Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	dt-bindings: mailbox: arm,mhuv3: Add bindings	Cristian Marussi	2	-0/+237
	Add bindings for the ARM MHUv3 Mailbox controller. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Cristian Marussi <cristian.marussi@arm.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove kernel FIFO message queuing	Andrew Davis	2	-111/+5
	The kernel FIFO queue has a couple issues. The biggest issue is that it causes extra latency in a path that can be used in real-time tasks, such as communication with real-time remote processors. The whole FIFO idea itself looks to be a leftover from before the unified mailbox framework. The current mailbox framework expects mbox_chan_received_data() to be called with data immediately as it arrives. Remove the FIFO and pass the messages to the mailbox framework directly as part of a threaded IRQ handler. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Reverse FIFO busy check logic	Andrew Davis	1	-17/+16
	It is much more clear to check if the hardware FIFO is full and return EBUSY if true. This allows us to also remove one level of indention from the core of this function. It also makes the similarities between omap_mbox_chan_send_noirq() and omap_mbox_chan_send() more obvious. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove mbox_chan_to_omap_mbox()	Andrew Davis	1	-11/+3
	This function only checks if mbox_chan *chan is not NULL, but that cannot be the case and if it was returning NULL which is not later checked doesn't save us from this. The second check for chan->con_priv is completely redundant as if it was NULL we would return NULL just the same. Simply dereference con_priv directly and remove this function. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Use mbox_controller channel list directly	Andrew Davis	1	-31/+11
	The driver stores a list of omap_mbox structs so it can later use it to lookup the mailbox names in of_xlate. This same information is already available in the mbox_controller passed into of_xlate. Simply use that data and remove the extra allocation and storage of the omap_mbox list. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Use function local struct mbox_controller	Andrew Davis	1	-9/+12
	The mbox_controller struct is only needed in the probe function. Make it a local variable instead of storing a copy in omap_mbox_device to simplify that struct. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Merge mailbox child node setup loops	Andrew Davis	1	-73/+46
	Currently the driver loops through all mailbox child nodes twice, once to read in data from each node, and again to make use of this data. Instead read the data and make use of it in one pass. This removes the need for several temporary data structures and reduces the complexity of this main loop in probe. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Use devm_pm_runtime_enable() helper	Andrew Davis	1	-15/+3
	Use device life-cycle managed runtime enable function to simplify probe and exit paths. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove device class	Andrew Davis	1	-87/+2
	The driver currently creates a new device class "mbox". Then for each mailbox adds a device to that class. This class provides no file operations provided for any userspace users of this device class. It may have been extended to be functional in our vendor tree at some point, but that is not the case anymore, nor does it matter for the upstream tree. Remove this device class and related functions and variables. This also allows us to switch to module_platform_driver() as there is nothing left to do in module_init(). Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove unneeded header omap-mailbox.h	Andrew Davis	1	-5/+2
	The type of message sent using omap-mailbox is always u32. The definition of mbox_msg_t is uintptr_t which is wrong as that type changes based on the architecture (32bit vs 64bit). This type should have been defined as u32. Instead of making that change here, simply remove the header usage and fix the last couple users of the same in this driver. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Move fifo size check to point of use	Andrew Davis	1	-5/+5
	The mbox_kfifo_size can be changed at runtime, the sanity check on it's value should be done when it is used, not only once at init time. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Move omap_mbox_irq_t into driver	Andrew Davis	2	-4/+5
	This is only used internal to the driver, move it out of the public header and into the driver file. While we are here, this is not used as a bitwise, so drop that and make it a simple enum type. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove unused omap_mbox_request_channel() function	Andrew Davis	2	-42/+0
	This function is not used, remove this function. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	mailbox: omap: Remove unused omap_mbox_{enable,disable}_irq() functions	Andrew Davis	2	-35/+10
	These function are not used, remove these here. While here, remove the leading _ from the driver internal functions that do the same thing as the functions removed. Signed-off-by: Andrew Davis <afd@ti.com> Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
2024-05-19	usercopy: Don't use "proxy" headers	Andy Shevchenko	1	-2/+6
	Update header inclusions to follow IWYU (Include What You Use) principle. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19	bitops: Move aligned_byte_mask() to wordpart.h	Andy Shevchenko	3	-7/+8
	The bitops.h is for bit related operations. The aligned_byte_mask() is about byte (or part of the machine word) operations, for which we have a separate header, move the mentioned macro to wordpart.h to consolidate similar operations. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19	MAINTAINERS: add BITOPS API record	Yury Norov	1	-0/+14
	Bitops API is the very basic, and it's widely used by the kernel. But corresponding files are not maintained. Bitmaps actively use bit operations, and big share of bitops material already moves through the bitmap branch. I would like to take a closer look to bitops. This patch creates a BITOPS API record in the MAINTAINERS, and adds Rasmus as a reviewer, and myself as a maintainer of those files. CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-05-19	mm/page-owner: use gfp_nested_mask() instead of open coded masking	Dave Chinner	1	-6/+1
	The page-owner tracking code records stack traces during page allocation. To do this, it must do a memory allocation for the stack information from inside an existing memory allocation context. This internal allocation must obey the high level caller allocation constraints to avoid generating false positive warnings that have nothing to do with the code they are instrumenting/tracking (e.g. through lockdep reclaim state tracking) We also don't want recording stack traces to deplete emergency memory reserves - debug code is useless if it creates new issues that can't be replicated when the debug code is disabled. Switch the stack tracking allocation masking to use gfp_nested_mask() to address these issues. gfp_nested_mask() naturally strips GFP_ZONEMASK, too, which greatly simplifies this code. Link: https://lkml.kernel.org/r/20240430054604.4169568-4-david@fromorbit.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19	stackdepot: use gfp_nested_mask() instead of open coded masking	Dave Chinner	1	-9/+2
	The stackdepot code is used by KASAN and lockdep for recoding stack traces. Both of these track allocation context information, and so their internal allocations must obey the caller allocation contexts to avoid generating their own false positive warnings that have nothing to do with the code they are instrumenting/tracking. We also don't want recording stack traces to deplete emergency memory reserves - debug code is useless if it creates new issues that can't be replicated when the debug code is disabled. Switch the stackdepot allocation masking to use gfp_nested_mask() to address these issues. gfp_nested_mask() also strips GFP_ZONEMASK naturally, so that greatly simplifies this code. Link: https://lkml.kernel.org/r/20240430054604.4169568-3-david@fromorbit.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19	mm: lift gfp_kmemleak_mask() to gfp.h	Dave Chinner	2	-8/+29
	Patch series "mm: fix nested allocation context filtering". This patchset is the followup to the comment I made earlier today: https://lore.kernel.org/linux-xfs/ZjAyIWUzDipofHFJ@dread.disaster.area/ Tl;dr: Memory allocations that are done inside the public memory allocation API need to obey the reclaim recursion constraints placed on the allocation by the original caller, including the "don't track recursion for this allocation" case defined by __GFP_NOLOCKDEP. These nested allocations are generally in debug code that is tracking something about the allocation (kmemleak, KASAN, etc) and so are allocating private kernel objects that only that debug system will use. Neither the page-owner code nor the stack depot code get this right. They also also clear GFP_ZONEMASK as a separate operation, which is completely redundant because the constraint filter applied immediately after guarantees that GFP_ZONEMASK bits are cleared. kmemleak gets this filtering right. It preserves the allocation constraints for deadlock prevention and clears all other context flags whilst also ensuring that the nested allocation will fail quickly, silently and without depleting emergency kernel reserves if there is no memory available. This can be made much more robust, immune to whack-a-mole games and the code greatly simplified by lifting gfp_kmemleak_mask() to include/linux/gfp.h and using that everywhere. Also document it so that there is no excuse for not knowing about it when writing new debug code that nests allocations. Tested with lockdep, KASAN + page_owner=on and kmemleak=on over multiple fstests runs with XFS. This patch (of 3): Any "internal" nested allocation done from within an allocation context needs to obey the high level allocation gfp_mask constraints. This is necessary for debug code like KASAN, kmemleak, lockdep, etc that allocate memory for saving stack traces and other information during memory allocation. If they don't obey things like __GFP_NOLOCKDEP or __GFP_NOWARN, they produce false positive failure detections. kmemleak gets this right by using gfp_kmemleak_mask() to pass through the relevant context flags to the nested allocation to ensure that the allocation follows the constraints of the caller context. KASAN recently was foudn to be missing __GFP_NOLOCKDEP due to stack depot allocations, and even more recently the page owner tracking code was also found to be missing __GFP_NOLOCKDEP support. We also don't wan't want KASAN or lockdep to drive the system into OOM kill territory by exhausting emergency reserves. This is something that kmemleak also gets right by adding (__GFP_NORETRY \| __GFP_NOMEMALLOC \| __GFP_NOWARN) to the allocation mask. Hence it is clear that we need to define a common nested allocation filter mask for these sorts of third party nested allocations used in debug code. So to start this process, lift gfp_kmemleak_mask() to gfp.h and rename it to gfp_nested_mask(), and convert the kmemleak callers to use it. Link: https://lkml.kernel.org/r/20240430054604.4169568-1-david@fromorbit.com Link: https://lkml.kernel.org/r/20240430054604.4169568-2-david@fromorbit.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19	nilfs2: make block erasure safe in nilfs_finish_roll_forward()	Ryusuke Konishi	1	-0/+4
	The implementation of writing a zero-fill block in nilfs_finish_roll_forward() is not safe. The buffer is being cleared without acquiring a lock or setting the uptodate flag, so theoretically, between the time the buffer's data is cleared and the time it is written back to the block device using sync_dirty_buffer(), that zero data can be undone by concurrent block device reads. Since this buffer points to a location that has been read from disk once, the uptodate flag will most likely remain, but since it was obtained with __getblk(), that is not guaranteed. In other words, this is exceptional, and this function itself is not normally called (only once when mounting after a specific pattern of unclean shutdown), so it is highly unlikely that this will actually cause a problem. Anyway, eliminate this potential race issue by protecting the clearing of buffer data with a buffer lock and setting the buffer's uptodate flag within the protected section. Link: https://lkml.kernel.org/r/20240511002942.9608-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>