aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2025-12-02Merge tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-4/+4
Pull scoped user access updates from Thomas Gleixner: "Scoped user mode access and related changes: - Implement the missing u64 user access function on ARM when CONFIG_CPU_SPECTRE=n. This makes it possible to access a 64bit value in generic code with [unsafe_]get_user(). All other architectures and ARM variants provide the relevant accessors already. - Ensure that ASM GOTO jump label usage in the user mode access helpers always goes through a local C scope label indirection inside the helpers. This is required because compilers are not supporting that a ASM GOTO target leaves a auto cleanup scope. GCC silently fails to emit the cleanup invocation and CLANG fails the build. [ Editor's note: gcc-16 will have fixed the code generation issue in commit f68fe3ddda4 ("eh: Invoke cleanups/destructors in asm goto jumps [PR122835]"). But we obviously have to deal with clang and older versions of gcc, so.. - Linus ] This provides generic wrapper macros and the conversion of affected architecture code to use them. - Scoped user mode access with auto cleanup Access to user mode memory can be required in hot code paths, but if it has to be done with user controlled pointers, the access is shielded with a speculation barrier, so that the CPU cannot speculate around the address range check. Those speculation barriers impact performance quite significantly. This cost can be avoided by "masking" the provided pointer so it is guaranteed to be in the valid user memory access range and otherwise to point to a guaranteed unpopulated address space. This has to be done without branches so it creates an address dependency for the access, which the CPU cannot speculate ahead. This results in repeating and error prone programming patterns: if (can_do_masked_user_access()) from = masked_user_read_access_begin((from)); else if (!user_read_access_begin(from, sizeof(*from))) return -EFAULT; unsafe_get_user(val, from, Efault); user_read_access_end(); return 0; Efault: user_read_access_end(); return -EFAULT; which can be replaced with scopes and automatic cleanup: scoped_user_read_access(from, Efault) unsafe_get_user(val, from, Efault); return 0; Efault: return -EFAULT; - Convert code which implements the above pattern over to scope_user.*.access(). This also corrects a couple of imbalanced masked_*_begin() instances which are harmless on most architectures, but prevent PowerPC from implementing the masking optimization. - Add a missing speculation barrier in copy_from_user_iter()" * tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required scm: Convert put_cmsg() to scoped user access iov_iter: Add missing speculation barrier to copy_from_user_iter() iov_iter: Convert copy_from_user_iter() to masked user access select: Convert to scoped user access x86/futex: Convert to scoped user access futex: Convert to get/put_user_inline() uaccess: Provide put/get_user_inline() uaccess: Provide scoped user access regions arm64: uaccess: Use unsafe wrappers for ASM GOTO s390/uaccess: Use unsafe wrappers for ASM GOTO riscv/uaccess: Use unsafe wrappers for ASM GOTO powerpc/uaccess: Use unsafe wrappers for ASM GOTO x86/uaccess: Use unsafe wrappers for ASM GOTO uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user() ARM: uaccess: Implement missing __get_user_asm_dword()
2025-12-01Merge tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-6/+6
Pull bug handling infrastructure updates from Ingo Molnar: "Core updates: - Improve WARN(), which has vararg printf like arguments, to work with the x86 #UD based WARN-optimizing infrastructure by hiding the format in the bug_table and replacing this first argument with the address of the bug-table entry, while making the actual function that's called a UD1 instruction (Peter Zijlstra) - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch (Ingo Molnar, s390 support by Heiko Carstens) Fixes and cleanups: - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens) - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS (Peter Zijlstra)" * tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERS x86/bug: Fix BUG_FORMAT vs KASLR x86_64/bug: Inline the UD1 x86/bug: Implement WARN_ONCE() x86_64/bug: Implement __WARN_printf() x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED x86/bug: Add BUG_FORMAT basics bug: Allow architectures to provide __WARN_printf() bug: Implement WARN_ON() using __WARN_FLAGS() bug: Add report_bug_entry() bug: Add BUG_FORMAT_ARGS infrastructure bug: Clean up CONFIG_GENERIC_BUG_RELATIVE_POINTERS bug: Add BUG_FORMAT infrastructure x86: Rework __bug_table helpers bugs/s390: Remove private WARN_ON() implementation bugs/core: Reorganize fields in the first line of WARNING output, add ->comm[] output bugs/sh: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output bugs/parisc: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output bugs/riscv: Concatenate 'cond_str' with '__FILE__' in __BUG_FLAGS(), to extend WARN_ON/BUG_ON output bugs/riscv: Pass in 'cond_str' to __BUG_FLAGS() ...
2025-12-01Merge tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds4-106/+32
Pull fd prepare updates from Christian Brauner: "This adds the FD_ADD() and FD_PREPARE() primitive. They simplify the common pattern of get_unused_fd_flags() + create file + fd_install() that is used extensively throughout the kernel and currently requires cumbersome cleanup paths. FD_ADD() - For simple cases where a file is installed immediately: fd = FD_ADD(O_CLOEXEC, vfio_device_open_file(device)); if (fd < 0) vfio_device_put_registration(device); return fd; FD_PREPARE() - For cases requiring access to the fd or file, or additional work before publishing: FD_PREPARE(fdf, O_CLOEXEC, sync_file->file); if (fdf.err) { fput(sync_file->file); return fdf.err; } data.fence = fd_prepare_fd(fdf); if (copy_to_user((void __user *)arg, &data, sizeof(data))) return -EFAULT; return fd_publish(fdf); The primitives are centered around struct fd_prepare. FD_PREPARE() encapsulates all allocation and cleanup logic and must be followed by a call to fd_publish() which associates the fd with the file and installs it into the caller's fdtable. If fd_publish() isn't called, both are deallocated automatically. FD_ADD() is a shorthand that does fd_publish() immediately and never exposes the struct to the caller. I've implemented this in a way that it's compatible with the cleanup infrastructure while also being usable separately. IOW, it's centered around struct fd_prepare which is aliased to class_fd_prepare_t and so we can make use of all the basica guard infrastructure" * tag 'vfs-6.19-rc1.fd_prepare.fs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits) io_uring: convert io_create_mock_file() to FD_PREPARE() file: convert replace_fd() to FD_PREPARE() vfio: convert vfio_group_ioctl_get_device_fd() to FD_ADD() tty: convert ptm_open_peer() to FD_ADD() ntsync: convert ntsync_obj_get_fd() to FD_PREPARE() media: convert media_request_alloc() to FD_PREPARE() hv: convert mshv_ioctl_create_partition() to FD_ADD() gpio: convert linehandle_create() to FD_PREPARE() pseries: port papr_rtas_setup_file_interface() to FD_ADD() pseries: convert papr_platform_dump_create_handle() to FD_ADD() spufs: convert spufs_gang_open() to FD_PREPARE() papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE() spufs: convert spufs_context_open() to FD_PREPARE() net/socket: convert __sys_accept4_file() to FD_ADD() net/socket: convert sock_map_fd() to FD_ADD() net/kcm: convert kcm_ioctl() to FD_PREPARE() net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE() secretmem: convert memfd_secret() to FD_ADD() memfd: convert memfd_create() to FD_ADD() bpf: convert bpf_token_create() to FD_PREPARE() ...
2025-12-01Merge tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-0/+1
Pull namespace updates from Christian Brauner: "This contains substantial namespace infrastructure changes including a new system call, active reference counting, and extensive header cleanups. The branch depends on the shared kbuild branch for -fms-extensions support. Features: - listns() system call Add a new listns() system call that allows userspace to iterate through namespaces in the system. This provides a programmatic interface to discover and inspect namespaces, addressing longstanding limitations: Currently, there is no direct way for userspace to enumerate namespaces. Applications must resort to scanning /proc/*/ns/ across all processes, which is: - Inefficient - requires iterating over all processes - Incomplete - misses namespaces not attached to any running process but kept alive by file descriptors, bind mounts, or parent references - Permission-heavy - requires access to /proc for many processes - No ordering or ownership information - No filtering per namespace type The listns() system call solves these problems: ssize_t listns(const struct ns_id_req *req, u64 *ns_ids, size_t nr_ns_ids, unsigned int flags); struct ns_id_req { __u32 size; __u32 spare; __u64 ns_id; struct /* listns */ { __u32 ns_type; __u32 spare2; __u64 user_ns_id; }; }; Features include: - Pagination support for large namespace sets - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.) - Filtering by owning user namespace - Permission checks respecting namespace isolation - Active Reference Counting Introduce an active reference count that tracks namespace visibility to userspace. A namespace is visible in the following cases: - The namespace is in use by a task - The namespace is persisted through a VFS object (namespace file descriptor or bind-mount) - The namespace is a hierarchical type and is the parent of child namespaces The active reference count does not regulate lifetime (that's still done by the normal reference count) - it only regulates visibility to namespace file handles and listns(). This prevents resurrection of namespaces that are pinned only for internal kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB references on idle CPUs, etc.) which should not be accessible via (1)-(3). - Unified Namespace Tree Introduce a unified tree structure for all namespaces with: - Fixed IDs assigned to initial namespaces - Lookup based solely on inode number - Maintained list of owned namespaces per user namespace - Simplified rbtree comparison helpers Cleanups - Header Reorganization: - Move namespace types into separate header (ns_common_types.h) - Decouple nstree from ns_common header - Move nstree types into separate header - Switch to new ns_tree_{node,root} structures with helper functions - Use guards for ns_tree_lock - Initial Namespace Reference Count Optimization - Make all reference counts on initial namespaces a nop to avoid pointless cacheline ping-pong for namespaces that can never go away - Drop custom reference count initialization for initial namespaces - Add NS_COMMON_INIT() macro and use it for all namespaces - pid: rely on common reference count behavior - Miscellaneous Cleanups - Rename exit_task_namespaces() to exit_nsproxy_namespaces() - Rename is_initial_namespace() and make argument const - Use boolean to indicate anonymous mount namespace - Simplify owner list iteration in nstree - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly - nsfs: use inode_just_drop() - pidfs: raise DCACHE_DONTCACHE explicitly - pidfs: simplify PIDFD_GET__NAMESPACE ioctls - libfs: allow to specify s_d_flags - cgroup: add cgroup namespace to tree after owner is set - nsproxy: fix free_nsproxy() and simplify create_new_namespaces() Fixes: - setns(pidfd, ...) race condition Fix a subtle race when using pidfds with setns(). When the target task exits after prepare_nsset() but before commit_nsset(), the namespace's active reference count might have been dropped. If setns() then installs the namespaces, it would bump the active reference count from zero without taking the required reference on the owner namespace, leading to underflow when later decremented. The fix resurrects the ownership chain if necessary - if the caller succeeded in grabbing passive references, the setns() should succeed even if the target task exits or gets reaped. - Return EFAULT on put_user() error instead of success - Make sure references are dropped outside of RCU lock (some namespaces like mount namespace sleep when putting the last reference) - Don't skip active reference count initialization for network namespace - Add asserts for active refcount underflow - Add asserts for initial namespace reference counts (both passive and active) - ipc: enable is_ns_init_id() assertions - Fix kernel-doc comments for internal nstree functions - Selftests - 15 active reference count tests - 9 listns() functionality tests - 7 listns() permission tests - 12 inactive namespace resurrection tests - 3 threaded active reference count tests - commit_creds() active reference tests - Pagination and stress tests - EFAULT handling test - nsid tests fixes" * tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits) pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls nstree: fix kernel-doc comments for internal functions nsproxy: fix free_nsproxy() and simplify create_new_namespaces() selftests/namespaces: fix nsid tests ns: drop custom reference count initialization for initial namespaces pid: rely on common reference count behavior ns: add asserts for initial namespace active reference counts ns: add asserts for initial namespace reference counts ns: make all reference counts on initial namespace a nop ipc: enable is_ns_init_id() assertions fs: use boolean to indicate anonymous mount namespace ns: rename is_initial_namespace() ns: make is_initial_namespace() argument const nstree: use guards for ns_tree_lock nstree: simplify owner list iteration nstree: switch to new structures nstree: add helper to operate on struct ns_tree_{node,root} nstree: move nstree types into separate header nstree: decouple from ns_common header ns: move namespace types into separate header ...
2025-12-01Merge tag 'vfs-6.19-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-1/+2
Pull misc vfs updates from Christian Brauner: "Features: - Cheaper MAY_EXEC handling for path lookup. This elides MAY_WRITE permission checks during path lookup and adds the IOP_FASTPERM_MAY_EXEC flag so filesystems like btrfs can avoid expensive permission work. - Hide dentry_cache behind runtime const machinery. - Add German Maglione as virtiofs co-maintainer. Cleanups: - Tidy up and inline step_into() and walk_component() for improved code generation. - Re-enable IOCB_NOWAIT writes to files. This refactors file timestamp update logic, fixing a layering bypass in btrfs when updating timestamps on device files and improving FMODE_NOCMTIME handling in VFS now that nfsd started using it. - Path lookup optimizations extracting slowpaths into dedicated routines and adding branch prediction hints for mntput_no_expire(), fd_install(), lookup_slow(), and various other hot paths. - Enable clang's -fms-extensions flag, requiring a JFS rename to avoid conflicts. - Remove spurious exports in fs/file_attr.c. - Stop duplicating union pipe_index declaration. This depends on the shared kbuild branch that brings in -fms-extensions support which is merged into this branch. - Use MD5 library instead of crypto_shash in ecryptfs. - Use largest_zero_folio() in iomap_dio_zero(). - Replace simple_strtol/strtoul with kstrtoint/kstrtouint in init and initrd code. - Various typo fixes. Fixes: - Fix emergency sync for btrfs. Btrfs requires an explicit sync_fs() call with wait == 1 to commit super blocks. The emergency sync path never passed this, leaving btrfs data uncommitted during emergency sync. - Use local kmap in watch_queue's post_one_notification(). - Add hint prints in sb_set_blocksize() for LBS dependency on THP" * tag 'vfs-6.19-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits) MAINTAINERS: add German Maglione as virtiofs co-maintainer fs: inline step_into() and walk_component() fs: tidy up step_into() & friends before inlining orangefs: use inode_update_timestamps directly btrfs: fix the comment on btrfs_update_time btrfs: use vfs_utimes to update file timestamps fs: export vfs_utimes fs: lift the FMODE_NOCMTIME check into file_update_time_flags fs: refactor file timestamp update logic include/linux/fs.h: trivial fix: regualr -> regular fs/splice.c: trivial fix: pipes -> pipe's fs: mark lookup_slow() as noinline fs: add predicts based on nd->depth fs: move mntput_no_expire() slowpath into a dedicated routine fs: remove spurious exports in fs/file_attr.c watch_queue: Use local kmap in post_one_notification() fs: touch up predicts in path lookup fs: move fd_install() slowpath into a dedicated routine and provide commentary fs: hide dentry_cache behind runtime const machinery fs: touch predicts in do_dentry_open() ...
2025-11-28pseries: port papr_rtas_setup_file_interface() to FD_ADD()Christian Brauner1-22/+5
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-36-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28pseries: convert papr_platform_dump_create_handle() to FD_ADD()Christian Brauner1-22/+8
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-35-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28spufs: convert spufs_gang_open() to FD_PREPARE()Christian Brauner1-16/+5
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-34-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()Christian Brauner1-30/+9
Fixes a UAF for src_info as well. Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-33-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28spufs: convert spufs_context_open() to FD_PREPARE()Christian Brauner1-16/+5
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-32-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-21Merge branch 'objtool/core'Peter Zijlstra196-2210/+2661
Bring in the UDB and objtool data annotations to avoid conflicts while further extending the bug exceptions. Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2025-11-15mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlbDavid Hildenbrand (Red Hat)2-1/+1
In the past, CONFIG_ARCH_HAS_GIGANTIC_PAGE indicated that we support runtime allocation of gigantic hugetlb folios. In the meantime it evolved into a generic way for the architecture to state that it supports gigantic hugetlb folios. In commit fae7d834c43c ("mm: add __dump_folio()") we started using CONFIG_ARCH_HAS_GIGANTIC_PAGE to decide MAX_FOLIO_ORDER: whether we could have folios larger than what the buddy can handle. In the context of that commit, we started using MAX_FOLIO_ORDER to detect page corruptions when dumping tail pages of folios. Before that commit, we assumed that we cannot have folios larger than the highest buddy order, which was obviously wrong. In commit 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate"), we used MAX_FOLIO_ORDER to detect inconsistencies, and in fact, we found some now. Powerpc allows for configs that can allocate gigantic folio during boot (not at runtime), that do not set CONFIG_ARCH_HAS_GIGANTIC_PAGE and can exceed PUD_ORDER. To fix it, let's make powerpc select CONFIG_ARCH_HAS_GIGANTIC_PAGE with hugetlb on powerpc, and increase the maximum folio size with hugetlb to 16 GiB on 64bit (possible on arm64 and powerpc) and 1 GiB on 32 bit (powerpc). Note that on some powerpc configurations, whether we actually have gigantic pages depends on the setting of CONFIG_ARCH_FORCE_MAX_ORDER, but there is nothing really problematic about setting it unconditionally: we just try to keep the value small so we can better detect problems in __dump_folio() and inconsistencies around the expected largest folio in the system. Ideally, we'd have a better way to obtain the maximum hugetlb folio size and detect ourselves whether we really end up with gigantic folios. Let's defer bigger changes and fix the warnings first. While at it, handle gigantic DAX folios more clearly: DAX can only end up creating gigantic folios with HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD. Add a new Kconfig option HAVE_GIGANTIC_FOLIOS to make both cases clearer. In particular, worry about ARCH_HAS_GIGANTIC_PAGE only with HUGETLB_PAGE. Note: with enabling CONFIG_ARCH_HAS_GIGANTIC_PAGE on powerpc, we will now also allow for runtime allocations of folios in some more powerpc configs. I don't think this is a problem, but if it is we could handle it through __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED. While __dump_page()/__dump_folio was also problematic (not handling dumping of tail pages of such gigantic folios correctly), it doesn't seem critical enough to mark it as a fix. Link: https://lkml.kernel.org/r/20251114214920.2550676-1-david@kernel.org Fixes: 7b4f21f5e038 ("mm/hugetlb: check for unreasonable folio sizes when registering hstate") Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Closes: https://lore.kernel.org/r/3e043453-3f27-48ad-b987-cc39f523060a@csgroup.eu/ Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/r/94377f5c-d4f0-4c0f-b0f6-5bf1cd7305b1@linux.ibm.com/ Signed-off-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Donet Tom <donettom@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-11Merge branch 'kbuild-6.19.fms.extension'Christian Brauner1-1/+2
Bring in the shared branch with the kbuild tree to enable '-fms-extensions' for 6.19. Further namespace cleanup work requires this extension. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-10Merge patch "kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS"Christian Brauner1-1/+2
Nathan Chancellor <nathan@kernel.org> says: Shared branch between Kbuild and other trees for enabling '-fms-extensions' for 6.19. * tag 'kbuild-ms-extensions-6.19' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS Kbuild: enable -fms-extensions jfs: Rename _inline to avoid conflict with clang's '-fms-extensions' Link: https://patch.msgid.link/20251101-kbuild-ms-extensions-dedicated-cflags-v1-1-38004aba524b@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-03arch: hookup listns() system callChristian Brauner1-0/+1
Add the listns() system call to all architectures. Link: https://patch.msgid.link/20251029-work-namespace-nstree-listns-v4-20-2e6f823ebdc0@kernel.org Tested-by: syzbot@syzkaller.appspotmail.com Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-03powerpc/uaccess: Use unsafe wrappers for ASM GOTOThomas Gleixner1-4/+4
ASM GOTO is miscompiled by GCC when it is used inside a auto cleanup scope: bool foo(u32 __user *p, u32 val) { scoped_guard(pagefault) unsafe_put_user(val, p, efault); return true; efault: return false; } It ends up leaking the pagefault disable counter in the fault path. clang at least fails the build. Rename unsafe_*_user() to arch_unsafe_*_user() which makes the generic uaccess header wrap it with a local label that makes both compilers emit correct code. Same for the kernel_nofault() variants. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251027083745.356628509@linutronix.de
2025-10-30kbuild: Add '-fms-extensions' to areas with dedicated CFLAGSNathan Chancellor1-1/+2
This is a follow up to commit c4781dc3d1cf ("Kbuild: enable -fms-extensions") but in a separate change due to being substantially different from the initial submission. There are many places within the kernel that use their own CFLAGS instead of the main KBUILD_CFLAGS, meaning code written with the main kernel's use of '-fms-extensions' in mind that may be tangentially included in these areas will result in "error: declaration does not declare anything" messages from the compiler. Add '-fms-extensions' to all these areas to ensure consistency, along with -Wno-microsoft-anon-tag to silence clang's warning about use of the extension that the kernel cares about using. parisc does not build with clang so it does not need this warning flag. LoongArch does not need it either because -W flags from KBUILD_FLAGS are pulled into cflags-vdso. Reported-by: Christian Brauner <brauner@kernel.org> Closes: https://lore.kernel.org/20251030-meerjungfrau-getrocknet-7b46eacc215d@brauner/ Reviewed-by: Christian Brauner <brauner@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-10-13powerpc/fadump: skip parameter area allocation when fadump is disabledSourabh Jain1-0/+3
Fadump allocates memory to pass additional kernel command-line argument to the fadump kernel. However, this allocation is not needed when fadump is disabled. So avoid allocating memory for the additional parameter area in such cases. Fixes: f4892c68ecc1 ("powerpc/fadump: allocate memory for additional parameters early") Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Fixes: f4892c68ecc1 ("powerpc/fadump: allocate memory for additional parameters early") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251008032934.262683-1-sourabhjain@linux.ibm.com
2025-10-13powerpc, ocxl: Fix extraction of struct xive_irq_dataNam Cao3-10/+6
Commit cc0cc23babc9 ("powerpc/xive: Untangle xive from child interrupt controller drivers") changed xive_irq_data to be stashed to chip_data instead of handler_data. However, multiple places are still attempting to read xive_irq_data from handler_data and get a NULL pointer deference bug. Update them to read xive_irq_data from chip_data. Non-XIVE files which touch xive_irq_data seem quite strange to me, especially the ocxl driver. I think there ought to be an alternative platform-independent solution, instead of touching XIVE's data directly. Therefore, I think this whole thing should be cleaned up. But perhaps I just misunderstand something. In any case, this cleanup would not be trivial; for now, just get things working again. Fixes: cc0cc23babc9 ("powerpc/xive: Untangle xive from child interrupt controller drivers") Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Closes: https://lore.kernel.org/linuxppc-dev/68e48df8.170a0220.4b4b0.217d@mx.google.com/ Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251008081359.1382699-1-namcao@linutronix.de
2025-10-13powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardownNam Cao1-2/+1
pseries_msi_ops_teardown() reads pci_dev* from msi_alloc_info_t. However, pseries_msi_ops_prepare() does not populate this structure, thus it is all zeros. Consequently, pseries_msi_ops_teardown() triggers a NULL pointer dereference crash. struct pci_dev is available in struct irq_domain. Read it there instead. Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/linuxppc-dev/878d7651-433a-46fe-a28b-1b7e893fcbe0@linux.ibm.com/ Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20251010120307.3281720-1-namcao@linutronix.de
2025-10-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-1/+15
Pull x86 kvm updates from Paolo Bonzini: "Generic: - Rework almost all of KVM's exports to expose symbols only to KVM's x86 vendor modules (kvm-{amd,intel}.ko and PPC's kvm-{pr,hv}.ko x86: - Rework almost all of KVM x86's exports to expose symbols only to KVM's vendor modules, i.e. to kvm-{amd,intel}.ko - Add support for virtualizing Control-flow Enforcement Technology (CET) on Intel (Shadow Stacks and Indirect Branch Tracking) and AMD (Shadow Stacks). It is worth noting that while SHSTK and IBT can be enabled separately in CPUID, it is not really possible to virtualize them separately. Therefore, Intel processors will really allow both SHSTK and IBT under the hood if either is made visible in the guest's CPUID. The alternative would be to intercept XSAVES/XRSTORS, which is not feasible for performance reasons - Fix a variety of fuzzing WARNs all caused by checking L1 intercepts when completing userspace I/O. KVM has already committed to allowing L2 to to perform I/O at that point - Emulate PERF_CNTR_GLOBAL_STATUS_SET for PerfMonV2 guests, as the MSR is supposed to exist for v2 PMUs - Allow Centaur CPU leaves (base 0xC000_0000) for Zhaoxin CPUs - Add support for the immediate forms of RDMSR and WRMSRNS, sans full emulator support (KVM should never need to emulate the MSRs outside of forced emulation and other contrived testing scenarios) - Clean up the MSR APIs in preparation for CET and FRED virtualization, as well as mediated vPMU support - Clean up a pile of PMU code in anticipation of adding support for mediated vPMUs - Reject in-kernel IOAPIC/PIT for TDX VMs, as KVM can't obtain EOI vmexits needed to faithfully emulate an I/O APIC for such guests - Many cleanups and minor fixes - Recover possible NX huge pages within the TDP MMU under read lock to reduce guest jitter when restoring NX huge pages - Return -EAGAIN during prefault if userspace concurrently deletes/moves the relevant memslot, to fix an issue where prefaulting could deadlock with the memslot update x86 (AMD): - Enable AVIC by default for Zen4+ if x2AVIC (and other prereqs) is supported - Require a minimum GHCB version of 2 when starting SEV-SNP guests via KVM_SEV_INIT2 so that invalid GHCB versions result in immediate errors instead of latent guest failures - Add support for SEV-SNP's CipherText Hiding, an opt-in feature that prevents unauthorized CPU accesses from reading the ciphertext of SNP guest private memory, e.g. to attempt an offline attack. This feature splits the shared SEV-ES/SEV-SNP ASID space into separate ranges for SEV-ES and SEV-SNP guests, therefore a new module parameter is needed to control the number of ASIDs that can be used for VMs with CipherText Hiding vs. how many can be used to run SEV-ES guests - Add support for Secure TSC for SEV-SNP guests, which prevents the untrusted host from tampering with the guest's TSC frequency, while still allowing the the VMM to configure the guest's TSC frequency prior to launch - Validate the XCR0 provided by the guest (via the GHCB) to avoid bugs resulting from bogus XCR0 values - Save an SEV guest's policy if and only if LAUNCH_START fully succeeds to avoid leaving behind stale state (thankfully not consumed in KVM) - Explicitly reject non-positive effective lengths during SNP's LAUNCH_UPDATE instead of subtly relying on guest_memfd to deal with them - Reload the pre-VMRUN TSC_AUX on #VMEXIT for SEV-ES guests, not the host's desired TSC_AUX, to fix a bug where KVM was keeping a different vCPU's TSC_AUX in the host MSR until return to userspace KVM (Intel): - Preparation for FRED support - Don't retry in TDX's anti-zero-step mitigation if the target memslot is invalid, i.e. is being deleted or moved, to fix a deadlock scenario similar to the aforementioned prefaulting case - Misc bugfixes and minor cleanups" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits) KVM: x86: Export KVM-internal symbols for sub-modules only KVM: x86: Drop pointless exports of kvm_arch_xxx() hooks KVM: x86: Move kvm_intr_is_single_vcpu() to lapic.c KVM: Export KVM-internal symbols for sub-modules only KVM: s390/vfio-ap: Use kvm_is_gpa_in_memslot() instead of open coded equivalent KVM: VMX: Make CR4.CET a guest owned bit KVM: selftests: Verify MSRs are (not) in save/restore list when (un)supported KVM: selftests: Add coverage for KVM-defined registers in MSRs test KVM: selftests: Add KVM_{G,S}ET_ONE_REG coverage to MSRs test KVM: selftests: Extend MSRs test to validate vCPUs without supported features KVM: selftests: Add support for MSR_IA32_{S,U}_CET to MSRs test KVM: selftests: Add an MSR test to exercise guest/host and read/write KVM: x86: Define AMD's #HV, #VC, and #SX exception vectors KVM: x86: Define Control Protection Exception (#CP) vector KVM: x86: Add human friendly formatting for #XM, and #VE KVM: SVM: Enable shadow stack virtualization for SVM KVM: SEV: Synchronize MSR_IA32_XSS from the GHCB when it's valid KVM: SVM: Pass through shadow stack MSRs as appropriate KVM: SVM: Update dump_vmcb with shadow stack save area additions KVM: nSVM: Save/load CET Shadow Stack state to/from vmcb12/vmcb02 ...
2025-10-06Merge tag 'pci-v6.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pciLinus Torvalds1-1/+1
Pull pci updates from Bjorn Helgaas: "Enumeration: - Add PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() macros that take config space accessor functions. Implement pci_find_capability(), pci_find_ext_capability(), and dwc, dwc endpoint, and cadence capability search interfaces with them (Hans Zhang) - Leave parent unit address 0 in 'interrupt-map' so that when we build devicetree nodes to describe PCI functions that contain multiple peripherals, we can build this property even when interrupt controllers lack 'reg' properties (Lorenzo Pieralisi) - Add a Xeon 6 quirk to disable Extended Tags and limit Max Read Request Size to 128B to avoid a performance issue (Ilpo Järvinen) - Add sysfs 'serial_number' file to expose the Device Serial Number (Matthew Wood) - Fix pci_acpi_preserve_config() memory leak (Nirmoy Das) Resource management: - Align m68k pcibios_enable_device() with other arches (Ilpo Järvinen) - Remove sparc pcibios_enable_device() implementations that don't do anything beyond what pci_enable_resources() does (Ilpo Järvinen) - Remove mips pcibios_enable_resources() and use pci_enable_resources() instead (Ilpo Järvinen) - Clean up bridge window sizing and assignment (Ilpo Järvinen), including: - Leave non-claimed bridge windows disabled - Enable bridges even if a window wasn't assigned because not all windows are required by downstream devices - Preserve bridge window type when releasing the resource, since the type is needed for reassignment - Consolidate selection of bridge windows into two new interfaces, pbus_select_window() and pbus_select_window_for_type(), so this is done consistently - Compute bridge window start and end earlier to avoid logging stale information MSI: - Add quirk to disable MSI on RDC PCI to PCIe bridges (Marcos Del Sol Vives) Error handling: - Align AER with EEH by allowing drivers to request a Bus Reset on Non-Fatal Errors (in addition to the reset on Fatal Errors that we already do) (Lukas Wunner) - If error recovery fails, emit FAILED_RECOVERY uevents for the devices, not for the bridge leading to them. This makes them correspond to BEGIN_RECOVERY uevents (Lukas Wunner) - Align AER with EEH by calling err_handler.error_detected() callbacks to notify drivers if error recovery fails (Lukas Wunner) - Align AER with EEH by restoring device error_state to pci_channel_io_normal before the err_handler.slot_reset() callback. This is earlier than before the err_handler.resume() callback (Lukas Wunner) - Emit a BEGIN_RECOVERY uevent when driver's err_handler.error_detected() requests a reset, as well as when it says recovery is complete or can be done without a reset (Niklas Schnelle) - Align s390 with AER and EEH by emitting uevents during error recovery (Niklas Schnelle) - Align EEH with AER and s390 by emitting BEGIN_RECOVERY, SUCCESSFUL_RECOVERY, or FAILED_RECOVERY uevents depending on the result of err_handler.error_detected() (Niklas Schnelle) - Fix a NULL pointer dereference in aer_ratelimit() when ACPI GHES error information identifies a device without an AER Capability (Breno Leitao) - Update error decoding and TLP Log printing for new errors in current PCIe base spec (Lukas Wunner) - Update error recovery documentation to match the current code and use consistent nomenclature (Lukas Wunner) ASPM: - Enable all ClockPM and ASPM states for devicetree platforms, since there's typically no firmware that enables ASPM This is a risky change that may uncover hardware or configuration defects at boot-time rather than when users enable ASPM via sysfs later. Booting with "pcie_aspm=off" prevents this enabling (Manivannan Sadhasivam) - Remove the qcom code that enabled ASPM (Manivannan Sadhasivam) Power management: - If a device has already been disconnected, e.g., by a hotplug removal, don't bother trying to resume it to D0 when detaching the driver. This avoids annoying "Unable to change power state from D3cold to D0" messages (Mario Limonciello) - Ensure devices are powered up before config reads for 'max_link_width', 'current_link_speed', 'current_link_width', 'secondary_bus_number', and 'subordinate_bus_number' sysfs files. This prevents using invalid data (~0) in drivers or lspci and, depending on how the PCIe controller reports errors, may avoid error interrupts or crashes (Brian Norris) Virtualization: - Add rescan/remove locking when enabling/disabling SR-IOV, which avoids list corruption on s390, where disabling SR-IOV also generates hotplug events (Niklas Schnelle) Peer-to-peer DMA: - Free struct p2p_pgmap, not a member within it, in the pci_p2pdma_add_resource() error path (Sungho Kim) Endpoint framework: - Document sysfs interface for BAR assignment of vNTB endpoint functions (Jerome Brunet) - Fix array underflow in endpoint BAR test case (Dan Carpenter) - Skip endpoint IRQ test if the IRQ is out of range to avoid false errors (Christian Bruel) - Fix endpoint test case for controllers with fixed-size BARs smaller than requested by the test (Marek Vasut) - Restore inbound translation when disabling doorbell so the endpoint doorbell test case can be run more than once (Niklas Cassel) - Avoid a NULL pointer dereference when releasing DMA channels in endpoint DMA test case (Shin'ichiro Kawasaki) - Convert tegra194 interrupt number to MSI vector to fix endpoint Kselftest MSI_TEST test case (Niklas Cassel) - Reset tegra194 BARs when running in endpoint mode so the BAR tests don't overwrite the ATU settings in BAR4 (Niklas Cassel) - Handle errors in tegra194 BPMP transactions so we don't mistakenly skip future PERST# assertion (Vidya Sagar) AMD MDB PCIe controller driver: - Update DT binding example to separate PERST# to a Root Port stanza to make multiple Root Ports possible in the future (Sai Krishna Musham) - Add driver support for PERST# being described in a Root Port stanza, falling back to the host bridge if not found there (Sai Krishna Musham) Freescale i.MX6 PCIe controller driver: - Enable the 3.3V Vaux supply if available so devices can request wakeup with either Beacon or WAKE# (Richard Zhu) MediaTek PCIe Gen3 controller driver: - Add optional sys clock ready time setting to avoid sys_clk_rdy signal glitching in MT6991 and MT8196 (AngeloGioacchino Del Regno) - Add DT binding and driver support for MT6991 and MT8196 (AngeloGioacchino Del Regno) NVIDIA Tegra PCIe controller driver: - When asserting PERST#, disable the controller instead of mistakenly disabling the PLL twice (Nagarjuna Kristam) - Convert struct tegra_msi mask_lock to raw spinlock to avoid a lock nesting error (Marek Vasut) Qualcomm PCIe controller driver: - Select PCI Power Control Slot driver so slot voltage rails can be turned on/off if described in Root Port devicetree node (Qiang Yu) - Parse only PCI bridge child nodes in devicetree, skipping unrelated nodes such as OPP (Operating Performance Points), which caused probe failures (Krishna Chaitanya Chundru) - Add 8.0 GT/s and 32.0 GT/s equalization settings (Ziyue Zhang) - Consolidate Root Port 'phy' and 'reset' properties in struct qcom_pcie_port, regardless of whether we got them from the Root Port node or the host bridge node (Manivannan Sadhasivam) - Fetch and map the ELBI register space in the DWC core rather than in each driver individually (Krishna Chaitanya Chundru) - Enable ECAM mechanism in DWC core by setting up iATU with 'CFG Shift Feature' and use this in the qcom driver (Krishna Chaitanya Chundru) - Add SM8750 compatible to qcom,pcie-sm8550.yaml (Krishna Chaitanya Chundru) - Update qcom,pcie-x1e80100.yaml to allow fifth PCIe host on Qualcomm Glymur, which is compatible with X1E80100 but doesn't have the cnoc_sf_axi clock (Qiang Yu) Renesas R-Car PCIe controller driver: - Fix a typo that prevented correct PHY initialization (Marek Vasut) - Add a missing 1ms delay after PWR reset assertion as required by the V4H manual (Marek Vasut) - Assure reset has completed before DBI access to avoid SError (Marek Vasut) - Fix inverted PHY initialization check, which sometimes led to timeouts and failure to start the controller (Marek Vasut) - Pass the correct IRQ domain to generic_handle_domain_irq() to fix a regression when converting to msi_create_parent_irq_domain() (Claudiu Beznea) - Drop the spinlock protecting the PMSR register - it's no longer required since pci_lock already serializes accesses (Marek Vasut) - Convert struct rcar_msi mask_lock to raw spinlock to avoid a lock nesting error (Marek Vasut) SOPHGO PCIe controller driver: - Check for existence of struct cdns_pcie.ops before using it to allow Cadence drivers that don't need to supply ops (Chen Wang) - Add DT binding and driver for the SOPHGO SG2042 PCIe controller (Chen Wang) STMicroelectronics STM32MP25 PCIe controller driver: - Update pinctrl documentation of initial states and use in runtime suspend/resume (Christian Bruel) - Add pinctrl_pm_select_init_state() for use by stm32 driver, which needs it during resume (Christian Bruel) - Add devicetree bindings and drivers for the STMicroelectronics STM32MP25 in host and endpoint modes (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Add support for x16 in devicetree 'num-lanes' property (Konrad Dybcio) - Verify that if DT specifies a single IRQ for all eDMA channels, it is named 'dma' (Niklas Cassel) TI J721E PCIe driver: - Add MODULE_DEVICE_TABLE() so driver can be autoloaded (Siddharth Vadapalli) - Power controller off before configuring the glue layer so the controller latches the correct values on power-on (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Use devm_request_irq() so 'ks-pcie-error-irq' is freed when driver exits with error (Siddharth Vadapalli) - Add Peripheral Virtualization Unit (PVU), which restricts DMA from PCIe devices to specific regions of host memory, to the ti,am65 binding (Jan Kiszka) Xilinx NWL PCIe controller driver: - Clear bootloader E_ECAM_CONTROL before merging in the new driver value to avoid writing invalid values (Jani Nurminen)" * tag 'pci-v6.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (141 commits) PCI/AER: Avoid NULL pointer dereference in aer_ratelimit() MAINTAINERS: Add entry for ST STM32MP25 PCIe drivers PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25 dt-bindings: PCI: Add STM32MP25 PCIe Endpoint bindings PCI: stm32: Add PCIe host support for STM32MP25 PCI: xilinx-nwl: Fix ECAM programming PCI: j721e: Fix incorrect error message in probe() PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit dt-bindings: PCI: qcom,pcie-x1e80100: Set clocks minItems for the fifth Glymur PCIe Controller PCI: dwc: Support 16-lane operation PCI: Add lockdep assertion in pci_stop_and_remove_bus_device() PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlock PCI: tegra194: Rename 'root_bus' to 'root_port_bus' in tegra_pcie_downstream_dev_to_D0() PCI: tegra: Convert struct tegra_msi mask_lock into raw spinlock PCI: rcar-gen4: Fix inverted break condition in PHY initialization PCI: rcar-gen4: Assure reset occurs before DBI access PCI: rcar-gen4: Add missing 1ms delay after PWR reset assertion PCI: Set up bridge resources earlier PCI: rcar-host: Drop PMSR spinlock ...
2025-10-03Merge tag 'dma-mapping-6.18-2025-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linuxLinus Torvalds1-2/+2
Pull dma-mapping updates from Marek Szyprowski: - Refactoring of DMA mapping API to physical addresses as the primary interface instead of page+offset parameters This gets much closer to Matthew Wilcox's long term wish for struct-pageless IO to cacheable DRAM and is supporting memdesc project which seeks to substantially transform how struct page works. An advantage of this approach is the possibility of introducing DMA_ATTR_MMIO, which covers existing 'dma_map_resource' flow in the common paths, what in turn lets to use recently introduced dma_iova_link() API to map PCI P2P MMIO without creating struct page Developped by Leon Romanovsky and Jason Gunthorpe - Minor clean-up by Petr Tesarik and Qianfeng Rong * tag 'dma-mapping-6.18-2025-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: kmsan: fix missed kmsan_handle_dma() signature conversion mm/hmm: properly take MMIO path mm/hmm: migrate to physical address-based DMA mapping API dma-mapping: export new dma_*map_phys() interface xen: swiotlb: Open code map_resource callback dma-mapping: implement DMA_ATTR_MMIO for dma_(un)map_page_attrs() kmsan: convert kmsan_handle_dma to use physical addresses dma-mapping: convert dma_direct_*map_page to be phys_addr_t based iommu/dma: implement DMA_ATTR_MMIO for iommu_dma_(un)map_phys() iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys dma-debug: refactor to use physical addresses for page mapping iommu/dma: implement DMA_ATTR_MMIO for dma_iova_link(). dma-mapping: introduce new DMA attribute to indicate MMIO memory swiotlb: Remove redundant __GFP_NOWARN dma-direct: clean up the logic in __dma_direct_alloc_pages()
2025-10-02Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds15-38/+23
Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page*() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...
2025-10-02Merge tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linuxLinus Torvalds1-5/+0
Pull block updates from Jens Axboe: - NVMe pull request via Keith: - FC target fixes (Daniel) - Authentication fixes and updates (Martin, Chris) - Admin controller handling (Kamaljit) - Target lockdep assertions (Max) - Keep-alive updates for discovery (Alastair) - Suspend quirk (Georg) - MD pull request via Yu: - Add support for a lockless bitmap. A key feature for the new bitmap are that the IO fastpath is lockless. If a user issues lots of write IO to the same bitmap bit in a short time, only the first write has additional overhead to update bitmap bit, no additional overhead for the following writes. By supporting only resync or recover written data, means in the case creating new array or replacing with a new disk, there is no need to do a full disk resync/recovery. - Switch ->getgeo() and ->bios_param() to using struct gendisk rather than struct block_device. - Rust block changes via Andreas. This series adds configuration via configfs and remote completion to the rnull driver. The series also includes a set of changes to the rust block device driver API: a few cleanup patches, and a few features supporting the rnull changes. The series removes the raw buffer formatting logic from `kernel::block` and improves the logic available in `kernel::string` to support the same use as the removed logic. - floppy arch cleanups - Reduce the number of dereferencing needed for ublk commands - Restrict supported sockets for nbd. Mostly done to eliminate a class of issues perpetually reported by syzbot, by using nonsensical socket setups. - A few s390 dasd block fixes - Fix a few issues around atomic writes - Improve DMA interation for integrity requests - Improve how iovecs are treated with regards to O_DIRECT aligment constraints. We used to require each segment to adhere to the constraints, now only the request as a whole needs to. - Clean up and improve p2p support, enabling use of p2p for metadata payloads - Improve locking of request lookup, using SRCU where appropriate - Use page references properly for brd, avoiding very long RCU sections - Fix ordering of recursively submitted IOs - Clean up and improve updating nr_requests for a live device - Various fixes and cleanups * tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits) s390/dasd: enforce dma_alignment to ensure proper buffer validation s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request ublk: remove redundant zone op check in ublk_setup_iod() nvme: Use non zero KATO for persistent discovery connections nvmet: add safety check for subsys lock nvme-core: use nvme_is_io_ctrl() for I/O controller check nvme-core: do ioccsz/iorcsz validation only for I/O controllers nvme-core: add method to check for an I/O controller blk-cgroup: fix possible deadlock while configuring policy blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path blk-mq: Fix more tag iteration function documentation selftests: ublk: fix behavior when fio is not installed ublk: don't access ublk_queue in ublk_unmap_io() ublk: pass ublk_io to __ublk_complete_rq() ublk: don't access ublk_queue in ublk_need_complete_req() ublk: don't access ublk_queue in ublk_check_commit_and_fetch() ublk: don't pass ublk_queue to ublk_fetch() ublk: don't access ublk_queue in ublk_config_io_buf() ublk: don't access ublk_queue in ublk_check_fetch_buf() ublk: pass q_id and tag to __ublk_check_and_get_req() ...
2025-10-01Merge tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linuxLinus Torvalds1-4/+0
Pull Kbuild updates from Nathan Chancellor: - Extend modules.builtin.modinfo to include module aliases from MODULE_DEVICE_TABLE for builtin modules so that userspace tools (such as kmod) can verify that a particular module alias will be handled by a builtin module - Bump the minimum version of LLVM for building the kernel to 15.0.0 - Upgrade several userspace API checks in headers_check.pl to errors - Unify and consolidate CONFIG_WERROR / W=e handling - Turn assembler and linker warnings into errors with CONFIG_WERROR / W=e - Respect CONFIG_WERROR / W=e when building userspace programs (userprogs) - Enable -Werror unconditionally when building host programs (hostprogs) - Support copy_file_range() and data segment alignment in gen_init_cpio to improve performance on filesystems that support reflinks such as btrfs and XFS - Miscellaneous small changes to scripts and configuration files * tag 'kbuild-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (47 commits) modpost: Initialize builtin_modname to stop SIGSEGVs Documentation: kbuild: note CONFIG_DEBUG_EFI in reproducible builds kbuild: vmlinux.unstripped should always depend on .vmlinux.export.o modpost: Create modalias for builtin modules modpost: Add modname to mod_device_table alias scsi: Always define blogic_pci_tbl structure kbuild: extract modules.builtin.modinfo from vmlinux.unstripped kbuild: keep .modinfo section in vmlinux.unstripped kbuild: always create intermediate vmlinux.unstripped s390: vmlinux.lds.S: Reorder sections KMSAN: Remove tautological checks objtool: Drop noinstr hack for KCSAN_WEAK_MEMORY lib/Kconfig.debug: Drop CLANG_VERSION check from DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT riscv: Remove ld.lld version checks from many TOOLCHAIN_HAS configs riscv: Unconditionally use linker relaxation riscv: Remove version check for LTO_CLANG selects powerpc: Drop unnecessary initializations in __copy_inst_from_kernel_nofault() mips: Unconditionally select ARCH_HAS_CURRENT_STACK_POINTER arm64: Remove tautological LLVM Kconfig conditions ARM: Clean up definition of ARM_HAS_GROUP_RELOCS ...
2025-10-01Merge tag 'soc-drivers-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds1-1/+0
Pull SoC driver updates from Arnd Bergmann: "Lots of platform specific updates for Qualcomm SoCs, including a new TEE subsystem driver for the Qualcomm QTEE firmware interface. Added support for the Apple A11 SoC in drivers that are shared with the M1/M2 series, among more updates for those. Smaller platform specific driver updates for Renesas, ASpeed, Broadcom, Nvidia, Mediatek, Amlogic, TI, Allwinner, and Freescale SoCs. Driver updates in the cache controller, memory controller and reset controller subsystems. SCMI firmware updates to add more features and improve robustness. This includes support for having multiple SCMI providers in a single system. TEE subsystem support for protected DMA-bufs, allowing hardware to access memory areas that managed by the kernel but remain inaccessible from the CPU in EL1/EL0" * tag 'soc-drivers-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (139 commits) soc/fsl/qbman: Use for_each_online_cpu() instead of for_each_cpu() soc: fsl: qe: Drop legacy-of-mm-gpiochip.h header from GPIO driver soc: fsl: qe: Change GPIO driver to a proper platform driver tee: fix register_shm_helper() pmdomain: apple: Add "apple,t8103-pmgr-pwrstate" dt-bindings: spmi: Add Apple A11 and T2 compatible serial: qcom-geni: Load UART qup Firmware from linux side spi: geni-qcom: Load spi qup Firmware from linux side i2c: qcom-geni: Load i2c qup Firmware from linux side soc: qcom: geni-se: Add support to load QUP SE Firmware via Linux subsystem soc: qcom: geni-se: Cleanup register defines and update copyright dt-bindings: qcom: se-common: Add QUP Peripheral-specific properties for I2C, SPI, and SERIAL bus Documentation: tee: Add Qualcomm TEE driver tee: qcom: enable TEE_IOC_SHM_ALLOC ioctl tee: qcom: add primordial object tee: add Qualcomm TEE driver tee: increase TEE_MAX_ARG_SIZE to 4096 tee: add TEE_IOCTL_PARAM_ATTR_TYPE_OBJREF tee: add TEE_IOCTL_PARAM_ATTR_TYPE_UBUF tee: add close_context to TEE driver operation ...
2025-09-30Merge tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-2/+0
Pull VDSO updates from Thomas Gleixner: - Further consolidation of the VDSO infrastructure and the common data store - Simplification of the related Kconfig logic - Improve the VDSO selftest suite * tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests: vDSO: Drop vdso_test_clock_getres selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64() selftests: vDSO: vdso_test_abi: Test CPUTIME clocks selftests: vDSO: vdso_test_abi: Use explicit indices for name array selftests: vDSO: vdso_test_abi: Drop clock availability tests selftests: vDSO: vdso_test_abi: Use ksft_finished() selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper vdso: Add struct __kernel_old_timeval forward declaration to gettime.h vdso: Gate VDSO_GETRANDOM behind HAVE_GENERIC_VDSO vdso: Drop Kconfig GENERIC_VDSO_TIME_NS vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE vdso: Drop kconfig GENERIC_COMPAT_VDSO vdso: Drop kconfig GENERIC_VDSO_32 riscv: vdso: Untangle Kconfig logic time: Build generic update_vsyscall() only with generic time vDSO vdso/gettimeofday: Remove !CONFIG_TIME_NS stubs vdso: Move ENABLE_COMPAT_VDSO from core to arm64 ARM: VDSO: Remove cntvct_ok global variable vdso/datastore: Gate time data behind CONFIG_GENERIC_GETTIMEOFDAY
2025-09-30KVM: Export KVM-internal symbols for sub-modules onlySean Christopherson2-1/+15
Rework the vast majority of KVM's exports to expose symbols only to KVM submodules, i.e. to x86's kvm-{amd,intel}.ko and PPC's kvm-{pr,hv}.ko. With few exceptions, KVM's exported APIs are intended (and safe) for KVM- internal usage only. Keep kvm_get_kvm(), kvm_get_kvm_safe(), and kvm_put_kvm() as normal exports, as they are needed by VFIO, and are generally safe for external usage (though ideally even the get/put APIs would be KVM-internal, and VFIO would pin a VM by grabbing a reference to its associated file). Implement a framework in kvm_types.h in anticipation of providing a macro to restrict KVM-specific kernel exports, i.e. to provide symbol exports for KVM if and only if KVM is built as one or more modules. Link: https://lore.kernel.org/r/20250919003303.1355064-3-seanjc@google.com Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-09-30Merge tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds4-24/+17
Pull scheduler updates from Ingo Molnar: "Core scheduler changes: - Make migrate_{en,dis}able() inline, to improve performance (Menglong Dong) - Move STDL_INIT() functions out-of-line (Peter Zijlstra) - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra) Fair scheduling: - Defer throttling to when tasks exit to user-space, to reduce the chance & impact of throttle-preemption with held locks and other resources (Aaron Lu, Valentin Schneider) - Get rid of sched_domains_curr_level hack for tl->cpumask(), as the warning was getting triggered on certain topologies (Peter Zijlstra) Misc cleanups & fixes: - Header cleanups (Menglong Dong) - Fix race in push_dl_task() (Harshit Agarwal)" * tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix some typos in include/linux/preempt.h sched: Make migrate_{en,dis}able() inline rcu: Replace preempt.h with sched.h in include/linux/rcupdate.h arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c sched/fair: Do not balance task to a throttled cfs_rq sched/fair: Do not special case tasks in throttled hierarchy sched/fair: update_cfs_group() for throttled cfs_rqs sched/fair: Propagate load for throttled cfs_rq sched/fair: Get rid of throttled_lb_pair() sched/fair: Task based throttle time accounting sched/fair: Switch to task based throttle model sched/fair: Implement throttle task work and related helpers sched/fair: Add related data structure for task based throttle sched: Unify the SCHED_{SMT,CLUSTER,MC} Kconfig sched: Move STDL_INIT() functions out-of-line sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask() sched/deadline: Fix race in push_dl_task()
2025-09-29Merge tag 'powerpc-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds140-722/+2429
Pull powerpc updates from Madhavan Srinivasan: - powerpc support for BPF arena and arena atomics - Patches to switch to msi parent domain (per-device MSI domains) - Add a lock contention tracepoint in the queued spinlock slowpath - Fixes for underflow in pseries/powernv msi and pci paths - Switch from legacy-of-mm-gpiochip dependency to platform driver - Fixes for handling TLB misses - Introduce support for powerpc papr-hvpipe - Add vpa-dtl PMU driver for pseries platform - Misc fixes and cleanups Thanks to Aboorva Devarajan, Aditya Bodkhe, Andrew Donnellan, Athira Rajeev, Cédric Le Goater, Christophe Leroy, Erhard Furtner, Gautam Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joe Lawrence, Kajol Jain, Kienan Stewart, Linus Walleij, Mahesh Salgaonkar, Nam Cao, Nicolas Schier, Nysal Jan K.A., Ritesh Harjani (IBM), Ruben Wauters, Saket Kumar Bhaskar, Shashank MS, Shrikanth Hegde, Tejas Manhas, Thomas Gleixner, Thomas Huth, Thorsten Blum, Tyrel Datwyler, and Venkat Rao Bagalkote. * tag 'powerpc-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (49 commits) powerpc/pseries: Define __u{8,32} types in papr_hvpipe_hdr struct genirq/msi: Remove msi_post_free() powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is needed powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL data docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs event format entries for vpa_dtl pmu powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf powerpc/time: Expose boot_tb via accessor powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failure powerpc/fprobe: fix updated fprobe for function-graph tracer powerpc/ftrace: support CONFIG_FUNCTION_GRAPH_RETVAL powerpc64/modules: replace stub allocation sentinel with an explicit counter powerpc64/modules: correctly iterate over stubs in setup_ftrace_ool_stubs powerpc/ftrace: ensure ftrace record ops are always set for NOPs powerpc/603: Really copy kernel PGD entries into all PGDIRs powerpc/8xx: Remove left-over instruction and comments in DataStoreTLBMiss handler powerpc/pseries: HVPIPE changes to support migration powerpc/pseries: Enable hvpipe with ibm,set-system-parameter RTAS powerpc/pseries: Enable HVPIPE event message interrupt ...
2025-09-29Merge tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds1-2/+2
Pull ffs const-attribute cleanups from Kees Cook: "While working on various hardening refactoring a while back we encountered inconsistencies in the application of __attribute_const__ on the ffs() family of functions. This series fixes this across all archs and adds KUnit tests. Notably, this found a theoretical underflow in PCI (also fixed here) and uncovered an inefficiency in ARC (fixed in the ARC arch PR). I kept the series separate from the general hardening PR since it is a stand-alone "topic". - PCI: Fix theoretical underflow in use of ffs(). - Universally apply __attribute_const__ to all architecture's ffs()-family of functions. - Add KUnit tests for ffs() behavior and const-ness" * tag 'ffs-const-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: KUnit: ffs: Validate all the __attribute_const__ annotations sparc: Add __attribute_const__ to ffs()-family implementations xtensa: Add __attribute_const__ to ffs()-family implementations s390: Add __attribute_const__ to ffs()-family implementations parisc: Add __attribute_const__ to ffs()-family implementations mips: Add __attribute_const__ to ffs()-family implementations m68k: Add __attribute_const__ to ffs()-family implementations openrisc: Add __attribute_const__ to ffs()-family implementations riscv: Add __attribute_const__ to ffs()-family implementations hexagon: Add __attribute_const__ to ffs()-family implementations alpha: Add __attribute_const__ to ffs()-family implementations sh: Add __attribute_const__ to ffs()-family implementations powerpc: Add __attribute_const__ to ffs()-family implementations x86: Add __attribute_const__ to ffs()-family implementations csky: Add __attribute_const__ to ffs()-family implementations bitops: Add __attribute_const__ to generic ffs()-family implementations KUnit: Introduce ffs()-family tests PCI: Test for bit underflow in pcie_set_readrq()
2025-09-29Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linuxLinus Torvalds8-1332/+0
Pull crypto library updates from Eric Biggers: - Add a RISC-V optimized implementation of Poly1305. This code was written by Andy Polyakov and contributed by Zhihang Shao. - Migrate the MD5 code into lib/crypto/, and add KUnit tests for MD5. Yes, it's still the 90s, and several kernel subsystems are still using MD5 for legacy use cases. As long as that remains the case, it's helpful to clean it up in the same way as I've been doing for other algorithms. Later, I plan to convert most of these users of MD5 to use the new MD5 library API instead of the generic crypto API. - Simplify the organization of the ChaCha, Poly1305, BLAKE2s, and Curve25519 code. Consolidate these into one module per algorithm, and centralize the configuration and build process. This is the same reorganization that has already been successful for SHA-1 and SHA-2. - Remove the unused crypto_kpp API for Curve25519. - Migrate the BLAKE2s and Curve25519 self-tests to KUnit. - Always enable the architecture-optimized BLAKE2s code. * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (38 commits) crypto: md5 - Implement export_core() and import_core() wireguard: kconfig: simplify crypto kconfig selections lib/crypto: tests: Enable Curve25519 test when CRYPTO_SELFTESTS lib/crypto: curve25519: Consolidate into single module lib/crypto: curve25519: Move a couple functions out-of-line lib/crypto: tests: Add Curve25519 benchmark lib/crypto: tests: Migrate Curve25519 self-test to KUnit crypto: curve25519 - Remove unused kpp support crypto: testmgr - Remove curve25519 kpp tests crypto: x86/curve25519 - Remove unused kpp support crypto: powerpc/curve25519 - Remove unused kpp support crypto: arm/curve25519 - Remove unused kpp support crypto: hisilicon/hpre - Remove unused curve25519 kpp support lib/crypto: tests: Add KUnit tests for BLAKE2s lib/crypto: blake2s: Consolidate into single C translation unit lib/crypto: blake2s: Move generic code into blake2s.c lib/crypto: blake2s: Always enable arch-optimized BLAKE2s code lib/crypto: blake2s: Remove obsolete self-test lib/crypto: x86/blake2s: Reduce size of BLAKE2S_SIGMA2 lib/crypto: chacha: Consolidate into single module ...
2025-09-29Merge tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-2/+2
Pull vfs async directory updates from Christian Brauner: "This contains further preparatory changes for the asynchronous directory locking scheme: - Add lookup_one_positive_killable() which allows overlayfs to perform lookup that won't block on a fatal signal - Unify the mount idmap handling in struct renamedata as a rename can only happen within a single mount - Introduce kern_path_parent() for audit which sets the path to the parent and returns a dentry for the target without holding any locks on return - Rename kern_path_locked() as it is only used to prepare for the removal of an object from the filesystem: kern_path_locked() => start_removing_path() kern_path_create() => start_creating_path() user_path_create() => start_creating_user_path() user_path_locked_at() => start_removing_user_path_at() done_path_create() => end_creating_path() NA => end_removing_path()" * tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: debugfs: rename start_creating() to debugfs_start_creating() VFS: rename kern_path_locked() and related functions. VFS/audit: introduce kern_path_parent() for audit VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata VFS: discard err2 in filename_create() VFS/ovl: add lookup_one_positive_killable()
2025-09-29Merge tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-1/+1
Pull copy_process updates from Christian Brauner: "This contains the changes to enable support for clone3() on nios2 which apparently is still a thing. The more exciting part of this is that it cleans up the inconsistency in how the 64-bit flag argument is passed from copy_process() into the various other copy_*() helpers" [ Fixed up rv ltl_monitor 32-bit support as per Sasha Levin in the merge ] * tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: nios2: implement architecture-specific portion of sys_clone3 arch: copy_thread: pass clone_flags as u64 copy_process: pass clone_flags as u64 across calltree copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
2025-09-29Merge tag 'vfs-6.18-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-1/+1
Pull vfs inode updates from Christian Brauner: "This contains a series I originally wrote and that Eric brought over the finish line. It moves out the i_crypt_info and i_verity_info pointers out of 'struct inode' and into the fs-specific part of the inode. So now the few filesytems that actually make use of this pay the price in their own private inode storage instead of forcing it upon every user of struct inode. The pointer for the crypt and verity info is simply found by storing an offset to its address in struct fsverity_operations and struct fscrypt_operations. This shrinks struct inode by 16 bytes. I hope to move a lot more out of it in the future so that struct inode becomes really just about very core stuff that we need, much like struct dentry and struct file, instead of the dumping ground it has become over the years. On top of this are a various changes associated with the ongoing inode lifetime handling rework that multiple people are pushing forward: - Stop accessing inode->i_count directly in f2fs and gfs2. They simply should use the __iget() and iput() helpers - Make the i_state flags an enum - Rework the iput() logic Currently, if we are the last iput, and we have the I_DIRTY_TIME bit set, we will grab a reference on the inode again and then mark it dirty and then redo the put. This is to make sure we delay the time update for as long as possible We can rework this logic to simply dec i_count if it is not 1, and if it is do the time update while still holding the i_count reference Then we can replace the atomic_dec_and_lock with locking the ->i_lock and doing atomic_dec_and_test, since we did the atomic_add_unless above - Add an icount_read() helper and convert everyone that accesses inode->i_count directly for this purpose to use the helper - Expand dump_inode() to dump more information about an inode helping in debugging - Add some might_sleep() annotations to iput() and associated helpers" * tag 'vfs-6.18-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: add might_sleep() annotation to iput() and more fs: expand dump_inode() inode: fix whitespace issues fs: add an icount_read helper fs: rework iput logic fs: make the i_state flags an enum fs: stop accessing ->i_count directly in f2fs and gfs2 fsverity: check IS_VERITY() in fsverity_cleanup_inode() fs: remove inode::i_verity_info btrfs: move verity info pointer to fs-specific part of inode f2fs: move verity info pointer to fs-specific part of inode ext4: move verity info pointer to fs-specific part of inode fsverity: add support for info in fs-specific part of inode fs: remove inode::i_crypt_info ceph: move crypt info pointer to fs-specific part of inode ubifs: move crypt info pointer to fs-specific part of inode f2fs: move crypt info pointer to fs-specific part of inode ext4: move crypt info pointer to fs-specific part of inode fscrypt: add support for info in fs-specific part of inode fscrypt: replace raw loads of info pointer with helper function
2025-09-25arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.cMenglong Dong1-0/+1
The include/generated/asm-offsets.h is generated in Kbuild during compiling from arch/SRCARCH/kernel/asm-offsets.c. When we want to generate another similar offset header file, circular dependency can happen. For example, we want to generate a offset file include/generated/test.h, which is included in include/sched/sched.h. If we generate asm-offsets.h first, it will fail, as include/sched/sched.h is included in asm-offsets.c and include/generated/test.h doesn't exist; If we generate test.h first, it can't success neither, as include/generated/asm-offsets.h is included by it. In x86_64, the macro COMPILE_OFFSETS is used to avoid such circular dependency. We can generate asm-offsets.h first, and if the COMPILE_OFFSETS is defined, we don't include the "generated/test.h". And we define the macro COMPILE_OFFSETS for all the asm-offsets.c for this purpose. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-09-24Merge tag 'soc_fsl-6.18-1' of https://github.com/chleroy/linux into soc/driversArnd Bergmann1-1/+0
FSL SOC Changes for 6.18: - Use for_each_online_cpu() instead of for_each_cpu() in qbman - Update FSL QUICC ENGINE GPIO driver to a standard platform driver and stop using legacy-of-mm-gpiochip.h header - Misc fixes on bus/fsl-mc * tag 'soc_fsl-6.18-1' of https://github.com/chleroy/linux: soc/fsl/qbman: Use for_each_online_cpu() instead of for_each_cpu() soc: fsl: qe: Drop legacy-of-mm-gpiochip.h header from GPIO driver soc: fsl: qe: Change GPIO driver to a proper platform driver bus: fsl-mc: Replace snprintf and sprintf with sysfs_emit in sysfs show functions bus: fsl-mc: Check return value of platform_get_resource() Link: https://lore.kernel.org/r/26615a15-3494-435f-b0c1-861122b4b5e1@csgroup.eu Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-23VFS: rename kern_path_locked() and related functions.NeilBrown1-2/+2
kern_path_locked() is now only used to prepare for removing an object from the filesystem (and that is the only credible reason for wanting a positive locked dentry). Thus it corresponds to kern_path_create() and so should have a corresponding name. Unfortunately the name "kern_path_create" is somewhat misleading as it doesn't actually create anything. The recently added simple_start_creating() provides a better pattern I believe. The "start" can be matched with "end" to bracket the creating or removing. So this patch changes names: kern_path_locked -> start_removing_path kern_path_create -> start_creating_path user_path_create -> start_creating_user_path user_path_locked_at -> start_removing_user_path_at done_path_create -> end_creating_path and also introduces end_removing_path() which is identical to end_creating_path(). __start_removing_path (which was __kern_path_locked) is enhanced to call mnt_want_write() for consistency with the start_creating_path(). Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-23powerpc/pseries: Define __u{8,32} types in papr_hvpipe_hdr structHaren Myneni1-4/+4
Fix the the following build errors with CONFIG_UAPI_HEADER_TEST: ./usr/include/asm/papr-hvpipe.h:16:9: error: unknown type name 'u8' 16 | u8 version; ./usr/include/asm/papr-hvpipe.h:17:9: error: unknown type name 'u8' 17 | u8 reserved[3]; ./usr/include/asm/papr-hvpipe.h:18:9: error: unknown type name 'u32' 18 | u32 flags; ./usr/include/asm/papr-hvpipe.h:19:9: error: unknown type name 'u8' 19 | u8 reserved2[40]; Fixes: 043439ad1a23c ("powerpc/pseries: Define papr-hvpipe ioctl") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250922091108.1483970-1-haren@linux.ibm.com
2025-09-22soc: fsl: qe: Drop legacy-of-mm-gpiochip.h header from GPIO driverChristophe Leroy1-1/+0
Remove legacy-of-mm-gpiochip.h header file. The above mentioned file provides an OF API that's deprecated. There is no agnostic alternatives to it and we have to open code the logic which was hidden behind of_mm_gpiochip_add_data(). Note, most of the GPIO drivers are using their own labeling schemas and resource retrieval that only a few may gain of the code deduplication, so whenever alternative is appear we can move drivers again to use that one. As a side effect this change fixes a potential memory leak on an error path, if of_mm_gpiochip_add_data() fails. [Text copied from commit 34064c8267a6 ("powerpc/8xx: Drop legacy-of-mm-gpiochip.h header")] Suggested-by: Bartosz Golaszewski <brgl@bgdev.pl> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/e8a5d2c5b72233bd36da7fecc0a551ca54d39478.1758212309.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
2025-09-22powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is neededAthira Rajeev1-2/+52
Handle the case when the aux buffer is going to be full and data needs to be written to the data file. perf_aux_output_begin() function checks if there is enough space depending on the values of aux_wakeup and aux_watermark which is part of "struct perf_buffer". Inorder to maintain where to write to aux buffer, add two fields to "struct vpa_pmu_buf". Field "threshold" to indicate total possible DTL entries that can be contained in aux buffer and field "full" to indicate anytime when buffer is full. In perf_aux_output_end, there is check to see if wake up is needed based on aux head value. In vpa_dtl_capture_aux(), check if there is enough space to contain the DTL data. If not, save the data for available memory and set full to true. Set head of private aux to zero when buffer is full so that next data will be copied to beginning of the buffer. The address used for copying to aux is "aux_copy_buf + buf->head". So once buffer is full, set head to zero, so that next time it will be written from start of the buffer. Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250915102947.26681-7-atrajeev@linux.ibm.com
2025-09-22powerpc/perf/vpa-dtl: Add support to capture DTL data in aux bufferAthira Rajeev1-1/+130
vpa dtl pmu has one hrtimer added per vpa-dtl pmu thread. When the hrtimer expires, in the timer handler, code is added to save the DTL data to perf event record via vpa_dtl_capture_aux() function. The DTL (Dispatch Trace Log) contains information about dispatch/preempt, enqueue time etc. We directly copy the DTL buffer data as part of auxiliary buffer. Data will be written to disk only when the allocated buffer is full. By this approach, all the DTL data will be present as-is in the perf.data. The data will be post-processed in perf tools side when doing perf report/perf script and this will avoid time taken to create samples in the kernel space. To corelate each DTL entry with other events across CPU's, we need to map timebase from "struct dtl_entry" which phyp provides with boot timebase. This also needs timebase frequency. Define "struct boottb_freq" to save these details. Added changes to capture the Dispatch Trace Log details to AUX buffer in vpa_dtl_dump_sample_data(). Boot timebase and frequency needs to be saved only at once, added field to indicate this as part of "vpa_pmu_buf" structure. perf_aux_output_begin: This function is called before writing to AUX area. This returns the pointer to aux area private structure, ie "struct vpa_pmu_buf". The function obtains the output handle (used in perf_aux_output_end). when capture completes in vpa_dtl_capture_aux(), call perf_aux_output_end() to commit the recorded data. perf_aux_output_end() is called to move the aux->head of "struct perf_buffer" to indicate size of data in aux buffer. aux_tail will be moved in perf tools side when writing the data from aux buffer to perf.data file in disk. It is responsiblity of PMU driver to make sure data is copied between perf_aux_output_begin and perf_aux_output_end. Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250915102947.26681-6-atrajeev@linux.ibm.com
2025-09-22powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL dataAthira Rajeev1-0/+77
vpa dtl pmu has one hrtimer added per vpa-dtl pmu thread. When the hrtimer expires, in the timer handler, code is added to save the DTL data to perf event record. DTL (Dispatch Trace Log) contains information about dispatch/preempt, enqueue time etc. We directly copy the DTL buffer data as part of auxiliary buffer and it will be postprocessed later. To enable the support for aux buffer, add the PMU callbacks for setup_aux and free_aux. In setup_aux, set up pmu-private data structures for an AUX area. rb_alloc_aux uses "alloc_pages_node" and returns pointer to each page address. Map these pages to contiguous space using vmap and use that as base address. The aux private data structure ie, "struct vpa_pmu_buf" mainly saves: 1. buf->base: aux buffer base address 2. buf->head: offset from base address where data will be written to. 3. buf->size: Size of allocated memory free_aux will free pmu-private AUX data structures. Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250915102947.26681-5-atrajeev@linux.ibm.com
2025-09-22powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perfKajol Jain2-1/+341
The pseries Shared Processor Logical Partition(SPLPAR) machines can retrieve a log of dispatch and preempt events from the hypervisor using data from Disptach Trace Log(DTL) buffer. With this information, user can retrieve when and why each dispatch & preempt has occurred. Added an interface to expose the Virtual Processor Area(VPA) DTL counters via perf. The following events are available and exposed in sysfs: vpa_dtl/dtl_cede/ - Trace voluntary (OS initiated) virtual processor waits vpa_dtl/dtl_preempt/ - Trace time slice preempts vpa_dtl/dtl_fault/ - Trace virtual partition memory page faults. vpa_dtl/dtl_all/ - Trace all (dtl_cede/dtl_preempt/dtl_fault) Added interface defines supported event list, config fields for the event attributes and their corresponding bit values which are exported via sysfs. User could use the standard perf tool to access perf events exposed via vpa-dtl pmu. The VPA DTL PMU counters do not interrupt on overflow or generate any PMI interrupts. Therefore, the kernel needs to poll the counters, added hrtimer code to do that. The timer interval can be provided by user via sample_period field in nano seconds. There is one hrtimer added per vpa-dtl pmu thread. To ensure there are no other conflicting dtl users (example: debugfs dtl or /proc/powerpc/vcpudispatch_stats), interface added code to use "down_write_trylock" call to take the dtl_access_lock. The dtl_access_lock is defined in dtl.h file. Also added global reference count variable called "dtl_global_refc", to ensure dtl data can be captured per-cpu. Code also added global lock called "dtl_global_lock" to avoid race condition. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250915102947.26681-3-atrajeev@linux.ibm.com
2025-09-22powerpc/time: Expose boot_tb via accessorAboorva Devarajan2-1/+11
- Define accessor function get_boot_tb() to safely return boot_tb value, this is only needed when running in SPLPAR environments, so the accessor is built conditionally under CONFIG_PPC_SPLPAR. - Tag boot_tb as __ro_after_init since it is written once at initialized and never updated afterwards. Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250915102947.26681-2-atrajeev@linux.ibm.com
2025-09-21powerpc: stop calling page_address() in free_pages()Vishal Moola (Oracle)1-1/+1
free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Link: https://lkml.kernel.org/r/20250903185921.1785167-6-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Justin Sanders <justin@coraid.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-21kasan: introduce ARCH_DEFER_KASAN and unify static key across modesSabyrzhan Tasbolatov5-19/+4
Patch series "kasan: unify kasan_enabled() and remove arch-specific implementations", v6. This patch series addresses the fragmentation in KASAN initialization across architectures by introducing a unified approach that eliminates duplicate static keys and arch-specific kasan_arch_is_ready() implementations. The core issue is that different architectures have inconsistent approaches to KASAN readiness tracking: - PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready() - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) - Generic and SW_TAGS modes relied on arch-specific solutions or always-on behavior This patch (of 2): Introduce CONFIG_ARCH_DEFER_KASAN to identify architectures [1] that need to defer KASAN initialization until shadow memory is properly set up, and unify the static key infrastructure across all KASAN modes. [1] PowerPC, UML, LoongArch selects ARCH_DEFER_KASAN. The core issue is that different architectures haveinconsistent approaches to KASAN readiness tracking: - PowerPC, LoongArch, and UML arch, each implement own kasan_arch_is_ready() - Only HW_TAGS mode had a unified static key (kasan_flag_enabled) - Generic and SW_TAGS modes relied on arch-specific solutions or always-on behavior This patch addresses the fragmentation in KASAN initialization across architectures by introducing a unified approach that eliminates duplicate static keys and arch-specific kasan_arch_is_ready() implementations. Let's replace kasan_arch_is_ready() with existing kasan_enabled() check, which examines the static key being enabled if arch selects ARCH_DEFER_KASAN or has HW_TAGS mode support. For other arch, kasan_enabled() checks the enablement during compile time. Now KASAN users can use a single kasan_enabled() check everywhere. Link: https://lkml.kernel.org/r/20250810125746.1105476-1-snovitoll@gmail.com Link: https://lkml.kernel.org/r/20250810125746.1105476-2-snovitoll@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217049 Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> #powerpc Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: David Gow <davidgow@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Marco Elver <elver@google.com> Cc: Qing Zhang <zhangqing@loongson.cn> Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-19Merge 6.17-rc6 into kbuild-nextNathan Chancellor11-42/+35
Commit bd7c2312128e ("pinctrl: meson: Fix typo in device table macro") is needed in kbuild-next to avoid a build error with a future change. While at it, address the conflict between commit 41f9049cff32 ("riscv: Only allow LTO with CMODEL_MEDANY") and commit 6578a1ff6aa4 ("riscv: Remove version check for LTO_CLANG selects"), as reported by Stephen Rothwell [1]. Link: https://lore.kernel.org/20250908134913.68778b7b@canb.auug.org.au/ [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-16powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failureChristophe Leroy3-15/+3
PAGE_KERNEL_TEXT is an old macro that is used to tell kernel whether kernel text has to be mapped read-only or read-write based on build time options. But nowadays, with functionnalities like jump_labels, static links, etc ... more only less all kernels need to be read-write at some point, and some combinations of configs failed to work due to innacurate setting of PAGE_KERNEL_TEXT. On the other hand, today we have CONFIG_STRICT_KERNEL_RWX which implements a more controlled access to kernel modifications. Instead of trying to keep PAGE_KERNEL_TEXT accurate with all possible options that may imply kernel text modification, always set kernel text read-write at startup and rely on CONFIG_STRICT_KERNEL_RWX to provide accurate protection. Do this by passing PAGE_KERNEL_X to map_kernel_page() in __maping_ram_chunk() instead of passing PAGE_KERNEL_TEXT. Once this is done, the only remaining user of PAGE_KERNEL_TEXT is mmu_mark_initmem_nx() which uses it in a call to setibat(). As setibat() ignores the RW/RO, we can seamlessly replace PAGE_KERNEL_TEXT by PAGE_KERNEL_X here as well and get rid of PAGE_KERNEL_TEXT completely. Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://lore.kernel.org/all/342b4120-911c-4723-82ec-d8c9b03a8aef@mailbox.org/ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/8e2d793abf87ae3efb8f6dce10f974ac0eda61b8.1757412205.git.christophe.leroy@csgroup.eu