aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2026-04-05folio_batch: rename pagevec.h to folio_batch.hTal Zussman5-8/+8
struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct pagevec"). Rename include/linux/pagevec.h to reflect reality and update includes tree-wide. Add the new filename to MAINTAINERS explicitly, as it no longer matches the "include/linux/page[-_]*" pattern in MEMORY MANAGEMENT - CORE. Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-3-716868cc2d11@columbia.edu Signed-off-by: Tal Zussman <tz2294@columbia.edu> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: remove stray references to struct pagevecTal Zussman2-3/+1
Patch series "mm: Remove stray references to pagevec", v2. struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct pagevec"). Remove any stray references to it and rename relevant files and macros accordingly. While at it, remove unnecessary #includes of pagevec.h (now folio_batch.h) in .c files. There are probably more of these that could be removed in .h files, but those are more complex to verify. This patch (of 4): struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct pagevec"). Remove remaining forward declarations and change __folio_batch_release()'s declaration to match its definition. Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-0-716868cc2d11@columbia.edu Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-1-716868cc2d11@columbia.edu Signed-off-by: Tal Zussman <tz2294@columbia.edu> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Chris Li <chrisl@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: introduce vm_mmap_shadow_stack() as a helper for VM_SHADOW_STACK mappingsCatalin Marinas1-0/+2
Patch series "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE", v2. A series to extract the common shadow stack mmap into a separate helper for arm64, riscv and x86. This patch (of 5): arm64, riscv and x86 use a similar pattern for mapping the user shadow stack (cloned from x86). Extract this into a helper to facilitate code reuse. The call to do_mmap() from the new helper uses PROT_READ|PROT_WRITE prot bits instead of the PROT_READ with an explicit VM_WRITE vm_flag. The x86 intent was to avoid PROT_WRITE implying normal write since the shadow stack is not writable by normal stores. However, from a kernel perspective, the vma is writeable. Functionally there is no difference. Link: https://lkml.kernel.org/r/20260225161404.3157851-1-catalin.marinas@arm.com Link: https://lkml.kernel.org/r/20260225161404.3157851-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: memcontrol: switch to native NR_VMALLOC vmstat counterJohannes Weiner1-1/+0
Eliminates the custom memcg counter and results in a single, consolidated accounting call in vmalloc code. Link: https://lkml.kernel.org/r/20260223160147.3792777-2-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: vmalloc: streamline vmalloc memory accountingJohannes Weiner2-3/+1
Use a vmstat counter instead of a custom, open-coded atomic. This has the added benefit of making the data available per-node, and prepares for cleaning up the memcg accounting as well. Link: https://lkml.kernel.org/r/20260223160147.3792777-1-hannes@cmpxchg.org Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05kho: adopt radix tree for preserved memory trackingJason Miu2-15/+199
Patch series "Make KHO Stateless", v9. This series transitions KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel. The key motivations for this change are to: - Eliminate the need for data serialization before kexec. - Remove the KHO finalize state. - Pass preservation metadata more directly to the next kernel via the FDT. The new approach uses a radix tree to mark preserved pages. A page's physical address and its order are encoded into a single value. The tree is composed of multiple levels of page-sized tables, with leaf nodes being bitmaps where each set bit represents a preserved page. The physical address of the radix tree's root is passed in the FDT, allowing the next kernel to reconstruct the preserved memory map. This series is broken down into the following patches: 1. kho: Adopt radix tree for preserved memory tracking: Replaces the xarray-based tracker with the new radix tree implementation and increments the ABI version. 2. kho: Remove finalize state and clients: Removes the now-obsolete kho_finalize() function and its usage from client code and debugfs. This patch (of 2): Introduce a radix tree implementation for tracking preserved memory pages and switch the KHO memory tracking mechanism to use it. This lays the groundwork for a stateless KHO implementation that eliminates the need for serialization and the associated "finalize" state. This patch introduces the core radix tree data structures and constants to the KHO ABI. It adds the radix tree node and leaf structures, along with documentation for the radix tree key encoding scheme that combines a page's physical address and order. To support broader use by other kernel subsystems, such as hugetlb preservation, the core radix tree manipulation functions are exported as a public API. The xarray-based memory tracking is replaced with this new radix tree implementation. The core KHO preservation and unpreservation functions are wired up to use the radix tree helpers. On boot, the second kernel restores the preserved memory map by walking the radix tree whose root physical address is passed via the FDT. The ABI `compatible` version is bumped to "kho-v2" to reflect the structural changes in the preserved memory map and sub-FDT property names. This includes renaming "fdt" to "preserved-data" to better reflect that preserved state may use formats other than FDT. [ran.xiaokai@zte.com.cn: fix child node parsing for debugfs in/sub_fdts] Link: https://lkml.kernel.org/r/20260309033530.244508-1-ranxiaokai627@163.com Link: https://lkml.kernel.org/r/20260206021428.3386442-1-jasonmiu@google.com Link: https://lkml.kernel.org/r/20260206021428.3386442-2-jasonmiu@google.com Signed-off-by: Jason Miu <jasonmiu@google.com> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Changyuan Lyu <changyuanl@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Ran Xiaokai <ran.xiaokai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: add folio_test_lazyfree helperVernon Yang1-0/+5
Add folio_test_lazyfree() function to identify lazy-free folios to improve code readability. Link: https://lkml.kernel.org/r/20260221093918.1456187-4-vernon2gm@gmail.com Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Barry Song <baohua@kernel.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: cache struct page for empty_zero_page and return it from ZERO_PAGE()Mike Rapoport (Microsoft)1-3/+8
For most architectures every invocation of ZERO_PAGE() does virt_to_page(empty_zero_page). But empty_zero_page is in BSS and it is enough to get its struct page once at initialization time and then use it whenever a zero page should be accessed. Add yet another __zero_page variable that will be initialized as virt_to_page(empty_zero_page) for most architectures in a weak arch_setup_zero_pages() function. For architectures that use colored zero pages (MIPS and s390) rename their setup_zero_pages() to arch_setup_zero_pages() and make it global rather than static. For architectures that cannot use virt_to_page() for BSS (arm64 and sparc64) add override of arch_setup_zero_pages(). Link: https://lkml.kernel.org/r/20260211103141.3215197-5-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05arch, mm: consolidate empty_zero_pageMike Rapoport (Microsoft)1-0/+10
Reduce 22 declarations of empty_zero_page to 3 and 23 declarations of ZERO_PAGE() to 4. Every architecture defines empty_zero_page that way or another, but for the most of them it is always a page aligned page in BSS and most definitions of ZERO_PAGE do virt_to_page(empty_zero_page). Move Linus vetted x86 definition of empty_zero_page and ZERO_PAGE() to the core MM and drop these definitions in architectures that do not implement colored zero page (MIPS and s390). ZERO_PAGE() remains a macro because turning it to a wrapper for a static inline causes severe pain in header dependencies. For the most part the change is mechanical, with these being noteworthy: * alpha: aliased empty_zero_page with ZERO_PGE that was also used for boot parameters. Switching to a generic empty_zero_page removes the aliasing and keeps ZERO_PGE for boot parameters only * arm64: uses __pa_symbol() in ZERO_PAGE() so that definition of ZERO_PAGE() is kept intact. * m68k/parisc/um: allocated empty_zero_page from memblock, although they do not support zero page coloring and having it in BSS will work fine. * sparc64 can have empty_zero_page in BSS rather allocate it, but it can't use virt_to_page() for BSS. Keep it's definition of ZERO_PAGE() but instead of allocating it, make mem_map_zero point to empty_zero_page. * sh: used empty_zero_page for boot parameters at the very early boot. Rename the parameters page to boot_params_page and let sh use the generic empty_zero_page. * hexagon: had an amusing comment about empty_zero_page /* A handy thing to have if one has the RAM. Declared in head.S */ that unfortunately had to go :) Link: https://lkml.kernel.org/r/20260211103141.3215197-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Helge Deller <deller@gmx.de> [parisc] Tested-by: Helge Deller <deller@gmx.de> [parisc] Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Magnus Lindholm <linmag7@gmail.com> [alpha] Acked-by: Dinh Nguyen <dinguyen@kernel.org> [nios2] Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc] Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: David S. Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: rename my_zero_pfn() to zero_pfn()Mike Rapoport (Microsoft)1-8/+20
my_zero_pfn() is a silly name. Rename zero_pfn variable to zero_page_pfn and my_zero_pfn() function to zero_pfn(). While on it, move extern declarations of zero_page_pfn outside the functions that use it and add a comment about what ZERO_PAGE is. Link: https://lkml.kernel.org/r/20260211103141.3215197-3-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: don't special case !MMU for is_zero_pfn() and my_zero_pfn()Mike Rapoport (Microsoft)1-13/+1
Patch series "arch, mm: consolidate empty_zero_page", v3. These patches cleanup handling of ZERO_PAGE() and zero_pfn. This patch (of 4): nommu architectures have empty_zero_page and define ZERO_PAGE() and although they don't really use it to populate page tables, there is no reason to hardwire !MMU implementation of is_zero_pfn() and my_zero_pfn() to 0. Drop #ifdef CONFIG_MMU around implementations of is_zero_pfn() and my_zero_pfn() and remove !MMU version. While on it, make zero_pfn __ro_after_init. Link: https://lkml.kernel.org/r/20260211103141.3215197-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20260211103141.3215197-2-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: name the anonymous MMOP enum as enum mmopGregory Price2-9/+10
Give the MMOP enum (MMOP_OFFLINE, MMOP_ONLINE, etc) a proper type name so the compiler can help catch invalid values being assigned to variables of this type. Leave the existing functions returning int alone to allow for value-or-error pattern to remain unchanged without churn. mmop_default_online_type is left as int because it uses the -1 sentinal value to signal it hasn't been initialized yet. Keep the uint8_t buffer in offline_and_remove_memory() as-is for space efficiency, with an explicit cast when we consume the value. Move the enum definition before the CONFIG_MEMORY_HOTPLUG guard so it is unconditionally available for struct memory_block in memory.h. No functional change. Link: https://lore.kernel.org/linux-mm/3424eba7-523b-4351-abd0-3a888a3e5e61@kernel.org/ Link: https://lkml.kernel.org/r/20260211215447.2194189-1-gourry@gourry.net Signed-off-by: Gregory Price <gourry@gourry.net> Suggested-by: Jonathan Cameron <jonathan.cameron@huawei.com> Suggested-by: "David Hildenbrand (arm)" <david@kernel.org> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: zswap: add per-memcg stat for incompressible pagesJiayuan Chen1-0/+1
Patch series "mm: zswap: add per-memcg stat for incompressible pages", v3. In containerized environments, knowing which cgroup is contributing incompressible pages to zswap is essential for effective resource management. This series adds a new per-memcg stat 'zswap_incomp' to track incompressible pages, along with a selftest. This patch (of 2): The global zswap_stored_incompressible_pages counter was added in commit dca4437a5861 ("mm/zswap: store <PAGE_SIZE compression failed page as-is") to track how many pages are stored in raw (uncompressed) form in zswap. However, in containerized environments, knowing which cgroup is contributing incompressible pages is essential for effective resource management [1]. Add a new memcg stat 'zswap_incomp' to track incompressible pages per cgroup. This helps administrators and orchestrators to: 1. Identify workloads that produce incompressible data (e.g., encrypted data, already-compressed media, random data) and may not benefit from zswap. 2. Make informed decisions about workload placement - moving incompressible workloads to nodes with larger swap backing devices rather than relying on zswap. 3. Debug zswap efficiency issues at the cgroup level without needing to correlate global stats with individual cgroups. While the compression ratio can be estimated from existing stats (zswap / zswapped * PAGE_SIZE), this doesn't distinguish between "uniformly poor compression" and "a few completely incompressible pages mixed with highly compressible ones". The zswap_incomp stat provides direct visibility into the latter case. Link: https://lkml.kernel.org/r/20260213071827.5688-1-jiayuan.chen@linux.dev Link: https://lkml.kernel.org/r/20260213071827.5688-2-jiayuan.chen@linux.dev Link: https://lore.kernel.org/linux-mm/CAF8kJuONDFj4NAksaR4j_WyDbNwNGYLmTe-o76rqU17La=nkOw@mail.gmail.com/ [1] Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Acked-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm/damon: remove unused target param of get_scheme_score()Asier Gutierrez1-2/+1
damon_target is not used by get_scheme_score operations, nor with virtual neither with physical addresses. Link: https://lkml.kernel.org/r/20260213145032.1740407-1-gutierrez.asier@huawei-partners.com Signed-off-by: Asier Gutierrez <gutierrez.asier@huawei-partners.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Quanmin Yan <yanquanmin1@huawei.com> Cc: ze zuo <zuoze1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: memfd_luo: preserve file sealsPratyush Yadav (Google)1-1/+17
File seals are used on memfd for making shared memory communication with untrusted peers safer and simpler. Seals provide a guarantee that certain operations won't be allowed on the file such as writes or truncations. Maintaining these guarantees across a live update will help keeping such use cases secure. These guarantees will also be needed for IOMMUFD preservation with LUO. Normally when IOMMUFD maps a memfd, it pins all its pages to make sure any truncation operations on the memfd don't lead to IOMMUFD using freed memory. This doesn't work with LUO since the preserved memfd might have completely different pages after a live update, and mapping them back to the IOMMUFD will cause all sorts of problems. Using and preserving the seals allows IOMMUFD preservation logic to trust the memfd. Since the uABI defines seals as an int, preserve them by introducing a new u32 field. There are currently only 6 possible seals, so the extra bits are unused and provide room for future expansion. Since the seals are uABI, it is safe to use them directly in the ABI. While at it, also add a u32 flags field. It makes sure the struct is nicely aligned, and can be used later to support things like MFD_CLOEXEC. Since the serialization structure is changed, bump the version number to "memfd-v2". It is important to note that the memfd-v2 version only supports seals that existed when this version was defined. This set is defined by MEMFD_LUO_ALL_SEALS. Any new seal might bring a completely different semantic with it and the parser for memfd-v2 cannot be expected to deal with that. If there are any future seals added, they will need another version bump. Link: https://lkml.kernel.org/r/20260216185946.1215770-3-pratyush@kernel.org Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Tested-by: Samiullah Khawaja <skhawaja@google.com> Cc: Alexander Graf <graf@amazon.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05memfd: export memfd_{add,get}_seals()Pratyush Yadav (Google)1-0/+12
Patch series "mm: memfd_luo: preserve file seals", v2. This series adds support for preserving file seals when preserving a memfd using LUO. Patch 1 exports some memfd seal manipulation functions and patch 2 adds support for preserving them. Since it makes changes to the serialized data structure for memfd, it also bumps the version number. This patch (of 2): Support for preserving file seals will be added to memfd preservation using the Live Update Orchestrator (LUO). Export memfd_{add,get}_seals)() so memfd_luo can use them to manipulate the seals. Link: https://lkml.kernel.org/r/20260216185946.1215770-1-pratyush@kernel.org Link: https://lkml.kernel.org/r/20260216185946.1215770-2-pratyush@kernel.org Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: Samiullah Khawaja <skhawaja@google.com> Cc: Alexander Graf <graf@amazon.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm, swap: use the swap table to track the swap countKairui Song1-25/+3
Now all the infrastructures are ready, switch to using the swap table only. This is unfortunately a large patch because the whole old counting mechanism, especially SWP_CONTINUED, has to be gone and switch to the new mechanism together, with no intermediate steps available. The swap table is capable of holding up to SWP_TB_COUNT_MAX - 1 counts in the higher bits of each table entry, so using that, the swap_map can be completely dropped. swap_map also had a limit of SWAP_CONT_MAX. Any value beyond that limit will require a COUNT_CONTINUED page. COUNT_CONTINUED is a bit complex to maintain, so for the swap table, a simpler approach is used: when the count goes beyond SWP_TB_COUNT_MAX - 1, the cluster will have an extend_table allocated, which is a swap cluster-sized array of unsigned int. The counting is basically offloaded there until the count drops below SWP_TB_COUNT_MAX again. Both the swap table and the extend table are cluster-based, so they exhibit good performance and sparsity. To make the switch from swap_map to swap table clean, this commit cleans up and introduces a new set of functions based on the swap table design, for manipulating swap counts: - __swap_cluster_dup_entry, __swap_cluster_put_entry, __swap_cluster_alloc_entry, __swap_cluster_free_entry: Increase/decrease the count of a swap slot, or alloc / free a swap slot. This is the internal routine that does the counting work based on the swap table and handles all the complexities. The caller will need to lock the cluster before calling them. All swap count-related update operations are wrapped by these four helpers. - swap_dup_entries_cluster, swap_put_entries_cluster: Increase/decrease the swap count of one or a set of swap slots in the same cluster range. These two helpers serve as the common routines for folio_dup_swap & swap_dup_entry_direct, or folio_put_swap & swap_put_entries_direct. And use these helpers to replace all existing callers. This helps to simplify the count tracking by a lot, and the swap_map is gone. [ryncsn@gmail.com: fix build] Link: https://lkml.kernel.org/r/aZWuLZi-vYi3vAWe@KASONG-MC4 Link: https://lkml.kernel.org/r/20260218-swap-table-p3-v3-9-f4e34be021a7@tencent.com Signed-off-by: Kairui Song <kasong@tencent.com> Suggested-by: Chris Li <chrisl@kernel.org> Acked-by: Chris Li <chrisl@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: kernel test robot <lkp@intel.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05mm: move pgscan, pgsteal, pgrefill to node statsJP Kobryn (Meta)2-13/+13
There are situations where reclaim kicks in on a system with free memory. One possible cause is a NUMA imbalance scenario where one or more nodes are under pressure. It would help if we could easily identify such nodes. Move the pgscan, pgsteal, and pgrefill counters from vm_event_item to node_stat_item to provide per-node reclaim visibility. With these counters as node stats, the values are now displayed in the per-node section of /proc/zoneinfo, which allows for quick identification of the affected nodes. /proc/vmstat continues to report the same counters, aggregated across all nodes. But the ordering of these items within the readout changes as they move from the vm events section to the node stats section. Memcg accounting of these counters is preserved. The relocated counters remain visible in memory.stat alongside the existing aggregate pgscan and pgsteal counters. However, this change affects how the global counters are accumulated. Previously, the global event count update was gated on !cgroup_reclaim(), excluding memcg-based reclaim from /proc/vmstat. Now that mod_lruvec_state() is being used to update the counters, the global counters will include all reclaim. This is consistent with how pgdemote counters are already tracked. Finally, the virtio_balloon driver is updated to use global_node_page_state() to fetch the counters, as they are no longer accessible through the vm_events array. Link: https://lkml.kernel.org/r/20260219235846.161910-1-jp.kobryn@linux.dev Signed-off-by: JP Kobryn <jp.kobryn@linux.dev> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@kernel.org> Cc: Eugenio Pérez <eperezma@redhat.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Wei Xu <weixugc@google.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: start using maple copy node for destinationLiam R. Howlett1-0/+14
Stop using the maple subtree state and big node in favour of using three destinations in the maple copy node. That is, expand the way leaves were handled to all levels of the tree and use the maple copy node to track the new nodes. Extract out the sibling init into the data calculation since this is where the insufficient data can be detected. The remainder of the sibling code to shift the next iteration is moved to the spanning_ascend() function, since it is not always needed. Next introduce the dst_setup() function which will decide how many nodes are needed to contain the data at this level. Using the destination count, populate the copy node's dst array with the new nodes and set d_count to the correct value. Note that this can be tricky in the case of a leaf node with exactly enough room because of the rule against NULLs at the end of leaves. Once the destinations are ready, copy the data by altering the cp_data_write() function to copy from the sources to the destinations directly. This eliminates the use of the big node in this code path. On node completion, node_finalise() will zero out the remaining area and set the metadata, if necessary. spanning_ascend() is used to decide if the operation is complete. It may create a new root, converge into one destination, or continue upwards by ascending the left and right write maple states. One test case setup needed to be tweaked so that the targeted node was surrounded by full nodes. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20260130205935.2559335-18-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: add gap support, slot and pivot sizes for maple copyLiam R. Howlett1-0/+1
Add plumbing work for using maple copy as a normal node for a source of copy operations. This is needed later. Link: https://lkml.kernel.org/r/20260130205935.2559335-17-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: change initial big node setup in mas_wr_spanning_rebalance()Liam R. Howlett1-0/+1
Instead of copying the data into the big node and finding out that the data may need to be moved or appended to, calculate the data space up front (in the maple copy node) and set up another source for the copy. The additional copy source is tracked in the maple state sib (short for sibling), and is put into the maple write states for future operations after the data is in the big node. To facilitate the newly moved node, some initial setup of the maple subtree state are relocated after the potential shift caused by the new way of rebalancing against a sibling. Link: https://lkml.kernel.org/r/20260130205935.2559335-15-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05maple_tree: introduce maple_copy node and use it in mas_spanning_rebalance()Liam R. Howlett1-0/+26
Introduce an internal-memory only node type called maple_copy to facilitate internal copy operations. Use it in mas_spanning_rebalance() for just the leaf nodes. Initially, the maple_copy node is used to configure the source nodes and copy the data into the big_node. The maple_copy contains a list of source entries with start and end offsets. One of the maple_copy entries can be itself with an offset of 0 to 2, representing the data where the store partially overwrites entries, or fully overwrites the entry. The side effect is that the source nodes no longer have to worry about partially copying the existing offset if it is not fully overwritten. This is in preparation of removal of the maple big_node, but for the time being the data is copied to the big node to limit the change size. Link: https://lkml.kernel.org/r/20260130205935.2559335-12-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrew Ballance <andrewjballance@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-05Merge tag 'char-misc-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds2-2/+14
Pull char/misc/iio driver fixes from Greg KH: "Here are a relativly large number of small char/misc/iio and other driver fixes for 7.0-rc7. There's a bunch, but overall they are all small fixes for issues that people have been having that I finally caught up with getting merged due to delays on my end. The "largest" change overall is just some documentation updates to the security-bugs.rst file to hopefully tell the AI tools (and any users that actually read the documentation), how to send us better security bug reports as the quantity of reports these past few weeks has increased dramatically due to tools getting better at "finding" things. Included in here are: - lots of small IIO driver fixes for issues reported in 7.0-rc - gpib driver fixes - comedi driver fixes - interconnect driver fix - nvmem driver fixes - mei driver fix - counter driver fix - binder rust driver fixes - some other small misc driver fixes All of these have been in linux-next this week with no reported issues" * tag 'char-misc-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (63 commits) Documentation: fix two typos in latest update to the security report howto Documentation: clarify the mandatory and desirable info for security reports Documentation: explain how to find maintainers addresses for security reports Documentation: minor updates to the security contacts .get_maintainer.ignore: add myself nvmem: zynqmp_nvmem: Fix buffer size in DMA and memcpy nvmem: imx: assign nvmem_cell_info::raw_len misc: fastrpc: check qcom_scm_assign_mem() return in rpmsg_probe misc: fastrpc: possible double-free of cctx->remote_heap comedi: dt2815: add hardware detection to prevent crash comedi: runflags cannot determine whether to reclaim chanlist comedi: Reinit dev->spinlock between attachments to low-level drivers comedi: me_daq: Fix potential overrun of firmware buffer comedi: me4000: Fix potential overrun of firmware buffer comedi: ni_atmio16d: Fix invalid clean-up after failed attach gpib: fix use-after-free in IO ioctl handlers gpib: lpvo_usb: fix memory leak on disconnect gpib: Fix fluke driver s390 compile issue lis3lv02d: Omit IRQF_ONESHOT if no threaded handler is provided lis3lv02d: fix kernel-doc warnings ...
2026-04-05Merge tag 'usb-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds1-2/+8
Pull USB/Thunderbolt fixes from Greg KH: "Here are a bunch of USB and Thunderbolt fixes (most all are USB) for 7.0-rc7. More than I normally like this late in the release cycle, partly due to my recent travels, and partly due to people banging away on the USB gadget interfaces and apis more than normal (big shoutout to Android for getting the vendors to actually work upstream on this, that's a huge win overall for everyone here) Included in here are: - Small thunderbolt fix - new USB serial driver ids added - typec driver fixes - gadget driver fixes for some disconnect issues - other usb gadget driver fixes for reported problems with binding and unbinding devices as happens when a gadget device connects / disconnects from a system it is plugged into (or it switches device mode at a user's request, these things are complex little beasts...) - usb offload fixes (where USB audio tunnels through the controller while the main CPU is asleep) for when EMP spikes hit the system causing disconnects to happen (as often happens with static electricity in the winter months). This has been much reported by at least one vendor, and resolves the issues they have been seeing with this codepath. Can't wait for the "formal methods are the answer!" people to try to model that one properly... - Other small usb driver fixes for issues reported. All of these have been in linux-next this week, and before, with no reported issues, and I've personally been stressing these harder than normal on my systems here with no problems" * tag 'usb-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (39 commits) usb: gadget: f_hid: move list and spinlock inits from bind to alloc usb: host: xhci-sideband: delegate offload_usage tracking to class drivers usb: core: use dedicated spinlock for offload state usb: cdns3: gadget: fix state inconsistency on gadget init failure usb: dwc3: imx8mp: fix memory leak on probe failure path usb: gadget: f_uac1_legacy: validate control request size usb: ulpi: fix double free in ulpi_register_interface() error path usb: misc: usbio: Fix URB memory leak on submit failure USB: core: add NO_LPM quirk for Razer Kiyo Pro webcam usb: cdns3: gadget: fix NULL pointer dereference in ep_queue usb: core: phy: avoid double use of 'usb3-phy' USB: serial: option: add MeiG Smart SRM825WN usb: gadget: f_rndis: Fix net_device lifecycle with device_move usb: gadget: f_subset: Fix net_device lifecycle with device_move usb: gadget: f_eem: Fix net_device lifecycle with device_move usb: gadget: f_ecm: Fix net_device lifecycle with device_move usb: gadget: u_ncm: Add kernel-doc comments for struct f_ncm_opts usb: gadget: f_rndis: Protect RNDIS options with mutex usb: gadget: f_subset: Fix unbalanced refcnt in geth_free dt-bindings: connector: add pd-disable dependency ...
2026-04-04prctl: rename branch landing pad implementation functions to be more explicitPaul Walmsley1-3/+3
Per Linus' comments about the unreadability of abbreviations such as "indir_br_lp", rename the three prctl() implementation functions to be more explicit. This involves renaming "indir_br_lp_status" in the function names to "branch_landing_pad_state". While here, add _prctl_ into the function names, following the speculation control prctl implementation functions. Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/ Cc: Deepak Gupta <debug@rivosinc.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04Merge tag 'v7.0-rc6' into irq/coreThomas Gleixner38-145/+305
to be able to merge the hyper-v patch related to randomness.
2026-04-04Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linuxRafael J. Wysocki1-2/+2
Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello: "Add support for new features: * CPPC performance priority * Dynamic EPP * Raw EPP * New unit tests for new features Fixes for: * PREEMPT_RT * sysfs files being present when HW missing * Broken/outdated documentation" * tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits) MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer cpufreq: Pass the policy to cpufreq_driver->adjust_perf() cpufreq/amd-pstate: Pass the policy to amd_pstate_update() cpufreq/amd-pstate-ut: Add a unit test for raw EPP cpufreq/amd-pstate: Add support for raw EPP writes cpufreq/amd-pstate: Add support for platform profile class cpufreq/amd-pstate: add kernel command line to override dynamic epp cpufreq/amd-pstate: Add dynamic energy performance preference Documentation: amd-pstate: fix dead links in the reference section cpufreq/amd-pstate: Cache the max frequency in cpudata Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count} Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file amd-pstate-ut: Add a testcase to validate the visibility of driver attributes amd-pstate-ut: Add module parameter to select testcases amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2() amd-pstate: Add sysfs support for floor_freq and floor_count amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF x86/cpufeatures: Add AMD CPPC Performance Priority feature. amd-pstate: Make certain freq_attrs conditionally visible ...
2026-04-04bus: fsl-mc: use generic driver_override infrastructureDanilo Krummrich1-4/+0
When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus") Link: https://patch.msgid.link/20260324005919.2408620-3-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-04block: remove unused BVEC_ITER_ALL_INITCaleb Sander Mateos1-9/+0
This macro no longer has any users, so remove it. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Link: https://patch.msgid.link/20260403184852.2140919-1-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-04-04Merge back earlier cpufreq material for 7.1Rafael J. Wysocki1-3/+4
2026-04-04PCI: endpoint: Add reserved region type for MSI-X Table and PBAManikanta Maddireddy1-0/+4
Add PCI_EPC_BAR_RSVD_MSIX_TBL_RAM and PCI_EPC_BAR_RSVD_MSIX_PBA_RAM to enum pci_epc_bar_rsvd_region_type so that Endpoint controllers can describe hardware-owned MSI-X Table and PBA (Pending Bit Array) regions behind a BAR_RESERVED BAR. Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260324080857.916263-2-mmaddireddy@nvidia.com
2026-04-04get rid of busy-waiting in shrink_dcache_tree()Al Viro1-3/+15
If shrink_dcache_tree() runs into a potential victim that is already dying, it must wait for that dentry to go away. To avoid busy-waiting we need some object to wait on and a way for dentry_unlist() to see that we need to be notified. The obvious place for the object to wait on would be on our stack frame. We will store a pointer to that object (struct completion_list) in victim dentry; if there's more than one thread wanting to wait for the same dentry to finish dying, we'll have their instances linked into a list, with reference in dentry pointing to the head of that list. * new object - struct completion_list. A pair of struct completion and pointer to the next instance. That's what shrink_dcache_tree() will wait on if needed. * add a new member (->waiters, opaque pointer to struct completion_list) to struct dentry. It is defined for negative live dentries that are not in-lookup ones and it will remain NULL for almost all of them. It does not conflict with ->d_rcu (defined for killed dentries), ->d_alias (defined for positive dentries, all live) or ->d_in_lookup_hash (defined for in-lookup dentries, all live negative). That allows to colocate all four members. * make sure that all places where dentry enters the state where ->waiters is defined (live, negative, not-in-lookup) initialize ->waiters to NULL. * if select_collect2() runs into a dentry that is already dying, have its caller insert a local instance of struct completion_list into the head of the list hanging off dentry->waiters and wait for completion. * if dentry_unlist() sees non-NULL ->waiters, have it carefully walk through the completion_list instances in that list, calling complete() for each. For now struct completion_list is local to fs/dcache.c; it's obviously dentry-agnostic, and it can be trivially lifted into linux/completion.h if somebody finds a reason to do so... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2026-04-03dpll: add frequency monitoring callback opsIvan Vecera1-0/+10
Add new callback operations for a dpll device: - freq_monitor_get(..) - to obtain current state of frequency monitor feature from dpll device, - freq_monitor_set(..) - to allow feature configuration. Add new callback operation for a dpll pin: - measured_freq_get(..) - to obtain the measured frequency in mHz. Obtain the feature state value using the get callback and provide it to the user if the device driver implements callbacks. The measured_freq_get pin callback is only invoked when the frequency monitor is enabled. The freq_monitor_get device callback is required when measured_freq_get is provided by the driver. Execute the set callback upon user requests. Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://patch.msgid.link/20260402184057.1890514-3-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-04vdpa: use generic driver_override infrastructureDanilo Krummrich1-4/+0
When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 539fec78edb4 ("vdpa: add driver_override support") Acked-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20260324005919.2408620-9-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-04platform/wmi: use generic driver_override infrastructureDanilo Krummrich1-4/+0
When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 12046f8c77e0 ("platform/x86: wmi: Add driver_override support") Reviewed-by: Armin Wolf <W_Armin@gmx.de> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20260324005919.2408620-7-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-04PCI: use generic driver_override infrastructureDanilo Krummrich1-6/+0
When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Williamson <alex@shazbot.org> Tested-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260324005919.2408620-6-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-03net: always inline some skb helpersEric Dumazet1-20/+26
Some performance critical helpers from include/linux/skbuff.h are not inlined by clang. Use __always_inline hint for: - __skb_fill_netmem_desc() - __skb_fill_page_desc() - skb_fill_netmem_desc() - skb_fill_page_desc() - __skb_pull() - pskb_may_pull_reason() - pskb_may_pull() - pskb_pull() - pskb_trim() - skb_orphan() - skb_postpull_rcsum() - skb_header_pointer() - skb_clear_delivery_time() - skb_tstamp_cond() - skb_warn_if_lro() This increases performance and saves ~1200 bytes of text. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 4/24 grow/shrink: 66/12 up/down: 4104/-5306 (-1202) Function old new delta ip_multipath_l3_keys - 303 +303 tcp_sendmsg_locked 4560 4848 +288 xfrm_input 6240 6455 +215 esp_output_head 1516 1711 +195 skb_try_coalesce 696 866 +170 bpf_prog_test_run_skb 1951 2091 +140 tls_strp_read_copy 528 667 +139 gue_udp_recv 738 871 +133 __ip6_append_data 4159 4279 +120 __bond_xmit_hash 1019 1122 +103 ip6_multipath_l3_keys 394 495 +101 bpf_lwt_seg6_action 1096 1197 +101 input_action_end_dx2 344 442 +98 vxlan_remcsum 487 581 +94 udpv6_queue_rcv_skb 393 480 +87 udp_queue_rcv_skb 385 471 +86 gue_remcsum 453 539 +86 udp_lib_checksum_complete 84 168 +84 vxlan_xmit 2777 2857 +80 nf_reset_ct 456 532 +76 igmp_rcv 1902 1978 +76 mpls_forward 1097 1169 +72 tcp_add_backlog 1226 1292 +66 nfulnl_log_packet 3091 3156 +65 tcp_rcv_established 1966 2026 +60 __strp_recv 1547 1603 +56 eth_type_trans 357 411 +54 bond_flow_ip 392 444 +52 __icmp_send 1584 1630 +46 ip_defrag 1636 1681 +45 tpacket_rcv 2793 2837 +44 refcount_add 132 176 +44 nf_ct_frag6_gather 1959 2003 +44 napi_skb_free_stolen_head 199 240 +41 __pskb_trim - 41 +41 napi_reuse_skb 319 358 +39 icmpv6_rcv 1877 1916 +39 br_handle_frame_finish 1672 1711 +39 ip_rcv_core 841 879 +38 ip_check_defrag 377 415 +38 br_stp_rcv 909 947 +38 qdisc_pkt_len_segs_init 366 399 +33 mld_query_work 2945 2975 +30 bpf_sk_assign_tcp_reqsk 607 637 +30 udp_gro_receive 1657 1686 +29 ip6_rcv_core 1170 1193 +23 ah_input 1176 1197 +21 tun_get_user 5174 5194 +20 llc_rcv 815 834 +19 __pfx_udp_lib_checksum_complete 16 32 +16 __pfx_refcount_add 48 64 +16 __pfx_nf_reset_ct 96 112 +16 __pfx_ip_multipath_l3_keys - 16 +16 __pfx___pskb_trim - 16 +16 packet_sendmsg 5771 5781 +10 esp_output_tail 1460 1470 +10 alloc_skb_with_frags 433 443 +10 xsk_generic_xmit 3477 3486 +9 mptcp_sendmsg_frag 2250 2259 +9 __ip_append_data 4166 4175 +9 __ip6_tnl_rcv 1159 1168 +9 skb_zerocopy 1215 1220 +5 gre_parse_header 1358 1362 +4 __iptunnel_pull_header 405 407 +2 skb_vlan_untag 692 693 +1 psp_dev_rcv 701 702 +1 netkit_xmit 1263 1264 +1 gre_rcv 2776 2777 +1 gre_gso_segment 1521 1522 +1 bpf_skb_net_hdr_pop 535 536 +1 udp6_ufo_fragment 888 884 -4 br_multicast_rcv 9154 9148 -6 snap_rcv 312 305 -7 skb_copy_ubufs 1841 1834 -7 __pfx_skb_tstamp_cond 16 - -16 __pfx_skb_clear_delivery_time 16 - -16 __pfx_pskb_trim 16 - -16 __pfx_pskb_pull 16 - -16 ipv6_gso_segment 1400 1383 -17 ipv6_frag_rcv 2511 2492 -19 erspan_xmit 1221 1190 -31 __pfx_skb_warn_if_lro 32 - -32 __pfx___skb_fill_page_desc 32 - -32 skb_tstamp_cond 42 - -42 pskb_trim 46 - -46 __pfx_skb_postpull_rcsum 48 - -48 tcp_gso_segment 1524 1475 -49 skb_clear_delivery_time 54 - -54 __pfx_skb_fill_page_desc 64 - -64 __pfx_skb_header_pointer 80 - -80 pskb_pull 91 - -91 skb_warn_if_lro 110 - -110 tcp_v6_rcv 3288 3170 -118 __pfx___skb_pull 128 - -128 __pfx_skb_orphan 144 - -144 __pfx_pskb_may_pull 160 - -160 tcp_v4_rcv 3334 3153 -181 __skb_fill_page_desc 231 - -231 udp_rcv 1809 1553 -256 skb_postpull_rcsum 318 - -318 skb_header_pointer 367 - -367 fib_multipath_hash 3399 3018 -381 skb_orphan 513 - -513 skb_fill_page_desc 534 - -534 __skb_pull 568 - -568 pskb_may_pull 604 - -604 Total: Before=29652698, After=29651496, chg -0.00% Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260402152654.1720627-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-03lsm: add backing_file LSM hooksPaul Moore6-3/+44
Stacked filesystems such as overlayfs do not currently provide the necessary mechanisms for LSMs to properly enforce access controls on the mmap() and mprotect() operations. In order to resolve this gap, a LSM security blob is being added to the backing_file struct and the following new LSM hooks are being created: security_backing_file_alloc() security_backing_file_free() security_mmap_backing_file() The first two hooks are to manage the lifecycle of the LSM security blob in the backing_file struct, while the third provides a new mmap() access control point for the underlying backing file. It is also expected that LSMs will likely want to update their security_file_mprotect() callback to address issues with their mprotect() controls, but that does not require a change to the security_file_mprotect() LSM hook. There are a three other small changes to support these new LSM hooks: * Pass the user file associated with a backing file down to alloc_empty_backing_file() so it can be included in the security_backing_file_alloc() hook. * Add getter and setter functions for the backing_file struct LSM blob as the backing_file struct remains private to fs/file_table.c. * Constify the file struct field in the LSM common_audit_data struct to better support LSMs that need to pass a const file struct pointer into the common LSM audit code. Thanks to Arnd Bergmann for identifying the missing EXPORT_SYMBOL_GPL() and supplying a fixup. Cc: stable@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-unionfs@vger.kernel.org Cc: linux-erofs@lists.ozlabs.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2026-04-03kernel: ksysfs: initialize kernel_kobj earlierBartosz Golaszewski1-0/+8
Software nodes depend on kernel_kobj which is initialized pretty late into the boot process - as a core_initcall(). Ahead of moving the software node initialization to driver_init() we must first make kernel_kobj available before it. Make ksysfs_init() visible in a new header - ksysfs.h - and call it in do_basic_setup() right before driver_init(). Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20260402-nokia770-gpio-swnodes-v5-1-d730db3dd299@oss.qualcomm.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2026-04-03KVM: SEV: Disallow LAUNCH_FINISH if vCPUs are actively being createdSean Christopherson1-0/+7
Reject LAUNCH_FINISH for SEV-ES and SNP VMs if KVM is actively creating one or more vCPUs, as KVM needs to process and encrypt each vCPU's VMSA. Letting userspace create vCPUs while LAUNCH_FINISH is in-progress is "fine", at least in the current code base, as kvm_for_each_vcpu() operates on online_vcpus, LAUNCH_FINISH (all SEV+ sub-ioctls) holds kvm->mutex, and fully onlining a vCPU in kvm_vm_ioctl_create_vcpu() is done under kvm->mutex. I.e. there's no difference between an in-progress vCPU and a vCPU that is created entirely after LAUNCH_FINISH. However, given that concurrent LAUNCH_FINISH and vCPU creation can't possibly work (for any reasonable definition of "work"), since userspace can't guarantee whether a particular vCPU will be encrypted or not, disallow the combination as a hardening measure, to reduce the probability of introducing bugs in the future, and to avoid having to reason about the safety of future changes related to LAUNCH_FINISH. Cc: Jethro Beekman <jethro@fortanix.com> Closes: https://lore.kernel.org/all/b31f7c6e-2807-4662-bcdd-eea2c1e132fa@fortanix.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260310234829.2608037-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-04-03Merge tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linuxLinus Torvalds2-10/+8
Pull gpio fixes from Bartosz Golaszewski: - fix kerneldocs for gpio-timberdale and gpio-nomadik - clear the "requested" flag in error path in gpiod_request_commit() - call of_xlate() if provided when setting up shared GPIOs - handle pins shared by child firmware nodes of consumer devices - fix return value check in gpio-qixis-fpga - fix suspend on gpio-mxc - fix gpio-microchip DT bindings * tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: dt-bindings: gpio: fix microchip #interrupt-cells gpio: shared: shorten the critical section in gpiochip_setup_shared() gpio: mxc: map Both Edge pad wakeup to Rising Edge gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio() gpio: shared: handle pins shared by child nodes of devices gpio: shared: call gpio_chip::of_xlate() if set gpiolib: clear requested flag if line is invalid gpio: nomadik: repair some kernel-doc comments gpio: timberdale: repair kernel-doc comments gpio: Fix resource leaks on errors in gpiochip_add_data_with_key()
2026-04-03bpf: Add helper and kfunc stack access size resolutionAlexei Starovoitov1-0/+6
The static stack liveness analysis needs to know how many bytes a helper or kfunc accesses through a stack pointer argument, so it can precisely mark the affected stack slots as stack 'def' or 'use'. Add bpf_helper_stack_access_bytes() and bpf_kfunc_stack_access_bytes() which resolve the access size for a given call argument. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260403024422.87231-7-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03bpf: Move verifier helpers to headerAlexei Starovoitov1-0/+28
Move several helpers to header as preparation for the subsequent stack liveness patches. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260403024422.87231-6-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03bpf: Add bpf_compute_const_regs() and bpf_prune_dead_branches() passesAlexei Starovoitov1-0/+23
Add two passes before the main verifier pass: bpf_compute_const_regs() is a forward dataflow analysis that tracks register values in R0-R9 across the program using fixed-point iteration in reverse postorder. Each register is tracked with a six-state lattice: UNVISITED -> CONST(val) / MAP_PTR(map_index) / MAP_VALUE(map_index, offset) / SUBPROG(num) -> UNKNOWN At merge points, if two paths produce the same state and value for a register, it stays; otherwise it becomes UNKNOWN. The analysis handles: - MOV, ADD, SUB, AND with immediate or register operands - LD_IMM64 for plain constants, map FDs, map values, and subprogs - LDX from read-only maps: constant-folds the load by reading the map value directly via bpf_map_direct_read() Results that fit in 32 bits are stored per-instruction in insn_aux_data and bitmasks. bpf_prune_dead_branches() uses the computed constants to evaluate conditional branches. When both operands of a conditional jump are known constants, the branch outcome is determined statically and the instruction is rewritten to an unconditional jump. The CFG postorder is then recomputed to reflect new control flow. This eliminates dead edges so that subsequent liveness analysis doesn't propagate through dead code. Also add runtime sanity check to validate that precomputed constants match the verifier's tracked state. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260403024422.87231-5-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03bpf: Sort subprogs in topological order after check_cfg()Alexei Starovoitov1-0/+2
Add a pass that sorts subprogs in topological order so that iterating subprog_topo_order[] walks leaf subprogs first, then their callers. This is computed as a DFS post-order traversal of the CFG. The pass runs after check_cfg() to ensure the CFG has been validated before traversing and after postorder has been computed to avoid walking dead code. Reviewed-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20260403024422.87231-3-alexei.starovoitov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf 7.0-rc6+Alexei Starovoitov19-46/+141
Cross-merge BPF and other fixes after downstream PR. Minor conflict in kernel/bpf/verifier.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-04-03SUNRPC: Add svc_rqst_page_release() helperChuck Lever1-0/+15
svc_rqst_replace_page() releases displaced pages through a per-rqst folio batch, but exposes the add-or-flush sequence directly. svc_tcp_restore_pages() releases displaced pages individually with put_page(). Introduce svc_rqst_page_release() to encapsulate the batched release mechanism. Convert svc_rqst_replace_page() and svc_tcp_restore_pages() to use it. The latter now benefits from the same batched release that svc_rqst_replace_page() already uses. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-04-03sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy return-migrationJohn Stultz1-2/+49
As we add functionality to proxy execution, we may migrate a donor task to a runqueue where it can't run due to cpu affinity. Thus, we must be careful to ensure we return-migrate the task back to a cpu in its cpumask when it becomes unblocked. Peter helpfully provided the following example with pictures: "Suppose we have a ww_mutex cycle: ,-+-* Mutex-1 <-. Task-A ---' | | ,-- Task-B `-> Mutex-2 *-+-' Where Task-A holds Mutex-1 and tries to acquire Mutex-2, and where Task-B holds Mutex-2 and tries to acquire Mutex-1. Then the blocked_on->owner chain will go in circles. Task-A -> Mutex-2 ^ | | v Mutex-1 <- Task-B We need two things: - find_proxy_task() to stop iterating the circle; - the woken task to 'unblock' and run, such that it can back-off and re-try the transaction. Now, the current code [without this patch] does: __clear_task_blocked_on(); wake_q_add(); And surely clearing ->blocked_on is sufficient to break the cycle. Suppose it is Task-B that is made to back-off, then we have: Task-A -> Mutex-2 -> Task-B (no further blocked_on) and it would attempt to run Task-B. Or worse, it could directly pick Task-B and run it, without ever getting into find_proxy_task(). Now, here is a problem because Task-B might not be runnable on the CPU it is currently on; and because !task_is_blocked() we don't get into the proxy paths, so nobody is going to fix this up. Ideally we would have dequeued Task-B alongside of clearing ->blocked_on, but alas, [the lock ordering prevents us from getting the task_rq_lock() and] spoils things." Thus we need more than just a binary concept of the task being blocked on a mutex or not. So allow setting blocked_on to PROXY_WAKING as a special value which specifies the task is no longer blocked, but needs to be evaluated for return migration *before* it can be run. This will then be used in a later patch to handle proxy return-migration. Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260324191337.1841376-7-jstultz@google.com
2026-04-03locking: Add task::blocked_lock to serialize blocked_on stateJohn Stultz1-31/+17
So far, we have been able to utilize the mutex::wait_lock for serializing the blocked_on state, but when we move to proxying across runqueues, we will need to add more state and a way to serialize changes to this state in contexts where we don't hold the mutex::wait_lock. So introduce the task::blocked_lock, which nests under the mutex::wait_lock in the locking order, and rework the locking to use it. Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260324191337.1841376-5-jstultz@google.com
2026-04-03Revert "usb: cdnsp: Add support for device-only configuration"Greg Kroah-Hartman1-1/+0
This reverts commit 7b7f2dd913829e06705035dfc41ca25fa6ec68d3. There was some problems with an earlier cdns3 change, so this one needs to be backed out as well. Cc: Pawel Laszczak <pawell@cadence.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Reported-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/ac+LEWMCQpLSnfoD@nchen-desktop Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>