| Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 0843ba458439f38efdc14aa359c14ad0127edb01 ]
Fadump allocates memory to pass additional kernel command-line argument
to the fadump kernel. However, this allocation is not needed when fadump
is disabled. So avoid allocating memory for the additional parameter
area in such cases.
Fixes: f4892c68ecc1 ("powerpc/fadump: allocate memory for additional parameters early")
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Fixes: f4892c68ecc1 ("powerpc/fadump: allocate memory for additional parameters early")
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251008032934.262683-1-sourabhjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 3443ff3be6e59b80d74036bb39f5b6409eb23cc9 upstream.
pseries_irq_domain_alloc() allocates interrupts at parent's interrupt
domain. If it fails in the progress, all allocated interrupts are
freed.
The number of successfully allocated interrupts so far is stored
"i". However, "i - 1" interrupts are freed. This is broken:
- One interrupt is not be freed
- If "i" is zero, "i - 1" wraps around
Correct the number of freed interrupts to 'i'.
Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/a980067f2b256bf716b4cd713bc1095966eed8cd.1754300646.git.namcao@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a39087905af9ffecaa237a918a2c03a04e479934 upstream.
pnv_irq_domain_alloc() allocates interrupts at parent's interrupt
domain. If it fails in the progress, all allocated interrupts are
freed.
The number of successfully allocated interrupts so far is stored
"i". However, "i - 1" interrupts are freed. This is broken:
- One interrupt is not be freed
- If "i" is zero, "i - 1" wraps around
Correct the number of freed interrupts to "i".
Fixes: 0fcfe2247e75 ("powerpc/powernv/pci: Add MSI domains")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/70f8debe8688e0b467367db769b71c20146a836d.1754300646.git.namcao@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 661f951e371cc134ea31c84238dbdc9a898b8403 ]
Leon [1] and Vinicius [2] noted a topology_span_sane() warning during
their testing starting from v6.16-rc1. Debug that followed pointed to
the tl->mask() for the NODE domain being incorrectly resolved to that of
the highest NUMA domain.
tl->mask() for NODE is set to the sd_numa_mask() which depends on the
global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
set to the "tl->numa_level" during tl traversal in build_sched_domains()
calling sd_init() but was not reset before topology_span_sane().
Since "tl->numa_level" still reflected the old value from
build_sched_domains(), topology_span_sane() for the NODE domain trips
when the span of the last NUMA domain overlaps.
Instead of replicating the "sched_domains_curr_level" hack, get rid of
it entirely and instead, pass the entire "sched_domain_topology_level"
object to tl->cpumask() function to prevent such mishap in the future.
sd_numa_mask() now directly references "tl->numa_level" instead of
relying on the global "sched_domains_curr_level" hack to index into
sched_domains_numa_masks[].
The original warning was reproducible on the following NUMA topology
reported by Leon:
$ sudo numactl -H
available: 5 nodes (0-4)
node 0 cpus: 0 1
node 0 size: 2927 MB
node 0 free: 1603 MB
node 1 cpus: 2 3
node 1 size: 3023 MB
node 1 free: 3008 MB
node 2 cpus: 4 5
node 2 size: 3023 MB
node 2 free: 3007 MB
node 3 cpus: 6 7
node 3 size: 3023 MB
node 3 free: 3002 MB
node 4 cpus: 8 9
node 4 size: 3022 MB
node 4 free: 2718 MB
node distances:
node 0 1 2 3 4
0: 10 39 38 37 36
1: 39 10 38 37 36
2: 38 38 10 37 36
3: 37 37 37 10 36
4: 36 36 36 36 10
The above topology can be mimicked using the following QEMU cmd that was
used to reproduce the warning and test the fix:
sudo qemu-system-x86_64 -enable-kvm -cpu host \
-m 20G -smp cpus=10,sockets=10 -machine q35 \
-object memory-backend-ram,size=4G,id=m0 \
-object memory-backend-ram,size=4G,id=m1 \
-object memory-backend-ram,size=4G,id=m2 \
-object memory-backend-ram,size=4G,id=m3 \
-object memory-backend-ram,size=4G,id=m4 \
-numa node,cpus=0-1,memdev=m0,nodeid=0 \
-numa node,cpus=2-3,memdev=m1,nodeid=1 \
-numa node,cpus=4-5,memdev=m2,nodeid=2 \
-numa node,cpus=6-7,memdev=m3,nodeid=3 \
-numa node,cpus=8-9,memdev=m4,nodeid=4 \
-numa dist,src=0,dst=1,val=39 \
-numa dist,src=0,dst=2,val=38 \
-numa dist,src=0,dst=3,val=37 \
-numa dist,src=0,dst=4,val=36 \
-numa dist,src=1,dst=0,val=39 \
-numa dist,src=1,dst=2,val=38 \
-numa dist,src=1,dst=3,val=37 \
-numa dist,src=1,dst=4,val=36 \
-numa dist,src=2,dst=0,val=38 \
-numa dist,src=2,dst=1,val=38 \
-numa dist,src=2,dst=3,val=37 \
-numa dist,src=2,dst=4,val=36 \
-numa dist,src=3,dst=0,val=37 \
-numa dist,src=3,dst=1,val=37 \
-numa dist,src=3,dst=2,val=37 \
-numa dist,src=3,dst=4,val=36 \
-numa dist,src=4,dst=0,val=36 \
-numa dist,src=4,dst=1,val=36 \
-numa dist,src=4,dst=2,val=36 \
-numa dist,src=4,dst=3,val=36 \
...
[ prateek: Moved common functions to include/linux/sched/topology.h,
reuse the common bits for s390 and ppc, commit message ]
Closes: https://lore.kernel.org/lkml/20250610110701.GA256154@unreal/ [1]
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") # ce29a7da84cd, f55dac1dafb3
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Valentin Schneider <vschneid@redhat.com> # x86
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> # powerpc
Link: https://lore.kernel.org/lkml/a3de98387abad28592e6ab591f3ff6107fe01dc1.1755893468.git.tim.c.chen@linux.intel.com/ [2]
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f6b4df37ebfeb47e50e27780500d2d06b4d211bd ]
CONFIG_PPC_FTRACE_OUT_OF_LINE introduced setup_ftrace_ool_stubs() to
extend the ppc64le module .stubs section with an array of
ftrace_ool_stub structures for each patchable function.
Fix its ppc64_stub_entry stub reservation loop to properly write across
all of the num_stubs used and not just the first entry.
Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250912142740.3581368-3-joe.lawrence@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5337609a314828aa2474ac359db615f475c4a4d2 ]
When an ftrace call site is converted to a NOP, its corresponding
dyn_ftrace record should have its ftrace_ops pointer set to
ftrace_nop_ops.
Correct the powerpc implementation to ensure the
ftrace_rec_set_nop_ops() helper is called on all successful NOP
initialization paths. This ensures all ftrace records are consistent
before being handled by the ftrace core.
Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Suggested-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250912142740.3581368-2-joe.lawrence@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f2863371f017eb03c230addc253783fa4c7e90f5 ]
Commit 82ef440f9a38 ("powerpc/603: Copy kernel PGD entries into all
PGDIRs and preallocate execmem page tables") was supposed to extend
to powerpc 603 the copy of kernel PGD entries into all PGDIRs
implemented in a previous patch on the 8xx. But 603 is book3s/32 and
uses a duplicate of pgd_alloc() defined in another header.
So really do the copy at the correct place for the 603.
Fixes: 82ef440f9a38 ("powerpc/603: Copy kernel PGD entries into all PGDIRs and preallocate execmem page tables")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/752ab7514cae089a2dd7cc0f3d5e35849f76adb9.1755757797.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d9e46de4bf5c5f987075afd5f240bb2a8a5d71ed ]
Commit ac9f97ff8b32 ("powerpc/8xx: Inconditionally use task PGDIR in
DTLB misses") removed the test that needed the valeur in SPRN_EPN but
failed to remove the read.
Remove it.
And remove related comments, including the very same comment
in InstructionTLBMiss that should have been removed by
commit 33c527522f39 ("powerpc/8xx: Inconditionally use task PGDIR in
ITLB misses").
Also update the comment about absence of a second level table which
has been handled implicitely since commit 5ddb75cee5af ("powerpc/8xx:
remove tests on PGDIR entry validity").
Fixes: ac9f97ff8b32 ("powerpc/8xx: Inconditionally use task PGDIR in DTLB misses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/5811c8d1d6187f280ad140d6c0ad6010e41eeaeb.1755361995.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 46104a7d3ccd2acfe508e661393add0615c27a22 ]
In the upstream commit 214c0eea43b2ea66bcd6467ea57e47ce8874191b
("kbuild: add $(objtree)/ prefix to some in-kernel build artifacts")
artifacts required for building out-of-tree kernel modules had
$(objtree) prepended to them to prepare for building in other
directories.
When building external modules for powerpc,
arch/powerpc/lib/crtsavres.o is required for certain
configurations. This artifact is missing the prepended $(objtree).
Fixes: 13b25489b6f8 ("kbuild: change working directory to external module directory with M=")
Acked-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250218-buildfix-extmod-powerpc-v2-1-1e78fcf12b56@efficios.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bbc46b23af5bb934cd1cf066ef4342cee457a24e ]
With the introduction of clone3 in commit 7f192e3cd316 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit, with a new type of u64 for the flags.
However, for most consumers of clone_flags the interface was not
changed from the previous type of unsigned long.
While this works fine as long as none of the new 64-bit flag bits
(CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still
undesirable in terms of the principle of least surprise.
Thus, this commit fixes all relevant interfaces of the copy_thread
function that is called from copy_process to consistently pass
clone_flags as u64, so that no truncation to 32-bit integers occurs on
32-bit architectures.
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com
Fixes: c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread")
Acked-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org>
Acked-by: Andreas Larsson <andreas@gaisler.com> # sparc
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
This reverts commit 8401a108a63302a5a198c7075d857895ca624851.
I got a report from an (anonymous) Sundance user:
Ethernet controller: Sundance Technology Inc / IC Plus Corp IC Plus IP100A Integrated 10/100 Ethernet MAC + PHY (rev 31)
Revert the driver back in. Make following changes:
- update Denis's email address in MAINTAINERS
- adjust to timer API renames:
- del_timer_sync() -> timer_delete_sync()
- from_timer() -> timer_container_of()
Fixes: 8401a108a633 ("eth: remove the DLink/Sundance (ST201) driver")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250901210818.1025316-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix shellcheck warning such as
"Double quote to prevent globbing and word splitting." and
Use $(...) notation instead of legacy backticks `...`.
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250518044107.39928-4-maddy@linux.ibm.com
|
|
Fix "Double quote to prevent globbing and word splitting."
warning from shellcheck
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Stephen Rothwell <sfr@cab.auug.org.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250518044107.39928-3-maddy@linux.ibm.com
|
|
When compiling for pseries or powernv defconfig with "make C=1",
these warning were reported bu sparse tool in powerpc/kernel/kvm.c
arch/powerpc/kernel/kvm.c:635:9: warning: switch with no cases
arch/powerpc/kernel/kvm.c:646:9: warning: switch with no cases
Currently #ifdef were added after the switch case which are specific
for BOOKE and PPC_BOOK3S_32. These are not enabled in pseries/powernv
defconfig. Fix it by moving the #ifdef before switch(){}
Fixes: cbe487fac7fc0 ("KVM: PPC: Add mtsrin PV code")
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250518044107.39928-1-maddy@linux.ibm.com
|
|
There are two CONFIG_POWERPC64_CPU entries in the "CPU selection"
choice block.
I guess the intent is to display a different prompt depending on
CPU_LITTLE_ENDIAN: "Generic (POWER5 and PowerPC 970 and above)" for big
endian, and "Generic (POWER8 and above)" for little endian.
I stumbled on this tricky use case, and worked around it on Kconfig with
commit 4d46b5b623e0 ("kconfig: fix infinite loop in sym_calc_choice()").
However, I doubt that supporting multiple entries with the same symbol
in a choice block is worth the complexity - this is the only such case
in the kernel tree.
This commit merges the two entries. Once this cleanup is accepted in
the powerpc subsystem, I will proceed to refactor the Kconfig parser.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250607102005.1965409-1-masahiroy@kernel.org
|
|
The extra-y syntax is planned for deprecation because it is similar
to always-y.
When building the boot wrapper, always-y and extra-y are equivalent.
Use always-y instead.
In arch/powerpc/kernel/Makefile, I added ifdef KBUILD_BUILTIN to
keep the current behavior: prom_init_check is skipped when building
only modular objects.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250602163302.478765-1-masahiroy@kernel.org
|
|
Simplify the code to enhance readability and maintain a consistent
coding style.
Signed-off-by: Xichao Zhao <zhao.xichao@vivo.com>
Acked-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250801035908.370463-1-zhao.xichao@vivo.com
|
|
irq_domain_create_simple() takes fwnode as the first argument. It can be
extracted from the struct device using dev_fwnode() helper instead of
using of_node with of_fwnode_handle().
So use the dev_fwnode() helper.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/all/4bc0e1ca-a523-424a-8759-59e353317fba@kernel.org/
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250723062842.1831271-1-jirislaby@kernel.org
|
|
Until commit 6c85f52b10fd ("kvm/ppc: IRQ disabling cleanup"),
kvmppc_prepare_to_enter() was called with interrupts already disabled by
the caller, which was documented in the comment above the function.
Post-cleanup, the function is now called with interrupts enabled, and
disables interrupts itself.
Fix the comment to reflect the current behaviour.
Fixes: 6c85f52b10fd ("kvm/ppc: IRQ disabling cleanup")
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Reviewed-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Reviewed-by: Gautam Menghani <gautam@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
[Fixed the double colon in Reviewed-by line]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250806055607.17081-1-ajd@linux.ibm.com
|
|
The conversion of all GPIO drivers to using the .set_rv() and
.set_multiple_rv() callbacks from struct gpio_chip (which - unlike their
predecessors - return an integer and allow the controller drivers to
indicate failures to users) is now complete and the legacy ones have
been removed. Rename the new callbacks back to their original names in
one sweeping change.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Pull powerpc fixes from Madhavan Srinivasan:
- Fixes for several issues in the powernv PCI hotplug path
- Fix htmldoc generation for htm.rst in toctree
- Add jit support for load_acquire and store_release in ppc64 bpf jit
Thanks to Bjorn Helgaas, Hari Bathini, Puranjay Mohan, Saket Kumar
Bhaskar, Shawn Anastasio, Timothy Pearson, and Vishal Parmar
* tag 'powerpc-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc64/bpf: Add jit support for load_acquire and store_release
docs: powerpc: add htm.rst to toctree
PCI: pnv_php: Enable third attention indicator state
PCI: pnv_php: Fix surprise plug detection and recovery
powerpc/eeh: Make EEH driver device hotplug safe
powerpc/eeh: Export eeh_unfreeze_pe()
PCI: pnv_php: Work around switches with broken presence detection
PCI: pnv_php: Clean up allocated IRQs on unplug
|
|
Pull non-MM updates from Andrew Morton:
"Significant patch series in this pull request:
- "squashfs: Remove page->mapping references" (Matthew Wilcox) gets
us closer to being able to remove page->mapping
- "relayfs: misc changes" (Jason Xing) does some maintenance and
minor feature addition work in relayfs
- "kdump: crashkernel reservation from CMA" (Jiri Bohac) switches
us from static preallocation of the kdump crashkernel's working
memory over to dynamic allocation. So the difficulty of a-priori
estimation of the second kernel's needs is removed and the first
kernel obtains extra memory
- "generalize panic_print's dump function to be used by other
kernel parts" (Feng Tang) implements some consolidation and
rationalization of the various ways in which a failing kernel
splats information at the operator
* tag 'mm-nonmm-stable-2025-08-03-12-47' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (80 commits)
tools/getdelays: add backward compatibility for taskstats version
kho: add test for kexec handover
delaytop: enhance error logging and add PSI feature description
samples: Kconfig: fix spelling mistake "instancess" -> "instances"
fat: fix too many log in fat_chain_add()
scripts/spelling.txt: add notifer||notifier to spelling.txt
xen/xenbus: fix typo "notifer"
net: mvneta: fix typo "notifer"
drm/xe: fix typo "notifer"
cxl: mce: fix typo "notifer"
KVM: x86: fix typo "notifer"
MAINTAINERS: add maintainers for delaytop
ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below()
ucount: fix atomic_long_inc_below() argument type
kexec: enable CMA based contiguous allocation
stackdepot: make max number of pools boot-time configurable
lib/xxhash: remove unused functions
init/Kconfig: restore CONFIG_BROKEN help text
lib/raid6: update recov_rvv.c zero page usage
docs: update docs after introducing delaytop
...
|
|
Pull MM updates from Andrew Morton:
"As usual, many cleanups. The below blurbiage describes 42 patchsets.
21 of those are partially or fully cleanup work. "cleans up",
"cleanup", "maintainability", "rationalizes", etc.
I never knew the MM code was so dirty.
"mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
mapped VMAs were not eligible for merging with existing adjacent
VMAs.
"mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
adds a new kernel module which simplifies the setup and usage of
DAMON in production environments.
"stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
is a cleanup to the writeback code which removes a couple of
pointers from struct writeback_control.
"drivers/base/node.c: optimization and cleanups" (Donet Tom)
contains largely uncorrelated cleanups to the NUMA node setup and
management code.
"mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
does some maintenance work on the userfaultfd code.
"Readahead tweaks for larger folios" (Ryan Roberts)
implements some tuneups for pagecache readahead when it is reading
into order>0 folios.
"selftests/mm: Tweaks to the cow test" (Mark Brown)
provides some cleanups and consistency improvements to the
selftests code.
"Optimize mremap() for large folios" (Dev Jain)
does that. A 37% reduction in execution time was measured in a
memset+mremap+munmap microbenchmark.
"Remove zero_user()" (Matthew Wilcox)
expunges zero_user() in favor of the more modern memzero_page().
"mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
addresses some warts which David noticed in the huge page code.
These were not known to be causing any issues at this time.
"mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
provides some cleanup and consolidation work in DAMON.
"use vm_flags_t consistently" (Lorenzo Stoakes)
uses vm_flags_t in places where we were inappropriately using other
types.
"mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
increases the reliability of large page allocation in the memfd
code.
"mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
removes several now-unneeded PFN_* flags.
"mm/damon: decouple sysfs from core" (SeongJae Park)
implememnts some cleanup and maintainability work in the DAMON
sysfs layer.
"madvise cleanup" (Lorenzo Stoakes)
does quite a lot of cleanup/maintenance work in the madvise() code.
"madvise anon_name cleanups" (Vlastimil Babka)
provides additional cleanups on top or Lorenzo's effort.
"Implement numa node notifier" (Oscar Salvador)
creates a standalone notifier for NUMA node memory state changes.
Previously these were lumped under the more general memory
on/offline notifier.
"Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
cleans up the pageblock isolation code and fixes a potential issue
which doesn't seem to cause any problems in practice.
"selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
adds additional drgn- and python-based DAMON selftests which are
more comprehensive than the existing selftest suite.
"Misc rework on hugetlb faulting path" (Oscar Salvador)
fixes a rather obscure deadlock in the hugetlb fault code and
follows that fix with a series of cleanups.
"cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
rationalizes and cleans up the highmem-specific code in the CMA
allocator.
"mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
provides cleanups and future-preparedness to the migration code.
"mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
adds some tracepoints to some DAMON auto-tuning code.
"mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
does that.
"mm/damon: misc cleanups" (SeongJae Park)
also does what it claims.
"mm: folio_pte_batch() improvements" (David Hildenbrand)
cleans up the large folio PTE batching code.
"mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
facilitates dynamic alteration of DAMON's inter-node allocation
policy.
"Remove unmap_and_put_page()" (Vishal Moola)
provides a couple of page->folio conversions.
"mm: per-node proactive reclaim" (Davidlohr Bueso)
implements a per-node control of proactive reclaim - beyond the
current memcg-based implementation.
"mm/damon: remove damon_callback" (SeongJae Park)
replaces the damon_callback interface with a more general and
powerful damon_call()+damos_walk() interface.
"mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
implements a number of mremap cleanups (of course) in preparation
for adding new mremap() functionality: newly permit the remapping
of multiple VMAs when the user is specifying MREMAP_FIXED. It still
excludes some specialized situations where this cannot be performed
reliably.
"drop hugetlb_free_pgd_range()" (Anthony Yznaga)
switches some sparc hugetlb code over to the generic version and
removes the thus-unneeded hugetlb_free_pgd_range().
"mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
augments the present userspace-requested update of DAMON sysfs
monitoring files. Automatic update is now provided, along with a
tunable to control the update interval.
"Some randome fixes and cleanups to swapfile" (Kemeng Shi)
does what is claims.
"mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
provides (and uses) a means by which debug-style functions can grab
a copy of a pageframe and inspect it locklessly without tripping
over the races inherent in operating on the live pageframe
directly.
"use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
addresses the large contention issues which can be triggered by
reads from that procfs file. Latencies are reduced by more than
half in some situations. The series also introduces several new
selftests for the /proc/pid/maps interface.
"__folio_split() clean up" (Zi Yan)
cleans up __folio_split()!
"Optimize mprotect() for large folios" (Dev Jain)
provides some quite large (>3x) speedups to mprotect() when dealing
with large folios.
"selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
does some cleanup work in the selftests code.
"tools/testing: expand mremap testing" (Lorenzo Stoakes)
extends the mremap() selftest in several ways, including adding
more checking of Lorenzo's recently added "permit mremap() move of
multiple VMAs" feature.
"selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
extends the DAMON sysfs interface selftest so that it tests all
possible user-requested parameters. Rather than the present minimal
subset"
* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
MAINTAINERS: add missing headers to mempory policy & migration section
MAINTAINERS: add missing file to cgroup section
MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
MAINTAINERS: add missing zsmalloc file
MAINTAINERS: add missing files to page alloc section
MAINTAINERS: add missing shrinker files
MAINTAINERS: move memremap.[ch] to hotplug section
MAINTAINERS: add missing mm_slot.h file THP section
MAINTAINERS: add missing interval_tree.c to memory mapping section
MAINTAINERS: add missing percpu-internal.h file to per-cpu section
mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
selftests/damon: introduce _common.sh to host shared function
selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
selftests/damon/sysfs.py: test non-default parameters runtime commit
selftests/damon/sysfs.py: generalize DAMON context commit assertion
selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
selftests/damon/sysfs.py: test DAMOS filters commitment
selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
selftests/damon/sysfs.py: test DAMOS destinations commitment
...
|
|
Pull ftrace updates from Steven Rostedt:
- Keep track of when fgraph_ops are registered or not
Keep accounting of when fgraph_ops are registered as if a fgraph_ops
is registered twice it can mess up the accounting and it will not
work as expected later. Trigger a warning if something registers it
twice as to catch bugs before they are found by things just not
working as expected.
- Make DYNAMIC_FTRACE always enabled for architectures that support it
As static ftrace (where all functions are always traced) is very
expensive and only exists to help architectures support ftrace, do
not make it an option. As soon as an architecture supports
DYNAMIC_FTRACE make it use it. This simplifies the code.
- Remove redundant config HAVE_FTRACE_MCOUNT_RECORD
The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the
DYNAMIC_FTRACE work, but now every architecture that implements
DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it
redundant with the HAVE_DYNAMIC_FTRACE.
- Make pid_ptr string size match the comment
In print_graph_proc() the pid_ptr string is of size 11, but the
comment says /* sign + log10(MAX_INT) + '\0' */ which is actually 12.
* tag 'ftrace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Remove redundant config HAVE_FTRACE_MCOUNT_RECORD
ftrace: Make DYNAMIC_FTRACE always enabled for architectures that support it
fgraph: Keep track of when fgraph_ops are registered or not
fgraph: Make pid_str size match the comment
|
|
Pull bpf updates from Alexei Starovoitov:
- Remove usermode driver (UMD) framework (Thomas Weißschuh)
- Introduce Strongly Connected Component (SCC) in the verifier to
detect loops and refine register liveness (Eduard Zingerman)
- Allow 'void *' cast using bpf_rdonly_cast() and corresponding
'__arg_untrusted' for global function parameters (Eduard Zingerman)
- Improve precision for BPF_ADD and BPF_SUB operations in the verifier
(Harishankar Vishwanathan)
- Teach the verifier that constant pointer to a map cannot be NULL
(Ihor Solodrai)
- Introduce BPF streams for error reporting of various conditions
detected by BPF runtime (Kumar Kartikeya Dwivedi)
- Teach the verifier to insert runtime speculation barrier (lfence on
x86) to mitigate speculative execution instead of rejecting the
programs (Luis Gerhorst)
- Various improvements for 'veristat' (Mykyta Yatsenko)
- For CONFIG_DEBUG_KERNEL config warn on internal verifier errors to
improve bug detection by syzbot (Paul Chaignon)
- Support BPF private stack on arm64 (Puranjay Mohan)
- Introduce bpf_cgroup_read_xattr() kfunc to read xattr of cgroup's
node (Song Liu)
- Introduce kfuncs for read-only string opreations (Viktor Malik)
- Implement show_fdinfo() for bpf_links (Tao Chen)
- Reduce verifier's stack consumption (Yonghong Song)
- Implement mprog API for cgroup-bpf programs (Yonghong Song)
* tag 'bpf-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (192 commits)
selftests/bpf: Migrate fexit_noreturns case into tracing_failure test suite
selftests/bpf: Add selftest for attaching tracing programs to functions in deny list
bpf: Add log for attaching tracing programs to functions in deny list
bpf: Show precise rejected function when attaching fexit/fmod_ret to __noreturn functions
bpf: Fix various typos in verifier.c comments
bpf: Add third round of bounds deduction
selftests/bpf: Test invariants on JSLT crossing sign
selftests/bpf: Test cross-sign 64bits range refinement
selftests/bpf: Update reg_bound range refinement logic
bpf: Improve bounds when s64 crosses sign boundary
bpf: Simplify bounds refinement from s32
selftests/bpf: Enable private stack tests for arm64
bpf, arm64: JIT support for private stack
bpf: Move bpf_jit_get_prog_name() to core.c
bpf, arm64: Fix fp initialization for exception boundary
umd: Remove usermode driver framework
bpf/preload: Don't select USERMODE_DRIVER
selftests/bpf: Fix test dynptr/test_dynptr_memset_xdp_chunks failure
selftests/bpf: Fix test dynptr/test_dynptr_copy_xdp failure
selftests/bpf: Increase xdp data size for arm64 64K page size
...
|
|
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
|
|
Pull powerpc updates from Madhavan Srinivasan:
- CONFIG_HZ changes to move the base_slice from 10ms to 1ms
- Patchset to move some of the mutex handling to lock guard
- Expose secvars relevant to the key management mode
- Misc cleanups and fixes
Thanks to Ankit Chauhan, Christophe Leroy, Donet Tom, Gautam Menghani,
Haren Myneni, Johan Korsnes, Madadi Vineeth Reddy, Paul Mackerras,
Shrikanth Hegde, Srish Srinivasan, Thomas Fourier, Thomas Huth, Thomas
Weißschuh, Souradeep, Amit Machhiwal, R Nageswara Sastry, Venkat Rao
Bagalkote, Andrew Donnellan, Greg Kroah-Hartman, Mimi Zohar, Mukesh
Kumar Chaurasiya, Nayna Jain, Ritesh Harjani (IBM), Sourabh Jain, Srikar
Dronamraju, Stefan Berger, Tyrel Datwyler, and Kowshik Jois.
* tag 'powerpc-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (23 commits)
arch/powerpc: Remove .interp section in vmlinux
powerpc: Drop GPL boilerplate text with obsolete FSF address
powerpc: Don't use %pK through printk
arch: powerpc: defconfig: Drop obsolete CONFIG_NET_CLS_TCINDEX
misc: ocxl: Replace scnprintf() with sysfs_emit() in sysfs show functions
integrity/platform_certs: Allow loading of keys in the static key management mode
powerpc/secvar: Expose secvars relevant to the key management mode
powerpc/pseries: Correct secvar format representation for static key management
(powerpc/512) Fix possible `dma_unmap_single()` on uninitialized pointer
powerpc: floppy: Add missing checks after DMA map
book3s64/radix : Optimize vmemmap start alignment
book3s64/radix : Handle error conditions properly in radix_vmemmap_populate
powerpc/pseries/dlpar: Search DRC index from ibm,drc-indexes for IO add
KVM: PPC: Book3S HV: Add H_VIRT mapping for tracing exits
powerpc: sysdev: use lock guard for mutex
powerpc: powernv: ocxl: use lock guard for mutex
powerpc: book3s: vas: use lock guard for mutex
powerpc: fadump: use lock guard for mutex
powerpc: rtas: use lock guard for mutex
powerpc: eeh: use lock guard for mutex
...
|
|
Pull scheduler updates from Ingo Molnar:
"Core scheduler changes:
- Better tracking of maximum lag of tasks in presence of different
slices duration, for better handling of lag in the fair scheduler
(Vincent Guittot)
- Clean up and standardize #if/#else/#endif markers throughout the
entire scheduler code base (Ingo Molnar)
- Make SMP unconditional: build the SMP scheduler's data structures
and logic on UP kernel too, even though they are not used, to
simplify the scheduler and remove around 200 #ifdef/[#else]/#endif
blocks from the scheduler (Ingo Molnar)
- Reorganize cgroup bandwidth control interface handling for better
interfacing with sched_ext (Tejun Heo)
Balancing:
- Bump sd->max_newidle_lb_cost when newidle balance fails (Chris
Mason)
- Remove sched_domain_topology_level::flags to simplify the code
(Prateek Nayak)
- Simplify and clean up build_sched_topology() (Li Chen)
- Optimize build_sched_topology() on large machines (Li Chen)
Real-time scheduling:
- Add initial version of proxy execution: a mechanism for
mutex-owning tasks to inherit the scheduling context of higher
priority waiters.
Currently limited to a single runqueue and conditional on
CONFIG_EXPERT, and other limitations (John Stultz, Peter Zijlstra,
Valentin Schneider)
- Deadline scheduler (Juri Lelli):
- Fix dl_servers initialization order (Juri Lelli)
- Fix DL scheduler's root domain reinitialization logic (Juri
Lelli)
- Fix accounting bugs after global limits change (Juri Lelli)
- Fix scalability regression by implementing less agressive
dl_server handling (Peter Zijlstra)
PSI:
- Improve scalability by optimizing psi_group_change() cpu_clock()
usage (Peter Zijlstra)
Rust changes:
- Make Task, CondVar and PollCondVar methods inline to avoid
unnecessary function calls (Kunwu Chan, Panagiotis Foliadis)
- Add might_sleep() support for Rust code: Rust's "#[track_caller]"
mechanism is used so that Rust's might_sleep() doesn't need to be
defined as a macro (Fujita Tomonori)
- Introduce file_from_location() (Boqun Feng)
Debugging & instrumentation:
- Make clangd usable with scheduler source code files again (Peter
Zijlstra)
- tools: Add root_domains_dump.py which dumps root domains info (Juri
Lelli)
- tools: Add dl_bw_dump.py for printing bandwidth accounting info
(Juri Lelli)
Misc cleanups & fixes:
- Remove play_idle() (Feng Lee)
- Fix check_preemption_disabled() (Sebastian Andrzej Siewior)
- Do not call __put_task_struct() on RT if pi_blocked_on is set (Luis
Claudio R. Goncalves)
- Correct the comment in place_entity() (wang wei)"
* tag 'sched-core-2025-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (84 commits)
sched/idle: Remove play_idle()
sched: Do not call __put_task_struct() on rt if pi_blocked_on is set
sched: Start blocked_on chain processing in find_proxy_task()
sched: Fix proxy/current (push,pull)ability
sched: Add an initial sketch of the find_proxy_task() function
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Move update_curr_task logic into update_curr_se
locking/mutex: Add p->blocked_on wrappers for correctness checks
locking/mutex: Rework task_struct::blocked_on
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched/topology: Remove sched_domain_topology_level::flags
x86/smpboot: avoid SMT domain attach/destroy if SMT is not enabled
x86/smpboot: moves x86_topology to static initialize and truncate
x86/smpboot: remove redundant CONFIG_SCHED_SMT
smpboot: introduce SDTL_INIT() helper to tidy sched topology setup
tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info
tools/sched: Add root_domains_dump.py which dumps root domains info
sched/deadline: Fix accounting after global limits change
sched/deadline: Reset extra_bw to max_bw when clearing root domains
sched/deadline: Initialize dl_servers after SMP
...
|
|
Pull driver core updates from Danilo Krummrich:
"debugfs:
- Remove unneeded debugfs_file_{get,put}() instances
- Remove last remnants of debugfs_real_fops()
- Allow storing non-const void * in struct debugfs_inode_info::aux
sysfs:
- Switch back to attribute_group::bin_attrs (treewide)
- Switch back to bin_attribute::read()/write() (treewide)
- Constify internal references to 'struct bin_attribute'
Support cache-ids for device-tree systems:
- Add arch hook arch_compact_of_hwid()
- Use arch_compact_of_hwid() to compact MPIDR values on arm64
Rust:
- Device:
- Introduce CoreInternal device context (for bus internal methods)
- Provide generic drvdata accessors for bus devices
- Provide Driver::unbind() callbacks
- Use the infrastructure above for auxiliary, PCI and platform
- Implement Device::as_bound()
- Rename Device::as_ref() to Device::from_raw() (treewide)
- Implement fwnode and device property abstractions
- Implement example usage in the Rust platform sample driver
- Devres:
- Remove the inner reference count (Arc) and use pin-init instead
- Replace Devres::new_foreign_owned() with devres::register()
- Require T to be Send in Devres<T>
- Initialize the data kept inside a Devres last
- Provide an accessor for the Devres associated Device
- Device ID:
- Add support for ACPI device IDs and driver match tables
- Split up generic device ID infrastructure
- Use generic device ID infrastructure in net::phy
- DMA:
- Implement the dma::Device trait
- Add DMA mask accessors to dma::Device
- Implement dma::Device for PCI and platform devices
- Use DMA masks from the DMA sample module
- I/O:
- Implement abstraction for resource regions (struct resource)
- Implement resource-based ioremap() abstractions
- Provide platform device accessors for I/O (remap) requests
- Misc:
- Support fallible PinInit types in Revocable
- Implement Wrapper<T> for Opaque<T>
- Merge pin-init blanket dependencies (for Devres)
Misc:
- Fix OF node leak in auxiliary_device_create()
- Use util macros in device property iterators
- Improve kobject sample code
- Add device_link_test() for testing device link flags
- Fix typo in Documentation/ABI/testing/sysfs-kernel-address_bits
- Hint to prefer container_of_const() over container_of()"
* tag 'driver-core-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (84 commits)
rust: io: fix broken intra-doc links to `platform::Device`
rust: io: fix broken intra-doc link to missing `flags` module
rust: io: mem: enable IoRequest doc-tests
rust: platform: add resource accessors
rust: io: mem: add a generic iomem abstraction
rust: io: add resource abstraction
rust: samples: dma: set DMA mask
rust: platform: implement the `dma::Device` trait
rust: pci: implement the `dma::Device` trait
rust: dma: add DMA addressing capabilities
rust: dma: implement `dma::Device` trait
rust: net::phy Change module_phy_driver macro to use module_device_table macro
rust: net::phy represent DeviceId as transparent wrapper over mdio_device_id
rust: device_id: split out index support into a separate trait
device: rust: rename Device::as_ref() to Device::from_raw()
arm64: cacheinfo: Provide helper to compress MPIDR value into u32
cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
cacheinfo: Set cache 'id' based on DT data
container_of: Document container_of() is not to be used in new code
driver core: auxiliary bus: fix OF node leak
...
|
|
Pull tty / serial driver updates from Greg KH:
"Here is the big set of TTY and Serial driver updates for 6.17-rc1.
Included in here is the following types of changes:
- another cleanup round from Jiri for the 8250 serial driver and some
other tty drivers, things are slowly getting better with our apis
thanks to this work. This touched many tty drivers all over the
tree.
- qcom_geni_serial driver update for new platforms and devices
- 8250 quirk handling fixups
- dt serial binding updates for different boards/platforms
- other minor cleanups and fixes
All of these have been in linux-next with no reported issues"
* tag 'tty-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (79 commits)
dt-bindings: serial: snps-dw-apb-uart: Allow use of a power-domain
serial: 8250: fix panic due to PSLVERR
dt-bindings: serial: samsung: add samsung,exynos2200-uart compatible
vt: defkeymap: Map keycodes above 127 to K_HOLE
vt: keyboard: Don't process Unicode characters in K_OFF mode
serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms
serial: qcom-geni: Enable PM runtime for serial driver
serial: qcom-geni: move clock-rate logic to separate function
serial: qcom-geni: move resource control logic to separate functions
serial: qcom-geni: move resource initialization to separate function
soc: qcom: geni-se: Enable QUPs on SA8255p Qualcomm platforms
dt-bindings: qcom: geni-se: describe SA8255p
dt-bindings: serial: describe SA8255p
serial: 8250_dw: Fix typo "notifer"
dt-bindings: serial: 8250: spacemit: set clocks property as required
dt-bindings: serial: renesas: Document RZ/V2N SCIF
serial: 8250_ce4100: Fix CONFIG_SERIAL_8250=n build
tty: omit need_resched() before cond_resched()
serial: 8250_ni: Reorder local variables
serial: 8250_ni: Fix build warning
...
|
|
Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.17. The main focus
this cycle is on reorganizing the SHA-1 and SHA-2 code, providing
high-quality library APIs for SHA-1 and SHA-2 including HMAC support,
and establishing conventions for lib/crypto/ going forward:
- Migrate the SHA-1 and SHA-512 code (and also SHA-384 which shares
most of the SHA-512 code) into lib/crypto/. This includes both the
generic and architecture-optimized code. Greatly simplify how the
architecture-optimized code is integrated. Add an easy-to-use
library API for each SHA variant, including HMAC support. Finally,
reimplement the crypto_shash support on top of the library API.
- Apply the same reorganization to the SHA-256 code (and also SHA-224
which shares most of the SHA-256 code). This is a somewhat smaller
change, due to my earlier work on SHA-256. But this brings in all
the same additional improvements that I made for SHA-1 and SHA-512.
There are also some smaller changes:
- Move the architecture-optimized ChaCha, Poly1305, and BLAKE2s code
from arch/$(SRCARCH)/lib/crypto/ to lib/crypto/$(SRCARCH)/. For
these algorithms it's just a move, not a full reorganization yet.
- Fix the MIPS chacha-core.S to build with the clang assembler.
- Fix the Poly1305 functions to work in all contexts.
- Fix a performance regression in the x86_64 Poly1305 code.
- Clean up the x86_64 SHA-NI optimized SHA-1 assembly code.
Note that since the new organization of the SHA code is much simpler,
the diffstat of this pull request is negative, despite the addition of
new fully-documented library APIs for multiple SHA and HMAC-SHA
variants.
These APIs will allow further simplifications across the kernel as
users start using them instead of the old-school crypto API. (I've
already written a lot of such conversion patches, removing over 1000
more lines of code. But most of those will target 6.18 or later)"
* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (67 commits)
lib/crypto: arm64/sha512-ce: Drop compatibility macros for older binutils
lib/crypto: x86/sha1-ni: Convert to use rounds macros
lib/crypto: x86/sha1-ni: Minor optimizations and cleanup
crypto: sha1 - Remove sha1_base.h
lib/crypto: x86/sha1: Migrate optimized code into library
lib/crypto: sparc/sha1: Migrate optimized code into library
lib/crypto: s390/sha1: Migrate optimized code into library
lib/crypto: powerpc/sha1: Migrate optimized code into library
lib/crypto: mips/sha1: Migrate optimized code into library
lib/crypto: arm64/sha1: Migrate optimized code into library
lib/crypto: arm/sha1: Migrate optimized code into library
crypto: sha1 - Use same state format as legacy drivers
crypto: sha1 - Wrap library and add HMAC support
lib/crypto: sha1: Add HMAC support
lib/crypto: sha1: Add SHA-1 library functions
lib/crypto: sha1: Rename sha1_init() to sha1_init_raw()
crypto: x86/sha1 - Rename conflicting symbol
lib/crypto: sha2: Add hmac_sha*_init_usingrawkey()
lib/crypto: arm/poly1305: Remove unneeded empty weak function
lib/crypto: x86/poly1305: Fix performance regression on short messages
...
|
|
Pull CRC updates from Eric Biggers:
- Reorganize the architecture-optimized CRC code
It now lives in lib/crc/$(SRCARCH)/ rather than arch/$(SRCARCH)/lib/,
and it is no longer artificially split into separate generic and arch
modules. This allows better inlining and dead code elimination
The generic CRC code is also no longer exported, simplifying the API.
(This mirrors the similar changes to SHA-1 and SHA-2 in lib/crypto/,
which can be found in the "Crypto library updates" pull request)
- Improve crc32c() performance on newer x86_64 CPUs on long messages by
enabling the VPCLMULQDQ optimized code
- Simplify the crypto_shash wrappers for crc32_le() and crc32c()
Register just one shash algorithm for each that uses the (fully
optimized) library functions, instead of unnecessarily providing
direct access to the generic CRC code
- Remove unused and obsolete drivers for hardware CRC engines
- Remove CRC-32 combination functions that are no longer used
- Add kerneldoc for crc32_le(), crc32_be(), and crc32c()
- Convert the crc32() macro to an inline function
* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (26 commits)
lib/crc: x86/crc32c: Enable VPCLMULQDQ optimization where beneficial
lib/crc: x86: Reorganize crc-pclmul static_call initialization
lib/crc: crc64: Add include/linux/crc64.h to kernel-api.rst
lib/crc: crc32: Change crc32() from macro to inline function and remove cast
nvmem: layouts: Switch from crc32() to crc32_le()
lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c()
lib/crc: Explicitly include <linux/export.h>
lib/crc: Remove ARCH_HAS_* kconfig symbols
lib/crc: x86: Migrate optimized CRC code into lib/crc/
lib/crc: sparc: Migrate optimized CRC code into lib/crc/
lib/crc: s390: Migrate optimized CRC code into lib/crc/
lib/crc: riscv: Migrate optimized CRC code into lib/crc/
lib/crc: powerpc: Migrate optimized CRC code into lib/crc/
lib/crc: mips: Migrate optimized CRC code into lib/crc/
lib/crc: loongarch: Migrate optimized CRC code into lib/crc/
lib/crc: arm64: Migrate optimized CRC code into lib/crc/
lib/crc: arm: Migrate optimized CRC code into lib/crc/
lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/
lib/crc: Move files into lib/crc/
lib/crc32: Remove unused combination support
...
|
|
Pull hardening updates from Kees Cook:
- Introduce and start using TRAILING_OVERLAP() helper for fixing
embedded flex array instances (Gustavo A. R. Silva)
- mux: Convert mux_control_ops to a flex array member in mux_chip
(Thorsten Blum)
- string: Group str_has_prefix() and strstarts() (Andy Shevchenko)
- Remove KCOV instrumentation from __init and __head (Ritesh Harjani,
Kees Cook)
- Refactor and rename stackleak feature to support Clang
- Add KUnit test for seq_buf API
- Fix KUnit fortify test under LTO
* tag 'hardening-v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
sched/task_stack: Add missing const qualifier to end_of_stack()
kstack_erase: Support Clang stack depth tracking
kstack_erase: Add -mgeneral-regs-only to silence Clang warnings
init.h: Disable sanitizer coverage for __init and __head
kstack_erase: Disable kstack_erase for all of arm compressed boot code
x86: Handle KCOV __init vs inline mismatches
arm64: Handle KCOV __init vs inline mismatches
s390: Handle KCOV __init vs inline mismatches
arm: Handle KCOV __init vs inline mismatches
mips: Handle KCOV __init vs inline mismatch
powerpc/mm/book3s64: Move kfence and debug_pagealloc related calls to __init section
configs/hardening: Enable CONFIG_INIT_ON_FREE_DEFAULT_ON
configs/hardening: Enable CONFIG_KSTACK_ERASE
stackleak: Split KSTACK_ERASE_CFLAGS from GCC_PLUGINS_CFLAGS
stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth
stackleak: Rename STACKLEAK to KSTACK_ERASE
seq_buf: Introduce KUnit tests
string: Group str_has_prefix() and strstarts()
kunit/fortify: Add back "volatile" for sizeof() constants
acpi: nfit: intel: avoid multiple -Wflex-array-member-not-at-end warnings
...
|
|
Pull execve updates from Kees Cook:
- Introduce regular REGSET note macros arch-wide (Dave Martin)
- Remove arbitrary 4K limitation of program header size (Yin Fengwei)
- Reorder function qualifiers for copy_clone_args_from_user() (Dishank Jogi)
* tag 'execve-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (25 commits)
fork: reorder function qualifiers for copy_clone_args_from_user
binfmt_elf: remove the 4k limitation of program header size
binfmt_elf: Warn on missing or suspicious regset note names
xtensa: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
um: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
x86/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sparc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
sh: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
s390/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
riscv: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
powerpc/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
parisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
openrisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
nios2: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
MIPS: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
m68k: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
LoongArch: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
hexagon: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
csky: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
arm64: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names
...
|
|
Pull fileattr updates from Christian Brauner:
"This introduces the new file_getattr() and file_setattr() system calls
after lengthy discussions.
Both system calls serve as successors and extensible companions to
the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR system calls which have
started to show their age in addition to being named in a way that
makes it easy to conflate them with extended attribute related
operations.
These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is the XFS quota projects.
XFS has project quotas which could be attached to a directory. All new
inodes in these directories inherit project ID set on parent
directory.
The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special
files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
with empty project ID. Those inodes then are not shown in the quota
accounting but still exist in the directory. This is not critical but
in the case when special files are created in the directory with
already existing project quota, these new inodes inherit extended
attributes. This creates a mix of special files with and without
attributes. Moreover, special files with attributes don't have a
possibility to become clear or change the attributes. This, in turn,
prevents userspace from re-creating quota project on these existing
files.
In addition, these new system calls allow the implementation of
additional attributes that we couldn't or didn't want to fit into the
legacy ioctls anymore"
* tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: tighten a sanity check in file_attr_to_fileattr()
tree-wide: s/struct fileattr/struct file_kattr/g
fs: introduce file_getattr and file_setattr syscalls
fs: prepare for extending file_get/setattr()
fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
selinux: implement inode_file_[g|s]etattr hooks
lsm: introduce new hooks for setting/getting inode fsxattr
fs: split fileattr related helpers into separate file
|
|
Pull simple_recursive_removal() update from Al Viro:
"Removing subtrees of kernel filesystems is done in quite a few places;
unfortunately, it's easy to get wrong. A number of open-coded attempts
are out there, with varying amount of bogosities.
simple_recursive_removal() had been introduced for doing that with all
precautions needed; it does an equivalent of rm -rf, with sufficient
locking, eviction of anything mounted on top of the subtree, etc.
This series converts a bunch of open-coded instances to using that"
* tag 'pull-simple_recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
functionfs, gadgetfs: use simple_recursive_removal()
kill binderfs_remove_file()
fuse_ctl: use simple_recursive_removal()
pstore: switch to locked_recursive_removal()
binfmt_misc: switch to locked_recursive_removal()
spufs: switch to locked_recursive_removal()
add locked_recursive_removal()
better lockdep annotations for simple_recursive_removal()
simple_recursive_removal(): saner interaction with fsnotify
|
|
Add JIT support for the load_acquire and store_release instructions. The
implementation is similar to the kernel where:
load_acquire => plain load -> lwsync
store_release => lwsync -> plain store
To test the correctness of the implementation, following selftests were
run:
[fedora@linux-kernel bpf]$ sudo ./test_progs -a \
verifier_load_acquire,verifier_store_release,atomics
#11/1 atomics/add:OK
#11/2 atomics/sub:OK
#11/3 atomics/and:OK
#11/4 atomics/or:OK
#11/5 atomics/xor:OK
#11/6 atomics/cmpxchg:OK
#11/7 atomics/xchg:OK
#11 atomics:OK
#519/1 verifier_load_acquire/load-acquire, 8-bit:OK
#519/2 verifier_load_acquire/load-acquire, 8-bit @unpriv:OK
#519/3 verifier_load_acquire/load-acquire, 16-bit:OK
#519/4 verifier_load_acquire/load-acquire, 16-bit @unpriv:OK
#519/5 verifier_load_acquire/load-acquire, 32-bit:OK
#519/6 verifier_load_acquire/load-acquire, 32-bit @unpriv:OK
#519/7 verifier_load_acquire/load-acquire, 64-bit:OK
#519/8 verifier_load_acquire/load-acquire, 64-bit @unpriv:OK
#519/9 verifier_load_acquire/load-acquire with uninitialized
src_reg:OK
#519/10 verifier_load_acquire/load-acquire with uninitialized src_reg
@unpriv:OK
#519/11 verifier_load_acquire/load-acquire with non-pointer src_reg:OK
#519/12 verifier_load_acquire/load-acquire with non-pointer src_reg
@unpriv:OK
#519/13 verifier_load_acquire/misaligned load-acquire:OK
#519/14 verifier_load_acquire/misaligned load-acquire @unpriv:OK
#519/15 verifier_load_acquire/load-acquire from ctx pointer:OK
#519/16 verifier_load_acquire/load-acquire from ctx pointer @unpriv:OK
#519/17 verifier_load_acquire/load-acquire with invalid register R15:OK
#519/18 verifier_load_acquire/load-acquire with invalid register R15
@unpriv:OK
#519/19 verifier_load_acquire/load-acquire from pkt pointer:OK
#519/20 verifier_load_acquire/load-acquire from flow_keys pointer:OK
#519/21 verifier_load_acquire/load-acquire from sock pointer:OK
#519 verifier_load_acquire:OK
#556/1 verifier_store_release/store-release, 8-bit:OK
#556/2 verifier_store_release/store-release, 8-bit @unpriv:OK
#556/3 verifier_store_release/store-release, 16-bit:OK
#556/4 verifier_store_release/store-release, 16-bit @unpriv:OK
#556/5 verifier_store_release/store-release, 32-bit:OK
#556/6 verifier_store_release/store-release, 32-bit @unpriv:OK
#556/7 verifier_store_release/store-release, 64-bit:OK
#556/8 verifier_store_release/store-release, 64-bit @unpriv:OK
#556/9 verifier_store_release/store-release with uninitialized
src_reg:OK
#556/10 verifier_store_release/store-release with uninitialized src_reg
@unpriv:OK
#556/11 verifier_store_release/store-release with uninitialized
dst_reg:OK
#556/12 verifier_store_release/store-release with uninitialized dst_reg
@unpriv:OK
#556/13 verifier_store_release/store-release with non-pointer
dst_reg:OK
#556/14 verifier_store_release/store-release with non-pointer dst_reg
@unpriv:OK
#556/15 verifier_store_release/misaligned store-release:OK
#556/16 verifier_store_release/misaligned store-release @unpriv:OK
#556/17 verifier_store_release/store-release to ctx pointer:OK
#556/18 verifier_store_release/store-release to ctx pointer @unpriv:OK
#556/19 verifier_store_release/store-release, leak pointer to stack:OK
#556/20 verifier_store_release/store-release, leak pointer to stack
@unpriv:OK
#556/21 verifier_store_release/store-release, leak pointer to map:OK
#556/22 verifier_store_release/store-release, leak pointer to map
@unpriv:OK
#556/23 verifier_store_release/store-release with invalid register
R15:OK
#556/24 verifier_store_release/store-release with invalid register R15
@unpriv:OK
#556/25 verifier_store_release/store-release to pkt pointer:OK
#556/26 verifier_store_release/store-release to flow_keys pointer:OK
#556/27 verifier_store_release/store-release to sock pointer:OK
#556 verifier_store_release:OK
Summary: 3/55 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250717202935.29018-2-puranjay@kernel.org
|
|
The existing PowerNV hotplug code did not handle surprise plug events
correctly, leading to a complete failure of the hotplug system after device
removal and a required reboot to detect new devices.
This comes down to two issues:
1) When a device is surprise removed, often the bridge upstream
port will cause a PE freeze on the PHB. If this freeze is not
cleared, the MSI interrupts from the bridge hotplug notification
logic will not be received by the kernel, stalling all plug events
on all slots associated with the PE.
2) When a device is removed from a slot, regardless of surprise or
programmatic removal, the associated PHB/PE ls left frozen.
If this freeze is not cleared via a fundamental reset, skiboot
is unable to clear the freeze and cannot retrain / rescan the
slot. This also requires a reboot to clear the freeze and redetect
the device in the slot.
Issue the appropriate unfreeze and rescan commands on hotplug events,
and don't oops on hotplug if pci_bus_to_OF_node() returns NULL.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[bhelgaas: tidy comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/171044224.1359864.1752615546988.JavaMail.zimbra@raptorengineeringinc.com
|
|
Multiple race conditions existed between the PCIe hotplug driver and the
EEH driver, leading to a variety of kernel oopses of the same general
nature:
<pcie device unplug>
<eeh driver trigger>
<hotplug removal trigger>
<pcie tree reconfiguration>
<eeh recovery next step>
<oops in EEH driver bus iteration loop>
A second class of oops is also seen when the underlying bus disappears
during device recovery.
Refactor the EEH module to be PCI rescan and remove safe. Also clean
up a few minor formatting / readability issues.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1334208367.1359861.1752615503144.JavaMail.zimbra@raptorengineeringinc.com
|
|
The PowerNV hotplug driver needs to be able to clear any frozen PE(s)
on the PHB after suprise removal of a downstream device.
Export the eeh_unfreeze_pe() symbol to allow implementation of this
functionality in the php_nv module.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/1778535414.1359858.1752615454618.JavaMail.zimbra@raptorengineeringinc.com
|
|
Ftrace is tightly coupled with architecture specific code because it
requires the use of trampolines written in assembly. This means that when
a new feature or optimization is made, it must be done for all
architectures. To simplify the approach, CONFIG_HAVE_FTRACE_* configs are
added to denote which architecture has the new enhancement so that other
architectures can still function until they too have been updated.
The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the
DYNAMIC_FTRACE work, but now every architecture that implements
DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it redundant
with the HAVE_DYNAMIC_FTRACE.
Remove the HAVE_FTRACE_MCOUNT config and use DYNAMIC_FTRACE directly where
applicable.
Link: https://lore.kernel.org/all/20250703154916.48e3ada7@gandalf.local.home/
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/20250704104838.27a18690@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
This patch enables batched RX buffer replenishment in ibmveth by
using the new firmware-supported h_add_logical_lan_buffers() hcall
to submit up to 8 RX buffers in a single call, instead of repeatedly
calling the single-buffer h_add_logical_lan_buffer() hcall.
During the probe, with the patch, the driver queries ILLAN attributes
to detect IBMVETH_ILLAN_RX_MULTI_BUFF_SUPPORT bit. If the attribute is
present, rx_buffers_per_hcall is set to 8, enabling batched replenishment.
Otherwise, it defaults to 1, preserving the original upstream behavior
with no change in code flow for unsupported systems.
The core rx replenish logic remains the same. But when batching
is enabled, the driver aggregates up to 8 fully prepared descriptors
into a single h_add_logical_lan_buffers() hypercall. If any allocation
or DMA mapping fails while preparing a batch, only the successfully
prepared buffers are submitted, and the remaining are deferred for
the next replenish cycle.
If at runtime the firmware stops accepting the batched hcall—e,g,
after a Live Partition Migration (LPM) to a host that does not
support h_add_logical_lan_buffers(), the hypercall returns H_FUNCTION.
In that case, the driver transparently disables batching, resets
rx_buffers_per_hcall to 1, and falls back to the single-buffer hcall
in next future replenishments to take care of these and future buffers.
Test were done on systems with firmware that both supports and
does not support the new h_add_logical_lan_buffers hcall.
On supported firmware, this reduces hypercall overhead significantly
over multiple buffers. SAR measurements showed about a 15% improvement
in packet processing rate under moderate RX load, with heavier traffic
seeing gains more than 30%
Signed-off-by: Mingming Cao <mmc@linux.ibm.com>
Reviewed-by: Brian King <bjking1@linux.ibm.com>
Reviewed-by: Haren Myneni <haren@linux.ibm.com>
Reviewed-by: Dave Marquardt <davemarq@linux.ibm.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250719091356.57252-1-mmc@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When building with CONFIG_RELOCATABLE, there is a .interp section
which contains the name of the expected ELF interpreter:
Contents of section .interp:
c0000000021c1bac 2f757372 2f6c6962 2f6c642e 736f2e31 /usr/lib/ld.so.1
c0000000021c1bbc 00 .
That information is useless and even likely wrong. Remove it.
Link: https://github.com/linuxppc/issues/issues/434
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/eeaf8fd6628a75d19872ab31cf7e7179e2baef5e.1751366959.git.christophe.leroy@csgroup.eu
|
|
The FSF does not reside in the Franklin street anymore, so we should not
request the people to write to this address. Fortunately, these header
files already contain a proper SPDX license identifier, so it should be
fine to simply drop all of this license boilerplate code here.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250711072553.198777-1-thuth@redhat.com
|
|
In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.
Switch to the regular pointer formatting which is safer and
easier to reason about.
Link: https://lore.kernel.org/lkml/20250113171731-dc10e3c1-da64-4af0-b767-7c7070468023@linutronix.de/
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250718-restricted-pointers-powerpc-v2-1-fd7bddd809f3@linutronix.de
|
|
Move a few kfence and debug_pagealloc related functions in hash_utils.c
and radix_pgtable.c to __init sections since these are only invoked once
by an __init function during system initialization.
i.e.
- hash_debug_pagealloc_alloc_slots()
- hash_kfence_alloc_pool()
- hash_kfence_map_pool()
The above 3 functions only gets called by __init htab_initialize().
- alloc_kfence_pool()
- map_kfence_pool()
The above 2 functions only gets called by __init radix_init_pgtable()
This should also help fix warning msgs like:
>> WARNING: modpost: vmlinux: section mismatch in reference:
hash_debug_pagealloc_alloc_slots+0xb0 (section: .text) ->
memblock_alloc_try_nid (section: .init.text)
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202504190552.mnFGs5sj-lkp@intel.com/
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20250717232519.2984886-8-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Patch series "kdump: crashkernel reservation from CMA", v5.
This series implements a way to reserve additional crash kernel memory
using CMA.
Currently, all the memory for the crash kernel is not usable by the 1st
(production) kernel. It is also unmapped so that it can't be corrupted by
the fault that will eventually trigger the crash. This makes sense for
the memory actually used by the kexec-loaded crash kernel image and initrd
and the data prepared during the load (vmcoreinfo, ...). However, the
reserved space needs to be much larger than that to provide enough
run-time memory for the crash kernel and the kdump userspace. Estimating
the amount of memory to reserve is difficult. Being too careful makes
kdump likely to end in OOM, being too generous takes even more memory from
the production system. Also, the reservation only allows reserving a
single contiguous block (or two with the "low" suffix). I've seen systems
where this fails because the physical memory is fragmented.
By reserving additional crashkernel memory from CMA, the main crashkernel
reservation can be just large enough to fit the kernel and initrd image,
minimizing the memory taken away from the production system. Most of the
run-time memory for the crash kernel will be memory previously available
to userspace in the production system. As this memory is no longer
wasted, the reservation can be done with a generous margin, making kdump
more reliable. Kernel memory that we need to preserve for dumping is
normally not allocated from CMA, unless it is explicitly allocated as
movable. Currently this is only the case for memory ballooning and zswap.
Such movable memory will be missing from the vmcore. User data is
typically not dumped by makedumpfile. When dumping of user data is
intended this new CMA reservation cannot be used.
There are five patches in this series:
The first adds a new ",cma" suffix to the recenly introduced generic
crashkernel parsing code. parse_crashkernel() takes one more argument to
store the cma reservation size.
The second patch implements reserve_crashkernel_cma() which performs the
reservation. If the requested size is not available in a single range,
multiple smaller ranges will be reserved.
The third patch updates Documentation/, explicitly mentioning the
potential DMA corruption of the CMA-reserved memory.
The fourth patch adds a short delay before booting the kdump kernel,
allowing pending DMA transfers to finish.
The fifth patch enables the functionality for x86 as a proof of
concept. There are just three things every arch needs to do:
- call reserve_crashkernel_cma()
- include the CMA-reserved ranges in the physical memory map
- exclude the CMA-reserved ranges from the memory available
through /proc/vmcore by excluding them from the vmcoreinfo
PT_LOAD ranges.
Adding other architectures is easy and I can do that as soon as this
series is merged.
With this series applied, specifying
crashkernel=100M craskhernel=1G,cma
on the command line will make a standard crashkernel reservation
of 100M, where kexec will load the kernel and initrd.
An additional 1G will be reserved from CMA, still usable by the production
system. The crash kernel will have 1.1G memory available. The 100M can
be reliably predicted based on the size of the kernel and initrd.
The new cma suffix is completely optional. When no
crashkernel=size,cma is specified, everything works as before.
This patch (of 5):
Add a new cma_size parameter to parse_crashkernel(). When not NULL, call
__parse_crashkernel to parse the CMA reservation size from
"crashkernel=size,cma" and store it in cma_size.
Set cma_size to NULL in all calls to parse_crashkernel().
Link: https://lkml.kernel.org/r/aEqnxxfLZMllMC8I@dwarf.suse.cz
Link: https://lkml.kernel.org/r/aEqoQckgoTQNULnh@dwarf.suse.cz
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Philipp Rudo <prudo@redhat.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Instead of having the core code guess the note name for each regset,
use USER_REGSET_NOTE_TYPE() to pick the correct name from elf.h.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Akihiko Odaki <akihiko.odaki@daynix.com>
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Link: https://lore.kernel.org/r/20250701135616.29630-16-Dave.Martin@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function. This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default. SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Avoid merge conflicts
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
|