aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
8 daysMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds14-30/+110
Pull kvm fixes from Paolo Bonzini: "arm64: - Fix ITS EventID sanitisation when restoring an interrupt translation table. - Fix PPI memory leak when failing to initialise a vcpu. - Correctly return an error when the validation of a hypervisor trace descriptor fails, and limit this validation to protected mode only. RISC-V: - Fix invalid HVA warning in steal-time recording - Return SBI_ERR_FAILURE to guest upon OOM in pmu_event_info() and pmu_snapshot_set_shmem() - Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler - Fix sign extension of value for MMIO loads s390: - Fix bugs in vSIE (nested virtualization) and UCONTROL, caused by the page table rewrite. x86: - Apply erratum #1235 workaround (disable AVIC IPI virtualization) on Hygon Family 18h, just like on AMD Family 17h. - When KVM_CAP_X86_APIC_BUS_CYCLES_NS is queried on a specific VM, return the VM's configured APIC bus frequency instead of the default. This is less confusing (read: not wrong) and makes it easier to fill in CPUID information that communicates the APIC bus frequency to the guest. Selftests: - Do not include glibc-internal <bits/endian.h>; it worked by chance and broke building KVM selftests with musl" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Disable AVIC IPI virtualization on Hygon Family 18h (erratum #1235) KVM: selftests: Verify that KVM returns the configured APIC cycle length KVM: x86: Return the VM's configured APIC bus frequency when queried KVM: selftests: elf: Include <endian.h> instead of <bits/endian.h> KVM: s390: Properly reset zero bit in PGSTE KVM: s390: vsie: Fix redundant rmap entries KVM: s390: vsie: Fix unshadowing logic KVM: s390: Fix leaking kvm_s390_mmu_cache in case of errors KVM: s390: vsie: Fix memory leak when unshadowing KVM: arm64: Fix nVHE/pKVM hyp tracing error on invalid desc KVM: arm64: vgic: Free private_irqs when init fails after allocation KVM: arm64: vgic-its: Reject restored DTE with out-of-range num_eventid_bits RISC-V: KVM: Fix sign extension for MMIO loads RISC-V: KVM: Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler riscv: kvm: return SBI_ERR_FAILURE for pmu_event_info() when OOM riscv: kvm: return SBI_ERR_FAILURE for pmu_snapshot_set_shmem() when OOM RISC-V: KVM: Fix invalid HVA warning in steal-time recording
8 daysMerge tag 'x86-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds14-69/+134
Pull x86 fixes from Ingo Molnar: - On SEV guests, handle set_memory_{encrypted,decrypted}() failures more conservatively by assuming that all affected pages are unencrypted (Carlos López) - Disable broadcast TLB flush when PCID is disabled (Tom Lendacky) - Fix VMX vs. hrtimer_rearm_deferred() regression (Peter Zijlstra) - Move IRQ/NMI dispatch code from KVM into x86 core, to prepare for a KVM x2apic fix (Peter Zijlstra) - Fix incorrect munmap() size on map_vdso() failure (Guilherme Giacomo Simoes) * tag 'x86-urgent-2026-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt: sev-guest: Explicitly leak pages in unknown state x86/mm: Disable broadcast TLB flush when PCID is disabled x86/kvm/vmx: Fix VMX vs hrtimer_rearm_deferred() x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core x86/vdso: Fix incorrect size in munmap() on map_vdso() failure
9 daysMerge tag 'nios2_updates_for_v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linuxLinus Torvalds1-0/+2
Pull nios2 fixes from Dinh Nguyen: - Implement _THIS_IP_ for inline asm - Add Simon Schuster as a maintainer and mark the NIOS2 as Supported * tag 'nios2_updates_for_v7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: Implement _THIS_IP_ using inline asm MAINTAINERS: arch/nios2: Add Simon Schuster as co-maintainer
9 daysMerge tag 'loongarch-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongsonLinus Torvalds6-15/+68
Pull LoongArch fixes from Huacai Chen: "Rework KASLR to avoid initrd overlap, remove some unused code to avoid a build warning, fix some bugs in kprobes and KVM" * tag 'loongarch-fixes-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Move some variable declarations to paravirt.h LoongArch: kprobes: Fix handling of fatal unrecoverable recursions LoongArch: kprobes: Use larch_insn_text_copy() to patch instructions LoongArch: Remove unused code to avoid build warning LoongArch: Avoid initrd overlap during kernel relocation LoongArch: Skip relocation-time KASLR if already applied efi/loongarch: Randomize kernel preferred address for KASLR
9 daysKVM: SVM: Disable AVIC IPI virtualization on Hygon Family 18h (erratum #1235)Tina Zhang1-5/+7
Hygon Family 18h CPUs are derived from AMD Family 17h (Zen1) silicon and share the same erratum #1235: hardware may read a stale IsRunning=1 bit during ICR write emulation and silently fail to generate an AVIC_IPI_FAILURE_TARGET_NOT_RUNNING VM-Exit on the sending vCPU. The absence of the VM-Exit causes KVM to miss the required wakeup of blocking target vCPUs, leading to hung vCPUs and unbounded delays in guest execution. Extend the existing AMD Family 17h erratum #1235 workaround to also cover Hygon Family 18h. With IPI virtualization disabled, KVM never sets IsRunning=1 in the Physical ID table, so every non-self IPI generates a VM-Exit and is correctly emulated. Fixes: 8de4a1c8164e ("KVM: SVM: Disable (x2)AVIC IPI virtualization if CPU has erratum #1235") Cc: <stable@vger.kernel.org> Signed-off-by: Tina Zhang <zhang_wei@open-hieco.net> Message-ID: <20260522040014.3380201-1-zhang_wei@open-hieco.net>
9 daysKVM: x86: Return the VM's configured APIC bus frequency when queriedSean Christopherson1-1/+1
When KVM_CAP_X86_APIC_BUS_CYCLES_NS is queried on a specific VM, return the VM's configured APIC bus frequency, not KVM's default. Aside from the fact that returning the default frequency is blatantly wrong if userspace has changed the frequency, returning the configured frequency means userspace can blindly trust the result, e.g. when filling PV CPUID information that communicates the APIC bus frequency to the guest. Fixes: 6fef518594bc ("KVM: x86: Add a capability to configure bus frequency for APIC timer") Reported-by: David Woodhouse <dwmw2@infradead.org> Closes: https://lore.kernel.org/all/ab84153e33fbe7c25667f595c56b310d4d5a93ef.camel@infradead.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20260522173526.3539407-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
9 daysMerge tag 'kvm-riscv-fixes-7.1-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini4-10/+15
KVM/riscv fixes for 7.1, take #1 - Fix invalid HVA warning in steal-time recording - Return SBI_ERR_FAILURE to guest upon OOM in pmu_event_info() and pmu_snapshot_set_shmem() - Fix NULL pointer dereference in SBI v0.1 SEND_IPI handler - Fix sign extension of value for MMIO loads
9 daysMerge tag 'kvm-s390-master-7.1-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEADPaolo Bonzini48-102/+177
KVM: s390: some vSIE and UCONTROL fixes Fix some memory issues and some hangs in vSIE.
9 daysMerge tag 'kvmarm-fixes-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEADPaolo Bonzini3-3/+14
KVM/arm64 fixes for 7.1, take #3 - Fix ITS EventID sanitisation when restoring an interrupt translation table. - Fix PPI memory leak when failing to initialise a vcpu. - Correctly return an error when the validation of a hypervisor trace descriptor fails, and limit this validation to protected mode only.
10 daysMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds2-2/+3
Pull arm64 fixes from Catalin Marinas: - Handle probe on hinted conditional branch instructions. BC.cond instructions can be simulated in the same way as B.cond instructions, so extend the decode mask for B.cond to cover BC.cond - Flush the walk cache when unsharing PMD tables. Recent changes to huge_pmd_unshare() introduced mmu_gather::unshared_tables but the arm64 code was still treating the TLB flushing as only targeting leaf entries (TLBI VALE1IS). Fix it by using non-leaf-only instructions (TLBI VAE1IS) when tlb->unshared_tables is set * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: tlb: Flush walk cache when unsharing PMD tables arm64: probes: Handle probes on hinted conditional branch instructions
10 daysMerge tag 's390-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds2-13/+28
Pull s390 fixes from Alexander Gordeev: - Fix PAI NNPA mismatch between counting and recording, where sampling reports twice the value - Fix loss of PAI counter increments during recording on systems with many CPUs under heavy load, while counting is not affected - On some supported machines, CHSC cannot access memory outside the DMA zone, causing CHSC command failures. Restore GFP_DMA flag when allocating memory for CHSC control blocks - Align the numbering scheme for higher-level topology structures like socket, book, drawer with other hardware identifiers e.g. in sysfs, procfs and tools like lscpu * tag 's390-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/topology: Use zero-based numbering for containing entities s390/cio: Restore GFP_DMA for CHSC allocation s390/pai: Fix missing PAI counter increments under heavy load s390/pai: Disable duplicate read of kernel PAI counter value
10 daysarm64: tlb: Flush walk cache when unsharing PMD tablesZeng Heng1-1/+2
When huge_pmd_unshare() is called to unshare a PMD table, the tlb_unshare_pmd_ptdesc() function sets tlb->unshared_tables=true but the aarch64 tlb_flush() only checked tlb->freed_tables to determine whether to use TLBF_NONE (vae1is, invalidates walk cache) or TLBF_NOWALKCACHE (vale1is, leaf-only). This caused the stale PMD page table entry to remain in the walk cache after unshare, potentially leading to incorrect page table walks. Fix by including unshared_tables in the check, so that when unsharing tables, TLBF_NONE is used and the walk cache is properly invalidated. Here is the detailed distinction between vae1is and vale1is: | Instruction Combination | Actual Invalidation Scope | | ------------------------ | --------------------------------------------------| | `VAE1IS` + TTL=`0` | All entries at all levels (full invalidation) | | `VAE1IS` + TTL=`2` (L2) | Non-leaf at Level 0/1 + leaf at Level 2 | | `VALE1IS` + TTL=`0` | Leaf entries at all levels (non-leaf not cleared) | | `VALE1IS` + TTL=`2` (L2) | Leaf entry at Level 2 only | Signed-off-by: Zeng Heng <zengheng4@huawei.com> Fixes: 8ce720d5bd91 ("mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather") Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
10 daysKVM: s390: Properly reset zero bit in PGSTEClaudio Imbrenda1-0/+1
In case of memory pressure, it's possible that a guest page gets freed and then almost immediately reused by the guest. If CMMA is enabled, _essa_clear_cbrl() will discard all pages that are either unused or zero. If a discarded page is reused before _essa_clear_cbrl() is called, and the pgste.zero bit is not cleared, the page will be discarded despite not being unused. When calling _gmap_ptep_xchg(), always clear the pgste.zero bit. This prevents the page from being accidentally discarded when not unused. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
10 daysKVM: s390: vsie: Fix redundant rmap entriesClaudio Imbrenda1-1/+3
The address passed to the gmap rmap was not being masked. As a consequence several different (but functionally equivalent) rmap entries were being created for each shadowed table. Fix this by properly masking the address depending on the table level. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
10 daysKVM: s390: vsie: Fix unshadowing logicClaudio Imbrenda5-5/+63
In some cases (i.e. under extreme memory pressure on the host), attempting to shadow memory will result in the same memory being unshadowed, causing a loop. Add a PGSTE bit to distinguish between shadowed memory and shadowed DAT tables, fix the unshadowing logic in _gmap_ptep_xchg() to prevent unnecessary unshadowing and perform better checks. Also fix the unshadowing logic in _gmap_crstep_xchg_atomic() which did not unshadow properly when the large page would become unprotected. Opportunistically add a check in gmap_protect_rmap() to make sure it won't be called with level == TABLE_TYPE_PAGE_TABLE. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
10 daysKVM: s390: Fix leaking kvm_s390_mmu_cache in case of errorsClaudio Imbrenda1-4/+3
Fix a memory leak that can happen if gmap_ucas_map_one() or kvm_s390_mmu_cache_topup() return error values. Also fix a similar issue in gmap_set_limit(). Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Reported-by: Jiaxin Fan <jiaxin.fan@ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
10 daysKVM: s390: vsie: Fix memory leak when unshadowingClaudio Imbrenda1-1/+3
When performing a partial unshadowing, the rmap was being leaked. Add the missing kfree(). Fixes: a2c17f9270cc ("KVM: s390: New gmap code") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
11 daysLoongArch: KVM: Move some variable declarations to paravirt.hBibo Mao2-4/+7
Some variables relative with paravirt feature are declared in the header file asm/qspinlock.h, however this file can be included only when option CONFIG_SMP is on. There is compiling warnings if CONFIG_SMP is off since variables are not declared. Move these variable declarations to header file asm/paravirt.h to avoid compiling warnings. Fixes: c43dce6f13fb ("LoongArch: KVM: Make vcpu_is_preempted() as a macro rather than function") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202605061313.O8Hswm2b-lkp@intel.com/ Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysLoongArch: kprobes: Fix handling of fatal unrecoverable recursionsTiezhu Yang1-2/+2
KPROBE_HIT_SS and KPROBE_REENTER are two types of fatal recursions that can not be safely recovered in kprobes. KPROBE_HIT_SS means that a kprobe is hit during single-stepping. At this point, the architecture-specific single-step context is already active. Nested single-stepping would corrupt the state, as the kprobe control block (kcb) and hardware registers cannot safely store multiple levels of stepping state. KPROBE_REENTER means that a third-level recursion occurs when a probe is hit while the system is already handling a nested probe (second- level). The kcb only provides a single slot (prev_kprobe) to backup the state. When a third probe is hit, there is no more space to save the state without corrupting the first-level backup. Kprobes work by replacing instructions with breakpoints. In order to execute the original instruction and continue, it must be moved to a temporary "single-step" slot. Since there is no backup space left to set up this slot safely, the CPU would be forced to return to the same original breakpoint address, triggering an endless loop. Currently, the code only prints a warning and returns. This leads to an infinite re-entry loop as the CPU repeatedly hits the same trap and a "stuck" CPU core because preemption was disabled at the start of the handler and never re-enabled in this early return path. Fix the logic by: 1. Merging KPROBE_HIT_SS and KPROBE_REENTER cases, as both represent fatal recursions that cannot be safely recovered. 2. Replacing WARN_ON_ONCE() with BUG() to terminate the system. This aligns LoongArch with other architectures (x86, arm64, riscv) and prevents stack overflow while providing diagnostic information. Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysLoongArch: kprobes: Use larch_insn_text_copy() to patch instructionsTiezhu Yang1-4/+6
On SMP systems, kprobe handlers would occasionally fail to execute on certain CPU cores. The issue is hard to reproduce and typically occurs randomly under high system load. The root cause is a software-side instruction hazard. According to the LoongArch Reference Manual, while the cache coherency is maintained by hardware, software must explicitly use the "IBAR" instruction to ensure the instruction fetch unit (IFU) observes the effects of recent stores. The current arch_arm_kprobe() and arch_disarm_kprobe() only execute the "IBAR" barrier (via flush_insn_slot -> local_flush_icache_range) on the local CPU. This leaves a vulnerable window where remote CPU cores may continue executing stale instructions from their pipelines or prefetch buffers, as they have not executed an "IBAR" since the code modification. Switch to larch_insn_text_copy() to fix this: 1. Synchronization: It uses stop_machine_cpuslocked() to synchronize all online CPUs, ensuring no CPU is executing the target code area during modification. 2. Visibility: By passing cpu_online_mask to stop_machine_cpuslocked(), the callback text_copy_cb() is executed on all online cores. Each CPU core invokes local_flush_icache_range() to execute "IBAR", clearing instruction hazards system-wide and ensuring the "break" instruction is visible to the fetch units of all cores. 3. Robustness: It properly manages memory write permissions (ROX/RW) for the kernel text segment during patching, ensuring compatibility with CONFIG_STRICT_KERNEL_RWX. Cc: <stable@vger.kernel.org> # 6.18+ Fixes: 6d4cc40fb5f5 ("LoongArch: Add kprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysnios2: Implement _THIS_IP_ using inline asmMarco Elver1-0/+2
Both GCC [1] and Clang [2] consider the generic version of _THIS_IP_ to be broken: #define _THIS_IP_ ({ __label__ __here; __here: (unsigned long)&&__here; }) In particular, the address of a label is only expected to be used with a computed goto. While the generic version more or less works today, it is known to be brittle and may break with current and future optimizations. For example, Clang -O2 always returns 1 when this function is inlined: static inline unsigned long get_ip(void) { return ({ __label__ __here; __here: (unsigned long)&&__here; }); } Fix it by overriding _THIS_IP_ in <asm/linkage.h> (which is included by <linux/instruction_pointer.h>) using an architecture-specific inline asm version. Additionally, avoiding taking the address of a label prevents compilers from emitting spurious indirect branch targets (e.g. ENDBR or BTI) under control-flow integrity schemes. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120071 [1] Link: https://github.com/llvm/llvm-project/issues/138272 [2] Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
11 daysMerge tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-0/+1
Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, wireless and netfilter. Craziness continues with no end in sight. Even discounting the driver revert this is a pretty huge PR for standards of the previous era. I'd speculate - we haven't seen the worst of it, yet. Good news, I guess, is that so far we haven't seen many (any?) cases of "AI reported a bug, we fixed it and a real user regressed". Current release - fix to a fix: - Bluetooth: btmtk: accept too short WMT FUNC_CTRL events - vsock/virtio: relax the recently added memory limit a little Current release - regressions: - IB/IPoIB: make sure IB drivers always use async set_rx_mode since some (mlx5) are now required to use it due to locking changes Previous releases - regressions: - udp: fix UDP length on last GSO_PARTIAL segment - af_unix: fix UAF read of tail->len in unix_stream_data_wait() - tcp: fix stale per-CPU tcp_tw_isn leak enabling ISN prediction - mlx5e: fix unlocked writing to ICOSQ, breaking AF_XDP Previous releases - always broken: - tap: fix stack info leak in tap_ioctl() SIOCGIFHWADDR - ipv4: raw: reject IP_HDRINCL packets with ihl < 5 - Bluetooth: a lot of locking and concurrency fixes (as always) - batman-adv (mesh wireless networking): a lot of random fixes for issues reported by security researchers and Sashiko - netfilter: same thing, a lot of small security-ish fixes all over the place, nothing really stands out Misc: - bring back the old 3c509 driver, Maciej wants to maintain it" * tag 'net-7.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (187 commits) net: enetc: avoid VF->PF mailbox timeout during SR-IOV teardown net: enetc: fix init and teardown order to prevent use of unsafe resources net: enetc: fix unbounded loop and interrupt handling in VF-to-PF messaging net: enetc: fix DMA write to freed memory in enetc_msg_free_mbx() net: enetc: fix race condition in VF MAC address configuration net: enetc: fix TOCTOU race and validate VF MAC address net: enetc: add ratelimiting to VF mailbox error messages net: enetc: fix missing error code when pf->vf_state allocation fails net: enetc: fix incorrect mailbox message status returned to VFs net: bridge: prevent too big nested attributes in br_fill_linkxstats() l2tp: use list_del_rcu in l2tp_session_unhash net: bcmgenet: keep RBUF EEE/PM disabled ethernet: 3c509: Fix most coding style issues ethernet: 3c509: Update documentation to match MAINTAINERS ethernet: 3c509: Add GPL 2.0 SPDX license identifier ethernet: 3c509: Fix AUI transceiver type selection Revert "drivers: net: 3com: 3c509: Remove this driver" tools: ynl: support listening on all nsids net: gro: don't merge zcopy skbs pds_core: ensure null-termination for firmware version strings ...
11 daysMerge tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-traceLinus Torvalds21-0/+30
Pull ring-buffer fixes from Steven Rostedt: - Fix reporting MISSED EVENTS in trace iterator When the "trace" file is read with tracing enabled, if the writer were to pass the iterator reader, it resets, sets a "missed_events" flag and continues. The tracing output checks for missed events and if there are some, it prints out "[LOST EVENTS]" to let the user know events were dropped. But the clearing of the missed_events happened when the tracing system queried the ring buffer iterator about missed events. This was premature as the ring buffer is per CPU, and the tracing code reads all the CPU buffers and checks for missed events when it is read. If the CPU iterator that had missed events isn't printed next, the output for the LOST EVENTS is lost. Clear the missed_events flag when the iterator moves to the next event and not when the missed_events flag is queried. Also clear it on reset. - Flush and stop the persistent ring buffer on panic On panic the persistent ring buffer is used to debug what caused the panic. But on some architectures, it requires flushing the memory from cache, otherwise, the ring buffer persistent memory may not have the last events and this could also cause the ring buffer to be corrupted on the next boot. - Fix nr_subbufs initialization in simple_ring_buffer_init_mm The remote simple ring buffer meta data nr_subbufs is initialized too early and gets cleared later on, making it zero and not reflect the actual number of sub-buffers. - Fix unload_page for simple_ring_buffer init rollback On error, the pages loaded need to be unloaded. To unload a page it is expected that: page = load_page(va); -> unload_page(page). But the code was doing: unload_page(va) and not unload_page(page). - Create output file from cmd_check_undefined The check for undefined symbols checks if the file *.o.checked exists and if so it skips doing the work. But the *.o.checked file never was created making every build do the work even when it was already done previously. * tag 'trace-ringbuffer-v7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Create output file from cmd_check_undefined tracing: Fix unload_page for simple_ring_buffer init rollback tracing: Fix nr_subbufs initialization in simple_ring_buffer_init_mm() ring-buffer: Flush and stop persistent ring buffer on panic ring-buffer: Fix reporting of missed events in iterator
11 daysMerge tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds19-86/+65
Pull SoC fixes from Arnd Bergmann: - The ff-a firmware driver gets 11 individual bugfixes for a number of issues with robustness to buggy firmware or client implementations. Another firmware fix address suspend to RAM via PSCI firmware. - The final code change is for the old Arm Integrator reference platform that recently started exposing an old NULL pointer dereference bug. - The MAINTAINERS file gets two updates, notably James Tai and Yu-Chun Lin are stepping up as co-maintainers for the Realtek platform. - The remaining patches are all for devicetree files. Two of these are for riscv boards, the rest are all for enesas Arm platforms, addressing build time checking issues as well as minor configuration problems. * tag 'soc-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits) firmware: psci: Set pm_set_resume/suspend_via_firmware() for SYSTEM_SUSPEND ARM: realtek: MAINTAINERS: Include pin controller drivers MAINTAINERS: Add maintainers for ARM/REALTEK ARCHITECTURE ARM: integrator: Fix early initialization firmware: arm_ffa: Fix sched-recv callback partition lookup firmware: arm_ffa: Snapshot notifier callbacks under lock firmware: arm_ffa: Align RxTx buffer size before mapping firmware: arm_ffa: Validate framework notification message layout firmware: arm_ffa: Keep framework RX release under lock firmware: arm_ffa: Bound PARTITION_INFO_GET_REGS copies firmware: arm_ffa: Unregister bus notifier on teardown for FF-A v1.0 firmware: arm_ffa: Fix per-vcpu self notifications handling in workqueue firmware: arm_ffa: Avoid collapsing NPI work from different CPUs firmware: arm_ffa: Skip free_pages on RX buffer alloc failure firmware: arm_ffa: Check for NULL FF-A ID table while driver registration riscv: dts: microchip: fix icicle i2c pinctrl configuration riscv: dts: starfive: jh7110: Drop CAMSS node arm64: dts: renesas: r9a09g056: Add #mux-state-cells to usb20phyrst arm64: dts: renesas: r9a09g057: Add #mux-state-cells to usb2{0,1}phyrst ARM: dts: renesas: rskrza1: Drop superfluous cells ...
11 daysRevert "drivers: net: 3com: 3c509: Remove this driver"Maciej W. Rozycki1-0/+1
This reverts commit 91f3a27ae9f66d81a5906461762c37c8a2bcab06. Contrary to the assumption stated with the original commit description this driver is in use and I'm going to maintain it for the foreseeable future. Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://patch.msgid.link/alpine.DEB.2.21.2605201204260.1450@angie.orcam.me.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysLoongArch: Remove unused code to avoid build warningHuacai Chen1-4/+0
After commit feee6b2989165631b1 ("mm/memory_hotplug: shrink zones when offlining memory"), __remove_pages() doesn't need the "zone" parameter so the "page" variable is also unused. Remove the unused code to avoid such build warning: arch/loongarch/mm/init.c: In function 'arch_remove_memory': arch/loongarch/mm/init.c:134:22: warning: variable 'page' set but not used [-Wunused-but-set-variable=] 134 | struct page *page = pfn_to_page(start_pfn); Cc: <stable@vger.kernel.org> Reviewed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysLoongArch: Avoid initrd overlap during kernel relocationWANG Rui1-0/+38
Validate the relocation address against the initrd region specified via "initrd=" or "initrdmem=" on the command line. Reject relocation targets that overlap the initrd to prevent memory corruption during early boot. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysLoongArch: Skip relocation-time KASLR if already appliedWANG Rui1-0/+12
When the kernel is relocated during early boot (efistub or kexec_file), a randomized load address may has already been selected and applied. In this case, performing KASLR again in relocate.c is unnecessary. Note: strictly-defined KASLR means the kernel's final runtime address has a random offset from the kernel's load address, which is implemented in relocate.c; broadly-defined KALSR means the kernel's final runtime address has a random offset from the kernel's link address (a.k.a. VMLINUX_LOAD_ADDRESS), which also include the efistlub implementation, kexec_file implementation and QEMU direct kernel boot. kaslr_disabled() return true only means strictly-defined KASLR is disabled. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysefi/loongarch: Randomize kernel preferred address for KASLRWANG Rui1-1/+3
Introduce efi_get_kimg_kaslr_address() helper to compute the preferred kernel image load address dynamically when CONFIG_RANDOMIZE_BASE is enabled. The function derives a random offset by using the EFI-provided randomness combined with the timer tick value, and constrains it within CONFIG_RANDOMIZE_BASE_MAX_OFFSET. Update EFI_KIMG_PREFERRED_ADDRESS to call this helper so that the EFI stub can select a randomized load address when KASLR is active, while preserving the original base address behavior when KASLR is disabled or "nokaslr" is specified. Note: LoongArch can't KASLR for hibernation, so set efi_nokaslr to true if "resume=<devname>" is explicitly specified in cmdline. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: WANG Rui <wangrui@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
11 daysring-buffer: Flush and stop persistent ring buffer on panicMasami Hiramatsu (Google)21-0/+30
On real hardware, panic and machine reboot may not flush hardware cache to memory. This means the persistent ring buffer, which relies on a coherent state of memory, may not have its events written to the buffer and they may be lost. Moreover, there may be inconsistency with the counters which are used for validation of the integrity of the persistent ring buffer which may cause all data to be discarded. To avoid this issue, stop recording of the ring buffer on panic and flush the cache of the ring buffer's memory. Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance") Cc: stable@vger.kernel.org Cc: Will Deacon <will@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ian Rogers <irogers@google.com> Link: https://patch.msgid.link/177751969602.2136606.12031934362587643488.stgit@mhiramat.tok.corp.google.com Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
12 daysx86/mm: Disable broadcast TLB flush when PCID is disabledTom Lendacky1-0/+1
Booting with "nopcid" clears X86_FEATURE_PCID and keeps CR4.PCIDE from being set to one. On AMD CPUs that support INVLPGB, broadcast TLB flushing remains enabled. There are two checks that decide whether the global ASID code runs, mm_global_asid() and consider_global_asid(), that key off of the X86_FEATURE_INVLPGB feature. Once an mm becomes active on more than three CPUs, consider_global_asid() assigns it a global ASID, after which flush_tlb_mm_range() takes the broadcast_tlb_flush() path using a non-zero PCID. Issuing an INVLPGB with a non-zero PCID while CR4.PCIDE is not set results in a #GP: Oops: general protection fault, kernel NULL pointer dereference 0x1: 0000 [#1] SMP NOPTI CPU: 158 UID: 0 PID: 3119 Comm: snap Not tainted 7.1.0-rc3 #1 PREEMPT(full) Hardware name: ... RIP: 0010:broadcast_tlb_flush Code: ... 89 da 48 83 c8 07 <0f> 01 fe eb 08 cc cc cc ... Call Trace: <TASK> flush_tlb_mm_range ptep_clear_flush wp_page_copy ? _raw_spin_unlock __handle_mm_fault handle_mm_fault do_user_addr_fault exc_page_fault asm_exc_page_fault All processors that support broadcast TLB invalidation also have PCID support, so it is only the "nopcid" scenario that is of concern. In this situation just disable the broadcast TLB support using the CPUID dependency support by making X86_FEATURE_INVLPGB dependent on X86_FEATURE_PCID. [ bp: Massage commit message. ] Fixes: 4afeb0ed1753 ("x86/mm: Enable broadcast TLB invalidation for multi-threaded processes") Suggested-by: Dave Hansen <dave.hansen@intel.com> Assisted-by: Claude:claude-opus-4.7 Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Rik van Riel <riel@surriel.com> Cc: <stable@kernel.org> Link: https://patch.msgid.link/b915acfd63e8b2a094fdeb8dc608738072518764.1779296450.git.thomas.lendacky@amd.com
13 dayss390/topology: Use zero-based numbering for containing entitiesAlexandra Winter1-3/+7
Start the numbering scheme for higher-level topology structures (like socket, book, drawer) at zero, matching the convention for other hardware identifiers like e.g. CPU numbers. Hardware documentation, the Hardware Management Console and other tools like zmemtopo also use zero-based numbering for these containing entities. Aligning the numbering in sysfs, procfs, and tools like lscpu improves user experience by making it easier to correlate topology information across different interfaces. If available, Linux on s390 derives this physical topology information from the stsi function code 15 store_topology instruction, which is defined to start at 1 for the lowest numbered container id. Subtract one, so drawer_id, book_id and socket_id in cpu_topology[] start with 0 for the lowest numbered entity; and /proc/cpuinfo and tools like 'lscpu -ye' display the expected values. Display only, no functional change intended. Example: In a partition with 3 cores in a system with 8 cores per socket; 2 sockets per book; 4 books per dawer; and 4 drawers: Before this fix: $ lscpu -ye CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2 ONLINE CONFIGURED POLARIZATION ADDRESS 0 0 2 4 1 0 0:0:0 yes yes vert-high 0 1 0 2 4 1 0 1:1:1 yes yes vert-high 1 2 0 2 4 1 1 2:2:2 yes yes vert-medium 2 3 0 2 4 1 1 3:3:3 yes yes vert-medium 3 4 0 2 4 2 3 4:4:4 yes yes vert-low 4 5 0 2 4 2 3 5:5:5 yes yes vert-low 5 After this fix: $ lscpu -ye CPU NODE DRAWER BOOK SOCKET CORE L1d:L1i:L2 ONLINE CONFIGURED POLARIZATION ADDRESS 0 0 1 3 0 0 0:0:0 yes yes vert-high 0 1 0 1 3 0 0 1:1:1 yes yes vert-high 1 2 0 1 3 0 1 2:2:2 yes yes vert-medium 2 3 0 1 3 0 1 3:3:3 yes yes vert-medium 3 4 0 1 3 1 3 4:4:4 yes yes vert-low 4 5 0 1 3 1 3 5:5:5 yes yes vert-low 5 For KVM guests, qemu emulates the stsi FC15 store_topology instruction. This emulation currently erroneously starts id numbering at 0. A qemu fix is proposed that makes this emulation compliant to the stsi architecture. In case a guest with this patch is running on a qemu without the other fix, it can happen that ids of 255 are displayed erroneously. z/VM currently does not provide or emulate physical topology information to its guests. So this patch does not change anything for z/VM guests. Fixes: 10d385895055 ("[S390] topology: expose core identifier") Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
13 daysKVM: arm64: Fix nVHE/pKVM hyp tracing error on invalid descVincent Donnefort1-2/+7
pKVM must validate the host-provided tracing buffer descriptor. However, if an error is found, the hypervisor would just return 0 to the host. Fix the return value on validation failure. While at it, rename the function to hyp_trace_desc_is_valid() and skip validation for the nVHE mode as we trust host-provided data in that case. Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Fixes: 680a04c333fa ("KVM: arm64: Add tracing capability for the nVHE/pKVM hyp") Link: https://lore.kernel.org/r/20260514162624.3477857-1-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
13 daysKVM: arm64: vgic: Free private_irqs when init fails after allocationMichael Bommarito1-1/+3
Companion to commit 250f25367b58 ("KVM: arm64: Tear down vGIC on failed vCPU creation"), which added the missing kvm_vgic_vcpu_destroy() call to the kvm_share_hyp() failure path in kvm_arch_vcpu_create(). The kvm_vgic_vcpu_init() failure path immediately above it has the same shape and still needs the same cleanup. Call kvm_vgic_vcpu_destroy() when kvm_vgic_vcpu_init() fails so private IRQs allocated before a redistributor iodev registration failure are released before the failed vCPU is freed. Fixes: 03b3d00a70b5 ("KVM: arm64: vgic: Allocate private interrupts on demand") Cc: stable@vger.kernel.org Cc: Will Deacon <will@kernel.org> Reviewed-by: Yuan Yao <yaoyuan@linux.alibaba.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://lore.kernel.org/r/20260519135042.2219239-1-michael.bommarito@gmail.com Signed-off-by: Marc Zyngier <maz@kernel.org>
13 daysKVM: arm64: vgic-its: Reject restored DTE with out-of-range num_eventid_bitsMichael Bommarito1-0/+4
Userspace can restore an ITS Device Table Entry whose Size field encodes more EventID bits than the virtual ITS supports. The live MAPD path rejects that state, but vgic_its_restore_dte() accepts it and stores the out-of-range value in dev->num_eventid_bits. Reject restored DTEs with num_eventid_bits > VITS_TYPER_IDBITS before allocating the device. This mirrors the MAPD check and prevents the restored state from reaching vgic_its_restore_itt(), where the unchecked value can be converted into an oversized scan_its_table() range. Fixes: 57a9a117154c ("KVM: arm64: vgic-its: Device table save/restore") Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Link: https://lore.kernel.org/r/20260519132519.2142458-1-michael.bommarito@gmail.com Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
13 daysx86/kvm/vmx: Fix VMX vs hrtimer_rearm_deferred()Peter Zijlstra1-0/+13
Vishal reported that KVM unit test 'x2apic' started failing after commit 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming"). The reason is that KVM/VMX is injecting interrupts while it has interrupts disabled, for a context that will enable interrupts, this means that regs->flags.X86_EFLAGS_IF == 0 and irqentry_exit() will not do the right thing. Notably, irqentry_exit() must not call hrtimer_rearm_deferred() when the return context does not have IF set, because this will cause problems vs NMIs. Therefore, fix up the state after the injection. Fixes: 0e98eb14814e ("entry: Prepare for deferred hrtimer rearming") Reported-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Suggested-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://patch.msgid.link/20260423155936.957351833@infradead.org Closes: https://lore.kernel.org/r/70cd3e97fbb796e2eb2ff8cd4b7614ada05a5f24.camel%40intel.com
13 daysx86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 corePeter Zijlstra12-68/+119
Move the VMX interrupt dispatch magic into the x86 core code. This isolates KVM from the FRED/IDT decisions and reduces the amount of EXPORT_SYMBOL_FOR_KVM(). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Tested-by: "Verma, Vishal L" <vishal.l.verma@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Zhao Liu <zhao1.liu@intel.com> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Binbin Wu <binbin.wu@linxu.intel.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260508091829.GO3126523@noisy.programming.kicks-ass.net
13 daysMerge tag 'mm-hotfixes-stable-2026-05-18-21-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds2-5/+8
Pull misc fixes from Andrew Morton: "14 hotfixes. 9 are for MM. 10 are cc:stable and the remainder are for post-7.1 issues or aren't deemed suitable for backporting. There's a two-patch MAINTAINERS series from Mike Rapoport which updates us for the new KEXEC/KDUMP/crash/LUO/etc arrangements. And another two-patch series from Muchun Song to fix a couple of memory-hotplug issues. Otherwise singletons, please see the changelogs for details" * tag 'mm-hotfixes-stable-2026-05-18-21-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/memory: fix spurious warning when unmapping device-private/exclusive pages mm: fix __vm_normal_page() to handle missing support for pmd_special()/pud_special() drivers/base/memory: fix memory block reference leak in poison accounting mm/memory_hotplug: fix memory block reference leak on remove lib: kunit_iov_iter: fix test fail on powerpc mm/page_alloc: fix initialization of tags of the huge zero folio with init_on_free MAINTAINERS: add kexec@ list to LIVE UPDATE ENTRY MAINTAINERS: add tree for KDUMP and KEXEC selftests/mm: run_vmtests.sh: fix destructive tests invocation scripts/gdb: slab: update field names of struct kmem_cache scripts/gdb: mm: cast untyped symbols in x86_page_ops mm/damon: fix damos_stat tracepoint format for sz_applied mm/damon/sysfs-schemes: call missing mem_cgroup_iter_break() mm/migrate_device: fix spinlock leak in migrate_vma_insert_huge_pmd_page
13 daysx86/vdso: Fix incorrect size in munmap() on map_vdso() failureGuilherme Giacomo Simoes1-1/+1
In map_vdso(), if a failure occurs during the installation of the VVAR mappings, the error path attempts to clean up previously allocated mappings using do_munmap(). However, the cleanup for the VVAR mapping is incorrectly using image->size (the size of the vDSO text) instead of the actual size allocated for the VVAR area. Replace the incorrect do_munmap() image->size parameter with the constant VDSO_NR_PAGES * PAGE_SIZE. Ensure the unmap size exactly matches the size used during the vdso_install_vvar_mapping() phase to provide a symmetrical and complete teardown of the memory region. Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping") Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20260503191609.551817-1-trintaeoitogc@gmail.com
13 daysarm64: probes: Handle probes on hinted conditional branch instructionsVladimir Murzin1-1/+1
BC.cond instructions introduced by FEAT_HBC cannot be executed out-of-line, like other branch instructions. However, they can be simulated in the same way as B.cond instructions. Extend the B.cond decoder mask to match BC.cond instructions as well, and handle them using the existing B.cond simulation path. Fixes: 7f86d128e437 ("arm64: add HWCAP for FEAT_HBC (hinted conditional branches)") Cc: <stable@vger.kernel.org> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2026-05-18RISC-V: KVM: Fix sign extension for MMIO loadsJiakai Xu1-5/+4
The kvm_riscv_vcpu_mmio_return() function handles MMIO read results by writing the data back to the guest register. For signed load instructions (LB, LH, LW on RV64), the value needs sign-extension from a smaller integer to unsigned long. The current code uses: (ulong)data << shift >> shift but (ulong) makes the right shift a logical shift (zero-extend) rather than an arithmetic shift (sign-extend), causing incorrect results when the MMIO device returns a negative value. For example, LB reading 0x80 would return 128 instead of -128. Fix this by casting to (long) after the left shift so that the subsequent right shift is arithmetic and correctly propagates the sign bit: (long)((ulong)data << shift) >> shift Additionally, remove the unnecessary shift assignment for LBU (unsigned byte load) since it does not need sign extension. This makes LBU consistent with LHU and LWU which already keep shift = 0. Fixes: b91f0e4cb8a3 ("RISC-V: KVM: Factor-out instruction emulation into separate sources") Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Assisted-by: OpenClaw:DeepSeek-V3.2 Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260514081752.472987-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-05-18RISC-V: KVM: Fix NULL pointer dereference in SBI v0.1 SEND_IPI handlerJiakai Xu1-0/+2
The SBI v0.1 SEND_IPI handler iterates over the hart mask and calls kvm_get_vcpu_by_id() to find the target vcpu for each set bit. When a guest provides a hart mask containing bits for non-existent vcpu_ids, kvm_get_vcpu_by_id() returns NULL, which is then unconditionally dereferenced by kvm_riscv_vcpu_set_interrupt(), causing a kernel crash. Fix this by adding a NULL check before dereferencing the return value. If the target vcpu is not found, skip it and continue processing the remaining valid harts. Fixes: a046c2d8578c ("RISC-V: KVM: Reorganize SBI code by moving SBI v0.1 to its own file") Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Assisted-by: OpenClaw:DeepSeek-V3.2 Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260517124414.420919-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-05-18riscv: kvm: return SBI_ERR_FAILURE for pmu_event_info() when OOMOsama Abdelkader1-2/+4
kvm_riscv_vcpu_pmu_event_info() returned -ENOMEM from the SBI extension handler, which caused kvm_riscv_vcpu_sbi_ecall() to abort KVM_RUN and surface the error to userspace instead of completing the ECALL with a negative SBI error in a0. Use SBI_ERR_FAILURE and the normal retdata path, matching other PMU handlers and kvm_sbi_ext_pmu_handler comment. Fixes: e309fd113b9f ("RISC-V: KVM: Implement get event info function") Cc: stable@vger.kernel.org Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260514173642.41448-2-osama.abdelkader@gmail.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-05-18riscv: kvm: return SBI_ERR_FAILURE for pmu_snapshot_set_shmem() when OOMOsama Abdelkader1-2/+4
kvm_riscv_vcpu_pmu_snapshot_set_shmem() returned -ENOMEM from the SBI extension handler, which caused kvm_riscv_vcpu_sbi_ecall() to abort KVM_RUN and surface the error to userspace instead of ompleting the ECALL with a negative SBI error in a0. Use SBI_ERR_FAILURE and the normal retdata path, matching other PMU handlers and kvm_sbi_ext_pmu_handler comment. Fixes: c2f41ddbcdd7 ("RISC-V: KVM: Implement SBI PMU Snapshot feature") Cc: stable@vger.kernel.org Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260514173642.41448-1-osama.abdelkader@gmail.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-05-18RISC-V: KVM: Fix invalid HVA warning in steal-time recordingJiakai Xu1-1/+1
kvm_riscv_vcpu_record_steal_time() assumes that the steal-time shared memory GPA (vcpu->arch.sta.shmem) is always backed by a valid guest memory slot. However, this assumption is not guaranteed by the KVM userspace ABI. A malicious or buggy userspace can set the STA shared memory GPA via KVM_SET_ONE_REG without establishing a corresponding memory region via KVM_SET_USER_MEMORY_REGION. In such cases, the GPA cannot be translated to a valid HVA and kvm_vcpu_gfn_to_hva() returns an error address. The current implementation incorrectly treats this as a kernel warning using WARN_ON(), which may escalate to a kernel panic when panic_on_warn is enabled. This is not a kernel bug condition but a normal invalid configuration from userspace, and should be handled gracefully. Fix it by removing WARN_ON() and treating invalid HVA as a normal failure case, resetting the STA shared memory state. Fixes: e9f12b5fff8ad0 ("RISC-V: KVM: Implement SBI STA extension") Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Assisted-by: OpenClaw:DeepSeek-V3.2 Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Link: https://lore.kernel.org/r/20260415075216.2757427-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-05-17Merge tag 'x86-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-0/+8
Pull x86 fix from Ingo Molnar: - Fix x86 boot crash for non-kjump kexecs (David Woodhouse) * tag 'x86-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Push kjump return address even for non-kjump kexec
2026-05-17Merge tag 'sched-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-7/+24
Pull scheduler fix from Ingo Molnar: - Fix ARM64-specific rseq regressions (Mark Rutland) * tag 'sched-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/entry: Fix arm64-specific rseq brokenness
2026-05-17Merge tag 'ras-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds1-28/+5
Pull MCE fix from Ingo Molnar: - Fix an MCE polling interval adjustment regression (Borislav Petkov) * tag 'ras-urgent-2026-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Restore MCA polling interval halving
2026-05-17Merge tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds10-22/+72
Pull RISC-V fixes from Paul Walmsley: "Relatively low-impact fixes. Probably the most notable one is that we no longer ask the monitor-mode firmware to delegate misaligned access handling to the kernel by default, since the kernel code needs significant improvement to match the functionality of the firmware. This change avoids functional problems at some cost in performance, but shouldn't affect any system with misaligned access handling in hardware. - Disable satp register probing when no5lvl is specified on the kernel command line - Fix a CFI-related issue with the misaligned access speed measurement code - Reduce the CFI shadow stack size limit from 4GB to 2GB (following ARM64 GCS) - Prevent the kernel from requesting delegation of misaligned access faults unless a new Kconfig option, RISCV_SBI_FWFT_DELEGATE_MISALIGNED, is enabled. This will depend on CONFIG_NONPORTABLE until the deficiencies of the kernel misaligned access fixup code are fixed - Fix some potential uninitialized memory accesses in error paths in compat_riscv_gpr_set() and compat_restore_sigcontext() - Fix a bug in the RISC-V MIPS vendor errata patching code where a logical-and was used in place of a bitwise-and - Drop some unnecessary code in riscv_fill_hwcap_from_isa_string() - Use macros for isa2hwcap indices in riscv_fill_hwcap(), rather than open-coding them - Fix some documentation typos (one affecting 'make htmldocs')" * tag 'riscv-for-linus-7.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: misaligned: Make enabling delegation depend on NONPORTABLE riscv: Docs: fix unmatched quote warning riscv: cfi: reduce shadow stack size limit from 4GB to 2GB riscv: cpufeature: Use pre-defined ISA ext macros to index isa2hwcap riscv: mm: Fixup no5lvl failure when vaddr is invalid riscv: Fix register corruption from uninitialized cregs on error riscv: errata: Fix bitwise vs logical AND in MIPS errata patching Documentation: riscv: cmodx: fix typos riscv: cpufeature: Drop this_hwcap clear in T-Head vector workaround riscv: Define __riscv_copy_{,vec_}{words,bytes}_unaligned() using SYM_TYPED_FUNC_START
2026-05-16Merge tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds8-52/+27
Pull powerpc fixes from Madhavan Srinivasan: - Fix preempt count leak in sysfs show paths - Fix error handling in pika_dtm_thread - Remove pmac_low_i2c_{lock,unlock}() - Enable all windfarms by default - Fix dead default for GUEST_STATE_BUFFER_TEST - Remove redundant preempt_disable|enable() calls from arch_irq_work_raise() Thanks to Aboorva Devarajan, Ally Heev, Amit Machhiwal, Bart Van Assche, Christophe Leroy, Christophe Leroy (CS GROUP), Dan Carpenter, Gautam Menghani, Harsh Prateek Bora, Julian Braha, Krzysztof Kozlowski, Linus Walleij, Ma Ke, Ritesh Harjani (IBM), and Sayali Patil * tag 'powerpc-7.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/time: Remove redundant preempt_disable|enable() calls from arch_irq_work_raise() powerpc/hv-gpci: fix preempt count leak in sysfs show paths powerpc: fix dead default for GUEST_STATE_BUFFER_TEST powerpc/powermac: Remove pmac_low_i2c_{lock,unlock}() powerpc/warp: Fix error handling in pika_dtm_thread powerpc: 82xx: fix uninitialized pointers with free attribute powerpc/g5: Enable all windfarms by default