Age | Commit message (Collapse) | Author | Files | Lines |
|
On POWER8 systems that don't have ibm,power-rng available, a guest that
ignores the KVM_CAP_PPC_HWRNG flag and calls H_RANDOM anyway will
dereference a NULL pointer. And on machines with darn instead of
ibm,power-rng, H_RANDOM won't work at all.
This patch kills two birds with one stone, by routing H_RANDOM calls to
ppc_md.get_random_seed, and doing the real mode check inside of it.
Cc: stable@vger.kernel.org # v4.1+
Cc: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Fixes: e928e9cb3601 ("KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
The preferred nomenclature is pnv_, not powernv_, but rng.c used
powernv_ for some reason, which isn't consistent with the rest. A recent
commit added a few pnv_ functions to rng.c, making the file a bit of a
mishmash. This commit just replaces the rest of them.
Cc: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Fixes: f3eac426657 ("powerpc/powernv: wire up rng during setup_arch")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
When running with return thunks enabled under 32-bit EFI, the system
crashes with:
kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle page fault for address: 000000005bc02900
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0011) - permissions violation
PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063
Oops: 0011 [#1] PREEMPT SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:0x5bc02900
Code: Unable to access opcode bytes at RIP 0x5bc028d6.
RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048
RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b
RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28
R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710
R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 0000000080050033
CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0
Call Trace:
? efi_set_virtual_address_map+0x9c/0x175
efi_enter_virtual_mode+0x4a6/0x53e
start_kernel+0x67c/0x71e
x86_64_start_reservations+0x24/0x2a
x86_64_start_kernel+0xe9/0xf4
secondary_startup_64_no_verify+0xe5/0xeb
That's because it cannot jump to the return thunk from the 32-bit code.
Using a naked RET and marking it as safe allows the system to proceed
booting.
Fixes: aa3d480315ba ("x86: Use return-thunk in asm code")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: <stable@vger.kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove a superfluous ' in the mitigation string.
Fixes: e8ec1b6e08a2 ("x86/bugs: Enable STIBP for JMP2RET")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Instead of doing complicated calculations to find the size of the subroutines
(which are even more complicated because they need to be stringified into
an asm statement), just hardcode to 16.
It is less dense for a few combinations of IBT/SLS/retbleed, but it has
the advantage of being really simple.
Cc: stable@vger.kernel.org # 5.15.x: 84e7051c0bc1: x86/kvm: fix FASTOP_SIZE when return thunks are enabled
Cc: stable@vger.kernel.org
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Clang warns:
arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection]
DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
^
arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here
extern u64 x86_spec_ctrl_current;
^
1 error generated.
The declaration should be using DECLARE_PER_CPU instead so all
attributes stay in sync.
Cc: stable@vger.kernel.org
Fixes: fc02735b14ff ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The UML function names to_virt() and to_phys() are exposed by UML
headers, and are very generic and may be defined by drivers. As it
turns out, commit 9409c9b6709e ("pmem: refactor pmem_clear_poison()")
did exactly that.
This results in build errors such as the following when trying to build
um:allmodconfig:
drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’
105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
| ^~~~~~~
Use less generic function names for the um specific to_phys() and
to_virt() functions to fix the problem and to avoid similar problems in
the future.
Fixes: 9409c9b6709e ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left
uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally
not needed for APIC_DM_REMRD, they're still referenced by
__apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize
the structure to avoid consuming random stack memory.
Fixes: a183b638b61c ("KVM: x86: make apic_accept_irq tracepoint more generic")
Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220708125147.593975-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
LKP reports a build issue on Clang, related to a literal load of
__current issued through the ldr_va macro. This turns out to be due to
the fact that group relocations are disabled when CONFIG_COMPILE_TEST=y,
which means that the ldr_va macro resolves to a pair of LDR
instructions, the first one being a literal load issued too far from its
literal pool.
Due to the introduction of a couple of new uses of this macro in commit
508074607c7b95b2 ("ARM: 9195/1: entry: avoid explicit literal loads"),
the literal pools end up getting rearranged in a way that causes the
literal for __current to go out of range. Let's fix this up by putting a
.ltorg directive in a suitable place in the code.
Link: https://lore.kernel.org/all/202205290805.1vZLAr36-lkp@intel.com/
Fixes: 508074607c7b95b2 ("ARM: 9195/1: entry: avoid explicit literal loads")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Some of the statistics values exported by KVM are always only 0 or 1.
It can be useful to export this fact to userspace so that it can track
them specially (for example by polling the value every now and then to
compute a % of time spent in a specific state).
Therefore, add "boolean value" as a new "unit". While it is not exactly
a unit, it walks and quacks like one. In particular, using the type
would be wrong because boolean values could be instantaneous or peak
values (e.g. "is the rmap allocated?") or even two-bucket histograms
(e.g. "number of posted vs. non-posted interrupt injections").
Suggested-by: Amneesh Singh <natto@weirdnatto.in>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.
Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:
[ 36.091116] int3: 0000 [#1] SMP NOPTI
[ 36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44
[ 36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[ 36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[ 36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[ 36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[ 36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[ 36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[ 36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[ 36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[ 36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[ 36.091191] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[ 36.091192] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[ 36.091195] PKRU: 55555554
[ 36.091195] Call Trace:
[ 36.091197] <TASK>
[ 36.091198] ? fastop+0x5a/0xa0 [kvm]
[ 36.091222] x86_emulate_insn+0x7b8/0xe90 [kvm]
[ 36.091244] x86_emulate_instruction+0x2f4/0x630 [kvm]
[ 36.091263] ? kvm_arch_vcpu_load+0x7c/0x230 [kvm]
[ 36.091283] ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel]
[ 36.091290] complete_emulated_mmio+0x297/0x320 [kvm]
[ 36.091310] kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm]
[ 36.091330] kvm_vcpu_ioctl+0x29e/0x6d0 [kvm]
[ 36.091344] ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm]
[ 36.091357] ? __fget_files+0x86/0xc0
[ 36.091362] ? __fget_files+0x86/0xc0
[ 36.091363] __x64_sys_ioctl+0x92/0xd0
[ 36.091366] do_syscall_64+0x59/0xc0
[ 36.091369] ? syscall_exit_to_user_mode+0x27/0x50
[ 36.091370] ? do_syscall_64+0x69/0xc0
[ 36.091371] ? syscall_exit_to_user_mode+0x27/0x50
[ 36.091372] ? __x64_sys_writev+0x1c/0x30
[ 36.091374] ? do_syscall_64+0x69/0xc0
[ 36.091374] ? exit_to_user_mode_prepare+0x37/0xb0
[ 36.091378] ? syscall_exit_to_user_mode+0x27/0x50
[ 36.091379] ? do_syscall_64+0x69/0xc0
[ 36.091379] ? do_syscall_64+0x69/0xc0
[ 36.091380] ? do_syscall_64+0x69/0xc0
[ 36.091381] ? do_syscall_64+0x69/0xc0
[ 36.091381] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 36.091384] RIP: 0033:0x7efdfe6d1aff
[ 36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ 36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff
[ 36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
[ 36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff
[ 36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000
[ 36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[ 36.091396] </TASK>
[ 36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover
[ 36.123271] ---[ end trace db3c0ab5a48fabcc ]---
[ 36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[ 36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[ 36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[ 36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[ 36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[ 36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[ 36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[ 36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[ 36.123323] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[ 36.123324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[ 36.123327] PKRU: 55555554
[ 36.123328] Kernel panic - not syncing: Fatal exception in interrupt
[ 36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Fixes: aa3d480315ba ("x86: Use return-thunk in asm code")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Message-Id: <20220713171241.184026-1-cascardo@canonical.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
hang upon boot or shortly after when a non-default TSC frequency was
set for L1. The issue is observed on a host where TSC scaling is
supported. The problem appears to be that Windows doesn't use TSC
frequency for its guests even when the feature is advertised and KVM
filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
L1's. This leads to L2 running with the default frequency (matching
host's) while L1 is running with an altered one.
Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
it was set for L1. TSC_MULTIPLIER is already correctly computed and
written by prepare_vmcs02().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220712135009.952805-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Implement apply_returns() stub for UM, just like all the other patching
routines.
Fixes: 15e67227c49a ("x86: Undo return-thunk damage")
Reported-by: Randy Dunlap <rdunlap@infradead.org)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/Ys%2Ft45l%2FgarIrD0u@worktop.programming.kicks-ass.net
|
|
UNTRAIN_RET is not needed in native_irq_return_ldt because RET
untraining has already been done at this point.
In addition, when the RETBleed mitigation is IBPB, UNTRAIN_RET clobbers
several registers (AX, CX, DX) so here it trashes user values which are
in these registers.
Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/35b0d50f-12d1-10c3-f5e8-d6c140486d4a@oracle.com
|
|
This symbol is not used outside of bugs.c, so mark it static.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220714072939.71162-1-jiapeng.chong@linux.alibaba.com
|
|
When commit 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all
and when CPPC_LIB is supported") was introduced, we found collateral
damage that a number of AMD systems that supported CPPC but
didn't advertise support in _OSC stopped having a functional
amd-pstate driver. The _OSC was only enforced on Intel systems at that
time.
This was fixed for the MSR based designs by commit 8b356e536e69f
("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
but some shared memory based designs also support CPPC but haven't
advertised support in the _OSC. Add support for those designs as well by
hardcoding the list of systems.
Fixes: 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported")
Fixes: 8b356e536e69f ("ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported")
Link: https://lore.kernel.org/all/3559249.JlDtxWtqDm@natalenko.name/
Cc: 5.18+ <stable@vger.kernel.org> # 5.18+
Reported-and-tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Commit 4efd417f298b ("s390: raise minimum supported machine generation
to z10") removed the usage of alternatives and lowcore in expolines
macros. Remove unneeded header includes as well.
With that, expoline.S doesn't require asm-offsets.h and
expoline_prepare target dependency could be removed.
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Link: https://lore.kernel.org/r/patch-2.thread-d13b6c.git-d13b6c96fb5f.your-ad-here.call-01656331067-ext-4899@work.hours
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
When CONFIG_EXPOLINE_EXTERN is used expoline thunks are generated
from arch/s390/lib/expoline.S and postlinked into every module.
This is also true for external modules. Add expoline.o build to
the modules_prepare target.
Fixes: 1d2ad084800e ("s390/nospec: add an option to use thunk-extern")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Tested-by: C. Erastus Toe <ctoe@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Link: https://lore.kernel.org/r/patch-1.thread-d13b6c.git-a2387a74dc49.your-ad-here.call-01656331067-ext-4899@work.hours
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
x86_has_pat_wp() is using a wrong test, as it relies on the normal
PAT configuration used by the kernel. In case the PAT MSR has been
setup by another entity (e.g. Xen hypervisor) it might return false
even if the PAT configuration is allowing WP mappings. This due to the
fact that when running as Xen PV guest the PAT MSR is setup by the
hypervisor and cannot be changed by the guest. This results in the WP
related entry to be at a different position when running as Xen PV
guest compared to the bare metal or fully virtualized case.
The correct way to test for WP support is:
1. Get the PTE protection bits needed to select WP mode by reading
__cachemode2pte_tbl[_PAGE_CACHE_MODE_WP] (depending on the PAT MSR
setting this might return protection bits for a stronger mode, e.g.
UC-)
2. Translate those bits back into the real cache mode selected by those
PTE bits by reading __pte2cachemode_tbl[__pte2cm_idx(prot)]
3. Test for the cache mode to be _PAGE_CACHE_MODE_WP
Fixes: f88a68facd9a ("x86/mm: Extend early_memremap() support with additional attrs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14
Link: https://lore.kernel.org/r/20220503132207.17234-1-jgross@suse.com
|
|
The build on x86_32 currently fails after commit
9bb2ec608a20 (objtool: Update Retpoline validation)
with:
arch/x86/kernel/../../x86/xen/xen-head.S:35: Error: no such instruction: `annotate_unret_safe'
ANNOTATE_UNRET_SAFE is defined in nospec-branch.h. And head_32.S is
missing this include. Fix this.
Fixes: 9bb2ec608a20 ("objtool: Update Retpoline validation")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/63e23f80-033f-f64e-7522-2816debbc367@kernel.org
|
|
__static_call_fixup() invokes __static_call_transform() without holding
text_mutex, which causes lockdep to complain in text_poke_bp().
Adding the proper locking cures that, but as this is either used during
early boot or during module finalizing, it's not required to use
text_poke_bp(). Add an argument to __static_call_transform() which tells
it to use text_poke_early() for it.
Fixes: ee88d363d156 ("x86,static_call: Use alternative RET encoding")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The kvm_riscv_check_vcpu_requests() is called with SRCU read lock held
and for KVM_REQ_SLEEP request it will block the VCPU without releasing
SRCU read lock. This causes KVM ioctls (such as KVM_IOEVENTFD) from
other VCPUs of the same Guest/VM to hang/deadlock if there is any
synchronize_srcu() or synchronize_srcu_expedited() in the path.
To fix the above in kvm_riscv_check_vcpu_requests(), we should do SRCU
read unlock before blocking the VCPU and do SRCU read lock after VCPU
wakeup.
Fixes: cce69aff689e ("RISC-V: KVM: Implement VCPU interrupts and requests handling")
Reported-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Tested-by: Bin Meng <bmeng.cn@gmail.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
There are a bunch of functions that use the PFN from a page table entry
that end up with the svpbmt upper-bits because they are missing the newly
introduced PAGE_PFN_MASK which leads to wrong addresses conversions and
then crash: fix this by adding this mask.
Fixes: 100631b48ded ("riscv: Fix accessing pfn bits in PTEs for non-32bit variants")
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
Commit in Fixes forgot to change the SETUP_TYPE_MAX definition which
contains the highest valid setup data type.
Correct that.
Fixes: 5ea98e01ab52 ("x86/boot: Add Confidential Computing type to setup_data")
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/ddba81dd-cc92-699c-5274-785396a17fb5@zytor.com
|
|
Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.
Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.
A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).
For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
All the invocations unroll to __x86_return_thunk and this file
must be PIC independent.
This fixes kexec on 64-bit AMD boxes.
[ bp: Fix 32-bit build. ]
Reported-by: Edward Tran <edward.tran@oracle.com>
Reported-by: Awais Tanveer <awais.tanveer@oracle.com>
Suggested-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Add spin-table enable-method and cpu-release-addr properties for
cpu0 node. This is required by all ARMv8 SoC. Otherwise some
bootloader like u-boot can not update cpu-release-addr and linux
fails to start up secondary cpus.
Fixes: 2961f69f151c ("arm64: dts: broadcom: add BCM4908 and Asus GT-AC5300 early DTS files")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
|
|
The cpu mask value in interrupt property inherits from bcm4908.dtsi
which sets to four cpus. Correct the value to two cpus for dual core
BCM4906 SoC.
Fixes: c8b404fb05dc ("arm64: dts: broadcom: bcm4908: add BCM4906 Netgear R8000P DTS files")
Signed-off-by: William Zhang <william.zhang@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
|
|
The device tree should include generic "jedec,spi-nor" compatible, and a
manufacturer-specific one.
The macronix part is what is shipped on the boards that come with a
flash chip.
Fixes: 45857ae95478 ("ARM: dts: orange-pi-zero: add node for SPI NOR")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220708174529.3360-1-msuchanek@suse.de
|
|
Fix typo in i2s1 causing errors in dt binding validation.
Change assigned-parrents to assigned-clock-parents
to match i2s0 node formatting.
Fixes: 1ca81883c557 ("ARM: dts: at91: sama5d2: add nodes for I2S controllers")
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
[claudiu.beznea: use imperative addressing in commit description, remove
blank line after fixes tag, fix typo in commit message]
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220707215812.193008-1-Ryan.Wanner@microchip.com
|
|
There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.
If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.
Fixes: 3ebc17006888 ("x86/bugs: Add retbleed=ibpb")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
The driver use the coma-mode pins as open-drain. Flag them in the device
tree accordingly. This avoids the following error:
[ 14.114180] gpio-2007 (coma-mode): enforced open drain please flag it properly in DT/ACPI DSDT/board file
Fixes: 46a9556d977e ("ARM: dts: kswitch-d10: enable networking")
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220704150808.1104295-1-michael@walle.cc
|
|
A pin controlled by the iomuxc-snvs pin controller must be
specified under the dtb's iomuxc-snvs node.
Move the one and only pin of that category from the iomuxc node
and set the pinctrl-0 using it accordingly.
Fixes: 2aa9d6201949 ("ARM: dts: imx6ull-colibri: add touchscreen device nodes")
Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|
|
The SiFive errata code contains code checking applicable erratas
vs. actually applied erratas to suggest missing erratas to the
user when their Kconfig options are not enabled.
In the main kernel image one can be quite sure that all available
erratas appear at least once, so that check will succeed.
On the other hand modules can very well not use any errata-relevant
code, so the newly added module-alternative support may also patch
the module code, but not touch SiFive-specific erratas at all.
So to restore the original behaviour don't warn when patching
modules. This will keep the warning if necessary for the main kernel
image but prevent spurious warnings for modules.
Of course having such a vendor-specific warning may not be needed at
all, as CONFIG_ERRATA_SIFIVE is selected by CONFIG_SOC_SIFIVE and the
individual erratas are default-y so disabling them requires
deliberate action anyway. But for now just restore the old behaviour.
Fixes: a8e910168bba ("riscv: implement module alternatives")
Reported-by: Ron Economos <re@w6rz.net>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Ron Economos <re@w6rz.net>
Link: https://lore.kernel.org/r/20220608120849.1695191-1-heiko@sntech.de
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Commit
ee774dac0da1 ("x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()")
moved PUSH_AND_CLEAR_REGS out of error_entry, into its own function, in
part to avoid calling error_entry() for XenPV.
However, commit
7c81c0c9210c ("x86/entry: Avoid very early RET")
had to change that because the 'ret' was too early and moved it into
idtentry, bloating the text size, since idtentry is expanded for every
exception vector.
However, with the advent of xen_error_entry() in commit
d147553b64bad ("x86/xen: Add UNTRAIN_RET")
it became possible to remove PUSH_AND_CLEAR_REGS from idtentry, back
into *error_entry().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
Cannon lake is also affected by RETBleed, add it to the list.
Fixes: 6ad0ad2bf8a6 ("x86/bugs: Report Intel retbleed vulnerability")
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
|
|
init_numa_memory() is annotated __init and not used by any module,
thus don't export it.
Remove not needed EXPORT_SYMBOL for init_numa_memory() to fix the
following section mismatch warning:
MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference
from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()
The symbol init_numa_memory is exported and annotated __init
Fix this by removing the __init annotation of init_numa_memory or drop the export.
This is build on Linux 5.19-rc4.
Fixes: d4b6f1562a3c ("LoongArch: Add Non-Uniform Memory Access (NUMA) support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Building loongarch:tinyconfig fails with the following error.
./arch/loongarch/include/asm/page.h: In function 'pfn_valid':
./arch/loongarch/include/asm/page.h:42:32: error: 'PHYS_OFFSET' undeclared
Add the missing include file and fix succeeding vdso errors.
Fixes: 09cfefb7fa70 ("LoongArch: Add memory management")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
The `vcsr` only exists in the old hardware design, it isn't used in any
shipped hardware from Loongson-3A5000 on. Both scalar FP and LSX/LASX
instructions use the `fcsr` as their control and status registers now.
For example, the RM control bit in fcsr0 is shared by FP, LSX and LASX
instructions.
Particularly, fcsr16 to fcsr31 are reserved for LSX/LASX now, access to
these registers has no visible effect if LSX/LASX is enabled, and will
cause SXD/ASXD exceptions if LSX/LASX is not enabled.
So, mentions of vcsr are obsolete in the first place (it was just used
for debugging), let's remove them.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Qi Hu <huqi@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Commit fa96b57c1490 ("LoongArch: Add build infrastructure") adds the new
file arch/loongarch/Kconfig.
As the work on LoongArch was probably quite some time under development,
various config symbols have changed and disappeared from the time of
initial writing of the Kconfig file and its inclusion in the repository.
The following four commits:
commit c126a53c2760 ("arch: remove GENERIC_FIND_FIRST_BIT entirely")
commit 140c8180eb7c ("arch: remove HAVE_COPY_THREAD_TLS")
commit aca52c398389 ("mm: remove CONFIG_HAVE_MEMBLOCK")
commit 3f08a302f533 ("mm: remove CONFIG_HAVE_MEMBLOCK_NODE_MAP option")
remove the mentioned config symbol, and enable the intended setup by
default without configuration.
Drop these obsolete selects in loongarch's Kconfig.
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
After emulating a misaligned load or store issued in Thumb mode, we have
to advance the IT state by hand, or it will get out of sync with the
actual instruction stream, which means we'll end up applying the wrong
condition code to subsequent instructions. This might corrupt the
program state rather catastrophically.
So borrow the it_advance() helper from the probing code, and use it on
CPSR if the emulated instruction is Thumb.
Cc: <stable@vger.kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Print the message about disabled Spectre workarounds only once. The
message is printed each time CPU goes out from idling state on NVIDIA
Tegra boards, causing storm in KMSG that makes system unusable.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
After the removal of set_fs() the reference to set_fs() is stale.
Alter the helptext to reflect what the config option really does.
Fixes: 8ac6f5d7f84b ("ARM: 9113/1: uaccess: remove set_fs() implementation")
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
This function/macro isn't used anywhere in the kernel.
The only user was set_fs() and was deleted in the set_fs()
removal patch set.
Fixes: 8ac6f5d7f84b ("ARM: 9113/1: uaccess: remove set_fs() implementation")
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
commit 7a1be318f579 ("ARM: 9012/1: move device tree mapping out of linear
region") use FDT_FIXED_BASE to map the whole FDT_FIXED_SIZE memory area
which contains fdt. But it only reserves the exact physical memory that
fdt occupied. Unfortunately, this mapping is non-shareable. An illegal or
speculative read access can bring the RAM content from non-fdt zone into
cache, PIPT makes it to be hit by subsequently read access through
shareable mapping(such as linear mapping), and the cache consistency
between cores is lost due to non-shareable property.
|<---------FDT_FIXED_SIZE------>|
| |
-------------------------------
| <non-fdt> | <fdt> | <non-fdt> |
-------------------------------
1. CoreA read <non-fdt> through MT_ROM mapping, the old data is loaded
into the cache.
2. CoreB write <non-fdt> to update data through linear mapping. CoreA
received the notification to invalid the corresponding cachelines, but
the property non-shareable makes it to be ignored.
3. CoreA read <non-fdt> through linear mapping, cache hit, the old data
is read.
To eliminate this risk, add a new memory type MT_MEMORY_RO. Compared to
MT_ROM, it is shareable and non-executable.
Here's an example:
list_del corruption. prev->next should be c0ecbf74, but was c08410dc
kernel BUG at lib/list_debug.c:53!
... ...
PC is at __list_del_entry_valid+0x58/0x98
LR is at __list_del_entry_valid+0x58/0x98
psr: 60000093
sp : c0ecbf30 ip : 00000000 fp : 00000001
r10: c08410d0 r9 : 00000001 r8 : c0825e0c
r7 : 20000013 r6 : c08410d0 r5 : c0ecbf74 r4 : c0ecbf74
r3 : c0825d08 r2 : 00000000 r1 : df7ce6f4 r0 : 00000044
... ...
Stack: (0xc0ecbf30 to 0xc0ecc000)
bf20: c0ecbf74 c0164fd0 c0ecbf70 c0165170
bf40: c0eca000 c0840c00 c0840c00 c0824500 c0825e0c c0189bbc c088f404 60000013
bf60: 60000013 c0e85100 000004ec 00000000 c0ebcdc0 c0ecbf74 c0ecbf74 c0825d08
... ... < next prev >
(__list_del_entry_valid) from (__list_del_entry+0xc/0x20)
(__list_del_entry) from (finish_swait+0x60/0x7c)
(finish_swait) from (rcu_gp_kthread+0x560/0xa20)
(rcu_gp_kthread) from (kthread+0x14c/0x15c)
(kthread) from (ret_from_fork+0x14/0x24)
The faulty list node to be deleted is a local variable, its address is
c0ecbf74. The dumped stack shows that 'prev' = c0ecbf74, but its value
before lib/list_debug.c:53 is c08410dc. A large amount of printing results
in swapping out the cacheline containing the old data(MT_ROM mapping is
read only, so the cacheline cannot be dirty), and the subsequent dump
operation obtains new data from the DDR.
Fixes: 7a1be318f579 ("ARM: 9012/1: move device tree mapping out of linear region")
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Jon reports that the Spectre-BHB init code is filling up the kernel log
with spurious notifications about which mitigation has been enabled,
every time any CPU comes out of a low power state.
Given that Spectre-BHB mitigations are system wide, only a single
mitigation can be enabled, and we already print an error if two types of
CPUs coexist in a single system that require different Spectre-BHB
mitigations.
This means that the pr_info() that describes the selected mitigation
does not need to be emitted for each CPU anyway, and so we can simply
emit it only once.
In order to clarify the above in the log message, update it to describe
that the selected mitigation will be enabled on all CPUs, including ones
that are unaffected. If another CPU comes up later that is affected and
requires a different mitigation, we report an error as before.
Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
The decompressed kernel initially relies on the identity map set up by
the boot/compressed kernel for accessing things like boot_params. With
the recent introduction of SEV-SNP support, the decompressed kernel
also needs to access the setup_data entries pointed to by
boot_params->hdr.setup_data.
This can lead to a crash in the kexec kernel during early boot due to
these entries not currently being included in the initial identity map,
see thread at Link below.
Include mappings for the setup_data entries in the initial identity map.
[ bp: Massage commit message and use a helper var for better readability. ]
Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup")
Reported-by: Jun'ichi Nomura <junichi.nomura@nec.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/TYCPR01MB694815CD815E98945F63C99183B49@TYCPR01MB6948.jpnprd01.prod.outlook.com
|
|
commit 72f2ecb7ece7 ("ACPI: bus: Set CPPC _OSC bits for all and
when CPPC_LIB is supported") added support for claiming to
support CPPC in _OSC on non-Intel platforms.
This unfortunately caused a regression on a vartiety of AMD
platforms in the field because a number of AMD platforms don't set
the `_OSC` bit 5 or 6 to indicate CPPC or CPPC v2 support.
As these AMD platforms already claim CPPC support via a dedicated
MSR from `X86_FEATURE_CPPC`, use this enable this feature rather
than requiring the `_OSC` on platforms with a dedicated MSR.
If there is additional breakage on the shared memory designs also
missing this _OSC, additional follow up changes may be needed.
Fixes: 72f2ecb7ece7 ("Set CPPC _OSC bits for all and when CPPC_LIB is supported")
Reported-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The initial PolarFire SoC devicetree must have been forked off from
the fu540 one prior to the addition of l2cache controller support being
added there. When the controller node was added to mpfs.dtsi, it was
not hooked up to the CPUs & thus sysfs reports an incorrect cache
configuration. Hook it up.
Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE board")
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Daire McNamara <daire.mcnamara@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Device-tree incorrectly used "ngpio" which caused the driver to
fallback to 32 ngpios.
This platform has 62 GPIO registers.
Fixes: 9ff8e9fccef9 ("ARM: dts: TS-7970: add basic device tree")
Signed-off-by: Kris Bahnsen <kris@embeddedTS.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
|