aboutsummaryrefslogtreecommitdiffstats
path: root/arch/powerpc
AgeCommit message (Collapse)AuthorFilesLines
2022-09-28powerpc/64s: Allow double call of kernel_[un]map_linear_page()Christophe Leroy1-1/+7
If the page is already mapped resp. already unmapped, bail out. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-3-nicholas@linux.ibm.com
2022-09-28powerpc/64s: Remove unneeded #ifdef CONFIG_DEBUG_PAGEALLOC in hash_utilsChristophe Leroy1-7/+2
debug_pagealloc_enabled() is always defined and constant folds to 'false' when CONFIG_DEBUG_PAGEALLOC is not enabled. Remove the #ifdefs, the code and associated static variables will be optimised out by the compiler when CONFIG_DEBUG_PAGEALLOC is not defined. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-2-nicholas@linux.ibm.com
2022-09-28powerpc/64s: Add DEBUG_PAGEALLOC for radixNicholas Miehlbradt1-4/+14
There is support for DEBUG_PAGEALLOC on hash but not on radix. Add support on radix. Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-1-nicholas@linux.ibm.com
2022-09-28powerpc/64s: update cpu selection optionsNicholas Piggin2-6/+7
Update the 64s GENERIC_CPU option. POWER4 support has been dropped, so make that clear in the option name. The POWER5_CPU option is dropped because it's uncommon, and GENERIC_CPU covers it. -mtune= before power8 is dropped because the minimum gcc version supports power8, and tuning is made consistent between big and little endian. A 970 option is added for PowerPC 970 / G5 because they still have a user base, and -mtune=power8 does not generate good code for the 970. This also updates the ISA versions document to add Power4/Power4+ because I didn't realise Power4+ used 2.01. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921014103.587954-2-npiggin@gmail.com
2022-09-28powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5Nicholas Piggin1-1/+1
Big-endian GENERIC_CPU supports 970, but builds with -mcpu=power5. POWER5 is ISA v2.02 whereas 970 is v2.01 plus Altivec. 2.02 added the popcntb instruction which a compiler might use. Use -mcpu=power4. Fixes: 471d7ff8b51b ("powerpc/64s: Remove POWER4 support") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921014103.587954-1-npiggin@gmail.com
2022-09-28powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loopsNicholas Piggin2-4/+22
We want to move away from using SMT priority updates for cpu_relax, and use a 'wait' instruction which is similar to x86. As well as being a much better fit for what everybody else uses and tests with, priority nops are stateful which is nasty (interrupts have to consider they might be taken at a different priority), and they're expensive to execute, similar to a mtSPR which can effect other threads in the pipe. This has shown to give results that are less affected by code alignment on benchmarks that cause a lot of spin waiting (e.g., rwsem contention on unixbench filesystem benchmarks) on POWER10. QEMU TCG only supports this instruction correctly since v7.1, versions without the fix may cause hangs whne running POWER10 CPUs. Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix checkpatch warnings RE the macros] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220920122259.363092-2-npiggin@gmail.com
2022-09-28powerpc: add ISA v3.0 / v3.1 wait opcode macroNicholas Piggin2-3/+6
The wait instruction encoding changed between ISA v2.07 and ISA v3.0. In v3.1 the instruction gained a new field. Update the PPC_WAIT macro to the current encoding. Rename the older incompatible one with a _v203 suffix as it was introduced in v2.03 (the WC field was introduced in v2.07 but the kernel only uses WC=0). Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220920122259.363092-1-npiggin@gmail.com
2022-09-28powerpc/time: avoid programming DEC at the start of the timer interruptNicholas Piggin1-14/+15
Setting DEC to maximum at the start of the timer interrupt is not necessary and can be avoided for performance when MSR[EE] is not enabled during the handler as explained in commit 0faf20a1ad16 ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use"), where this change was first attempted. The idea is that the timer interrupt runs with MSR[EE]=0, and at the end of the interrupt DEC is programmed to the next timer interval, so there is no need to clear the decrementer exception before then. When the above commit was merged, that was not quite true. The low res timer subsystem had some cases in the oneshot timer code where if the tick was to be stopped and no timers active, the clock device would not get the ->set_state_oneshot_stopped() call, so DEC would not get reprogrammed, and this would hang taking continual timer interrupts. So this was reverted in commit d2b9be1f4af5 ("powerpc/time: Always set decrementer in timer_interrupt()"), which was a partial revert of the above commit. Commit 62c1256d5447 ("timers/nohz: Switch to ONESHOT_STOPPED in the low-res handler when the tick is stopped") was later merged to fix this missing case in the timer subsystem, so now the behaviour can be restored. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220909142457.278032-1-npiggin@gmail.com
2022-09-28powerpc: Add support for early debugging via Serial 16550 consolePali Rohár4-0/+51
Currently powerpc early debugging contains lot of platform specific options, but does not support standard UART / serial 16550 console. Later legacy_serial.c code supports registering UART as early debug console from device tree but it is not early during booting, but rather later after machine description code finishes. So for real early debugging via UART is current code unsuitable. Add support for new early debugging option CONFIG_PPC_EARLY_DEBUG_16550 which enable Serial 16550 console on address defined by new option CONFIG_PPC_EARLY_DEBUG_16550_PHYSADDR and by stride by option CONFIG_PPC_EARLY_DEBUG_16550_STRIDE. With this change it is possible to debug powerpc machine descriptor code. For example this early debugging code can print on serial console also "No suitable machine description found" error which is done before legacy_serial.c code. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220822231501.16827-1-pali@kernel.org
2022-09-28powerpc/64/kdump: Limit kdump base to 512MBHari Bathini1-3/+3
Since commit e641eb03ab2b0 ("powerpc: Fix up the kdump base cap to 128M") memory for kdump kernel has been reserved at an offset of 128MB. This held up well for a long time before running into boot failure on LPARs having a lot of cores. Commit 7c5ed82b800d8 ("powerpc: Set crashkernel offset to mid of RMA region") fixed this boot failure by moving the offset to mid of RMA region. This change meant the offset is either 256MB or 512MB on LPARs as ppc64_rma_size was 512MB or 1024MB owing to commit 103a8542cb35b ("powerpc/book3s64/ radix: Fix boot failure with large amount of guest memory"). But ppc64_rma_size can be larger as well with newer f/w. So, limit crashkernel reservation offset to 512MB to avoid running into boot failures during kdump kernel boot, due to RTAS or other allocation restrictions. Also, while here, use SZ_128M instead of opening coding it. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220912065031.57416-1-hbathini@linux.ibm.com
2022-09-28powerpc: Provide syscall wrapperRohan McLure7-18/+105
Implement syscall wrapper as per s390, x86, arm64. When enabled cause handlers to accept parameters from a stack frame rather than from user scratch register state. This allows for user registers to be safely cleared in order to reduce caller influence on speculation within syscall routine. The wrapper is a macro that emits syscall handler symbols that call into the target handler, obtaining its parameters from a struct pt_regs on the stack. As registers are already saved to the stack prior to calling system_call_exception, it appears that this function is executed more efficiently with the new stack-pointer convention than with parameters passed by registers, avoiding the allocation of a stack frame for this method. On a 32-bit system, we see >20% performance increases on the null_syscall microbenchmark, and on a Power 8 the performance gains amortise the cost of clearing and restoring registers which is implemented at the end of this series, seeing final result of ~5.6% performance improvement on null_syscall. Syscalls are wrapped in this fashion on all platforms except for the Cell processor as this commit does not provide SPU support. This can be quickly fixed in a successive patch, but requires spu_sys_callback to allocate a pt_regs structure to satisfy the wrapped calling convention. Co-developed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmai.com> [mpe: Make incompatible with COMPAT to retain clearing of high bits of args] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-22-rmclure@linux.ibm.com
2022-09-28powerpc: Change system_call_exception calling conventionRohan McLure4-24/+23
Change system_call_exception arguments to pass a pointer to a stack frame container caller state, as well as the original r0, which determines the number of the syscall. This has been observed to yield improved performance to passing them by registers, circumventing the need to allocate a stack frame. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Retain clearing of high bits of args for compat tasks] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-21-rmclure@linux.ibm.com
2022-09-28powerpc: Use common syscall handler typeRohan McLure6-12/+19
Cause syscall handlers to be typed as follows when called indirectly throughout the kernel. This is to allow for better type checking. typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); Since both 32 and 64-bit abis allow for at least the first six machine-word length parameters to a function to be passed by registers, even handlers which admit fewer than six parameters may be viewed as having the above type. Coercing syscalls to syscall_fn requires a cast to void* to avoid -Wcast-function-type. Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce explicit cast on systems with SPUs. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-19-rmclure@linux.ibm.com
2022-09-28powerpc: Enable compile-time check for syscall handlersRohan McLure1-21/+11
The table of syscall handlers and registered compatibility syscall handlers has in past been produced using assembly, with function references resolved at link time. This moves link-time errors to compile-time, by rewriting systbl.S in C, and including the linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for prototypes. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-18-rmclure@linux.ibm.com
2022-09-28powerpc: Include all arch-specific syscall prototypesRohan McLure4-27/+80
Forward declare all syscall handler prototypes where a generic prototype is not provided in either linux/syscalls.h or linux/compat.h in asm/syscalls.h. This is required for compile-time type-checking for syscall handlers, which is implemented later in this series. 32-bit compatibility syscall handlers are expressed in terms of types in ppc32.h. Expose this header globally. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Use standard include guard naming for syscalls_32.h] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-17-rmclure@linux.ibm.com
2022-09-28powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlersRohan McLure4-34/+53
Arch-specific implementations of syscall handlers are currently used over generic implementations for the following reasons: 1. Semantics unique to powerpc 2. Compatibility syscalls require 'argument padding' to comply with 64-bit argument convention in ELF32 abi. 3. Parameter types or order is different in other architectures. These syscall handlers have been defined prior to this patch series without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with custom input and output types. We remove every such direct definition in favour of the aforementioned macros. Also update syscalls.tbl in order to refer to the symbol names generated by each of these macros. Since ppc64_personality can be called by both 64 bit and 32 bit binaries through compatibility, we must generate both both compat_sys_ and sys_ symbols for this handler. As an aside: A number of architectures including arm and powerpc agree on an alternative argument order and numbering for most of these arch-specific handlers. A future patch series may allow for asm/unistd.h to signal through its defines that a generic implementation of these syscall handlers with the correct calling convention be emitted, through the __ARCH_WANT_COMPAT_SYS_... convention. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-16-rmclure@linux.ibm.com
2022-09-28powerpc: Provide do_ppc64_personality helperRohan McLure1-1/+5
Avoid duplication in future patch that will define the ppc64_personality syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, by extracting the common body of ppc64_personality into a helper function. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-15-rmclure@linux.ibm.com
2022-09-28powerpc: Remove direct call to mmap2 syscall handlersRohan McLure3-13/+14
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Move the compatibility syscall definition for mmap2 to syscalls.c, so that all mmap implementations can share a helper function. Remove 'inline' on static mmap helper. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix compat_sys_mmap2() prototype and offset handling] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-14-rmclure@linux.ibm.com
2022-09-28KVM: PPC: Book3S HV P9: Restore stolen time logging in dtlNicholas Piggin1-4/+45
Stolen time logging in dtl was removed from the P9 path, so guests had no stolen time accounting. Add it back in a simpler way that still avoids locks and per-core accounting code. Fixes: ecb6a7207f92 ("KVM: PPC: Book3S HV P9: Remove most of the vcore logic") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220908132545.4085849-4-npiggin@gmail.com
2022-09-28KVM: PPC: Book3S HV: Update guest state entry/exit accounting to new APINicholas Piggin1-21/+10
Update the guest state and timing entry/exit accounting to use the new API, which was introduced following issues found[1]. KVM HV does possibly call instrumented code inside the guest context, and it does call srcu inside the guest context which is fragile at best. Switch to the new API, moving the guest context inside the srcu_read_lock/unlock region. [1] https://lore.kernel.org/lkml/20220201132926.3301912-1-mark.rutland@arm.com/ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220908132545.4085849-3-npiggin@gmail.com
2022-09-28KVM: PPC: Book3S HV P9: Fix irq disabling in tick accountingNicholas Piggin1-2/+2
kvmhv_run_single_vcpu() disables PMIs as well as Linux irqs, however the tick time accounting code enables and disables irqs and not PMIs within this region. By chance this might not actually cause a bug, but it is clearly an incorrect use of the APIs. Fixes: 2251fbe76395e ("KVM: PPC: Book3S HV P9: Improve mtmsrd scheduling by delaying MSR[EE] disable") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220908132545.4085849-2-npiggin@gmail.com
2022-09-28KVM: PPC: Book3S HV P9: Clear vcpu cpu fields before enabling host irqsNicholas Piggin1-3/+3
On guest entry, vcpu->cpu and vcpu->arch.thread_cpu are set after disabling host irqs. On guest exit there is a window whre tick time accounting briefly enables irqs before these fields are cleared. Move them up to ensure they are cleared before host irqs are run. This is possibly not a problem, but is more symmetric and makes the fields less surprising. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220908132545.4085849-1-npiggin@gmail.com
2022-09-28KVM: PPC: Book3S HV: Fix decrementer migrationFabiano Rosas2-3/+16
We used to have a workaround[1] for a hang during migration that was made ineffective when we converted the decrementer expiry to be relative to guest timebase. The point of the workaround was that in the absence of an explicit decrementer expiry value provided by userspace during migration, KVM needs to initialize dec_expires to a value that will result in an expired decrementer after subtracting the current guest timebase. That stops the vcpu from hanging after migration due to a decrementer that's too large. If the dec_expires is now relative to guest timebase, its initialization needs to be guest timebase-relative as well, otherwise we end up with a decrementer expiry that is still larger than the guest timebase. 1- https://git.kernel.org/torvalds/c/5855564c8ab2 Fixes: 3c1a4322bba7 ("KVM: PPC: Book3S HV: Change dec_expires to be relative to guest timebase") Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220816222517.1916391-1-farosas@linux.ibm.com
2022-09-26powerpc: remove mmap linked list walksMatthew Wilcox (Oracle)3-19/+11
Use the VMA iterator instead. Link: https://lkml.kernel.org/r/20220906194824.2110408-34-Liam.Howlett@oracle.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26Merge tag 'mm-hotfixes-stable-2022-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mmLinus Torvalds1-9/+0
Pull last (?) hotfixes from Andrew Morton: "26 hotfixes. 8 are for issues which were introduced during this -rc cycle, 18 are for earlier issues, and are cc:stable" * tag 'mm-hotfixes-stable-2022-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits) x86/uaccess: avoid check_object_size() in copy_from_user_nmi() mm/page_isolation: fix isolate_single_pageblock() isolation behavior mm,hwpoison: check mm when killing accessing process mm/hugetlb: correct demote page offset logic mm: prevent page_frag_alloc() from corrupting the memory mm: bring back update_mmu_cache() to finish_fault() frontswap: don't call ->init if no ops are registered mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() mm: fix madivse_pageout mishandling on non-LRU page powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush mm: gup: fix the fast GUP race against THP collapse mm: fix dereferencing possible ERR_PTR vmscan: check folio_test_private(), not folio_get_private() mm: fix VM_BUG_ON in __delete_from_swap_cache() tools: fix compilation after gfp_types.h split mm/damon/dbgfs: fix memory leak when using debugfs_lookup() mm/migrate_device.c: copy pte dirty bit to page mm/migrate_device.c: add missing flush_cache_page() mm/migrate_device.c: flush TLB while holding PTL x86/mm: disable instrumentations of mm/pgprot.c ...
2022-09-26Merge branch 'mm-hotfixes-stable' into mm-stableAndrew Morton1-9/+0
2022-09-26powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flushYang Shi1-9/+0
The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will move to use RCU instead of disabling local interrupts in fast-GUP. Using an IPI is the old-styled way of serializing against fast-GUP although it still works as expected now. And fast-GUP now fixed the potential race with THP collapse by checking whether PMD is changed or not. So IPI broadcast in radix pmd collapse flush is not necessary anymore. But it is still needed for hash TLB. Link: https://lkml.kernel.org/r/20220907180144.555485-2-shy828301@gmail.com Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Yang Shi <shy828301@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26KVM: remove KVM_REQ_UNHALTPaolo Bonzini4-4/+0
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-26powerpc: Remove direct call to personality syscall handlerRohan McLure1-1/+1
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Fortunately, in the case of ppc64_personality, its call to sys_personality can be replaced with an invocation to the equivalent ksys_personality inline helper in <linux/syscalls.h>. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-13-rmclure@linux.ibm.com
2022-09-26powerpc/32: Remove powerpc select specialisationRohan McLure3-20/+1
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select and compat_sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a1a0@csgroup.eu/ Link: https://lore.kernel.org/r/20220921065605.1051927-12-rmclure@linux.ibm.com
2022-09-26powerpc: Use generic fallocate compatibility syscallRohan McLure3-9/+1
The powerpc fallocate compat syscall handler is identical to the generic implementation provided by commit 59c10c52f573f ("riscv: compat: syscall: Add compat_sys_call_table implementation"), and as such can be removed in favour of the generic implementation. A future patch series will replace more architecture-defined syscall handlers with generic implementations, dependent on introducing generic implementations that are compatible with powerpc and arm's parameter reorderings. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-11-rmclure@linux.ibm.com
2022-09-26powerpc: Fix fallocate and fadvise64_64 compat parameter combinationRohan McLure3-15/+15
As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate compatibility handlers assume parameters are passed with 32-bit big-endian ABI. This affects the assignment of odd-even parameter pairs to the high or low words of a 64-bit syscall parameter. Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower 32 bits conditioned on endianness. A future patch will replace the arch-specific compat fallocate with an asm-generic implementation. This patch is intended for ease of back-port. [1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80583@www.fastmail.com/ Fixes: 57f48b4b74e7 ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-9-rmclure@linux.ibm.com
2022-09-26powerpc/64s: Fix comment on interrupt handler prologueRohan McLure1-2/+2
Interrupt handlers on 64s systems will often need to save register state from the interrupted process to make space for loading special purpose registers or for internal state. Fix a comment documenting a common code path macro in the beginning of interrupt handlers where r10 is saved to the PACA to afford space for the value of the CFAR. Comment is currently written as if r10-r12 are saved to PACA, but in fact only r10 is saved, with r11-r12 saved much later. The distance in code between these saves has grown over the many revisions of this macro. Fix this by signalling with a comment where r11-r12 are saved to the PACA. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reported-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-8-rmclure@linux.ibm.com
2022-09-26powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRSRohan McLure1-16/+12
The common interrupt handler prologue macro and the bad_stack trampolines include consecutive sequences of register saves, and some register clears. Neaten such instances by expanding use of the SAVE_GPRS macro and employing the ZEROIZE_GPR macro when appropriate. Also simplify an invocation of SAVE_GPRS targetting all non-volatile registers to SAVE_NVGPRS. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reported-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-7-rmclure@linux.ibm.com
2022-09-26powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.SRohan McLure1-20/+13
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-6-rmclure@linux.ibm.com
2022-09-26powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlersRohan McLure1-34/+9
Use the convenience macros for saving/clearing/restoring gprs in keeping with syscall calling conventions. The plural variants of these macros can store a range of registers for concision. This works well when the user gpr value we are hoping to save is still live. In the syscall interrupt handlers, user register state is sometimes juggled between registers. Hold-off from issuing the SAVE_GPR macro for applicable neighbouring lines to highlight the delicate register save logic. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-5-rmclure@linux.ibm.com
2022-09-26powerpc: Add ZEROIZE_GPRS macros for register clearsRohan McLure1-0/+22
Provide register zeroing macros, following the same convention as existing register stack save/restore macros, to be used in later change to concisely zero a sequence of consecutive gprs. The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping with the naming of the accompanying restore and save macros, and usage of zeroize to describe this operation elsewhere in the kernel. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-4-rmclure@linux.ibm.com
2022-09-26powerpc: Save caller r3 prior to system_call_exceptionRohan McLure3-1/+3
This reverts commit 8875f47b7681 ("powerpc/syscall: Save r3 in regs->orig_r3 "). Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel. Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3. Also replicate this early save in 32-bit. A similar save was removed in commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-3-rmclure@linux.ibm.com
2022-09-26powerpc: Remove asmlinkage from syscall handler definitionsRohan McLure2-12/+12
The asmlinkage macro has no special meaning in powerpc, and prior to this patch is used sporadically on some syscall handler definitions. On architectures that do not define asmlinkage, it resolves to extern "C" for C++ compilers and a nop otherwise. The current invocations of asmlinkage provide far from complete support for C++ toolchains, and so the macro serves no purpose in powerpc. Remove all invocations of asmlinkage in arch/powerpc. These incidentally only occur in syscall definitions and prototypes. Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-2-rmclure@linux.ibm.com
2022-09-26powerpc/irq: Refactor irq_soft_mask_{set,or}_return()Christophe Leroy1-22/+4
This partialy reapply commit ef5b570d3700 ("powerpc/irq: Don't open code irq_soft_mask helpers") which was reverted by commit 684c68d92e2e ("Revert "powerpc/irq: Don't open code irq_soft_mask helpers"") irq_soft_mask_set_return() and irq_soft_mask_or_return() are overset of irq_soft_mask_set(). Have them use irq_soft_mask_set() instead of duplicating it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/18473da42362ee8f07bce36b9caef8ba77d7633f.1663656054.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Remove impossible mmu_psize_defs[] on nohashChristophe Leroy1-49/+15
Today there is: if e500 or 8xx if e500 mmu_psize_defs[] = else if 8xx mmu_psize_defs[] = else mmu_psize_defs[] = endif endif The else leg is dead definition. Drop that else leg and rewrite as: if e500 mmu_psize_defs[] = endif if 8xx mmu_psize_defs[] = endif Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/030a843449f348c0b709ca5349640624f36a016f.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Cleanup idle for e500Christophe Leroy6-19/+4
e500 idle setup is a bit messy. e500_idle() is used for PPC32 while book3e_idle() is used for PPC64. As they are mutually exclusive, call them all e500_idle(). Use CONFIG_MPC_85xx instead of PPC32 + E500 in Makefile and rename idle_e500.c to idle_85xx.c . Rename idle_book3e.c to idle_64e.c and remove #ifdef PPC64 in as it's only built on PPC64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8039301334e948974c85ec5ef2db37751075185b.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Simplify redundant Kconfig testsChristophe Leroy3-3/+3
PPC_85xx implies PPC32 so no need to check PPC32 in addition. PPC64 && !PPC_BOOK3E_64 means PPC_BOOK3S_64. PPC_BOOK3E_64 implies PPC_E500. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/244cce3e603f2b79796314c0c1c46cab927b9adc.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Replace PPC_85xx || PPC_BOOKE_64 by PPC_E500Christophe Leroy3-3/+3
PPC_E500 is the same as PPC_85xx || PPC_BOOKE_64 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/af79696f8cb8536fb4e20c0d98a6bf159a9e371b.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Remove CONFIG_PPC_BOOK3E_MMUChristophe Leroy10-18/+14
CONFIG_PPC_BOOK3E_MMU is redundant with CONFIG_PPC_E500. Remove it. Also rename mmu-book3e.h to mmu-e500.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c5549cd59a131204ff94ab909cad2e2dad4ddf2f.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Remove CONFIG_PPC_FSL_BOOK3EChristophe Leroy32-66/+58
CONFIG_PPC_FSL_BOOK3E is redundant with CONFIG_PPC_E500. Remove it. And rename five files accordingly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Rename include guards to match new file names] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/795cb93b88c9a0279289712e674f39e3b108a1b4.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Change CONFIG_E500 to CONFIG_PPC_E500Christophe Leroy18-38/+38
It will be used outside arch/powerpc, make it clear its a powerpc configuration item. And we already have CONFIG_PPC_E500MC, so that will make it more consistent. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e63b22083c11c4300f4a82d3123a46e5fdd54fa6.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Remove redundant selection of E500 and E500MCChristophe Leroy1-3/+0
PPC_85xx and PPC_BOOK3E_64 already select E500 so no need to select it again by PPC_QEMU_E500 and CORENET_GENERIC as they depend on PPC_85xx || PPC_BOOK3E_64. PPC_BOOK3E_64 already selects E500MC so no need to select it again by PPC_QEMU_E500 if PPC64, PPC_BOOK3E_64 is the only way into PPC_QEMU_E500 with PPC64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/44f03fa1506892fabf626dceb2f47a049908b6af.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc: Remove CONFIG_PPC_BOOK3EChristophe Leroy20-70/+66
CONFIG_PPC_BOOK3E is redundant with CONFIG_PPC_BOOK3E_64. The later is more explicit about the fact that it's a 64 bits target. Remove CONFIG_PPC_BOOK3E. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5d0891490813c19cdcfc04678f512ea68cba3e64.1663606876.git.christophe.leroy@csgroup.eu
2022-09-26powerpc/cputable: Split cpu_specs[] for mpc85xx and e500mcChristophe Leroy3-66/+65
e500v1/v2 and e500mc are said to be mutually exclusive in Kconfig. Split e500 cpu_specs[] and then restrict the non e500mc to PPC32 which is then 85xx. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Tweak formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/553b901ea91e393df231103da4b018e9b251b0e9.1663606876.git.christophe.leroy@csgroup.eu