aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/kernel/entry.S (follow)
Commit message (Collapse)AuthorAgeFilesLines
* s390/idle: Rewrite psw_idle() in CSven Schnelle2024-05-141-23/+0
| | | | | | | | | To ease maintenance and further enhancements, convert the psw_idle() function to C. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390/mm: Fix NULL pointer dereferenceSven Schnelle2024-04-171-1/+2
| | | | | | | | | | | | | | | | | The recently added check to figure out if a fault happened on gmap ASCE dereferences the gmap pointer in lowcore without checking that it is not NULL. For all non-KVM processes the pointer is NULL, so that some value from lowcore will be read. With the current layouts of struct gmap and struct lowcore the read value (aka ASCE) is zero, so that this doesn't lead to any observable bug; at least currently. Fix this by adding the missing NULL pointer check. Fixes: 64c3431808bd ("s390/entry: compare gmap asce to determine guest/host fault") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390/entry: align system call table on 8 bytesSumanth Korikkar2024-04-031-0/+1
| | | | | | | | | | | | Align system call table on 8 bytes. With sys_call_table entry size of 8 bytes that eliminates the possibility of a system call pointer crossing cache line boundary. Cc: stable@kernel.org Suggested-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/entry: compare gmap asce to determine guest/host faultSven Schnelle2024-03-171-16/+15
| | | | | | | | | | | | With the current implementation, there are some cornercases where a host fault would be treated as a guest fault, for example when the sie instruction causes a program check. Therefore store the gmap asce in ptregs, and use that to compare the primary asce from the fault instead of matching instruction addresses. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: remove OUTSIDE macroSven Schnelle2024-03-171-25/+4
| | | | | | | | | | With only one OUTSIDE user left, remove the macro and move the code directly to the machine check handler. This has the advantage that it is much easier to determine which registers are used. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: add CIF_SIE flag and remove sie64a() address checkSven Schnelle2024-03-171-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | When a program check, interrupt or machine check is triggered, the PSW address is compared to a certain range of the sie64a() function to figure out whether SIE was interrupted and a cleanup of SIE is needed. This doesn't work with kprobes: If kprobes probes an instruction, it copies the instruction to the kprobes instruction page and overwrites the original instruction with an undefind instruction (Opcode 00). When this instruction is hit later, kprobes single-steps the instruction on the kprobes_instruction page. However, if this instruction is a relative branch instruction it will now point to a different location in memory due to being moved to the kprobes instruction page. If the new branch target points into sie64a() the kernel assumes it interrupted SIE when processing the breakpoint and will crash trying to access the SIE control block. Instead of comparing the address, introduce a new CIF_SIE flag which indicates whether SIE was interrupted. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/kvm: convert to regular kernel fpu userHeiko Carstens2024-02-161-2/+0
| | | | | | | | | | | | | | | KVM modifies the kernel fpu's regs pointer to its own area to implement its custom version of preemtible kernel fpu context. With general support for preemptible kernel fpu context there is no need for the extra complexity in KVM code anymore. Therefore convert KVM to a regular kernel fpu user. In particular this means that all TIF_FPU checks can be removed, since the fpu register context will never be changed by other kernel fpu users, and also the fpu register context will be restored if a thread is preempted. Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/fpu: convert FPU CIF flag to regular TIF flagHeiko Carstens2024-02-161-1/+1
| | | | | | | | | | | | | The FPU state, as represented by the CIF_FPU flag reflects the FPU state of a task, not the CPU it is running on. Therefore convert the flag to a regular TIF flag. This removes the magic in switch_to() where a save_fpu_regs() call for the currently (previous) running task sets the per-cpu CIF_FPU flag, which is required to restore FPU register contents of the next task, when it returns to user space. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/fpu: move, rename, and merge header filesHeiko Carstens2024-02-161-1/+1
| | | | | | | | | | | | | | | Move, rename, and merge the fpu and vx header files. This way fpu header files have a consistent naming scheme (fpu*.h). Also get rid of the fpu subdirectory and move header files to asm directory, so that all fpu and vx header files can be found at the same location. Merge internal.h header file into other header files, since the internal helpers are used at many locations. so those helper functions are really not internal. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/nmi: remove register validation codeHeiko Carstens2024-02-161-5/+0
| | | | | | | | | Remove the historic machine check handler code which validates registers. Registers are automatically validated as part of the machine check handling sequence (see Principles of Operation, Machine-Check Handling chapter, Validation). Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/switch_to: use generic header fileHeiko Carstens2024-02-121-5/+5
| | | | | | | | | | | | | | Move the switch_to() implementation to process.c and use the generic switch_to.h header file instead, like some other architectures. This addresses also the oddity that the old switch_to() implementation assigns the return value of __switch_to() to 'prev' instead of 'last', like it should. Remove also all includes of switch_to.h from C files, except process.c. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390: replace #include <asm/export.h> with #include <linux/export.h>Masahiro Yamada2023-08-091-1/+1
| | | | | | | | | | | | | | Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost") deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>. Replace #include <asm/export.h> with #include <linux/export.h>. After all the <asm/export.h> lines are converted, <asm/export.h> and <asm-generic/export.h> will be removed. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20230806151641.394720-2-masahiroy@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: remove mcck clockSven Schnelle2023-07-031-1/+0
| | | | | | | | | | | In the past machine checks where accounted as irq time. With the conversion to generic entry, it was decided to account machine checks to the current context. The stckf at the beginning of the machine check handler and the lowcore member is no longer required, therefore remove it. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390/entry: rework entering DAT-on mode on CPU restartAlexander Gordeev2023-07-031-3/+8
| | | | | | | | | | | | | | Instead of enforcing PSW_MASK_DAT bit on previously stored in lowcore restart_psw.mask use the PSW_KERNEL_BITS mask (which contains PSW_MASK_DAT) directly. As result, the PSW mask stored in lowcore is only used to enter the CPU restart routine, while PSW_KERNEL_BITS is used to enter the kernel code - similarily to commit 64ea2977add2 ("s390/mm: start kernel with DAT enabled"). Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390: consistently use .balign instead of .alignHeiko Carstens2023-06-281-1/+1
| | | | | | | | | | | | The .align directive has inconsistent behavior across architectures. Use .balign instead everywhere. This is a no-op for s390, but with this there is no mix in using .align and .balign anymore. Future code is supposed to use only .balign. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390/entry: use SYM* macros instead of ENTRY(), etc.Heiko Carstens2023-04-191-35/+34
| | | | | | | | Consistently use the SYM* family of macros instead of the deprecated ENTRY(), ENDPROC(), etc. family of macros. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: enable HAVE_ARCH_STACKLEAKHeiko Carstens2023-04-041-0/+10
| | | | | | | | | | | | | Add support for the stackleak feature. Whenever the kernel returns to user space the kernel stack is filled with a poison value. Enabling this feature is quite expensive: e.g. after instrumenting the getpid() system call function to have a 4kb stack the result is an increased runtime of the system call by a factor of 3. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/stack: use STACK_INIT_OFFSET where possibleHeiko Carstens2023-04-041-8/+4
| | | | | | | | | Make STACK_INIT_OFFSET also available for assembler code, and use it everywhere instead of open-coding it at several places. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/entry: rely on long-displacement facilityVasily Gorbik2023-04-041-5/+3
| | | | | | | | | | | | | | | | | Since commit 4efd417f298b ("s390: raise minimum supported machine generation to z10"), the long-displacement facility is assumed and required for the kernel. Clean up a couple of places in the entry code, where long-displacement could be used directly instead of using a base register. However, there are still a few other places where a base register has to be used to extend short-displacement for the second lowcore page access. Notably, boot/head.S still has to be built for z900, and in mcck_int_handler, spt and lbear, which don't have long-displacements, but need to access save areas at the second lowcore page. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/bp: remove __bpon()Heiko Carstens2023-03-131-12/+6
| | | | | | | | | | There is no point in changing branch prediction state of a cpu shortly before it enters stop state. Therefore remove __bpon(). Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/bp: remove TIF_ISOLATE_BPHeiko Carstens2023-03-131-23/+13
| | | | | | | | | | | TIF_ISOLATE_BP is unused since it was introduced with commit 6b73044b2b00 ("s390: run user space and KVM guests with modified branch prediction"). Given that there is no use case remove it again. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/bp: add missing BPENTER to program check handlerHeiko Carstens2023-03-131-0/+1
| | | | | | | | | | When leaving interpretive execution because of a program check BPENTER should be called like it is done on interrupt exit as well. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/mcck: cleanup user process termination pathAlexander Gordeev2023-02-281-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a machine check interrupt hits while user process is running __s390_handle_mcck() helper function is called directly from the interrupt handler and terminates the current process by calling make_task_dead() routine. The make_task_dead() is not allowed to be called from interrupt context which forces the machine check handler switch to the kernel stack and enable local interrupts first. The __s390_handle_mcck() could also be called to service pending work, but this time from the external interrupts handler. It is the machine check handler that establishes the work and schedules the external interrupt, therefore the machine check interrupt itself should be disabled while reading out the corresponding variable: local_mcck_disable(); mcck = *this_cpu_ptr(&cpu_mcck); memset(this_cpu_ptr(&cpu_mcck), 0, sizeof(mcck)); local_mcck_enable(); However, local_mcck_disable() does not have effect when __s390_handle_mcck() is called directly form the machine check handler, since the machine check interrupt is still disabled. Therefore, it is not the opening bracket to the following local_mcck_enable() function. Simplify the user process termination flow by scheduling the external interrupt and killing the affected process from the interrupt context. Assume a kernel-generated signal is always delivered and ignore a value returned by do_send_sig_info() funciton. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: remove toolchain dependent micro-optimizationHeiko Carstens2023-02-141-6/+0
| | | | | | | | | | | Get rid of CONFIG_AS_IS_LLVM in entry.S to make the code a bit more readable. This removes a micro-optimization, but given that the llvm IAS limitation will likely stay, just use the version that works with llvm. See commit 4c25f0ff6336 ("s390/entry: workaround llvm's IAS limitations") for further details. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-12-151-11/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...
| * s390/entry: sort out physical vs virtual pointers usage in sie64aNico Boehr2022-10-261-11/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix virtual vs physical address confusion (which currently are the same). sie_block is accessed in entry.S and passed it to hardware, which is why both its physical and virtual address are needed. To avoid every caller having to do the virtual-physical conversion, add a new function sie64a() which converts the virtual address to physical. Signed-off-by: Nico Boehr <nrb@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20221020143159.294605-3-nrb@linux.ibm.com Message-Id: <20221020143159.294605-3-nrb@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
* | s390/nmi: move storage error checking back to C, enter with DAT onHeiko Carstens2022-12-061-30/+4
|/ | | | | | | | | | | | | | Checking for storage errors in machine check entry code was done in order to handle also storage errors on kernel page tables. However this is extremely unlikely and some basic assumptions what works on machine check entry are necessary anyway. In order to simplify machine check handling delay checking for storage errors to C code. With this also change the machine check new PSW to have DAT on, which simplifies the entry code even further. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
* s390/mcck: isolate SIE instruction when setting CIF_MCCK_GUEST flagAlexander Gordeev2022-06-011-1/+5
| | | | | | | | | | | | | | | | | | | Commit d768bd892fc8 ("s390: add options to change branch prediction behaviour for the kernel") introduced .Lsie_exit label - supposedly to fence off SIE instruction. However, the corresponding address range length .Lsie_crit_mcck_length was not updated, which led to BPON code potentionally marked with CIF_MCCK_GUEST flag. Both .Lsie_exit and .Lsie_crit_mcck_length were removed with commit 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S"), but the issue persisted - currently BPOFF and BPENTER macros might get wrongly considered by the machine check handler as a guest. Fixes: d768bd892fc8 ("s390: add options to change branch prediction behaviour for the kernel") Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390: generate register offsets into pt_regs automaticallyHeiko Carstens2022-05-251-17/+0
| | | | | | | | Use asm offsets method to generate register offsets into pt_regs, instead of open-coding at several places. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: workaround llvm's IAS limitationsHeiko Carstens2022-05-171-2/+12
| | | | | | | | | | | | | | | | | llvm's integrated assembler cannot handle immediate values which are calculated with two local labels: <instantiation>:3:13: error: invalid operand for instruction clgfi %r14,.Lsie_done - .Lsie_gmap Workaround this by adding clang specific code which reads the specific value from memory. Since this code is within the hot paths of the kernel and adds an additional memory reference, keep the original code, and add ifdef'ed code. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/r/20220511120532.2228616-5-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/alternatives: provide identical sized orginal/alternative sequencesHeiko Carstens2022-05-171-10/+10
| | | | | | | | | | | | | | Explicitly provide identical sized original/alternative instruction sequences. This way there is no need for the s390 specific alternatives infrastructure to generate padding sequences. The code which generates such sequences will be removed with a follow on patch. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20220511120532.2228616-2-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/pai: add support for cryptography countersThomas Richter2022-05-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PMU device driver perf_pai_crypto supports Processor Activity Instrumentation (PAI), available with IBM z16: - maps a full page to lowcore address 0x1500. - uses CR0 bit 13 to turn PAI crypto counting on and off. - creates a sample with raw data on each context switch out when at context switch some mapped counters have a value of nonzero. This device driver only supports CPU wide context, no task context is allowed. Support for counting: - one or more counters can be specified using perf stat -e pai_crypto/xxx/ where xxx stands for the counter event name. Multiple invocation of this command is possible. The counter names are listed in /sys/devices/pai_crypto/events directory. - one special counters can be specified using perf stat -e pai_crypto/CRYPTO_ALL/ which returns the sum of all incremented crypto counters. - one event pai_crypto/CRYPTO_ALL/ is reserved for sampling. No multiple invocations are possible. The event collects data at context switch out and saves them in the ring buffer. Add qpaci assembly instruction to query supported memory mapped crypto counters. It returns the number of counters (no holes allowed in that range). The PAI crypto counter events are system wide and can not be executed in parallel. Therefore some restrictions documented in function paicrypt_busy apply. In particular event CRYPTO_ALL for sampling must run exclusive. Only counting events can run in parallel. PAI crypto counter events can not be created when a CPU hot plug add is processed. This means a CPU hot plug add does not get the necessary PAI event to record PAI cryptography counter increments on the newly added CPU. CPU hot plug remove removes the event and terminates the counting of PAI counters immediately. Co-developed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20220504062351.2954280-3-tmricht@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/entry: remove broken and not needed codeHeiko Carstens2022-05-061-4/+1
| | | | | | | | | | | | | | | | | | | | | | | LLVM's integrated assembler reports the following error when compiling entry.S: <instantiation>:38:5: error: unknown token in expression tm %r8,0x0001 # coming from user space? The correct instruction would have been tmhh instead of tm. The current code is doing nothing, since (with gas) it get's translated to a tm instruction which reads from real address 8, which again contains always zero, and therefore the conditional code is never executed. Note that due to the missing displacement gas translates "%r8" into "8(%r0)". Also code inspection reveals that this conditional code is not needed. Therefore remove it. Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/alternatives: use insn format for new instructionsHeiko Carstens2022-03-271-5/+5
| | | | | | | | | | | Use insn format with instruction format specifier instead of plain longs. This way it is also more obvious that code instead of data is generated. The generated code is identical. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: convert ".insn" encoding to instruction namesVasily Gorbik2022-03-101-2/+2
| | | | | | | | | | | | | With z10 as minimum supported machine generation many ".insn" encodings could be now converted to instruction names. There are couple of exceptions - stfle is used from the als code built for z900 and cannot be converted - few ".insn" directives encode unsupported instruction formats The generated code is identical before/after this change. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: assume stckf is always presentVasily Gorbik2022-03-101-8/+3
| | | | | | | | | | With z10 as minimum supported machine generation the store-clock-fast facility (25) is always present and checked in als code. Drop alternatives and always use stckf. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/extable: move EX_TABLE define to asm-extable.hHeiko Carstens2022-03-081-0/+1
| | | | | | | | | | | | | | Follow arm64 and riscv and move the EX_TABLE define to asm-extable.h which is a lot less generic than the current linkage.h. Also make sure that all files which contain EX_TABLE usages actually include the new header file. This should make sure that the files always compile and there won't be any random compile breakage due to other header file dependencies. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/entry: remove unused expoline thunkVasily Gorbik2022-03-011-1/+0
| | | | | | | Remove __s390_indirect_jump_r13use_r14 expoline thunk unused since commit fbbdfca5c553 ("s390/entry.S: factor out SIEEXIT macro"). Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: remove invalid email address of Heiko CarstensHeiko Carstens2022-02-061-1/+0
| | | | | | | | | | | Remove my old invalid email address which can be found in a couple of files. Instead of updating it, just remove my contact data completely from source files. We have git and other tools which allow to figure out who is responsible for what with recent contact data. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390: add support for BEAR enhancement facilitySven Schnelle2021-10-261-8/+37
| | | | | | | | | | | | | | | | | | | The Breaking-Event-Address-Register (BEAR) stores the address of the last breaking event instruction. Breaking events are usually instructions that change the program flow - for example branches, and instructions that modify the address in the PSW like lpswe. This is useful for debugging wild branches, because one could easily figure out where the wild branch was originating from. What is problematic is that lpswe is considered a breaking event, and therefore overwrites BEAR on kernel exit. The BEAR enhancement facility adds new instructions that allow to save/restore BEAR and also an lpswey instruction that doesn't cause a breaking event. So we can save BEAR on kernel entry and restore it on exit to user space. Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/entry: make oklabel within CHKSTG macro localHeiko Carstens2021-08-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Make the oklabel within the CHKSTG macro local. This makes sure that tools like objdump and the crash debugging tool still disassemble full functions where the macro has been used instead of stopping half way where such a global label is used and one has to guess how to disassemble the rest of such a function: E.g.: 0000000000cb0270 <mcck_int_handler>: cb0270: b2 05 03 20 stck 800 ... cb0354: a7 74 00 97 jne cb0482 <oklabel270+0xe2> 0000000000cb0358 <oklabel243>: cb0358: c0 e0 00 22 4e 8f larl %r14,10fa076 <opcode+0x2558> ... Fixes: d35925b34996 ("s390/mcck: move storage error checks to assembler") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/smp: enable DAT before CPU restart callback is calledAlexander Gordeev2021-08-261-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The restart interrupt is triggered whenever a secondary CPU is brought online, a remote function call dispatched from another CPU or a manual PSW restart is initiated and causes the system to kdump. The handling routine is always called with DAT turned off. It then initializes the stack frame and invokes a callback. The existing callbacks handle DAT as follows: * __do_restart() and __machine_kexec() turn in on upon entry; * __ipl_run(), __reipl_run() and __dump_run() do not turn it right away, but all of them call diag308() - which turns DAT on, but only if kasan is enabled; In addition to the described complexity all callbacks (and the functions they call) should avoid kasan instrumentation while DAT is off. This update enables DAT in the assembler restart handler and relieves any callbacks (which are mostly C functions) from dealing with DAT altogether. There are four types of CPU restart that initialize control registers in different ways: 1. Start of secondary CPU on boot - control registers are inherited from the IPL CPU; 2. Restart of online CPU - control registers of the CPU being restarted are kept; 3. Hotplug of offline CPU - control registers are inherited from the starting CPU; 4. Start of offline CPU triggered by manual PSW restart - the control registers are read from the absolute lowcore and contain the boot time IPL CPU values updated with all follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit() routines; In first three cases contents of the control registers is the most recent. In the latter case control registers are good enough to facilitate successful completion of kdump operation. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
* s390/mcck: move register validation to C codeAlexander Gordeev2021-07-051-38/+1
| | | | | | | | | | | | | | | | | | | | | | | | This update partially reverts commit 3037a52f9846 ("s390/nmi: do register validation as early as possible"). Storage error checks and control registers validation are left in the assembler code, since correct ASCEs and page tables are required to enable DAT - which is done before the C handler is entered. System damage, kernel instruction address and PSW MWP checks are left in the assembler code as well, since there is no way to proceed if one of these checks is failed. The getcpu vdso syscall reads CPU number from the programmable field of the TOD clock. Disregard the TOD programmable register validity bit and load the CPU number into the TOD programmable field unconditionally. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/mcck: move storage error checks to assemblerAlexander Gordeev2021-07-051-11/+32
| | | | | | | | | | | | | | | | | | The current storage errors tackling is wrong - the DAT is enabled in assembler code before the actual storage checks in C half are executed. In case the page tables themselves are damaged such approach is not going to work. With this update unrecoverable storage errors are not passed to C code for handling, but rather the machine is stopped right away. The only exception to this flow is when a machine check occurred in KVM guest - in this case the errors are reinjected by the handler. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/mcck: always enter C handler with DAT enabledAlexander Gordeev2021-07-051-3/+32
| | | | | | | | | | | | | | | | | | | | | | | The machine check handler must be entered with DAT disabled in case control registers are corrupted or a storage error happened and we can not tell if such error corresponds to a page table. Both of described conditions end up in stopping all CPUs and entering the disabled wait in C half of the handler. However, the storage errors are still checked after the DAT is enabled and C code is entered. In case a page table is damaged such flow is not expected to work. This update paves the way for moving the storage error checks from C to assembler half. All fatal errors that can only be handled with DAT disabled are handled in assembler half also. As result, the C half is only entered if the DAT is secured. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/mcck: optimize user mode check in case of !CONFIG_KVMAlexander Gordeev2021-07-051-2/+4
| | | | | | | | | | In case of the !CONFIG_KVM use "jz" instead of "jnz" when detecting user mode and get rid of unnecessary jump as result. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Christia Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* s390/entry.S: factor out SIEEXIT macroAlexander Gordeev2021-07-051-16/+12
| | | | | | | | | | Factor out SIEEXIT macro and use it instead of cleanup_sie routine. As a side effect %r13 and %r14 are spared. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Christia Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
* Merge tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds2021-07-041-30/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull s390 updates from Vasily Gorbik: - Rework inline asm to get rid of error prone "register asm" constructs, which are problematic especially when code instrumentation is enabled. In particular introduce and use register pair union to allocate even/odd register pairs. Unfortunately this breaks compatibility with older clang compilers and minimum clang version for s390 has been raised to 13. https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/ - Fix gcc 11 warnings, which triggered various minor reworks all over the code. - Add zstd kernel image compression support. - Rework boot CPU lowcore handling. - De-duplicate and move kernel memory layout setup logic earlier. - Few fixes in preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for mem functions. - Remove broken and unused power management support leftovers in s390 drivers. - Disable stack-protector for decompressor and purgatory to fix buildroot build. - Fix vt220 sclp console name to match the char device name. - Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in zPCI code. - Remove some implausible WARN_ON_ONCEs and remove arch specific counter transaction call backs in favour of default transaction handling in perf code. - Extend/add new uevents for online/config/mode state changes of AP card / queue device in zcrypt. - Minor entry and ccwgroup code improvements. - Other small various fixes and improvements all over the code. * tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (91 commits) s390/dasd: use register pair instead of register asm s390/qdio: get rid of register asm s390/ioasm: use symbolic names for asm operands s390/ioasm: get rid of register asm s390/cmf: get rid of register asm s390/lib,string: get rid of register asm s390/lib,uaccess: get rid of register asm s390/string: get rid of register asm s390/cmpxchg: use register pair instead of register asm s390/mm,pages-states: get rid of register asm s390/lib,xor: get rid of register asm s390/timex: get rid of register asm s390/hypfs: use register pair instead of register asm s390/zcrypt: Switch to flexible array member s390/speculation: Use statically initialized const for instructions virtio/s390: get rid of open-coded kvm hypercall s390/pci: add zpci_set_irq()/zpci_clear_irq() scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390 s390/ipl: use register pair instead of register asm s390/mem_detect: fix tprot() program check new psw handling ...
| * s390/entry.S: factor out OUTSIDE macroAlexander Gordeev2021-06-161-23/+25
| | | | | | | | | | | | | | | | | | Introduce OUTSIDE macro that checks whether an instruction address is inside or outside of a block of instructions. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
| * s390/mcck: cleanup use of cleanup_sie_mcckAlexander Gordeev2021-06-071-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | cleanup_sie_mcck label is called from a single location only and thus does not need to be a subroutine. Move the labelled code to the caller - by doing that the SIE critical section checks appear next to each other and the SIE cleanup becomes bit more readable. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>