aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/include/asm/processor.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2017-07-03Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-2/+4
Pull s390 updates from Martin Schwidefsky: "The bulk of the s390 patches for 4.13. Some new things but mostly bug fixes and cleanups. Noteworthy changes: - The SCM block driver is converted to blk-mq - Switch s390 to 5 level page tables. The virtual address space for a user space process can now have up to 16EB-4KB. - Introduce a ELF phdr flag for qemu to avoid the global vm.alloc_pgste which forces all processes to large page tables - A couple of PCI improvements to improve error recovery - Included is the merge of the base support for proper machine checks for KVM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) s390/dasd: Fix faulty ENODEV for RO sysfs attribute s390/pci: recognize name clashes with uids s390/pci: provide more debug information s390/pci: fix handling of PEC 306 s390/pci: improve pci hotplug s390/pci: introduce clp_get_state s390/pci: improve error handling during fmb (de)registration s390/pci: improve unreg_ioat error handling s390/pci: improve error handling during interrupt deregistration s390/pci: don't cleanup in arch_setup_msi_irqs KVM: s390: Backup the guest's machine check info s390/nmi: s390: New low level handling for machine check happening in guest s390/fpu: export save_fpu_regs for all configs s390/kvm: avoid global config of vm.alloc_pgste=1 s390: rename struct psw_bits members s390: rename psw_bits enums s390/mm: use correct address space when enabling DAT s390/cio: introduce io_subchannel_type s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL s390/dumpstack: remove raw stack dump ...
2017-06-28arch: remove unused macro/function thread_saved_pc()Tobias Klauser1-5/+0
The only user of thread_saved_pc() in non-arch-specific code was removed in commit 8243d5597793 ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-28Merge tag 'nmiforkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into featuresMartin Schwidefsky1-0/+2
Pull kvm patches from Christian Borntraeger: "s390,kvm: provide plumbing for machines checks when running guests" This provides the basic plumbing for handling machine checks when running guests
2017-06-27s390/nmi: s390: New low level handling for machine check happening in guestQingFeng Hao1-0/+2
Add the logic to check if the machine check happens when the guest is running. If yes, set the exit reason -EINTR in the machine check's interrupt handler. Refactor s390_do_machine_check to avoid panicing the host for some kinds of machine checks which happen when guest is running. Reinject the instruction processing damage's machine checks including Delayed Access Exception instead of damaging the host if it happens in the guest because it could be caused by improper update on TLB entry or other software case and impacts the guest only. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2017-06-12s390/mm: implement 5 level pages tablesMartin Schwidefsky1-2/+2
Add the logic to upgrade the page table for a 64-bit process to five levels. This increases the TASK_SIZE from 8PB to 16EB-4K. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-25s390/mm: make TASK_SIZE independent from the number of page table levelsMartin Schwidefsky1-4/+5
The TASK_SIZE for a process should be maximum possible size of the address space, 2GB for a 31-bit process and 8PB for a 64-bit process. The number of page table levels required for a given memory layout is a consequence of the mapped memory areas and their location. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-03-22s390: add a system call for guarded storageMartin Schwidefsky1-0/+5
This adds a new system call to enable the use of guarded storage for user space processes. The system call takes two arguments, a command and pointer to a guarded storage control block: s390_guarded_storage(int command, struct gs_cb *gs_cb); The second argument is relevant only for the GS_SET_BC_CB command. The commands in detail: 0 - GS_ENABLE Enable the guarded storage facility for the current task. The initial content of the guarded storage control block will be all zeros. After the enablement the user space code can use load-guarded-storage-controls instruction (LGSC) to load an arbitrary control block. While a task is enabled the kernel will save and restore the current content of the guarded storage registers on context switch. 1 - GS_DISABLE Disables the use of the guarded storage facility for the current task. The kernel will cease to save and restore the content of the guarded storage registers, the task specific content of these registers is lost. 2 - GS_SET_BC_CB Set a broadcast guarded storage control block. This is called per thread and stores a specific guarded storage control block in the task struct of the current task. This control block will be used for the broadcast event GS_BROADCAST. 3 - GS_CLEAR_BC_CB Clears the broadcast guarded storage control block. The guarded- storage control block is removed from the task struct that was established by GS_SET_BC_CB. 4 - GS_BROADCAST Sends a broadcast to all thread siblings of the current task. Every sibling that has established a broadcast guarded storage control block will load this control block and will be enabled for guarded storage. The broadcast guarded storage control block is used up, a second broadcast without a refresh of the stored control block with GS_SET_BC_CB will not have any effect. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-24s390: TASK_SIZE for kernel threadsMartin Schwidefsky1-1/+2
Return a sensible value if TASK_SIZE if called from a kernel thread. This gets us around an issue with copy_mount_options that does a magic size calculation "TASK_SIZE - (unsigned long)data" while in a kernel thread and data pointing to kernel space. Cc: <stable@vger.kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-23s390: restore address space when returning to user spaceHeiko Carstens1-4/+8
Unbalanced set_fs usages (e.g. early exit from a function and a forgotten set_fs(USER_DS) call) may lead to a situation where the secondary asce is the kernel space asce when returning to user space. This would allow user space to modify kernel space at will. This would only be possible with the above mentioned kernel bug, however we can detect this and fix the secondary asce before returning to user space. Therefore a new TIF_ASCE_SECONDARY which is used within set_fs. When returning to user space check if TIF_ASCE_SECONDARY is set, which would indicate a bug. If it is set print a message to the console, fixup the secondary asce, and then return to user space. This is similar to what is being discussed for x86 and arm: "[RFC] syscalls: Restore address limit after a syscall". Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-23s390: rename CIF_ASCE to CIF_ASCE_PRIMARYHeiko Carstens1-2/+2
This is just a preparation patch in order to keep the "restore address space after syscall" patch small. Rename CIF_ASCE to CIF_ASCE_PRIMARY to be unique and specific when introducing a second CIF_ASCE_SECONDARY CIF flag. Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-22Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-2/+2
Pull s390 updates from Martin Schwidefsky: - New entropy generation for the pseudo random number generator. - Early boot printk output via sclp to help debug crashes on boot. This needs to be enabled with a kernel parameter. - Add proper no-execute support with a bit in the page table entry. - Bug fixes and cleanups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (65 commits) s390/syscall: fix single stepped system calls s390/zcrypt: make ap_bus explicitly non-modular s390/zcrypt: Removed unneeded debug feature directory creation. s390: add missing "do {} while (0)" loop constructs to multiline macros s390/mm: add cond_resched call to kernel page table dumper s390: get rid of MACHINE_HAS_PFMF and MACHINE_HAS_HPAGE s390/mm: make memory_block_size_bytes available for !MEMORY_HOTPLUG s390: replace ACCESS_ONCE with READ_ONCE s390: Audit and remove any remaining unnecessary uses of module.h s390: mm: Audit and remove any unnecessary uses of module.h s390: kernel: Audit and remove any unnecessary uses of module.h s390/kdump: Use "LINUX" ELF note name instead of "CORE" s390: add no-execute support s390: report new vector facilities s390: use correct input data address for setup_randomness s390/sclp: get rid of common response code handling s390/sclp: don't add new lines to each printed string s390/sclp: make early sclp code readable s390/sclp: disable early sclp code as soon as the base sclp driver is active s390/sclp: move early printk code to drivers ...
2017-02-17s390: add missing "do {} while (0)" loop constructs to multiline macrosHeiko Carstens1-2/+2
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-14sched/cputime, s390: Implement delayed accounting of system timeMartin Schwidefsky1-0/+3
The account_system_time() function is called with a cputime that occurred while running in the kernel. The function detects which context the CPU is currently running in and accounts the time to the correct bucket. This forces the arch code to account the cputime for hardirq and softirq immediately. Such accounting function can be costly and perform unwelcome divisions and multiplications, among others. The arch code can delay the accounting for system time. For s390 the accounting is done once per timer tick and for each task switch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> [ Rebase against latest linus tree and move account_system_index_scaled(). ] Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1483636310-6557-10-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-12-13Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-0/+6
Pull s390 updates from Martin Schwidefsky: "The main bulk of the s390 patches for the 4.10 merge window: - Add support for the contiguous memory allocator. - The recovery for I/O errors in the dasd device driver is improved, the driver will now remove channel paths that are not working properly. - Additional fields are added to /proc/sysinfo, the extended partition name and the partition UUID. - New naming for PCI devices with system defined UIDs. - The last few remaining alloc_bootmem calls are converted to memblock. - The thread_info structure is stripped down and moved to the task_struct. The only field left in thread_info is the flags field. - Rework of the arch topology code to fix a fake numa issue. - Refactoring of the atomic primitives and add a new preempt_count implementation. - Clocksource steering for the STP sync check offsets. - The s390 specific headers are changed to make them usable with CLANG. - Bug fixes and cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits) s390/cpumf: Use configuration level indication for sampling data s390: provide memmove implementation s390: cleanup arch/s390/kernel Makefile s390: fix initrd corruptions with gcov/kcov instrumented kernels s390: exclude early C code from gcov profiling s390/dasd: channel path aware error recovery s390/dasd: extend dasd path handling s390: remove unused labels from entry.S s390/vmlogrdr: fix IUCV buffer allocation s390/crypto: unlock on error in prng_tdes_read() s390/sysinfo: show partition extended name and UUID if available s390/numa: pin all possible cpus to nodes early s390/numa: establish cpu to node mapping early s390/topology: use cpu_topology array instead of per cpu variable s390/smp: initialize cpu_present_mask in setup_arch s390/topology: always use s390 specific sched_domain_topology_level s390/smp: use smp_get_base_cpu() helper function s390/numa: always use logical cpu and core ids s390: Remove VLAIS in ptff() and clear_table() s390: fix machine check panic stack switch ...
2016-11-17locking/core: Provide common cpu_relax_yield() definitionChristian Borntraeger1-0/+1
No need to duplicate the same define everywhere. Since the only user is stop-machine and the only provider is s390, we can use a default implementation of cpu_relax_yield() in sched.h. Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16locking/core, arch: Remove cpu_relax_lowlatency()Christian Borntraeger1-1/+0
As there are no users left, we can remove cpu_relax_lowlatency() implementations from every architecture. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Cc: <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16locking/core, s390: Make cpu_relax() a barrier againChristian Borntraeger1-1/+1
stop_machine() seemed to be the only important place for yielding during cpu_relax(). This was fixed by using cpu_relax_yield(). Therefore, we can now redefine cpu_relax() to be a barrier instead on s390, making s390 identical to all other architectures. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-4-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16locking/core: Introduce cpu_relax_yield()Christian Borntraeger1-1/+2
For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-15s390: move sys_call_table and last_break from thread_info to thread_structMartin Schwidefsky1-0/+2
Move the last two architecture specific fields from the thread_info structure to the thread_struct. All that is left in thread_info is the flags field. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11s390: move cputime accounting fields from thread_info to thread_structMartin Schwidefsky1-0/+2
The user_timer and system_timer fields are used for the per-thread cputime accounting code. The access to these values is simpler if they are moved to the thread_struct as the task_thread_info(tsk) indirection is not needed anymore. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11s390: move system_call field from thread_info to thread_structMartin Schwidefsky1-0/+2
The system_call field in thread_info structure is used by the signal code to store the number of the current system call while the debugger interacts with its inferior. A better location for the system_call field is with the other debugger related information in the thread_struct. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17s390/dumpstack: restore reliable indicator for call tracesHeiko Carstens1-1/+1
Before merging all different stack tracers the call traces printed had an indicator if an entry can be considered reliable or not. Unreliable entries were put in braces, reliable not. Currently all lines contain these extra braces. This patch restores the old behaviour by adding an extra "reliable" parameter to the callback functions. Only show_trace makes currently use of it. Before: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0) After: [ 0.804751] Call Trace: [ 0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0) [ 0.804756] [<0000000000161d64>] create_worker+0x174/0x1c0 Fixes: 758d39ebd3d5 ("s390/dumpstack: merge all four stack tracers") Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+2
Pull KVM updates from Paolo Bonzini: - ARM: GICv3 ITS emulation and various fixes. Removal of the old VGIC implementation. - s390: support for trapping software breakpoints, nested virtualization (vSIE), the STHYI opcode, initial extensions for CPU model support. - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups, preliminary to this and the upcoming support for hardware virtualization extensions. - x86: support for execute-only mappings in nested EPT; reduced vmexit latency for TSC deadline timer (by about 30%) on Intel hosts; support for more than 255 vCPUs. - PPC: bugfixes. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits) KVM: PPC: Introduce KVM_CAP_PPC_HTM MIPS: Select HAVE_KVM for MIPS64_R{2,6} MIPS: KVM: Reset CP0_PageMask during host TLB flush MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX() MIPS: KVM: Sign extend MFC0/RDHWR results MIPS: KVM: Fix 64-bit big endian dynamic translation MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase MIPS: KVM: Use 64-bit CP0_EBase when appropriate MIPS: KVM: Set CP0_Status.KX on MIPS64 MIPS: KVM: Make entry code MIPS64 friendly MIPS: KVM: Use kmap instead of CKSEG0ADDR() MIPS: KVM: Use virt_to_phys() to get commpage PFN MIPS: Fix definition of KSEGX() for 64-bit KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD kvm: x86: nVMX: maintain internal copy of current VMCS KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures KVM: arm64: vgic-its: Simplify MAPI error handling KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers KVM: arm64: vgic-its: Turn device_id validation into generic ID validation ...
2016-06-20s390/mm: remember the int code for the last gmap faultDavid Hildenbrand1-0/+1
For nested virtualization, we want to know if we are handling a protection exception, because these can directly be forwarded to the guest without additional checks. Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20s390/mm: add shadow gmap supportMartin Schwidefsky1-0/+1
For a nested KVM guest the outer KVM host needs to create shadow page tables for the nested guest. This patch adds the basic support to the guest address space (gmap) code. For each guest address space the inner KVM host creates, the first outer KVM host needs to create shadow page tables. The address space is identified by the ASCE loaded into the control register 1 at the time the inner SIE instruction for the second nested KVM guest is executed. The outer KVM host creates the shadow tables starting with the table identified by the ASCE on a on-demand basis. The outer KVM host will get repeated faults for all the shadow tables needed to run the second KVM guest. While a shadow page table for the second KVM guest is active the access to the origin region, segment and page tables needs to be restricted for the first KVM guest. For region and segment and page tables the first KVM guest may read the memory, but write attempt has to lead to an unshadow. This is done using the page invalid and read-only bits in the page table of the first KVM guest. If the first guest re-accesses one of the origin pages of a shadow, it gets a fault and the affected parts of the shadow page table hierarchy needs to be removed again. PGSTE tables don't have to be shadowed, as all interpretation assist can't deal with the invalid bits in the shadow pte being set differently than the original ones provided by the first KVM guest. Many bug fixes and improvements by David Hildenbrand. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-13s390/cpuinfo: show dynamic and static cpu mhzHeiko Carstens1-1/+16
Show the dynamic and static cpu mhz of each cpu. Since these values are per cpu this requires a fundamental extension of the format of /proc/cpuinfo. Historically we had only a single line per cpu and a summary at the top of the file. This format is hardly extendible if we want to add more per cpu information. Therefore this patch adds per cpu blocks at the end of /proc/cpuinfo: cpu : 0 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 1 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 2 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 cpu : 3 cpu Mhz dynamic : 5504 cpu Mhz static : 5504 Right now each block contains only the dynamic and static cpu mhz, but it can be easily extended like on every other architecture. This extension is supposed to be compatible with the old format. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Sascha Silbe <silbe@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-05-18Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-3/+6
Pull s390 updates from Martin Schwidefsky: "The s390 patches for the 4.7 merge window have the usual bug fixes and cleanups, and the following new features: - An interface for dasd driver to query if a volume is online to another operating system - A new ioctl for the dasd driver to verify the format for a range of tracks - Following the example of x86 the struct fpu is now allocated with the task_struct - The 'report_error' interface for the PCI bus to send an adapter-error notification from user space to the service element of the machine" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits) s390/vmem: remove unused function parameter s390/vmem: fix identity mapping s390: add missing include statements s390: add missing declarations s390: make couple of variables and functions static s390/cache: remove superfluous locking s390/cpuinfo: simplify locking and skip offline cpus early s390/3270: hangup the 3270 tty after a disconnect s390/3270: handle reconnect of a tty with a different size s390/3270: avoid endless I/O loop with disconnected 3270 terminals s390/3270: fix garbled output on 3270 tty view s390/3270: fix view reference counting s390/3270: add missing tty_kref_put s390/dumpstack: implement and use return_address() s390/cpum_sf: Remove superfluous SMP function call s390/cpum_cf: Remove superfluous SMP function call s390/Kconfig: make z196 the default processor type s390/sclp: avoid compile warning in sclp_pci_report s390/fpu: allocate 'struct fpu' with the task_struct s390/crypto: cleanup and move the header with the cpacf definitions ...
2016-04-21s390/fpu: allocate 'struct fpu' with the task_structMartin Schwidefsky1-3/+6
Analog to git commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a "x86/fpu, sched: Dynamically allocate 'struct fpu'" move the struct fpu to the end of the struct thread_struct, set CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and add the setup_task_size() function to calculate the correct size fo the task struct. For the performance_defconfig this increases the size of struct task_struct from 7424 bytes to 7936 bytes (MACHINE_HAS_VX==1) or 7552 bytes (MACHINE_HAS_VX==0). The dynamic allocation of the struct fpu is removed. The slab cache uses an 8KB block for the task struct in all cases, there is enough room for the struct fpu. For MACHINE_HAS_VX==1 each task now needs 512 bytes less memory. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-04-21s390/mm: fix asce_bits handling with dynamic pagetable levelsGerald Schaefer1-1/+1
There is a race with multi-threaded applications between context switch and pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a pagetable upgrade on another thread in crst_table_upgrade() could already have set new asce_bits, but not yet the new mm->pgd. This would result in a corrupt user_asce in switch_mm(), and eventually in a kernel panic from a translation exception. Fix this by storing the complete asce instead of just the asce_bits, which can then be read atomically from switch_mm(), so that it either sees the old value or the new value, but no mixture. Both cases are OK. Having the old value would result in a page fault on access to the higher level memory, but the fault handler would see the new mm->pgd, if it was a valid access after the mmap on the other thread has completed. So as worst-case scenario we would have a page fault loop for the racing thread until the next time slice. Also remove dead code and simplify the upgrade/downgrade path, there are no upgrades from 2 levels, and only downgrades from 3 levels for compat tasks. There are also no concurrent upgrades, because the mmap_sem is held with down_write() in do_mmap, so the flush and table checks during upgrade can be removed. Reported-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-23s390/dumpstack: merge all four stack tracersHeiko Carstens1-0/+4
We have four different stack tracers of which three had bugs. So it's time to merge them to a single stack tracer which allows to specify a call back function which will be called for each step. This patch changes behavior a bit: - the "nosched" and "in_sched_functions" check within save_stack_trace_tsk did work only for the last stack frame within a context. Now it considers the check for each stack frame like it should. - both the oprofile variant and the perf_events variant did save a return address twice if a zero back chain was detected, which indicates an interrupt frame. The new dump_trace function will call the oprofile and perf_events backends with the psw address that is contained within the corresponding pt_regs structure instead. - the original show_trace and save_context_stack functions did already use the psw address of the pt_regs structure if a zero back chain was detected. However now we ignore the psw address if it is a user space address. After all we trace the kernel stack and not the user space stack. This way we also get rid of the garbage user space address in case of warnings and / or panic call traces. So this should make life easier since now there is only one stack tracer left which we can break. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-23s390: add current_stack_pointer() helper functionHeiko Carstens1-0/+8
Implement current_stack_pointer() helper function and use it everywhere, instead of having several different inline assembly variants. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-19s390: remove all usages of PSW_ADDR_AMODEHeiko Carstens1-2/+2
This is a leftover from the 31 bit area. For CONFIG_64BIT the usual operation "y = x | PSW_ADDR_AMODE" is a nop. Therefore remove all usages of PSW_ADDR_AMODE and make the code a bit less confusing. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2016-01-11s390: rename struct _lowcore to struct lowcoreHeiko Carstens1-1/+1
Finally get rid of the leading underscore. I tried this already two or three years ago, however Michael Holzheu objected since this would break the crash utility (again). However Michael integrated support for the new name into the crash utility back then, so it doesn't break if the name will be changed now. So finally get rid of the ever confusing leading underscore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-27s390/spinlock: do not yield to a CPU in udelay/mdelayMartin Schwidefsky1-0/+12
It does not make sense to try to relinquish the time slice with diag 0x9c to a CPU in a state that does not allow to schedule the CPU. The scenario where this can happen is a CPU waiting in udelay/mdelay while holding a spin-lock. Add a CIF bit to tag a CPU in enabled wait and use it to detect that the yield of a CPU will not be successful and skip the diagnose call. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-03s390: remove runtime instrumentation interruptsMartin Schwidefsky1-1/+0
The external interrupts for runtime instrumentation buffer-full and runtime instrumentation halted are unused and have no current user. Remove the support and ignore the second parameter of the s390_runtime_instr system call from now on. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-27s390: don't store registers on disabled wait anymoreHeiko Carstens1-41/+5
The current disabled wait code stores register contents into their save areas, however it is (at least) missing the new vector registers. Given the fact that the whole exercise seems to be rather pointless simply don't save any registers anymore. In a "live" system it is always possible to inspect register contents, and in case of a dump the register contents will be stored by the dump mechanism. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-27s390: get rid of __set_psw_mask()Heiko Carstens1-16/+11
With the removal of 31 bit code we can always assume that the epsw instruction is available. Therefore use the __extract_psw() function to disable and enable machine checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-16s390/fpu: split fpu-internal.h into fpu internals, api, and type headersHendrik Brueckner1-2/+3
Split the API and FPU type definitions into separate header files similar to "x86/fpu: Rename fpu-internal.h to fpu/internal.h" (78f7f1e54b). The new header files and their meaning are: asm/fpu/types.h: FPU related data types, needed for 'struct thread_struct' and 'struct task_struct'. asm/fpu/api.h: FPU related 'public' functions for other subsystems and device drivers. asm/fpu/internal.h: FPU internal functions mainly used to convert FPU register contents in signal handling. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/flags: use _BITUL macroHeiko Carstens1-5/+7
The defines that are used in entry.S have been partially converted to use the _BITUL macro (setup.h). This patch converts the rest. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/flags: fix flag handlingHeiko Carstens1-3/+3
The cpu flags and pt_regs flags fields are each 64 bits in size. A flag can be set with helper functions like set_cpu_flags(). These functions create a mask using "1U << flag". This doesn't work if flag is larger than 31, since 1U << 32 == 0. So fix this in case we ever will have such flag numbers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/udelay: make udelay have busy loop semanticsHeiko Carstens1-0/+2
When using systemtap it was observed that our udelay implementation is rather suboptimal if being called from a kprobe handler installed by systemtap. The problem observed when a kprobe was installed on lock_acquired(). When the probe was hit the kprobe handler did call udelay, which set up an (internal) timer and reenabled interrupts (only the clock comparator interrupt) and waited for the interrupt. This is an optimization to avoid that the cpu is busy looping while waiting that enough time passes. The problem is that the interrupt handler still does call irq_enter()/irq_exit() which then again can lead to a deadlock, since some accounting functions may take locks as well. If one of these locks is the same, which caused lock_acquired() to be called, we have a nice deadlock. This patch reworks the udelay code for the interrupts disabled case to immediately leave the low level interrupt handler when the clock comparator interrupt happens. That way no C code is being called and the deadlock cannot happen anymore. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/fpu: add static FPU save area for init_taskHendrik Brueckner1-0/+2
Previously, the init task did not have an allocated FPU save area and saving an FPU state was not possible. Now if the vector extension is always enabled, provide a static FPU save area to save FPU states of vector instructions that can be executed quite early. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-07-29s390/sclp: convert early sclp console code to CMartin Schwidefsky1-0/+11
The 31-bit assembler code for the early sclp console is error prone as git commit fde24b54d976cc123506695c17db01438a11b673 "s390/sclp: clear upper register halves in _sclp_print_early" has shown. Convert the assembler code to C. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-07-22s390/kernel: lazy restore fpu registersHendrik Brueckner1-0/+2
Improve the save and restore behavior of FPU register contents to use the vector extension within the kernel. The kernel does not use floating-point or vector registers and, therefore, saving and restoring the FPU register contents are performed for handling signals or switching processes only. To prepare for using vector instructions and vector registers within the kernel, enhance the save behavior and implement a lazy restore at return to user space from a system call or interrupt. To implement the lazy restore, the save_fpu_regs() sets a CPU information flag, CIF_FPU, to indicate that the FPU registers must be restored. Saving and setting CIF_FPU is performed in an atomic fashion to be interrupt-safe. When the kernel wants to use the vector extension or wants to change the FPU register state for a task during signal handling, the save_fpu_regs() must be called first. The CIF_FPU flag is also set at process switch. At return to user space, the FPU state is restored. In particular, the FPU state includes the floating-point or vector register contents, as well as, vector-enablement and floating-point control. The FPU state restore and clearing CIF_FPU is also performed in an atomic fashion. For KVM, the restore of the FPU register state is performed when restoring the general-purpose guest registers before the SIE instructions is started. Because the path towards the SIE instruction is interruptible, the CIF_FPU flag must be checked again right before going into SIE. If set, the guest registers must be reloaded again by re-entering the outer SIE loop. This is the same behavior as if the SIE critical section is interrupted. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-07-22s390/kernel: introduce fpu-internal.h with fpu helper functionsHendrik Brueckner1-2/+2
Introduce a new structure to manage FP and VX registers. Refactor the save and restore of floating point and vector registers with a set of helper functions in fpu-internal.h. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-07-22s390/kernel: move EX_TABLE macros to linkage.h header fileHendrik Brueckner1-19/+0
Move the EX_TABLE macro definitions from the processor.h to the linkage.h header file. It helps to reduce circular header file dependencies. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-25s390: remove 31 bit supportHeiko Carstens1-65/+1
Remove the 31 bit support in order to reduce maintenance cost and effectively remove dead code. Since a couple of years there is no distribution left that comes with a 31 bit kernel. The 31 bit kernel also has been broken since more than a year before anybody noticed. In addition I added a removal warning to the kernel shown at ipl for 5 minutes: a960062e5826 ("s390: add 31 bit warning message") which let everybody know about the plan to remove 31 bit code. We didn't get any response. Given that the last 31 bit only machine was introduced in 1999 let's remove the code. Anybody with 31 bit user space code can still use the compat mode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-29s390: reintroduce diag 44 calls for cpu_relax()Heiko Carstens1-4/+1
Christian Borntraeger reported that the now missing diag 44 calls (voluntary time slice end) does cause a performance regression for stop_machine() calls if a machine has more virtual cpus than the host has physical cpus. This patch mainly reverts 57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()") with the exception that we still do not issue diag 44 calls if running with smt enabled. Due to group scheduling algorithms when running in LPAR this would lead to significant latencies. However, when running in LPAR we do not have more virtual than physical cpus. Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-11-28s390: remove diag 44 calls from cpu_relax()Heiko Carstens1-2/+0
Simplify cpu_relax() to a simple barrier(). Performance wise this doesn't seem to make any big difference anymore, since nearly all lock variants have directed yield semantics in the meantime. Also this makes s390 behave like all other architectures. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-09s390: add support for vector extensionMartin Schwidefsky1-0/+1
The vector extension introduces 32 128-bit vector registers and a set of instruction to operate on the vector registers. The kernel can control the use of vector registers for the problem state program with a bit in control register 0. Once enabled for a process the kernel needs to retain the content of the vector registers on context switch. The signal frame is extended to include the vector registers. Two new register sets NT_S390_VXRS_LOW and NT_S390_VXRS_HIGH are added to the regset interface for the debugger and core dumps. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>