aboutsummaryrefslogtreecommitdiffstats
path: root/arch/s390/kernel/head64.S (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-02-06s390: remove invalid email address of Heiko CarstensHeiko Carstens1-1/+0
Remove my old invalid email address which can be found in a couple of files. Instead of updating it, just remove my contact data completely from source files. We have git and other tools which allow to figure out who is responsible for what with recent contact data. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04s390/boot: initialize control registers in decompressorAlexander Gordeev1-18/+0
Partially revert commit 4555b9f34296 ("s390/boot: move dma sections from decompressor to decompressed kernel"). This is a prerequisite to allow initialization of virtual memory in decompressor and avoid overwriting of ASCEs in the decompressed kernel otherwise. Since the control registers 2, 5 and 15 are reinitialized in the decompressed kernel again, this change does not prevent relocating of amode31 section in any way. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-07-27s390/boot: move dma sections from decompressor to decompressed kernelAlexander Egorenkov1-0/+17
This change simplifies the task of making the decompressor relocatable. The decompressor's image contains special DMA sections between _sdma and _edma. This DMA segment is loaded at boot as part of the decompressor and then simply handed over to the decompressed kernel. The decompressor itself never uses it in any way. The primary reason for this is the need to keep the aforementioned DMA segment below 2GB which is required by architecture, and because the decompressor is always loaded at a fixed low physical address, it is guaranteed that the DMA region will not cross the 2GB memory limit. If the DMA region had been placed in the decompressed kernel, then KASLR would make this guarantee impossible to fulfill or it would be restricted to the first 2GB of memory address space. This commit moves all DMA sections between _sdma and _edma from the decompressor's image to the decompressed kernel's image. The complete DMA region is placed in the init section of the decompressed kernel and immediately relocated below 2GB at start-up before it is needed by other parts of the decompressed kernel. The relocation of the DMA region happens even if the decompressed kernel is already located below 2GB in order to keep the first implementation simple. The relocation should not have any noticeable impact on boot time because the DMA segment is only a couple of pages. After relocating the DMA sections, the kernel has to fix all references which point into it. In order to automate this, place all variables pointing into the DMA sections in a special .dma.refs section. All such variables must be defined using the new __dma_ref macro. Only variables containing addresses within the DMA sections must be placed in the new .dma.refs section. Furthermore, move the initialization of control registers from the decompressor to the decompressed kernel because some control registers reference tables that must be placed in the DMA data section to guarantee that their addresses are below 2G. Because the decompressed kernel relocates the DMA sections at startup, the content of control registers CR2, CR5 and CR15 must be updated with new addresses after the relocation. The decompressed kernel initializes all control registers early at boot and then updates the content of CR2, CR5 and CR15 as soon as the DMA relocation has occurred. This practically reverts the commit a80313ff91ab ("s390/kernel: introduce .dma sections"). Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09s390/early: rewrite program parameter setup in CVasily Gorbik1-6/+1
And move it earlier in the decompressor. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2019-11-30s390/head64: correct init_task stack setupVasily Gorbik1-1/+1
Add missing allocation of pt_regs at the bottom of the stack. This makes it consistent with other stack setup cases and also what stack unwinder expects. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20s390/early: move access registers setup in C codeVasily Gorbik1-8/+2
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20s390/head64: remove unnecessary vdso_per_cpu_data setupVasily Gorbik1-2/+0
vdso_per_cpu_data lowcore value is only needed for fully functional exception handlers, which are activated in setup_lowcore_dat_off. The same function does init vdso_per_cpu_data via vdso_alloc_boot_cpu. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-11-20s390/early: move control registers setup in C codeVasily Gorbik1-6/+0
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-17Merge tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds1-5/+3
Pull s390 updates from Vasily Gorbik: - Add support for IBM z15 machines. - Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring. - Move to arch_stack_walk infrastructure for the stack unwinder. - Various kasan fixes and improvements. - Various command line parsing fixes. - Improve decompressor phase debuggability. - Lift no bss usage restriction for the early code. - Use refcount_t for reference counters for couple of places in mm code. - Logging improvements and return code fix in vfio-ccw code. - Couple of zpci fixes and minor refactoring. - Remove some outdated documentation. - Fix secure boot detection. - Other various minor code clean ups. * tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390: remove pointless drivers-y in drivers/s390/Makefile s390/cpum_sf: Fix line length and format string s390/pci: fix MSI message data s390: add support for IBM z15 machines s390/crypto: Support for SHA3 via CPACF (MSA6) s390/startup: add pgm check info printing s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding vfio-ccw: fix error return code in vfio_ccw_sch_init() s390: vfio-ap: fix warning reset not completed s390/base: remove unused s390_base_mcck_handler s390/sclp: Fix bit checked for has_sipl s390/zcrypt: fix wrong handling of cca cipher keygenflags s390/kasan: add kdump support s390/setup: avoid using strncmp with hardcoded length s390/sclp: avoid using strncmp with hardcoded length s390/module: avoid using strncmp with hardcoded length s390/pci: avoid using strncmp with hardcoded length s390/kaslr: reserve memory for kasan usage s390/mem_detect: provide single get_mem_detect_end s390/cmma: reuse kstrtobool for option value parsing ...
2019-08-21s390: clean .bss before running uncompressed kernelVasily Gorbik1-5/+3
Clean uncompressed kernel .bss section in the startup code before the uncompressed kernel is executed. At this point of time initrd and certificates have been already rescued. Uncompressed kernel .bss size is known from vmlinux_info. It is also taken into consideration during uncompressed kernel positioning by kaslr (so it is safe to clean it). With that uncompressed kernel is starting with .bss section zeroed and no .bss section usage restrictions apply. Which makes chkbss checks for uncompressed kernel objects obsolete and they can be removed. early_nobss.c is also not needed anymore. Parts of it which are still relevant are moved to early.c. Kasan initialization code is now called directly from head64 (early.c is instrumented and should not be executed before kasan shadow memory is set up). Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-08-06s390/head64: cleanup unused labelsVasily Gorbik1-7/+0
Cleanup labels in head64 some of which are not being used since git recorded history. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-04-29s390/kernel: introduce .dma sectionsGerald Schaefer1-26/+0
With a relocatable kernel that could reside at any place in memory, code and data that has to stay below 2 GB needs special handling. This patch introduces .dma sections for such text, data and ex_table. The sections will be part of the decompressor kernel, so they will not be relocated and stay below 2 GB. Their location is passed over to the decompressed / relocated kernel via the .boot.preserved.data section. The duald and aste for control register setup also need to stay below 2 GB, so move the setup code from arch/s390/kernel/head64.S to arch/s390/boot/head.S. The duct and linkage_stack could reside above 2 GB, but their content has to be preserved for the decompresed kernel, so they are also moved into the .dma section. The start and end address of the .dma sections is added to vmcoreinfo, for crash support, to help debugging in case the kernel crashed there. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-02-07s390: remove dead codeGerald Schaefer1-2/+0
Remove some dead code from head64.S, which was left over since commit da292bbe1f62 ("[S390] eliminate ipl_device from lowcore") removed ipl_device from lowcore. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09s390: add support for virtually mapped kernel stacksMartin Schwidefsky1-3/+1
With virtually mapped kernel stacks the kernel stack overflow detection is now fault based, every stack has a guard page in the vmalloc space. The panic_stack is renamed to nodat_stack and is used for all function that need to run without DAT, e.g. memcpy_real or do_start_kdump. The main effect is a reduction in the kernel image size as with vmap stacks the old style overflow checking that adds two instructions per function is not needed anymore. Result from bloat-o-meter: add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042) In regard to performance the micro-benchmark for fork has a hit of a few microseconds, allocating 4 pages in vmalloc space is more expensive compare to an order-2 page allocation. But with real workload I could not find a noticeable difference. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-09-20s390: clean up stacks setupVasily Gorbik1-3/+3
Replace hard coded stack frame overhead values with STACK_FRAME_OVERHEAD definition. Avoid unnecessary arithmetic instructions. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-02s390: correct _stext offsetVasily Gorbik1-25/+15
Avoid unnecessary rewrite of psw and merge _stext into startup_continue. This allows to move _stext definition to vmlinux.lds.S, where _etext is also defined and set _stext to the actual beginning of .text at 0x100000. This fixes the problem with setting the last .text page as not-executable due to vmem_map_init relying on page alinged _stext and _etext. Fixes: bd79d6632958 ("s390/decompressor: trim the kernel image up to 1M") Reported-by: Nils Hoppmann <niho@de.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-02s390: get rid of the first mb of uncompressed imageVasily Gorbik1-1/+0
Instead of generating uncompressed kernel image starting at 0, filling first mb with zeros (with ".org 0x100000") and then trimming it off from vmlinux.bin before compression, simply generate a kernel image starting from 0x100000. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-07-02s390: remove unused _ehead symbolVasily Gorbik1-2/+0
Since startup code now reserves memory ranges [0, PARMAREA_END] and [_stext, <end of kernel>] _ehead symbol is not used and could be cleaned up. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-06-25s390/boot: make head.S and als.c be part of the decompressor onlyVasily Gorbik1-1/+1
Since uncompressed kernel image does not have to be bootable anymore, move head.S, head_kdump.S and als.c to boot/ folder and compile them in just in the decompressor. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-06-25s390/decompressor: trim the kernel image up to 1MVasily Gorbik1-1/+1
Move head64.S main kernel entry point "startup_continue" to 0x100000 and trim everything which is below 1M during build. So, that the decompressor would unpack the main kernel image, move it to 0x100000 and jump to startup_continue. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-05-09s390/early: move functions which may not access bss section to extra fileHeiko Carstens1-2/+6
Move functions which may not access bss section to extra file. This makes it easier to verify that all early functions which may not rely on an initialized bss section are not accessing it. Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-14s390: remove all code using the access register modeMartin Schwidefsky1-1/+1
The vdso code for the getcpu() and the clock_gettime() call use the access register mode to access the per-CPU vdso data page with the current code. An alternative to the complicated AR mode is to use the secondary space mode. This makes the vdso faster and quite a bit simpler. The downside is that the uaccess code has to be changed quite a bit. Which instructions are used depends on the machine and what kind of uaccess operation is requested. The instruction dictates which ASCE value needs to be loaded into %cr1 and %cr7. The different cases: * User copy with MVCOS for z10 and newer machines The MVCOS instruction can copy between the primary space (aka user) and the home space (aka kernel) directly. For set_fs(KERNEL_DS) the kernel ASCE is loaded into %cr1. For set_fs(USER_DS) the user space is already loaded in %cr1. * User copy with MVCP/MVCS for older machines To be able to execute the MVCP/MVCS instructions the kernel needs to switch to primary mode. The control register %cr1 has to be set to the kernel ASCE and %cr7 to either the kernel ASCE or the user ASCE dependent on set_fs(KERNEL_DS) vs set_fs(USER_DS). * Data access in the user address space for strnlen / futex To use "normal" instruction with data from the user address space the secondary space mode is used. The kernel needs to switch to primary mode, %cr1 has to contain the kernel ASCE and %cr7 either the user ASCE or the kernel ASCE, dependent on set_fs. To load a new value into %cr1 or %cr7 is an expensive operation, the kernel tries to be lazy about it. E.g. for multiple user copies in a row with MVCP/MVCS the replacement of the vdso ASCE in %cr7 with the user ASCE is done only once. On return to user space a CPU bit is checked that loads the vdso ASCE again. To enable and disable the data access via the secondary space two new functions are added, enable_sacf_uaccess and disable_sacf_uaccess. The fact that a context is in secondary space uaccess mode is stored in the mm_segment_t value for the task. The code of an interrupt may use set_fs as long as it returns to the previous state it got with get_fs with another call to set_fs. The code in finish_arch_post_lock_switch simply has to do a set_fs with the current mm_segment_t value for the task. For CPUs with MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode, lazy | user | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | For CPUs without MVCOS: CPU running in | %cr1 ASCE | %cr7 ASCE | --------------------------------------|-----------|-----------| user space | user | vdso | kernel, USER_DS, normal-mode | user | vdso | kernel, USER_DS, normal-mode lazy | kernel | user | kernel, USER_DS, sacf-mode | kernel | user | kernel, KERNEL_DS, normal-mode | kernel | vdso | kernel, KERNEL_DS, normal-mode, lazy | kernel | kernel | kernel, KERNEL_DS, sacf-mode | kernel | kernel | The lines with "lazy" refer to the state after a copy via the secondary space with a delayed reload of %cr1 and %cr7. There are three hardware address spaces that can cause a DAT exception, primary, secondary and home space. The exception can be related to four different fault types: user space fault, vdso fault, kernel fault, and the gmap faults. Dependent on the set_fs state and normal vs. sacf mode there are a number of fault combinations: 1) user address space fault via the primary ASCE 2) gmap address space fault via the primary ASCE 3) kernel address space fault via the primary ASCE for machines with MVCOS and set_fs(KERNEL_DS) 4) vdso address space faults via the secondary ASCE with an invalid address while running in secondary space in problem state 5) user address space fault via the secondary ASCE for user-copy based on the secondary space mode, e.g. futex_ops or strnlen_user 6) kernel address space fault via the secondary ASCE for user-copy with secondary space mode with set_fs(KERNEL_DS) 7) kernel address space fault via the primary ASCE for user-copy with secondary space mode with set_fs(USER_DS) on machines without MVCOS. 8) kernel address space fault via the home space ASCE Replace user_space_fault() with a new function get_fault_type() that can distinguish all four different fault types. With these changes the futex atomic ops from the kernel and the strnlen_user will get a little bit slower, as well as the old style uaccess with MVCP/MVCS. All user accesses based on MVCOS will be as fast as before. On the positive side, the user space vdso code is a lot faster and Linux ceases to use the complicated AR mode. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-26s390/time: add support for the TOD clock epoch extensionMartin Schwidefsky1-2/+2
The TOD epoch extension adds 8 epoch bits to the TOD clock to provide a continuous clock after 2042/09/17. The store-clock-extended (STCKE) instruction will store the epoch index in the first byte of the 16 bytes stored by the instruction. The read_boot_clock64 and the read_presistent_clock64 functions need to take the additional bits into account to give the correct result after 2042/09/17. The clock-comparator register will stay 64 bit wide. The comparison of the clock-comparator with the TOD clock is limited to bytes 1 to 8 of the extended TOD format. To deal with the overflow problem due to an epoch change the clock-comparator sign control in CR0 can be used to switch the comparison of the 64-bit TOD clock with the clock-comparator to a signed comparison. The decision between the signed vs. unsigned clock-comparator comparisons is done at boot time. Only if the TOD clock is in the second half of a 142 year epoch the signed comparison is used. This solves the epoch overflow issue as long as the machine is booted at least once in an epoch. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-05s390/cpumf: simplify detection of guest samplesMartin Schwidefsky1-1/+1
There are three different code levels in regard to the identification of guest samples. They differ in the way the LPP instruction is used. 1) Old kernels without the LPP instruction. The guest program parameter is always zero. 2) Newer kernels load the process pid into the program parameter with LPP. The guest program parameter is non-zero if the guest executes in a process != idle. 3) The latest kernels load ((1UL << 31) | pid) with LPP to make the value non-zero even for the idle task. The guest program parameter is non-zero if the guest is running. All kernels load the process pid to CR4 on context switch. The CPU sampling code uses the value in CR4 to decide between guest and host samples in case the guest program parameter is zero. The three cases: 1) CR4==pid, gpp==0 2) CR4==pid, gpp==pid 3) CR4==pid, gpp==((1UL << 31) | pid) The load-control instruction to load the pid into CR4 is expensive and the goal is to remove it. To distinguish the host CR4 from the guest pid for the idle process the maximum value 0xffff for the PASN is used. This adds a fourth case for a guest OS with an updated kernel: 4) CR4==0xffff, gpp=((1UL << 31) | pid) The host kernel will have CR4==0xffff and will use (gpp!=0 || CR4!==0xffff) to identify guest samples. This works nicely with all 4 cases, the only possible issue would be a guest with an old kernel (gpp==0) and a process pid of 0xffff. Well, don't do that.. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23s390/thread_info: get rid of THREAD_ORDER defineHeiko Carstens1-1/+1
We have the s390 specific THREAD_ORDER define and the THREAD_SIZE_ORDER define which is also used in common code. Both have exactly the same semantics. Therefore get rid of THREAD_ORDER and always use THREAD_SIZE_ORDER instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11s390: move thread_info into task_structHeiko Carstens1-3/+2
This is the s390 variant of commit 15f4eae70d36 ("x86: Move thread_info into task_struct"). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-08s390/cpumf: Fix lpp detectionChristian Borntraeger1-1/+1
we have to check bit 40 of the facility list before issuing LPP and not bit 48. Otherwise a guest running on a system with "The decimal-floating-point zoned-conversion facility" and without the "The set-program-parameters facility" might crash on an lpp instruction. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples") Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-12-18s390/facilities: always use lowcore's stfle field for storing facility bitsHeiko Carstens1-1/+1
head.s contains an stfle instruction which stores it result at the storage location that is assigned to the stfl instruction. This is currently no problem, since we only care about one double word. However if the number of double words in the ALS bitfield grows the current code is not very stable. E.g. before issuing the stfle command the memory to which it stores must be cleared, since the instruction may or may not clear memory contents where no bits are set. In order to simplify the code a bit always use the storage location that we reserved for the stfle result. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14s390/cpumf: rework program parameter setting to detect guest samplesChristian Borntraeger1-1/+6
The program parameter can be used to mark hardware samples with some token. Previously, it was used to mark guest samples only. Improve the program parameter doubleword by combining two parts, the leftmost LPP part and the rightmost PID part. Set the PID part for processes by using the task PID. To distinguish host and guest samples for the kernel (PID part is zero), the guest must always set the program paramater to a non-zero value. Use the leftmost bit in the LPP part of the program parameter to be able to detect guest kernel samples. [brueckner@linux.vnet.ibm.com]: Split __LC_CURRENT and introduced __LC_LPP. Corrected __LC_CURRENT users and adjusted assembler parts. And updated the commit message accordingly. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-05s390: fix kernel crash due to linkage stack instructionsMartin Schwidefsky1-2/+5
The kernel currently crashes with a low-address-protection exception if a user space process executes an instruction that tries to use the linkage stack. Set the base-ASTE origin and the subspace-ASTE origin of the dispatchable-unit-control-table to point to a dummy ASTE. Set up control register 15 to point to an empty linkage stack with no room left. A user space process with a linkage stack instruction will still crash but with a different exception which is correctly translated to a segmentation fault instead of a kernel oops. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-10-09s390/mm: let kernel text section always begin at 1MBHeiko Carstens1-3/+0
Let the kernel text section always begin at 1MB. This allows to always have a large frame in the identity mapping of the kernel image for beginning of the text section, if the machine has EDAT1 support. Moving the beginning from 64K to 1MB doesn't cost any memory, since we make the memory between 64K and 1MB available for the page allocator. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-20s390/comments: unify copyright messages and remove file namesHeiko Carstens1-3/+1
Remove the file name from the comment at top of many files. In most cases the file name was wrong anyway, so it's rather pointless. Also unify the IBM copyright statement. We did have a lot of sightly different statements and wanted to change them one after another whenever a file gets touched. However that never happened. Instead people start to take the old/"wrong" statements to use as a template for new files. So unify all of them in one go. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-10-30[S390] smp: external call vs. emergency signalMartin Schwidefsky1-1/+1
Use a sigp sense running to decide which signal processor order to use for an ipi. If the target cpu is running use external call, if the target cpu is not running use emergency signal. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] initial cr0 bitsMartin Schwidefsky1-1/+1
Remove outdated bits from the initial cr0 register. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] iucv cr0 enablement bitMartin Schwidefsky1-1/+1
Do not set the cr0 enablement bit for iucv by default in head[31|64].S, move the enablement to iucv_init in the iucv base layer. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-07-24[S390] fix s390 assembler code alignmentsJan Glauber1-6/+5
The alignment is missing for various global symbols in s390 assembly code. With a recent gcc and an instruction like stgrl this can lead to a specification exception if the instruction uses such a mis-aligned address. Specify the alignment explicitely and while add it define __ALIGN for s390 and use the ENTRY define to save some lines of code. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-05-12[S390] correct address of _stext with CONFIG_SHARED_KERNEL=yMartin Schwidefsky1-1/+1
As of git commit 1844c9bc0b2fed3023551c1affe033ab38e90b9a head64.S/head31.S are not included in head.S anymore but build as an extra object. This breaks shared kernel support because the .org statement in head64.S/head31.S for CONFIG_SHARED_KERNEL=y will have a different effect. The end address of the head.text section in head.o will be added to the .org value, to compensate for this subtract 0x11000 to get the required value of 0x100000 again. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-03-24[S390] fix boot failures with compressed kernelsMartin Schwidefsky1-2/+0
Fix two bugs with the kernel image compression: 1) reset the bss section of the compressed vmlinux 2) clear the high half of the registers for 64 bit early enough for the decompression step Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-02-26[S390] add support for compressed kernelsMartin Schwidefsky1-12/+12
Add the "bzImage" compile target and the necessary code to generate compressed kernel images. The old style uncompressed "image" target is preserved, a simple make will build them both. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-02-26[S390] zfcpdump: remove cross arch dump supportHeiko Carstens1-69/+1
Remove support to be able to dump 31 bit systems with a 64 bit dumper. This is mostly useless since no distro ships 31 bit kernels together with a 64 bit dumper. We also get rid of a bit of hacky code. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07[S390] s390: clear high-order bits of registers after sam64Hendrik Brueckner1-0/+3
When the kernel is IPLed without the CLEAR option and switches to 64-bit, the high-order half of the registers might contain random values. This can cause addressing exceptions and the kernel enters an interrupt loop. Initialize the high-order half of the general purpose registers with zeros after switching to 64-bit mode. Cc: <stable@kernel.org> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11[S390] Limit cpu detection to 256 physical cpus.Heiko Carstens1-5/+3
Saves us more than 65k pointless IPIs. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11[S390] Initialize __LC_THREAD_INFO early.Heiko Carstens1-0/+1
"lockdep: Fix backtraces" reveales a bug in early setup code: when lockdep tries to save a stack backtrace before setup_arch has been called the lowcore pointer for the current thread info pointer isn't initialized yet. However our save stack backtrace code relies on it. If the pointer isn't initialized the saved backtrace will have zero entries. lockdep however relies (correctly) on the fact that that cannot happen. A write access to some random memory region is the result. Fix this by initializing the thread info pointer early. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-03-26[S390] eliminate ipl_device from lowcoreMartin Schwidefsky1-1/+0
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-31[PATCH] fast vdso implementation for CLOCK_THREAD_CPUTIME_IDMartin Schwidefsky1-0/+2
The extract cpu time instruction (ectg) instruction allows the user process to get the current thread cputime without calling into the kernel. The code that uses the instruction needs to switch to the access registers mode to get access to the per-cpu info page that contains the two base values that are needed to calculate the current cputime from the CPU timer with the ectg instruction. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25[S390] Remove initial kernel stack backchain initialization.Heiko Carstens1-1/+0
Early init code clears the backchain of the initial kernel stack frame. This is not necessary since it is pre initialized with zeros. Plus it was broken on 64 bit since it cleared only four of eight bytes. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25[S390] Add processor type march=z10 and a processor type safety check.Martin Schwidefsky1-23/+0
This patch adds the code generation option for IBM System z10 and adds a check in head[31,64].S to prevents the execution of a kernel compiled for a new processor type on an old machine. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-04-30[S390] System z large page support.Gerald Schaefer1-1/+1
This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-04-30[S390] Convert machine feature detection code to C.Heiko Carstens1-62/+0
From: Heiko Carstens <heiko.carstens@de.ibm.com> From: Carsten Otte <cotte@de.ibm.com> This lets us use defines for the magic bits in machine flags instead of using plain numbers all over the place. In addition on newer machines features/facilities are indicated by the result of the stfl instruction. So we use these bits instead of trying to execute new instructions and check wether we get an exception or not. Also the mvpg instruction is always available when in zArch mode, whereas the idte instruction is only available in zArch mode. This results in some minor optimizations. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>