aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel/head.S (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-10-06Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds1-5/+5
Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-09-09arm64/sysreg: Standardise naming for ID_AA64MMFR2_EL1.VARangeMark Brown1-2/+2
The kernel refers to ID_AA64MMFR2_EL1.VARange as LVA. In preparation for automatic generation of defines for the system registers bring the naming used by the kernel in sync with that of DDI0487H.a. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-12-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09arm64/sysreg: Add _EL1 into ID_AA64MMFR2_EL1 definition namesMark Brown1-2/+2
Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR2_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09arm64/sysreg: Add _EL1 into ID_AA64MMFR0_EL1 definition namesMark Brown1-3/+3
Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-01arm64: head: Ignore bogus KASLR displacement on non-relocatable kernelsArd Biesheuvel1-0/+2
Even non-KASLR kernels can be built as relocatable, to work around broken bootloaders that violate the rules regarding physical placement of the kernel image - in this case, the physical offset modulo 2 MiB is used as the KASLR offset, and all absolute symbol references are fixed up in the usual way. This workaround is enabled by default. CONFIG_RELOCATABLE can also be disabled entirely, in which case the relocation code and the code that captures the offset are omitted from the build. However, since commit aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR"), this code got out of sync, and we still add the offset to the kernel virtual address before populating the page tables even though we never capture it. This means we add a bogus value instead, breaking the boot entirely. Fixes: aacd149b6238 ("arm64: head: avoid relocating the kernel twice for KASLR") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Link: https://lore.kernel.org/r/20220827070904.2216989-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-20arm64: fix KASAN_INLINEMark Rutland1-2/+3
Since commit: a004393f45d9a55e ("arm64: idreg-override: use early FDT mapping in ID map") Kernels built with KASAN_INLINE=y die early in boot before producing any console output. This is because the accesses made to the FDT (e.g. in generic string processing functions) are instrumented with KASAN, and with KASAN_INLINE=y any access to an address in TTBR0 results in a bogus shadow VA, resulting in a data abort. This patch fixes this by reverting commits: 7559d9f97581654f ("arm64: setup: drop early FDT pointer helpers") bd0c3fa21878b6d0 ("arm64: idreg-override: use early FDT mapping in ID map") ... and using the TTBR1 fixmap mapping of the FDT. Note that due to a later commit: b65e411d6cc2f12a ("arm64: Save state of HCR_EL2.E2H before switch to EL1") ... which altered the prototype of init_feature_override() (and invocation from head.S), commit bd0c3fa21878b6d0 does not revert cleanly, and I've fixed that up manually. Fixes: a004393f45d9 ("arm64: idreg-override: use early FDT mapping in ID map") Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220713140949.45440-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Allow sticky E2H when entering EL1Marc Zyngier1-24/+10
For CPUs that have the unfortunate mis-feature to be stuck in VHE mode, we perform a funny dance where we completely shortcut the normal boot process to enable VHE and run the kernel at EL2, and only then start booting the kernel. Not only this is pretty ugly, but it means that the EL2 finalisation occurs before we have processed the sysreg override. Instead, start executing the kernel as if it was an EL1 guest and rely on the normal EL2 finalisation to go back to EL2. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-4-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Save state of HCR_EL2.E2H before switch to EL1Marc Zyngier1-2/+5
As we're about to switch the way E2H-stuck CPUs boot, save the boot CPU E2H state as a flag tied to the boot mode that can then be checked by the idreg override code. This allows us to replace the is_kernel_in_hyp_mode() check with a simple comparison with this state, even when running at EL1. Note that this flag isn't saved in __boot_cpu_mode, and is only kept in a register in the assembly code. Use with caution. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Rename the VHE switch to "finalise_el2"Marc Zyngier1-3/+3
as we are about to perform a lot more in 'mutate_to_vhe' than we currently do, this function really becomes the point where we finalise the basic EL2 configuration. Reflect this into the code by renaming a bunch of things: - HVC_VHE_RESTART -> HVC_FINALISE_EL2 - switch_to_vhe --> finalise_el2 - mutate_to_vhe -> __finalise_el2 No functional changes. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: mm: fix booting with 52-bit address spaceArd Biesheuvel1-0/+18
Joey reports that booting 52-bit VA capable builds on 52-bit VA capable CPUs is broken since commit 0d9b1ffefabe ("arm64: mm: make vabits_actual a build time constant if possible"). This is due to the fact that the primary CPU reads the vabits_actual variable before it has been assigned. The reason for deferring the assignment of vabits_actual was that we try to perform as few stores to memory as we can with the MMU and caches off, due to the cache coherency issues it creates. Since __cpu_setup() [which is where the read of vabits_actual occurs] is also called on the secondary boot path, we cannot just read the CPU ID registers directly, given that the size of the VA space is decided by the capabilities of the primary CPU. So let's read vabits_actual only on the secondary boot path, and read the CPU ID registers directly on the primary boot path, by making it a function parameter of __cpu_setup(). To ensure that all users of vabits_actual (including kasan_early_init()) observe the correct value, move the assignment of vabits_actual back into asm code, but still defer it to after the MMU and caches have been enabled. Cc: Will Deacon <will@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 0d9b1ffefabe ("arm64: mm: make vabits_actual a build time constant if possible") Reported-by: Joey Gouly <joey.gouly@arm.com> Co-developed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220701111045.2944309-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-29arm64: head: remove __PHYS_OFFSETMark Rutland1-8/+3
It's very easy to confuse __PHYS_OFFSET and PHYS_OFFSET. To clarify things, let's remove __PHYS_OFFSET and use KERNEL_START directly, with comments to show that we're using physical address, as we do for other objects. At the same time, update the comment regarding the kernel entry address to mention __pa(KERNEL_START) rather than __pa(PAGE_OFFSET). There should be no functional change as a result of this patch. Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220629041207.1670133-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: setup: drop early FDT pointer helpersArd Biesheuvel1-2/+0
We no longer need to call into the kernel to map the FDT before calling into the kernel so let's drop the helpers we added for this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-22-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: avoid relocating the kernel twice for KASLRArd Biesheuvel1-52/+21
Currently, when KASLR is in effect, we set up the kernel virtual address space twice: the first time, the KASLR seed is looked up in the device tree, and the kernel virtual mapping is torn down and recreated again, after which the relocations are applied a second time. The latter step means that statically initialized global pointer variables will be reset to their initial values, and to ensure that BSS variables are not set to values based on the initial translation, they are cleared again as well. All of this is needed because we need the command line (taken from the DT) to tell us whether or not to randomize the virtual address space before entering the kernel proper. However, this code has expanded little by little and now creates global state unrelated to the virtual randomization of the kernel before the mapping is torn down and set up again, and the BSS cleared for a second time. This has created some issues in the past, and it would be better to avoid this little dance if possible. So instead, let's use the temporary mapping of the device tree, and execute the bare minimum of code to decide whether or not KASLR should be enabled, and what the seed is. Only then, create the virtual kernel mapping, clear BSS, etc and proceed as normal. This avoids the issues around inconsistent global state due to BSS being cleared twice, and is generally more maintainable, as it permits us to defer all the remaining DT parsing and KASLR initialization to a later time. This means the relocation fixup code runs only a single time as well, allowing us to simplify the RELR handling code too, which is not idempotent and was therefore required to keep track of the offset that was applied the first time around. Note that this means we have to clone a pair of FDT library objects, so that we can control how they are built - we need the stack protector and other instrumentation disabled so that the code can tolerate being called this early. Note that only the kernel page tables and the temporary stack are mapped read-write at this point, which ensures that the early code does not modify any global state inadvertently. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-21-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: record CPU boot mode after enabling the MMUArd Biesheuvel1-37/+13
In order to avoid having to touch memory with the MMU and caches disabled, and therefore having to invalidate it from the caches explicitly, just defer storing the value until after the MMU has been turned on, unless we are giving up with an error. While at it, move the associated variable definitions into C code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-19-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: populate kernel page tables with MMU and caches onArd Biesheuvel1-46/+16
Now that we can access the entire kernel image via the ID map, we can execute the page table population code with the MMU and caches enabled. The only thing we need to ensure is that translations via TTBR1 remain disabled while we are updating the page tables the second time around, in case KASLR wants them to be randomized. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-18-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: factor out TTBR1 assignment into a macroArd Biesheuvel1-4/+1
Create a macro load_ttbr1 to avoid having to repeat the same instruction sequence 3 times in a subsequent patch. No functional change intended. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-17-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: idreg-override: use early FDT mapping in ID mapArd Biesheuvel1-0/+1
Instead of calling into the kernel to map the FDT into the kernel page tables before even calling start_kernel(), let's switch to the initial, temporary mapping of the device tree that has been added to the ID map. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-16-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: create a temporary FDT mapping in the initial ID mapArd Biesheuvel1-1/+13
We need to access the DT very early to get at the command line and the KASLR seed, which currently means we rely on some hacks to call into the kernel before really calling into the kernel, which is undesirable. So instead, let's create a mapping for the FDT in the initial ID map, which is feasible now that it has been extended to cover more than a single page or block, and can be updated in place to remap other output addresses. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-15-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: use relative references to the RELA and RELR tablesArd Biesheuvel1-9/+4
Formerly, we had to access the RELA and RELR tables via the kernel mapping that was being relocated, and so deriving the start and end addresses using ADRP/ADD references was not possible, as the relocation code runs from the ID map. Now that we map the entire kernel image via the ID map, we can simplify this, and just load the entries via the ID map as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-14-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: cover entire kernel image in initial ID mapArd Biesheuvel1-10/+21
As a first step towards avoiding the need to create, tear down and recreate the kernel virtual mapping with MMU and caches disabled, start by expanding the ID map so it covers the page tables as well as all executable code. This will allow us to populate the page tables with the MMU and caches on, and call KASLR init code before setting up the virtual mapping. Since this ID map is only needed at boot, create it as a temporary set of page tables, and populate the permanent ID map after enabling the MMU and caches. While at it, switch to read-only attributes for the where possible, as writable permissions are only needed for the initial kernel page tables. Note that on 4k granule configurations, the permanent ID map will now be reduced to a single page rather than a 2M block mapping. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: add helper function to remap regions in early page tablesArd Biesheuvel1-0/+33
The asm macros used to create the initial ID map and kernel mappings don't support randomly remapping parts of the address space after it has been populated. What we can do, however, given that all block or page mappings are created at the final level, is take a subset of the mapped range and update its attributes or output address. This will permit us to make parts of these page tables read-only, or remap a part of it to cover the device tree. So add a helper that encapsulates this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-12-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: pass ID map root table address to __enable_mmu()Ard Biesheuvel1-6/+8
We will be adding an initial ID map that covers the entire kernel image, so we will pass the actual ID map root table to use to __enable_mmu(), rather than hard code it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-10-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: split off idmap creation codeArd Biesheuvel1-49/+52
Split off the creation of the ID map page tables, so that we can avoid running it again unnecessarily when KASLR is in effect (which only randomizes the virtual placement). This will permit us to drop some explicit cache maintenance to the PoC which was necessary because the cache invalidation being performed on some global variables might otherwise clobber unrelated variables that happen to share a cacheline. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-8-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: switch to map_memory macro for the extended ID mapArd Biesheuvel1-39/+37
In a future patch, we will start using an ID map that covers the entire image, rather than a single page. This means that we need to deal with the pathological case of an extended ID map where the kernel image does not fit neatly inside a single entry at the root level, which means we will need to create additional table entries and map additional pages for page tables. The existing map_memory macro already takes care of most of that, so let's just extend it to deal with this case as well. While at it, drop the conditional branch on the value of T0SZ: we don't set the variable anymore in the entry code, and so we can just let the map_memory macro deal with the case where the output address exceeds VA_BITS. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-7-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: simplify page table mapping macros (slightly)Ard Biesheuvel1-33/+22
Simplify the macros in head.S that are used to set up the early page tables, by switching to immediates for the number of bits that are interpreted as the table index at each level. This makes it much easier to infer from the instruction stream what is going on, and reduces the number of instructions emitted substantially. Note that the extended ID map for cases where no additional level needs to be configured now uses a compile time size as well, which means that we interpret up to 10 bits as the table index at the root level (for 52-bit physical addressing), without taking into account whether or not this is supported on the current system. However, those bits can only be set if we are executing the image from an address that exceeds the 48-bit PA range, and are guaranteed to be cleared otherwise, and given that we are dealing with a mapping in the lower TTBR0 range of the address space, the result is therefore the same as if we'd mask off only 6 bits. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-6-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: drop idmap_ptrs_per_pgdArd Biesheuvel1-4/+3
The assignment of idmap_ptrs_per_pgd lacks any cache invalidation, even though it is updated with the MMU and caches disabled. However, we never bother to read the value again except in the very next instruction, and so we can just drop the variable entirely. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220624150651.1358849-5-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: move assignment of idmap_t0sz to C codeArd Biesheuvel1-12/+1
Setting idmap_t0sz involves fiddling with the caches if done with the MMU off. Since we will be creating an initial ID map with the MMU and caches off, and the permanent ID map with the MMU and caches on, let's move this assignment of idmap_t0sz out of the startup code, and replace it with a macro that simply issues the three instructions needed to calculate the value wherever it is needed before the MMU is turned on. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-4-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: mm: make vabits_actual a build time constant if possibleArd Biesheuvel1-14/+1
Currently, we only support 52-bit virtual addressing on 64k pages configurations, and in all other cases, vabits_actual is guaranteed to equal VA_BITS (== VA_BITS_MIN). So get rid of the variable entirely in that case. While at it, move the assignment out of the asm entry code - it has no need to be there. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-3-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: move kimage_vaddr variable into C fileArd Biesheuvel1-7/+0
This variable definition does not need to be in head.S so move it out. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220624150651.1358849-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-09-30sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=yArd Biesheuvel1-1/+1
THREAD_INFO_IN_TASK moved the CPU field out of thread_info, but this causes some issues on architectures that define raw_smp_processor_id() in terms of this field, due to the fact that #include'ing linux/sched.h to get at struct task_struct is problematic in terms of circular dependencies. Given that thread_info and task_struct are the same data structure anyway when THREAD_INFO_IN_TASK=y, let's move it back so that having access to the type definition of struct thread_info is sufficient to reference the CPU number of the current task. Note that this requires THREAD_INFO_IN_TASK's definition of the task_thread_info() helper to be updated, as task_cpu() takes a pointer-to-const, whereas task_thread_info() (which is used to generate lvalues as well), needs a non-const pointer. So make it a macro instead. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2021-08-24arm64: head: avoid over-mapping in map_memoryMark Rutland1-5/+6
The `compute_indices` and `populate_entries` macros operate on inclusive bounds, and thus the `map_memory` macro which uses them also operates on inclusive bounds. We pass `_end` and `_idmap_text_end` to `map_memory`, but these are exclusive bounds, and if one of these is sufficiently aligned (as a result of kernel configuration, physical placement, and KASLR), then: * In `compute_indices`, the computed `iend` will be in the page/block *after* the final byte of the intended mapping. * In `populate_entries`, an unnecessary entry will be created at the end of each level of table. At the leaf level, this entry will map up to SWAPPER_BLOCK_SIZE bytes of physical addresses that we did not intend to map. As we may map up to SWAPPER_BLOCK_SIZE bytes more than intended, we may violate the boot protocol and map physical address past the 2MiB-aligned end address we are permitted to map. As we map these with Normal memory attributes, this may result in further problems depending on what these physical addresses correspond to. The final entry at each level may require an additional table at that level. As EARLY_ENTRIES() calculates an inclusive bound, we allocate enough memory for this. Avoid the extraneous mapping by having map_memory convert the exclusive end address to an inclusive end address by subtracting one, and do likewise in EARLY_ENTRIES() when calculating the number of required tables. For clarity, comments are updated to more clearly document which boundaries the macros operate on. For consistency with the other macros, the comments in map_memory are also updated to describe `vstart` and `vend` as virtual addresses. Fixes: 0370b31e4845 ("arm64: Extend early page table code to allow for larger kernels") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210823101253.55567-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-06-24Merge branch 'for-next/mm' into for-next/coreWill Deacon1-3/+2
Lots of cleanup to our various page-table definitions, but also some non-critical fixes and removal of some unnecessary memory types. The most interesting change here is the reduction of ARCH_DMA_MINALIGN back to 64 bytes, since we're not aware of any machines that need a higher value with the way the code is structured (only needed for non-coherent DMA). * for-next/mm: arm64: tlb: fix the TTL value of tlb_get_level arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS arm64: head: fix code comments in set_cpu_boot_mode_flag arm64: mm: drop unused __pa(__idmap_text_start) arm64: mm: fix the count comments in compute_indices arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan arm64: mm: Pass original fault address to handle_mm_fault() arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK] arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT arm64/mm: Drop SWAPPER_INIT_MAP_SIZE arm64: mm: decode xFSC in mem_abort_decode() arm64: mm: Add is_el1_data_abort() helper arm64: cache: Lower ARCH_DMA_MINALIGN to 64 (L1_CACHE_BYTES) arm64: mm: Remove unused support for Normal-WT memory type arm64: acpi: Map EFI_MEMORY_WT memory as Normal-NC arm64: mm: Remove unused support for Device-GRE memory type arm64: mm: Use better bitmap_zalloc() arm64/mm: Make vmemmap_free() available only with CONFIG_MEMORY_HOTPLUG arm64/mm: Remove [PUD|PMD]_TABLE_BIT from [pud|pmd]_bad() arm64/mm: Validate CONFIG_PGTABLE_LEVELS
2021-06-24Merge branch 'for-next/caches' into for-next/coreWill Deacon1-8/+5
Big cleanup of our cache maintenance routines, which were confusingly named and inconsistent in their implementations. * for-next/caches: arm64: Rename arm64-internal cache maintenance functions arm64: Fix cache maintenance function comments arm64: sync_icache_aliases to take end parameter instead of size arm64: __clean_dcache_area_pou to take end parameter instead of size arm64: __clean_dcache_area_pop to take end parameter instead of size arm64: __clean_dcache_area_poc to take end parameter instead of size arm64: __flush_dcache_area to take end parameter instead of size arm64: dcache_by_line_op to take end parameter instead of size arm64: __inval_dcache_area to take end parameter instead of size arm64: Fix comments to refer to correct function __flush_icache_range arm64: Move documentation of dcache_by_line_op arm64: assembler: remove user_alt arm64: Downgrade flush_icache_range to invalidate arm64: Do not enable uaccess for invalidate_icache_range arm64: Do not enable uaccess for flush_icache_range arm64: Apply errata to swsusp_arch_suspend_exit arm64: assembler: add conditional cache fixups arm64: assembler: replace `kaddr` with `addr`
2021-06-15arm64: head: fix code comments in set_cpu_boot_mode_flagDong Aisheng1-1/+1
Up to here, the CPU boot mode can either be EL1 or EL2. Correct the code comments a bit. Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210518101405.1048860-5-aisheng.dong@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-15arm64: mm: drop unused __pa(__idmap_text_start)Dong Aisheng1-1/+0
x5 is not used in the following map_memory. Instead, __pa(__idmap_text_start) is stored in x3 which is used later. Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210518101405.1048860-4-aisheng.dong@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-15arm64: mm: fix the count comments in compute_indicesDong Aisheng1-1/+1
'count - 1' is confusing and not comply with the real code running. 'count' actually represents the extra entries required, no need minus 1. Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210518101405.1048860-3-aisheng.dong@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-27arm64: scs: Drop unused 'tmp' argument to scs_{load, save} asm macrosWill Deacon1-1/+1
The scs_load and scs_save asm macros don't make use of the mandatory 'tmp' register argument, so drop it and fix up the callers. Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20210527105529.21967-1-will@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: initialize cpu offset earlierMark Rutland1-6/+11
Now that we have a consistent place to initialize CPU context registers early in the boot path, let's also initialize the per-cpu offset here. This makes the primary and secondary boot paths more consistent, and allows for the use of per-cpu operations earlier, which will be necessary for instrumentation with KCSAN. Note that smp_prepare_boot_cpu() still needs to re-initialize CPU0's offset as immediately prior to this the per-cpu areas may be reallocated, and hence the boot-time offset may be stale. A comment is added to make this clear. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: unify task and sp setupMark Rutland1-18/+15
Once we enable the MMU, we have to initialize: * SP_EL0 to point at the active task * SP to point at the active task's stack * SCS_SP to point at the active task's shadow stack For all tasks (including init_task), this information can be derived from the task's task_struct. Let's unify __primary_switched and __secondary_switched to consistently acquire this information from the relevant task_struct. At the same time, let's fold this together with initializing a task's final frame. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-26arm64: smp: remove stack from secondary_dataMark Rutland1-3/+4
When we boot a secondary CPU, we pass it a task and a stack to use. As the stack is always the task's stack, which can be derived from the task, let's have the secondary CPU derive this itself and avoid passing redundant information. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210520115031.18509-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Rename arm64-internal cache maintenance functionsFuad Tabba1-4/+4
Although naming across the codebase isn't that consistent, it tends to follow certain patterns. Moreover, the term "flush" isn't defined in the Arm Architecture reference manual, and might be interpreted to mean clean, invalidate, or both for a cache. Rename arm64-internal functions to make the naming internally consistent, as well as making it consistent with the Arm ARM, by specifying whether it applies to the instruction, data, or both caches, whether the operation is a clean, invalidate, or both. Also specify which point the operation applies to, i.e., to the point of unification (PoU), coherency (PoC), or persistence (PoP). This commit applies the following sed transformation to all files under arch/arm64: "s/\b__flush_cache_range\b/caches_clean_inval_pou_macro/g;"\ "s/\b__flush_icache_range\b/caches_clean_inval_pou/g;"\ "s/\binvalidate_icache_range\b/icache_inval_pou/g;"\ "s/\b__flush_dcache_area\b/dcache_clean_inval_poc/g;"\ "s/\b__inval_dcache_area\b/dcache_inval_poc/g;"\ "s/__clean_dcache_area_poc\b/dcache_clean_poc/g;"\ "s/\b__clean_dcache_area_pop\b/dcache_clean_pop/g;"\ "s/\b__clean_dcache_area_pou\b/dcache_clean_pou/g;"\ "s/\b__flush_cache_user_range\b/caches_clean_inval_user_pou/g;"\ "s/\b__flush_icache_all\b/icache_inval_all_pou/g;" Note that __clean_dcache_area_poc is deliberately missing a word boundary check at the beginning in order to match the efistub symbols in image-vars.h. Also note that, despite its name, __flush_icache_range operates on both instruction and data caches. The name change here reflects that. No functional change intended. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-19-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: __inval_dcache_area to take end parameter instead of sizeFuad Tabba1-4/+1
To be consistent with other functions with similar names and functionality in cacheflush.h, cache.S, and cachetlb.rst, change to specify the range in terms of start and end, as opposed to start and size. Because the code is shared with __dma_inv_area, it changes the parameters for that as well. However, __dma_inv_area is local to cache.S, so no other users are affected. No functional change intended. Reported-by: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210524083001.2586635-11-tabba@google.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-25arm64: Implement stack trace termination recordMadhavan T. Venkataraman1-6/+19
Reliable stacktracing requires that we identify when a stacktrace is terminated early. We can do this by ensuring all tasks have a final frame record at a known location on their task stack, and checking that this is the final frame record in the chain. We'd like to use task_pt_regs(task)->stackframe as the final frame record, as this is already setup upon exception entry from EL0. For kernel tasks we need to consistently reserve the pt_regs and point x29 at this, which we can do with small changes to __primary_switched, __secondary_switched, and copy_process(). Since the final frame record must be at a specific location, we must create the final frame record in __primary_switched and __secondary_switched rather than leaving this to start_kernel and secondary_start_kernel. Thus, __primary_switched and __secondary_switched will now show up in stacktraces for the idle tasks. Since the final frame record is now identified by its location rather than by its contents, we identify it at the start of unwind_frame(), before we read any values from it. External debuggers may terminate the stack trace when FP == 0. In the pt_regs->stackframe, the PC is 0 as well. So, stack traces taken in the debugger may print an extra record 0x0 at the end. While this is not pretty, this does not do any harm. This is a small price to pay for having reliable stack trace termination in the kernel. That said, gdb does not show the extra record probably because it uses DWARF and not frame pointers for stack traces. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> [Mark: rebase, use ASM_BUG(), update comments, update commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210510110026.18061-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-04-08arm64: Cope with CPUs stuck in VHE modeMarc Zyngier1-3/+36
It seems that the CPUs part of the SoC known as Apple M1 have the terrible habit of being stuck with HCR_EL2.E2H==1, in violation of the architecture. Try and work around this deplorable state of affairs by detecting the stuck bit early and short-circuit the nVHE dance. Additional filtering code ensures that attempts at switching to nVHE from the command-line are also ignored. It is still unknown whether there are many more such nuggets to be found... Reported-by: Hector Martin <marcan@marcan.st> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210408131010.1109027-3-maz@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-03-11arm64: mm: use a 48-bit ID map when possible on 52-bit VA buildsArd Biesheuvel1-1/+1
52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one. This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well. Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable. Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: Mark Salter <msalter@redhat.com> Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-03-10arm64/mm: Fix __enable_mmu() for new TGRAN range valuesJames Morse1-2/+4
As per ARM ARM DDI 0487G.a, when FEAT_LPA2 is implemented, ID_AA64MMFR0_EL1 might contain a range of values to describe supported translation granules (4K and 16K pages sizes in particular) instead of just enabled or disabled values. This changes __enable_mmu() function to handle complete acceptable range of values (depending on whether the field is signed or unsigned) now represented with ID_AA64MMFR0_TGRAN_SUPPORTED_[MIN..MAX] pair. While here, also fix similar situations in EFI stub and KVM as well. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-efi@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1615355590-21102-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-02-24arm64: Add missing ISB after invalidating TLB in __primary_switchMarc Zyngier1-0/+1
Although there has been a bit of back and forth on the subject, it appears that invalidating TLBs requires an ISB instruction when FEAT_ETS is not implemented by the CPU. From the bible: | In an implementation that does not implement FEAT_ETS, a TLB | maintenance instruction executed by a PE, PEx, can complete at any | time after it is issued, but is only guaranteed to be finished for a | PE, PEx, after the execution of DSB by the PEx followed by a Context | synchronization event Add the missing ISB in __primary_switch, just in case. Fixes: 3c5e9f238bc4 ("arm64: head.S: move KASLR processing out of __enable_mmu()") Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210224093738.3629662-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Defer enabling pointer authentication on boot coreSrinivas Ramana1-4/+0
Defer enabling pointer authentication on boot core until after its required to be enabled by cpufeature framework. This will help in controlling the feature dynamically with a boot parameter. Signed-off-by: Ajay Patil <pajay@qti.qualcomm.com> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1610152163-16554-2-git-send-email-sramana@codeaurora.org Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-22-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: cpufeature: Add an early command-line cpufeature override facilityMarc Zyngier1-0/+1
In order to be able to override CPU features at boot time, let's add a command line parser that matches options of the form "cpureg.feature=value", and store the corresponding value into the override val/mask pair. No features are currently defined, so no expected change in functionality. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: David Brazdil <dbrazdil@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20210208095732.3267263-14-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2021-02-09arm64: Extract early FDT mapping from kaslr_early_init()Marc Zyngier1-1/+2
As we want to parse more options very early in the kernel lifetime, let's always map the FDT early. This is achieved by moving that code out of kaslr_early_init(). No functional change expected. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: David Brazdil <dbrazdil@google.com> Link: https://lore.kernel.org/r/20210208095732.3267263-13-maz@kernel.org [will: Ensue KASAN is enabled before running C code] Signed-off-by: Will Deacon <will@kernel.org>