aboutsummaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel
AgeCommit message (Collapse)AuthorFilesLines
2022-07-05arm64/idreg: Fix tab/space damageMark Brown1-7/+7
Quite a few of the overrides in idreg-override.c have a mix of tabs and spaces in their definitions, fix these. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-05arm64/cpuinfo: Remove references to reserved cache typeMark Brown1-8/+15
In 155433cb365ee466 ("arm64: cache: Remove support for ASID-tagged VIVT I-caches") we removed all the support fir AIVIVT cache types and renamed all references to the field to say "unknown" since support for AIVIVT caches was removed from the architecture. Some confusion has resulted since the corresponding change to the architecture left the value named as AIVIVT but documented it as reserved in v8, refactor the code so we don't define the constant instead. This will help with automatic generation of this register field since it means we care less about the correspondence with the ARM. No functional change, the value displayed to userspace is unchanged. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-04arm64: topology: Remove redundant setting of llc_id in CPU topologySudeep Holla1-14/+0
Since the cacheinfo LLC information is used directly in arch_topology, there is no need to parse and fetch the LLC ID information only for ACPI systems. Just drop the redundant parsing and setting of llc_id in CPU topology from ACPI PPTT. Link: https://lore.kernel.org/r/20220704101605.1318280-12-sudeep.holla@arm.com Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Ionela Voinescu <ionela.voinescu@arm.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-07-04arm64: fix oops in concurrently setting insn_emulation sysctlshaibinzhang (张海斌)1-4/+5
emulation_proc_handler() changes table->data for proc_dointvec_minmax and can generate the following Oops if called concurrently with itself: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010 | Internal error: Oops: 96000006 [#1] SMP | Call trace: | update_insn_emulation_mode+0xc0/0x148 | emulation_proc_handler+0x64/0xb8 | proc_sys_call_handler+0x9c/0xf8 | proc_sys_write+0x18/0x20 | __vfs_write+0x20/0x48 | vfs_write+0xe4/0x1d0 | ksys_write+0x70/0xf8 | __arm64_sys_write+0x20/0x28 | el0_svc_common.constprop.0+0x7c/0x1c0 | el0_svc_handler+0x2c/0xa0 | el0_svc+0x8/0x200 To fix this issue, keep the table->data as &insn->current_mode and use container_of() to retrieve the insn pointer. Another mutex is used to protect against the current_mode update but not for retrieving insn_emulation as table->data is no longer changing. Co-developed-by: hewenliang <hewenliang4@huawei.com> Signed-off-by: hewenliang <hewenliang4@huawei.com> Signed-off-by: Haibin Zhang <haibinzhang@tencent.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220128090324.2727688-1-hewenliang4@huawei.com Link: https://lore.kernel.org/r/9A004C03-250B-46C5-BF39-782D7551B00E@tencent.com Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Add an override for ID_AA64SMFR0_EL1.FA64Marc Zyngier2-7/+19
Add a specific override for ID_AA64SMFR0_EL1.FA64, which disables the full A64 streaming SVE mode. Note that no alias is provided for this, as this is already covered by arm64.nosme, and is only added as a debugging facility. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-10-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Add the arm64.nosve command line optionMarc Zyngier3-2/+43
In order to be able to completely disable SVE even if the HW seems to support it (most likely because the FW is broken), move the SVE setup into the EL2 finalisation block, and use a new idreg override to deal with it. Note that we also nuke id_aa64zfr0_el1 as a byproduct, and that SME also gets disabled, due to the dependency between the two features. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-9-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Add the arm64.nosme command line optionMarc Zyngier3-1/+61
In order to be able to completely disable SME even if the HW seems to support it (most likely because the FW is broken), move the SME setup into the EL2 finalisation block, and use a new idreg override to deal with it. Note that we also nuke id_aa64smfr0_el1 as a byproduct. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-8-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Expose a __check_override primitive for oddball featuresMarc Zyngier1-6/+11
In order to feal with early override of features that are not classically encoded in a standard ID register with a 4 bit wide field, add a primitive that takes a sysreg value as an input (instead of the usual sysreg name) as well as a bit field width (usually 4). No functional change. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-7-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Allow the idreg override to deal with variable field widthMarc Zyngier1-12/+16
Currently, the override mechanism can only deal with 4bit fields, which is the most common case. However, we now have a bunch of ID registers that have more diverse field widths, such as ID_AA64SMFR0_EL1, which has fields that are a single bit wide. Add the support for variable width, and a macro that encodes a feature width of 4 for all existing override. No functional change. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-6-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Factor out checking of a feature against the override into a macroMarc Zyngier1-14/+20
Checking for a feature being supported from assembly code is a bit tedious if we need to factor in the idreg override. Since we already have such code written for forcing nVHE, move the whole thing into a macro. This heavily relies on the override structure being called foo_override for foo_el1. No functional change. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-5-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Allow sticky E2H when entering EL1Marc Zyngier1-24/+10
For CPUs that have the unfortunate mis-feature to be stuck in VHE mode, we perform a funny dance where we completely shortcut the normal boot process to enable VHE and run the kernel at EL2, and only then start booting the kernel. Not only this is pretty ugly, but it means that the EL2 finalisation occurs before we have processed the sysreg override. Instead, start executing the kernel as if it was an EL1 guest and rely on the normal EL2 finalisation to go back to EL2. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-4-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Save state of HCR_EL2.E2H before switch to EL1Marc Zyngier2-5/+13
As we're about to switch the way E2H-stuck CPUs boot, save the boot CPU E2H state as a flag tied to the boot mode that can then be checked by the idreg override code. This allows us to replace the is_kernel_in_hyp_mode() check with a simple comparison with this state, even when running at EL1. Note that this flag isn't saved in __boot_cpu_mode, and is only kept in a register in the assembly code. Use with caution. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-3-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: Rename the VHE switch to "finalise_el2"Marc Zyngier3-15/+14
as we are about to perform a lot more in 'mutate_to_vhe' than we currently do, this function really becomes the point where we finalise the basic EL2 configuration. Reflect this into the code by renaming a bunch of things: - HVC_VHE_RESTART -> HVC_FINALISE_EL2 - switch_to_vhe --> finalise_el2 - mutate_to_vhe -> __finalise_el2 No functional changes. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220630160500.1536744-2-maz@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: mm: fix booting with 52-bit address spaceArd Biesheuvel1-0/+18
Joey reports that booting 52-bit VA capable builds on 52-bit VA capable CPUs is broken since commit 0d9b1ffefabe ("arm64: mm: make vabits_actual a build time constant if possible"). This is due to the fact that the primary CPU reads the vabits_actual variable before it has been assigned. The reason for deferring the assignment of vabits_actual was that we try to perform as few stores to memory as we can with the MMU and caches off, due to the cache coherency issues it creates. Since __cpu_setup() [which is where the read of vabits_actual occurs] is also called on the secondary boot path, we cannot just read the CPU ID registers directly, given that the size of the VA space is decided by the capabilities of the primary CPU. So let's read vabits_actual only on the secondary boot path, and read the CPU ID registers directly on the primary boot path, by making it a function parameter of __cpu_setup(). To ensure that all users of vabits_actual (including kasan_early_init()) observe the correct value, move the assignment of vabits_actual back into asm code, but still defer it to after the MMU and caches have been enabled. Cc: Will Deacon <will@kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 0d9b1ffefabe ("arm64: mm: make vabits_actual a build time constant if possible") Reported-by: Joey Gouly <joey.gouly@arm.com> Co-developed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220701111045.2944309-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: vdso32: Add DWARF_DEBUGNathan Chancellor1-0/+1
When building the 32-bit vDSO with LLVM 15 and CONFIG_DEBUG_INFO, there are the following orphan section warnings: ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_abbrev) is being placed in '.debug_abbrev' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_info) is being placed in '.debug_info' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_str_offsets) is being placed in '.debug_str_offsets' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_str) is being placed in '.debug_str' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_addr) is being placed in '.debug_addr' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_line) is being placed in '.debug_line' ld.lld: warning: arch/arm64/kernel/vdso32/note.o:(.debug_line_str) is being placed in '.debug_line_str' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_loclists) is being placed in '.debug_loclists' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_abbrev) is being placed in '.debug_abbrev' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_info) is being placed in '.debug_info' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_rnglists) is being placed in '.debug_rnglists' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_str_offsets) is being placed in '.debug_str_offsets' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_str) is being placed in '.debug_str' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_addr) is being placed in '.debug_addr' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_frame) is being placed in '.debug_frame' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_line) is being placed in '.debug_line' ld.lld: warning: arch/arm64/kernel/vdso32/vgettimeofday.o:(.debug_line_str) is being placed in '.debug_line_str' These are DWARF5 sections, as that is the implicit default version for clang-14 and newer when just '-g' is used. All DWARF sections are handled by the DWARF_DEBUG macro from include/asm-generic/vmlinux.lds.h so use that macro here to fix the warnings regardless of DWARF version. Fixes: 9d4775b332e1 ("arm64: vdso32: enable orphan handling for VDSO") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20220630153121.1317045-3-nathan@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: vdso32: Shuffle .ARM.exidx section above ELF_DETAILSNathan Chancellor1-1/+1
When building the 32-bit vDSO after commit 5c4fb60816ea ("arm64: vdso32: add ARM.exidx* sections"), ld.lld 11 fails to link: ld.lld: error: could not allocate headers ld.lld: error: unable to place section .text at file offset [0x2A0, 0xBB1]; check your linker script for overflows ld.lld: error: unable to place section .comment at file offset [0xBB2, 0xC8A]; check your linker script for overflows ld.lld: error: unable to place section .symtab at file offset [0xC8C, 0xE0B]; check your linker script for overflows ld.lld: error: unable to place section .strtab at file offset [0xE0C, 0xF1C]; check your linker script for overflows ld.lld: error: unable to place section .shstrtab at file offset [0xF1D, 0xFAA]; check your linker script for overflows ld.lld: error: section .ARM.exidx file range overlaps with .hash >>> .ARM.exidx range is [0x90, 0xCF] >>> .hash range is [0xB4, 0xE3] ld.lld: error: section .hash file range overlaps with .ARM.attributes >>> .hash range is [0xB4, 0xE3] >>> .ARM.attributes range is [0xD0, 0x10B] ld.lld: error: section .ARM.attributes file range overlaps with .dynsym >>> .ARM.attributes range is [0xD0, 0x10B] >>> .dynsym range is [0xE4, 0x133] ld.lld: error: section .ARM.exidx virtual address range overlaps with .hash >>> .ARM.exidx range is [0x90, 0xCF] >>> .hash range is [0xB4, 0xE3] ld.lld: error: section .ARM.exidx load address range overlaps with .hash >>> .ARM.exidx range is [0x90, 0xCF] >>> .hash range is [0xB4, 0xE3] This was fixed in ld.lld 12 with a change to match GNU ld's semantics of placing non-SHF_ALLOC sections after SHF_ALLOC sections. To workaround this issue, move the .ARM.exidx section before the .comment, .symtab, .strtab, and .shstrtab sections (ELF_DETAILS) so that those sections remain contiguous with the .ARM.attributes section. Fixes: 5c4fb60816ea ("arm64: vdso32: add ARM.exidx* sections") Link: https://github.com/llvm/llvm-project/commit/ec29538af2e0886a65f479d6a533956a1c478132 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220630153121.1317045-2-nathan@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-07-01arm64: compat: Move sigreturn32.S to .rodata sectionChen Zhongjin1-0/+1
Kuser code should be inside .rodata. sigreturn32.S is splited from kuser32.S, the code in .text section is never executed. Move it to .rodata. Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Link: https://lore.kernel.org/r/20220701035456.250877-1-chenzhongjin@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-29arm64/fpsimd: Remove duplicate SYS_SVCR readSchspa Shi1-1/+0
It seems to be a typo, remove the duplicate SYS_SVCR read. Signed-off-by: Schspa Shi <schspa@gmail.com> Link: https://lore.kernel.org/r/20220629051023.18173-1-schspa@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-29arm64: head: remove __PHYS_OFFSETMark Rutland1-8/+3
It's very easy to confuse __PHYS_OFFSET and PHYS_OFFSET. To clarify things, let's remove __PHYS_OFFSET and use KERNEL_START directly, with comments to show that we're using physical address, as we do for other objects. At the same time, update the comment regarding the kernel entry address to mention __pa(KERNEL_START) rather than __pa(PAGE_OFFSET). There should be no functional change as a result of this patch. Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220629041207.1670133-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-29arm64: lds: use PROVIDE instead of conditional definitionsArd Biesheuvel1-31/+30
Currently, a build with CONFIG_EFI=n and CONFIG_KASAN=y will not complete successfully because of missing symbols. This is due to the fact that the __pi_ prefixed aliases for __memcpy/__memmove were put inside a #ifdef CONFIG_EFI block inadvertently, and are therefore missing from the build in question. These definitions should only be provided when needed, as they will otherwise clutter up the symbol table, kallsyms etc for no reason. Fortunately, instead of using CPP conditionals, we can achieve the same result by using the linker's PROVIDE() directive, which only defines a symbol if it is required to complete the link. So let's use that for all symbols alias definitions. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220629083246.3729177-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28arm64: vdso*: place got/plt sections in .rodataJoey Gouly2-20/+15
The vDSO will not contain absolute relocations, so place these sections in .rodata. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/00abb0c5-6360-0004-353f-e7a88b3bd22c@arm.com/ Cc: Will Deacon <will@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20220628151307.35561-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-28arm64: vdso32: add ARM.exidx* sectionsJoey Gouly1-0/+1
These show up when building with clang+lld. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20220628151307.35561-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27arm64: mm: Convert to GENERIC_IOREMAPKefeng Wang1-1/+1
Add hook for arm64's special operation when ioremap(), then ioremap_wc/np/cache is converted to use ioremap_prot() from GENERIC_IOREMAP, update the Copyright and kill the unused inclusions. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20220607125027.44946-6-wangkefeng.wang@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27arm64: Copy the task argument to unwind_stateMadhavan T. Venkataraman1-13/+20
Copy the task argument passed to arch_stack_walk() to unwind_state so that it can be passed to unwind functions via unwind_state rather than as a separate argument. The task is a fundamental part of the unwind state. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220617180219.20352-3-madvenka@linux.microsoft.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27arm64: Split unwind_init()Madhavan T. Venkataraman1-11/+55
unwind_init() is currently a single function that initializes all of the unwind state. Split it into the following functions and call them appropriately: - unwind_init_from_regs() - initialize from regs passed by caller. - unwind_init_from_caller() - initialize for the current task from the caller of arch_stack_walk(). - unwind_init_from_task() - initialize from the saved state of a task other than the current task. In this case, the other task must not be running. This is done for two reasons: - the different ways of initializing are clear - specialized code can be added to each initializer in the future. Signed-off-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220617180219.20352-2-madvenka@linux.microsoft.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27arm64/signal: Clean up SVE/SME feature checking inconsistencyMark Brown1-8/+12
Currently when restoring signal state we check to see if SVE is supported in restore_sigframe() but check to see if SVE is supported inside restore_sve_fpsimd_context(). This makes no real difference since SVE is always supported in systems with SME but looks a bit untidy and makes things slightly harder to follow, move the SVE check next to the SME one in restore_sve_fpsimd_context(). Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220624172108.555000-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: setup: drop early FDT pointer helpersArd Biesheuvel2-17/+0
We no longer need to call into the kernel to map the FDT before calling into the kernel so let's drop the helpers we added for this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-22-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: avoid relocating the kernel twice for KASLRArd Biesheuvel6-140/+171
Currently, when KASLR is in effect, we set up the kernel virtual address space twice: the first time, the KASLR seed is looked up in the device tree, and the kernel virtual mapping is torn down and recreated again, after which the relocations are applied a second time. The latter step means that statically initialized global pointer variables will be reset to their initial values, and to ensure that BSS variables are not set to values based on the initial translation, they are cleared again as well. All of this is needed because we need the command line (taken from the DT) to tell us whether or not to randomize the virtual address space before entering the kernel proper. However, this code has expanded little by little and now creates global state unrelated to the virtual randomization of the kernel before the mapping is torn down and set up again, and the BSS cleared for a second time. This has created some issues in the past, and it would be better to avoid this little dance if possible. So instead, let's use the temporary mapping of the device tree, and execute the bare minimum of code to decide whether or not KASLR should be enabled, and what the seed is. Only then, create the virtual kernel mapping, clear BSS, etc and proceed as normal. This avoids the issues around inconsistent global state due to BSS being cleared twice, and is generally more maintainable, as it permits us to defer all the remaining DT parsing and KASLR initialization to a later time. This means the relocation fixup code runs only a single time as well, allowing us to simplify the RELR handling code too, which is not idempotent and was therefore required to keep track of the offset that was applied the first time around. Note that this means we have to clone a pair of FDT library objects, so that we can control how they are built - we need the stack protector and other instrumentation disabled so that the code can tolerate being called this early. Note that only the kernel page tables and the temporary stack are mapped read-write at this point, which ensures that the early code does not modify any global state inadvertently. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-21-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: kaslr: defer initialization to initcall where permittedArd Biesheuvel1-55/+40
The early KASLR init code runs extremely early, and anything that could be deferred until later should be. So let's defer the randomization of the module region until much later - this also simplifies the arithmetic, given that we no longer have to reason about the link time vs load time placement of the core kernel explicitly. Also get rid of the global status variable, and infer the status reported by the diagnostic print from other KASLR related context. While at it, get rid of the special case for KASAN without KASAN_VMALLOC, which never occurs in practice. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-20-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: record CPU boot mode after enabling the MMUArd Biesheuvel2-39/+15
In order to avoid having to touch memory with the MMU and caches disabled, and therefore having to invalidate it from the caches explicitly, just defer storing the value until after the MMU has been turned on, unless we are giving up with an error. While at it, move the associated variable definitions into C code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-19-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: populate kernel page tables with MMU and caches onArd Biesheuvel1-46/+16
Now that we can access the entire kernel image via the ID map, we can execute the page table population code with the MMU and caches enabled. The only thing we need to ensure is that translations via TTBR1 remain disabled while we are updating the page tables the second time around, in case KASLR wants them to be randomized. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-18-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: factor out TTBR1 assignment into a macroArd Biesheuvel1-4/+1
Create a macro load_ttbr1 to avoid having to repeat the same instruction sequence 3 times in a subsequent patch. No functional change intended. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-17-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: idreg-override: use early FDT mapping in ID mapArd Biesheuvel2-11/+7
Instead of calling into the kernel to map the FDT into the kernel page tables before even calling start_kernel(), let's switch to the initial, temporary mapping of the device tree that has been added to the ID map. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-16-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: create a temporary FDT mapping in the initial ID mapArd Biesheuvel1-1/+13
We need to access the DT very early to get at the command line and the KASLR seed, which currently means we rely on some hacks to call into the kernel before really calling into the kernel, which is undesirable. So instead, let's create a mapping for the FDT in the initial ID map, which is feasible now that it has been extended to cover more than a single page or block, and can be updated in place to remap other output addresses. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-15-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: use relative references to the RELA and RELR tablesArd Biesheuvel2-17/+8
Formerly, we had to access the RELA and RELR tables via the kernel mapping that was being relocated, and so deriving the start and end addresses using ADRP/ADD references was not possible, as the relocation code runs from the ID map. Now that we map the entire kernel image via the ID map, we can simplify this, and just load the entries via the ID map as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-14-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: cover entire kernel image in initial ID mapArd Biesheuvel2-12/+26
As a first step towards avoiding the need to create, tear down and recreate the kernel virtual mapping with MMU and caches disabled, start by expanding the ID map so it covers the page tables as well as all executable code. This will allow us to populate the page tables with the MMU and caches on, and call KASLR init code before setting up the virtual mapping. Since this ID map is only needed at boot, create it as a temporary set of page tables, and populate the permanent ID map after enabling the MMU and caches. While at it, switch to read-only attributes for the where possible, as writable permissions are only needed for the initial kernel page tables. Note that on 4k granule configurations, the permanent ID map will now be reduced to a single page rather than a 2M block mapping. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-13-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: add helper function to remap regions in early page tablesArd Biesheuvel1-0/+33
The asm macros used to create the initial ID map and kernel mappings don't support randomly remapping parts of the address space after it has been populated. What we can do, however, given that all block or page mappings are created at the final level, is take a subset of the mapped range and update its attributes or output address. This will permit us to make parts of these page tables read-only, or remap a part of it to cover the device tree. So add a helper that encapsulates this. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-12-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: mm: provide idmap pointer to cpu_replace_ttbr1()Ard Biesheuvel2-2/+2
In preparation for changing the way we initialize the permanent ID map, update cpu_replace_ttbr1() so we can use it with the initial ID map as well. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-11-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: pass ID map root table address to __enable_mmu()Ard Biesheuvel2-6/+9
We will be adding an initial ID map that covers the entire kernel image, so we will pass the actual ID map root table to use to __enable_mmu(), rather than hard code it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-10-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: kernel: drop unnecessary PoC cache clean+invalidateArd Biesheuvel1-11/+0
Some early boot code runs before the virtual placement of the kernel is finalized, and we used to go back to the very start and recreate the ID map along with the page tables describing the virtual kernel mapping, and this involved setting some global variables with the caches off. In order to ensure that global state created by the KASLR code is not corrupted by the cache invalidation that occurs in that case, we needed to clean those global variables to the PoC explicitly. This is no longer needed now that the ID map is created only once (and the associated global variable updates are no longer repeated). So drop the cache maintenance that is no longer necessary. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220624150651.1358849-9-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: split off idmap creation codeArd Biesheuvel1-49/+52
Split off the creation of the ID map page tables, so that we can avoid running it again unnecessarily when KASLR is in effect (which only randomizes the virtual placement). This will permit us to drop some explicit cache maintenance to the PoC which was necessary because the cache invalidation being performed on some global variables might otherwise clobber unrelated variables that happen to share a cacheline. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-8-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: switch to map_memory macro for the extended ID mapArd Biesheuvel1-39/+37
In a future patch, we will start using an ID map that covers the entire image, rather than a single page. This means that we need to deal with the pathological case of an extended ID map where the kernel image does not fit neatly inside a single entry at the root level, which means we will need to create additional table entries and map additional pages for page tables. The existing map_memory macro already takes care of most of that, so let's just extend it to deal with this case as well. While at it, drop the conditional branch on the value of T0SZ: we don't set the variable anymore in the entry code, and so we can just let the map_memory macro deal with the case where the output address exceeds VA_BITS. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-7-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: simplify page table mapping macros (slightly)Ard Biesheuvel1-33/+22
Simplify the macros in head.S that are used to set up the early page tables, by switching to immediates for the number of bits that are interpreted as the table index at each level. This makes it much easier to infer from the instruction stream what is going on, and reduces the number of instructions emitted substantially. Note that the extended ID map for cases where no additional level needs to be configured now uses a compile time size as well, which means that we interpret up to 10 bits as the table index at the root level (for 52-bit physical addressing), without taking into account whether or not this is supported on the current system. However, those bits can only be set if we are executing the image from an address that exceeds the 48-bit PA range, and are guaranteed to be cleared otherwise, and given that we are dealing with a mapping in the lower TTBR0 range of the address space, the result is therefore the same as if we'd mask off only 6 bits. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-6-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: drop idmap_ptrs_per_pgdArd Biesheuvel1-4/+3
The assignment of idmap_ptrs_per_pgd lacks any cache invalidation, even though it is updated with the MMU and caches disabled. However, we never bother to read the value again except in the very next instruction, and so we can just drop the variable entirely. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220624150651.1358849-5-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: move assignment of idmap_t0sz to C codeArd Biesheuvel1-12/+1
Setting idmap_t0sz involves fiddling with the caches if done with the MMU off. Since we will be creating an initial ID map with the MMU and caches off, and the permanent ID map with the MMU and caches on, let's move this assignment of idmap_t0sz out of the startup code, and replace it with a macro that simply issues the three instructions needed to calculate the value wherever it is needed before the MMU is turned on. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-4-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: mm: make vabits_actual a build time constant if possibleArd Biesheuvel1-14/+1
Currently, we only support 52-bit virtual addressing on 64k pages configurations, and in all other cases, vabits_actual is guaranteed to equal VA_BITS (== VA_BITS_MIN). So get rid of the variable entirely in that case. While at it, move the assignment out of the asm entry code - it has no need to be there. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220624150651.1358849-3-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: head: move kimage_vaddr variable into C fileArd Biesheuvel1-7/+0
This variable definition does not need to be in head.S so move it out. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20220624150651.1358849-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24arm64: entry: simplify trampoline data pageArd Biesheuvel2-32/+24
Get rid of some clunky open coded arithmetic on section addresses, by emitting the trampoline data variables into a separate, dedicated r/o data section, and putting it at the next page boundary. This way, we can access the literals via single LDR instruction. While at it, get rid of other, implicit literals, and use ADRP/ADD or MOVZ/MOVK sequences, as appropriate. Note that the latter are only supported for CONFIG_RELOCATABLE=n (which is usually the case if CONFIG_RANDOMIZE_BASE=n), so update the CPP conditionals to reflect this. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220622161010.3845775-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-06-24jump_label: make initial NOP patching the special caseArd Biesheuvel1-11/+0
Instead of defaulting to patching NOP opcodes at init time, and leaving it to the architectures to override this if this is not needed, switch to a model where doing nothing is the default. This is the common case by far, as only MIPS requires NOP patching at init time. On all other architectures, the correct encodings are emitted by the compiler and so no initial patching is needed. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615154142.1574619-4-ardb@kernel.org
2022-06-23arm64: trap implementation defined functionality in userspaceKristina Martsenko1-0/+18
The Arm v8.8 extension adds a new control FEAT_TIDCP1 that allows the kernel to disable all implementation-defined system registers and instructions in userspace. This can improve robustness against covert channels between processes, for example in cases where the firmware or hardware didn't disable that functionality by default. The kernel does not currently support any implementation-defined features, as there are no hwcaps for any such features, so disable all imp-def features unconditionally. Any use of imp-def instructions will result in a SIGILL being delivered to the process (same as for undefined instructions). Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220622115424.683520-1-kristina.martsenko@arm.com Signed-off-by: Will Deacon <will@kernel.org>