Age | Commit message (Collapse) | Author | Files | Lines |
|
Prior to commit:
14c127c957c1c607 ("arm64: mm: Flip kernel VA space")
... VA_START described the start of the TTBR1 address space for a given
VA size described by VA_BITS, where all kernel mappings began.
Since that commit, VA_START described a portion midway through the
address space, where the linear map ends and other kernel mappings
begin.
To avoid confusion, let's rename VA_START to PAGE_END, making it clear
that it's not the start of the TTBR1 address space and implying that
it's related to PAGE_OFFSET. Comments and other mnemonics are updated
accordingly, along with a typo fix in the decription of VMEMMAP_SIZE.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
VA_START used to be the start of the TTBR1 address space, but now it's a
point midway though. In a couple of places we still use VA_START to get
the start of the TTBR1 address space, so let's fix these up to use
PAGE_OFFSET instead.
Fixes: 14c127c957c1c607 ("arm64: mm: Flip kernel VA space")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Cleanup memory.h so that the indentation is consistent, remove pointless
line-wrapping and use consistent parameter names for different versions
of the same macro.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Commenting the #endif of a multi-statement #ifdef block with the
condition which guards it is useful and can save having to scroll back
through the file to figure out which set of Kconfig options apply to
a particular piece of code.
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There's no need for __tag_set() to be a complicated macro when
CONFIG_KASAN_SW_TAGS=y and a simple static inline otherwise. Rewrite
the thing as a common static inline function.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Rather than subtracting from -1 and then adding 1, we can simply
subtract from 0.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Build virt_to_page() on top of virt_to_pfn() so we can avoid the need
for explicit shifting.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The default implementations of page_to_virt() and virt_to_page() are
fairly confusing to read and the former evaluates its 'page' parameter
twice in the macro
Rewrite them so that the computation is expressed as 'base + index' in
both cases and the parameter is always evaluated exactly once.
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When converting a linear virtual address to a physical address, pfn or
struct page *, we must make sure that the tag bits are masked before the
calculation otherwise we end up with corrupt pointers when running with
CONFIG_KASAN_SW_TAGS=y:
| Unable to handle kernel paging request at virtual address 0037fe0007580d08
| [0037fe0007580d08] address between user and kernel address ranges
Mask out the tag in __virt_to_phys_nodebug() and virt_to_page().
Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9cb1c5ddd2c4 ("arm64: mm: Remove bit-masking optimisations for PAGE_OFFSET and VMEMMAP_START")
Signed-off-by: Will Deacon <will@kernel.org>
|
|
virt_addr_valid() is intended to test whether or not the passed address
is a valid linear map address. Unfortunately, it relies on
_virt_addr_is_linear() which is broken because it assumes the linear
map is at the top of the address space, which it no longer is.
Reimplement virt_addr_valid() using __is_lm_address() and remove
_virt_addr_is_linear() entirely. At the same time, ensure we evaluate
the macro parameter only once and move it within the __ASSEMBLY__ block.
Reported-by: Qian Cai <cai@lca.pw>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 14c127c957c1 ("arm64: mm: Flip kernel VA space")
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Pull in generic CPU topology changes from Paul Walmsley (RISC-V).
* tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
MAINTAINERS: Add an entry for generic architecture topology
base: arch_topology: update Kconfig help description
RISC-V: Parse cpu topology during boot.
arm: Use common cpu_topology structure and functions.
cpu-topology: Move cpu topology code to common code.
dt-binding: cpu-topology: Move cpu-map to a common binding.
Documentation: DT: arm: add support for sockets defining package boundaries
|
|
All instances of struct sys64_hook contain compile-time constant data,
and are never inentionally modified, so let's make them all const.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The aarch64_insn_encoding_class[] array contains compile-time constant
data, and is never intentionally modified, so let's mark it as const.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The icache_policy_str[] array contains compile-time constant data, and
is never intentionally modified, so let's mark it as const.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.
This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
ACPI 6.3 adds a thread flag to represent if a CPU/PE is
actually a thread. Given that the MPIDR_MT bit may not
represent this information consistently on homogeneous machines
we should prefer the PPTT flag if its available.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Robert Richter <rrichter@marvell.com>
[will: made acpi_cpu_is_threaded() return 'bool']
Signed-off-by: Will Deacon <will@kernel.org>
|
|
ACPI 6.3 adds a flag to the CPU node to indicate whether
the given PE is a thread. Add a function to return that
information for a given linux logical CPU.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Robert Richter <rrichter@marvell.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Current PSCI code handles idle state entry through the
psci_cpu_suspend_enter() API, that takes an idle state index as a
parameter and convert the index into a previously initialized
power_state parameter before calling the PSCI.CPU_SUSPEND() with it.
This is unwieldly, since it forces the PSCI firmware layer to keep track
of power_state parameter for every idle state so that the
index->power_state conversion can be made in the PSCI firmware layer
instead of the CPUidle driver implementations.
Move the power_state handling out of drivers/firmware/psci
into the respective ACPI/DT PSCI CPUidle backends and convert
the psci_cpu_suspend_enter() API to get the power_state
parameter as input, which makes it closer to its firmware
interface PSCI.CPU_SUSPEND() API.
A notable side effect is that the PSCI ACPI/DT CPUidle backends
now can directly handle (and if needed update) power_state
parameters before handing them over to the PSCI firmware
interface to trigger PSCI.CPU_SUSPEND() calls.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Allow selection of the PSCI CPUidle in the kernel by updating
the respective Kconfig entry.
Remove PSCI callbacks from ARM/ARM64 generic CPU ops
to prevent the PSCI idle driver from clashing with the generic
ARM CPUidle driver initialization, that relies on CPU ops
to initialize and enter idle states.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
PSCI firmware is the standard power management control for
all ARM64 based platforms and it is also deployed on some
ARM 32 bit platforms to date.
Idle state entry in PSCI is currently achieved by calling
arm_cpuidle_init() and arm_cpuidle_suspend() in a generic
idle driver, which in turn relies on ARM/ARM64 CPUidle back-end
to relay the call into PSCI firmware if PSCI is the boot method.
Given that PSCI is the standard idle entry method on ARM64 systems
(which means that no other CPUidle driver are expected on ARM64
platforms - so PSCI is already a generic idle driver), in order to
simplify idle entry and code maintenance, it makes sense to have a PSCI
specific idle driver so that idle code that it is currently living in
drivers/firmware directory can be hoisted out of it and moved
where it belongs, into a full-fledged PSCI driver, leaving PSCI code
in drivers/firmware as a pure firmware interface, as it should be.
Implement a PSCI CPUidle driver. By default it is a silent Kconfig entry
which is left unselected, since it selection would clash with the
generic ARM CPUidle driver that provides a PSCI based idle driver
through the arm/arm64 arches back-ends CPU operations.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The PSCI checker currently relies on the generic ARM CPUidle
infrastructure to enter an idle state, which in turn creates
a dependency that is not really needed.
The PSCI checker code to test PSCI CPU suspend is built on
top of the CPUidle framework and can easily reuse the
struct cpuidle_state.enter() function (previously initialized
by an idle driver, with a PSCI back-end) to trigger an entry
into an idle state, decoupling the PSCI checker from the
generic ARM CPUidle infrastructure and simplyfing the code
in the process.
Convert the PSCI checker suspend entry function to use
the struct cpuidle_state.enter() function callback.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
CPUidle back-end operations are not implemented in some platforms
but this should not be considered an error serious enough to be
logged. Check the arm_cpuidle_init() return value to detect whether
the failure must be reported or not in the kernel log and do
not log it if the platform does not support CPUidle operations.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The generic ARM CPUidle driver includes <linux/topology.h> by mistake.
Remove the topology header include.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
untagged_addr() can be called with a '__user' pointer parameter and must
therefore use '__force' casts both when passing this parameter through
to sign_extend64() as a 'u64', but also when casting the 's64' return
value back to the '__user' pointer type.
Signed-off-by: Will Deacon <will@kernel.org>
|
|
_virt_addr_valid() is defined as the same value in two places and rolls
its own version of virt_to_pfn() in both cases.
Consolidate these definitions by inlining a simplified version directly
into virt_addr_valid().
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As the kernel no longer prints out the memory layout on boot, this patch
adds this information back to the memory document.
Also, as the 52-bit support introduces some subtle changes to the arm64
memory, the rationale behind these changes are also added to the memory
document.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Previous patches have enabled 52-bit kernel + user VAs and there is no
longer any scenario where user VA != kernel VA size.
This patch removes the, now redundant, vabits_user variable and replaces
usage with vabits_actual where appropriate.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Most of the machinery is now in place to enable 52-bit kernel VAs that
are detectable at boot time.
This patch adds a Kconfig option for 52-bit user and kernel addresses
and plumbs in the requisite CONFIG_ macros as well as sets TCR.T1SZ,
physvirt_offset and vmemmap at early boot.
To simplify things this patch also removes the 52-bit user/48-bit kernel
kconfig option.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In a later patch we will need to have a slightly larger VMEMMAP region
to accommodate boot time selection between 48/52-bit kernel VAs.
This patch modifies the formula for computing VMEMMAP_SIZE to depend
explicitly on the PAGE_OFFSET and start of kernel addressable memory.
(This allows for a slightly larger direct linear map in future).
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
vmemmap is a preprocessor definition that depends on a variable,
memstart_addr. In a later patch we will need to expand the size of
the VMEMMAP region and optionally modify vmemmap depending upon
whether or not hardware support is available for 52-bit virtual
addresses.
This patch changes vmemmap to be a variable. As the old definition
depended on a variable load, this should not affect performance
noticeably.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When running with a 52-bit userspace VA and a 48-bit kernel VA we offset
ttbr1_el1 to allow the kernel pagetables with a 52-bit PTRS_PER_PGD to
be used for both userspace and kernel.
Moving on to a 52-bit kernel VA we no longer require this offset to
ttbr1_el1 should we be running on a system with HW support for 52-bit
VAs.
This patch introduces conditional logic to offset_ttbr1 to query
SYS_ID_AA64MMFR2_EL1 whenever 52-bit VAs are selected. If there is HW
support for 52-bit VAs then the ttbr1 offset is skipped.
We choose to read a system register rather than vabits_actual because
offset_ttbr1 can be called in places where the kernel data is not
actually mapped.
Calls to offset_ttbr1 appear to be made from rarely called code paths so
this extra logic is not expected to adversely affect performance.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In order to support 52-bit kernel addresses detectable at boot time, one
needs to know the actual VA_BITS detected. A new variable vabits_actual
is introduced in this commit and employed for the KVM hypervisor layout,
KASAN, fault handling and phys-to/from-virt translation where there
would normally be compile time constants.
In order to maintain performance in phys_to_virt, another variable
physvirt_offset is introduced.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In order to support 52-bit kernel addresses detectable at boot time, the
kernel needs to know the most conservative VA_BITS possible should it
need to fall back to this quantity due to lack of hardware support.
A new compile time constant VA_BITS_MIN is introduced in this patch and
it is employed in the KASAN end address, KASLR, and EFI stub.
For Arm, if 52-bit VA support is unavailable the fallback is to 48-bits.
In other words: VA_BITS_MIN = min (48, VA_BITS)
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The kernel page table dumper assumes that the placement of VA regions is
constant and determined at compile time. As we are about to introduce
variable VA logic, we need to be able to determine certain regions at
boot time.
Specifically the VA_START and KASAN_SHADOW_START will depend on whether
or not the system is booted with 52-bit kernel VAs.
This patch adds logic to the kernel page table dumper s.t. these regions
can be computed at boot time.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
KASAN_SHADOW_OFFSET is a constant that is supplied to gcc as a command
line argument and affects the codegen of the inline address sanetiser.
Essentially, for an example memory access:
*ptr1 = val;
The compiler will insert logic similar to the below:
shadowValue = *(ptr1 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET)
if (somethingWrong(shadowValue))
flagAnError();
This code sequence is inserted into many places, thus
KASAN_SHADOW_OFFSET is essentially baked into many places in the kernel
text.
If we want to run a single kernel binary with multiple address spaces,
then we need to do this with KASAN_SHADOW_OFFSET fixed.
Thankfully, due to the way the KASAN_SHADOW_OFFSET is used to provide
shadow addresses we know that the end of the shadow region is constant
w.r.t. VA space size:
KASAN_SHADOW_END = ~0 >> KASAN_SHADOW_SCALE_SHIFT + KASAN_SHADOW_OFFSET
This means that if we increase the size of the VA space, the start of
the KASAN region expands into lower addresses whilst the end of the
KASAN region is fixed.
Currently the arm64 code computes KASAN_SHADOW_OFFSET at build time via
build scripts with the VA size used as a parameter. (There are build
time checks in the C code too to ensure that expected values are being
derived). It is sufficient, and indeed is a simplification, to remove
the build scripts (and build time checks) entirely and instead provide
KASAN_SHADOW_OFFSET values.
This patch removes the logic to compute the KASAN_SHADOW_OFFSET in the
arm64 Makefile, and instead we adopt the approach used by x86 to supply
offset values in kConfig. To help debug/develop future VA space changes,
the Makefile logic has been preserved in a script file in the arm64
Documentation folder.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In order to allow for a KASAN shadow that changes size at boot time, one
must fix the KASAN_SHADOW_END for both 48 & 52-bit VAs and "grow" the
start address. Also, it is highly desirable to maintain the same
function addresses in the kernel .text between VA sizes. Both of these
requirements necessitate us to flip the kernel address space halves s.t.
the direct linear map occupies the lower addresses.
This patch puts the direct linear map in the lower addresses of the
kernel VA range and everything else in the higher ranges.
We need to adjust:
*) KASAN shadow region placement logic,
*) KASAN_SHADOW_OFFSET computation logic,
*) virt_to_phys, phys_to_virt checks,
*) page table dumper.
These are all small changes, that need to take place atomically, so they
are bundled into this commit.
As part of the re-arrangement, a guard region of 2MB (to preserve
alignment for fixed map) is added after the vmemmap. Otherwise the
vmemmap could intersect with IS_ERR pointers.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently there are assumptions about the alignment of VMEMMAP_START
and PAGE_OFFSET that won't be valid after this series is applied.
These assumptions are in the form of bitwise operators being used
instead of addition and subtraction when calculating addresses.
This patch replaces these bitwise operators with addition/subtraction.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The ptrace trace SVE flags are prefixed with SVE_PT_*. Update the
comment accordingly.
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This change prints the hexadecimal EC value in mem_abort_decode(),
which makes it easier to lookup the corresponding EC in
the ARM Architecture Reference Manual.
The commit 1f9b8936f36f ("arm64: Decode information from ESR upon mem
faults") prints useful information when memory abort occurs. It would
be easier to lookup "0x25" instead of "DABT" in the document. Then we
can check the corresponding ISS.
For example:
Current info Document
EC Exception class
"CP15 MCR/MRC" 0x3 "MCR or MRC access to CP15a..."
"ASIMD" 0x7 "Access to SIMD or floating-point..."
"DABT (current EL)" 0x25 "Data Abort taken without..."
...
Before:
Unable to handle kernel paging request at virtual address 000000000000c000
Mem abort info:
ESR = 0x96000046
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000046
CM = 0, WnR = 1
After:
Unable to handle kernel paging request at virtual address 000000000000c000
Mem abort info:
ESR = 0x96000046
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000046
CM = 0, WnR = 1
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <Mark.rutland@arm.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The commit d5370f754875 ("arm64: prefetch: add alternative pattern for
CPUs without a prefetcher") introduced MIDR_IS_CPU_MODEL_RANGE() to be
used in has_no_hw_prefetch() with rv_min=0 which generates a compilation
warning from GCC,
In file included from ./arch/arm64/include/asm/cache.h:8,
from ./include/linux/cache.h:6,
from ./include/linux/printk.h:9,
from ./include/linux/kernel.h:15,
from ./include/linux/cpumask.h:10,
from arch/arm64/kernel/cpufeature.c:11:
arch/arm64/kernel/cpufeature.c: In function 'has_no_hw_prefetch':
./arch/arm64/include/asm/cputype.h:59:26: warning: comparison of
unsigned expression >= 0 is always true [-Wtype-limits]
_model == (model) && rv >= (rv_min) && rv <= (rv_max); \
^~
arch/arm64/kernel/cpufeature.c:889:9: note: in expansion of macro
'MIDR_IS_CPU_MODEL_RANGE'
return MIDR_IS_CPU_MODEL_RANGE(midr, MIDR_THUNDERX,
^~~~~~~~~~~~~~~~~~~~~~~
Fix it by converting MIDR_IS_CPU_MODEL_RANGE to a static inline
function.
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Commit 5cf896fb6be3 ("arm64: Add support for relocating the kernel with
RELR relocations") introduced CONFIG_TOOLS_SUPPORT_RELR, which checks
for RELR support in the toolchain as part of the kernel configuration.
During this procedure, "$(NM)" is invoked to see if it supports the new
relocation format, however PowerPC conditionally overrides this variable
in the architecture Makefile in order to pass '--synthetic' when
targetting PPC64.
This conditional override causes Kconfig to recurse forever, since
CONFIG_TOOLS_SUPPORT_RELR cannot be determined without $(NM) being
defined, but that in turn depends on CONFIG_PPC64:
$ make ARCH=powerpc CROSS_COMPILE=powerpc-linux-gnu-
scripts/kconfig/conf --syncconfig Kconfig
scripts/kconfig/conf --syncconfig Kconfig
scripts/kconfig/conf --syncconfig Kconfig
[...]
In this particular case, it looks like PowerPC may be able to pass
'--synthetic' unconditionally to nm or even drop it altogether. While
that is being resolved, let's just bodge the RELR check by picking up
$(NM) directly from the environment in whatever state it happens to be
in.
Cc: Peter Collingbourne <pcc@google.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Inspired by the commit 7cd01b08d35f ("powerpc: Add support for function
error injection"), this patch supports function error injection for
Arm64.
This patch mainly support two functions: one is regs_set_return_value()
which is used to overwrite the return value; the another function is
override_function_with_return() which is to override the probed
function returning and jump to its caller.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The function override_function_with_return() is defined separately for
each architecture and every architecture's definition is almost same
with each other. E.g. x86 and powerpc both define function in its own
asm/error-injection.h header and override_function_with_return() has
the same definition, the only difference is that x86 defines an extra
function just_return_func() but it is specific for x86 and is only used
by x86's override_function_with_return(), so don't need to export this
function.
This patch consolidates override_function_with_return() definition into
asm-generic/error-injection.h header, thus all architectures can use the
common definition. As result, the architecture specific headers are
removed; the include/linux/error-injection.h header also changes to
include asm-generic/error-injection.h header rather than architecture
header, furthermore, it includes linux/compiler.h for successful
compilation.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
This patch adds a simple test, that calls the uname syscall with a
tagged user pointer as an argument. Without the kernel accepting tagged
user pointers the test fails with EFAULT.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve(). A Kconfig
option allows the overall disabling of the relaxed ABI.
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This patch is a part of a series that extends kernel ABI to allow to pass
tagged user pointers (with the top byte set to something else other than
0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data
from user memory into the kernel memory or vice versa. Since a user can
provided a tagged pointer to one of the syscalls that use copy_from_user,
we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr,
before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the
checks, but then passes them as is into the kernel internals.
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
[will: Add __force to casting in untagged_addr() to kill sparse warning]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
RELR is a relocation packing format for relative relocations.
The format is described in a generic-abi proposal:
https://groups.google.com/d/topic/generic-abi/bX460iggiKg/discussion
The LLD linker can be instructed to pack relocations in the RELR
format by passing the flag --pack-dyn-relocs=relr.
This patch adds a new config option, CONFIG_RELR. Enabling this option
instructs the linker to pack vmlinux's relative relocations in the RELR
format, and causes the kernel to apply the relocations at startup along
with the RELA relocations. RELA relocations still need to be applied
because the linker will emit RELA relative relocations if they are
unrepresentable in the RELR format (i.e. address not a multiple of 2).
Enabling CONFIG_RELR reduces the size of a defconfig kernel image
with CONFIG_RANDOMIZE_BASE by 3.5MB/16% uncompressed, or 550KB/5%
compressed (lz4).
Signed-off-by: Peter Collingbourne <pcc@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Some TIF_* flags are documented in the comment block at the top, some
next to their definitions, some in both places.
Move all documentation to the individual definitions for consistency,
and for easy lookup.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
We should free the initrd reserved memblock in an aligned manner,
because the initrd reserves the memblock in an aligned manner
in arm64_memblock_init().
Otherwise there are some fragments in memblock_reserved regions
after free_initrd_mem(). e.g.:
/sys/kernel/debug/memblock # cat reserved
0: 0x0000000080080000..0x00000000817fafff
1: 0x0000000083400000..0x0000000083ffffff
2: 0x0000000090000000..0x000000009000407f
3: 0x00000000b0000000..0x00000000b000003f
4: 0x00000000b26184ea..0x00000000b2618fff
The fragments like the ranges from b0000000 to b000003f and
from b26184ea to b2618fff should be freed.
And we can do free_reserved_area() after memblock_free(),
as free_reserved_area() calls __free_pages(), once we've done
that it could be allocated somewhere else,
but memblock and iomem still say this is reserved memory.
Fixes: 05c58752f9dc ("arm64: To remove initrd reserved area entry from memblock")
Signed-off-by: Junhua Huang <huang.junhua@zte.com.cn>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The arm64 implementation of the default I/O accessors requires barrier
instructions to satisfy the memory ordering requirements documented in
memory-barriers.txt [1], which are largely derived from the behaviour of
I/O accesses on x86.
Of particular interest are the requirements that a write to a device
must be ordered against prior writes to memory, and a read from a device
must be ordered against subsequent reads from memory. We satisfy these
requirements using various flavours of DSB: the most expensive barrier
we have, since it implies completion of prior accesses. This was deemed
necessary when we first implemented the accessors, since accesses to
different endpoints could propagate independently and therefore the only
way to enforce order is to rely on completion guarantees [2].
Since then, the Armv8 memory model has been retrospectively strengthened
to require "other-multi-copy atomicity", a property that requires memory
accesses from an observer to become visible to all other observers
simultaneously [3]. In other words, propagation of accesses is limited
to transitioning from locally observed to globally observed. It recently
became apparent that this change also has a subtle impact on our I/O
accessors for shared peripherals, allowing us to use the cheaper DMB
instruction instead.
As a concrete example, consider the following:
memcpy(dma_buffer, data, bufsz);
writel(DMA_START, dev->ctrl_reg);
A DMB ST instruction between the final write to the DMA buffer and the
write to the control register will ensure that the writes to the DMA
buffer are observed before the write to the control register by all
observers. Put another way, if an observer can see the write to the
control register, it can also see the writes to memory. This has always
been the case and is not sufficient to provide the ordering required by
Linux, since there is no guarantee that the master interface of the
DMA-capable device has observed either of the accesses. However, in an
other-multi-copy atomic world, we can infer two things:
1. A write arriving at an endpoint shared between multiple CPUs is
visible to all CPUs
2. A write that is visible to all CPUs is also visible to all other
observers in the shareability domain
Pieced together, this allows us to use DMB OSHST for our default I/O
write accessors and DMB OSHLD for our default I/O read accessors (the
outer-shareability is for handling non-cacheable mappings) for shared
devices. Memory-mapped, DMA-capable peripherals that are private to a
CPU (i.e. inaccessible to other CPUs) still require the DSB, however
these are few and far between and typically require special treatment
anyway which is outside of the scope of the portable driver API (e.g.
GIC, page-table walker, SPE profiler).
Note that our mandatory barriers remain as DSBs, since there are cases
where they are used to flush the store buffer of the CPU, e.g. when
publishing page table updates to the SMMU.
[1] https://git.kernel.org/linus/4614bbdee357
[2] https://www.youtube.com/watch?v=i6DayghhA8Q
[3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
|