linux-dev/arch/arm/kernel/unwind.c, branch linus/master

ARM: unwind: only permit stack switch when unwinding call_with_stack()

2022-03-11T13:01:00Z

Commit b6506981f880 ("ARM: unwind: support unwinding across multiple stacks") updated the logic in the ARM unwinder to widen the bounds within which SP is assumed to be valid, in order to allow the unwind to traverse from the IRQ stack to the task stack. This is necessary, as otherwise, unwinds started from the IRQ stack would terminate in the IRQ exception handler, making stacktraces substantially less useful. This turns out to be a mistake, as it breaks asynchronous unwinding across exceptions, when the exception is taken before the stack frame is consistent with the unwind info. For instance, in the following backtrace: ... generic_handle_arch_irq from call_with_stack+0x18/0x20 call_with_stack from __irq_svc+0x80/0x98 Exception stack(0xc7093e20 to 0xc7093e68) 3e20: b6a94a88 c7093ea0 00000008 00000000 c7093ea0 b7e127d0 00000051 c9220000 3e40: b6a94a88 b6a94a88 00000004 0002b000 0036b570 c7093e70 c040ca2c c0994a90 3e60: 20070013 ffffffff __irq_svc from __copy_to_user_std+0x20/0x378 ... we need to apply the following unwind directives: 0xc099720c <__copy_to_user_std+0x1c>: @0xc295d1d4 Compact model index: 1 0x9b vsp = r11 0xb1 0x0d pop {r0, r2, r3} 0x84 0x81 pop {r4, r11, r14} 0xb0 finish which tell us to switch to the frame pointer register R11 and proceed with the unwind from that. However, having been interrupted 0x20 bytes into the function: c09971f0 <__copy_to_user_std>: c09971f0: e59f3350 ldr r3, [pc, #848] c09971f4: e243c001 sub ip, r3, #1 c09971f8: e05cc000 subs ip, ip, r0 c09971fc: 228cc001 addcs ip, ip, #1 c0997200: 205cc002 subscs ip, ip, r2 c0997204: 33a00000 movcc r0, #0 c0997208: e320f014 csdb c099720c: e3a03000 mov r3, #0 c0997210: e92d481d push {r0, r2, r3, r4, fp, lr} c0997214: e1a0b00d mov fp, sp c0997218: e2522004 subs r2, r2, #4 the value for R11 recovered from the previous frame (__irq_svc) will be a snapshot of its value before the exception was taken (0x0002b000), which occurred at address __copy_to_user_std+0x20 (0xc0997210), when R11 had not been assigned its value yet. This means we can never assume that the SP values recovered from the stack or from the frame pointer are ever safe to use, given the need to do asynchronous unwinding, and the only robust approach is to revert to the previous approach, which is to derive bounds for SP based on the initial value, and never update them. We can make an exception, though: now that the IRQ stack switch is guaranteed to occur in call_with_stack(), we can implement a special case for this function, and use a different set of bounds based on the knowledge that it will always unwind from R11 rather than SP. As call_with_stack() is a hand-rolled assembly routine, this is guaranteed to remain that way. So let's do a partial revert of b6506981f880, and drop all manipulations for sp_low and sp_high based on the information collected during the unwind itself. To support call_with_stack(), set sp_low and sp_high explicitly to values derived from R11 when we unwind that function. The only downside is that, while unwinding an overflow of the vmap'ed stack will work fine as before, we will no longer be able to produce a backtrace that unwinds the overflow stack itself across the exception that was raised due to the faulting access to the guard region. However, this only affects exceptions caused by problems in the stack overflow handling code itself, in which case the remaining backtrace is not that relevant. Fixes: b6506981f880 ("ARM: unwind: support unwinding across multiple stacks") Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle)

ARM: Revert "unwind: dump exception stack from calling frame"

2022-03-11T13:00:55Z

After simplifying the stack switch code in the IRQ exception handler by deferring the actual stack switch to call_with_stack(), we no longer need to special case the way we dump the exception stack, since it will always be at the top of whichever stack was active when the exception was taken. So revert this special handling for the ARM unwinder. This reverts commit 4ab6827081c63b83011a18d8e27f621ed34b1194. Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle)

ARM: unwind: set frame.pc correctly for current-thread unwinding

2022-03-11T10:55:28Z

When e.g. a WARN_ON() is encountered, we attempt to unwind the current thread. To do this, we set frame.pc to unwind_backtrace, which means it points at the beginning of the function. However, the rest of the state is initialised from within the function, which means the function prologue has already been run. This can be confusing, and with a recent patch from Ard, can result in the unwinder misbehaving if we want to be strict about the PC value. If we correctly initialise the state so it is self-consistent (in other words, set frame.pc to the location we are initialising it) then we eliminate this confusion, and avoid possible future issues. Reviewed-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle)

ARM: 9183/1: unwind: avoid spurious warnings on bogus code addresses

2022-03-07T11:43:12Z

Corentin reports that since commit 538b9265c063 ("ARM: unwind: track location of LR value in stack frame"), numerous spurious warnings are emitted into the kernel log: [ 0.000000] unwind: Index not found c0f0c440 [ 0.000000] unwind: Index not found 00000000 [ 0.000000] unwind: Index not found c0f0c440 [ 0.000000] unwind: Index not found 00000000 This is due to the fact that the commit in question removes a check whether the PC value in the unwound frame is actually a kernel text address, on the assumption that such an address will not be associated with valid unwind data to begin with, which is checked right after. The reason for removing this check was that unwind_frame() will be called by the ftrace graph tracer code, which means that it can no longer be safely instrumented itself, or any code that it calls, as it could cause infinite recursion. In order to prevent the spurious diagnostics, let's add back the call to kernel_text_address(), but this time, only call it if no unwind data could be found for the address in question. This is more efficient for the common successful case, and should avoid any unintended recursion, considering that kernel_text_address() will only be called if no unwind data was found. Cc: Corentin Labbe Fixes: 538b9265c063 ("ARM: unwind: track location of LR value in stack frame") Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle)

ARM: unwind: track location of LR value in stack frame

2022-02-09T08:13:43Z

The ftrace graph tracer needs to override the return address of an instrumented function, in order to install a hook that gets invoked when the function returns again. Currently, we only support this when building for ARM using GCC with frame pointers, as in this case, it is guaranteed that the function will reload LR from [FP, #-4] in all cases, and we can simply pass that address to the ftrace code. In order to support this for configurations that rely on the EABI unwinder, such as Thumb2 builds, make the unwinder keep track of the address from which LR was unwound, permitting ftrace to make use of this in a subsequent patch. Drop the call to is_kernel_text_address(), which is problematic in terms of ftrace recursion, given that it may be instrumented itself. The call is redundant anyway, as no unwind directives will be found unless the PC points to memory that is known to contain executable code. Signed-off-by: Ard Biesheuvel Reviewed-by: Nick Desaulniers

ARM: implement support for vmap'ed stacks

2021-12-03T14:11:33Z

Wire up the generic support for managing task stack allocations via vmalloc, and implement the entry code that detects whether we faulted because of a stack overrun (or future stack overrun caused by pushing the pt_regs array) While this adds a fair amount of tricky entry asm code, it should be noted that it only adds a TST + branch to the svc_entry path. The code implementing the non-trivial handling of the overflow stack is emitted out-of-line into the .text section. Since on ARM, we rely on do_translation_fault() to keep PMD level page table entries that cover the vmalloc region up to date, we need to ensure that we don't hit such a stale PMD entry when accessing the stack. So we do a dummy read from the new stack while still running from the old one on the context switch path, and bump the vmalloc_seq counter when PMD level entries in the vmalloc range are modified, so that the MM switch fetches the latest version of the entries. Note that we need to increase the per-mode stack by 1 word, to gain some space to stash a GPR until we know it is safe to touch the stack. However, due to the cacheline alignment of the struct, this does not actually increase the memory footprint of the struct stack array at all. Signed-off-by: Ard Biesheuvel Tested-by: Keith Packard Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: unwind: disregard unwind info before stack frame is set up

2021-12-03T14:11:32Z

When unwinding the stack from a stack overflow, we are likely to start from a stack push instruction, given that this is the most common way to grow the stack for compiler emitted code. This push instruction rarely appears anywhere else than at offset 0x0 of the function, and if it doesn't, the compiler tends to split up the unwind annotations, given that the stack frame layout is apparently not the same throughout the function. This means that, in the general case, if the frame's PC points at the first instruction covered by a certain unwind entry, there is no way the stack frame that the unwind entry describes could have been created yet, and so we are still on the stack frame of the caller in that case. So treat this as a special case, and return with the new PC taken from the frame's LR, without applying the unwind transformations to the virtual register set. This permits us to unwind the call stack on stack overflow when the overflow was caused by a stack push on function entry. Signed-off-by: Ard Biesheuvel Tested-by: Keith Packard Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: unwind: dump exception stack from calling frame

2021-12-03T14:11:31Z

The existing code that dumps the contents of the pt_regs structure passed to __entry routines does so while unwinding the callee frame, and dereferences the stack pointer as a struct pt_regs*. This will no longer work when we enable support for IRQ or overflow stacks, because the struct pt_regs may live on the task stack, while we are executing from another stack. The unwinder has access to this information, but only while unwinding the calling frame. So let's combine the exception stack dumping code with the handling of the calling frame as well. By printing it before dumping the caller/callee addresses, the output order is preserved. Signed-off-by: Ard Biesheuvel Reviewed-by: Arnd Bergmann Acked-by: Linus Walleij Tested-by: Keith Packard Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: unwind: support unwinding across multiple stacks

2021-12-03T14:11:31Z

Implement support in the unwinder for dealing with multiple stacks. This will be needed once we add support for IRQ stacks, or for the overflow stack used by the vmap'ed stacks code. This involves tracking the unwind opcodes that either update the virtual stack pointer from another virtual register, or perform an explicit subtract on the virtual stack pointer, and updating the low and high bounds that we use to sanitize the stack pointer accordingly. Signed-off-by: Ard Biesheuvel Reviewed-by: Arnd Bergmann Acked-by: Linus Walleij Tested-by: Keith Packard Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: 9026/1: unwind: remove old check for GCC <= 4.2

2020-12-08T10:13:59Z

Since commit 0bddd227f3dc ("Documentation: update for gcc 4.9 requirement") the minimum supported version of GCC is gcc-4.9. It's now safe to remove this code. Link: https://github.com/ClangBuiltLinux/linux/issues/427 Signed-off-by: Nick Desaulniers Reviewed-by: Nathan Chancellor Signed-off-by: Russell King