linux-dev/arch/arm/kernel/smp.c, branch linus/master

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

2022-03-24T00:35:57Z

Pull ARM updates from Russell King: "Updates for IRQ stacks and virtually mapped stack support, and ftrace: - Support for IRQ and vmap'ed stacks This covers all the work related to implementing IRQ stacks and vmap'ed stacks for all 32-bit ARM systems that are currently supported by the Linux kernel, including RiscPC and Footbridge. It has been submitted for review in four different waves: - IRQ stacks support for v7 SMP systems [0] - vmap'ed stacks support for v7 SMP systems[1] - extending support for both IRQ stacks and vmap'ed stacks for all remaining configurations, including v6/v7 SMP multiplatform kernels and uniprocessor configurations including v7-M [2] - fixes and updates in [3] - ftrace fixes and cleanups Make all flavors of ftrace available on all builds, regardless of ISA choice, unwinder choice or compiler [4]: - use ADD not POP where possible - fix a couple of Thumb2 related issues - enable HAVE_FUNCTION_GRAPH_FP_TEST for robustness - enable the graph tracer with the EABI unwinder - avoid clobbering frame pointer registers to make Clang happy - Fixes for the above" [0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/ [1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/ [2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/ [3] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/ [4] https://lore.kernel.org/linux-arm-kernel/20220203082204.1176734-1-ardb@kernel.org/ * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (62 commits) ARM: fix building NOMMU ARMv4/v5 kernels ARM: unwind: only permit stack switch when unwinding call_with_stack() ARM: Revert "unwind: dump exception stack from calling frame" ARM: entry: fix unwinder problems caused by IRQ stacks ARM: unwind: set frame.pc correctly for current-thread unwinding ARM: 9184/1: return_address: disable again for CONFIG_ARM_UNWIND=y ARM: 9183/1: unwind: avoid spurious warnings on bogus code addresses Revert "ARM: 9144/1: forbid ftrace with clang and thumb2_kernel" ARM: mach-bcm: disable ftrace in SMC invocation routines ARM: cacheflush: avoid clobbering the frame pointer ARM: kprobes: treat R7 as the frame pointer register in Thumb2 builds ARM: ftrace: enable the graph tracer with the EABI unwinder ARM: unwind: track location of LR value in stack frame ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST ARM: ftrace: avoid unnecessary literal loads ARM: ftrace: avoid redundant loads or clobbering IP ARM: ftrace: use trampolines to keep .init.text in branching range ARM: ftrace: use ADD not POP to counter PUSH at entry ARM: ftrace: ensure that ADR takes the Thumb bit into account ARM: make get_current() and __my_cpu_offset() __always_inline ...

ARM: drop pointless SMP check on secondary startup path

2022-01-25T08:53:52Z

Only SMP systems use the secondary startup path by definition, so there is no need for SMP conditionals there. Signed-off-by: Ard Biesheuvel

ARM: 9158/1: leave it to core code to manage thread_info::cpu

2021-12-17T11:34:31Z

Since commit bcf9033e5449 ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y"), the CPU field in thread_info went back to being managed by the core code, so we no longer have to keep it in sync in arch code. While at it, mark THREAD_INFO_IN_TASK as done for ARM in the documentation. Reviewed-by: Linus Walleij Signed-off-by: Ard Biesheuvel Signed-off-by: Russell King (Oracle)

ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems

2021-12-06T11:49:17Z

On UP systems, only a single task can be 'current' at the same time, which means we can use a global variable to track it. This means we can also enable THREAD_INFO_IN_TASK for those systems, as in that case, thread_info is accessed via current rather than the other way around, removing the need to store thread_info at the base of the task stack. This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP systems as well. To partially mitigate the performance overhead of this arrangement, use a ADD/ADD/LDR sequence with the appropriate PC-relative group relocations to load the value of current when needed. This means that accessing current will still only require a single load as before, avoiding the need for a literal to carry the address of the global variable in each function. However, accessing thread_info will now require this load as well. Acked-by: Linus Walleij Acked-by: Nicolas Pitre Signed-off-by: Ard Biesheuvel Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: remove some dead code

2021-12-03T14:11:31Z

This code appears to be no longer used so let's get rid of it. Signed-off-by: Ard Biesheuvel Reviewed-by: Arnd Bergmann Acked-by: Linus Walleij Tested-by: Keith Packard Tested-by: Marc Zyngier Tested-by: Vladimir Murzin # ARMv7M

ARM: smp: Enable THREAD_INFO_IN_TASK

2021-09-27T14:54:02Z

Now that we no longer rely on thread_info living at the base of the task stack to be able to access the 'current' pointer, we can wire up the generic support for moving thread_info into the task struct itself. Note that this requires us to update the cpu field in thread_info explicitly, now that the core code no longer does so. Ideally, we would switch the percpu code to access the cpu field in task_struct instead, but this unleashes #include circular dependency hell. Co-developed-by: Keith Packard Signed-off-by: Keith Packard Signed-off-by: Ard Biesheuvel Reviewed-by: Linus Walleij Tested-by: Amit Daniel Kachhap

ARM: smp: Store current pointer in TPIDRURO register if available

2021-09-27T14:54:02Z

Now that the user space TLS register is assigned on every return to user space, we can use it to keep the 'current' pointer while running in the kernel. This removes the need to access it via thread_info, which is located at the base of the stack, but will be moved out of there in a subsequent patch. Use the __builtin_thread_pointer() helper when available - this will help GCC understand that reloading the value within the same function is not necessary, even when using the per-task stack protector (which also generates accesses via the TLS register). For example, the generated code below loads TPIDRURO only once, and uses it to access both the stack canary and the preempt_count fields. : e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr} ee1d 4f70 mrc 15, 0, r4, cr13, cr0, {3} 4606 mov r6, r0 b094 sub sp, #80 ; 0x50 f8d4 34e8 ldr.w r3, [r4, #1256] ; 0x4e8 <- stack canary 9313 str r3, [sp, #76] ; 0x4c f8d4 8004 ldr.w r8, [r4, #4] <- preempt count Co-developed-by: Keith Packard Signed-off-by: Keith Packard Signed-off-by: Ard Biesheuvel Reviewed-by: Linus Walleij Tested-by: Amit Daniel Kachhap

ARM: smp: Pass task to secondary_start_kernel

2021-09-27T14:54:01Z

This avoids needing to compute the task pointer in this function, which will no longer be possible once we move thread_info off the stack. Signed-off-by: Keith Packard Signed-off-by: Ard Biesheuvel Reviewed-by: Linus Walleij Tested-by: Amit Daniel Kachhap

printk: remove NMI tracking

2021-07-26T13:09:44Z

All NMI contexts are handled the same as the safe context: store the message and defer printing. There is no need to have special NMI context tracking for this. Using in_nmi() is enough. There are several parts of the kernel that are manually calling into the printk NMI context tracking in order to cause general printk deferred printing: arch/arm/kernel/smp.c arch/powerpc/kexec/crash.c kernel/trace/trace.c For arm/kernel/smp.c and powerpc/kexec/crash.c, provide a new function pair printk_deferred_enter/exit that explicitly achieves the same objective. For ftrace, remove the printk context manipulation completely. It was added in commit 03fc7f9c99c1 ("printk/nmi: Prevent deadlock when accessing the main log buffer in NMI"). The purpose was to enforce storing messages directly into the ring buffer even in NMI context. It really should have only modified the behavior in NMI context. There is no need for a special behavior any longer. All messages are always stored directly now. The console deferring is handled transparently in vprintk(). Signed-off-by: John Ogness [pmladek@suse.com: Remove special handling in ftrace.c completely. Signed-off-by: Petr Mladek Link: https://lore.kernel.org/r/20210715193359.25946-5-john.ogness@linutronix.de

sched/core: Initialize the idle task with preemption disabled

2021-05-12T11:01:45Z

As pointed out by commit de9b8f5dcbd9 ("sched: Fix crash trying to dequeue/enqueue the idle thread") init_idle() can and will be invoked more than once on the same idle task. At boot time, it is invoked for the boot CPU thread by sched_init(). Then smp_init() creates the threads for all the secondary CPUs and invokes init_idle() on them. As the hotplug machinery brings the secondaries to life, it will issue calls to idle_thread_get(), which itself invokes init_idle() yet again. In this case it's invoked twice more per secondary: at _cpu_up(), and at bringup_cpu(). Given smp_init() already initializes the idle tasks for all *possible* CPUs, no further initialization should be required. Now, removing init_idle() from idle_thread_get() exposes some interesting expectations with regards to the idle task's preempt_count: the secondary startup always issues a preempt_disable(), requiring some reset of the preempt count to 0 between hot-unplug and hotplug, which is currently served by idle_thread_get() -> idle_init(). Given the idle task is supposed to have preemption disabled once and never see it re-enabled, it seems that what we actually want is to initialize its preempt_count to PREEMPT_DISABLED and leave it there. Do that, and remove init_idle() from idle_thread_get(). Secondary startups were patched via coccinelle: @begone@ @@ -preempt_disable(); ... cpu_startup_entry(CPUHP_AP_ONLINE_IDLE); Signed-off-by: Valentin Schneider Signed-off-by: Ingo Molnar Acked-by: Peter Zijlstra Link: https://lore.kernel.org/r/20210512094636.2958515-1-valentin.schneider@arm.com