From 7d92bda271ddcbb2d1be2f82733dcb9bf8378010 Mon Sep 17 00:00:00 2001 From: Douglas Anderson Date: Wed, 25 Sep 2019 16:47:45 -0700 Subject: kgdb: don't use a notifier to enter kgdb at panic; call directly Right now kgdb/kdb hooks up to debug panics by registering for the panic notifier. This works OK except that it means that kgdb/kdb gets called _after_ the CPUs in the system are taken offline. That means that if anything important was happening on those CPUs (like something that might have contributed to the panic) you can't debug them. Specifically I ran into a case where I got a panic because a task was "blocked for more than 120 seconds" which was detected on CPU 2. I nicely got shown stack traces in the kernel log for all CPUs including CPU 0, which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of __mmc_switch(). I then ended up at the kdb prompt where switched over to kgdb to try to look at local variables of the process on CPU 0. I found that I couldn't. Digging more, I found that I had no info on any tasks running on CPUs other than CPU 2 and that asking kdb for help showed me "Error: no saved data for this cpu". This was because all the CPUs were offline. Let's move the entry of kdb/kgdb to a direct call from panic() and stop using the generic notifier. Putting a direct call in allows us to order things more properly and it also doesn't seem like we're breaking any abstractions by calling into the debugger from the panic function. Daniel said: : This patch changes the way kdump and kgdb interact with each other. : However it would seem rather odd to have both tools simultaneously armed : and, even if they were, the user still has the option to use panic_timeout : to force a kdump to happen. Thus I think the change of order is : acceptable. Link: http://lkml.kernel.org/r/20190703170354.217312-1-dianders@chromium.org Signed-off-by: Douglas Anderson Reviewed-by: Daniel Thompson Cc: Jason Wessel Cc: Kees Cook Cc: Borislav Petkov Cc: Thomas Gleixner Cc: Feng Tang Cc: YueHaibing Cc: Sergey Senozhatsky Cc: "Steven Rostedt (VMware)" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/panic.c | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 057540b6eee9..d1ece4c363b9 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -219,6 +220,13 @@ void panic(const char *fmt, ...) dump_stack(); #endif + /* + * If kgdb is enabled, give it a chance to run before we stop all + * the other CPUs or else we won't be able to debug processes left + * running on them. + */ + kgdb_panic(buf); + /* * If we have crashed and we have a crash kernel loaded let it handle * everything else. -- cgit v1.3-6-gb490 From ee8711336c51708382627ebcaee5f2122b77dfef Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Wed, 25 Sep 2019 16:47:52 -0700 Subject: bug: refactor away warn_slowpath_fmt_taint() Patch series "Clean up WARN() "cut here" handling", v2. Christophe Leroy noticed that the fix for missing "cut here" in the WARN() case was adding explicit printk() calls instead of teaching the exception handler to add it. This refactors the bug/warn infrastructure to pass this information as a new BUGFLAG. Longer details repeated from the last patch in the series: bug: move WARN_ON() "cut here" into exception handler The original cleanup of "cut here" missed the WARN_ON() case (that does not have a printk message), which was fixed recently by adding an explicit printk of "cut here". This had the downside of adding a printk() to every WARN_ON() caller, which reduces the utility of using an instruction exception to streamline the resulting code. By making this a new BUGFLAG, all of these can be removed and "cut here" can be handled by the exception handler. This was very pronounced on PowerPC, but the effect can be seen on x86 as well. The resulting text size of a defconfig build shows some small savings from this patch: text data bss dec hex filename 19691167 5134320 1646664 26472151 193eed7 vmlinux.before 19676362 5134260 1663048 26473670 193f4c6 vmlinux.after This change also opens the door for creating something like BUG_MSG(), where a custom printk() before issuing BUG(), without confusing the "cut here" line. This patch (of 7): There's no reason to have specialized helpers for passing the warn taint down to __warn(). Consolidate and refactor helper macros, removing __WARN_printf() and warn_slowpath_fmt_taint(). Link: http://lkml.kernel.org/r/20190819234111.9019-2-keescook@chromium.org Signed-off-by: Kees Cook Cc: Christophe Leroy Cc: Peter Zijlstra Cc: Christophe Leroy Cc: Drew Davenport Cc: Arnd Bergmann Cc: "Steven Rostedt (VMware)" Cc: Feng Tang Cc: Petr Mladek Cc: Mauro Carvalho Chehab Cc: Borislav Petkov Cc: YueHaibing Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/asm-generic/bug.h | 13 ++++--------- kernel/panic.c | 18 +++--------------- 2 files changed, 7 insertions(+), 24 deletions(-) (limited to 'kernel/panic.c') diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h index 7357a3c942a0..c3a9c16a2b69 100644 --- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -90,24 +90,19 @@ struct bug_entry { * Use the versions with printk format strings to provide better diagnostics. */ #ifndef __WARN_TAINT -extern __printf(3, 4) -void warn_slowpath_fmt(const char *file, const int line, - const char *fmt, ...); extern __printf(4, 5) -void warn_slowpath_fmt_taint(const char *file, const int line, unsigned taint, - const char *fmt, ...); +void warn_slowpath_fmt(const char *file, const int line, unsigned taint, + const char *fmt, ...); extern void warn_slowpath_null(const char *file, const int line); #define WANT_WARN_ON_SLOWPATH #define __WARN() warn_slowpath_null(__FILE__, __LINE__) -#define __WARN_printf(arg...) warn_slowpath_fmt(__FILE__, __LINE__, arg) #define __WARN_printf_taint(taint, arg...) \ - warn_slowpath_fmt_taint(__FILE__, __LINE__, taint, arg) + warn_slowpath_fmt(__FILE__, __LINE__, taint, arg) #else extern __printf(1, 2) void __warn_printk(const char *fmt, ...); #define __WARN() do { \ printk(KERN_WARNING CUT_HERE); __WARN_TAINT(TAINT_WARN); \ } while (0) -#define __WARN_printf(arg...) __WARN_printf_taint(TAINT_WARN, arg) #define __WARN_printf_taint(taint, arg...) \ do { __warn_printk(arg); __WARN_TAINT(taint); } while (0) #endif @@ -132,7 +127,7 @@ void __warn(const char *file, int line, void *caller, unsigned taint, #define WARN(condition, format...) ({ \ int __ret_warn_on = !!(condition); \ if (unlikely(__ret_warn_on)) \ - __WARN_printf(format); \ + __WARN_printf_taint(TAINT_WARN, format); \ unlikely(__ret_warn_on); \ }) #endif diff --git a/kernel/panic.c b/kernel/panic.c index d1ece4c363b9..1d89f5423426 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -600,20 +600,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint, } #ifdef WANT_WARN_ON_SLOWPATH -void warn_slowpath_fmt(const char *file, int line, const char *fmt, ...) -{ - struct warn_args args; - - args.fmt = fmt; - va_start(args.args, fmt); - __warn(file, line, __builtin_return_address(0), TAINT_WARN, NULL, - &args); - va_end(args.args); -} -EXPORT_SYMBOL(warn_slowpath_fmt); - -void warn_slowpath_fmt_taint(const char *file, int line, - unsigned taint, const char *fmt, ...) +void warn_slowpath_fmt(const char *file, int line, unsigned taint, + const char *fmt, ...) { struct warn_args args; @@ -622,7 +610,7 @@ void warn_slowpath_fmt_taint(const char *file, int line, __warn(file, line, __builtin_return_address(0), taint, NULL, &args); va_end(args.args); } -EXPORT_SYMBOL(warn_slowpath_fmt_taint); +EXPORT_SYMBOL(warn_slowpath_fmt); void warn_slowpath_null(const char *file, int line) { -- cgit v1.3-6-gb490 From f2f84b05e02b7710a201f0017b3272ad7ef703d1 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Wed, 25 Sep 2019 16:47:58 -0700 Subject: bug: consolidate warn_slowpath_fmt() usage Instead of having a separate helper for no printk output, just consolidate the logic into warn_slowpath_fmt(). Link: http://lkml.kernel.org/r/20190819234111.9019-4-keescook@chromium.org Signed-off-by: Kees Cook Cc: Arnd Bergmann Cc: Borislav Petkov Cc: Christophe Leroy Cc: Drew Davenport Cc: Feng Tang Cc: Mauro Carvalho Chehab Cc: Peter Zijlstra Cc: Petr Mladek Cc: "Steven Rostedt (VMware)" Cc: YueHaibing Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/asm-generic/bug.h | 3 +-- kernel/panic.c | 14 +++++++------- 2 files changed, 8 insertions(+), 9 deletions(-) (limited to 'kernel/panic.c') diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h index 2d35bbf687d0..598d7072602f 100644 --- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -93,9 +93,8 @@ struct bug_entry { extern __printf(4, 5) void warn_slowpath_fmt(const char *file, const int line, unsigned taint, const char *fmt, ...); -extern void warn_slowpath_null(const char *file, const int line); #define WANT_WARN_ON_SLOWPATH -#define __WARN() warn_slowpath_null(__FILE__, __LINE__) +#define __WARN() __WARN_printf(TAINT_WARN, NULL) #define __WARN_printf(taint, arg...) \ warn_slowpath_fmt(__FILE__, __LINE__, taint, arg) #else diff --git a/kernel/panic.c b/kernel/panic.c index 1d89f5423426..79c153951b59 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -605,19 +605,19 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint, { struct warn_args args; + if (!fmt) { + pr_warn(CUT_HERE); + __warn(file, line, __builtin_return_address(0), taint, + NULL, NULL); + return; + } + args.fmt = fmt; va_start(args.args, fmt); __warn(file, line, __builtin_return_address(0), taint, NULL, &args); va_end(args.args); } EXPORT_SYMBOL(warn_slowpath_fmt); - -void warn_slowpath_null(const char *file, int line) -{ - pr_warn(CUT_HERE); - __warn(file, line, __builtin_return_address(0), TAINT_WARN, NULL, NULL); -} -EXPORT_SYMBOL(warn_slowpath_null); #else void __warn_printk(const char *fmt, ...) { -- cgit v1.3-6-gb490 From d38aba49a9f72b862f1220739ca837c886fdc319 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Wed, 25 Sep 2019 16:48:01 -0700 Subject: bug: lift "cut here" out of __warn() In preparation for cleaning up "cut here", move the "cut here" logic up out of __warn() and into callers that pass non-NULL args. For anyone looking closely, there are two callers that pass NULL args: one already explicitly prints "cut here". The remaining case is covered by how a WARN is built, which will be cleaned up in the next patch. Link: http://lkml.kernel.org/r/20190819234111.9019-5-keescook@chromium.org Signed-off-by: Kees Cook Cc: Arnd Bergmann Cc: Borislav Petkov Cc: Christophe Leroy Cc: Drew Davenport Cc: Feng Tang Cc: Mauro Carvalho Chehab Cc: Peter Zijlstra Cc: Petr Mladek Cc: "Steven Rostedt (VMware)" Cc: YueHaibing Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/panic.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 79c153951b59..a643e5464296 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -559,9 +559,6 @@ void __warn(const char *file, int line, void *caller, unsigned taint, { disable_trace_on_warning(); - if (args) - pr_warn(CUT_HERE); - if (file) pr_warn("WARNING: CPU: %d PID: %d at %s:%d %pS\n", raw_smp_processor_id(), current->pid, file, line, @@ -605,8 +602,9 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint, { struct warn_args args; + pr_warn(CUT_HERE); + if (!fmt) { - pr_warn(CUT_HERE); __warn(file, line, __builtin_return_address(0), taint, NULL, NULL); return; -- cgit v1.3-6-gb490 From 2da1ead4d5f7fa5f61e5805655de1e245d03a763 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Wed, 25 Sep 2019 16:48:08 -0700 Subject: bug: consolidate __WARN_FLAGS usage Instead of having separate tests for __WARN_FLAGS, merge the two #ifdef blocks and replace the synonym WANT_WARN_ON_SLOWPATH macro. Link: http://lkml.kernel.org/r/20190819234111.9019-7-keescook@chromium.org Signed-off-by: Kees Cook Cc: Arnd Bergmann Cc: Borislav Petkov Cc: Christophe Leroy Cc: Drew Davenport Cc: Feng Tang Cc: Mauro Carvalho Chehab Cc: Peter Zijlstra Cc: Petr Mladek Cc: "Steven Rostedt (VMware)" Cc: YueHaibing Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/asm-generic/bug.h | 18 +++++++----------- kernel/panic.c | 2 +- 2 files changed, 8 insertions(+), 12 deletions(-) (limited to 'kernel/panic.c') diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h index 4b18e09094cf..b4a2639130a0 100644 --- a/include/asm-generic/bug.h +++ b/include/asm-generic/bug.h @@ -61,16 +61,6 @@ struct bug_entry { #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0) #endif -#ifdef __WARN_FLAGS -#define WARN_ON_ONCE(condition) ({ \ - int __ret_warn_on = !!(condition); \ - if (unlikely(__ret_warn_on)) \ - __WARN_FLAGS(BUGFLAG_ONCE | \ - BUGFLAG_TAINT(TAINT_WARN)); \ - unlikely(__ret_warn_on); \ -}) -#endif - /* * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report * significant kernel issues that need prompt attention if they should ever @@ -91,7 +81,6 @@ struct bug_entry { extern __printf(4, 5) void warn_slowpath_fmt(const char *file, const int line, unsigned taint, const char *fmt, ...); -#define WANT_WARN_ON_SLOWPATH #define __WARN() __WARN_printf(TAINT_WARN, NULL) #define __WARN_printf(taint, arg...) \ warn_slowpath_fmt(__FILE__, __LINE__, taint, arg) @@ -105,6 +94,13 @@ extern __printf(1, 2) void __warn_printk(const char *fmt, ...); __warn_printk(arg); \ __WARN_FLAGS(BUGFLAG_TAINT(taint)); \ } while (0) +#define WARN_ON_ONCE(condition) ({ \ + int __ret_warn_on = !!(condition); \ + if (unlikely(__ret_warn_on)) \ + __WARN_FLAGS(BUGFLAG_ONCE | \ + BUGFLAG_TAINT(TAINT_WARN)); \ + unlikely(__ret_warn_on); \ +}) #endif /* used internally by panic.c */ diff --git a/kernel/panic.c b/kernel/panic.c index a643e5464296..47e8ebccc22b 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -596,7 +596,7 @@ void __warn(const char *file, int line, void *caller, unsigned taint, add_taint(taint, LOCKDEP_STILL_OK); } -#ifdef WANT_WARN_ON_SLOWPATH +#ifndef __WARN_FLAGS void warn_slowpath_fmt(const char *file, int line, unsigned taint, const char *fmt, ...) { -- cgit v1.3-6-gb490 From 20bb759a66be52cf4a9ddd17fddaf509e11490cd Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Sun, 6 Oct 2019 17:58:00 -0700 Subject: panic: ensure preemption is disabled during panic() Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the calling CPU in an infinite loop, but with interrupts and preemption enabled. From this state, userspace can continue to be scheduled, despite the system being "dead" as far as the kernel is concerned. This is easily reproducible on arm64 when booting with "nosmp" on the command line; a couple of shell scripts print out a periodic "Ping" message whilst another triggers a crash by writing to /proc/sysrq-trigger: | sysrq: Trigger a crash | Kernel panic - not syncing: sysrq triggered crash | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0x0/0x148 | show_stack+0x14/0x20 | dump_stack+0xa0/0xc4 | panic+0x140/0x32c | sysrq_handle_reboot+0x0/0x20 | __handle_sysrq+0x124/0x190 | write_sysrq_trigger+0x64/0x88 | proc_reg_write+0x60/0xa8 | __vfs_write+0x18/0x40 | vfs_write+0xa4/0x1b8 | ksys_write+0x64/0xf0 | __arm64_sys_write+0x14/0x20 | el0_svc_common.constprop.0+0xb0/0x168 | el0_svc_handler+0x28/0x78 | el0_svc+0x8/0xc | Kernel Offset: disabled | CPU features: 0x0002,24002004 | Memory Limit: none | ---[ end Kernel panic - not syncing: sysrq triggered crash ]--- | Ping 2! | Ping 1! | Ping 1! | Ping 2! The issue can also be triggered on x86 kernels if CONFIG_SMP=n, otherwise local interrupts are disabled in 'smp_send_stop()'. Disable preemption in 'panic()' before re-enabling interrupts. Link: http://lkml.kernel.org/r/20191002123538.22609-1-will@kernel.org Link: https://lore.kernel.org/r/BX1W47JXPMR8.58IYW53H6M5N@dragonstone Signed-off-by: Will Deacon Reported-by: Xogium Reviewed-by: Kees Cook Cc: Russell King Cc: Greg Kroah-Hartman Cc: Ingo Molnar Cc: Petr Mladek Cc: Feng Tang Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/panic.c | 1 + 1 file changed, 1 insertion(+) (limited to 'kernel/panic.c') diff --git a/kernel/panic.c b/kernel/panic.c index 47e8ebccc22b..f470a038b05b 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -180,6 +180,7 @@ void panic(const char *fmt, ...) * after setting panic_cpu) from invoking panic() again. */ local_irq_disable(); + preempt_disable_notrace(); /* * It's possible to come here directly from a panic-assertion and -- cgit v1.3-6-gb490