Skip to content

Commit

Permalink
kgdb: don't use a notifier to enter kgdb at panic; call directly
Browse files Browse the repository at this point in the history
Right now kgdb/kdb hooks up to debug panics by registering for the panic
notifier.  This works OK except that it means that kgdb/kdb gets called
_after_ the CPUs in the system are taken offline.  That means that if
anything important was happening on those CPUs (like something that might
have contributed to the panic) you can't debug them.

Specifically I ran into a case where I got a panic because a task was
"blocked for more than 120 seconds" which was detected on CPU 2.  I nicely
got shown stack traces in the kernel log for all CPUs including CPU 0,
which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of
__mmc_switch().

I then ended up at the kdb prompt where switched over to kgdb to try to
look at local variables of the process on CPU 0.  I found that I couldn't.
Digging more, I found that I had no info on any tasks running on CPUs
other than CPU 2 and that asking kdb for help showed me "Error: no saved
data for this cpu".  This was because all the CPUs were offline.

Let's move the entry of kdb/kgdb to a direct call from panic() and stop
using the generic notifier.  Putting a direct call in allows us to order
things more properly and it also doesn't seem like we're breaking any
abstractions by calling into the debugger from the panic function.

Daniel said:

: This patch changes the way kdump and kgdb interact with each other.
: However it would seem rather odd to have both tools simultaneously armed
: and, even if they were, the user still has the option to use panic_timeout
: to force a kdump to happen.  Thus I think the change of order is
: acceptable.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Daniel Thompson <[email protected]>
Cc: Jason Wessel <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Feng Tang <[email protected]>
Cc: YueHaibing <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: "Steven Rostedt (VMware)" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
dianders authored and torvalds committed Sep 26, 2019
1 parent ac7c3e4 commit 7d92bda
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 20 deletions.
2 changes: 2 additions & 0 deletions include/linux/kgdb.h
Original file line number Diff line number Diff line change
Expand Up @@ -326,8 +326,10 @@ extern atomic_t kgdb_active;
(raw_smp_processor_id() == atomic_read(&kgdb_active))
extern bool dbg_is_early;
extern void __init dbg_late_init(void);
extern void kgdb_panic(const char *msg);
#else /* ! CONFIG_KGDB */
#define in_dbg_master() (0)
#define dbg_late_init()
static inline void kgdb_panic(const char *msg) {}
#endif /* ! CONFIG_KGDB */
#endif /* _KGDB_H_ */
31 changes: 11 additions & 20 deletions kernel/debug/debug_core.c
Original file line number Diff line number Diff line change
Expand Up @@ -893,30 +893,25 @@ static struct sysrq_key_op sysrq_dbg_op = {
};
#endif

static int kgdb_panic_event(struct notifier_block *self,
unsigned long val,
void *data)
void kgdb_panic(const char *msg)
{
if (!kgdb_io_module_registered)
return;

/*
* Avoid entering the debugger if we were triggered due to a panic
* We don't want to get stuck waiting for input from user in such case.
* panic_timeout indicates the system should automatically
* We don't want to get stuck waiting for input from user if
* "panic_timeout" indicates the system should automatically
* reboot on panic.
*/
if (panic_timeout)
return NOTIFY_DONE;
return;

if (dbg_kdb_mode)
kdb_printf("PANIC: %s\n", (char *)data);
kdb_printf("PANIC: %s\n", msg);

kgdb_breakpoint();
return NOTIFY_DONE;
}

static struct notifier_block kgdb_panic_event_nb = {
.notifier_call = kgdb_panic_event,
.priority = INT_MAX,
};

void __weak kgdb_arch_late(void)
{
}
Expand Down Expand Up @@ -965,8 +960,6 @@ static void kgdb_register_callbacks(void)
kgdb_arch_late();
register_module_notifier(&dbg_module_load_nb);
register_reboot_notifier(&dbg_reboot_notifier);
atomic_notifier_chain_register(&panic_notifier_list,
&kgdb_panic_event_nb);
#ifdef CONFIG_MAGIC_SYSRQ
register_sysrq_key('g', &sysrq_dbg_op);
#endif
Expand All @@ -980,16 +973,14 @@ static void kgdb_register_callbacks(void)
static void kgdb_unregister_callbacks(void)
{
/*
* When this routine is called KGDB should unregister from the
* panic handler and clean up, making sure it is not handling any
* When this routine is called KGDB should unregister from
* handlers and clean up, making sure it is not handling any
* break exceptions at the time.
*/
if (kgdb_io_module_registered) {
kgdb_io_module_registered = 0;
unregister_reboot_notifier(&dbg_reboot_notifier);
unregister_module_notifier(&dbg_module_load_nb);
atomic_notifier_chain_unregister(&panic_notifier_list,
&kgdb_panic_event_nb);
kgdb_arch_exit();
#ifdef CONFIG_MAGIC_SYSRQ
unregister_sysrq_key('g', &sysrq_dbg_op);
Expand Down
8 changes: 8 additions & 0 deletions kernel/panic.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include <linux/debug_locks.h>
#include <linux/sched/debug.h>
#include <linux/interrupt.h>
#include <linux/kgdb.h>
#include <linux/kmsg_dump.h>
#include <linux/kallsyms.h>
#include <linux/notifier.h>
Expand Down Expand Up @@ -219,6 +220,13 @@ void panic(const char *fmt, ...)
dump_stack();
#endif

/*
* If kgdb is enabled, give it a chance to run before we stop all
* the other CPUs or else we won't be able to debug processes left
* running on them.
*/
kgdb_panic(buf);

/*
* If we have crashed and we have a crash kernel loaded let it handle
* everything else.
Expand Down

0 comments on commit 7d92bda

Please sign in to comment.