Skip to content

Commit

Permalink
workqueue: fix state-dump console deadlock
Browse files Browse the repository at this point in the history
Console drivers often queue work while holding locks also taken in their
console write paths, something which can lead to deadlocks on SMP when
dumping workqueue state (e.g. sysrq-t or on suspend failures).

For serial console drivers this could look like:

	CPU0				CPU1
	----				----

	show_workqueue_state();
	  lock(&pool->lock);		<IRQ>
	  				  lock(&port->lock);
					  schedule_work();
					    lock(&pool->lock);
	  printk();
	    lock(console_owner);
	    lock(&port->lock);

where workqueues are, for example, used to push data to the line
discipline, process break signals and handle modem-status changes. Line
disciplines and serdev drivers can also queue work on write-wakeup
notifications, etc.

Reworking every console driver to avoid queuing work while holding locks
also taken in their write paths would complicate drivers and is neither
desirable or feasible.

Instead use the deferred-printk mechanism to avoid printing while
holding pool locks when dumping workqueue state.

Note that there are a few WARN_ON() assertions in the workqueue code
which could potentially also trigger a deadlock. Hopefully the ongoing
printk rework will provide a general solution for this eventually.

This was originally reported after a lockdep splat when executing
sysrq-t with the imx serial driver.

Fixes: 3494fc3 ("workqueue: dump workqueues on sysrq-t")
Cc: [email protected]	# 4.0
Reported-by: Fabio Estevam <[email protected]>
Tested-by: Fabio Estevam <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Reviewed-by: John Ogness <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
  • Loading branch information
jhovold authored and htejun committed Oct 11, 2021
1 parent 80f0a1f commit 57116ce
Showing 1 changed file with 16 additions and 2 deletions.
18 changes: 16 additions & 2 deletions kernel/workqueue.c
Original file line number Diff line number Diff line change
Expand Up @@ -4830,8 +4830,16 @@ void show_workqueue_state(void)

for_each_pwq(pwq, wq) {
raw_spin_lock_irqsave(&pwq->pool->lock, flags);
if (pwq->nr_active || !list_empty(&pwq->inactive_works))
if (pwq->nr_active || !list_empty(&pwq->inactive_works)) {
/*
* Defer printing to avoid deadlocks in console
* drivers that queue work while holding locks
* also taken in their write paths.
*/
printk_deferred_enter();
show_pwq(pwq);
printk_deferred_exit();
}
raw_spin_unlock_irqrestore(&pwq->pool->lock, flags);
/*
* We could be printing a lot from atomic context, e.g.
Expand All @@ -4849,7 +4857,12 @@ void show_workqueue_state(void)
raw_spin_lock_irqsave(&pool->lock, flags);
if (pool->nr_workers == pool->nr_idle)
goto next_pool;

/*
* Defer printing to avoid deadlocks in console drivers that
* queue work while holding locks also taken in their write
* paths.
*/
printk_deferred_enter();
pr_info("pool %d:", pool->id);
pr_cont_pool_info(pool);
pr_cont(" hung=%us workers=%d",
Expand All @@ -4864,6 +4877,7 @@ void show_workqueue_state(void)
first = false;
}
pr_cont("\n");
printk_deferred_exit();
next_pool:
raw_spin_unlock_irqrestore(&pool->lock, flags);
/*
Expand Down

0 comments on commit 57116ce

Please sign in to comment.