Skip to content

Commit

Permalink
powerpc/64s: Fix lost pending interrupt due to race causing lost upda…
Browse files Browse the repository at this point in the history
…te to irq_happened

force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.

Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.

This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.

Fixes: 1d607bb ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: [email protected] # v4.8+
Reported-by: Carol L. Soto <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
  • Loading branch information
npiggin authored and mpe committed Mar 22, 2018
1 parent e4b7990 commit ff6781f
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions arch/powerpc/kernel/irq.c
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,14 @@ void force_external_irq_replay(void)
*/
WARN_ON(!arch_irqs_disabled());

/*
* Interrupts must always be hard disabled before irq_happened is
* modified (to prevent lost update in case of interrupt between
* load and store).
*/
__hard_irq_disable();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;

/* Indicate in the PACA that we have an interrupt to replay */
local_paca->irq_happened |= PACA_IRQ_EE;
}
Expand Down

0 comments on commit ff6781f

Please sign in to comment.