Skip to content

Commit

Permalink
smp/hotplug: Move unparking of percpu threads to the control CPU
Browse files Browse the repository at this point in the history
commit 9cd4f1a upstream.

Vikram reported the following backtrace:

   BUG: scheduling while atomic: swapper/7/0/0x00000002
   CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.9.32-perf+ raspberrypi#680
   schedule
   schedule_hrtimeout_range_clock
   schedule_hrtimeout
   wait_task_inactive
   __kthread_bind_mask
   __kthread_bind
   __kthread_unpark
   kthread_unpark
   cpuhp_online_idle
   cpu_startup_entry
   secondary_start_kernel

He analyzed correctly that a parked cpu hotplug thread of an offlined CPU
was still on the runqueue when the CPU came back online and tried to unpark
it. This causes the thread which invoked kthread_unpark() to call
wait_task_inactive() and subsequently schedule() with preemption disabled.
His proposed workaround was to "make sure" that a parked thread has
scheduled out when the CPU goes offline, so the situation cannot happen.

But that's still wrong because the root cause is not the fact that the
percpu thread is still on the runqueue and neither that preemption is
disabled, which could be simply solved by enabling preemption before
calling kthread_unpark().

The real issue is that the calling thread is the idle task of the upcoming
CPU, which is not supposed to call anything which might sleep.  The moron,
who wrote that code, missed completely that kthread_unpark() might end up
in schedule().

The solution is simpler than expected. The thread which controls the
hotplug operation is waiting for the CPU to call complete() on the hotplug
state completion. So the idle task of the upcoming CPU can set its state to
CPUHP_AP_ONLINE_IDLE and invoke complete(). This in turn wakes the control
task on a different CPU, which then can safely do the unpark and kick the
now unparked hotplug thread of the upcoming CPU to complete the bringup to
the final target state.

Control CPU                     AP

bringup_cpu();
  __cpu_up()  ------------>
				bringup_ap();
  bringup_wait_for_ap()
    wait_for_completion();
                                cpuhp_online_idle();
                <------------    complete();
    unpark(AP->stopper);
    unpark(AP->hotplugthread);
                                while(1)
                                  do_idle();
    kick(AP->hotplugthread);
    wait_for_completion();	hotplug_thread()
				  run_online_callbacks();
				  complete();

Fixes: 8df3e07 ("cpu/hotplug: Let upcoming cpu bring itself fully up")
Reported-by: Vikram Mulukutla <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Sebastian Sewior <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1707042218020.2131@nanos
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
  • Loading branch information
KAGA-KOKO authored and gregkh committed Aug 7, 2017
1 parent 755f655 commit 7b4e4b1
Showing 1 changed file with 19 additions and 18 deletions.
37 changes: 19 additions & 18 deletions kernel/cpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -410,11 +410,25 @@ static int notify_online(unsigned int cpu)
return 0;
}

static void __cpuhp_kick_ap_work(struct cpuhp_cpu_state *st);

static int bringup_wait_for_ap(unsigned int cpu)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);

/* Wait for the CPU to reach CPUHP_AP_ONLINE_IDLE */
wait_for_completion(&st->done);
BUG_ON(!cpu_online(cpu));

/* Unpark the stopper thread and the hotplug thread of the target cpu */
stop_machine_unpark(cpu);
kthread_unpark(st->thread);

/* Should we go further up ? */
if (st->target > CPUHP_AP_ONLINE_IDLE) {
__cpuhp_kick_ap_work(st);
wait_for_completion(&st->done);
}
return st->result;
}

Expand All @@ -437,9 +451,7 @@ static int bringup_cpu(unsigned int cpu)
cpu_notify(CPU_UP_CANCELED, cpu);
return ret;
}
ret = bringup_wait_for_ap(cpu);
BUG_ON(!cpu_online(cpu));
return ret;
return bringup_wait_for_ap(cpu);
}

/*
Expand Down Expand Up @@ -974,31 +986,20 @@ void notify_cpu_starting(unsigned int cpu)
}

/*
* Called from the idle task. We need to set active here, so we can kick off
* the stopper thread and unpark the smpboot threads. If the target state is
* beyond CPUHP_AP_ONLINE_IDLE we kick cpuhp thread and let it bring up the
* cpu further.
* Called from the idle task. Wake up the controlling task which brings the
* stopper and the hotplug thread of the upcoming CPU up and then delegates
* the rest of the online bringup to the hotplug thread.
*/
void cpuhp_online_idle(enum cpuhp_state state)
{
struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
unsigned int cpu = smp_processor_id();

/* Happens for the boot cpu */
if (state != CPUHP_AP_ONLINE_IDLE)
return;

st->state = CPUHP_AP_ONLINE_IDLE;

/* Unpark the stopper thread and the hotplug thread of this cpu */
stop_machine_unpark(cpu);
kthread_unpark(st->thread);

/* Should we go further up ? */
if (st->target > CPUHP_AP_ONLINE_IDLE)
__cpuhp_kick_ap_work(st);
else
complete(&st->done);
complete(&st->done);
}

/* Requires cpu_add_remove_lock to be held */
Expand Down

0 comments on commit 7b4e4b1

Please sign in to comment.