cpu/hotplug: Fix smpboot thread ordering
Commit 931ef163309e moved the smpboot thread park/unpark invocation to the
state machine. The move of the unpark invocation was premature as it depends
on work in progress patches.
As a result cpu down can fail, because rcu synchronization in takedown_cpu()
eventually requires a functional softirq thread. I never encountered the
problem in testing, but 0day testing managed to provide a reliable reproducer.
Remove the smpboot_threads_park() call from the state machine for now and put
it back into the original place after the rcu synchronization.
I'm embarrassed as I knew about the dependency and still managed to get it
wrong. Hotplug induced brain melt seems to be the only sensible explanation
for that.
Fixes: 931ef163309e "cpu/hotplug: Unpark smpboot threads from the state machine"
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 373e831..bcee286 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -706,8 +706,9 @@
else
synchronize_rcu();
- /* Park the hotplug thread */
+ /* Park the smpboot threads */
kthread_park(per_cpu_ptr(&cpuhp_state, cpu)->thread);
+ smpboot_park_threads(cpu);
/*
* Prevent irq alloc/free while the dying cpu reorganizes the
@@ -1206,7 +1207,7 @@
[CPUHP_AP_SMPBOOT_THREADS] = {
.name = "smpboot:threads",
.startup = smpboot_unpark_threads,
- .teardown = smpboot_park_threads,
+ .teardown = NULL,
},
[CPUHP_AP_NOTIFY_ONLINE] = {
.name = "notify:online",