workqueue: change cancel_work_sync() to clear work->data
In short: change cancel_work_sync(work) to mark this work as "never
queued" upon return.
When cancel_work_sync(work) succeeds, we know that this work can't be
queued or running, and since we own WORK_STRUCT_PENDING nobody can change
the bits in work->data under us. This means we can also clear the "cwq"
part along with _PENDING bit lockless before return, unless the work is
queued nobody can assume get_wq_data() is stable even under cwq->lock.
This change can speedup the subsequent cancel/flush requests, and as
Dmitry pointed out this simplifies the usage of work_struct's which
can be queued on different workqueues. Consider this pseudo code from
the input subsystem:
struct workqueue_struct *WQ;
struct work_struct *WORK;
for (;;) {
WQ = create_workqueue();
...
if (condition())
queue_work(WQ, WORK);
...
cancel_work_sync(WORK);
destroy_workqueue(WQ);
}
If condition() returns T and then F, cancel_work_sync() will crash the
kernel because WORK->data still points to the already destroyed workqueue.
With this patch the code like above becomes correct.
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0225fea8..77dabbf 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -229,6 +229,16 @@
atomic_long_set(&work->data, new);
}
+/*
+ * Clear WORK_STRUCT_PENDING and the workqueue on which it was queued.
+ */
+static inline void clear_wq_data(struct work_struct *work)
+{
+ unsigned long flags = *work_data_bits(work) &
+ (1UL << WORK_STRUCT_STATIC);
+ atomic_long_set(&work->data, flags);
+}
+
static inline
struct cpu_workqueue_struct *get_wq_data(struct work_struct *work)
{
@@ -671,7 +681,7 @@
wait_on_work(work);
} while (unlikely(ret < 0));
- work_clear_pending(work);
+ clear_wq_data(work);
return ret;
}