virtio: Fix ordering of virtio_queue__should_signal()

The guest programs used_event in the avail ring to let the host know when
it wants a notification from the device. The host notifies the guest when
the used ring index passes used_event. It is possible for the guest to
submit a buffer, and then go into uninterruptible sleep waiting for this
notification.

The virtio-blk guest driver, in the notification callback virtblk_done(),
increments the last known used ring index, then sets used_event to this
value, which means it will get a notification after the next buffer is
consumed by the host. virtblk_done() exits after the value of the used
ring idx has been propagated from the host thread.

On the host side, the virtio-blk device increments the used ring index,
then compares it to used_event to decide if a notification should be sent.

This is a common communication pattern between two threads, called store
buffer. Memory barriers are needed in order for the pattern to work
correctly, otherwise it is possible for the host to miss sending a required
notification.

Initial state: vring.used.idx = 2, vring.used_event = 1 (idx passes
used_event, which means kvmtool notifies the guest).

	GUEST (in virtblk_done())	| KVMTOOL (in virtio_blk_complete())
					|
(increment vq->last_used_idx = 2)	|
// virtqueue_enable_cb_prepare_split():	| // virt_queue__used_idx_advance():
write vring.used_event = 2		| write vring.used.idx = 3
// virtqueue_poll():			|
mb()					| wmb()
// virtqueue_poll_split():		| // virt_queue__should_signal():
read vring.used.idx = 2			| read vring.used_event = 1
// virtblk_done() exits.		| // No notification.

The write memory barrier on the host side is not enough to prevent
reordering of the read in the kvmtool thread, which can lead to the guest
thread waiting forever for IO to complete. Replace it with a full memory
barrier to get the correct store buffer pattern described in the Linux
litmus test SB+fencembonceonces.litmus, which forbids both threads reading
the initial values.

Also move the barrier in virtio_queue__should_signal(), because the barrier
is needed for notifications to work correctly, and it makes more sense to
have it in the function that determines if the host should notify the
guest.

Reported-by: Anvay Virkar <anvay.virkar@arm.com>
Suggested-by: Anvay Virkar <anvay.virkar@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200804145317.51633-1-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
1 file changed