afs: Fix speculative status fetch going out of order wrt to modifications When doing a lookup in a directory, the afs filesystem uses a bulk status fetch to speculatively retrieve the statuses of up to 48 other vnodes found in the same directory and it will then either update extant inodes or create new ones - effectively doing 'lookup ahead'. To avoid the possibility of deadlocking itself, however, the filesystem doesn't lock all of those inodes; rather just the directory inode is locked (by the VFS). When the operation completes, afs_inode_init_from_status() or afs_apply_status() is called, depending on whether the inode already exists, to commit the new status. A case exists, however, where the speculative status fetch operation may straddle a modification operation on one of those vnodes. What can then happen is that the speculative bulk status RPC retrieves the old status, and whilst that is happening, the modification happens - which returns an updated status, then the modification status is committed, then we attempt to commit the speculative status. This results in something like the following being seen in dmesg: kAFS: vnode modified {100058:861} 8->9 YFS.InlineBulkStatus showing that for vnode 861 on volume 100058, we saw YFS.InlineBulkStatus say that the vnode had data version 8 when we'd already recorded version 9 due to a local modification. This was causing the cache to be invalidated for that vnode when it shouldn't have been. If it happens on a data file, this might lead to local changes being lost. Fix this by ignoring speculative status updates if the data version doesn't match the expected value. Note that it is possible to get a DV regression if a volume gets restored from a backup - but we should get a callback break in such a case that should trigger a recheck anyway. It might be worth checking the volume creation time in the volsync info and, if a change is observed in that (as would happen on a restore), invalidate all caches associated with the volume. Fixes: 5cf9dd55a0ec ("afs: Prospectively look up extra files when doing a single lookup") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit: a9e5c87ca7443d09fb530fffa4d96ce1c76dbe4d [log] [tgz]
author: David Howells <dhowells@redhat.com> Sun Nov 22 13:13:45 2020 +0000
committer: Linus Torvalds <torvalds@linux-foundation.org> Sun Nov 22 11:27:03 2020 -0800
tree: da0ec0d0f11e772efbb5e1d5345ccd3883614fca
parent: a349e4c659609fd20e4beea89e5c4a4038e33a95 [diff]
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1bb5b9d..9068d55 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c

@@ -823,6 +823,7 @@
 				vp->cb_break_before = afs_calc_vnode_cb_break(vnode);
 				vp->vnode = vnode;
 				vp->put_vnode = true;
+				vp->speculative = true; /* vnode not locked */
 			}
 		}
 	}

diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 0fe8844..b0d7b89 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c

@@ -294,6 +294,13 @@
 			op->flags &= ~AFS_OPERATION_DIR_CONFLICT;
 		}
 	} else if (vp->scb.have_status) {
+		if (vp->dv_before + vp->dv_delta != vp->scb.status.data_version &&
+		    vp->speculative)
+			/* Ignore the result of a speculative bulk status fetch
+			 * if it splits around a modification op, thereby
+			 * appearing to regress the data version.
+			 */
+			goto out;
 		afs_apply_status(op, vp);
 		if (vp->scb.have_cb)
 			afs_apply_callback(op, vp);
@@ -305,6 +312,7 @@
 		}
 	}
 
+out:
 	write_sequnlock(&vnode->cb_lock);
 
 	if (vp->scb.have_status)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 14d5d75..0d150a2 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h

@@ -755,6 +755,7 @@
 	bool			update_ctime:1;	/* Need to update the ctime */
 	bool			set_size:1;	/* Must update i_size */
 	bool			op_unlinked:1;	/* True if file was unlinked by op */
+	bool			speculative:1;	/* T if speculative status fetch (no vnode lock) */
 };
 
 /*
commit	a9e5c87ca7443d09fb530fffa4d96ce1c76dbe4d	[log] [tgz]
author	David Howells <dhowells@redhat.com>	Sun Nov 22 13:13:45 2020 +0000
committer	Linus Torvalds <torvalds@linux-foundation.org>	Sun Nov 22 11:27:03 2020 -0800
tree	da0ec0d0f11e772efbb5e1d5345ccd3883614fca
parent	a349e4c659609fd20e4beea89e5c4a4038e33a95 [diff]