xfs: allocate xattr buffer on demand
When doing file lookups and checking for permissions, we end up in
xfs_get_acl() to see if there are any ACLs on the inode. This
requires and xattr lookup, and to do that we have to supply a buffer
large enough to hold an maximum sized xattr.
On workloads were we are accessing a wide range of cache cold files
under memory pressure (e.g. NFS fileservers) we end up spending a
lot of time allocating the buffer. The buffer is 64k in length, so
is a contiguous multi-page allocation, and if that then fails we
fall back to vmalloc(). Hence the allocation here is /expensive/
when we are looking up hundreds of thousands of files a second.
Initial numbers from a bpf trace show average time in xfs_get_acl()
is ~32us, with ~19us of that in the memory allocation. Note these
are average times, so there are going to be affected by the worst
case allocations more than the common fast case...
To avoid this, we could just do a "null" lookup to see if the ACL
xattr exists and then only do the allocation if it exists. This,
however, optimises the path for the "no ACL present" case at the
expense of the "acl present" case. i.e. we can halve the time in
xfs_get_acl() for the no acl case (i.e down to ~10-15us), but that
then increases the ACL case by 30% (i.e. up to 40-45us).
To solve this and speed up both cases, drive the xattr buffer
allocation into the attribute code once we know what the actual
xattr length is. For the no-xattr case, we avoid the allocation
completely, speeding up that case. For the common ACL case, we'll
end up with a fast heap allocation (because it'll be smaller than a
page), and only for the rarer "we have a remote xattr" will we have
a multi-page allocation occur. Hence the common ACL case will be
much faster, too.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 86c0697..96d7071 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -112,7 +112,7 @@ xfs_get_acl(struct inode *inode, int type)
{
struct xfs_inode *ip = XFS_I(inode);
struct posix_acl *acl = NULL;
- struct xfs_acl *xfs_acl;
+ struct xfs_acl *xfs_acl = NULL;
unsigned char *ea_name;
int error;
int len;
@@ -135,12 +135,8 @@ xfs_get_acl(struct inode *inode, int type)
* go out to the disk.
*/
len = XFS_ACL_MAX_SIZE(ip->i_mount);
- xfs_acl = kmem_zalloc_large(len, 0);
- if (!xfs_acl)
- return ERR_PTR(-ENOMEM);
-
- error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
- &len, ATTR_ROOT);
+ error = xfs_attr_get(ip, ea_name, (unsigned char **)&xfs_acl, &len,
+ ATTR_ALLOC | ATTR_ROOT);
if (error) {
/*
* If the attribute doesn't exist make sure we have a negative
@@ -151,8 +147,8 @@ xfs_get_acl(struct inode *inode, int type)
} else {
acl = xfs_acl_from_disk(xfs_acl, len,
XFS_ACL_MAX_ENTRIES(ip->i_mount));
+ kmem_free(xfs_acl);
}
- kmem_free(xfs_acl);
return acl;
}