Documentation/filesystems/nfs/localio.rst - linux - Git at Google

 ===========
 NFS LOCALIO
 ===========

 Overview
 ========

 The LOCALIO auxiliary RPC protocol allows the Linux NFS client and
 server to reliably handshake to determine if they are on the same
 host. Select "NFS client and server support for LOCALIO auxiliary
 protocol" in menuconfig to enable CONFIG_NFS_LOCALIO in the kernel
 config (both CONFIG_NFS_FS and CONFIG_NFSD must also be enabled).

 Once an NFS client and server handshake as "local", the client will
 bypass the network RPC protocol for read, write and commit operations.
 Due to this XDR and RPC bypass, these operations will operate faster.

 The LOCALIO auxiliary protocol's implementation, which uses the same
 connection as NFS traffic, follows the pattern established by the NFS
 ACL protocol extension.

 The LOCALIO auxiliary protocol is needed to allow robust discovery of
 clients local to their servers. In a private implementation that
 preceded use of this LOCALIO protocol, a fragile sockaddr network
 address based match against all local network interfaces was attempted.
 But unlike the LOCALIO protocol, the sockaddr-based matching didn't
 handle use of iptables or containers.

 The robust handshake between local client and server is just the
 beginning, the ultimate use case this locality makes possible is the
 client is able to open files and issue reads, writes and commits
 directly to the server without having to go over the network. The
 requirement is to perform these loopback NFS operations as efficiently
 as possible, this is particularly useful for container use cases
 (e.g. kubernetes) where it is possible to run an IO job local to the
 server.

 The performance advantage realized from LOCALIO's ability to bypass
 using XDR and RPC for reads, writes and commits can be extreme, e.g.:

 fio for 20 secs with directio, qd of 8, 16 libaio threads:
   - With LOCALIO:
     4K read:    IOPS=979k,  BW=3825MiB/s (4011MB/s)(74.7GiB/20002msec)
     4K write:   IOPS=165k,  BW=646MiB/s  (678MB/s)(12.6GiB/20002msec)
     128K read:  IOPS=402k,  BW=49.1GiB/s (52.7GB/s)(982GiB/20002msec)
     128K write: IOPS=11.5k, BW=1433MiB/s (1503MB/s)(28.0GiB/20004msec)

   - Without LOCALIO:
     4K read:    IOPS=79.2k, BW=309MiB/s  (324MB/s)(6188MiB/20003msec)
     4K write:   IOPS=59.8k, BW=234MiB/s  (245MB/s)(4671MiB/20002msec)
     128K read:  IOPS=33.9k, BW=4234MiB/s (4440MB/s)(82.7GiB/20004msec)
     128K write: IOPS=11.5k, BW=1434MiB/s (1504MB/s)(28.0GiB/20011msec)

 fio for 20 secs with directio, qd of 8, 1 libaio thread:
   - With LOCALIO:
     4K read:    IOPS=230k,  BW=898MiB/s  (941MB/s)(17.5GiB/20001msec)
     4K write:   IOPS=22.6k, BW=88.3MiB/s (92.6MB/s)(1766MiB/20001msec)
     128K read:  IOPS=38.8k, BW=4855MiB/s (5091MB/s)(94.8GiB/20001msec)
     128K write: IOPS=11.4k, BW=1428MiB/s (1497MB/s)(27.9GiB/20001msec)

   - Without LOCALIO:
     4K read:    IOPS=77.1k, BW=301MiB/s  (316MB/s)(6022MiB/20001msec)
     4K write:   IOPS=32.8k, BW=128MiB/s  (135MB/s)(2566MiB/20001msec)
     128K read:  IOPS=24.4k, BW=3050MiB/s (3198MB/s)(59.6GiB/20001msec)
     128K write: IOPS=11.4k, BW=1430MiB/s (1500MB/s)(27.9GiB/20001msec)

 FAQ
 ===

 1. What are the use cases for LOCALIO?

    a. Workloads where the NFS client and server are on the same host
       realize improved IO performance. In particular, it is common when
       running containerised workloads for jobs to find themselves
       running on the same host as the knfsd server being used for
       storage.

 2. What are the requirements for LOCALIO?

    a. Bypass use of the network RPC protocol as much as possible. This
       includes bypassing XDR and RPC for open, read, write and commit
       operations.
    b. Allow client and server to autonomously discover if they are
       running local to each other without making any assumptions about
       the local network topology.
    c. Support the use of containers by being compatible with relevant
       namespaces (e.g. network, user, mount).
    d. Support all versions of NFS. NFSv3 is of particular importance
       because it has wide enterprise usage and pNFS flexfiles makes use
       of it for the data path.

 3. Why doesn’t LOCALIO just compare IP addresses or hostnames when
    deciding if the NFS client and server are co-located on the same
    host?

    Since one of the main use cases is containerised workloads, we cannot
    assume that IP addresses will be shared between the client and
    server. This sets up a requirement for a handshake protocol that
    needs to go over the same connection as the NFS traffic in order to
    identify that the client and the server really are running on the
    same host. The handshake uses a secret that is sent over the wire,
    and can be verified by both parties by comparing with a value stored
    in shared kernel memory if they are truly co-located.

 4. Does LOCALIO improve pNFS flexfiles?

    Yes, LOCALIO complements pNFS flexfiles by allowing it to take
    advantage of NFS client and server locality.  Policy that initiates
    client IO as closely to the server where the data is stored naturally
    benefits from the data path optimization LOCALIO provides.

 5. Why not develop a new pNFS layout to enable LOCALIO?

    A new pNFS layout could be developed, but doing so would put the
    onus on the server to somehow discover that the client is co-located
    when deciding to hand out the layout.
    There is value in a simpler approach (as provided by LOCALIO) that
    allows the NFS client to negotiate and leverage locality without
    requiring more elaborate modeling and discovery of such locality in a
    more centralized manner.

 6. Why is having the client perform a server-side file OPEN, without
    using RPC, beneficial?  Is the benefit pNFS specific?

    Avoiding the use of XDR and RPC for file opens is beneficial to
    performance regardless of whether pNFS is used. Especially when
    dealing with small files its best to avoid going over the wire
    whenever possible, otherwise it could reduce or even negate the
    benefits of avoiding the wire for doing the small file I/O itself.
    Given LOCALIO's requirements the current approach of having the
    client perform a server-side file open, without using RPC, is ideal.
    If in the future requirements change then we can adapt accordingly.

 7. Why is LOCALIO only supported with UNIX Authentication (AUTH_UNIX)?

    Strong authentication is usually tied to the connection itself. It
    works by establishing a context that is cached by the server, and
    that acts as the key for discovering the authorisation token, which
    can then be passed to rpc.mountd to complete the authentication
    process. On the other hand, in the case of AUTH_UNIX, the credential
    that was passed over the wire is used directly as the key in the
    upcall to rpc.mountd. This simplifies the authentication process, and
    so makes AUTH_UNIX easier to support.

 8. How do export options that translate RPC user IDs behave for LOCALIO
    operations (eg. root_squash, all_squash)?

    Export options that translate user IDs are managed by nfsd_setuser()
    which is called by nfsd_setuser_and_check_port() which is called by
    __fh_verify().  So they get handled exactly the same way for LOCALIO
    as they do for non-LOCALIO.

 9. How does LOCALIO make certain that object lifetimes are managed
    properly given NFSD and NFS operate in different contexts?

    See the detailed "NFS Client and Server Interlock" section below.

 RPC
 ===

 The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL"
 RPC method that allows the Linux NFS client to verify the local Linux
 NFS server can see the nonce (single-use UUID) the client generated and
 made available in nfs_common. This protocol isn't part of an IETF
 standard, nor does it need to be considering it is Linux-to-Linux
 auxiliary RPC protocol that amounts to an implementation detail.

 The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of
 the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode
 XDR methods are used instead of the less efficient variable sized
 methods.

 The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned
 by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ):
 Linux Kernel Organization       400122  nfslocalio

 The LOCALIO protocol spec in rpcgen syntax is::

   /* raw RFC 9562 UUID */
   #define UUID_SIZE 16
   typedef u8 uuid_t<UUID_SIZE>;

   program NFS_LOCALIO_PROGRAM {
       version LOCALIO_V1 {
           void
               NULL(void) = 0;

           void
               UUID_IS_LOCAL(uuid_t) = 1;
       } = 1;
   } = 400122;

 LOCALIO uses the same transport connection as NFS traffic. As such,
 LOCALIO is not registered with rpcbind.

 NFS Common and Client/Server Handshake
 ======================================

 fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client
 to generate a nonce (single-use UUID) and associated short-lived
 nfs_uuid_t struct, register it with nfs_common for subsequent lookup and
 verification by the NFS server and if matched the NFS server populates
 members in the nfs_uuid_t struct. The NFS client then uses nfs_common to
 transfer the nfs_uuid_t from its nfs_uuids to the nn->nfsd_serv
 clients_list from the nfs_common's uuids_list.  See:
 fs/nfs/localio.c:nfs_local_probe()

 nfs_common's nfs_uuids list is the basis for LOCALIO enablement, as such
 it has members that point to nfsd memory for direct use by the client
 (e.g. 'net' is the server's network namespace, through it the client can
 access nn->nfsd_serv with proper rcu read access). It is this client
 and server synchronization that enables advanced usage and lifetime of
 objects to span from the host kernel's nfsd to per-container knfsd
 instances that are connected to nfs client's running on the same local
 host.

 NFS Client and Server Interlock
 ===============================

 LOCALIO provides the nfs_uuid_t object and associated interfaces to
 allow proper network namespace (net-ns) and NFSD object refcounting:

     We don't want to keep a long-term counted reference on each NFSD's
     net-ns in the client because that prevents a server container from
     completely shutting down.

     So we avoid taking a reference at all and rely on the per-cpu
     reference to the server (detailed below) being sufficient to keep
     the net-ns active. This involves allowing the NFSD's net-ns exit
     code to iterate all active clients and clear their ->net pointers
     (which are needed to find the per-cpu-refcount for the nfsd_serv).

     Details:

      - Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
        that can be used to find the client. It does add the 16-byte
        uuid_t to nfs_client so it is bigger than needed (given that
        uuid_t is only used during the initial NFS client and server
        LOCALIO handshake to determine if they are local to each other).
        If that is really a problem we can find a fix.

      - When the nfs server confirms that the uuid_t is local, it moves
        the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.

      - When each server's net-ns is shutting down - in a "pre_exit"
        handler, all these nfs_uuid_t have their ->net cleared. There is
        an rcu_synchronize() call between pre_exit() handlers and exit()
        handlers so any caller that sees nfs_uuid_t ->net as not NULL can
        safely manage the per-cpu-refcount for nfsd_serv.

      - The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
        can safely dereference ->net in a private rcu_read_lock() section
        to allow safe access to the associated nfsd_net and nfsd_serv.

 So LOCALIO required the introduction and use of NFSD's percpu_ref to
 interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
 nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
 warrants a more detailed explanation:

     nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
     nfsd_file handle and then the caller (NFS client) must drop the
     reference for the nfsd_file and associated nn->nfsd_serv using
     nfs_file_put_local() once it has completed its IO.

     This interlock working relies heavily on nfsd_open_local_fh() being
     afforded the ability to safely deal with the possibility that the
     NFSD's net-ns (and nfsd_net by association) may have been destroyed
     by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
     possible given the nfs_uuid_t ->net pointer managemenet detailed
     above.

 All told, this elaborate interlock of the NFS client and server has been
 verified to fix an easy to hit crash that would occur if an NFSD
 instance running in a container, with a LOCALIO client mounted, is
 shutdown. Upon restart of the container and associated NFSD the client
 would go on to crash due to NULL pointer dereference that occurred due
 to the LOCALIO client's attempting to nfsd_open_local_fh(), using
 nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.

 NFS Client issues IO instead of Server
 ======================================

 Because LOCALIO is focused on protocol bypass to achieve improved IO
 performance, alternatives to the traditional NFS wire protocol (SUNRPC
 with XDR) must be provided to access the backing filesystem.

 See fs/nfs/localio.c:nfs_local_open_fh() and
 fs/nfsd/localio.c:nfsd_open_local_fh() for the interface that makes
 focused use of select nfs server objects to allow a client local to a
 server to open a file pointer without needing to go over the network.

 The client's fs/nfs/localio.c:nfs_local_open_fh() will call into the
 server's fs/nfsd/localio.c:nfsd_open_local_fh() and carefully access
 both the associated nfsd network namespace and nn->nfsd_serv in terms of
 RCU. If nfsd_open_local_fh() finds that the client no longer sees valid
 nfsd objects (be it struct net or nn->nfsd_serv) it returns -ENXIO
 to nfs_local_open_fh() and the client will try to reestablish the
 LOCALIO resources needed by calling nfs_local_probe() again. This
 recovery is needed if/when an nfsd instance running in a container were
 to reboot while a LOCALIO client is connected to it.

 Once the client has an open nfsd_file pointer it will issue reads,
 writes and commits directly to the underlying local filesystem (normally
 done by the nfs server). As such, for these operations, the NFS client
 is issuing IO to the underlying local filesystem that it is sharing with
 the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and
 fs/nfs/localio.c:nfs_local_commit().

 Security
 ========

 Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka
 AUTH_SYS) is used.

 Care is taken to ensure the same NFS security mechanisms are used
 (authentication, etc) regardless of whether LOCALIO or regular NFS
 access is used. The auth_domain established as part of the traditional
 NFS client access to the NFS server is also used for LOCALIO.

 Relative to containers, LOCALIO gives the client access to the network
 namespace the server has. This is required to allow the client to access
 the server's per-namespace nfsd_net struct. With traditional NFS, the
 client is afforded this same level of access (albeit in terms of the NFS
 protocol via SUNRPC). No other namespaces (user, mount, etc) have been
 altered or purposely extended from the server to the client.

 Testing
 =======

 The LOCALIO auxiliary protocol and associated NFS LOCALIO read, write
 and commit access have proven stable against various test scenarios:

 - Client and server both on the same host.

 - All permutations of client and server support enablement for both
   local and remote client and server.

 - Testing against NFS storage products that don't support the LOCALIO
   protocol was also performed.

 - Client on host, server within a container (for both v3 and v4.2).
   The container testing was in terms of podman managed containers and
   includes successful container stop/restart scenario.

 - Formalizing these test scenarios in terms of existing test
   infrastructure is on-going. Initial regular coverage is provided in
   terms of ktest running xfstests against a LOCALIO-enabled NFS loopback
   mount configuration, and includes lockdep and KASAN coverage, see:
   https://evilpiepirate.org/~testdashboard/ci?user=snitzer&branch=snitm-nfs-next
   https://github.com/koverstreet/ktest

 - Various kdevops testing (in terms of "Chuck's BuildBot") has been
   performed to regularly verify the LOCALIO changes haven't caused any
   regressions to non-LOCALIO NFS use cases.

 - All of Hammerspace's various sanity tests pass with LOCALIO enabled
   (this includes numerous pNFS and flexfiles tests).
	===========
	NFS LOCALIO
	===========

	Overview
	========

	The LOCALIO auxiliary RPC protocol allows the Linux NFS client and
	server to reliably handshake to determine if they are on the same
	host. Select "NFS client and server support for LOCALIO auxiliary
	protocol" in menuconfig to enable CONFIG_NFS_LOCALIO in the kernel
	config (both CONFIG_NFS_FS and CONFIG_NFSD must also be enabled).

	Once an NFS client and server handshake as "local", the client will
	bypass the network RPC protocol for read, write and commit operations.
	Due to this XDR and RPC bypass, these operations will operate faster.

	The LOCALIO auxiliary protocol's implementation, which uses the same
	connection as NFS traffic, follows the pattern established by the NFS
	ACL protocol extension.

	The LOCALIO auxiliary protocol is needed to allow robust discovery of
	clients local to their servers. In a private implementation that
	preceded use of this LOCALIO protocol, a fragile sockaddr network
	address based match against all local network interfaces was attempted.
	But unlike the LOCALIO protocol, the sockaddr-based matching didn't
	handle use of iptables or containers.

	The robust handshake between local client and server is just the
	beginning, the ultimate use case this locality makes possible is the
	client is able to open files and issue reads, writes and commits
	directly to the server without having to go over the network. The
	requirement is to perform these loopback NFS operations as efficiently
	as possible, this is particularly useful for container use cases
	(e.g. kubernetes) where it is possible to run an IO job local to the
	server.

	The performance advantage realized from LOCALIO's ability to bypass
	using XDR and RPC for reads, writes and commits can be extreme, e.g.:

	fio for 20 secs with directio, qd of 8, 16 libaio threads:
	- With LOCALIO:
	4K read: IOPS=979k, BW=3825MiB/s (4011MB/s)(74.7GiB/20002msec)
	4K write: IOPS=165k, BW=646MiB/s (678MB/s)(12.6GiB/20002msec)
	128K read: IOPS=402k, BW=49.1GiB/s (52.7GB/s)(982GiB/20002msec)
	128K write: IOPS=11.5k, BW=1433MiB/s (1503MB/s)(28.0GiB/20004msec)

	- Without LOCALIO:
	4K read: IOPS=79.2k, BW=309MiB/s (324MB/s)(6188MiB/20003msec)
	4K write: IOPS=59.8k, BW=234MiB/s (245MB/s)(4671MiB/20002msec)
	128K read: IOPS=33.9k, BW=4234MiB/s (4440MB/s)(82.7GiB/20004msec)
	128K write: IOPS=11.5k, BW=1434MiB/s (1504MB/s)(28.0GiB/20011msec)

	fio for 20 secs with directio, qd of 8, 1 libaio thread:
	- With LOCALIO:
	4K read: IOPS=230k, BW=898MiB/s (941MB/s)(17.5GiB/20001msec)
	4K write: IOPS=22.6k, BW=88.3MiB/s (92.6MB/s)(1766MiB/20001msec)
	128K read: IOPS=38.8k, BW=4855MiB/s (5091MB/s)(94.8GiB/20001msec)
	128K write: IOPS=11.4k, BW=1428MiB/s (1497MB/s)(27.9GiB/20001msec)

	- Without LOCALIO:
	4K read: IOPS=77.1k, BW=301MiB/s (316MB/s)(6022MiB/20001msec)
	4K write: IOPS=32.8k, BW=128MiB/s (135MB/s)(2566MiB/20001msec)
	128K read: IOPS=24.4k, BW=3050MiB/s (3198MB/s)(59.6GiB/20001msec)
	128K write: IOPS=11.4k, BW=1430MiB/s (1500MB/s)(27.9GiB/20001msec)

	FAQ
	===

	1. What are the use cases for LOCALIO?

	a. Workloads where the NFS client and server are on the same host
	realize improved IO performance. In particular, it is common when
	running containerised workloads for jobs to find themselves
	running on the same host as the knfsd server being used for
	storage.

	2. What are the requirements for LOCALIO?

	a. Bypass use of the network RPC protocol as much as possible. This
	includes bypassing XDR and RPC for open, read, write and commit
	operations.
	b. Allow client and server to autonomously discover if they are
	running local to each other without making any assumptions about
	the local network topology.
	c. Support the use of containers by being compatible with relevant
	namespaces (e.g. network, user, mount).
	d. Support all versions of NFS. NFSv3 is of particular importance
	because it has wide enterprise usage and pNFS flexfiles makes use
	of it for the data path.

	3. Why doesn’t LOCALIO just compare IP addresses or hostnames when
	deciding if the NFS client and server are co-located on the same
	host?

	Since one of the main use cases is containerised workloads, we cannot
	assume that IP addresses will be shared between the client and
	server. This sets up a requirement for a handshake protocol that
	needs to go over the same connection as the NFS traffic in order to
	identify that the client and the server really are running on the
	same host. The handshake uses a secret that is sent over the wire,
	and can be verified by both parties by comparing with a value stored
	in shared kernel memory if they are truly co-located.

	4. Does LOCALIO improve pNFS flexfiles?

	Yes, LOCALIO complements pNFS flexfiles by allowing it to take
	advantage of NFS client and server locality. Policy that initiates
	client IO as closely to the server where the data is stored naturally
	benefits from the data path optimization LOCALIO provides.

	5. Why not develop a new pNFS layout to enable LOCALIO?

	A new pNFS layout could be developed, but doing so would put the
	onus on the server to somehow discover that the client is co-located
	when deciding to hand out the layout.
	There is value in a simpler approach (as provided by LOCALIO) that
	allows the NFS client to negotiate and leverage locality without
	requiring more elaborate modeling and discovery of such locality in a
	more centralized manner.

	6. Why is having the client perform a server-side file OPEN, without
	using RPC, beneficial? Is the benefit pNFS specific?

	Avoiding the use of XDR and RPC for file opens is beneficial to
	performance regardless of whether pNFS is used. Especially when
	dealing with small files its best to avoid going over the wire
	whenever possible, otherwise it could reduce or even negate the
	benefits of avoiding the wire for doing the small file I/O itself.
	Given LOCALIO's requirements the current approach of having the
	client perform a server-side file open, without using RPC, is ideal.
	If in the future requirements change then we can adapt accordingly.

	7. Why is LOCALIO only supported with UNIX Authentication (AUTH_UNIX)?

	Strong authentication is usually tied to the connection itself. It
	works by establishing a context that is cached by the server, and
	that acts as the key for discovering the authorisation token, which
	can then be passed to rpc.mountd to complete the authentication
	process. On the other hand, in the case of AUTH_UNIX, the credential
	that was passed over the wire is used directly as the key in the
	upcall to rpc.mountd. This simplifies the authentication process, and
	so makes AUTH_UNIX easier to support.

	8. How do export options that translate RPC user IDs behave for LOCALIO
	operations (eg. root_squash, all_squash)?

	Export options that translate user IDs are managed by nfsd_setuser()
	which is called by nfsd_setuser_and_check_port() which is called by
	__fh_verify(). So they get handled exactly the same way for LOCALIO
	as they do for non-LOCALIO.

	9. How does LOCALIO make certain that object lifetimes are managed
	properly given NFSD and NFS operate in different contexts?

	See the detailed "NFS Client and Server Interlock" section below.

	RPC
	===

	The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL"
	RPC method that allows the Linux NFS client to verify the local Linux
	NFS server can see the nonce (single-use UUID) the client generated and
	made available in nfs_common. This protocol isn't part of an IETF
	standard, nor does it need to be considering it is Linux-to-Linux
	auxiliary RPC protocol that amounts to an implementation detail.

	The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of
	the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode
	XDR methods are used instead of the less efficient variable sized
	methods.

	The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned
	by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ):
	Linux Kernel Organization 400122 nfslocalio

	The LOCALIO protocol spec in rpcgen syntax is::

	/* raw RFC 9562 UUID */
	#define UUID_SIZE 16
	typedef u8 uuid_t<UUID_SIZE>;

	program NFS_LOCALIO_PROGRAM {
	version LOCALIO_V1 {
	void
	NULL(void) = 0;

	void
	UUID_IS_LOCAL(uuid_t) = 1;
	} = 1;
	} = 400122;

	LOCALIO uses the same transport connection as NFS traffic. As such,
	LOCALIO is not registered with rpcbind.

	NFS Common and Client/Server Handshake
	======================================

	fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS client
	to generate a nonce (single-use UUID) and associated short-lived
	nfs_uuid_t struct, register it with nfs_common for subsequent lookup and
	verification by the NFS server and if matched the NFS server populates
	members in the nfs_uuid_t struct. The NFS client then uses nfs_common to
	transfer the nfs_uuid_t from its nfs_uuids to the nn->nfsd_serv
	clients_list from the nfs_common's uuids_list. See:
	fs/nfs/localio.c:nfs_local_probe()

	nfs_common's nfs_uuids list is the basis for LOCALIO enablement, as such
	it has members that point to nfsd memory for direct use by the client
	(e.g. 'net' is the server's network namespace, through it the client can
	access nn->nfsd_serv with proper rcu read access). It is this client
	and server synchronization that enables advanced usage and lifetime of
	objects to span from the host kernel's nfsd to per-container knfsd
	instances that are connected to nfs client's running on the same local
	host.

	NFS Client and Server Interlock
	===============================

	LOCALIO provides the nfs_uuid_t object and associated interfaces to
	allow proper network namespace (net-ns) and NFSD object refcounting:

	We don't want to keep a long-term counted reference on each NFSD's
	net-ns in the client because that prevents a server container from
	completely shutting down.

	So we avoid taking a reference at all and rely on the per-cpu
	reference to the server (detailed below) being sufficient to keep
	the net-ns active. This involves allowing the NFSD's net-ns exit
	code to iterate all active clients and clear their ->net pointers
	(which are needed to find the per-cpu-refcount for the nfsd_serv).

	Details:

	- Embed nfs_uuid_t in nfs_client. nfs_uuid_t provides a list_head
	that can be used to find the client. It does add the 16-byte
	uuid_t to nfs_client so it is bigger than needed (given that
	uuid_t is only used during the initial NFS client and server
	LOCALIO handshake to determine if they are local to each other).
	If that is really a problem we can find a fix.

	- When the nfs server confirms that the uuid_t is local, it moves
	the nfs_uuid_t onto a per-net-ns list in NFSD's nfsd_net.

	- When each server's net-ns is shutting down - in a "pre_exit"
	handler, all these nfs_uuid_t have their ->net cleared. There is
	an rcu_synchronize() call between pre_exit() handlers and exit()
	handlers so any caller that sees nfs_uuid_t ->net as not NULL can
	safely manage the per-cpu-refcount for nfsd_serv.

	- The client's nfs_uuid_t is passed to nfsd_open_local_fh() so it
	can safely dereference ->net in a private rcu_read_lock() section
	to allow safe access to the associated nfsd_net and nfsd_serv.

	So LOCALIO required the introduction and use of NFSD's percpu_ref to
	interlock nfsd_destroy_serv() and nfsd_open_local_fh(), to ensure each
	nn->nfsd_serv is not destroyed while in use by nfsd_open_local_fh(), and
	warrants a more detailed explanation:

	nfsd_open_local_fh() uses nfsd_serv_try_get() before opening its
	nfsd_file handle and then the caller (NFS client) must drop the
	reference for the nfsd_file and associated nn->nfsd_serv using
	nfs_file_put_local() once it has completed its IO.

	This interlock working relies heavily on nfsd_open_local_fh() being
	afforded the ability to safely deal with the possibility that the
	NFSD's net-ns (and nfsd_net by association) may have been destroyed
	by nfsd_destroy_serv() via nfsd_shutdown_net() -- which is only
	possible given the nfs_uuid_t ->net pointer managemenet detailed
	above.

	All told, this elaborate interlock of the NFS client and server has been
	verified to fix an easy to hit crash that would occur if an NFSD
	instance running in a container, with a LOCALIO client mounted, is
	shutdown. Upon restart of the container and associated NFSD the client
	would go on to crash due to NULL pointer dereference that occurred due
	to the LOCALIO client's attempting to nfsd_open_local_fh(), using
	nn->nfsd_serv, without having a proper reference on nn->nfsd_serv.

	NFS Client issues IO instead of Server
	======================================

	Because LOCALIO is focused on protocol bypass to achieve improved IO
	performance, alternatives to the traditional NFS wire protocol (SUNRPC
	with XDR) must be provided to access the backing filesystem.

	See fs/nfs/localio.c:nfs_local_open_fh() and
	fs/nfsd/localio.c:nfsd_open_local_fh() for the interface that makes
	focused use of select nfs server objects to allow a client local to a
	server to open a file pointer without needing to go over the network.

	The client's fs/nfs/localio.c:nfs_local_open_fh() will call into the
	server's fs/nfsd/localio.c:nfsd_open_local_fh() and carefully access
	both the associated nfsd network namespace and nn->nfsd_serv in terms of
	RCU. If nfsd_open_local_fh() finds that the client no longer sees valid
	nfsd objects (be it struct net or nn->nfsd_serv) it returns -ENXIO
	to nfs_local_open_fh() and the client will try to reestablish the
	LOCALIO resources needed by calling nfs_local_probe() again. This
	recovery is needed if/when an nfsd instance running in a container were
	to reboot while a LOCALIO client is connected to it.

	Once the client has an open nfsd_file pointer it will issue reads,
	writes and commits directly to the underlying local filesystem (normally
	done by the nfs server). As such, for these operations, the NFS client
	is issuing IO to the underlying local filesystem that it is sharing with
	the NFS server. See: fs/nfs/localio.c:nfs_local_doio() and
	fs/nfs/localio.c:nfs_local_commit().

	Security
	========

	Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka
	AUTH_SYS) is used.

	Care is taken to ensure the same NFS security mechanisms are used
	(authentication, etc) regardless of whether LOCALIO or regular NFS
	access is used. The auth_domain established as part of the traditional
	NFS client access to the NFS server is also used for LOCALIO.

	Relative to containers, LOCALIO gives the client access to the network
	namespace the server has. This is required to allow the client to access
	the server's per-namespace nfsd_net struct. With traditional NFS, the
	client is afforded this same level of access (albeit in terms of the NFS
	protocol via SUNRPC). No other namespaces (user, mount, etc) have been
	altered or purposely extended from the server to the client.

	Testing
	=======

	The LOCALIO auxiliary protocol and associated NFS LOCALIO read, write
	and commit access have proven stable against various test scenarios:

	- Client and server both on the same host.

	- All permutations of client and server support enablement for both
	local and remote client and server.

	- Testing against NFS storage products that don't support the LOCALIO
	protocol was also performed.

	- Client on host, server within a container (for both v3 and v4.2).
	The container testing was in terms of podman managed containers and
	includes successful container stop/restart scenario.

	- Formalizing these test scenarios in terms of existing test
	infrastructure is on-going. Initial regular coverage is provided in
	terms of ktest running xfstests against a LOCALIO-enabled NFS loopback
	mount configuration, and includes lockdep and KASAN coverage, see:
	https://evilpiepirate.org/~testdashboard/ci?user=snitzer&branch=snitm-nfs-next
	https://github.com/koverstreet/ktest

	- Various kdevops testing (in terms of "Chuck's BuildBot") has been
	performed to regularly verify the LOCALIO changes haven't caused any
	regressions to non-LOCALIO NFS use cases.

	- All of Hammerspace's various sanity tests pass with LOCALIO enabled
	(this includes numerous pNFS and flexfiles tests).