Merge tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker: "New Features: - Add support for 'dacl' and 'sacl' attributes Bugfixes and Cleanups: - Fixes for reporting mapping errors - Fixes for memory allocation errors - Improve warning message when locks are lost - Update documentation for the nfs4_unique_id parameter - Add an explanation of NFSv4 client identifiers - Ensure the i_size attribute is written to the fscache storage - Fix freeing uninitialized nfs4_labels - Better handling when xprtrdma bc_serv is NULL - Mark qualified async operations as MOVEABLE tasks" * tag 'nfs-for-5.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFSv4.1 mark qualified async operations as MOVEABLE tasks xprtrdma: treat all calls not a bcall when bc_serv is NULL NFSv4: Fix free of uninitialized nfs4_label on referral lookup. NFS: Pass i_size to fscache_unuse_cookie() when a file is released Documentation: Add an explanation of NFSv4 client identifiers NFS: update documentation for the nfs4_unique_id parameter NFS: Improve warning message when locks are lost. NFSv4.1: Enable access to the NFSv4.1 'dacl' and 'sacl' attributes NFSv4: Add encoders/decoders for the NFSv4.1 dacl and sacl attributes NFSv4: Specify the type of ACL to cache NFSv4: Don't hold the layoutget locks across multiple RPC calls pNFS/files: Fall back to I/O through the MDS on non-fatal layout errors NFS: Further fixes to the writeback error handling NFSv4/pNFS: Do not fail I/O when we fail to allocate the pNFS layout NFS: Memory allocation failures are not server fatal errors NFS: Don't report errors from nfs_pageio_complete() more than once NFS: Do not report flush errors in nfs_write_end() NFS: Don't report ENOSPC write errors twice NFS: fsync() should report filesystem errors over EINTR/ERESTARTSYS NFS: Do not report EINTR/ERESTARTSYS as mapping errors
author: Linus Torvalds 2022-05-31 16:58:24 -0700
committer: Linus Torvalds 2022-05-31 16:58:24 -0700
commit: 700170bf6b4d773e328fa54ebb70ba444007c702 (patch)
tree: 8ff0327b1622b670bf4ac30e69f176f59d66044e /Documentation
parent: 1501f707d2b24316b41d45bdc95a73bc8cc8dd49 (diff)
parent: 118f09eda21d392e1eeb9f8a4bee044958cccf20 (diff)
3 files changed, 227 insertions, 6 deletions
diff --git a/Documentation/admin-guide/nfs/nfs-client.rst b/Documentation/admin-guide/nfs/nfs-client.rst
index 6adb6457bc69..36760685dd34 100644
--- a/Documentation/admin-guide/nfs/nfs-client.rst
+++ b/Documentation/admin-guide/nfs/nfs-client.rst
@@ -36,10 +36,9 @@ administrative requirements that require particular behavior that does not
 work well as part of an nfs_client_id4 string.
 
 The nfs.nfs4_unique_id boot parameter specifies a unique string that can be
-used instead of a system's node name when an NFS client identifies itself to
-a server.  Thus, if the system's node name is not unique, or it changes, its
-nfs.nfs4_unique_id stays the same, preventing collision with other clients
-or loss of state during NFS reboot recovery or transparent state migration.
+used together with  a system's node name when an NFS client identifies itself to
+a server.  Thus, if the system's node name is not unique, its
+nfs.nfs4_unique_id can help prevent collisions with other clients.
 
 The nfs.nfs4_unique_id string is typically a UUID, though it can contain
 anything that is believed to be unique across all NFS clients.  An
@@ -53,8 +52,12 @@ outstanding NFSv4 state has expired, to prevent loss of NFSv4 state.
 
 This string can be stored in an NFS client's grub.conf, or it can be provided
 via a net boot facility such as PXE.  It may also be specified as an nfs.ko
-module parameter.  Specifying a uniquifier string is not support for NFS
-clients running in containers.
+module parameter.
+
+This uniquifier string will be the same for all NFS clients running in
+containers unless it is overridden by a value written to
+/sys/fs/nfs/net/nfs_client/identifier which will be local to the network
+namespace of the process which writes.
 
 
 The DNS resolver
diff --git a/Documentation/filesystems/nfs/client-identifier.rst b/Documentation/filesystems/nfs/client-identifier.rst
new file mode 100644
index 000000000000..5147e15815a1
--- /dev/null
+++ b/Documentation/filesystems/nfs/client-identifier.rst
@@ -0,0 +1,216 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================
+NFSv4 client identifier
+=======================
+
+This document explains how the NFSv4 protocol identifies client
+instances in order to maintain file open and lock state during
+system restarts. A special identifier and principal are maintained
+on each client. These can be set by administrators, scripts
+provided by site administrators, or tools provided by Linux
+distributors.
+
+There are risks if a client's NFSv4 identifier and its principal
+are not chosen carefully.
+
+
+Introduction
+------------
+
+The NFSv4 protocol uses "lease-based file locking". Leases help
+NFSv4 servers provide file lock guarantees and manage their
+resources.
+
+Simply put, an NFSv4 server creates a lease for each NFSv4 client.
+The server collects each client's file open and lock state under
+the lease for that client.
+
+The client is responsible for periodically renewing its leases.
+While a lease remains valid, the server holding that lease
+guarantees the file locks the client has created remain in place.
+
+If a client stops renewing its lease (for example, if it crashes),
+the NFSv4 protocol allows the server to remove the client's open
+and lock state after a certain period of time. When a client
+restarts, it indicates to servers that open and lock state
+associated with its previous leases is no longer valid and can be
+destroyed immediately.
+
+In addition, each NFSv4 server manages a persistent list of client
+leases. When the server restarts and clients attempt to recover
+their state, the server uses this list to distinguish amongst
+clients that held state before the server restarted and clients
+sending fresh OPEN and LOCK requests. This enables file locks to
+persist safely across server restarts.
+
+NFSv4 client identifiers
+------------------------
+
+Each NFSv4 client presents an identifier to NFSv4 servers so that
+they can associate the client with its lease. Each client's
+identifier consists of two elements:
+
+  - co_ownerid: An arbitrary but fixed string.
+
+  - boot verifier: A 64-bit incarnation verifier that enables a
+    server to distinguish successive boot epochs of the same client.
+
+The NFSv4.0 specification refers to these two items as an
+"nfs_client_id4". The NFSv4.1 specification refers to these two
+items as a "client_owner4".
+
+NFSv4 servers tie this identifier to the principal and security
+flavor that the client used when presenting it. Servers use this
+principal to authorize subsequent lease modification operations
+sent by the client. Effectively this principal is a third element of
+the identifier.
+
+As part of the identity presented to servers, a good
+"co_ownerid" string has several important properties:
+
+  - The "co_ownerid" string identifies the client during reboot
+    recovery, therefore the string is persistent across client
+    reboots.
+  - The "co_ownerid" string helps servers distinguish the client
+    from others, therefore the string is globally unique. Note
+    that there is no central authority that assigns "co_ownerid"
+    strings.
+  - Because it often appears on the network in the clear, the
+    "co_ownerid" string does not reveal private information about
+    the client itself.
+  - The content of the "co_ownerid" string is set and unchanging
+    before the client attempts NFSv4 mounts after a restart.
+  - The NFSv4 protocol places a 1024-byte limit on the size of the
+    "co_ownerid" string.
+
+Protecting NFSv4 lease state
+----------------------------
+
+NFSv4 servers utilize the "client_owner4" as described above to
+assign a unique lease to each client. Under this scheme, there are
+circumstances where clients can interfere with each other. This is
+referred to as "lease stealing".
+
+If distinct clients present the same "co_ownerid" string and use
+the same principal (for example, AUTH_SYS and UID 0), a server is
+unable to tell that the clients are not the same. Each distinct
+client presents a different boot verifier, so it appears to the
+server as if there is one client that is rebooting frequently.
+Neither client can maintain open or lock state in this scenario.
+
+If distinct clients present the same "co_ownerid" string and use
+distinct principals, the server is likely to allow the first client
+to operate normally but reject subsequent clients with the same
+"co_ownerid" string.
+
+If a client's "co_ownerid" string or principal are not stable,
+state recovery after a server or client reboot is not guaranteed.
+If a client unexpectedly restarts but presents a different
+"co_ownerid" string or principal to the server, the server orphans
+the client's previous open and lock state. This blocks access to
+locked files until the server removes the orphaned state.
+
+If the server restarts and a client presents a changed "co_ownerid"
+string or principal to the server, the server will not allow the
+client to reclaim its open and lock state, and may give those locks
+to other clients in the meantime. This is referred to as "lock
+stealing".
+
+Lease stealing and lock stealing increase the potential for denial
+of service and in rare cases even data corruption.
+
+Selecting an appropriate client identifier
+------------------------------------------
+
+By default, the Linux NFSv4 client implementation constructs its
+"co_ownerid" string starting with the words "Linux NFS" followed by
+the client's UTS node name (the same node name, incidentally, that
+is used as the "machine name" in an AUTH_SYS credential). In small
+deployments, this construction is usually adequate. Often, however,
+the node name by itself is not adequately unique, and can change
+unexpectedly. Problematic situations include:
+
+  - NFS-root (diskless) clients, where the local DCHP server (or
+    equivalent) does not provide a unique host name.
+
+  - "Containers" within a single Linux host.  If each container has
+    a separate network namespace, but does not use the UTS namespace
+    to provide a unique host name, then there can be multiple NFS
+    client instances with the same host name.
+
+  - Clients across multiple administrative domains that access a
+    common NFS server. If hostnames are not assigned centrally
+    then uniqueness cannot be guaranteed unless a domain name is
+    included in the hostname.
+
+Linux provides two mechanisms to add uniqueness to its "co_ownerid"
+string:
+
+    nfs.nfs4_unique_id
+      This module parameter can set an arbitrary uniquifier string
+      via the kernel command line, or when the "nfs" module is
+      loaded.
+
+    /sys/fs/nfs/client/net/identifier
+      This virtual file, available since Linux 5.3, is local to the
+      network namespace in which it is accessed and so can provide
+      distinction between network namespaces (containers) when the
+      hostname remains uniform.
+
+Note that this file is empty on name-space creation. If the
+container system has access to some sort of per-container identity
+then that uniquifier can be used. For example, a uniquifier might
+be formed at boot using the container's internal identifier:
+
+    sha256sum /etc/machine-id | awk '{print $1}' \\
+        > /sys/fs/nfs/client/net/identifier
+
+Security considerations
+-----------------------
+
+The use of cryptographic security for lease management operations
+is strongly encouraged.
+
+If NFS with Kerberos is not configured, a Linux NFSv4 client uses
+AUTH_SYS and UID 0 as the principal part of its client identity.
+This configuration is not only insecure, it increases the risk of
+lease and lock stealing. However, it might be the only choice for
+client configurations that have no local persistent storage.
+"co_ownerid" string uniqueness and persistence is critical in this
+case.
+
+When a Kerberos keytab is present on a Linux NFS client, the client
+attempts to use one of the principals in that keytab when
+identifying itself to servers. The "sec=" mount option does not
+control this behavior. Alternately, a single-user client with a
+Kerberos principal can use that principal in place of the client's
+host principal.
+
+Using Kerberos for this purpose enables the client and server to
+use the same lease for operations covered by all "sec=" settings.
+Additionally, the Linux NFS client uses the RPCSEC_GSS security
+flavor with Kerberos and the integrity QOS to prevent in-transit
+modification of lease modification requests.
+
+Additional notes
+----------------
+The Linux NFSv4 client establishes a single lease on each NFSv4
+server it accesses. NFSv4 mounts from a Linux NFSv4 client of a
+particular server then share that lease.
+
+Once a client establishes open and lock state, the NFSv4 protocol
+enables lease state to transition to other servers, following data
+that has been migrated. This hides data migration completely from
+running applications. The Linux NFSv4 client facilitates state
+migration by presenting the same "client_owner4" to all servers it
+encounters.
+
+========
+See Also
+========
+
+  - nfs(5)
+  - kerberos(7)
+  - RFC 7530 for the NFSv4.0 specification
+  - RFC 8881 for the NFSv4.1 specification.
diff --git a/Documentation/filesystems/nfs/index.rst b/Documentation/filesystems/nfs/index.rst
index 288d8ddb2bc6..8536134f31fd 100644
--- a/Documentation/filesystems/nfs/index.rst
+++ b/Documentation/filesystems/nfs/index.rst
@@ -6,6 +6,8 @@ NFS
 .. toctree::
    :maxdepth: 1
 
+   client-identifier
+   exporting
    pnfs
    rpc-cache
    rpc-server-gss
author	Linus Torvalds	2022-05-31 16:58:24 -0700
committer	Linus Torvalds	2022-05-31 16:58:24 -0700
commit	700170bf6b4d773e328fa54ebb70ba444007c702 (patch)
tree	8ff0327b1622b670bf4ac30e69f176f59d66044e /Documentation
parent	1501f707d2b24316b41d45bdc95a73bc8cc8dd49 (diff)
parent	118f09eda21d392e1eeb9f8a4bee044958cccf20 (diff)