| ===================================== |
| Filesystem-level encryption (fscrypt) |
| ===================================== |
| |
| Introduction |
| ============ |
| |
| fscrypt is a library which filesystems can hook into to support |
| transparent encryption of files and directories. |
| |
| Note: "fscrypt" in this document refers to the kernel-level portion, |
| implemented in ``fs/crypto/``, as opposed to the userspace tool |
| `fscrypt <https://github.com/google/fscrypt>`_. This document only |
| covers the kernel-level portion. For command-line examples of how to |
| use encryption, see the documentation for the userspace tool `fscrypt |
| <https://github.com/google/fscrypt>`_. Also, it is recommended to use |
| the fscrypt userspace tool, or other existing userspace tools such as |
| `fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key |
| management system |
| <https://source.android.com/security/encryption/file-based>`_, over |
| using the kernel's API directly. Using existing tools reduces the |
| chance of introducing your own security bugs. (Nevertheless, for |
| completeness this documentation covers the kernel's API anyway.) |
| |
| Unlike dm-crypt, fscrypt operates at the filesystem level rather than |
| at the block device level. This allows it to encrypt different files |
| with different keys and to have unencrypted files on the same |
| filesystem. This is useful for multi-user systems where each user's |
| data-at-rest needs to be cryptographically isolated from the others. |
| However, except for filenames, fscrypt does not encrypt filesystem |
| metadata. |
| |
| Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated |
| directly into supported filesystems --- currently ext4, F2FS, UBIFS, |
| and CephFS. This allows encrypted files to be read and written |
| without caching both the decrypted and encrypted pages in the |
| pagecache, thereby nearly halving the memory used and bringing it in |
| line with unencrypted files. Similarly, half as many dentries and |
| inodes are needed. eCryptfs also limits encrypted filenames to 143 |
| bytes, causing application compatibility issues; fscrypt allows the |
| full 255 bytes (NAME_MAX). Finally, unlike eCryptfs, the fscrypt API |
| can be used by unprivileged users, with no need to mount anything. |
| |
| fscrypt does not support encrypting files in-place. Instead, it |
| supports marking an empty directory as encrypted. Then, after |
| userspace provides the key, all regular files, directories, and |
| symbolic links created in that directory tree are transparently |
| encrypted. |
| |
| Threat model |
| ============ |
| |
| Offline attacks |
| --------------- |
| |
| Provided that userspace chooses a strong encryption key, fscrypt |
| protects the confidentiality of file contents and filenames in the |
| event of a single point-in-time permanent offline compromise of the |
| block device content. fscrypt does not protect the confidentiality of |
| non-filename metadata, e.g. file sizes, file permissions, file |
| timestamps, and extended attributes. Also, the existence and location |
| of holes (unallocated blocks which logically contain all zeroes) in |
| files is not protected. |
| |
| fscrypt is not guaranteed to protect confidentiality or authenticity |
| if an attacker is able to manipulate the filesystem offline prior to |
| an authorized user later accessing the filesystem. |
| |
| Online attacks |
| -------------- |
| |
| fscrypt (and storage encryption in general) can only provide limited |
| protection, if any at all, against online attacks. In detail: |
| |
| Side-channel attacks |
| ~~~~~~~~~~~~~~~~~~~~ |
| |
| fscrypt is only resistant to side-channel attacks, such as timing or |
| electromagnetic attacks, to the extent that the underlying Linux |
| Cryptographic API algorithms or inline encryption hardware are. If a |
| vulnerable algorithm is used, such as a table-based implementation of |
| AES, it may be possible for an attacker to mount a side channel attack |
| against the online system. Side channel attacks may also be mounted |
| against applications consuming decrypted data. |
| |
| Unauthorized file access |
| ~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| After an encryption key has been added, fscrypt does not hide the |
| plaintext file contents or filenames from other users on the same |
| system. Instead, existing access control mechanisms such as file mode |
| bits, POSIX ACLs, LSMs, or namespaces should be used for this purpose. |
| |
| (For the reasoning behind this, understand that while the key is |
| added, the confidentiality of the data, from the perspective of the |
| system itself, is *not* protected by the mathematical properties of |
| encryption but rather only by the correctness of the kernel. |
| Therefore, any encryption-specific access control checks would merely |
| be enforced by kernel *code* and therefore would be largely redundant |
| with the wide variety of access control mechanisms already available.) |
| |
| Kernel memory compromise |
| ~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| An attacker who compromises the system enough to read from arbitrary |
| memory, e.g. by mounting a physical attack or by exploiting a kernel |
| security vulnerability, can compromise all encryption keys that are |
| currently in use. |
| |
| However, fscrypt allows encryption keys to be removed from the kernel, |
| which may protect them from later compromise. |
| |
| In more detail, the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl (or the |
| FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl) can wipe a master |
| encryption key from kernel memory. If it does so, it will also try to |
| evict all cached inodes which had been "unlocked" using the key, |
| thereby wiping their per-file keys and making them once again appear |
| "locked", i.e. in ciphertext or encrypted form. |
| |
| However, these ioctls have some limitations: |
| |
| - Per-file keys for in-use files will *not* be removed or wiped. |
| Therefore, for maximum effect, userspace should close the relevant |
| encrypted files and directories before removing a master key, as |
| well as kill any processes whose working directory is in an affected |
| encrypted directory. |
| |
| - The kernel cannot magically wipe copies of the master key(s) that |
| userspace might have as well. Therefore, userspace must wipe all |
| copies of the master key(s) it makes as well; normally this should |
| be done immediately after FS_IOC_ADD_ENCRYPTION_KEY, without waiting |
| for FS_IOC_REMOVE_ENCRYPTION_KEY. Naturally, the same also applies |
| to all higher levels in the key hierarchy. Userspace should also |
| follow other security precautions such as mlock()ing memory |
| containing keys to prevent it from being swapped out. |
| |
| - In general, decrypted contents and filenames in the kernel VFS |
| caches are freed but not wiped. Therefore, portions thereof may be |
| recoverable from freed memory, even after the corresponding key(s) |
| were wiped. To partially solve this, you can set |
| CONFIG_PAGE_POISONING=y in your kernel config and add page_poison=1 |
| to your kernel command line. However, this has a performance cost. |
| |
| - Secret keys might still exist in CPU registers, in crypto |
| accelerator hardware (if used by the crypto API to implement any of |
| the algorithms), or in other places not explicitly considered here. |
| |
| Limitations of v1 policies |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| v1 encryption policies have some weaknesses with respect to online |
| attacks: |
| |
| - There is no verification that the provided master key is correct. |
| Therefore, a malicious user can temporarily associate the wrong key |
| with another user's encrypted files to which they have read-only |
| access. Because of filesystem caching, the wrong key will then be |
| used by the other user's accesses to those files, even if the other |
| user has the correct key in their own keyring. This violates the |
| meaning of "read-only access". |
| |
| - A compromise of a per-file key also compromises the master key from |
| which it was derived. |
| |
| - Non-root users cannot securely remove encryption keys. |
| |
| All the above problems are fixed with v2 encryption policies. For |
| this reason among others, it is recommended to use v2 encryption |
| policies on all new encrypted directories. |
| |
| Key hierarchy |
| ============= |
| |
| Master Keys |
| ----------- |
| |
| Each encrypted directory tree is protected by a *master key*. Master |
| keys can be up to 64 bytes long, and must be at least as long as the |
| greater of the security strength of the contents and filenames |
| encryption modes being used. For example, if any AES-256 mode is |
| used, the master key must be at least 256 bits, i.e. 32 bytes. A |
| stricter requirement applies if the key is used by a v1 encryption |
| policy and AES-256-XTS is used; such keys must be 64 bytes. |
| |
| To "unlock" an encrypted directory tree, userspace must provide the |
| appropriate master key. There can be any number of master keys, each |
| of which protects any number of directory trees on any number of |
| filesystems. |
| |
| Master keys must be real cryptographic keys, i.e. indistinguishable |
| from random bytestrings of the same length. This implies that users |
| **must not** directly use a password as a master key, zero-pad a |
| shorter key, or repeat a shorter key. Security cannot be guaranteed |
| if userspace makes any such error, as the cryptographic proofs and |
| analysis would no longer apply. |
| |
| Instead, users should generate master keys either using a |
| cryptographically secure random number generator, or by using a KDF |
| (Key Derivation Function). The kernel does not do any key stretching; |
| therefore, if userspace derives the key from a low-entropy secret such |
| as a passphrase, it is critical that a KDF designed for this purpose |
| be used, such as scrypt, PBKDF2, or Argon2. |
| |
| Key derivation function |
| ----------------------- |
| |
| With one exception, fscrypt never uses the master key(s) for |
| encryption directly. Instead, they are only used as input to a KDF |
| (Key Derivation Function) to derive the actual keys. |
| |
| The KDF used for a particular master key differs depending on whether |
| the key is used for v1 encryption policies or for v2 encryption |
| policies. Users **must not** use the same key for both v1 and v2 |
| encryption policies. (No real-world attack is currently known on this |
| specific case of key reuse, but its security cannot be guaranteed |
| since the cryptographic proofs and analysis would no longer apply.) |
| |
| For v1 encryption policies, the KDF only supports deriving per-file |
| encryption keys. It works by encrypting the master key with |
| AES-128-ECB, using the file's 16-byte nonce as the AES key. The |
| resulting ciphertext is used as the derived key. If the ciphertext is |
| longer than needed, then it is truncated to the needed length. |
| |
| For v2 encryption policies, the KDF is HKDF-SHA512. The master key is |
| passed as the "input keying material", no salt is used, and a distinct |
| "application-specific information string" is used for each distinct |
| key to be derived. For example, when a per-file encryption key is |
| derived, the application-specific information string is the file's |
| nonce prefixed with "fscrypt\\0" and a context byte. Different |
| context bytes are used for other types of derived keys. |
| |
| HKDF-SHA512 is preferred to the original AES-128-ECB based KDF because |
| HKDF is more flexible, is nonreversible, and evenly distributes |
| entropy from the master key. HKDF is also standardized and widely |
| used by other software, whereas the AES-128-ECB based KDF is ad-hoc. |
| |
| Per-file encryption keys |
| ------------------------ |
| |
| Since each master key can protect many files, it is necessary to |
| "tweak" the encryption of each file so that the same plaintext in two |
| files doesn't map to the same ciphertext, or vice versa. In most |
| cases, fscrypt does this by deriving per-file keys. When a new |
| encrypted inode (regular file, directory, or symlink) is created, |
| fscrypt randomly generates a 16-byte nonce and stores it in the |
| inode's encryption xattr. Then, it uses a KDF (as described in `Key |
| derivation function`_) to derive the file's key from the master key |
| and nonce. |
| |
| Key derivation was chosen over key wrapping because wrapped keys would |
| require larger xattrs which would be less likely to fit in-line in the |
| filesystem's inode table, and there didn't appear to be any |
| significant advantages to key wrapping. In particular, currently |
| there is no requirement to support unlocking a file with multiple |
| alternative master keys or to support rotating master keys. Instead, |
| the master keys may be wrapped in userspace, e.g. as is done by the |
| `fscrypt <https://github.com/google/fscrypt>`_ tool. |
| |
| DIRECT_KEY policies |
| ------------------- |
| |
| The Adiantum encryption mode (see `Encryption modes and usage`_) is |
| suitable for both contents and filenames encryption, and it accepts |
| long IVs --- long enough to hold both an 8-byte data unit index and a |
| 16-byte per-file nonce. Also, the overhead of each Adiantum key is |
| greater than that of an AES-256-XTS key. |
| |
| Therefore, to improve performance and save memory, for Adiantum a |
| "direct key" configuration is supported. When the user has enabled |
| this by setting FSCRYPT_POLICY_FLAG_DIRECT_KEY in the fscrypt policy, |
| per-file encryption keys are not used. Instead, whenever any data |
| (contents or filenames) is encrypted, the file's 16-byte nonce is |
| included in the IV. Moreover: |
| |
| - For v1 encryption policies, the encryption is done directly with the |
| master key. Because of this, users **must not** use the same master |
| key for any other purpose, even for other v1 policies. |
| |
| - For v2 encryption policies, the encryption is done with a per-mode |
| key derived using the KDF. Users may use the same master key for |
| other v2 encryption policies. |
| |
| IV_INO_LBLK_64 policies |
| ----------------------- |
| |
| When FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64 is set in the fscrypt policy, |
| the encryption keys are derived from the master key, encryption mode |
| number, and filesystem UUID. This normally results in all files |
| protected by the same master key sharing a single contents encryption |
| key and a single filenames encryption key. To still encrypt different |
| files' data differently, inode numbers are included in the IVs. |
| Consequently, shrinking the filesystem may not be allowed. |
| |
| This format is optimized for use with inline encryption hardware |
| compliant with the UFS standard, which supports only 64 IV bits per |
| I/O request and may have only a small number of keyslots. |
| |
| IV_INO_LBLK_32 policies |
| ----------------------- |
| |
| IV_INO_LBLK_32 policies work like IV_INO_LBLK_64, except that for |
| IV_INO_LBLK_32, the inode number is hashed with SipHash-2-4 (where the |
| SipHash key is derived from the master key) and added to the file data |
| unit index mod 2^32 to produce a 32-bit IV. |
| |
| This format is optimized for use with inline encryption hardware |
| compliant with the eMMC v5.2 standard, which supports only 32 IV bits |
| per I/O request and may have only a small number of keyslots. This |
| format results in some level of IV reuse, so it should only be used |
| when necessary due to hardware limitations. |
| |
| Key identifiers |
| --------------- |
| |
| For master keys used for v2 encryption policies, a unique 16-byte "key |
| identifier" is also derived using the KDF. This value is stored in |
| the clear, since it is needed to reliably identify the key itself. |
| |
| Dirhash keys |
| ------------ |
| |
| For directories that are indexed using a secret-keyed dirhash over the |
| plaintext filenames, the KDF is also used to derive a 128-bit |
| SipHash-2-4 key per directory in order to hash filenames. This works |
| just like deriving a per-file encryption key, except that a different |
| KDF context is used. Currently, only casefolded ("case-insensitive") |
| encrypted directories use this style of hashing. |
| |
| Encryption modes and usage |
| ========================== |
| |
| fscrypt allows one encryption mode to be specified for file contents |
| and one encryption mode to be specified for filenames. Different |
| directory trees are permitted to use different encryption modes. |
| |
| Supported modes |
| --------------- |
| |
| Currently, the following pairs of encryption modes are supported: |
| |
| - AES-256-XTS for contents and AES-256-CTS-CBC for filenames |
| - AES-256-XTS for contents and AES-256-HCTR2 for filenames |
| - Adiantum for both contents and filenames |
| - AES-128-CBC-ESSIV for contents and AES-128-CTS-CBC for filenames |
| - SM4-XTS for contents and SM4-CTS-CBC for filenames |
| |
| Authenticated encryption modes are not currently supported because of |
| the difficulty of dealing with ciphertext expansion. Therefore, |
| contents encryption uses a block cipher in `XTS mode |
| <https://en.wikipedia.org/wiki/Disk_encryption_theory#XTS>`_ or |
| `CBC-ESSIV mode |
| <https://en.wikipedia.org/wiki/Disk_encryption_theory#Encrypted_salt-sector_initialization_vector_(ESSIV)>`_, |
| or a wide-block cipher. Filenames encryption uses a |
| block cipher in `CTS-CBC mode |
| <https://en.wikipedia.org/wiki/Ciphertext_stealing>`_ or a wide-block |
| cipher. |
| |
| The (AES-256-XTS, AES-256-CTS-CBC) pair is the recommended default. |
| It is also the only option that is *guaranteed* to always be supported |
| if the kernel supports fscrypt at all; see `Kernel config options`_. |
| |
| The (AES-256-XTS, AES-256-HCTR2) pair is also a good choice that |
| upgrades the filenames encryption to use a wide-block cipher. (A |
| *wide-block cipher*, also called a tweakable super-pseudorandom |
| permutation, has the property that changing one bit scrambles the |
| entire result.) As described in `Filenames encryption`_, a wide-block |
| cipher is the ideal mode for the problem domain, though CTS-CBC is the |
| "least bad" choice among the alternatives. For more information about |
| HCTR2, see `the HCTR2 paper <https://eprint.iacr.org/2021/1441.pdf>`_. |
| |
| Adiantum is recommended on systems where AES is too slow due to lack |
| of hardware acceleration for AES. Adiantum is a wide-block cipher |
| that uses XChaCha12 and AES-256 as its underlying components. Most of |
| the work is done by XChaCha12, which is much faster than AES when AES |
| acceleration is unavailable. For more information about Adiantum, see |
| `the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_. |
| |
| The (AES-128-CBC-ESSIV, AES-128-CTS-CBC) pair exists only to support |
| systems whose only form of AES acceleration is an off-CPU crypto |
| accelerator such as CAAM or CESA that does not support XTS. |
| |
| The remaining mode pairs are the "national pride ciphers": |
| |
| - (SM4-XTS, SM4-CTS-CBC) |
| |
| Generally speaking, these ciphers aren't "bad" per se, but they |
| receive limited security review compared to the usual choices such as |
| AES and ChaCha. They also don't bring much new to the table. It is |
| suggested to only use these ciphers where their use is mandated. |
| |
| Kernel config options |
| --------------------- |
| |
| Enabling fscrypt support (CONFIG_FS_ENCRYPTION) automatically pulls in |
| only the basic support from the crypto API needed to use AES-256-XTS |
| and AES-256-CTS-CBC encryption. For optimal performance, it is |
| strongly recommended to also enable any available platform-specific |
| kconfig options that provide acceleration for the algorithm(s) you |
| wish to use. Support for any "non-default" encryption modes typically |
| requires extra kconfig options as well. |
| |
| Below, some relevant options are listed by encryption mode. Note, |
| acceleration options not listed below may be available for your |
| platform; refer to the kconfig menus. File contents encryption can |
| also be configured to use inline encryption hardware instead of the |
| kernel crypto API (see `Inline encryption support`_); in that case, |
| the file contents mode doesn't need to supported in the kernel crypto |
| API, but the filenames mode still does. |
| |
| - AES-256-XTS and AES-256-CTS-CBC |
| - Recommended: |
| - arm64: CONFIG_CRYPTO_AES_ARM64_CE_BLK |
| - x86: CONFIG_CRYPTO_AES_NI_INTEL |
| |
| - AES-256-HCTR2 |
| - Mandatory: |
| - CONFIG_CRYPTO_HCTR2 |
| - Recommended: |
| - arm64: CONFIG_CRYPTO_AES_ARM64_CE_BLK |
| - arm64: CONFIG_CRYPTO_POLYVAL_ARM64_CE |
| - x86: CONFIG_CRYPTO_AES_NI_INTEL |
| - x86: CONFIG_CRYPTO_POLYVAL_CLMUL_NI |
| |
| - Adiantum |
| - Mandatory: |
| - CONFIG_CRYPTO_ADIANTUM |
| - Recommended: |
| - arm32: CONFIG_CRYPTO_CHACHA20_NEON |
| - arm32: CONFIG_CRYPTO_NHPOLY1305_NEON |
| - arm64: CONFIG_CRYPTO_CHACHA20_NEON |
| - arm64: CONFIG_CRYPTO_NHPOLY1305_NEON |
| - x86: CONFIG_CRYPTO_CHACHA20_X86_64 |
| - x86: CONFIG_CRYPTO_NHPOLY1305_SSE2 |
| - x86: CONFIG_CRYPTO_NHPOLY1305_AVX2 |
| |
| - AES-128-CBC-ESSIV and AES-128-CTS-CBC: |
| - Mandatory: |
| - CONFIG_CRYPTO_ESSIV |
| - CONFIG_CRYPTO_SHA256 or another SHA-256 implementation |
| - Recommended: |
| - AES-CBC acceleration |
| |
| fscrypt also uses HMAC-SHA512 for key derivation, so enabling SHA-512 |
| acceleration is recommended: |
| |
| - SHA-512 |
| - Recommended: |
| - arm64: CONFIG_CRYPTO_SHA512_ARM64_CE |
| - x86: CONFIG_CRYPTO_SHA512_SSSE3 |
| |
| Contents encryption |
| ------------------- |
| |
| For contents encryption, each file's contents is divided into "data |
| units". Each data unit is encrypted independently. The IV for each |
| data unit incorporates the zero-based index of the data unit within |
| the file. This ensures that each data unit within a file is encrypted |
| differently, which is essential to prevent leaking information. |
| |
| Note: the encryption depending on the offset into the file means that |
| operations like "collapse range" and "insert range" that rearrange the |
| extent mapping of files are not supported on encrypted files. |
| |
| There are two cases for the sizes of the data units: |
| |
| * Fixed-size data units. This is how all filesystems other than UBIFS |
| work. A file's data units are all the same size; the last data unit |
| is zero-padded if needed. By default, the data unit size is equal |
| to the filesystem block size. On some filesystems, users can select |
| a sub-block data unit size via the ``log2_data_unit_size`` field of |
| the encryption policy; see `FS_IOC_SET_ENCRYPTION_POLICY`_. |
| |
| * Variable-size data units. This is what UBIFS does. Each "UBIFS |
| data node" is treated as a crypto data unit. Each contains variable |
| length, possibly compressed data, zero-padded to the next 16-byte |
| boundary. Users cannot select a sub-block data unit size on UBIFS. |
| |
| In the case of compression + encryption, the compressed data is |
| encrypted. UBIFS compression works as described above. f2fs |
| compression works a bit differently; it compresses a number of |
| filesystem blocks into a smaller number of filesystem blocks. |
| Therefore a f2fs-compressed file still uses fixed-size data units, and |
| it is encrypted in a similar way to a file containing holes. |
| |
| As mentioned in `Key hierarchy`_, the default encryption setting uses |
| per-file keys. In this case, the IV for each data unit is simply the |
| index of the data unit in the file. However, users can select an |
| encryption setting that does not use per-file keys. For these, some |
| kind of file identifier is incorporated into the IVs as follows: |
| |
| - With `DIRECT_KEY policies`_, the data unit index is placed in bits |
| 0-63 of the IV, and the file's nonce is placed in bits 64-191. |
| |
| - With `IV_INO_LBLK_64 policies`_, the data unit index is placed in |
| bits 0-31 of the IV, and the file's inode number is placed in bits |
| 32-63. This setting is only allowed when data unit indices and |
| inode numbers fit in 32 bits. |
| |
| - With `IV_INO_LBLK_32 policies`_, the file's inode number is hashed |
| and added to the data unit index. The resulting value is truncated |
| to 32 bits and placed in bits 0-31 of the IV. This setting is only |
| allowed when data unit indices and inode numbers fit in 32 bits. |
| |
| The byte order of the IV is always little endian. |
| |
| If the user selects FSCRYPT_MODE_AES_128_CBC for the contents mode, an |
| ESSIV layer is automatically included. In this case, before the IV is |
| passed to AES-128-CBC, it is encrypted with AES-256 where the AES-256 |
| key is the SHA-256 hash of the file's contents encryption key. |
| |
| Filenames encryption |
| -------------------- |
| |
| For filenames, each full filename is encrypted at once. Because of |
| the requirements to retain support for efficient directory lookups and |
| filenames of up to 255 bytes, the same IV is used for every filename |
| in a directory. |
| |
| However, each encrypted directory still uses a unique key, or |
| alternatively has the file's nonce (for `DIRECT_KEY policies`_) or |
| inode number (for `IV_INO_LBLK_64 policies`_) included in the IVs. |
| Thus, IV reuse is limited to within a single directory. |
| |
| With CTS-CBC, the IV reuse means that when the plaintext filenames share a |
| common prefix at least as long as the cipher block size (16 bytes for AES), the |
| corresponding encrypted filenames will also share a common prefix. This is |
| undesirable. Adiantum and HCTR2 do not have this weakness, as they are |
| wide-block encryption modes. |
| |
| All supported filenames encryption modes accept any plaintext length |
| >= 16 bytes; cipher block alignment is not required. However, |
| filenames shorter than 16 bytes are NUL-padded to 16 bytes before |
| being encrypted. In addition, to reduce leakage of filename lengths |
| via their ciphertexts, all filenames are NUL-padded to the next 4, 8, |
| 16, or 32-byte boundary (configurable). 32 is recommended since this |
| provides the best confidentiality, at the cost of making directory |
| entries consume slightly more space. Note that since NUL (``\0``) is |
| not otherwise a valid character in filenames, the padding will never |
| produce duplicate plaintexts. |
| |
| Symbolic link targets are considered a type of filename and are |
| encrypted in the same way as filenames in directory entries, except |
| that IV reuse is not a problem as each symlink has its own inode. |
| |
| User API |
| ======== |
| |
| Setting an encryption policy |
| ---------------------------- |
| |
| FS_IOC_SET_ENCRYPTION_POLICY |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an |
| empty directory or verifies that a directory or regular file already |
| has the specified encryption policy. It takes in a pointer to |
| struct fscrypt_policy_v1 or struct fscrypt_policy_v2, defined as |
| follows:: |
| |
| #define FSCRYPT_POLICY_V1 0 |
| #define FSCRYPT_KEY_DESCRIPTOR_SIZE 8 |
| struct fscrypt_policy_v1 { |
| __u8 version; |
| __u8 contents_encryption_mode; |
| __u8 filenames_encryption_mode; |
| __u8 flags; |
| __u8 master_key_descriptor[FSCRYPT_KEY_DESCRIPTOR_SIZE]; |
| }; |
| #define fscrypt_policy fscrypt_policy_v1 |
| |
| #define FSCRYPT_POLICY_V2 2 |
| #define FSCRYPT_KEY_IDENTIFIER_SIZE 16 |
| struct fscrypt_policy_v2 { |
| __u8 version; |
| __u8 contents_encryption_mode; |
| __u8 filenames_encryption_mode; |
| __u8 flags; |
| __u8 log2_data_unit_size; |
| __u8 __reserved[3]; |
| __u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; |
| }; |
| |
| This structure must be initialized as follows: |
| |
| - ``version`` must be FSCRYPT_POLICY_V1 (0) if |
| struct fscrypt_policy_v1 is used or FSCRYPT_POLICY_V2 (2) if |
| struct fscrypt_policy_v2 is used. (Note: we refer to the original |
| policy version as "v1", though its version code is really 0.) |
| For new encrypted directories, use v2 policies. |
| |
| - ``contents_encryption_mode`` and ``filenames_encryption_mode`` must |
| be set to constants from ``<linux/fscrypt.h>`` which identify the |
| encryption modes to use. If unsure, use FSCRYPT_MODE_AES_256_XTS |
| (1) for ``contents_encryption_mode`` and FSCRYPT_MODE_AES_256_CTS |
| (4) for ``filenames_encryption_mode``. For details, see `Encryption |
| modes and usage`_. |
| |
| v1 encryption policies only support three combinations of modes: |
| (FSCRYPT_MODE_AES_256_XTS, FSCRYPT_MODE_AES_256_CTS), |
| (FSCRYPT_MODE_AES_128_CBC, FSCRYPT_MODE_AES_128_CTS), and |
| (FSCRYPT_MODE_ADIANTUM, FSCRYPT_MODE_ADIANTUM). v2 policies support |
| all combinations documented in `Supported modes`_. |
| |
| - ``flags`` contains optional flags from ``<linux/fscrypt.h>``: |
| |
| - FSCRYPT_POLICY_FLAGS_PAD_*: The amount of NUL padding to use when |
| encrypting filenames. If unsure, use FSCRYPT_POLICY_FLAGS_PAD_32 |
| (0x3). |
| - FSCRYPT_POLICY_FLAG_DIRECT_KEY: See `DIRECT_KEY policies`_. |
| - FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64: See `IV_INO_LBLK_64 |
| policies`_. |
| - FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32: See `IV_INO_LBLK_32 |
| policies`_. |
| |
| v1 encryption policies only support the PAD_* and DIRECT_KEY flags. |
| The other flags are only supported by v2 encryption policies. |
| |
| The DIRECT_KEY, IV_INO_LBLK_64, and IV_INO_LBLK_32 flags are |
| mutually exclusive. |
| |
| - ``log2_data_unit_size`` is the log2 of the data unit size in bytes, |
| or 0 to select the default data unit size. The data unit size is |
| the granularity of file contents encryption. For example, setting |
| ``log2_data_unit_size`` to 12 causes file contents be passed to the |
| underlying encryption algorithm (such as AES-256-XTS) in 4096-byte |
| data units, each with its own IV. |
| |
| Not all filesystems support setting ``log2_data_unit_size``. ext4 |
| and f2fs support it since Linux v6.7. On filesystems that support |
| it, the supported nonzero values are 9 through the log2 of the |
| filesystem block size, inclusively. The default value of 0 selects |
| the filesystem block size. |
| |
| The main use case for ``log2_data_unit_size`` is for selecting a |
| data unit size smaller than the filesystem block size for |
| compatibility with inline encryption hardware that only supports |
| smaller data unit sizes. ``/sys/block/$disk/queue/crypto/`` may be |
| useful for checking which data unit sizes are supported by a |
| particular system's inline encryption hardware. |
| |
| Leave this field zeroed unless you are certain you need it. Using |
| an unnecessarily small data unit size reduces performance. |
| |
| - For v2 encryption policies, ``__reserved`` must be zeroed. |
| |
| - For v1 encryption policies, ``master_key_descriptor`` specifies how |
| to find the master key in a keyring; see `Adding keys`_. It is up |
| to userspace to choose a unique ``master_key_descriptor`` for each |
| master key. The e4crypt and fscrypt tools use the first 8 bytes of |
| ``SHA-512(SHA-512(master_key))``, but this particular scheme is not |
| required. Also, the master key need not be in the keyring yet when |
| FS_IOC_SET_ENCRYPTION_POLICY is executed. However, it must be added |
| before any files can be created in the encrypted directory. |
| |
| For v2 encryption policies, ``master_key_descriptor`` has been |
| replaced with ``master_key_identifier``, which is longer and cannot |
| be arbitrarily chosen. Instead, the key must first be added using |
| `FS_IOC_ADD_ENCRYPTION_KEY`_. Then, the ``key_spec.u.identifier`` |
| the kernel returned in the struct fscrypt_add_key_arg must |
| be used as the ``master_key_identifier`` in |
| struct fscrypt_policy_v2. |
| |
| If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY |
| verifies that the file is an empty directory. If so, the specified |
| encryption policy is assigned to the directory, turning it into an |
| encrypted directory. After that, and after providing the |
| corresponding master key as described in `Adding keys`_, all regular |
| files, directories (recursively), and symlinks created in the |
| directory will be encrypted, inheriting the same encryption policy. |
| The filenames in the directory's entries will be encrypted as well. |
| |
| Alternatively, if the file is already encrypted, then |
| FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption |
| policy exactly matches the actual one. If they match, then the ioctl |
| returns 0. Otherwise, it fails with EEXIST. This works on both |
| regular files and directories, including nonempty directories. |
| |
| When a v2 encryption policy is assigned to a directory, it is also |
| required that either the specified key has been added by the current |
| user or that the caller has CAP_FOWNER in the initial user namespace. |
| (This is needed to prevent a user from encrypting their data with |
| another user's key.) The key must remain added while |
| FS_IOC_SET_ENCRYPTION_POLICY is executing. However, if the new |
| encrypted directory does not need to be accessed immediately, then the |
| key can be removed right away afterwards. |
| |
| Note that the ext4 filesystem does not allow the root directory to be |
| encrypted, even if it is empty. Users who want to encrypt an entire |
| filesystem with one key should consider using dm-crypt instead. |
| |
| FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors: |
| |
| - ``EACCES``: the file is not owned by the process's uid, nor does the |
| process have the CAP_FOWNER capability in a namespace with the file |
| owner's uid mapped |
| - ``EEXIST``: the file is already encrypted with an encryption policy |
| different from the one specified |
| - ``EINVAL``: an invalid encryption policy was specified (invalid |
| version, mode(s), or flags; or reserved bits were set); or a v1 |
| encryption policy was specified but the directory has the casefold |
| flag enabled (casefolding is incompatible with v1 policies). |
| - ``ENOKEY``: a v2 encryption policy was specified, but the key with |
| the specified ``master_key_identifier`` has not been added, nor does |
| the process have the CAP_FOWNER capability in the initial user |
| namespace |
| - ``ENOTDIR``: the file is unencrypted and is a regular file, not a |
| directory |
| - ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory |
| - ``ENOTTY``: this type of filesystem does not implement encryption |
| - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| support for filesystems, or the filesystem superblock has not |
| had encryption enabled on it. (For example, to use encryption on an |
| ext4 filesystem, CONFIG_FS_ENCRYPTION must be enabled in the |
| kernel config, and the superblock must have had the "encrypt" |
| feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O |
| encrypt``.) |
| - ``EPERM``: this directory may not be encrypted, e.g. because it is |
| the root directory of an ext4 filesystem |
| - ``EROFS``: the filesystem is readonly |
| |
| Getting an encryption policy |
| ---------------------------- |
| |
| Two ioctls are available to get a file's encryption policy: |
| |
| - `FS_IOC_GET_ENCRYPTION_POLICY_EX`_ |
| - `FS_IOC_GET_ENCRYPTION_POLICY`_ |
| |
| The extended (_EX) version of the ioctl is more general and is |
| recommended to use when possible. However, on older kernels only the |
| original ioctl is available. Applications should try the extended |
| version, and if it fails with ENOTTY fall back to the original |
| version. |
| |
| FS_IOC_GET_ENCRYPTION_POLICY_EX |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_GET_ENCRYPTION_POLICY_EX ioctl retrieves the encryption |
| policy, if any, for a directory or regular file. No additional |
| permissions are required beyond the ability to open the file. It |
| takes in a pointer to struct fscrypt_get_policy_ex_arg, |
| defined as follows:: |
| |
| struct fscrypt_get_policy_ex_arg { |
| __u64 policy_size; /* input/output */ |
| union { |
| __u8 version; |
| struct fscrypt_policy_v1 v1; |
| struct fscrypt_policy_v2 v2; |
| } policy; /* output */ |
| }; |
| |
| The caller must initialize ``policy_size`` to the size available for |
| the policy struct, i.e. ``sizeof(arg.policy)``. |
| |
| On success, the policy struct is returned in ``policy``, and its |
| actual size is returned in ``policy_size``. ``policy.version`` should |
| be checked to determine the version of policy returned. Note that the |
| version code for the "v1" policy is actually 0 (FSCRYPT_POLICY_V1). |
| |
| FS_IOC_GET_ENCRYPTION_POLICY_EX can fail with the following errors: |
| |
| - ``EINVAL``: the file is encrypted, but it uses an unrecognized |
| encryption policy version |
| - ``ENODATA``: the file is not encrypted |
| - ``ENOTTY``: this type of filesystem does not implement encryption, |
| or this kernel is too old to support FS_IOC_GET_ENCRYPTION_POLICY_EX |
| (try FS_IOC_GET_ENCRYPTION_POLICY instead) |
| - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| support for this filesystem, or the filesystem superblock has not |
| had encryption enabled on it |
| - ``EOVERFLOW``: the file is encrypted and uses a recognized |
| encryption policy version, but the policy struct does not fit into |
| the provided buffer |
| |
| Note: if you only need to know whether a file is encrypted or not, on |
| most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl |
| and check for FS_ENCRYPT_FL, or to use the statx() system call and |
| check for STATX_ATTR_ENCRYPTED in stx_attributes. |
| |
| FS_IOC_GET_ENCRYPTION_POLICY |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_GET_ENCRYPTION_POLICY ioctl can also retrieve the |
| encryption policy, if any, for a directory or regular file. However, |
| unlike `FS_IOC_GET_ENCRYPTION_POLICY_EX`_, |
| FS_IOC_GET_ENCRYPTION_POLICY only supports the original policy |
| version. It takes in a pointer directly to struct fscrypt_policy_v1 |
| rather than struct fscrypt_get_policy_ex_arg. |
| |
| The error codes for FS_IOC_GET_ENCRYPTION_POLICY are the same as those |
| for FS_IOC_GET_ENCRYPTION_POLICY_EX, except that |
| FS_IOC_GET_ENCRYPTION_POLICY also returns ``EINVAL`` if the file is |
| encrypted using a newer encryption policy version. |
| |
| Getting the per-filesystem salt |
| ------------------------------- |
| |
| Some filesystems, such as ext4 and F2FS, also support the deprecated |
| ioctl FS_IOC_GET_ENCRYPTION_PWSALT. This ioctl retrieves a randomly |
| generated 16-byte value stored in the filesystem superblock. This |
| value is intended to used as a salt when deriving an encryption key |
| from a passphrase or other low-entropy user credential. |
| |
| FS_IOC_GET_ENCRYPTION_PWSALT is deprecated. Instead, prefer to |
| generate and manage any needed salt(s) in userspace. |
| |
| Getting a file's encryption nonce |
| --------------------------------- |
| |
| Since Linux v5.7, the ioctl FS_IOC_GET_ENCRYPTION_NONCE is supported. |
| On encrypted files and directories it gets the inode's 16-byte nonce. |
| On unencrypted files and directories, it fails with ENODATA. |
| |
| This ioctl can be useful for automated tests which verify that the |
| encryption is being done correctly. It is not needed for normal use |
| of fscrypt. |
| |
| Adding keys |
| ----------- |
| |
| FS_IOC_ADD_ENCRYPTION_KEY |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_ADD_ENCRYPTION_KEY ioctl adds a master encryption key to |
| the filesystem, making all files on the filesystem which were |
| encrypted using that key appear "unlocked", i.e. in plaintext form. |
| It can be executed on any file or directory on the target filesystem, |
| but using the filesystem's root directory is recommended. It takes in |
| a pointer to struct fscrypt_add_key_arg, defined as follows:: |
| |
| struct fscrypt_add_key_arg { |
| struct fscrypt_key_specifier key_spec; |
| __u32 raw_size; |
| __u32 key_id; |
| __u32 __reserved[8]; |
| __u8 raw[]; |
| }; |
| |
| #define FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR 1 |
| #define FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER 2 |
| |
| struct fscrypt_key_specifier { |
| __u32 type; /* one of FSCRYPT_KEY_SPEC_TYPE_* */ |
| __u32 __reserved; |
| union { |
| __u8 __reserved[32]; /* reserve some extra space */ |
| __u8 descriptor[FSCRYPT_KEY_DESCRIPTOR_SIZE]; |
| __u8 identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; |
| } u; |
| }; |
| |
| struct fscrypt_provisioning_key_payload { |
| __u32 type; |
| __u32 __reserved; |
| __u8 raw[]; |
| }; |
| |
| struct fscrypt_add_key_arg must be zeroed, then initialized |
| as follows: |
| |
| - If the key is being added for use by v1 encryption policies, then |
| ``key_spec.type`` must contain FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR, and |
| ``key_spec.u.descriptor`` must contain the descriptor of the key |
| being added, corresponding to the value in the |
| ``master_key_descriptor`` field of struct fscrypt_policy_v1. |
| To add this type of key, the calling process must have the |
| CAP_SYS_ADMIN capability in the initial user namespace. |
| |
| Alternatively, if the key is being added for use by v2 encryption |
| policies, then ``key_spec.type`` must contain |
| FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER, and ``key_spec.u.identifier`` is |
| an *output* field which the kernel fills in with a cryptographic |
| hash of the key. To add this type of key, the calling process does |
| not need any privileges. However, the number of keys that can be |
| added is limited by the user's quota for the keyrings service (see |
| ``Documentation/security/keys/core.rst``). |
| |
| - ``raw_size`` must be the size of the ``raw`` key provided, in bytes. |
| Alternatively, if ``key_id`` is nonzero, this field must be 0, since |
| in that case the size is implied by the specified Linux keyring key. |
| |
| - ``key_id`` is 0 if the raw key is given directly in the ``raw`` |
| field. Otherwise ``key_id`` is the ID of a Linux keyring key of |
| type "fscrypt-provisioning" whose payload is |
| struct fscrypt_provisioning_key_payload whose ``raw`` field contains |
| the raw key and whose ``type`` field matches ``key_spec.type``. |
| Since ``raw`` is variable-length, the total size of this key's |
| payload must be ``sizeof(struct fscrypt_provisioning_key_payload)`` |
| plus the raw key size. The process must have Search permission on |
| this key. |
| |
| Most users should leave this 0 and specify the raw key directly. |
| The support for specifying a Linux keyring key is intended mainly to |
| allow re-adding keys after a filesystem is unmounted and re-mounted, |
| without having to store the raw keys in userspace memory. |
| |
| - ``raw`` is a variable-length field which must contain the actual |
| key, ``raw_size`` bytes long. Alternatively, if ``key_id`` is |
| nonzero, then this field is unused. |
| |
| For v2 policy keys, the kernel keeps track of which user (identified |
| by effective user ID) added the key, and only allows the key to be |
| removed by that user --- or by "root", if they use |
| `FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS`_. |
| |
| However, if another user has added the key, it may be desirable to |
| prevent that other user from unexpectedly removing it. Therefore, |
| FS_IOC_ADD_ENCRYPTION_KEY may also be used to add a v2 policy key |
| *again*, even if it's already added by other user(s). In this case, |
| FS_IOC_ADD_ENCRYPTION_KEY will just install a claim to the key for the |
| current user, rather than actually add the key again (but the raw key |
| must still be provided, as a proof of knowledge). |
| |
| FS_IOC_ADD_ENCRYPTION_KEY returns 0 if either the key or a claim to |
| the key was either added or already exists. |
| |
| FS_IOC_ADD_ENCRYPTION_KEY can fail with the following errors: |
| |
| - ``EACCES``: FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR was specified, but the |
| caller does not have the CAP_SYS_ADMIN capability in the initial |
| user namespace; or the raw key was specified by Linux key ID but the |
| process lacks Search permission on the key. |
| - ``EDQUOT``: the key quota for this user would be exceeded by adding |
| the key |
| - ``EINVAL``: invalid key size or key specifier type, or reserved bits |
| were set |
| - ``EKEYREJECTED``: the raw key was specified by Linux key ID, but the |
| key has the wrong type |
| - ``ENOKEY``: the raw key was specified by Linux key ID, but no key |
| exists with that ID |
| - ``ENOTTY``: this type of filesystem does not implement encryption |
| - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| support for this filesystem, or the filesystem superblock has not |
| had encryption enabled on it |
| |
| Legacy method |
| ~~~~~~~~~~~~~ |
| |
| For v1 encryption policies, a master encryption key can also be |
| provided by adding it to a process-subscribed keyring, e.g. to a |
| session keyring, or to a user keyring if the user keyring is linked |
| into the session keyring. |
| |
| This method is deprecated (and not supported for v2 encryption |
| policies) for several reasons. First, it cannot be used in |
| combination with FS_IOC_REMOVE_ENCRYPTION_KEY (see `Removing keys`_), |
| so for removing a key a workaround such as keyctl_unlink() in |
| combination with ``sync; echo 2 > /proc/sys/vm/drop_caches`` would |
| have to be used. Second, it doesn't match the fact that the |
| locked/unlocked status of encrypted files (i.e. whether they appear to |
| be in plaintext form or in ciphertext form) is global. This mismatch |
| has caused much confusion as well as real problems when processes |
| running under different UIDs, such as a ``sudo`` command, need to |
| access encrypted files. |
| |
| Nevertheless, to add a key to one of the process-subscribed keyrings, |
| the add_key() system call can be used (see: |
| ``Documentation/security/keys/core.rst``). The key type must be |
| "logon"; keys of this type are kept in kernel memory and cannot be |
| read back by userspace. The key description must be "fscrypt:" |
| followed by the 16-character lower case hex representation of the |
| ``master_key_descriptor`` that was set in the encryption policy. The |
| key payload must conform to the following structure:: |
| |
| #define FSCRYPT_MAX_KEY_SIZE 64 |
| |
| struct fscrypt_key { |
| __u32 mode; |
| __u8 raw[FSCRYPT_MAX_KEY_SIZE]; |
| __u32 size; |
| }; |
| |
| ``mode`` is ignored; just set it to 0. The actual key is provided in |
| ``raw`` with ``size`` indicating its size in bytes. That is, the |
| bytes ``raw[0..size-1]`` (inclusive) are the actual key. |
| |
| The key description prefix "fscrypt:" may alternatively be replaced |
| with a filesystem-specific prefix such as "ext4:". However, the |
| filesystem-specific prefixes are deprecated and should not be used in |
| new programs. |
| |
| Removing keys |
| ------------- |
| |
| Two ioctls are available for removing a key that was added by |
| `FS_IOC_ADD_ENCRYPTION_KEY`_: |
| |
| - `FS_IOC_REMOVE_ENCRYPTION_KEY`_ |
| - `FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS`_ |
| |
| These two ioctls differ only in cases where v2 policy keys are added |
| or removed by non-root users. |
| |
| These ioctls don't work on keys that were added via the legacy |
| process-subscribed keyrings mechanism. |
| |
| Before using these ioctls, read the `Kernel memory compromise`_ |
| section for a discussion of the security goals and limitations of |
| these ioctls. |
| |
| FS_IOC_REMOVE_ENCRYPTION_KEY |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_REMOVE_ENCRYPTION_KEY ioctl removes a claim to a master |
| encryption key from the filesystem, and possibly removes the key |
| itself. It can be executed on any file or directory on the target |
| filesystem, but using the filesystem's root directory is recommended. |
| It takes in a pointer to struct fscrypt_remove_key_arg, defined |
| as follows:: |
| |
| struct fscrypt_remove_key_arg { |
| struct fscrypt_key_specifier key_spec; |
| #define FSCRYPT_KEY_REMOVAL_STATUS_FLAG_FILES_BUSY 0x00000001 |
| #define FSCRYPT_KEY_REMOVAL_STATUS_FLAG_OTHER_USERS 0x00000002 |
| __u32 removal_status_flags; /* output */ |
| __u32 __reserved[5]; |
| }; |
| |
| This structure must be zeroed, then initialized as follows: |
| |
| - The key to remove is specified by ``key_spec``: |
| |
| - To remove a key used by v1 encryption policies, set |
| ``key_spec.type`` to FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR and fill |
| in ``key_spec.u.descriptor``. To remove this type of key, the |
| calling process must have the CAP_SYS_ADMIN capability in the |
| initial user namespace. |
| |
| - To remove a key used by v2 encryption policies, set |
| ``key_spec.type`` to FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER and fill |
| in ``key_spec.u.identifier``. |
| |
| For v2 policy keys, this ioctl is usable by non-root users. However, |
| to make this possible, it actually just removes the current user's |
| claim to the key, undoing a single call to FS_IOC_ADD_ENCRYPTION_KEY. |
| Only after all claims are removed is the key really removed. |
| |
| For example, if FS_IOC_ADD_ENCRYPTION_KEY was called with uid 1000, |
| then the key will be "claimed" by uid 1000, and |
| FS_IOC_REMOVE_ENCRYPTION_KEY will only succeed as uid 1000. Or, if |
| both uids 1000 and 2000 added the key, then for each uid |
| FS_IOC_REMOVE_ENCRYPTION_KEY will only remove their own claim. Only |
| once *both* are removed is the key really removed. (Think of it like |
| unlinking a file that may have hard links.) |
| |
| If FS_IOC_REMOVE_ENCRYPTION_KEY really removes the key, it will also |
| try to "lock" all files that had been unlocked with the key. It won't |
| lock files that are still in-use, so this ioctl is expected to be used |
| in cooperation with userspace ensuring that none of the files are |
| still open. However, if necessary, this ioctl can be executed again |
| later to retry locking any remaining files. |
| |
| FS_IOC_REMOVE_ENCRYPTION_KEY returns 0 if either the key was removed |
| (but may still have files remaining to be locked), the user's claim to |
| the key was removed, or the key was already removed but had files |
| remaining to be the locked so the ioctl retried locking them. In any |
| of these cases, ``removal_status_flags`` is filled in with the |
| following informational status flags: |
| |
| - ``FSCRYPT_KEY_REMOVAL_STATUS_FLAG_FILES_BUSY``: set if some file(s) |
| are still in-use. Not guaranteed to be set in the case where only |
| the user's claim to the key was removed. |
| - ``FSCRYPT_KEY_REMOVAL_STATUS_FLAG_OTHER_USERS``: set if only the |
| user's claim to the key was removed, not the key itself |
| |
| FS_IOC_REMOVE_ENCRYPTION_KEY can fail with the following errors: |
| |
| - ``EACCES``: The FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR key specifier type |
| was specified, but the caller does not have the CAP_SYS_ADMIN |
| capability in the initial user namespace |
| - ``EINVAL``: invalid key specifier type, or reserved bits were set |
| - ``ENOKEY``: the key object was not found at all, i.e. it was never |
| added in the first place or was already fully removed including all |
| files locked; or, the user does not have a claim to the key (but |
| someone else does). |
| - ``ENOTTY``: this type of filesystem does not implement encryption |
| - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| support for this filesystem, or the filesystem superblock has not |
| had encryption enabled on it |
| |
| FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS is exactly the same as |
| `FS_IOC_REMOVE_ENCRYPTION_KEY`_, except that for v2 policy keys, the |
| ALL_USERS version of the ioctl will remove all users' claims to the |
| key, not just the current user's. I.e., the key itself will always be |
| removed, no matter how many users have added it. This difference is |
| only meaningful if non-root users are adding and removing keys. |
| |
| Because of this, FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS also requires |
| "root", namely the CAP_SYS_ADMIN capability in the initial user |
| namespace. Otherwise it will fail with EACCES. |
| |
| Getting key status |
| ------------------ |
| |
| FS_IOC_GET_ENCRYPTION_KEY_STATUS |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| The FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl retrieves the status of a |
| master encryption key. It can be executed on any file or directory on |
| the target filesystem, but using the filesystem's root directory is |
| recommended. It takes in a pointer to |
| struct fscrypt_get_key_status_arg, defined as follows:: |
| |
| struct fscrypt_get_key_status_arg { |
| /* input */ |
| struct fscrypt_key_specifier key_spec; |
| __u32 __reserved[6]; |
| |
| /* output */ |
| #define FSCRYPT_KEY_STATUS_ABSENT 1 |
| #define FSCRYPT_KEY_STATUS_PRESENT 2 |
| #define FSCRYPT_KEY_STATUS_INCOMPLETELY_REMOVED 3 |
| __u32 status; |
| #define FSCRYPT_KEY_STATUS_FLAG_ADDED_BY_SELF 0x00000001 |
| __u32 status_flags; |
| __u32 user_count; |
| __u32 __out_reserved[13]; |
| }; |
| |
| The caller must zero all input fields, then fill in ``key_spec``: |
| |
| - To get the status of a key for v1 encryption policies, set |
| ``key_spec.type`` to FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR and fill |
| in ``key_spec.u.descriptor``. |
| |
| - To get the status of a key for v2 encryption policies, set |
| ``key_spec.type`` to FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER and fill |
| in ``key_spec.u.identifier``. |
| |
| On success, 0 is returned and the kernel fills in the output fields: |
| |
| - ``status`` indicates whether the key is absent, present, or |
| incompletely removed. Incompletely removed means that removal has |
| been initiated, but some files are still in use; i.e., |
| `FS_IOC_REMOVE_ENCRYPTION_KEY`_ returned 0 but set the informational |
| status flag FSCRYPT_KEY_REMOVAL_STATUS_FLAG_FILES_BUSY. |
| |
| - ``status_flags`` can contain the following flags: |
| |
| - ``FSCRYPT_KEY_STATUS_FLAG_ADDED_BY_SELF`` indicates that the key |
| has added by the current user. This is only set for keys |
| identified by ``identifier`` rather than by ``descriptor``. |
| |
| - ``user_count`` specifies the number of users who have added the key. |
| This is only set for keys identified by ``identifier`` rather than |
| by ``descriptor``. |
| |
| FS_IOC_GET_ENCRYPTION_KEY_STATUS can fail with the following errors: |
| |
| - ``EINVAL``: invalid key specifier type, or reserved bits were set |
| - ``ENOTTY``: this type of filesystem does not implement encryption |
| - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| support for this filesystem, or the filesystem superblock has not |
| had encryption enabled on it |
| |
| Among other use cases, FS_IOC_GET_ENCRYPTION_KEY_STATUS can be useful |
| for determining whether the key for a given encrypted directory needs |
| to be added before prompting the user for the passphrase needed to |
| derive the key. |
| |
| FS_IOC_GET_ENCRYPTION_KEY_STATUS can only get the status of keys in |
| the filesystem-level keyring, i.e. the keyring managed by |
| `FS_IOC_ADD_ENCRYPTION_KEY`_ and `FS_IOC_REMOVE_ENCRYPTION_KEY`_. It |
| cannot get the status of a key that has only been added for use by v1 |
| encryption policies using the legacy mechanism involving |
| process-subscribed keyrings. |
| |
| Access semantics |
| ================ |
| |
| With the key |
| ------------ |
| |
| With the encryption key, encrypted regular files, directories, and |
| symlinks behave very similarly to their unencrypted counterparts --- |
| after all, the encryption is intended to be transparent. However, |
| astute users may notice some differences in behavior: |
| |
| - Unencrypted files, or files encrypted with a different encryption |
| policy (i.e. different key, modes, or flags), cannot be renamed or |
| linked into an encrypted directory; see `Encryption policy |
| enforcement`_. Attempts to do so will fail with EXDEV. However, |
| encrypted files can be renamed within an encrypted directory, or |
| into an unencrypted directory. |
| |
| Note: "moving" an unencrypted file into an encrypted directory, e.g. |
| with the `mv` program, is implemented in userspace by a copy |
| followed by a delete. Be aware that the original unencrypted data |
| may remain recoverable from free space on the disk; prefer to keep |
| all files encrypted from the very beginning. The `shred` program |
| may be used to overwrite the source files but isn't guaranteed to be |
| effective on all filesystems and storage devices. |
| |
| - Direct I/O is supported on encrypted files only under some |
| circumstances. For details, see `Direct I/O support`_. |
| |
| - The fallocate operations FALLOC_FL_COLLAPSE_RANGE and |
| FALLOC_FL_INSERT_RANGE are not supported on encrypted files and will |
| fail with EOPNOTSUPP. |
| |
| - Online defragmentation of encrypted files is not supported. The |
| EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with |
| EOPNOTSUPP. |
| |
| - The ext4 filesystem does not support data journaling with encrypted |
| regular files. It will fall back to ordered data mode instead. |
| |
| - DAX (Direct Access) is not supported on encrypted files. |
| |
| - The maximum length of an encrypted symlink is 2 bytes shorter than |
| the maximum length of an unencrypted symlink. For example, on an |
| EXT4 filesystem with a 4K block size, unencrypted symlinks can be up |
| to 4095 bytes long, while encrypted symlinks can only be up to 4093 |
| bytes long (both lengths excluding the terminating null). |
| |
| Note that mmap *is* supported. This is possible because the pagecache |
| for an encrypted file contains the plaintext, not the ciphertext. |
| |
| Without the key |
| --------------- |
| |
| Some filesystem operations may be performed on encrypted regular |
| files, directories, and symlinks even before their encryption key has |
| been added, or after their encryption key has been removed: |
| |
| - File metadata may be read, e.g. using stat(). |
| |
| - Directories may be listed, in which case the filenames will be |
| listed in an encoded form derived from their ciphertext. The |
| current encoding algorithm is described in `Filename hashing and |
| encoding`_. The algorithm is subject to change, but it is |
| guaranteed that the presented filenames will be no longer than |
| NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and |
| will uniquely identify directory entries. |
| |
| The ``.`` and ``..`` directory entries are special. They are always |
| present and are not encrypted or encoded. |
| |
| - Files may be deleted. That is, nondirectory files may be deleted |
| with unlink() as usual, and empty directories may be deleted with |
| rmdir() as usual. Therefore, ``rm`` and ``rm -r`` will work as |
| expected. |
| |
| - Symlink targets may be read and followed, but they will be presented |
| in encrypted form, similar to filenames in directories. Hence, they |
| are unlikely to point to anywhere useful. |
| |
| Without the key, regular files cannot be opened or truncated. |
| Attempts to do so will fail with ENOKEY. This implies that any |
| regular file operations that require a file descriptor, such as |
| read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden. |
| |
| Also without the key, files of any type (including directories) cannot |
| be created or linked into an encrypted directory, nor can a name in an |
| encrypted directory be the source or target of a rename, nor can an |
| O_TMPFILE temporary file be created in an encrypted directory. All |
| such operations will fail with ENOKEY. |
| |
| It is not currently possible to backup and restore encrypted files |
| without the encryption key. This would require special APIs which |
| have not yet been implemented. |
| |
| Encryption policy enforcement |
| ============================= |
| |
| After an encryption policy has been set on a directory, all regular |
| files, directories, and symbolic links created in that directory |
| (recursively) will inherit that encryption policy. Special files --- |
| that is, named pipes, device nodes, and UNIX domain sockets --- will |
| not be encrypted. |
| |
| Except for those special files, it is forbidden to have unencrypted |
| files, or files encrypted with a different encryption policy, in an |
| encrypted directory tree. Attempts to link or rename such a file into |
| an encrypted directory will fail with EXDEV. This is also enforced |
| during ->lookup() to provide limited protection against offline |
| attacks that try to disable or downgrade encryption in known locations |
| where applications may later write sensitive data. It is recommended |
| that systems implementing a form of "verified boot" take advantage of |
| this by validating all top-level encryption policies prior to access. |
| |
| Inline encryption support |
| ========================= |
| |
| By default, fscrypt uses the kernel crypto API for all cryptographic |
| operations (other than HKDF, which fscrypt partially implements |
| itself). The kernel crypto API supports hardware crypto accelerators, |
| but only ones that work in the traditional way where all inputs and |
| outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can |
| take advantage of such hardware, but the traditional acceleration |
| model isn't particularly efficient and fscrypt hasn't been optimized |
| for it. |
| |
| Instead, many newer systems (especially mobile SoCs) have *inline |
| encryption hardware* that can encrypt/decrypt data while it is on its |
| way to/from the storage device. Linux supports inline encryption |
| through a set of extensions to the block layer called *blk-crypto*. |
| blk-crypto allows filesystems to attach encryption contexts to bios |
| (I/O requests) to specify how the data will be encrypted or decrypted |
| in-line. For more information about blk-crypto, see |
| :ref:`Documentation/block/inline-encryption.rst <inline_encryption>`. |
| |
| On supported filesystems (currently ext4 and f2fs), fscrypt can use |
| blk-crypto instead of the kernel crypto API to encrypt/decrypt file |
| contents. To enable this, set CONFIG_FS_ENCRYPTION_INLINE_CRYPT=y in |
| the kernel configuration, and specify the "inlinecrypt" mount option |
| when mounting the filesystem. |
| |
| Note that the "inlinecrypt" mount option just specifies to use inline |
| encryption when possible; it doesn't force its use. fscrypt will |
| still fall back to using the kernel crypto API on files where the |
| inline encryption hardware doesn't have the needed crypto capabilities |
| (e.g. support for the needed encryption algorithm and data unit size) |
| and where blk-crypto-fallback is unusable. (For blk-crypto-fallback |
| to be usable, it must be enabled in the kernel configuration with |
| CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK=y.) |
| |
| Currently fscrypt always uses the filesystem block size (which is |
| usually 4096 bytes) as the data unit size. Therefore, it can only use |
| inline encryption hardware that supports that data unit size. |
| |
| Inline encryption doesn't affect the ciphertext or other aspects of |
| the on-disk format, so users may freely switch back and forth between |
| using "inlinecrypt" and not using "inlinecrypt". |
| |
| Direct I/O support |
| ================== |
| |
| For direct I/O on an encrypted file to work, the following conditions |
| must be met (in addition to the conditions for direct I/O on an |
| unencrypted file): |
| |
| * The file must be using inline encryption. Usually this means that |
| the filesystem must be mounted with ``-o inlinecrypt`` and inline |
| encryption hardware must be present. However, a software fallback |
| is also available. For details, see `Inline encryption support`_. |
| |
| * The I/O request must be fully aligned to the filesystem block size. |
| This means that the file position the I/O is targeting, the lengths |
| of all I/O segments, and the memory addresses of all I/O buffers |
| must be multiples of this value. Note that the filesystem block |
| size may be greater than the logical block size of the block device. |
| |
| If either of the above conditions is not met, then direct I/O on the |
| encrypted file will fall back to buffered I/O. |
| |
| Implementation details |
| ====================== |
| |
| Encryption context |
| ------------------ |
| |
| An encryption policy is represented on-disk by |
| struct fscrypt_context_v1 or struct fscrypt_context_v2. It is up to |
| individual filesystems to decide where to store it, but normally it |
| would be stored in a hidden extended attribute. It should *not* be |
| exposed by the xattr-related system calls such as getxattr() and |
| setxattr() because of the special semantics of the encryption xattr. |
| (In particular, there would be much confusion if an encryption policy |
| were to be added to or removed from anything other than an empty |
| directory.) These structs are defined as follows:: |
| |
| #define FSCRYPT_FILE_NONCE_SIZE 16 |
| |
| #define FSCRYPT_KEY_DESCRIPTOR_SIZE 8 |
| struct fscrypt_context_v1 { |
| u8 version; |
| u8 contents_encryption_mode; |
| u8 filenames_encryption_mode; |
| u8 flags; |
| u8 master_key_descriptor[FSCRYPT_KEY_DESCRIPTOR_SIZE]; |
| u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; |
| }; |
| |
| #define FSCRYPT_KEY_IDENTIFIER_SIZE 16 |
| struct fscrypt_context_v2 { |
| u8 version; |
| u8 contents_encryption_mode; |
| u8 filenames_encryption_mode; |
| u8 flags; |
| u8 log2_data_unit_size; |
| u8 __reserved[3]; |
| u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]; |
| u8 nonce[FSCRYPT_FILE_NONCE_SIZE]; |
| }; |
| |
| The context structs contain the same information as the corresponding |
| policy structs (see `Setting an encryption policy`_), except that the |
| context structs also contain a nonce. The nonce is randomly generated |
| by the kernel and is used as KDF input or as a tweak to cause |
| different files to be encrypted differently; see `Per-file encryption |
| keys`_ and `DIRECT_KEY policies`_. |
| |
| Data path changes |
| ----------------- |
| |
| When inline encryption is used, filesystems just need to associate |
| encryption contexts with bios to specify how the block layer or the |
| inline encryption hardware will encrypt/decrypt the file contents. |
| |
| When inline encryption isn't used, filesystems must encrypt/decrypt |
| the file contents themselves, as described below: |
| |
| For the read path (->read_folio()) of regular files, filesystems can |
| read the ciphertext into the page cache and decrypt it in-place. The |
| folio lock must be held until decryption has finished, to prevent the |
| folio from becoming visible to userspace prematurely. |
| |
| For the write path (->writepage()) of regular files, filesystems |
| cannot encrypt data in-place in the page cache, since the cached |
| plaintext must be preserved. Instead, filesystems must encrypt into a |
| temporary buffer or "bounce page", then write out the temporary |
| buffer. Some filesystems, such as UBIFS, already use temporary |
| buffers regardless of encryption. Other filesystems, such as ext4 and |
| F2FS, have to allocate bounce pages specially for encryption. |
| |
| Filename hashing and encoding |
| ----------------------------- |
| |
| Modern filesystems accelerate directory lookups by using indexed |
| directories. An indexed directory is organized as a tree keyed by |
| filename hashes. When a ->lookup() is requested, the filesystem |
| normally hashes the filename being looked up so that it can quickly |
| find the corresponding directory entry, if any. |
| |
| With encryption, lookups must be supported and efficient both with and |
| without the encryption key. Clearly, it would not work to hash the |
| plaintext filenames, since the plaintext filenames are unavailable |
| without the key. (Hashing the plaintext filenames would also make it |
| impossible for the filesystem's fsck tool to optimize encrypted |
| directories.) Instead, filesystems hash the ciphertext filenames, |
| i.e. the bytes actually stored on-disk in the directory entries. When |
| asked to do a ->lookup() with the key, the filesystem just encrypts |
| the user-supplied name to get the ciphertext. |
| |
| Lookups without the key are more complicated. The raw ciphertext may |
| contain the ``\0`` and ``/`` characters, which are illegal in |
| filenames. Therefore, readdir() must base64url-encode the ciphertext |
| for presentation. For most filenames, this works fine; on ->lookup(), |
| the filesystem just base64url-decodes the user-supplied name to get |
| back to the raw ciphertext. |
| |
| However, for very long filenames, base64url encoding would cause the |
| filename length to exceed NAME_MAX. To prevent this, readdir() |
| actually presents long filenames in an abbreviated form which encodes |
| a strong "hash" of the ciphertext filename, along with the optional |
| filesystem-specific hash(es) needed for directory lookups. This |
| allows the filesystem to still, with a high degree of confidence, map |
| the filename given in ->lookup() back to a particular directory entry |
| that was previously listed by readdir(). See |
| struct fscrypt_nokey_name in the source for more details. |
| |
| Note that the precise way that filenames are presented to userspace |
| without the key is subject to change in the future. It is only meant |
| as a way to temporarily present valid filenames so that commands like |
| ``rm -r`` work as expected on encrypted directories. |
| |
| Tests |
| ===== |
| |
| To test fscrypt, use xfstests, which is Linux's de facto standard |
| filesystem test suite. First, run all the tests in the "encrypt" |
| group on the relevant filesystem(s). One can also run the tests |
| with the 'inlinecrypt' mount option to test the implementation for |
| inline encryption support. For example, to test ext4 and |
| f2fs encryption using `kvm-xfstests |
| <https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md>`_:: |
| |
| kvm-xfstests -c ext4,f2fs -g encrypt |
| kvm-xfstests -c ext4,f2fs -g encrypt -m inlinecrypt |
| |
| UBIFS encryption can also be tested this way, but it should be done in |
| a separate command, and it takes some time for kvm-xfstests to set up |
| emulated UBI volumes:: |
| |
| kvm-xfstests -c ubifs -g encrypt |
| |
| No tests should fail. However, tests that use non-default encryption |
| modes (e.g. generic/549 and generic/550) will be skipped if the needed |
| algorithms were not built into the kernel's crypto API. Also, tests |
| that access the raw block device (e.g. generic/399, generic/548, |
| generic/549, generic/550) will be skipped on UBIFS. |
| |
| Besides running the "encrypt" group tests, for ext4 and f2fs it's also |
| possible to run most xfstests with the "test_dummy_encryption" mount |
| option. This option causes all new files to be automatically |
| encrypted with a dummy key, without having to make any API calls. |
| This tests the encrypted I/O paths more thoroughly. To do this with |
| kvm-xfstests, use the "encrypt" filesystem configuration:: |
| |
| kvm-xfstests -c ext4/encrypt,f2fs/encrypt -g auto |
| kvm-xfstests -c ext4/encrypt,f2fs/encrypt -g auto -m inlinecrypt |
| |
| Because this runs many more tests than "-g encrypt" does, it takes |
| much longer to run; so also consider using `gce-xfstests |
| <https://github.com/tytso/xfstests-bld/blob/master/Documentation/gce-xfstests.md>`_ |
| instead of kvm-xfstests:: |
| |
| gce-xfstests -c ext4/encrypt,f2fs/encrypt -g auto |
| gce-xfstests -c ext4/encrypt,f2fs/encrypt -g auto -m inlinecrypt |