| .. SPDX-License-Identifier: GPL-2.0 |
| |
| .. _inline_encryption: |
| |
| ================= |
| Inline Encryption |
| ================= |
| |
| Background |
| ========== |
| |
| Inline encryption hardware sits logically between memory and the disk, and can |
| en/decrypt data as it goes in/out of the disk. Inline encryption hardware has a |
| fixed number of "keyslots" - slots into which encryption contexts (i.e. the |
| encryption key, encryption algorithm, data unit size) can be programmed by the |
| kernel at any time. Each request sent to the disk can be tagged with the index |
| of a keyslot (and also a data unit number to act as an encryption tweak), and |
| the inline encryption hardware will en/decrypt the data in the request with the |
| encryption context programmed into that keyslot. This is very different from |
| full disk encryption solutions like self encrypting drives/TCG OPAL/ATA |
| Security standards, since with inline encryption, any block on disk could be |
| encrypted with any encryption context the kernel chooses. |
| |
| |
| Objective |
| ========= |
| |
| We want to support inline encryption (IE) in the kernel. |
| To allow for testing, we also want a crypto API fallback when actual |
| IE hardware is absent. We also want IE to work with layered devices |
| like dm and loopback (i.e. we want to be able to use the IE hardware |
| of the underlying devices if present, or else fall back to crypto API |
| en/decryption). |
| |
| |
| Constraints and notes |
| ===================== |
| |
| - IE hardware has a limited number of "keyslots" that can be programmed |
| with an encryption context (key, algorithm, data unit size, etc.) at any time. |
| One can specify a keyslot in a data request made to the device, and the |
| device will en/decrypt the data using the encryption context programmed into |
| that specified keyslot. When possible, we want to make multiple requests with |
| the same encryption context share the same keyslot. |
| |
| - We need a way for upper layers like filesystems to specify an encryption |
| context to use for en/decrypting a struct bio, and a device driver (like UFS) |
| needs to be able to use that encryption context when it processes the bio. |
| |
| - We need a way for device drivers to expose their inline encryption |
| capabilities in a unified way to the upper layers. |
| |
| |
| Design |
| ====== |
| |
| We add a struct bio_crypt_ctx to struct bio that can |
| represent an encryption context, because we need to be able to pass this |
| encryption context from the upper layers (like the fs layer) to the |
| device driver to act upon. |
| |
| While IE hardware works on the notion of keyslots, the FS layer has no |
| knowledge of keyslots - it simply wants to specify an encryption context to |
| use while en/decrypting a bio. |
| |
| We introduce a keyslot manager (KSM) that handles the translation from |
| encryption contexts specified by the FS to keyslots on the IE hardware. |
| This KSM also serves as the way IE hardware can expose its capabilities to |
| upper layers. The generic mode of operation is: each device driver that wants |
| to support IE will construct a KSM and set it up in its struct request_queue. |
| Upper layers that want to use IE on this device can then use this KSM in |
| the device's struct request_queue to translate an encryption context into |
| a keyslot. The presence of the KSM in the request queue shall be used to mean |
| that the device supports IE. |
| |
| The KSM uses refcounts to track which keyslots are idle (either they have no |
| encryption context programmed, or there are no in-flight struct bios |
| referencing that keyslot). When a new encryption context needs a keyslot, it |
| tries to find a keyslot that has already been programmed with the same |
| encryption context, and if there is no such keyslot, it evicts the least |
| recently used idle keyslot and programs the new encryption context into that |
| one. If no idle keyslots are available, then the caller will sleep until there |
| is at least one. |
| |
| |
| blk-mq changes, other block layer changes and blk-crypto-fallback |
| ================================================================= |
| |
| We add a pointer to a ``bi_crypt_context`` and ``keyslot`` to |
| struct request. These will be referred to as the ``crypto fields`` |
| for the request. This ``keyslot`` is the keyslot into which the |
| ``bi_crypt_context`` has been programmed in the KSM of the ``request_queue`` |
| that this request is being sent to. |
| |
| We introduce ``block/blk-crypto-fallback.c``, which allows upper layers to remain |
| blissfully unaware of whether or not real inline encryption hardware is present |
| underneath. When a bio is submitted with a target ``request_queue`` that doesn't |
| support the encryption context specified with the bio, the block layer will |
| en/decrypt the bio with the blk-crypto-fallback. |
| |
| If the bio is a ``WRITE`` bio, a bounce bio is allocated, and the data in the bio |
| is encrypted stored in the bounce bio - blk-mq will then proceed to process the |
| bounce bio as if it were not encrypted at all (except when blk-integrity is |
| concerned). ``blk-crypto-fallback`` sets the bounce bio's ``bi_end_io`` to an |
| internal function that cleans up the bounce bio and ends the original bio. |
| |
| If the bio is a ``READ`` bio, the bio's ``bi_end_io`` (and also ``bi_private``) |
| is saved and overwritten by ``blk-crypto-fallback`` to |
| ``bio_crypto_fallback_decrypt_bio``. The bio's ``bi_crypt_context`` is also |
| overwritten with ``NULL``, so that to the rest of the stack, the bio looks |
| as if it was a regular bio that never had an encryption context specified. |
| ``bio_crypto_fallback_decrypt_bio`` will decrypt the bio, restore the original |
| ``bi_end_io`` (and also ``bi_private``) and end the bio again. |
| |
| Regardless of whether real inline encryption hardware is used or the |
| blk-crypto-fallback is used, the ciphertext written to disk (and hence the |
| on-disk format of data) will be the same (assuming the hardware's implementation |
| of the algorithm being used adheres to spec and functions correctly). |
| |
| If a ``request queue``'s inline encryption hardware claimed to support the |
| encryption context specified with a bio, then it will not be handled by the |
| ``blk-crypto-fallback``. We will eventually reach a point in blk-mq when a |
| struct request needs to be allocated for that bio. At that point, |
| blk-mq tries to program the encryption context into the ``request_queue``'s |
| keyslot_manager, and obtain a keyslot, which it stores in its newly added |
| ``keyslot`` field. This keyslot is released when the request is completed. |
| |
| When the first bio is added to a request, ``blk_crypto_rq_bio_prep`` is called, |
| which sets the request's ``crypt_ctx`` to a copy of the bio's |
| ``bi_crypt_context``. bio_crypt_do_front_merge is called whenever a subsequent |
| bio is merged to the front of the request, which updates the ``crypt_ctx`` of |
| the request so that it matches the newly merged bio's ``bi_crypt_context``. In particular, the request keeps a copy of the ``bi_crypt_context`` of the first |
| bio in its bio-list (blk-mq needs to be careful to maintain this invariant |
| during bio and request merges). |
| |
| To make it possible for inline encryption to work with request queue based |
| layered devices, when a request is cloned, its ``crypto fields`` are cloned as |
| well. When the cloned request is submitted, blk-mq programs the |
| ``bi_crypt_context`` of the request into the clone's request_queue's keyslot |
| manager, and stores the returned keyslot in the clone's ``keyslot``. |
| |
| |
| API presented to users of the block layer |
| ========================================= |
| |
| ``struct blk_crypto_key`` represents a crypto key (the raw key, size of the |
| key, the crypto algorithm to use, the data unit size to use, and the number of |
| bytes required to represent data unit numbers that will be specified with the |
| ``bi_crypt_context``). |
| |
| ``blk_crypto_init_key`` allows upper layers to initialize such a |
| ``blk_crypto_key``. |
| |
| ``bio_crypt_set_ctx`` should be called on any bio that a user of |
| the block layer wants en/decrypted via inline encryption (or the |
| blk-crypto-fallback, if hardware support isn't available for the desired |
| crypto configuration). This function takes the ``blk_crypto_key`` and the |
| data unit number (DUN) to use when en/decrypting the bio. |
| |
| ``blk_crypto_config_supported`` allows upper layers to query whether or not the |
| an encryption context passed to request queue can be handled by blk-crypto |
| (either by real inline encryption hardware, or by the blk-crypto-fallback). |
| This is useful e.g. when blk-crypto-fallback is disabled, and the upper layer |
| wants to use an algorithm that may not supported by hardware - this function |
| lets the upper layer know ahead of time that the algorithm isn't supported, |
| and the upper layer can fallback to something else if appropriate. |
| |
| ``blk_crypto_start_using_key`` - Upper layers must call this function on |
| ``blk_crypto_key`` and a ``request_queue`` before using the key with any bio |
| headed for that ``request_queue``. This function ensures that either the |
| hardware supports the key's crypto settings, or the crypto API fallback has |
| transforms for the needed mode allocated and ready to go. Note that this |
| function may allocate an ``skcipher``, and must not be called from the data |
| path, since allocating ``skciphers`` from the data path can deadlock. |
| |
| ``blk_crypto_evict_key`` *must* be called by upper layers before a |
| ``blk_crypto_key`` is freed. Further, it *must* only be called only once |
| there are no more in-flight requests that use that ``blk_crypto_key``. |
| ``blk_crypto_evict_key`` will ensure that a key is removed from any keyslots in |
| inline encryption hardware that the key might have been programmed into (or the blk-crypto-fallback). |
| |
| API presented to device drivers |
| =============================== |
| |
| A :c:type:``struct blk_keyslot_manager`` should be set up by device drivers in |
| the ``request_queue`` of the device. The device driver needs to call |
| ``blk_ksm_init`` (or its resource-managed variant ``devm_blk_ksm_init``) on the |
| ``blk_keyslot_manager``, while specifying the number of keyslots supported by |
| the hardware. |
| |
| The device driver also needs to tell the KSM how to actually manipulate the |
| IE hardware in the device to do things like programming the crypto key into |
| the IE hardware into a particular keyslot. All this is achieved through the |
| struct blk_ksm_ll_ops field in the KSM that the device driver |
| must fill up after initing the ``blk_keyslot_manager``. |
| |
| The KSM also handles runtime power management for the device when applicable |
| (e.g. when it wants to program a crypto key into the IE hardware, the device |
| must be runtime powered on) - so the device driver must also set the ``dev`` |
| field in the ksm to point to the `struct device` for the KSM to use for runtime |
| power management. |
| |
| ``blk_ksm_reprogram_all_keys`` can be called by device drivers if the device |
| needs each and every of its keyslots to be reprogrammed with the key it |
| "should have" at the point in time when the function is called. This is useful |
| e.g. if a device loses all its keys on runtime power down/up. |
| |
| If the driver used ``blk_ksm_init`` instead of ``devm_blk_ksm_init``, then |
| ``blk_ksm_destroy`` should be called to free up all resources used by a |
| ``blk_keyslot_manager`` once it is no longer needed. |
| |
| Layered Devices |
| =============== |
| |
| Request queue based layered devices like dm-rq that wish to support IE need to |
| create their own keyslot manager for their request queue, and expose whatever |
| functionality they choose. When a layered device wants to pass a clone of that |
| request to another ``request_queue``, blk-crypto will initialize and prepare the |
| clone as necessary - see ``blk_crypto_insert_cloned_request`` in |
| ``blk-crypto.c``. |
| |
| |
| Future Optimizations for layered devices |
| ======================================== |
| |
| Creating a keyslot manager for a layered device uses up memory for each |
| keyslot, and in general, a layered device merely passes the request on to a |
| "child" device, so the keyslots in the layered device itself are completely |
| unused, and don't need any refcounting or keyslot programming. We can instead |
| define a new type of KSM; the "passthrough KSM", that layered devices can use |
| to advertise an unlimited number of keyslots, and support for any encryption |
| algorithms they choose, while not actually using any memory for each keyslot. |
| Another use case for the "passthrough KSM" is for IE devices that do not have a |
| limited number of keyslots. |
| |
| |
| Interaction between inline encryption and blk integrity |
| ======================================================= |
| |
| At the time of this patch, there is no real hardware that supports both these |
| features. However, these features do interact with each other, and it's not |
| completely trivial to make them both work together properly. In particular, |
| when a WRITE bio wants to use inline encryption on a device that supports both |
| features, the bio will have an encryption context specified, after which |
| its integrity information is calculated (using the plaintext data, since |
| the encryption will happen while data is being written), and the data and |
| integrity info is sent to the device. Obviously, the integrity info must be |
| verified before the data is encrypted. After the data is encrypted, the device |
| must not store the integrity info that it received with the plaintext data |
| since that might reveal information about the plaintext data. As such, it must |
| re-generate the integrity info from the ciphertext data and store that on disk |
| instead. Another issue with storing the integrity info of the plaintext data is |
| that it changes the on disk format depending on whether hardware inline |
| encryption support is present or the kernel crypto API fallback is used (since |
| if the fallback is used, the device will receive the integrity info of the |
| ciphertext, not that of the plaintext). |
| |
| Because there isn't any real hardware yet, it seems prudent to assume that |
| hardware implementations might not implement both features together correctly, |
| and disallow the combination for now. Whenever a device supports integrity, the |
| kernel will pretend that the device does not support hardware inline encryption |
| (by essentially setting the keyslot manager in the request_queue of the device |
| to NULL). When the crypto API fallback is enabled, this means that all bios with |
| and encryption context will use the fallback, and IO will complete as usual. |
| When the fallback is disabled, a bio with an encryption context will be failed. |