| .. SPDX-License-Identifier: GPL-2.0 |
| |
| ===================== |
| AMD Memory Encryption |
| ===================== |
| |
| Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) are |
| features found on AMD processors. |
| |
| SME provides the ability to mark individual pages of memory as encrypted using |
| the standard x86 page tables. A page that is marked encrypted will be |
| automatically decrypted when read from DRAM and encrypted when written to |
| DRAM. SME can therefore be used to protect the contents of DRAM from physical |
| attacks on the system. |
| |
| SEV enables running encrypted virtual machines (VMs) in which the code and data |
| of the guest VM are secured so that a decrypted version is available only |
| within the VM itself. SEV guest VMs have the concept of private and shared |
| memory. Private memory is encrypted with the guest-specific key, while shared |
| memory may be encrypted with hypervisor key. When SME is enabled, the hypervisor |
| key is the same key which is used in SME. |
| |
| A page is encrypted when a page table entry has the encryption bit set (see |
| below on how to determine its position). The encryption bit can also be |
| specified in the cr3 register, allowing the PGD table to be encrypted. Each |
| successive level of page tables can also be encrypted by setting the encryption |
| bit in the page table entry that points to the next table. This allows the full |
| page table hierarchy to be encrypted. Note, this means that just because the |
| encryption bit is set in cr3, doesn't imply the full hierarchy is encrypted. |
| Each page table entry in the hierarchy needs to have the encryption bit set to |
| achieve that. So, theoretically, you could have the encryption bit set in cr3 |
| so that the PGD is encrypted, but not set the encryption bit in the PGD entry |
| for a PUD which results in the PUD pointed to by that entry to not be |
| encrypted. |
| |
| When SEV is enabled, instruction pages and guest page tables are always treated |
| as private. All the DMA operations inside the guest must be performed on shared |
| memory. Since the memory encryption bit is controlled by the guest OS when it |
| is operating in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware |
| forces the memory encryption bit to 1. |
| |
| Support for SME and SEV can be determined through the CPUID instruction. The |
| CPUID function 0x8000001f reports information related to SME:: |
| |
| 0x8000001f[eax]: |
| Bit[0] indicates support for SME |
| Bit[1] indicates support for SEV |
| 0x8000001f[ebx]: |
| Bits[5:0] pagetable bit number used to activate memory |
| encryption |
| Bits[11:6] reduction in physical address space, in bits, when |
| memory encryption is enabled (this only affects |
| system physical addresses, not guest physical |
| addresses) |
| |
| If support for SME is present, MSR 0xc00100010 (MSR_AMD64_SYSCFG) can be used to |
| determine if SME is enabled and/or to enable memory encryption:: |
| |
| 0xc0010010: |
| Bit[23] 0 = memory encryption features are disabled |
| 1 = memory encryption features are enabled |
| |
| If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if |
| SEV is active:: |
| |
| 0xc0010131: |
| Bit[0] 0 = memory encryption is not active |
| 1 = memory encryption is active |
| |
| Linux relies on BIOS to set this bit if BIOS has determined that the reduction |
| in the physical address space as a result of enabling memory encryption (see |
| CPUID information above) will not conflict with the address space resource |
| requirements for the system. If this bit is not set upon Linux startup then |
| Linux itself will not set it and memory encryption will not be possible. |
| |
| The state of SME in the Linux kernel can be documented as follows: |
| |
| - Supported: |
| The CPU supports SME (determined through CPUID instruction). |
| |
| - Enabled: |
| Supported and bit 23 of MSR_AMD64_SYSCFG is set. |
| |
| - Active: |
| Supported, Enabled and the Linux kernel is actively applying |
| the encryption bit to page table entries (the SME mask in the |
| kernel is non-zero). |
| |
| SME can also be enabled and activated in the BIOS. If SME is enabled and |
| activated in the BIOS, then all memory accesses will be encrypted and it |
| will not be necessary to activate the Linux memory encryption support. |
| |
| If the BIOS merely enables SME (sets bit 23 of the MSR_AMD64_SYSCFG), |
| then memory encryption can be enabled by supplying mem_encrypt=on on the |
| kernel command line. However, if BIOS does not enable SME, then Linux |
| will not be able to activate memory encryption, even if configured to do |
| so by default or the mem_encrypt=on command line parameter is specified. |
| |
| Secure Nested Paging (SNP) |
| ========================== |
| |
| SEV-SNP introduces new features (SEV_FEATURES[1:63]) which can be enabled |
| by the hypervisor for security enhancements. Some of these features need |
| guest side implementation to function correctly. The below table lists the |
| expected guest behavior with various possible scenarios of guest/hypervisor |
| SNP feature support. |
| |
| +-----------------+---------------+---------------+------------------+ |
| | Feature Enabled | Guest needs | Guest has | Guest boot | |
| | by the HV | implementation| implementation| behaviour | |
| +=================+===============+===============+==================+ |
| | No | No | No | Boot | |
| | | | | | |
| +-----------------+---------------+---------------+------------------+ |
| | No | Yes | No | Boot | |
| | | | | | |
| +-----------------+---------------+---------------+------------------+ |
| | No | Yes | Yes | Boot | |
| | | | | | |
| +-----------------+---------------+---------------+------------------+ |
| | Yes | No | No | Boot with | |
| | | | | feature enabled | |
| +-----------------+---------------+---------------+------------------+ |
| | Yes | Yes | No | Graceful boot | |
| | | | | failure | |
| +-----------------+---------------+---------------+------------------+ |
| | Yes | Yes | Yes | Boot with | |
| | | | | feature enabled | |
| +-----------------+---------------+---------------+------------------+ |
| |
| More details in AMD64 APM[1] Vol 2: 15.34.10 SEV_STATUS MSR |
| |
| Reverse Map Table (RMP) |
| ======================= |
| |
| The RMP is a structure in system memory that is used to ensure a one-to-one |
| mapping between system physical addresses and guest physical addresses. Each |
| page of memory that is potentially assignable to guests has one entry within |
| the RMP. |
| |
| The RMP table can be either contiguous in memory or a collection of segments |
| in memory. |
| |
| Contiguous RMP |
| -------------- |
| |
| Support for this form of the RMP is present when support for SEV-SNP is |
| present, which can be determined using the CPUID instruction:: |
| |
| 0x8000001f[eax]: |
| Bit[4] indicates support for SEV-SNP |
| |
| The location of the RMP is identified to the hardware through two MSRs:: |
| |
| 0xc0010132 (RMP_BASE): |
| System physical address of the first byte of the RMP |
| |
| 0xc0010133 (RMP_END): |
| System physical address of the last byte of the RMP |
| |
| Hardware requires that RMP_BASE and (RPM_END + 1) be 8KB aligned, but SEV |
| firmware increases the alignment requirement to require a 1MB alignment. |
| |
| The RMP consists of a 16KB region used for processor bookkeeping followed |
| by the RMP entries, which are 16 bytes in size. The size of the RMP |
| determines the range of physical memory that the hypervisor can assign to |
| SEV-SNP guests. The RMP covers the system physical address from:: |
| |
| 0 to ((RMP_END + 1 - RMP_BASE - 16KB) / 16B) x 4KB. |
| |
| The current Linux support relies on BIOS to allocate/reserve the memory for |
| the RMP and to set RMP_BASE and RMP_END appropriately. Linux uses the MSR |
| values to locate the RMP and determine the size of the RMP. The RMP must |
| cover all of system memory in order for Linux to enable SEV-SNP. |
| |
| Segmented RMP |
| ------------- |
| |
| Segmented RMP support is a new way of representing the layout of an RMP. |
| Initial RMP support required the RMP table to be contiguous in memory. |
| RMP accesses from a NUMA node on which the RMP doesn't reside |
| can take longer than accesses from a NUMA node on which the RMP resides. |
| Segmented RMP support allows the RMP entries to be located on the same |
| node as the memory the RMP is covering, potentially reducing latency |
| associated with accessing an RMP entry associated with the memory. Each |
| RMP segment covers a specific range of system physical addresses. |
| |
| Support for this form of the RMP can be determined using the CPUID |
| instruction:: |
| |
| 0x8000001f[eax]: |
| Bit[23] indicates support for segmented RMP |
| |
| If supported, segmented RMP attributes can be found using the CPUID |
| instruction:: |
| |
| 0x80000025[eax]: |
| Bits[5:0] minimum supported RMP segment size |
| Bits[11:6] maximum supported RMP segment size |
| |
| 0x80000025[ebx]: |
| Bits[9:0] number of cacheable RMP segment definitions |
| Bit[10] indicates if the number of cacheable RMP segments |
| is a hard limit |
| |
| To enable a segmented RMP, a new MSR is available:: |
| |
| 0xc0010136 (RMP_CFG): |
| Bit[0] indicates if segmented RMP is enabled |
| Bits[13:8] contains the size of memory covered by an RMP |
| segment (expressed as a power of 2) |
| |
| The RMP segment size defined in the RMP_CFG MSR applies to all segments |
| of the RMP. Therefore each RMP segment covers a specific range of system |
| physical addresses. For example, if the RMP_CFG MSR value is 0x2401, then |
| the RMP segment coverage value is 0x24 => 36, meaning the size of memory |
| covered by an RMP segment is 64GB (1 << 36). So the first RMP segment |
| covers physical addresses from 0 to 0xF_FFFF_FFFF, the second RMP segment |
| covers physical addresses from 0x10_0000_0000 to 0x1F_FFFF_FFFF, etc. |
| |
| When a segmented RMP is enabled, RMP_BASE points to the RMP bookkeeping |
| area as it does today (16K in size). However, instead of RMP entries |
| beginning immediately after the bookkeeping area, there is a 4K RMP |
| segment table (RST). Each entry in the RST is 8-bytes in size and represents |
| an RMP segment:: |
| |
| Bits[19:0] mapped size (in GB) |
| The mapped size can be less than the defined segment size. |
| A value of zero, indicates that no RMP exists for the range |
| of system physical addresses associated with this segment. |
| Bits[51:20] segment physical address |
| This address is left shift 20-bits (or just masked when |
| read) to form the physical address of the segment (1MB |
| alignment). |
| |
| The RST can hold 512 segment entries but can be limited in size to the number |
| of cacheable RMP segments (CPUID 0x80000025_EBX[9:0]) if the number of cacheable |
| RMP segments is a hard limit (CPUID 0x80000025_EBX[10]). |
| |
| The current Linux support relies on BIOS to allocate/reserve the memory for |
| the segmented RMP (the bookkeeping area, RST, and all segments), build the RST |
| and to set RMP_BASE, RMP_END, and RMP_CFG appropriately. Linux uses the MSR |
| values to locate the RMP and determine the size and location of the RMP |
| segments. The RMP must cover all of system memory in order for Linux to enable |
| SEV-SNP. |
| |
| More details in the AMD64 APM Vol 2, section "15.36.3 Reverse Map Table", |
| docID: 24593. |
| |
| Secure VM Service Module (SVSM) |
| =============================== |
| |
| SNP provides a feature called Virtual Machine Privilege Levels (VMPL) which |
| defines four privilege levels at which guest software can run. The most |
| privileged level is 0 and numerically higher numbers have lesser privileges. |
| More details in the AMD64 APM Vol 2, section "15.35.7 Virtual Machine |
| Privilege Levels", docID: 24593. |
| |
| When using that feature, different services can run at different protection |
| levels, apart from the guest OS but still within the secure SNP environment. |
| They can provide services to the guest, like a vTPM, for example. |
| |
| When a guest is not running at VMPL0, it needs to communicate with the software |
| running at VMPL0 to perform privileged operations or to interact with secure |
| services. An example fur such a privileged operation is PVALIDATE which is |
| *required* to be executed at VMPL0. |
| |
| In this scenario, the software running at VMPL0 is usually called a Secure VM |
| Service Module (SVSM). Discovery of an SVSM and the API used to communicate |
| with it is documented in "Secure VM Service Module for SEV-SNP Guests", docID: |
| 58019. |
| |
| (Latest versions of the above-mentioned documents can be found by using |
| a search engine like duckduckgo.com and typing in: |
| |
| site:amd.com "Secure VM Service Module for SEV-SNP Guests", docID: 58019 |
| |
| for example.) |