| =============================================== |
| Guarded Control Stack support for AArch64 Linux |
| =============================================== |
| |
| This document outlines briefly the interface provided to userspace by Linux in |
| order to support use of the ARM Guarded Control Stack (GCS) feature. |
| |
| This is an outline of the most important features and issues only and not |
| intended to be exhaustive. |
| |
| |
| |
| 1. General |
| ----------- |
| |
| * GCS is an architecture feature intended to provide greater protection |
| against return oriented programming (ROP) attacks and to simplify the |
| implementation of features that need to collect stack traces such as |
| profiling. |
| |
| * When GCS is enabled a separate guarded control stack is maintained by the |
| PE which is writeable only through specific GCS operations. This |
| stores the call stack only, when a procedure call instruction is |
| performed the current PC is pushed onto the GCS and on RET the |
| address in the LR is verified against that on the top of the GCS. |
| |
| * When active the current GCS pointer is stored in the system register |
| GCSPR_EL0. This is readable by userspace but can only be updated |
| via specific GCS instructions. |
| |
| * The architecture provides instructions for switching between guarded |
| control stacks with checks to ensure that the new stack is a valid |
| target for switching. |
| |
| * The functionality of GCS is similar to that provided by the x86 Shadow |
| Stack feature, due to sharing of userspace interfaces the ABI refers to |
| shadow stacks rather than GCS. |
| |
| * Support for GCS is reported to userspace via HWCAP_GCS in the aux vector |
| AT_HWCAP2 entry. |
| |
| * GCS is enabled per thread. While there is support for disabling GCS |
| at runtime this should be done with great care. |
| |
| * GCS memory access faults are reported as normal memory access faults. |
| |
| * GCS specific errors (those reported with EC 0x2d) will be reported as |
| SIGSEGV with a si_code of SEGV_CPERR (control protection error). |
| |
| * GCS is supported only for AArch64. |
| |
| * On systems where GCS is supported GCSPR_EL0 is always readable by EL0 |
| regardless of the GCS configuration for the thread. |
| |
| * The architecture supports enabling GCS without verifying that return values |
| in LR match those in the GCS, the LR will be ignored. This is not supported |
| by Linux. |
| |
| |
| |
| 2. Enabling and disabling Guarded Control Stacks |
| ------------------------------------------------- |
| |
| * GCS is enabled and disabled for a thread via the PR_SET_SHADOW_STACK_STATUS |
| prctl(), this takes a single flags argument specifying which GCS features |
| should be used. |
| |
| * When set PR_SHADOW_STACK_ENABLE flag allocates a Guarded Control Stack |
| and enables GCS for the thread, enabling the functionality controlled by |
| GCSCRE0_EL1.{nTR, RVCHKEN, PCRSEL}. |
| |
| * When set the PR_SHADOW_STACK_PUSH flag enables the functionality controlled |
| by GCSCRE0_EL1.PUSHMEn, allowing explicit GCS pushes. |
| |
| * When set the PR_SHADOW_STACK_WRITE flag enables the functionality controlled |
| by GCSCRE0_EL1.STREn, allowing explicit stores to the Guarded Control Stack. |
| |
| * Any unknown flags will cause PR_SET_SHADOW_STACK_STATUS to return -EINVAL. |
| |
| * PR_LOCK_SHADOW_STACK_STATUS is passed a bitmask of features with the same |
| values as used for PR_SET_SHADOW_STACK_STATUS. Any future changes to the |
| status of the specified GCS mode bits will be rejected. |
| |
| * PR_LOCK_SHADOW_STACK_STATUS allows any bit to be locked, this allows |
| userspace to prevent changes to any future features. |
| |
| * There is no support for a process to remove a lock that has been set for |
| it. |
| |
| * PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS affect only the |
| thread that called them, any other running threads will be unaffected. |
| |
| * New threads inherit the GCS configuration of the thread that created them. |
| |
| * GCS is disabled on exec(). |
| |
| * The current GCS configuration for a thread may be read with the |
| PR_GET_SHADOW_STACK_STATUS prctl(), this returns the same flags that |
| are passed to PR_SET_SHADOW_STACK_STATUS. |
| |
| * If GCS is disabled for a thread after having previously been enabled then |
| the stack will remain allocated for the lifetime of the thread. At present |
| any attempt to reenable GCS for the thread will be rejected, this may be |
| revisited in future. |
| |
| * It should be noted that since enabling GCS will result in GCS becoming |
| active immediately it is not normally possible to return from the function |
| that invoked the prctl() that enabled GCS. It is expected that the normal |
| usage will be that GCS is enabled very early in execution of a program. |
| |
| |
| |
| 3. Allocation of Guarded Control Stacks |
| ---------------------------------------- |
| |
| * When GCS is enabled for a thread a new Guarded Control Stack will be |
| allocated for it of half the standard stack size or 2 gigabytes, |
| whichever is smaller. |
| |
| * When a new thread is created by a thread which has GCS enabled then a |
| new Guarded Control Stack will be allocated for the new thread with |
| half the size of the standard stack. |
| |
| * When a stack is allocated by enabling GCS or during thread creation then |
| the top 8 bytes of the stack will be initialised to 0 and GCSPR_EL0 will |
| be set to point to the address of this 0 value, this can be used to |
| detect the top of the stack. |
| |
| * Additional Guarded Control Stacks can be allocated using the |
| map_shadow_stack() system call. |
| |
| * Stacks allocated using map_shadow_stack() can optionally have an end of |
| stack marker and cap placed at the top of the stack. If the flag |
| SHADOW_STACK_SET_TOKEN is specified a cap will be placed on the stack, |
| if SHADOW_STACK_SET_MARKER is not specified the cap will be the top 8 |
| bytes of the stack and if it is specified then the cap will be the next |
| 8 bytes. While specifying just SHADOW_STACK_SET_MARKER by itself is |
| valid since the marker is all bits 0 it has no observable effect. |
| |
| * Stacks allocated using map_shadow_stack() must have a size which is a |
| multiple of 8 bytes larger than 8 bytes and must be 8 bytes aligned. |
| |
| * An address can be specified to map_shadow_stack(), if one is provided then |
| it must be aligned to a page boundary. |
| |
| * When a thread is freed the Guarded Control Stack initially allocated for |
| that thread will be freed. Note carefully that if the stack has been |
| switched this may not be the stack currently in use by the thread. |
| |
| |
| 4. Signal handling |
| -------------------- |
| |
| * A new signal frame record gcs_context encodes the current GCS mode and |
| pointer for the interrupted context on signal delivery. This will always |
| be present on systems that support GCS. |
| |
| * The record contains a flag field which reports the current GCS configuration |
| for the interrupted context as PR_GET_SHADOW_STACK_STATUS would. |
| |
| * The signal handler is run with the same GCS configuration as the interrupted |
| context. |
| |
| * When GCS is enabled for the interrupted thread a signal handling specific |
| GCS cap token will be written to the GCS, this is an architectural GCS cap |
| with the token type (bits 0..11) all clear. The GCSPR_EL0 reported in the |
| signal frame will point to this cap token. |
| |
| * The signal handler will use the same GCS as the interrupted context. |
| |
| * When GCS is enabled on signal entry a frame with the address of the signal |
| return handler will be pushed onto the GCS, allowing return from the signal |
| handler via RET as normal. This will not be reported in the gcs_context in |
| the signal frame. |
| |
| |
| 5. Signal return |
| ----------------- |
| |
| When returning from a signal handler: |
| |
| * If there is a gcs_context record in the signal frame then the GCS flags |
| and GCSPR_EL0 will be restored from that context prior to further |
| validation. |
| |
| * If there is no gcs_context record in the signal frame then the GCS |
| configuration will be unchanged. |
| |
| * If GCS is enabled on return from a signal handler then GCSPR_EL0 must |
| point to a valid GCS signal cap record, this will be popped from the |
| GCS prior to signal return. |
| |
| * If the GCS configuration is locked when returning from a signal then any |
| attempt to change the GCS configuration will be treated as an error. This |
| is true even if GCS was not enabled prior to signal entry. |
| |
| * GCS may be disabled via signal return but any attempt to enable GCS via |
| signal return will be rejected. |
| |
| |
| 6. ptrace extensions |
| --------------------- |
| |
| * A new regset NT_ARM_GCS is defined for use with PTRACE_GETREGSET and |
| PTRACE_SETREGSET. |
| |
| * The GCS mode, including enable and disable, may be configured via ptrace. |
| If GCS is enabled via ptrace no new GCS will be allocated for the thread. |
| |
| * Configuration via ptrace ignores locking of GCS mode bits. |
| |
| |
| 7. ELF coredump extensions |
| --------------------------- |
| |
| * NT_ARM_GCS notes will be added to each coredump for each thread of the |
| dumped process. The contents will be equivalent to the data that would |
| have been read if a PTRACE_GETREGSET of the corresponding type were |
| executed for each thread when the coredump was generated. |
| |
| |
| |
| 8. /proc extensions |
| -------------------- |
| |
| * Guarded Control Stack pages will include "ss" in their VmFlags in |
| /proc/<pid>/smaps. |