aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/Documentation/x86/sgx.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/x86/sgx.rst')
-rw-r--r--Documentation/x86/sgx.rst287
1 files changed, 0 insertions, 287 deletions
diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
deleted file mode 100644
index 265568a9292c..000000000000
--- a/Documentation/x86/sgx.rst
+++ /dev/null
@@ -1,287 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===============================
-Software Guard eXtensions (SGX)
-===============================
-
-Overview
-========
-
-Software Guard eXtensions (SGX) hardware enables for user space applications
-to set aside private memory regions of code and data:
-
-* Privileged (ring-0) ENCLS functions orchestrate the construction of the
- regions.
-* Unprivileged (ring-3) ENCLU functions allow an application to enter and
- execute inside the regions.
-
-These memory regions are called enclaves. An enclave can be only entered at a
-fixed set of entry points. Each entry point can hold a single hardware thread
-at a time. While the enclave is loaded from a regular binary file by using
-ENCLS functions, only the threads inside the enclave can access its memory. The
-region is denied from outside access by the CPU, and encrypted before it leaves
-from LLC.
-
-The support can be determined by
-
- ``grep sgx /proc/cpuinfo``
-
-SGX must both be supported in the processor and enabled by the BIOS. If SGX
-appears to be unsupported on a system which has hardware support, ensure
-support is enabled in the BIOS. If a BIOS presents a choice between "Enabled"
-and "Software Enabled" modes for SGX, choose "Enabled".
-
-Enclave Page Cache
-==================
-
-SGX utilizes an *Enclave Page Cache (EPC)* to store pages that are associated
-with an enclave. It is contained in a BIOS-reserved region of physical memory.
-Unlike pages used for regular memory, pages can only be accessed from outside of
-the enclave during enclave construction with special, limited SGX instructions.
-
-Only a CPU executing inside an enclave can directly access enclave memory.
-However, a CPU executing inside an enclave may access normal memory outside the
-enclave.
-
-The kernel manages enclave memory similar to how it treats device memory.
-
-Enclave Page Types
-------------------
-
-**SGX Enclave Control Structure (SECS)**
- Enclave's address range, attributes and other global data are defined
- by this structure.
-
-**Regular (REG)**
- Regular EPC pages contain the code and data of an enclave.
-
-**Thread Control Structure (TCS)**
- Thread Control Structure pages define the entry points to an enclave and
- track the execution state of an enclave thread.
-
-**Version Array (VA)**
- Version Array pages contain 512 slots, each of which can contain a version
- number for a page evicted from the EPC.
-
-Enclave Page Cache Map
-----------------------
-
-The processor tracks EPC pages in a hardware metadata structure called the
-*Enclave Page Cache Map (EPCM)*. The EPCM contains an entry for each EPC page
-which describes the owning enclave, access rights and page type among the other
-things.
-
-EPCM permissions are separate from the normal page tables. This prevents the
-kernel from, for instance, allowing writes to data which an enclave wishes to
-remain read-only. EPCM permissions may only impose additional restrictions on
-top of normal x86 page permissions.
-
-For all intents and purposes, the SGX architecture allows the processor to
-invalidate all EPCM entries at will. This requires that software be prepared to
-handle an EPCM fault at any time. In practice, this can happen on events like
-power transitions when the ephemeral key that encrypts enclave memory is lost.
-
-Application interface
-=====================
-
-Enclave build functions
------------------------
-
-In addition to the traditional compiler and linker build process, SGX has a
-separate enclave “build” process. Enclaves must be built before they can be
-executed (entered). The first step in building an enclave is opening the
-**/dev/sgx_enclave** device. Since enclave memory is protected from direct
-access, special privileged instructions are then used to copy data into enclave
-pages and establish enclave page permissions.
-
-.. kernel-doc:: arch/x86/kernel/cpu/sgx/ioctl.c
- :functions: sgx_ioc_enclave_create
- sgx_ioc_enclave_add_pages
- sgx_ioc_enclave_init
- sgx_ioc_enclave_provision
-
-Enclave vDSO
-------------
-
-Entering an enclave can only be done through SGX-specific EENTER and ERESUME
-functions, and is a non-trivial process. Because of the complexity of
-transitioning to and from an enclave, enclaves typically utilize a library to
-handle the actual transitions. This is roughly analogous to how glibc
-implementations are used by most applications to wrap system calls.
-
-Another crucial characteristic of enclaves is that they can generate exceptions
-as part of their normal operation that need to be handled in the enclave or are
-unique to SGX.
-
-Instead of the traditional signal mechanism to handle these exceptions, SGX
-can leverage special exception fixup provided by the vDSO. The kernel-provided
-vDSO function wraps low-level transitions to/from the enclave like EENTER and
-ERESUME. The vDSO function intercepts exceptions that would otherwise generate
-a signal and return the fault information directly to its caller. This avoids
-the need to juggle signal handlers.
-
-.. kernel-doc:: arch/x86/include/uapi/asm/sgx.h
- :functions: vdso_sgx_enter_enclave_t
-
-ksgxd
-=====
-
-SGX support includes a kernel thread called *ksgxd*.
-
-EPC sanitization
-----------------
-
-ksgxd is started when SGX initializes. Enclave memory is typically ready
-for use when the processor powers on or resets. However, if SGX has been in
-use since the reset, enclave pages may be in an inconsistent state. This might
-occur after a crash and kexec() cycle, for instance. At boot, ksgxd
-reinitializes all enclave pages so that they can be allocated and re-used.
-
-The sanitization is done by going through EPC address space and applying the
-EREMOVE function to each physical page. Some enclave pages like SECS pages have
-hardware dependencies on other pages which prevents EREMOVE from functioning.
-Executing two EREMOVE passes removes the dependencies.
-
-Page reclaimer
---------------
-
-Similar to the core kswapd, ksgxd, is responsible for managing the
-overcommitment of enclave memory. If the system runs out of enclave memory,
-*ksgxd* “swaps” enclave memory to normal memory.
-
-Launch Control
-==============
-
-SGX provides a launch control mechanism. After all enclave pages have been
-copied, kernel executes EINIT function, which initializes the enclave. Only after
-this the CPU can execute inside the enclave.
-
-EINIT function takes an RSA-3072 signature of the enclave measurement. The function
-checks that the measurement is correct and signature is signed with the key
-hashed to the four **IA32_SGXLEPUBKEYHASH{0, 1, 2, 3}** MSRs representing the
-SHA256 of a public key.
-
-Those MSRs can be configured by the BIOS to be either readable or writable.
-Linux supports only writable configuration in order to give full control to the
-kernel on launch control policy. Before calling EINIT function, the driver sets
-the MSRs to match the enclave's signing key.
-
-Encryption engines
-==================
-
-In order to conceal the enclave data while it is out of the CPU package, the
-memory controller has an encryption engine to transparently encrypt and decrypt
-enclave memory.
-
-In CPUs prior to Ice Lake, the Memory Encryption Engine (MEE) is used to
-encrypt pages leaving the CPU caches. MEE uses a n-ary Merkle tree with root in
-SRAM to maintain integrity of the encrypted data. This provides integrity and
-anti-replay protection but does not scale to large memory sizes because the time
-required to update the Merkle tree grows logarithmically in relation to the
-memory size.
-
-CPUs starting from Icelake use Total Memory Encryption (TME) in the place of
-MEE. TME-based SGX implementations do not have an integrity Merkle tree, which
-means integrity and replay-attacks are not mitigated. B, it includes
-additional changes to prevent cipher text from being returned and SW memory
-aliases from being created.
-
-DMA to enclave memory is blocked by range registers on both MEE and TME systems
-(SDM section 41.10).
-
-Usage Models
-============
-
-Shared Library
---------------
-
-Sensitive data and the code that acts on it is partitioned from the application
-into a separate library. The library is then linked as a DSO which can be loaded
-into an enclave. The application can then make individual function calls into
-the enclave through special SGX instructions. A run-time within the enclave is
-configured to marshal function parameters into and out of the enclave and to
-call the correct library function.
-
-Application Container
----------------------
-
-An application may be loaded into a container enclave which is specially
-configured with a library OS and run-time which permits the application to run.
-The enclave run-time and library OS work together to execute the application
-when a thread enters the enclave.
-
-Impact of Potential Kernel SGX Bugs
-===================================
-
-EPC leaks
----------
-
-When EPC page leaks happen, a WARNING like this is shown in dmesg:
-
-"EREMOVE returned ... and an EPC page was leaked. SGX may become unusable..."
-
-This is effectively a kernel use-after-free of an EPC page, and due
-to the way SGX works, the bug is detected at freeing. Rather than
-adding the page back to the pool of available EPC pages, the kernel
-intentionally leaks the page to avoid additional errors in the future.
-
-When this happens, the kernel will likely soon leak more EPC pages, and
-SGX will likely become unusable because the memory available to SGX is
-limited. However, while this may be fatal to SGX, the rest of the kernel
-is unlikely to be impacted and should continue to work.
-
-As a result, when this happpens, user should stop running any new
-SGX workloads, (or just any new workloads), and migrate all valuable
-workloads. Although a machine reboot can recover all EPC memory, the bug
-should be reported to Linux developers.
-
-
-Virtual EPC
-===========
-
-The implementation has also a virtual EPC driver to support SGX enclaves
-in guests. Unlike the SGX driver, an EPC page allocated by the virtual
-EPC driver doesn't have a specific enclave associated with it. This is
-because KVM doesn't track how a guest uses EPC pages.
-
-As a result, the SGX core page reclaimer doesn't support reclaiming EPC
-pages allocated to KVM guests through the virtual EPC driver. If the
-user wants to deploy SGX applications both on the host and in guests
-on the same machine, the user should reserve enough EPC (by taking out
-total virtual EPC size of all SGX VMs from the physical EPC size) for
-host SGX applications so they can run with acceptable performance.
-
-Architectural behavior is to restore all EPC pages to an uninitialized
-state also after a guest reboot. Because this state can be reached only
-through the privileged ``ENCLS[EREMOVE]`` instruction, ``/dev/sgx_vepc``
-provides the ``SGX_IOC_VEPC_REMOVE_ALL`` ioctl to execute the instruction
-on all pages in the virtual EPC.
-
-``EREMOVE`` can fail for three reasons. Userspace must pay attention
-to expected failures and handle them as follows:
-
-1. Page removal will always fail when any thread is running in the
- enclave to which the page belongs. In this case the ioctl will
- return ``EBUSY`` independent of whether it has successfully removed
- some pages; userspace can avoid these failures by preventing execution
- of any vcpu which maps the virtual EPC.
-
-2. Page removal will cause a general protection fault if two calls to
- ``EREMOVE`` happen concurrently for pages that refer to the same
- "SECS" metadata pages. This can happen if there are concurrent
- invocations to ``SGX_IOC_VEPC_REMOVE_ALL``, or if a ``/dev/sgx_vepc``
- file descriptor in the guest is closed at the same time as
- ``SGX_IOC_VEPC_REMOVE_ALL``; it will also be reported as ``EBUSY``.
- This can be avoided in userspace by serializing calls to the ioctl()
- and to close(), but in general it should not be a problem.
-
-3. Finally, page removal will fail for SECS metadata pages which still
- have child pages. Child pages can be removed by executing
- ``SGX_IOC_VEPC_REMOVE_ALL`` on all ``/dev/sgx_vepc`` file descriptors
- mapped into the guest. This means that the ioctl() must be called
- twice: an initial set of calls to remove child pages and a subsequent
- set of calls to remove SECS pages. The second set of calls is only
- required for those mappings that returned a nonzero value from the
- first call. It indicates a bug in the kernel or the userspace client
- if any of the second round of ``SGX_IOC_VEPC_REMOVE_ALL`` calls has
- a return code other than 0.