In this article, I recall the concept of separation of duties in computer systems and its application to enclaved programs. The concept has been implemented by Operating Systems like Linux and Windows to give applications different privileges. Rings are a widely deployed OS approach to separate privileges whereby a lower ring inherits the privileges of the higher ring. Rings play also a cardinal role in defining whether a process can access some memory area (e.g., the area of a concurrent process). For example, Windows 7 and Windows Server 2008 (and their predecessors) use only two rings, with ring 0 corresponding to kernel mode and ring 3 to user mode, because earlier versions of Windows ran on processors that supported only two protection levels.
With the introduction of enclave technologies, being fully memory-encrypted processes, a new type of ring 3 has been introduced. It is worth mentioning that enclaves run in user mode.
A process is nothing else than a bunch of micro-instructions in certain virtual addresses / segments. An Intel CPU x86 architecture is built to run multiple application software instances, named processes. The operating system (OS) allocates the computer resources of the running processes.
In case of cloud computing, multiple operating systems could be executed at the same time with the help of an hypervisor, which manages the hardware resources between the many operating system instances on the computer. In which scenario, the hypervisor acts like the kernel of an Operating System and manages the hardware computing resources between the virtualized operating systems (aka virtual machines). Either of the virtualized operating system may be thought of as “another process”. As so it allocates some memory and asks for CPU resources as any other process.
Isolation is a key feature for every software. This programming principle has been designed in favour of developers to not worry about the interactions with other software. To this end, operating systems leverage the concept of virtual memory addresses. That means from the software’s point of view, all the virtual memory on the computer is available for operations. In order to implement the virtual memory abstraction, every process will get its own virtual address space that only references the memory allocated to that process. The address translation concept uses a mapping defined by page tables, which are managed by the system software, to transform a virtual address to a physical address and vice versa.
As illustrated below, every process gets its own virtual address space and its the task of the operating system to multiplex the systems DRAM between the processes, while apparently the point of view of application developers is, they get access to the whole computer’s DRAM.
Thus, the isolation of processes is achieved and at the same time, it prevents application code to execute memory-mapped devices directly. The address translation process is carried out by a dedicated hardware in the CPU, the so called memory management unit (MMU).
Another key feature of virtualization is the distinction in software privilege levels, which are carried out by the CPU. A privilege separation implemented in hardware guarantees, that a software cannot damage other software indirectly, by interfering with the system software managing it .
The concept of privilege levels is hierarchically, means the most privileged Ring 0 has superpower and the Rings below, are increasingly less privileged Rings. That’s the reason, why most privileged levels can manipulate lower privileged levels, but not vice versa.
For system designers its proven practice to distinct the operating system into a kernel (= high privilege level, Ring 0) and a user-mode (less privilege level, Ring 3) to achieve a user-safe environment. The kernel allocates all the hardware resources to the other system components (e.g. drivers, lower privileged processes) and acts like an API for system calls (SYSCALLS). The lowest privilege levels are used by standard applications like web browser and user applications and is therefor called user-mode in UNIX environments. In windows environments, the kernel-mode has the synonym unprotected-mode, because the kernel is able to access whole memory space. The user-mode is called protected-mode, because the user has lower possibilities to damage the system due to limited access to the memory.
Bringing Eclaves into this Picture
Enclave execution always happens in protected mode, at ring 3, and uses the address translation set up by the OS kernel and hypervisor. To avoid leaking private data, a CPU that is executing enclave code does not directly service an interrupt, fault (e.g., a page fault) or VM exit. Instead, the CPU first performs an Asynchronous Enclave Exit to switch from enclave code to ring 3 code, and then services the interrupt, fault, or VM exit.
The CPU performs an AEX by saving the CPU state into a predefined area inside the enclave and transfers control to a pre-specified instruction outside the enclave, replacing CPU registers with synthetic values. The allocation of enclave page cache (EPC) pages to enclaves is delegated to the OS kernel (or hypervisor). The OS communicates its allocation decisions to the SGX implementation via special ring 0 CPU instructions. The OS can also evict EPC pages into untrusted DRAM and later load them back, using dedicated CPU instructions. SGX uses cryptographic protections to assure the confidentiality, integrity and freshness of the evicted EPC pages while they are stored in untrusted memory.