15
SmashEx: Smashing SGX Enclaves Using Exceptions Jinhua Cui ∗† National University of Defense Technology Changsha, China jhcui.gid@gmail.com Jason Zhijingcheng Yu National University of Singapore Singapore yu.zhi@comp.nus.edu.sg Shweta Shinde ETH Zürich Zürich, Switzerland shweta.shivajishinde@inf .ethz.ch Prateek Saxena National University of Singapore Singapore prateeks@comp.nus.edu.sg Zhiping Cai National University of Defense Technology Changsha, China zpcai@nudt.edu.cn ABSTRACT Exceptions are a commodity hardware functionality which is central to multi-tasking OSes as well as event-driven user applications. Normally, the OS assists the user application by lifting the semantics of exceptions received from hardware to program-friendly user signals and exception handling interfaces. However, can exception handlers work securely in user enclaves, such as those enabled by Intel SGX, where the OS is not trusted by the enclave code? In this paper, we introduce a new attack called SmashEx which exploits the OS-enclave interface for asynchronous exceptions in SGX. It demonstrates the importance of a fundamental property of safe atomic execution that is required on this interface. In the absence of atomicity, we show that asynchronous exception han- dling in SGX enclaves is complicated and prone to re-entrancy vulnerabilities. Our attacks do not assume any memory errors in the enclave code, side channels, or application-specific logic flaws. We concretely demonstrate exploits that cause arbitrary disclo- sure of enclave private memory and code-reuse (ROP) attacks in the enclave. We show reliable exploits on two widely-used SGX runtimes, Intel SGX SDK and Microsoft Open Enclave, running OpenSSL and cURL libraries respectively. We tested a total of 14 frameworks, including Intel SGX SDK and Microsoft Open Enclave, 10 of which are vulnerable. We discuss how the vulnerability mani- fests on both SGX1-based and SGX2-based platforms. We present potential mitigation and long-term defenses for SmashEx. CCS CONCEPTS Security and privacy Trusted computing; Software secu- rity engineering. Both authors contributed equally to this research. The author carried out this research while visiting National University of Singapore. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. CCS ’21, November 15–19, 2021, Virtual Event, Republic of Korea © 2021 Association for Computing Machinery. ACM ISBN 978-1-4503-8454-4/21/11. . . $15.00 https://doi.org/10.1145/3460120.3484821 KEYWORDS TEE; Intel SGX; Exception handling; Atomicity; Re-entrancy vul- nerability; Code-reuse attack ACM Reference Format: Jinhua Cui, Jason Zhijingcheng Yu, Shweta Shinde, Prateek Saxena, and Zhip- ing Cai. 2021. SmashEx: Smashing SGX Enclaves Using Exceptions. In Pro- ceedings of the 2021 ACM SIGSAC Conference on Computer and Commu- nications Security (CCS ’21), November 15–19, 2021, Virtual Event, Repub- lic of Korea. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/ 3460120.3484821 1 INTRODUCTION Exceptions are a basic functionality available on modern processors and are ubiquitously used by the OS and real-world applications. The OS makes use of exceptions for multiplexing processes and re- sources, e.g., via timer interrupts and page faults. Applications use programmatic constructs, such as exception and signal handling, to deal with dynamic events or runtime errors. The underlying OS is in charge of monitoring and delivering hardware generated- exceptions to a user process. This design allows application devel- opers to focus on what to do when an event occurs. Recently, a new form of hardware isolation has been enabled by enclaves such as those provided by Intel SGX. SGX allows user applications to be partitioned into hardware-isolated compartments called enclaves, which are protected from privileged system soft- ware (e.g., the hypervisor and the OS). The main guarantee provided by enclaves is protecting the confidentiality and integrity of code running in them. Enclaves are an important step, for example, to- wards reducing the dependence on privileged OSes and towards confidential computation [3, 4]. This presents a unique security model—a trusted enclave running alongside an untrusted OS. This paper studies how exceptions are handled on SGX, a platform where the OS and user enclave do not trust each other. Exceptions are events that hardware generates and software handles. There are two design choices for enabling exceptions for enclaves. The trusted hardware can directly deliver the exceptions to the enclave code. Alternatively, the hardware can deliver it to the OS, as in non-SGX systems. The current SGX implementation takes the second approach. In such a design, the OS can route an exception to the enclave along with the description of the exception event. Once the exception is delivered to the enclave by either mechanism, the enclave can execute the exception handler. Since the enclave does not trust the OS, this interface requires careful design to ensure

SmashEx: Smashing SGX Enclaves Using Exceptions

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SmashEx: Smashing SGX Enclaves Using Exceptions

SmashEx: Smashing SGX Enclaves Using ExceptionsJinhua Cui∗†

National University of Defense TechnologyChangsha, China

[email protected]

Jason Zhijingcheng Yu∗National University of Singapore

[email protected]

Shweta ShindeETH Zürich

Zürich, Switzerlandshweta.shivajishinde@inf .ethz.ch

Prateek SaxenaNational University of Singapore

[email protected]

Zhiping CaiNational University of Defense Technology

Changsha, [email protected]

ABSTRACTExceptions are a commodity hardware functionalitywhich is centralto multi-tasking OSes as well as event-driven user applications.Normally, the OS assists the user application by lifting the semanticsof exceptions received from hardware to program-friendly usersignals and exception handling interfaces. However, can exceptionhandlers work securely in user enclaves, such as those enabled byIntel SGX, where the OS is not trusted by the enclave code?

In this paper, we introduce a new attack called SmashEx whichexploits the OS-enclave interface for asynchronous exceptions inSGX. It demonstrates the importance of a fundamental propertyof safe atomic execution that is required on this interface. In theabsence of atomicity, we show that asynchronous exception han-dling in SGX enclaves is complicated and prone to re-entrancyvulnerabilities. Our attacks do not assume any memory errors inthe enclave code, side channels, or application-specific logic flaws.We concretely demonstrate exploits that cause arbitrary disclo-sure of enclave private memory and code-reuse (ROP) attacks inthe enclave. We show reliable exploits on two widely-used SGXruntimes, Intel SGX SDK and Microsoft Open Enclave, runningOpenSSL and cURL libraries respectively. We tested a total of 14frameworks, including Intel SGX SDK and Microsoft Open Enclave,10 of which are vulnerable. We discuss how the vulnerability mani-fests on both SGX1-based and SGX2-based platforms. We presentpotential mitigation and long-term defenses for SmashEx.

CCS CONCEPTS• Security and privacy→ Trusted computing; Software secu-rity engineering.

∗Both authors contributed equally to this research.†The author carried out this research while visiting National University of Singapore.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] ’21, November 15–19, 2021, Virtual Event, Republic of Korea© 2021 Association for Computing Machinery.ACM ISBN 978-1-4503-8454-4/21/11. . . $15.00https://doi.org/10.1145/3460120.3484821

KEYWORDSTEE; Intel SGX; Exception handling; Atomicity; Re-entrancy vul-nerability; Code-reuse attack

ACM Reference Format:Jinhua Cui, Jason Zhijingcheng Yu, Shweta Shinde, Prateek Saxena, and Zhip-ing Cai. 2021. SmashEx: Smashing SGX Enclaves Using Exceptions. In Pro-ceedings of the 2021 ACM SIGSAC Conference on Computer and Commu-nications Security (CCS ’21), November 15–19, 2021, Virtual Event, Repub-lic of Korea. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/3460120.3484821

1 INTRODUCTIONExceptions are a basic functionality available on modern processorsand are ubiquitously used by the OS and real-world applications.The OS makes use of exceptions for multiplexing processes and re-sources, e.g., via timer interrupts and page faults. Applications useprogrammatic constructs, such as exception and signal handling,to deal with dynamic events or runtime errors. The underlyingOS is in charge of monitoring and delivering hardware generated-exceptions to a user process. This design allows application devel-opers to focus on what to do when an event occurs.

Recently, a new form of hardware isolation has been enabledby enclaves such as those provided by Intel SGX. SGX allows userapplications to be partitioned into hardware-isolated compartmentscalled enclaves, which are protected from privileged system soft-ware (e.g., the hypervisor and the OS). The main guarantee providedby enclaves is protecting the confidentiality and integrity of coderunning in them. Enclaves are an important step, for example, to-wards reducing the dependence on privileged OSes and towardsconfidential computation [3, 4]. This presents a unique securitymodel—a trusted enclave running alongside an untrusted OS. Thispaper studies how exceptions are handled on SGX, a platformwherethe OS and user enclave do not trust each other.

Exceptions are events that hardware generates and softwarehandles. There are two design choices for enabling exceptions forenclaves. The trusted hardware can directly deliver the exceptionsto the enclave code. Alternatively, the hardware can deliver it to theOS, as in non-SGX systems. The current SGX implementation takesthe second approach. In such a design, the OS can route an exceptionto the enclave along with the description of the exception event.Once the exception is delivered to the enclave by either mechanism,the enclave can execute the exception handler. Since the enclavedoes not trust the OS, this interface requires careful design to ensure

Page 2: SmashEx: Smashing SGX Enclaves Using Exceptions

security. There are three entities interacting: the user enclave, thetrusted SGX hardware, and the untrusted OS.

Many exceptions are synchronous in the sense that the enclavecode can control when these exceptions are raised. But, certainexceptions are asynchronous, i.e., they can be triggered outside thecontrol of the enclave. The OS has the power to trigger such excep-tions at any time. The design of any hardware enclave abstractionthat supports asynchronous exceptions needs to provide at leastthree security properties on enclave-OS context switches:

• Register state save/restore.When an exception interrupts anenclave, the integrity and confidentiality of the requisiteenclave register state should be preserved in the presence ofa malicious OS.

• Safe control resumption. After an exception, the executionresumes either at the point of interruption or at the start ofan enclave-defined handler.

• Atomicity. The hardware must support sufficient mecha-nisms for the enclave to prevent exception handling when itis executing inside certain critical sections.

The SGX hardware provides the first two properties, but notthe third. An enclave can turn off delivery of certain programmer-defined exception throughout its execution by statically settingits hardware configuration. However, if the enclave does not stati-cally disable exceptions—which is useful for signal handling—SGXdoes not allow the enclave to selectively mask exceptions at certaintimes during execution. This effectively means that the SGX hard-ware does not provide any explicit runtime primitives for ensuringatomicity of critical sections in the enclave, when exceptions arestatically enabled. The lack of such a primitive opens up enclaves tore-entrancy vulnerabilities which can in turn lead to serious exploits.

To demonstrate this clearly, we introduce a powerful attack calledSmashEx, which does not assume any side channels or pre-existingmemory safety bugs in the enclave application code. We success-fully execute the attack to compromise both confidentiality andintegrity guarantees for enclave applications on SGX. Our attack onSSL implementations for instance can cause a malicious OS to spillout secret keys residing in private memory. To demonstrate thefull power of SmashEx, we leverage the re-entrancy vulnerabilityto effect code-reuse (e.g., ROP [52]) and arbitrary memory disclo-sure attacks on enclaves. We construct end-to-end PoC exploits fortwo widely-used SGX runtimes: Intel SGX SDK [17] and MicrosoftOpen Enclave [47]. We target an OpenSSL implementation basedon Intel SGX SDK and the cURL application based on Open Enclaverespectively. The attacks are demonstrated on the latest SGX2 hard-ware, but also extends to SGX1 runtimes that have asynchronousexception handling enabled.

In this paper, we explain why the root re-entrancy vulnerabil-ity exploited by SmashEx is fundamental—if we want to supportasynchronous exception handling on SGX, careful re-entrant de-sign in the enclave is critical. In total, we survey 14 SGX runtimeframeworks and deem that the vulnerability affects 10 of them onSGX2. While the exploits do not immediately carry over to 4 ofthe runtimes, we point out that this comes at the cost of a limit totheir exception handling functionality or extra complexity in theirdesign and implementation. We discuss the effectiveness of vari-ous software mitigations for SmashEx. We recommend potential

Trusted

Untrusted

Public

memory

Enclave

private

memory

R/W

Ring 0-2

Hardware

OS kernel

Host

Enclave

Interrupt

handler

APIC timer

interface

EE

XIT

EE

NT

ER

ER

ES

UM

E

AE

X

R/W/X

Ring 3

Figure 1: SGX enclave interfaces and memory protection.

hardware abstractions for exposing atomic execution primitives toenclaves to simplify defenses. These may be of independent interestto future enclave designs.Contribution.Much prior attention has been devoted to safe dataand control exchange at the enclave-OS interface (e.g., for Iagoattacks [36]). Our main contribution is to highlight a third missingdefense primitive at the enclave-OS interface: ensuring atomicity inre-entrant enclave code. When enclaves support the standard pro-grammingmodel of asynchronous exceptions, re-entrancy concernsarise. Our SmashEx attack makes this issue concrete for study.

2 BACKGROUNDIntel SGX introduces the notion of enclaves—hardware isolatedmemory regions for sensitive execution. We refer to the code thatexecutes inside an enclave as enclave software. The code and thedata of enclave software, including its stack (enclave private stack),are stored inside enclave memory and protected by the SGX hard-ware. In the SGX trust model, only the hardware and the enclavesoftware are trusted. All the other software on the system, includingprivileged software such as the OS, is considered untrusted. Thisincludes the user process in charge of creating and interacting withthe enclave, which we refer to as its host process. The SGX hardwaredoes not allow the untrusted software to access enclave memory.However, enclave software can read or write memory regions out-side the enclave boundary, which are also accessible to the hostprocess. We refer to such a shared virtual address space accessibleto both an enclave and its host process as the public memory.

Enclave software requires mechanisms to request services fromthe non-enclave/OS code as well as to receive notifications (e.g.,signals) from it. The SGX hardware has two kinds of interfaces,synchronous and asynchronous, for switching between the OS andan enclave. Figure 1 depicts such interfaces alongside the protectedmemory region for an SGX enclave.Synchronous Entry/Exits. Synchronous entry/exits are neededin enclaves to interface with the host application and the OS forsynchronous or blocking communication. To help safeguard theinterface, the SGX hardware strictly restricts the transfer of controlbetween enclave and non-enclave code. Two specific instructions,EENTER and EEXIT, are used to synchronously enter and exit anenclave respectively. The EENTER instruction jumps to a fixed en-clave entry point that is pre-configured during enclave creation.

2

Page 3: SmashEx: Smashing SGX Enclaves Using Exceptions

CPU register

enclavestack

CPUregister

EENTER

EEXIT

Fixed entry:

2

3

4

1

Untrusted software Trusted enclave

Sa

ve

Re

sto

re

Trusted code

Ocall

Ocall return

Trusted code

xor %xdx,%xdx

...

Host

process

Figure 2: Synchronous entry/exit in an ocall on SGX.

An enclave can specify a public memory location and exit to it viathe EEXIT instruction. EENTER and EEXIT do not scrub or replacethe register state during context switches. Instead, the hardwarekeeps most of the registers unchanged. It is the responsibility of theenclave software to prepare the register state for enclave executionafter EENTER and to prevent leaking secrets through the registerstate after EEXIT. This is necessary for normal functionality, forexample, to propagate any data arguments between the enclave andthe OS. An enclave can provide functions for untrusted softwareto invoke in a so-called ecall, which consists of a paired EENTERand a subsequent EEXIT. In addition, synchronous entry/exits canalso be used to support ocalls, where enclaves request to invokefunctions provided by untrusted software (shown in Figure 2).

Since EEXIT and EENTER do not take care of the register state,the enclave code has to save the enclave CPU context on its privatestack and restore it when returning from the ocall later. The ocallinterface has been the subject of much scrutiny in prior work,largely due to the risk of Iago attacks [36, 44, 58, 61].Asynchronous Entry/Exits. In addition to synchronous exits, anenclave can exit asynchronously as a result of exceptions (e.g.,timer interrupts, page faults, division-by-zero). During such anAEX (Asynchronous Enclave eXit), the enclave stores the currentenclave execution context in a special data structure called theState Save Area (SSA) located inside the enclave private memory.Asynchronous entry/exits are different from synchronous eventsbecause they can arise at any time during the enclave execution,interrupting it involuntarily. To ensure safe enclave-OS transitions,the SGX hardware implements the following mechanisms:

• Safe control resumption. At an AEX, the hardware automati-cally stores the current instruction pointer (rip) in the SSA.The untrusted host process may execute an ERESUME instruc-tion later to transfer control back to the enclave. At this point,SGX hardware enforces that the enclave resumes executionfrom the rip value stored inside the SSA.

• Register save/restore. In addition to rip, the hardware savesthe remaining enclave execution context (e.g., general-purposeregisters) in the SSA. Before exit, the hardware scrubs theregister values to prevent data leakage through them. OnERESUME, the hardware restores the register values from SSA.

Asynchronous ExceptionHandling in SGXEnclaves.The sim-ple mechanisms above are sufficient to protect an enclave while

enclavestack

SSAregion

CPU register

Host

process

OS kernel

EENTER3

EEXIT5

Signal

Trusted enclave

4

Exception

1

AEX2

Untrusted software

Hardware

ERESUME6

Exception

handler

Sa

ve

Co

py

...

int i=6/0;

Trusted code:

Figure 3: Exception handling mechanism in Intel SGX SDK.The SGX hardware performs an AEX and transfers the con-trol to the OS when an exception occurs ( 1○ 2○). The OS deliv-ers a corresponding signal to the host process, which then re-enters the enclave via EENTER ( 3○). The enclave performs in-enclave exception handling ( 4○) and exits to the host processvia EEXIT ( 5○), which then resumes the enclave execution viaERESUME ( 6○). During this process, the CPU state of the in-terrupted enclave is first saved into the SSA upon the AEX,fromwhich it is then copied to the enclave private stack dur-ing in-enclave exception handling.

allowing exceptions to interrupt its execution. However, in orderto also allow the enclave to handle exceptions (including decid-ing the resumption point by modifying the SSA content), a morecomplex mechanism is designed (shown in Figure 3) in SGX run-times. Instead of resuming the enclave immediately via ERESUME,the untrusted host process re-enters the enclave using EENTER andpasses relevant information about the exception. Note that thisnew flow starts with a normal EENTER which leaves the rsp valueuninitialized, so the enclave has to set up its private stack beforeexecuting any real exception handler. In both Intel SGX SDK andMicrosoft Open Enclave (among others; see Section 8), the enclaveloads the stack pointer from the saved rsp in the SSA, effectivelyreusing the same stack of the interrupted thread. After the enclavefinishes handling the exception in the SDK, it uses EEXIT to returncontrol to the untrusted software, which then resumes the enclaveexecution via ERESUME.Key Observation. To perform exception handling, the enclaveneeds to be re-entered and operate on a context that overlaps withthat of the interrupted thread. Therefore, the above design for in-enclave handlers requires the enclave to be re-entrant.SGX Runtimes. Since the SGX enclave programming model is sig-nificantly different from a traditional one, it can be cumbersome fordevelopers to use the low-level SGX interfaces. Therefore, enclavedevelopers usually use frameworks that provide high-level abstrac-tions to hide away the details of exception handling, ocall, ecall,and so on. Such software frameworks, referred to as SGX enclaveruntimes in this paper, execute inside enclaves. Since runtimes area part of the enclave trusted computing base, their design and im-plementation are crucial to the security of enclave applications.We survey 14 runtimes (Table 1) and find that they have varyingdegrees of exception handling support.

3

Page 4: SmashEx: Smashing SGX Enclaves Using Exceptions

Software Version Vulnerableto SmashEx?

Exceptionhandling

Intel SGX SDK [17] 2.13 a a

Microsoft Open Enclave [48] 0.15.0 a a

RedHat Enarx [13] 02dab73 a a

Graphene-SGX [35] 1.1 d a

Apache Teaclave [1] 0.2.0 a a

Google Asylo [31] 0.6.2 a a

Fortanix Rust EDP [41] 3341ce1 d d

Alibaba Inclavare [15] 0.6.0 d d

Ratel [26] 1.1 d a

SGX-LKL [51] b6e838e a a

EdgelessRT [12] 8a6f11f a a

Rust SGX SDK [64] 1.1.3 a a

CoSMIX [50] 4e67f55 a a

Veracruz [28] cbf01a9 a a

Table 1: Summary of different enclave runtime designs.adenotes that the enclave runtime is deemed exploitable bySmashEx andd denotes that the enclave runtime is deemedunexploitable by SmashEx under any enclave settings. Forthe exception handling,a denotes handling asynchronousexceptions is supported in enclaves andd denotes no sup-port for handling exceptions in enclaves.

3 ATTACK OVERVIEWAn enclave must be re-entrant to safely handle exceptions by itself.However, an important primitive, namely atomicity, is missing. Weillustrate the need for atomicity by outlining SmashEx, a novelattack that exploits a re-entrancy vulnerability present in manySGX frameworks.

3.1 Threat ModelAssumptions. The security-sensitive part of the application exe-cutes, together with any runtime libraries, in the victim enclave.The enclave interacts with the external environment, including theOS and the host process, which are assumed to be arbitrarily mali-cious. The SGX hardware is trusted. The enclave code is assumedto be benign and we do not assume that it has any auxiliary bugsor side channels that aid the attacker. We assume that the enclavecode, however, is running without ASLR (Address Space LayoutRandomization), i.e., certain critical addresses are deterministic andknown to the attacker. This is the default setup on nearly all SGXplatform runtimes we have surveyed. We discuss how to extend ourattacks when ASLR or other auxiliary defenses are enabled in Sec-tion 9. The enclave is assumed to have enabled the asynchronousexception handling interface.Attack Goal & Scope. Our attack goal is to break the basic mem-ory protection guarantees offered by SGX enclaves. Specifically,we aim to break confidentiality by enabling an arbitrary memorydisclosure attack by which the attacker can reveal the victim en-clave memory contents in full. To break integrity, our attack aimsto enable ROP attacks in the enclave. Most runtimes that supportin-enclave exception handling are susceptible to our attack. In oursurvey summarized in Table 1, 12/14 runtimes support this func-tionality. Among the 12 runtimes, we find that 10 are susceptible to

Enclave moves

regs to stack

Enclave

cleans regs

Enclave ret

uses stack

Invalid page

permissionsInterrupt handler

HW moves

regs to SSA

Fixed entry: HW

preserves regs

Ocall impl moves

values to regs

EEXIT EENTER #PF

AEX

Exception handler

moves SSA to stack

EENTER EEXIT ERESUME

HW moves

SSA to regs

A B C

Figure 4: State diagramdepicting the re-entrancy vulnerabil-ity. The clear boxes denote enclave states, and the gray boxesdenote OS states. The solid black arrows show the ocall ex-ecution flow where the dotted black box denotes the criticalsection. The dashed red arrows show the in-enclave excep-tion handling flow, injected by the OSwhen the enclave is incritical section, thus corrupting the enclave state (i.e., stack).

the SmashEx attack. We present proof-of-concept (PoC) exploitsfor two of the most popular runtimes, Intel SGX SDK and Open En-clave, as case studies in Section 7. Intel SGX SDK is widely used inmultiple other runtimes and Open Enclave is part of the MicrosoftConfidential Computing Framework (CCF). Since SmashEx arisesdirectly on the enclave-OS interface, any application that uses avulnerable runtime is exploitable. We have also constructed PoCexploits for all but RedHat Enarx, for which we have confirmed theexploitability through code inspection (see Section 8).

3.2 The Re-entrancy VulnerabilityTo demonstrate the re-entrancy issue clearly, we outline the flow ofexception handling on Intel SGX SDK for SGX2, both under normalexecution and under a SmashEx attack. It executes similarly onSGX1 and extends to other runtimes (see Section 8).

Consider the flow that handles returning from an ocall. Fig-ure 4 shows this flow with black solid arrows inside the dottedbox, which executes logic labeled 𝐴 → 𝐵 → 𝐶 in that sequence.This flow, however, can be interrupted when asynchronous excep-tions arrive. The dashed red arrows in Figure 4 show the executionflow when handling in-enclave exceptions corresponding to theEENTER→ EEXIT→ ERESUME previously highlighted in Section 2.Both flows are benign, but operate on overlapping enclave con-texts. This clearly highlights that such ocall return flow and theexception handling flow should be written with care to ensure thatthey interleave safely. Specifically, when the ocall return flow isinterrupted, the enclave should be in a consistent state for the ex-ception handling flow to progress correctly, and when the exceptionhandling flow completes, the enclave state should also be ready forthe enclave to resume. This adds considerable complexity whenin-enclave exceptions are to be supported, regardless of whetherthe OS is acting maliciously.

In this example, themain vulnerability point that enables SmashExis in the ocall return flow, which requires atomicity for executingcertain critical sections that update the enclave context (shownin the dotted black box in Figure 4). The state transitions for theocall return flow must check and clean up the register values re-turned by the OS and then use them to set up the enclave private

4

Page 5: SmashEx: Smashing SGX Enclaves Using Exceptions

Trusted enclave

xor %xdx,%xdx

...

Untrusted software

EENTER

AEX

CPU

registers

SSA

region

Set page permissions

to non-executable

Return from

an OCALL

Kernel interrupt

handler

Save

Entry point code page

(a)

Trusted enclaveUntrusted software

AEX

RegistersRegisters SSA region

Stack M Exception

handling

EENTER

(b)

Trusted enclave

call ocall

mov %rax,%rdi

gadget 1

...

Original return addr:

Target addr:

Stack M

...

arg 2

arg 1

ret_addr

...

(c)

Figure 5: Sub-components of SmashEx. (a) Injecting an AEX at the fixed entry point; (b) Corrupting the in-enclave memory(corrupted data is depicted in red); (c) Returning to the target that the attacker has specified in the controlled anchor (in blue).

stack. Before this is finished, the enclave is in an inconsistent state,with register values (in particular, the stack pointer rsp) providedby the untrusted OS.

It is important, therefore, that the enclave should not be inter-rupted to perform exception handling when it is still executingin such an inconsistent state. However, the SGX enclave abstrac-tion does not provide primitives for ensuring atomic execution ofcritical sections in the enclave. For example, it is not possible toselectively mask interrupts for certain critical sections. The enclavemust either statically disable all user-defined asynchronous excep-tion handling1 or risk being interrupted arbitrarily if exceptionhandling is statically allowed. The lack of atomicity results in apowerful attack vector which SmashEx leverages.

3.3 High-level Attack StepsAs illustrated in Figure 5, SmashEx starts with the attacker trig-gering an exception immediately after the enclave is entered (viaEENTER) to return from an ocall (Figure 5a). The hardware copiesthe attacker-controlled registers into the SSA region, which the en-clave exception handler in turn uses to determine the stack addressand the data to later use. This gives the attacker the capability ofcorrupting the stack content of the enclave (Figure 5b). By carefullycrafting the register values, the attacker can exploit this capabilityto corrupt an enclave stack location that the enclave will later useto load a return address (we call this location an anchor), therebyhijacking the control flow of the enclave (Figure 5c). Figure 6 de-scribes the detailed steps SmashEx follows, which we will discussin Sections 4, 5, and 6.

3.4 Difference to Prior AttacksSmashEx is the first attack that demonstrates the exception han-dling attack vector in Intel SGX. that SmashEx is conceptually closeto known prior attacks on enclave memory safety, synchroniza-tion, and scheduling. However, SmashEx is significantly different.Briefly, previous attacks assume much more than SmashEx:

• AsyncShock [2] assumes that the synchronization logic (e.g.,using mutexes) between two or more enclave threads isbuggy. In contrast, the vulnerability SmashEx exploits arisesdue to atomicity violation in the enclave-OS interaction, towhich thread synchronization is irrelevant.

1In SGX, the runtime can disable all in-enclave exception handling by setting theTCS.NSSA enclave configuration to one.

Trusted enclaveUntrusted software

EENTER

AEX SSA

region

Return from

an OCALL

Kernel interrupt

handler

RSP

R8…

Craft regs

Set permsRSP

R8…

User space

handler

EENTER

enclave

stack

1

2

3Signal

4

EEXIT

mov $arg 2,%rsi

mov $arg 1,%rdi

ret (ret addr)...

ERESUME

6

Restore in OCALL

return logic

5

Save

Copy

Enclave entry point:

Exception handler:

xor %xdx,%xdx...

sp=ssa.rsp;

info=(info*)sp;

info.r8=ssa.r8;...

xor %xdx,%xdx

...

Figure 6: SmashEx Overview. The attacker directly controlsthe untrusted software, including the OS and the host pro-cess. Data that the attacker can control is in red.

• Game of Threads [63] shows that by manipulating threadscheduling, a malicious OS is able to stably exploit faultythread synchronization logic in enclave applications. Forexample, for machine learning training workloads withoutfrequent thread synchronization, malicious scheduling candegrade accuracy and bias the model.

• The Guard’s Dilemma [33] assumes the existence of memoryvulnerabilities in the enclave application and uses them todemonstrate code-reuse attacks such as ROP. In SmashEx,we do not assume any such pre-existing memory errors.

4 ARBITRARYWRITE CAPABILITYThe SmashEx attack starts with enabling the attacker to perform anarbitrary write to an attacker-specified anchor location (Figure 5a).Step 1: PreparingMaliciousRegisterValues.The attacker loadsmalicious values into registers right before EENTER to ensure thatthe attacker-specified register state is preserved.Step 2: Injecting an Exception at the Precise Time/Location.SmashEx requires that the AEX event occurs shortly after enclaveentry, before the enclave cleans up the register state. There are atleast two ways to achieve this:

5

Page 6: SmashEx: Smashing SGX Enclaves Using Exceptions

(a) Page faults.We identify the page that contains the first enclaveinstruction and pre-set its access permissions to be non-executable.The malicious OS is still in charge of the enclave page permissionsand can trivially do this. In particular, the OS knows the exact pageaddress because it has to set up the enclave memory layout beforelaunching an enclave. Note that this address cannot be hidden fromthe attacker (e.g., via randomization [33, 54]) because the hardwarehas to know the exact entry point for the enclave. As describedin Section 3, we use this mechanism to trigger a page fault whenthe enclave attempts to execute the first enclave instruction. Whenthe page fault is triggered, the hardware delivers it to the attacker-controlled OS, which then forwards it to the enclave for handling.

(b) Timer interrupts.We can also interrupt the enclave executionvia the APIC timer. Prior work [62] has shown that the OS caninvoke the APIC timer interface to precisely interrupt the enclaveexecution at any desired point. The remaining challenge is to injectthe timer interrupt at the exact moment. Before returning from anocall, we set the APIC timer to the one-shot mode. We set thetimer count such that the interrupt will occur immediately afterEENTER. In order to stably achieve this, we execute the enclavein debug mode and tune the timer count to interrupt at the righttime. In our experiments, when we reuse this same timer count inproduction mode, we can reliably inject the interrupt.Step 3: Re-entering the Enclave for Exception Handling. Af-ter the untrusted OS gains control because of the AEX, it re-entersthe enclave (via EENTER) for handling the exception. This time, theattacker allows the enclave to progress after the enclave entry byreverting the access protection to the original permissions, if pagefaults are used in Step 2.Step 4: Tricking the Enclave into Using Malicious Values. Tohandle the AEX, the enclave first needs to prepare the enclave stackfor the in-enclave handler by loading the rsp register from the SSA(see Section 2). It then copies the SSA content onto this stack forthe handler to use as function arguments. Since the attacker cancontrol the values of rsp and the other saved registers in the SSA,it has gained the capability of tricking the enclave into writing anattacker-specified value to an attacker-specified stack location. Tohijack the enclave control flow, the attacker uses this capabilityto control an anchor, i.e., a location later used by the enclave toretrieve a return address. More specifically, the attacker can choosethe stack location that stores the return address for the currentocall as the anchor.Achieved Capability. Using the above steps, the attacker has thecapability to write an arbitrary value to a particular location. Thiswrite capability is the first part for scaffolding a control-flow hi-jacking attack. In Sections 5 and 6, we explain how to leverage thiscapability to effect powerful end-to-end attacks.

5 SETTING UP THE STACKSo far, the attack has only corrupted one anchor location on thestack. Our final goal is to demonstrate a powerful ROP attack [52].To this end, our next step is to escalate the attacker’s capability to:

• Point the stack pointer to an attacker-controlled region;• Trick the enclave into using the value stored at the anchorfor a control-flow transfer.

Copied during

exception handling

ocall arg 2

ocall arg 1

ocall ret_addr

RBP

Attacker-controlled

stack MSSA region

RDX

RBX

RSP

RBP

RSI

RDI

R8~R15

Anchor

……

Figure 7: Enclave memory corruption by SmashEx during ex-ception handling in Intel SGX SDK. The attacker is able tocontrol the stack pointer in the SSA (in red) and the anchoron the enclave private stack (in blue). The other data thatthe attacker can control in SmashEx is shown in gray.

Step 5: Pointing the Stack Pointer to Attacker Memory. AROP attack requires the attacker to control data on the victim’sstack so gadgets can be strung together through a series of returnaddresses. As illustrated in Figure 7, in the exception handlingprocess (Section 2), the in-enclave exception handler copies theregister state stored in the SSA region into a region of the enclaveprivate stack (say𝑀). This SSA state consists of a group of registerswhich the SGX hardware has saved during the AEX event, andwhich, as explained earlier, are attacker-controlled. The attackermay therefore use 𝑀 to store the gadget addresses. To make theenclave use this region as its stack, the attacker needs to point theenclave rsp value to it after the ocall return is completed. Thiscan be easily done by setting the anchor value to the address ofa gadget that moves the value of a register (which the attackeralready controls through returning from ocall) into rsp.

SmashEx does not require the memory region for preparing thegadget addresses to be the same as the region 𝑀 . Since the SGXhardware allows an enclave to use a stack inside the public memory,the attacker can simply set up the gadget addresses in a buffer inthe host process (located in the public memory), and use the samemethod to point rsp to it. We use this strategy for our exploit onOpen Enclave and the earlier one for Intel SGX SDK.Step 6: Effecting a Control Transfer Using the Anchor. AfterStep 5, the region 𝑀 itself is the stack with attacker-controlledvalues and the stack pointer (rsp) points to it. This is part of whatwe need to start a ROP attack. It remains to cause a control-flowtransfer with the corrupted anchor value (Figure 5c). The exceptionhandler does not immediately use the anchor value after the copylogic. Several control transfers and context switches2 happen beforethe anchor is used, but the content of𝑀 remains unchanged.

Note that although it is possible to set the anchor to any value,pointing it to the public memory will merely crash the enclave(hence falling short of a code-reuse attack) since SGX enclaves

2The exception handler performs several other operations and exits the enclave usingan EEXIT instruction to the untrusted code, which then performs an ERESUME instruc-tion to transfer control back to the enclave to complete the exception handling and theocall return flow interrupted. This part of the logic uses the anchor in a control-transferinstruction—this is why we chose the anchor to be the return address used in resumingafter the last ocall.

6

Page 7: SmashEx: Smashing SGX Enclaves Using Exceptions

Effective code

due to ROP

GadgetsAttacker-controlled

stack M

mov dst, %rdi

mov src, %rsi

mov len, %rdx

jmp memcpy

pop %rdi

ret

pop %rsi

ret

pop %rdx

ret

dst

src

len

memcpy

Figure 8: A chain of ROP gadgets in the malicious stack toinvoke memcpy with attacker-controlled arguments.

cannot execute code from public memory. For this reason, we mustconfine the anchor value to private memory addresses.Achieved Capability. At the end of Steps 5 and 6, the enclavestarts executing with the stack content controlled by the attacker.By carefully crafting the stack content, the attacker is able to convertthis capability to a full-blown ROP exploit.

6 ROP EXPLOITSAt this point, the attacker can already control both the enclaveinstruction pointer (rip) and the enclave stack content. Next, weescalate the attacker’s capability to being able to execute a sequenceof ROP gadgets that exist in the enclave code [52]. We discuss theways to achieve this for two different goals: to steal enclave secretsand to execute desirable ROP gadgets.Goal 1. Compromising Enclave Confidentiality. Enclave run-times (e.g., Intel SGX SDK, Open Enclave SDK) usually implementtheir own memcpy function for in-enclave operations. Such a func-tion performs memory copy on any accessible memory locationregardless of the enclave boundary, and accepts three argumentsthat specify the source and the destination as well as the size ofthe data to copy. The three arguments are passed in registers rdi,rsi, and rdx. We can use this function in our chain of gadgets.First, we set up arbitrary values into registers using memory-to-register move gadgets. Thenwe chain a gadget to invoke the memcpyfunction. This allows us to move arbitrary regions of memory toarbitrary locations. For example, we can point the source addressargument to the start of the enclave and destination address toa public memory region. Such a gadget will dump the entire en-clave memory. Alternatively, we can point the source to sensitivedata (e.g., SSL keys, enclave ephemeral keys) to selectively leak se-crets. To compromise enclave confidentiality, we use ROP gadgetsto manipulate the enclave to execute a memcpy function. Throughmanual inspection, we find and locate memcpy implementationsin the trusted runtime code of both Intel SGX SDK and MicrosoftOpen Enclave. We also find three pop reg; ret gadgets in theruntimes that allow the attacker to populate the three registers withvalues from the attacker-controlled external stack. As illustratedin Figure 8, the attacker can perform the desired memcpy and stealenclave secrets by chaining the ROP gadgets on the external stack.Goal 2. Arbitrary Code Injection in Enclave. Similarly, the at-tacker can copy arbitrary code to the enclave memory and executeit. It points the source to a malicious code payload outside theenclave and the destination to an enclave page. Specifically, if the

attacker sets the source to a prepared shellcode in the public mem-ory and the destination to an executable and writable region in theenclave private memory, it will be able to inject the shellcode to theenclave. The attacker can then point the subsequent return addresson the external stack to the injected shellcode in the enclave toexecute it. Since certain applications such as JIT compilers need ex-ecutable and writable enclave memory regions, SmashEx can injectand execute arbitrary code. On an SGX2 platform where dynamicadjustment of enclave page permissions is supported, SmashEx canuse ROP gadgets to make certain pages executable and writablebefore injecting and executing the shellcode.Other Desirable ROP Gadgets. In addition to the ROP gadgetchains discussed above, the attacker may also use others that servea wide range of goals. Prior work [33] has concluded that a ROPattack in an SGX enclave can be very expressive. More specifically,special gadgets available in an SGX runtime (e.g., Intel SGX SDK)enable the attacker to control the entire register file if it alreadycontrols rsp, rdi and rsi. SmashEx meets this criterion requiredby previous attacks. We can therefore reproduce any attacks shownby this prior work3 but without their assumptions of commonmemory vulnerabilities such as buffer overflow in the enclave code.

7 ATTACKING REAL SYSTEMSWe present the low-level implementation challenges in executingend-to-end attacks on two applications as case studies: an OpenSSLport with Intel SGX SDK and a cURL port with Open Enclave.

We run all the victim enclaves and the SmashEx exploits on anIntel NUC Kit NUC7PJYH with SGX2 support, 8 GB DRAM, 128MBEPC, and a Ubuntu 18.04 installation (Linux kernel 5.4.0-72). ForSGX enclave runtimes, we use Intel SGX SDK 2.13, SGX driver 2.11,and Open Enclave 0.15.0 [48], since these are the latest versionsavailable at the time of our experiments.

7.1 Intel SGX SDKCase Study: OpenSSL v1.1.1i. Intel SGX SSL [18] is a crypto-graphic library that uses OpenSSL [23] to provide general-purposecryptographic services (e.g., key generation, encryption/decryptionoperations, decision-making statements) for SGX enclave applica-tions. For our end-to-end attack, we target a test program bundledwith Intel SGX SSL, where the enclave generates a public/privateRSA key pair. By leaking this private key, we show that SmashExcan breach the Intel SGX protections.

We have to locate the target in-enclave secret before we canlaunch SmashEx. For this purpose, we disable ASLR system-wideand pre-run the enclave once to record the addresses. In our attackrun, we wait for the enclave to invoke a specific ocall that reportsthe result to the user after finishing the computation. We choose tostart our attack after this ocall, because by this point, the enclavehas created the private key in its private memory. To copy thesecret key to the public memory, we use the memcpy gadget chaindescribed in Section 6. The attack causes the enclave to copy the1024-bit key to the public memory.

3For example, the attacker can chain the asm_oret and continue_execution gadgetsin Intel SGX SDK or oe_longjmp and oe_continue_execution in Open Enclave tocontrol a wider range of registers [33].

7

Page 8: SmashEx: Smashing SGX Enclaves Using Exceptions

Implementation Challenges.We encountered the following twomain challenges when launching SmashEx against Intel SGX SDK.

Bypassing overrun and alignment checks. During exception han-dling, the SGX runtime performs security checks to sanitize andensure consistency of certain enclave states. For instance, the in-enclave exception handler (see Listing 1) derives the stack pointerfrom the SSA. Then it checks that the stack pointer is a valid enclavestack address and satisfies a pre-defined alignment requirement.4We set the malicious stack pointer to a legitimate enclave stackaddress that obeys the required alignment to pass those checks.

1 ... // check validity of thread_data, tcs, stack canary, enclave state,exception flag, ssa region

2 ssa_gpr = reinterpret_cast<ssa_gpr_t *>(thread_data->first_ssa_gpr);3 sp = ssa_gpr->REG(sp);4 ... // check stack overrun5 info = (sgx_exception_info_t *)sp;6 if(ssa_gpr->exit_info.valid != 1) { // exception handlers are not allowed to

call in a non-exception state7 goto default_handler;8 }9 ...10 info->cpu_context.r8 = ssa_gpr->r8;11 ...12 info->cpu_context.r15 = ssa_gpr->r15;13 ... // alignment will be checked after exception is handled

Listing 1: Operations and security checks during exceptionhandling in Intel SGX SDK.

The ocall return logic (see Listing 2) also includes importantchecks that our attack has to circumvent. For instance, before restor-ing the ocall context,5 the enclave sanitizes the ocall contextpointer to ensure that it is on the enclave stack (Lines 3 and 5). Inaddition, the enclave checks the validity of part of the ocall con-text content (Lines 7 and 9). However, those checks do not cover thedata that SmashEx needs to overwrite. We craft a legitimate stackpointer value to control the anchor without corrupting the checkedmemory region. In this way, our attack bypasses the checks.

1 uintptr_t last_sp = thread_data->last_sp;2 ocall_context_t *context = reinterpret_cast<ocall_context_t*>(thread_data->

last_sp);3 if(0 == last_sp || last_sp <= (uintptr_t)&context)4 return SGX_ERROR_UNEXPECTED;5 if(last_sp > thread_data->stack_base_addr - 30 * sizeof(size_t))6 return SGX_ERROR_UNEXPECTED;7 if(context->ocall_flag != ocall_flag)8 return SGX_ERROR_UNEXPECTED;9 if(context->pre_last_sp > thread_data->stack_base_addr ||10 context->pre_last_sp <= (uintptr_t)context)11 return SGX_ERROR_UNEXPECTED;12 thread_data->last_sp = context->pre_last_sp;13 asm_oret(last_sp, ms);

Listing 2: Security checks before restoring the ocall contextin Intel SGX SDK.

Restoring host process stack after AEX. Recall that in Step 1 ofSmashEx, we prepare the rsp with a malicious address that pointsto the enclave private memory. When we trigger an AEX in Step 2,the hardware retains the rsp even after exiting the enclave. Ad-ditionally, the hardware transfers control to the OS for kernel ex-ception handling. The kernel generates a corresponding signal for4In Intel SGX SDK 2.13, the is_valid_sp function performs such checks.5ocall context is a data structure on the enclave stack that stores the context of theenclave before an ocall.

the exception and wants to deliver the signal to the host process.The kernel attempts to do this by using the rsp to place the signal-related information on the host process stack. Since the rsp stillpoints to an in-enclave address, the kernel cannot perform thisoperation. However, for Step 3 of our SmashEx, it is necessary thatthe attacker handle this signal in the host process. We ensure thatwhen the OS accesses the rsp it is pointing to a host stack locationwith the sigaltstack() system call, which allows a user processto specify a separate signal handling stack. Alternatively, when theattacker moves malicious values to rsp in Step 1, we can save thecurrent rsp in the host process. After the AEX, when the controlcomes to the kernel, we restore the saved rsp value to the rspregister. Note that the malicious rsp value has already been storedin the SSA at this point. Therefore, we can safely change the rsp.

Our PoC integrates those two mechanisms to overcome theimplementation quirks of Intel SGX SDK.

7.2 Open Enclave SDKCase Study: cURL v7.67.1. The cURL library implements a widerange of application-layer network protocols, including HTTP,HTTPS, SMTP, and so on. Open Enclave provides an official portof cURL [49] to allow applications that require secure network pro-tocols (e.g., HTTPS) to benefit from the protection of SGX. Theenclave private memory contains several pieces of sensitive in-formation such as secure channel keys, enclave private keys, andHTTPS plaintext responses.

We run SmashEx on an Open Enclave cURL test program anddump the whole enclave private memory to the public memory.This will allow us to extract all the secrets inside the enclave privatememory. We obtain the virtual address ranges of the enclave pri-vate memory regions by consulting the untrusted library of OpenEnclave. The library is responsible for creating the enclave, and istherefore aware of the enclave address space layout. To ensure thatthe enclave private memory contains secret data at the time of ourattack, we wait for the enclave to finish sufficient ocalls beforelaunching the attack. In our experiment, we wait until right afterthe 150𝑡ℎ ocall to start SmashEx, where we use the gadgets fromSection 6 to dump the enclave content. The dumped data in ourexperiment includes secrets such as plaintext HTTPS responses.Configuring the APIC Timer.We use the APIC timer to triggerthe AEX at the precise moment [62]. Typically, only the OS kernelcan configure the APIC timer. However, in order to trigger theexception at precisely the first instruction of the enclave, we wantto shorten the time gap between configuring the APIC timer andentering the enclave. Therefore, instead of configuring the APICtimer inside the kernel space, we map the interface of the APICtimer (memory-mapped I/O) directly to the address space of thehost process, and configure the APIC timer in the user space shortlybefore entering the enclave via EENTER.

8 ATTACK EXTENSIBILITYThe ramifications of unsafe re-entrancy in enclave handlers gobeyond the target platform configurations—hardware version andruntime—we used for our end-to-end attacks.

8

Page 9: SmashEx: Smashing SGX Enclaves Using Exceptions

8.1 Extensibility to SGX1SGX1 and SGX2 have the same exception handling mechanism.The main difference is that in SGX2, the enclave can request thehardware to notify the enclave about certain exceptions, such aspage faults. This allows enclaves to dynamically manage memory.

The attack steps described so far assume SGX2, but they largelyalso apply to SGX1. Unlike SGX2, SGX1 does not support reportingpage faults to the enclave. When such an event occurs, the SGX1hardware performs an AEX, but with one difference to SGX2: itdoes so without setting SSA.EXITINFO.valid, a field in the SSAregion, to 1. Both Intel SGX SDK and Open Enclave perform avalidity check on this field in their exception handlers and onlyexecute the handler if SSA.EXITINFO.valid is 1. In Open Enclave,by the time this check is done, SmashEx can already corrupt theanchor. Therefore, SmashEx works on Open Enclave with SGX1.In Intel SGX SDK, SmashEx needs to bypass the above check tosucceed. We examined the 8 exceptions (unconditionally supportedexceptions [16]) supported in SGX1. None can be used to triggerthe AEX at the enclave entry. As a result, we are not able to exploitIntel SGX SDK on SGX1 with SmashEx.

However, this safety comes with a trade-off in functionality forIntel SGX SDK: with this check in force, Intel SGX SDK disablesasynchronous events on SGX1 and does not support programmingprimitives for user-defined signal handlers. The root vulnerability(i.e., the lack of atomicity) is fundamental. We hypothesized that itwould affect SGX1 if Intel SGX SDK allowed execution of in-enclaveexception handlers, and confirmed our hypothesis by removing thevalidity check in the Intel SGX SDK and repeating our attack. OurPoC works successfully on Intel SDK for SGX1 with the one-linevalidity check removed.

8.2 Extensibility to Other Enclave RuntimesApart from Intel SGX SDK and Open Enclave, we survey 12 otherenclave runtimes to understand how the vulnerability impacts them.We report that 8 of them are vulnerable to SmashEx, and haveverified this by constructing SmashEx PoC exploits against them.Derivatives of Intel SGX SDK & Open Enclave. In our survey,8 enclave runtimes are based on either Intel SGX SDK or OpenEnclave SDK. Among them, 6 runtimes use Intel SGX SDK or OpenEnclave as it is, without any modification to the exception handlinglogic. Those include Apache Teaclave [1], Rust SGX SDK [64], CoS-MIX [50], and Veracruz [28] which are based on Intel SGX SDK,and SGX-LKL [51] and EdgelessRT [12] which are based on OpenEnclave. Since all the relevant interfaces are still exposed and un-changed, such runtimes inherit the vulnerability from the runtimethey are based on. The other 2 runtimes, Google Asylo [31] andRatel [26], use modified Intel SGX SDK. They have altered the be-haviors of exception handling or other enclave interfaces relevantto SmashEx and hence need to be examined individually. GoogleAsylo [31] keeps the original exception handling interface and asa result is vulnerable to SmashEx. However, it also provides analternative exception handling interface which uses a dedicatedstack and cannot be exploited by SmashEx. Ratel [26] is immuneto SmashEx because it uses a separate pre-allocated enclave stackfor exception handling. We discuss the dedicated-stack design indetails in Section 9.1.

Independent Runtimes. RedHat Enarx [13] has its own SGX run-time independent of Intel SGX SDK and Open Enclave. Listing 3shows how it sets up the exception handler stack shortly afterthe enclave is re-entered for exception handling. Similarly to itscounterpart in Open Enclave, the code loads the saved rsp registerfrom the SSA region, shifts it by a fixed offset, and starts storinguntrusted register values at that location. We therefore conclude,through our best-effort code inspection, that SmashEx would worksuccessfully on RedHat Enarx. Though open-source, RedHat Enarxdoes not have fully functioning code base yet [14]. Thus, we werenot able to experimentally demonstrate and confirm that SmashExworks on it.

1 shl $12, %rax # %rax = CSSA * 40962 mov %rcx, %r11 # %r11 = &Layout3 add %rax, %r11 # %r11 = &aex[CSSA - 1]45 mov RSP(%r11), %r10 # %r10 = aex[CSSA - 1].gpr.rsp6 sub $128, %r10 # Skip the red zone7 and $~0xf, %r10 # Align89 mov SRSP(%r11), %rax # %rax = syscall return stack pointer1011 xchg %r10, %rsp # Swap to trusted stack12 pushq $0 # Align stack13 push %r10 # Save untrusted %rsp14 savep # Save untrusted preserved registers

Listing 3: Exception handler stack setup in RedHat Enarx.

The other three runtimes developed independently of Intel SGXSDK and Open Enclave SDK—Alibaba Inclavare [15], Fortanix RustEDP [41], and Graphene-SGX [35, 60]—are deemed immune toSmashEx through manual inspection. Alibaba Inclavare [15] andFortanix Rust EDP [41] both simply disable all in-enclave exceptionhandling, which limits the enclave functionality. Graphene-SGX [35,60] introduces software-based atomicity to safely handle exceptions,which we elaborate on in Section 9.3.

9 DEFENDING AGAINST SMASHEXThe proof-of-concept exploits for SmashEx are viable because theenclave runtimes (a) use the common program stack for exceptionhandling; and (b) lack software- or hardware-enforced atomicity.An ideal solution would be to defeat both (a) and (b). However, wediscuss the mitigations for these two issues separately. We summa-rize how certain design choices render enclave runtimes immuneto SmashEx, by disabling either requirement (a) or (b):

• Use a dedicated stack for exception handling (e.g., Ratel [26]and the alternative mechanism in Google Asylo [31]);

• Disable exception handling (e.g., Fortanix Rust EDP [41] andAlibaba Inclavare [15]);

• Program the exception handler in a re-entrant way (e.g.,Graphene-SGX [35]).

However, those designs come with significant downsides eitherby limiting the enclave functionality or by introducing complexity.

9.1 Dedicated Exception Handler StackUnlike the original exception handler in Intel SGX SDK, the excep-tion handling interfaces in Google Asylo and Ratel use a dedicatedstack separate from the one used by the interrupted thread. Theytherefore avoid relying on the rsp value in the SSA region whichSmashEx exploits to control the anchor. However, since both of

9

Page 10: SmashEx: Smashing SGX Enclaves Using Exceptions

them reserve only one separate stack, exception handling throughsuch interfaces cannot be nested. In other words, if during the han-dling of an exception, another exception occurs, this new exceptioncannot be handled inside the enclave. This limits the compatibil-ity between Google Asylo or Ratel and traditional programmingmodels where signals can be nested. One can adapt these runtimesto support nested exceptions by reserving an individual stack foreach level of nested exceptions. However, the fixed memory sizerequired by each reserved stack may limit its scalability.

9.2 Disabling Exception HandlingBoth Fortanix Rust EDP [41] and Alibaba Inclavare [15] are immuneto SmashEx because they do not support any in-enclave exceptionhandling. They configure enclaves so that the SGX hardware forbidsthe untrusted software from re-entering the enclave via EENTERafter an AEX. Specifically, the configuration parameter, TCS.NSSA,when set to 1, implies that the hardware can store at most 1 AEXcontext inside the SSA at any time. Whenever the untrusted soft-ware attempts to re-enter the enclave via EENTER following an AEX,the hardware disallows it because of the insufficient AEX contextslots to hold another potential AEX after the re-entry. Without thepossibility of a re-entry, in-enclave exception handling is effectivelydisabled. Making an enclave thread execution fully synchronousthis way simplifies the reasoning about re-entrancy. However, thisdesign choice limits the functionality of the enclave software. Forexample, the try-catch exception handling primitive widely usedin modern high-level programming languages cannot leverage hard-ware exception support inside an enclave, making them inefficientand cumbersome to enable. It hinders the implementation of signalhandling mechanisms commonly provided by modern OSes such asLinux, which are important to the functioning of user applications.Such limitations degrade the compatibility of Alibaba Inclavare andFortanix Rust EDP with traditional programming models.

9.3 Re-entrant Exception HandlingAn SGX runtime software may attempt to provide atomic primitivesfor re-entrant exception handling. One example is Graphene-SGX.

1 movq SGX_GPR_RIP(%rbx), %rax2 leaq .Locall_about_to_eexit_begin(%rip), %r113 cmpq %r11, %rax4 jb .Lhandle_interrupted_ocall_case_c5 leaq .Locall_about_to_eexit_end(%rip), %r116 cmpq %r11, %rax7 jae .Lhandle_interrupted_ocall_case_c89 // ...1011 .Lhandle_interrupted_ocall_case_c:12 movq %rdi, SGX_GPR_RSI(%rbx) # external event for .Lreturn_from_ocall13 leaq .Lreturn_from_ocall_after_stack_restore(%rip), %rax14 movq %rax, SGX_GPR_RIP(%rbx)15 movq %rsi, SGX_GPR_RSP(%rbx)16 movq $0, %gs:SGX_PRE_OCALL_STACK17 andq $(~(RFLAGS_DF | RFLAGS_AC)), SGX_GPR_RFLAGS(%rbx)18 jmp .Leexit_exception

Listing 4: Emulation of part of the sanitization logic atenclave entry in the exception handler of Graphene-SGX.

Graphene-SGX. It uses the same stack from the interrupted threadfor exception handling. However, in our investigation, we find thatGraphene-SGX does not blindly load the stack pointer from theSSA region. Instead, it examines the location of the AEX (the rip

register value inside the SSA region), and handles it differently indifferent cases. For example, when Graphene-SGX finds that theAEX occurred within the sanitization logic at the enclave entry, itwill emulate the unfinished sanitization logic. Instead of operatingon real registers as in normal execution, it operates on the registervalues stored in the SSA region (see Listing 4). This separates theexecution of the sanitization logic and the exception handler. Thus,when the enclave starts the post-sanitization processing of theAEX, the stack has already been correctly set up and is no longercontrolled by the untrusted software.

1 leaq .Ltmp_rip_saved0(%rip), %rax2 cmpq %rax, SGX_GPR_RIP(%rbx)3 je .Lemulate_tmp_rip_saved045 leaq .Ltmp_rip_saved1(%rip), %rax6 cmpq %rax, SGX_GPR_RIP(%rbx)7 je .Lemulate_tmp_rip_saved189 leaq .Ltmp_rip_saved2(%rip), %rax10 cmpq %rax, SGX_GPR_RIP(%rbx)11 je .Lemulate_tmp_rip_saved21213 jmp .Lemulate_tmp_rip_end1415 .Lemulate_tmp_rip_saved0:16 # emulate movq SGX_CPU_CONTEXT_R15 - SGX_CPU_CONTEXT_RIP(%rsp), %r1517 movq SGX_GPR_RSP(%rbx), %rax18 movq SGX_CPU_CONTEXT_R15 - SGX_CPU_CONTEXT_RIP(%rax), %rax19 movq %rax, SGX_GPR_R15(%rbx)20 .Lemulate_tmp_rip_saved1:21 # emulate movq SGX_CPU_CONTEXT_RSP - SGX_CPU_CONTEXT_RIP(%rsp), %rsp22 movq SGX_GPR_RSP(%rbx), %rax23 movq SGX_CPU_CONTEXT_RSP - SGX_CPU_CONTEXT_RIP(%rax), %rax24 movq %rax, SGX_GPR_RSP(%rbx)25 .Lemulate_tmp_rip_saved2:26 # emulate jmp *%gs:SGX_TMP_RIP27 movq %gs:SGX_TMP_RIP, %rax28 movq %rax, SGX_GPR_RIP(%rbx)29 .Lemulate_tmp_rip_end:30 movq SGX_GPR_RSP(%rbx), %rsi31 // ...

Listing 5: Emulation of part of the enclave contextrestoration code in the exception handler of Graphene-SGX.

1 .Ltmp_rip_saved0:2 movq SGX_CPU_CONTEXT_R15 - SGX_CPU_CONTEXT_RIP(%rsp), %r153 .Ltmp_rip_saved1:4 movq SGX_CPU_CONTEXT_RSP - SGX_CPU_CONTEXT_RIP(%rsp), %rsp5 .Ltmp_rip_saved2:6 jmp *%gs:SGX_TMP_RIP

Listing 6: Part of the enclave context restorationcode. Graphene-SGX emulates its execution instruction byinstruction in the exception handler (see Listing 5).

Besides emulating the sanitization logic and register setup atthe enclave entry point, Graphene-SGX emulates the execution ofthe interrupted thread whenever the AEX occurs in other criticalregions where the enclave state is unsafe for exception handling.Listings 5 and 6 show more examples. After the emulation, the AEXappears to have occurred outside the critical regions. By doing this,Graphene-SGX effectively makes the enclave exception handlersafely re-entrant.

This design is significantly more nuanced and complex. We notethat it has been the result of years of patching and revising. Thesignal and exception handling design for Graphene-SGX has un-dergone several iterations in the past three years [20, 24, 25].Alternative Implementation Strategies. The design adopted byGraphene-SGX is not the only possible implementation strategyand can be generalized by addressing two key questions.

10

Page 11: SmashEx: Smashing SGX Enclaves Using Exceptions

Q1.How can the enclave identify whether an exception occurredinside a critical section?

Q2.What should the enclave do when the untrusted OS attemptsto deliver an exception during a critical section to the enclave?Tracking Critical Sections. The enclave can track its own criticalsections explicitly. Specifically, the enclave runtime software canmaintain the location information (e.g., code address ranges) aboutall its critical sections. When required, it can check if the codeaddress where the exception occurred falls within the range ofknown critical sections. This is the option chosen by Graphene-SGX. Alternatively, the enclave software can maintain a per-thread1-bit flag in its privatememory. It sets the flag whenever the enclaveis about to enter a critical section, and clears it immediately afterexiting a critical section. To check if an exception happened in acritical section, the enclave exception handler can check this flag.Handling Exceptions in Critical Sections. If the enclave is in-terrupted midway in a critical section, one approach is to emulatethe rest of the critical section. More precisely, the enclave exceptionhandler can identify the point of interruption, look up the criticalsection, and emulate the remaining part of the interrupted criticalsection. After that, the handler can perform the real (non-emulated)exception handling. With this mechanism, the enclave gets an il-lusion that the exception occurred immediately after the criticalsection. This design works only in cases where the enclave runtimehas sufficient information about the critical sections to emulate it.This is the option chosen by Graphene-SGX.

A second way is to postpone the exception handling. Instead ofimmediately invoking the exception handler, the enclave runtimesoftware can choose to execute the current critical section. Oncethe section ends, the runtime executes the handler. This mechanismrequires the enclave runtime to maintain the received exceptions(e.g., via setting a per-thread pending flag) and add logic at the endof each critical section to handle pending exceptions.

Finally, a straightforward way is to ignore any exceptions thatarrive when a critical section is being executed. However, it is im-portant to ensure functional correctness when the OS is cooperative.For instance, an exception should not be lost when a cooperativeOS delivers it while the enclave is in a critical section, unless theenclave exposes sufficient information (e.g., by setting OS-visiblecritical section flags) to allow the OS to avoid delivering exceptionsduring critical sections.Caveats. Although the above design options are conceptually sim-ple, there are several implementation details that need careful atten-tion. The first issue arises when an enclave has to maintain a datastructure to track pending exceptions (e.g., a bitmap that records thepostponed exception types). After the critical section, the enclaveneeds to read such data structures to process the postponed excep-tions. Should the code that does this be included as part of the samecritical section? If it is part of the critical section, the data structureoperations must be made re-entrant. This is because, in a criticalsection, a delivered exception will trigger write operations to thedata structures. If it is outside the critical section, one must eitherensure the same re-entrancy property or ensure that no exceptionhandling code contains critical sections, and hence may involvewrite operations to the data structures.

The second issue concerns the use of a critical flag. One mustensure that the flag covers all locations where exceptions shouldnot be handled. One location particularly prone to negligence isimmediately after an enclave entry (EENTER). As demonstrated bySmashEx, exceptions immediately after the enclave entry, whenuntrusted OS still controls the register values, cannot be handleddirectly. If the enclave relies on an instruction to set the flag afterthe enclave entry, this leave a window of time between enclaveentry and when the flag is set. To avoid this problem, an enclavemust ensure that the flag is set before an EEXIT.

A third issue stems from the requirement of exception-free han-dler implementations. Although one can carefully implement han-dler logic to be free of certain programming-oriented exceptions(e.g., divide by zero), OS-induced page faults are difficult to avoid.For instance, if an enclave uses custom page fault handlers on SGX2,delivery of page faults to the enclave cannot be delayed or ignored,especially for faults on pages accessed within a critical section ofthe exception handler itself.

In summary, while there are software-based strategies for achiev-ing re-entrant exception handling, they introduce considerablecomplexity to ensure desired functionality and security.

9.4 Impact of Other Memory DefensesA second line of defenses aim at thwarting the code-reuse attacksteps (Steps 5 and 6) of SmashEx.Bypassing ASLR. The SmashEx attack requires the attacker toknow the exact address of the anchor. Since the OS allocates thevirtual memory range for the enclave and sets the page permissions,the attacker knows that the enclave stack will be within a certainrange. However, the enclave may randomize the base address of itsstack (e.g., Asylo [31]) to prevent the attacker from predicting theanchor location accurately. The attacker can adapt SmashEx in thefollowing ways to overcome this hurdle.

The first strategy follows the observation that Steps 1–4 are tooverwrite more than one memory location. For example, in ourGoogle Asylo [31] PoC exploit (elided here), we can overwrite 152bytes, out of which 64 bytes are freely controllable by the attacker.The attacker can therefore set all those locations to the desiredvalue for the anchor in the hope of hitting the actual anchor. Giventhat Google Asylo initializes the stack base address by advancingthe stack by a random amount between 1 and 2048, this strategyhas an attack success rate of 3.125% per trial.

One of the issues with using ASLR defenses in our context isthat a failed trial results in an invalid memory access which inturn creates another exception. This gives the attacker additionalopportunities to reenter the enclave. To concretely illustrate theissue, we implemented a multi-round proof-of-concept attack vari-ant of SmashEx specialized for Google Asylo. A multi-round attacktrial has multiple rounds, where each round executes Steps 1–4of the SmashEx procedure to corrupt one location. Note that theattacker-corrupted location is used in Step 5. If we mispredict theanchor address as the corruption value, the enclave will potentiallycrash in the subsequent steps. What remains, therefore, is to keepiterating Steps 1–4 while making sure that the enclave does notprogress to Step 5, i.e., to resume the return from ocall. To achieve

11

Page 12: SmashEx: Smashing SGX Enclaves Using Exceptions

this, the attacker uses an invalid ecall number6 to enter the en-clave in Step 1, instead of the one that requests a return from anocall. When the enclave resumes execution after AEX at the entrypoint, it checks the ecall command. Since the ecall command isinvalid, the enclave forces an EEXIT. As a result, Steps 5–6 will nottake place and instead Steps 1–4 repeat. The attacker can then keeprepeating this procedure of using bad ecall numbers and corruptone new location each time reliably. Finally, after controlling suffi-cient locations, the attacker performs an EENTER with the correctecall number. In this last iteration, the enclave performs Steps 5–6and uses a bad stack value, thus corrupting the anchor reliably evenin the presence of ASLR.Bypassing Stack Canary. The stack canary is a widely-deployeddefense against buffer overflow exploits [39]. The attacker has tocorrupt unintended stack locations as a side effect of the memorycorruption. While SmashEx does not involve a buffer overflow, itdoes corrupt unintended locations beyond the anchor itself. There-fore, it is conceptually possible to mitigate SmashEx with stackcanaries. However, the stack canary supported in common modernC compilers (e.g., with -fstack-protector-all in GCC) does nothelp protect against SmashEx. Unlike the return address of a func-tion call, the stack canary automatically generated by the compilerdoes not protect the saved ocall context, and hence the anchor,that SmashEx aims to control. Moreover, even if all code pointerson the stack have been carefully protected by stack canaries, theattacker can adapt SmashEx to launch a data-oriented attack [43]without controlling code pointers. In addition, due to a lack ofchecks on the stack pointer, on some SGX runtimes (e.g., OpenEnclave) SmashEx has the option to control non-stack locations,including where the secret stack canary value is stored. This makesexisting stack canary defenses ineffective against SmashEx.

9.5 Better Hardware Support for AtomicityWhile it is possible to implement the enclave software in a safelyre-entrant way, doing this entails a fairly complicated design whichis difficult to reason about. This motivates us to propose strategiesof enabling atomicity support in hardware, which SGX currentlylacks. We start by examining the atomicity support available to theOS and traditional user applications.Atomicity in the OS. Since the OS can configure hardware inter-rupt and exception sources (e.g., interrupt controllers), it can simplydisable interrupts and exceptions whenever it desires atomicity. ForSGX enclave software, however, an untrusted party (i.e., the OS)can trigger an exception at any time. The enclave has no way ofcontrolling or predicting when it will be interrupted.Atomicity in Traditional User Applications. Traditional userapplications rely on the OS to control when they can be interruptedand re-entered in the midst of their execution (e.g., for signal han-dling). For example, POSIX-compliant OSes define the set of sce-narios where they can deliver signals to a user process [22]. Userprocesses can use the sigprocmask system call to dynamically en-able or disable the delivery of a certain signal during runtime. For

6The ecall number is an integer that the untrusted software passes to the enclaveupon an ecall to indicate which enclave function to execute.

SGX enclaves, the OS or the host process is still in charge of invok-ing in-enclave exception handlers, but it is not trusted and shouldnot be relied on to decide when to perform a re-entry.EnablingAtomicity Primitives inHardware.We point out thatin both the above cases, the atomicity guarantee is provided bya different but trusted entity (the hardware or the OS) throughdisabling either interrupts or re-entry upon an interrupt. For anSGX enclave, only the hardware can be such a trusted entity. Wediscuss the potential changes in the SGX hardware abstraction toenable atomicity primitive for enclaves. Following the inspirationfrom the atomicity primitives available to the OS and traditionaluser applications, we discuss two directions: disabling exceptionsand disabling re-entry.

Direction 1: Temporarily Disabling Interrupts and Exceptions. IntelSGX can be adapted to allow enclaves to dynamically enable ordisable interrupts and exceptions from hardware, similarly to theprimitives OSes use to achieve atomicity. The SGX hardware canprotect the enclave execution from being interrupted at the requestof the enclave. A naïve design that allows the enclave to use thisprimitive without restrictions will enable an enclave to fully occupya hardware thread for an arbitrarily long period, thus launchingdenial-of-service (DoS) attacks against the OS. To avoid DoS attacks,the SGX hardware can let the OS decide whether to accept or denysuch requests from enclaves. The hardware relays the decision of theOS to the enclave, who in turn can make an informed decision aboutexecuting critical code. For example, the OSmay base such decisionson a pre-exchanged quota for interrupt disabling: it may permit theenclave to run with interrupts disabled for (say) 100 cycles in every10K cycles executed in the enclave. In such a case, correspondingsupport for counting and limiting the enclave execution cycleswill need to be available inside the hardware. Such an enclave-OS contract facilitated and enforced by the hardware, if designedcarefully, can guarantee atomicity while preventing DoS.

Direction 2: Temporarily Disabling Enclave Re-entry. Instead ofblocking interrupts or exceptions, another option is to allow theenclave to disable enclave re-entry during runtime. In this design,the enclave can still be interrupted when it has disabled enclavere-entry, leaving untrusted software the chance to perform excep-tion handling and manage resources accordingly. However, theSGX hardware only allows it to resume the enclave execution viaERESUME, but disallows re-entry into the enclave via EENTER. Im-mediately after enclave entry, since the enclave software needsto perform crucial operations to complete a context switch intothe enclave, the enclave hardware should preferably automaticallymask enclave re-entry by default to ensure its atomicity.

Both directions pose the risk of opening a new side channel.The attacker may learn whether an enclave is inside a critical coderegion simply by attempting to deliver an exception into it. Never-theless, we believe addressing atomicity on the OS-enclave interfacethrough a carefully designed hardware abstraction is promisingfuture work. We hope these directions offer a starting point.

10 RELATEDWORKThe SmashEx attack is targeted at Intel SGX enclaves. It stems fromunsafe re-entrancy at the OS-enclave interface. We have demon-strated that, if exploited, it can lead to code-reuse attacks. In this

12

Page 13: SmashEx: Smashing SGX Enclaves Using Exceptions

section we examine prominent work on securing host-enclave in-terfaces on Intel SGX, code-reuse attacks targeting SGX enclaves,and re-entrancy vulnerabilities in non-enclave settings.Security of Synchronous SGX Interfaces. Since the introduc-tion of Intel SGX, there has been abundant work on the security ofthe synchronous interfaces between untrusted software and SGXenclaves. Previous work has discovered that an attacker can com-promise the confidentiality and integrity of an enclave by providingmalicious system call return values, referred to as Iago attacks [36].Eliminating such threats requires enclave software to carefully scru-tinize system call return values passed into an enclave [30, 57, 60],with the aid of formal verification [58] or software testing tech-niques [40]. In addition, enclave runtimes may forget to clean cer-tain registers after a context switch into an enclave, thus openingup the enclave to attacks [29, 61]. The synchronous interface hasbeen a subject of comprehensive survey and categorization of at-tacks [44]. Unlike these lines of work, our paper examines thesecurity of asynchronous OS-enclave interfaces.Security ofAsynchronous SGX Interfaces. Existing attacks haveshown that timer interrupts or page faults can be leveraged to leakenclave secrets through side channels [34, 62, 65, 66]. Defendingagainst such side-channel attacks is non-trivial [37, 55, 56]. Previouswork has examined the attack avenue of enclave thread scheduling.In the SGX threat model, the attacker can control the schedulingand influence the enclave logic. Such manipulations can compro-mise enclave confidentiality and integrity if the enclave logic isinfluenced by scheduling. For example, the attacker can affect theenclave behavior by exploiting existing synchronization bugs [2] orbreaking assumptions made by the enclave application regardingthe thread scheduling algorithm [63]. However, the security impli-cations of the asynchronous interfaces of SGX enclaves have notbeen comprehensively studied. Specifically, to our knowledge, ourwork is the first to study the security implications at the OS-enclaveinterface for asynchronous exceptions on SGX.Code-reuse Attacks on SGX. The prevalence of code-reuse at-tacks in non-enclave applications is well studied [59]. Althoughenclaves reduce the size of the trusted computing base, they aresusceptible to corruption if the enclave code has unsafe memoryusage. Thus, enclaves are not immune to code-reuse attacks [38].Dark-ROP [45] demonstrates a ROP attack even when the enclavebinary is end-to-end encrypted [32, 53] such that the attacker can-not inspect it. They assume a fixed enclave address space layout,which allows the attacker to probe the locations of useful gad-gets through trials-and-errors. This assumption is justified by thedifficulty in applying defense techniques such as ASLR to SGXenclaves due to the constraints imposed by the Intel SGX design.SGX-Shield [54] proposes a strategy to enable ASLR in SGX en-claves and prevent code-reuse attacks. However, as shown in thesubsequent work, SGX-Shield does not randomize the code insidethe trusted runtime of the enclave. This allows the attacker toexploit memory-unsafe enclave code and launch powerful ROP at-tacks [33]. SmashEx demonstrates a code-reuse attack on enclaves.However, unlike the existing work, we do not assume a pre-existingmemory vulnerability in the enclave software.Re-entrancy Vulnerabilities & Defenses. Traditional asynchro-nous interfaces, such as the signal handler, are prone to re-entrancy

challenges [67]. Such vulnerabilities are common in several othersystems [5–7, 10, 11, 42, 46]. SmashEx is the first attack that exploitsre-entrancy vulnerabilities in the context of Intel SGX. Preventingre-entrancy bugs in general involves introducing a notion of atom-icity. For instance, when the code is operating in a critical section,the user application can request the OS to mask certain signals (i.e.,to pause their delivery) [27]. Our work makes the first attempt tocompare and contrast exception handling in Intel SGX versus tradi-tional systems. Our findings highlight the need for better hardwareabstractions to enable safely re-entrant enclave code.

SmashEx brings attention to a new avenue of powerful attackson Intel SGX. It can serve as a motivation to further scrutinize andfortify the enclave asynchronous interface.

11 RESPONSIBLE DISCLOSUREWe informed the affected parties—Intel for Intel SGX SDK andMicrosoft for Open Enclave SDK—on 3 May 2021. Intel and Mi-crosoft acknowledged the attack and assigned CVE-2021-0186 andCVE-2021-33767 respectively [8, 9]. After due process, Intel andMicrosoft released patches for SmashEx on 13 July 2021. In addi-tion, they released public advisories on 13 July 2021 [21] and 12October 2021 [19]. Further, we have assisted Intel and Microsoft tocoordinate responsible disclosures to the affected runtimes listedin Table 1, where it was requested.

12 CONCLUSIONAsynchronous exception handling is a commodity functionalityfor real-world applications today, which are increasingly utilizingenclaves. In this work, we show the importance of providing atom-icity guarantees at the OS-enclave interface for such exceptions. Wehave introduced a new attack called SmashEx in this work, whichexploits the inherent re-entrancy interface required in exceptionhandling on SGX. Our exploits demonstrate the issue concretely onpopular SGX runtime frameworks. We hope our work initiates care-ful consideration for asynchronous exception handling in existingSGX frameworks as well as in future enclave designs.

AVAILABILITYWe maintain further information regarding SmashEx, includinghow to acquire the proof-of-concept exploits for educational pur-poses, at https://jasonyu1996.github.io/SmashEx/.

ACKNOWLEDGMENTSWe thank the anonymous CCS reviewers for their valuable sug-gestions. We also thank Ivan Puddu for help with SGX-Step. Thiswork was supported by Crystal Center at National University ofSingapore. Zhiping Cai’s work was funded by National NaturalScience Foundation of China (62072465). Any opinions, findings,and conclusions or recommendations expressed in this material arethose of the authors only.

REFERENCES[1] [n.d.]. Apache Teaclave: A Universal Secure Computing Platform. https://

teaclave.apache.org/.[2] [n.d.]. AsyncShock: Exploiting Synchronisation Bugs in Intel SGX Enclaves.[3] [n.d.]. Confidential Computing Consortium - Open Source Community. https:

//confidentialcomputing.io/.

13

Page 14: SmashEx: Smashing SGX Enclaves Using Exceptions

[4] [n.d.]. Confidential Consortium Framework - Microsoft Research.https://www.microsoft.com/en-us/research/project/confidential-consortium-framework/.

[5] [n.d.]. CVE - CVE-2011-1285: The regular-expression functionality in GoogleChrome before 10.0.648.127 does not properly implement reentrancy, which al-lows remote attackers to cause a denial of service (memory corruption) or possiblyhave unspecified other impact via unknown vectors. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-1285.

[6] [n.d.]. CVE - CVE-2016-5185: Blink in Google Chrome prior to 54.0.2840.59 forWindows, Mac, and Linux; 54.0.2840.85 for Android incorrectly allowed reen-trance of FrameView::updateLifecyclePhasesInternal(), which allowed a remoteattacker to perform an out of bounds memory read via crafted HTML pages.https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-5185.

[7] [n.d.]. CVE - CVE-2018-16065: A Javascript reentrancy issues that caused ause-after-free in V8 in Google Chrome prior to 69.0.3497.81 allowed a remoteattacker to execute arbitrary code inside a sandbox via a crafted HTML page.https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-16065.

[8] [n.d.]. CVE-2021-0186 - Intel® SGX SDK Escalation of Privilege Vulnerability.https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-0186.

[9] [n.d.]. CVE-2021-33767 - Security Update Guide - Microsoft - Open EnclaveSDK Elevation of Privilege Vulnerability. https://msrc.microsoft.com/update-guide/vulnerability/CVE-2021-33767.

[10] [n.d.]. CWE - CWE-1265: Unintended Reentrant Invocation of Non-reentrantCode Via Nested Calls (4.4). https://cwe.mitre.org/data/definitions/1265.html.

[11] [n.d.]. CWE - CWE-479: Signal Handler Use of a Non-reentrant Function (4.4).https://cwe.mitre.org/data/definitions/479.html.

[12] [n.d.]. Edgeless RT: an SDK and a runtime for Intel SGX. https://github.com/edgelesssys/edgelessrt.

[13] [n.d.]. Enarx: an application deployment system enabling applications to runwithin TEE. https://github.com/enarx/enarx/wiki/Enarx-Introduction.

[14] [n.d.]. Enarx FAQ · enarx/enarx Wiki. https://github.com/enarx/enarx/wiki/Enarx-FAQ.

[15] [n.d.]. Inclavare Containers. https://inclavare-containers.io/.[16] [n.d.]. Intel 64 and IA-32 Architectures Software Developer Manuals.[17] [n.d.]. Intel Software Guard Extensions SDK - Documentation | Intel Software.

https://software.intel.com/en-us/sgx-sdk/documentation.[18] [n.d.]. Intel Software Guard Extensions SSL cryptographic library. https://

github.com/intel/intel-sgx-ssl.[19] [n.d.]. Intel® SGX SDK Advisory. https://www.intel.com/content/www/us/en/

security-center/advisory/intel-sa-00548.html.[20] [n.d.]. [LibOS] Rework signal handling by boryspoplawski · Pull Request #2090 ·

oscarlab/graphene. https://github.com/oscarlab/graphene/pull/2090.[21] [n.d.]. Open Enclave SDK Elevation of Privilege Vulnerability · Advisory · ope-

nenclave/openenclave. https://github.com/openenclave/openenclave/security/advisories/GHSA-mj87-466f-jq42.

[22] [n.d.]. The Open Group Base Specifications Issue 7, 2018 edition. https://pubs.opengroup.org/onlinepubs/9699919799/.

[23] [n.d.]. OpenSSL: Cryptography and SSL/TLS Toolkit. https://www.openssl.org/.[24] [n.d.]. [Pal/Linux-SGX] enclave_entry.S fix stack on exception and ocall by

yamahata · Pull Request #696 · oscarlab/graphene. https://github.com/oscarlab/graphene/pull/696.

[25] [n.d.]. [Pal/Linux-SGX] Fix super-subtle bugs in SGX enter/ex-it/AEX flows by dimakuv · Pull Request #1663 · oscarlab/graphene.https://github.com/oscarlab/graphene/pull/1663.

[26] [n.d.]. Ratel: a general framework for instruction-level interposition on enclavedapplications. https://ratel-enclave.github.io/.

[27] [n.d.]. sigprocmask(2) - Linux manual page. https://man7.org/linux/man-pages/man2/sigprocmask.2.html.

[28] [n.d.]. Veracruz: privacy-preserving collaborative compute. https://github.com/veracruz-project/veracruz.

[29] Fritz Alder, Jo Van Bulck, David F. Oswald, and Frank Piessens. 2020. FaultyPoint Unit: ABI Poisoning Attacks on Intel SGX. In ACSAC ’20: Annual ComputerSecurity Applications Conference, Virtual Event / Austin, TX, USA, 7-11 December,2020. ACM, 415–427. https://doi.org/10.1145/3427228.3427270

[30] Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin,Christian Priebe, Joshua Lind, Divya Muthukumaran, Daniel O’Keeffe, Mark LStillwell, David Goltzsche, Dave Eyers, Rüdiger Kapitza, Peter Pietzuch, andChristof Fetzer. 2016. SCONE: Secure Linux Containers with Intel SGX. In OSDI.

[31] asylo 2019. Google Asylo: An open and flexible framework for enclave applica-tions. https://asylo.dev/.

[32] Andrew Baumann,Marcus Peinado, and GalenHunt. 2014. Shielding Applicationsfrom an Untrusted Cloud with Haven. In OSDI.

[33] Andrea Biondo, Mauro Conti, Lucas Davi, Tommaso Frassetto, and Ahmad-RezaSadeghi. 2018. The Guard’s Dilemma: Efficient Code-Reuse Attacks againstIntel SGX. In Proceedings of the 27th USENIX Conference on Security Symposium(Baltimore, MD, USA) (SEC’18). USENIX Association, USA, 1213–1227.

[34] Jo Van Bulck, Nico Weichbrodt, Rüdiger Kapitza, Frank Piessens, and RaoulStrackx. 2017. Telling Your Secrets without Page Faults: Stealthy Page

Table-Based Attacks on Enclaved Execution. In 26th USENIX Security Sym-posium (USENIX Security 17). USENIX Association, Vancouver, BC, 1041–1056. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/van-bulck

[35] Chia che Tsai, Donald E. Porter, and Mona Vij. 2017. Graphene-SGX: A PracticalLibrary OS for Unmodified Applications on SGX. In 2017 USENIXAnnual TechnicalConference (USENIX ATC 17). USENIX Association, Santa Clara, CA, 645–658.https://www.usenix.org/conference/atc17/technical-sessions/presentation/tsai

[36] Stephen Checkoway and Hovav Shacham. 2013. Iago Attacks: Why the SystemCall API is a Bad Untrusted RPC Interface. In Proceedings of the EighteenthInternational Conference on Architectural Support for Programming Languages andOperating Systems (ASPLOS ’13).

[37] Guoxing Chen, Wenhao Wang, Tianyu Chen, Sanchuan Chen, Yinqian Zhang,XiaoFeng Wang, Ten-Hwang Lai, and Dongdai Lin. 2018. Racing in Hyperspace:Closing Hyper-Threading Side Channels on SGX with Contrived Data Races. In2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May2018, San Francisco, California, USA. IEEE Computer Society, 178–194. https://doi.org/10.1109/SP.2018.00024

[38] Tobias Cloosters, Michael Rodler, and Lucas Davi. 2020. TeeRex: Discovery andExploitation of Memory Corruption Vulnerabilities in SGX Enclaves. In 29thUSENIX Security Symposium. http://tubiblio.ulb.tu-darmstadt.de/119694/

[39] Crispan Cowan. 1998. StackGuard: Automatic Adaptive Detection and Pre-vention of Buffer-Overflow Attacks. In Proceedings of the 7th USENIX SecuritySymposium, San Antonio, TX, USA, January 26-29, 1998, Aviel D. Rubin (Ed.).USENIX Association. https://www.usenix.org/conference/7th-usenix-security-symposium/stackguard-automatic-adaptive-detection-and-prevention

[40] Rongzhen Cui, Lianying Zhao, and David Lie. 2021. Emilia: Catching Iago inLegacy Code. https://doi.org/10.14722/ndss.2021.24328

[41] fortanix-rust-sgx [n.d.]. fortanix/rust-sgx: The Fortanix Rust Enclave Develop-ment Platform. https://github.com/fortanix/rust-sgx.

[42] Shelly Grossman, Ittai Abraham, Guy Golan-Gueta, Yan Michalevsky, NoamRinetzky, Mooly Sagiv, and Yoni Zohar. 2017. Online Detection of EffectivelyCallback Free Objects with Applications to Smart Contracts. POPL (2017).

[43] Hong Hu, Shweta Shinde, Sendroiu Adrian, Zheng Leong Chua, Prateek Saxena,and Zhenkai Liang. 2016. Data-Oriented Programming: On the Expressivenessof Non-control Data Attacks. In IEEE Symposium on Security and Privacy, SP 2016,San Jose, CA, USA, May 22-26, 2016. 969–986. https://doi.org/10.1109/SP.2016.62

[44] Mustakimur Rahman Khandaker, Yueqiang Cheng, Zhi Wang, and Tao Wei.2020. COIN Attacks: On Insecurity of Enclave Untrusted Interfaces in SGX. InProceedings of the Twenty-Fifth International Conference on Architectural Supportfor Programming Languages and Operating Systems (ASPLOS ’20).

[45] Jaehyuk Lee, Jinsoo Jang, Yeongjin Jang, Nohyun Kwak, Yeseul Choi, ChanghoChoi, Taesoo Kim,Marcus Peinado, and Brent ByungHoon Kang. 2017. Hacking inDarkness: Return-oriented Programming against Secure Enclaves. In 26th USENIXSecurity Symposium (USENIX Security 17). USENIX Association, Vancouver,BC, 523–539. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/lee-jaehyuk

[46] Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning fromMis-takes: A Comprehensive Study on Real World Concurrency Bug Characteristics.(2008).

[47] open-enclave [n.d.]. Open Enclave SDK. https://openenclave.io/sdk/.[48] openenclave [n.d.]. OpenEnclave. https://github.com/openenclave/openenclave/

tree/v0.15.0.[49] openenclave-curl [n.d.]. OpenEnclave Curl. https://github.com/openenclave/

openenclave-curl.[50] Meni Orenbach, Yan Michalevsky, Christof Fetzer, and Mark Silberstein. 2019.

CoSMIX: a compiler-based system for secure memory instrumentation and exe-cution in enclaves. In 2019 USENIX Annual Technical Conference (USENIX ATC19). 555–570.

[51] Christian Priebe, Divya Muthukumaran, Joshua Lind, Huanzhou Zhu, Shujie Cui,Vasily A Sartakov, and Peter Pietzuch. 2019. SGX-LKL: Securing the Host OSInterface for Trusted Execution. arXiv preprint arXiv:1908.11143 (2019).

[52] Ryan Roemer, Erik Buchanan, Hovav Shacham, and Stefan Savage. 2012. Return-Oriented Programming: Systems, Languages, and Applications. ACM Trans.Inf. Syst. Secur. 15, 1, Article 2 (March 2012), 34 pages. https://doi.org/10.1145/2133375.2133377

[53] Felix Schuster, Manuel Costa, Cedric Fournet, Christos Gkantsidis, MarcusPeinado, Gloria Mainar-Ruiz, and Mark Russinovich. 2015. VC3: TrustworthyData Analytics in the Cloud. In IEEE S&P.

[54] Jaebaek Seo, Byoungyoung Lee, Seong Min Kim, Ming-Wei Shih, Insik Shin,Dongsu Han, and Taesoo Kim. 2017. SGX-Shield: Enabling Address Space LayoutRandomization for SGX Programs.. In NDSS.

[55] Ming-Wei Shih, Sangho Lee, Taesoo Kim, and Marcus Peinado. 2017. T-SGX:Eradicating Controlled-Channel Attacks Against Enclave Programs. In NDSS.Internet Society.

[56] Shweta Shinde, Zheng Leong Chua, Viswesh Narayanan, and Prateek Saxena.2016. Preventing Page Faults from Telling Your Secrets. In Proceedings of the 11thACM on Asia Conference on Computer and Communications Security (ASIA CCS

14

Page 15: SmashEx: Smashing SGX Enclaves Using Exceptions

’16).[57] Shweta Shinde, Dat Le Tien, Shruti Tople, and Prateek Saxena. 2017. Panoply:

Low-TCB Linux Applications With SGX Enclaves. In 24th Annual Network andDistributed System Security Symposium, NDSS.

[58] Shweta Shinde, Shengyi Wang, Pinghai Yuan, Aquinas Hobor, Abhik Roychoud-hury, and Prateek Saxena. 2020. BesFS: A POSIX Filesystem for Enclaves with aMechanized Safety Proof. In 29th USENIX Security Symposium (USENIX Security20).

[59] László Szekeres, Mathias Payer, Tao Wei, and Dawn Song. 2013. SoK: EternalWar in Memory. In 2013 IEEE Symposium on Security and Privacy. 48–62. https://doi.org/10.1109/SP.2013.13

[60] Chia-Che Tsai, Kumar SaurabhArora, Nehal Bandi, Bhushan Jain,William Jannen,Jitin John, Harry A. Kalodner, Vrushali Kulkarni, Daniela Oliveira, and Donald E.Porter. 2014. Cooperation and Security Isolation of Library OSes forMulti-ProcessApplications. In EuroSys.

[61] Jo Van Bulck, David Oswald, Eduard Marin, Abdulla Aldoseri, Flavio D. Garcia,and Frank Piessens. 2019. A Tale of Two Worlds: Assessing the Vulnerability ofEnclave Shielding Runtimes. In Proceedings of the 2019 ACM SIGSAC Conferenceon Computer and Communications Security (CCS ’19).

[62] Jo Van Bulck, Frank Piessens, and Raoul Strackx. 2017. SGX-Step: A PracticalAttack Framework for Precise Enclave Execution Control. In 2nd Workshop on

System Software for Trusted Execution (SysTEX). ACM, 4:1–4:6.[63] Jose Rodrigo Sanchez Vicarte, Benjamin Schreiber, Riccardo Paccagnella, and

Christopher W. Fletcher. 2020. Game of Threads: Enabling Asynchronous Poison-ing Attacks. In ASPLOS ’20: Architectural Support for Programming Languages andOperating Systems, Lausanne, Switzerland, March 16-20, 2020, James R. Larus,Luis Ceze, and Karin Strauss (Eds.). ACM, 35–52. https://doi.org/10.1145/3373376.3378462

[64] Huibo Wang, Pei Wang, Yu Ding, Mingshen Sun, Yiming Jing, Ran Duan, Long Li,Yulong Zhang, Tao Wei, and Zhiqiang Lin. 2019. Towards Memory Safe EnclaveProgramming with Rust-SGX. In CCS.

[65] Wenhao Wang, Guoxing Chen, Xiaorui Pan, Yinqian Zhang, XiaoFeng Wang,Vincent Bindschaedler, Haixu Tang, and Carl A. Gunter. 2017. Leaky Cauldronon the Dark Land: Understanding Memory Side-Channel Hazards in SGX. CoRRabs/1705.07289 (2017). arXiv:1705.07289 http://arxiv.org/abs/1705.07289

[66] Yuanzhong Xu, Weidong Cui, and Marcus Peinado. 2015. Controlled-ChannelAttacks: Deterministic Side Channels for Untrusted Operating Systems. In 2015IEEE Symposium on Security and Privacy. 640–656. https://doi.org/10.1109/SP.2015.45

[67] Michal Zalewski. 2001. Delivering Signals for Fun and Profit: Understanding,exploiting and preventing signal-handling related vulnerabilities. (2001). https://lcamtuf .coredump.cx/signals.txt

15