Retrofitting privacy into operating systems › files › neu:cj...privacy-sensitive data produced during short-lived program execution sessions con dential, (2) securely deleting

Retrofitting Privacy intoOperating Systems

A dissertation presented in partial fulfillment of therequirements for the degree of

Doctor of Philosophy

in the field of

Information Assurance

by

Kaan Onarlioglu

Committee Members

Engin Kirda, Northeastern UniversityWilliam Robertson, Northeastern University

Christo Wilson, Northeastern UniversityManuel Egele, Boston University

Northeastern UniversityCollege of Computer and Information Science

Boston, MassachusettsDecember, 2016

Abstract

The changing technology and adversarial models often render existing privacy de-

fenses obsolete, or lead to the emergence of new privacy-sensitive computer resources.

This necessitates the security community to develop novel defenses that address con-

temporary privacy threats.

The operating system is a natural platform to design these novel privacy defenses

on, as it allows for enforcing correct, strong security properties on applications, and

scales to the entire user space. However, developing and deploying a secure operating

system from scratch is impractical.

In this thesis, we instead argue that extending existing operating systems with

novel privacy defenses is the preferable approach to addressing emerging privacy

needs, while also sidestepping issues of cost and usability. We show that such an

approach is both feasible and effective.

To support this claim, we examine four distinct privacy scenarios: (1) keeping

privacy-sensitive data produced during short-lived program execution sessions con-

fidential, (2) securely deleting sensitive data persisted to modern storage media for

long-term use, (3) hiding the existence of encrypted sensitive data on disk, and finally,

(4) providing a user-driven access control model for non-traditional privacy-sensitive

resources on a computer. We discuss the contemporary privacy threats pertaining to

each scenario, and present four privacy-enhancing techniques to address these issues.

We demonstrate that our solutions can be retrofitted into existing operating systems

to provide general, application-transparent, and usable defenses.

i

Acknowledgments

Pursuing a PhD can be a scarring experience. I came through unscathed, and it is

largely thanks to Prof. Engin Kirda who mentored and championed me throughout

these long years. I am indebted to him for the many opportunities he has opened

up for me. Likewise, I owe a special debt of gratitude to Prof. William Robertson

for guiding me through the inception and formulation of my research. I have learned

a lot from Engin and Wil, and enjoyed countless scholarly moments together with

them∗. It was an honor to be part of the team.

I thank my committee members Prof. Christo Wilson and Prof. Manuel Egele for

patiently reading this thesis, and for their valuable feedback.

I am grateful to Prof. Ali Aydin Selcuk who set things in motion by converting

me to security research. I understand now that the alternative path would have led

to a world of pain and misery. It was a close call.

I thank all the SecLab members I had the pleasure of working and spending

time with: Ahmet, Ahmet2, Amin, Andrea, Beri, Can, Collin, Erik, Matt, Michael,

Patrick, Sajjad, and Sevtap. Also Amirali, even though I am still not certain why he

was hanging around with us. A special shout-out to Tobias for sticking with me since

the very beginning. It was a fun ride.

I thank my parents for their endless “love and support” – without them I never

could have afforded the rent in Boston. Lastly, I thank Defne for bearing with me

through some extremely difficult times and making life easier on all fronts.

∗Excluding that one time in Japan. That was not very scholarly.

ii

Contents

List of Tables vii

List of Figures viii

1 Introduction 1

1.1 Motivation for Novel Defenses . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Constraints for Practical Defenses . . . . . . . . . . . . . . . . . . . . 3

1.3 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Research Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 PrivExec: Private Execution as an Operating System Service 9

2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Security Properties . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.2 File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.3 Swap Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.4 Inter-Process Communication . . . . . . . . . . . . . . . . . . 17

2.3.5 Memory Isolation . . . . . . . . . . . . . . . . . . . . . . . . . 18

iii

2.3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4.1 Private Process Management . . . . . . . . . . . . . . . . . . . 19

2.4.2 Private Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4.3 Private Swap Space . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4.4 Private Inter-Process Communication . . . . . . . . . . . . . . 29

2.4.5 Launching Private Applications . . . . . . . . . . . . . . . . . 30

2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.5.1 Running Popular Applications . . . . . . . . . . . . . . . . . . 31

2.5.2 Disk I/O and File System Benchmarks . . . . . . . . . . . . . 32

2.5.3 Real-World Application Performance . . . . . . . . . . . . . . 34

2.6 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Eraser: Secure Deletion on Blackbox Hardware 46

3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.2 Background & Related Work . . . . . . . . . . . . . . . . . . . . . . . 48

3.2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.2.2 Flash Translation Layers . . . . . . . . . . . . . . . . . . . . . 52

3.2.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.4 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.4.1 Naıve Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.4.2 File Key Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

iv

3.5.1 Alternative Solutions & Our Philosophy . . . . . . . . . . . . 60

3.5.2 Prototype Overview . . . . . . . . . . . . . . . . . . . . . . . 62

3.5.3 I/O Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.5.4 Intercepting File Deletions . . . . . . . . . . . . . . . . . . . . 66

3.5.5 Key Storage & Management . . . . . . . . . . . . . . . . . . . 67

3.5.6 Master Key Vault . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.5.7 Encrypting Non-File System Blocks . . . . . . . . . . . . . . . 70

3.5.8 Managing Eraser Partitions . . . . . . . . . . . . . . . . . . 71

3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.6.1 I/O Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.6.2 Tests with Many Small Files . . . . . . . . . . . . . . . . . . . 75

3.6.3 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . 76

3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4 HiVE: Hidden Volume Encryption 83

4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.3.2 Generic Hidden Volume Encryption . . . . . . . . . . . . . . . 86

4.3.3 Write-Only ORAM Construction . . . . . . . . . . . . . . . . 88

4.3.4 Choosing the Parameter k . . . . . . . . . . . . . . . . . . . . 89

4.3.5 Write-Only ORAM Optimizations . . . . . . . . . . . . . . . . 90

4.3.6 Hidden Volume Encryption with HiVE . . . . . . . . . . . . . 91

4.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

v

4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5 Overhaul: Input-Driven Access Control on

Traditional Operating Systems 101

5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.3 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.3.1 Security Properties . . . . . . . . . . . . . . . . . . . . . . . . 105

5.3.2 Trusted Input & Output Paths . . . . . . . . . . . . . . . . . 106

5.3.3 Permission Adjustments . . . . . . . . . . . . . . . . . . . . . 107

5.3.4 Sensitive Resource Protection . . . . . . . . . . . . . . . . . . 108

5.3.5 Interaction Across Process Boundaries . . . . . . . . . . . . . 113

5.3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.4.1 Enhancements to X Window System . . . . . . . . . . . . . . 117

5.4.2 Enhancements to the Linux Kernel . . . . . . . . . . . . . . . 124

5.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

5.5.1 Performance Measurements . . . . . . . . . . . . . . . . . . . 131

5.5.2 Usability Experiments . . . . . . . . . . . . . . . . . . . . . . 133

5.5.3 Applicability & False Positives Assessment . . . . . . . . . . . 134

5.5.4 Empirical Experiments . . . . . . . . . . . . . . . . . . . . . . 136

5.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6 Conclusions 142

vi

List of Tables

2.1 Disk I/O and file system performance of PrivExec. . . . . . . . . . 32

2.2 Runtime performance overhead of PrivExec for two popular web

browsers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3 Runtime performance overhead of PrivExec for various desktop and

console applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1 Disk I/O and file system performance of Eraser compared to full disk

encryption with dm-crypt. . . . . . . . . . . . . . . . . . . . . . . . . 73

3.2 Timed experiments with the Linux kernel source code directory to

compare the small-file performance of Eraser to full disk encryption

with dm-crypt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.1 Disk I/O and file system performance of Hive. . . . . . . . . . . . . 97

5.1 Performance overhead of Overhaul. . . . . . . . . . . . . . . . . . . 131

vii

List of Figures

2.1 An overview of PrivExec’s design. . . . . . . . . . . . . . . . . . . . 13

2.2 An overview of the Linux block I/O layers. . . . . . . . . . . . . . . . 21

2.3 Setting up the secure storage container and overlaying it on the root

file system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1 Structure of an n-ary FKT. . . . . . . . . . . . . . . . . . . . . . . . 57

3.2 Secure deletion using an FKT. . . . . . . . . . . . . . . . . . . . . . . 58

3.3 An overview of Eraser’s design. . . . . . . . . . . . . . . . . . . . . 63

4.1 Hive stores volumes interleaved on disk. . . . . . . . . . . . . . . . . 93

5.1 Dynamic access control over privacy-sensitive hardware devices. . . . 110

5.2 Protecting copy & paste operations against clipboard sniffing. . . . . 112

5.3 A program launcher executing a screen capture program. . . . . . . . 113

5.4 A multi-process browser, components of which communicate via shared

memory IPC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.5 Sample visual alerts shown by Overhaul. . . . . . . . . . . . . . . . 120

5.6 Protocol diagram for the X11 copy & paste operation. . . . . . . . . . 122

viii

Chapter 1

Introduction

The ever-increasing significance of computers in our daily lives, whether to work, to

communicate, or to entertain, has resulted in the present situation where our comput-

ers process and store immense amounts of private information. For instance, personal

and sensitive information such as records of online communications or financial trans-

actions, bank account or credit card details, and user credentials such as passwords

to Internet-based services is often kept on the computer’s disk.

Various defenses are readily available on modern computer systems to alleviate

potential threats to user privacy. For example, a large pool of file and disk encryption

utilities make it possible for users to secure their confidential data against unautho-

rized access. Similarly, many modern operating systems have powerful access control

mechanisms in place to isolate multiple users sharing a computer from each other, or

to enforce security policies on applications that attempt to access privacy-sensitive

system resources.

On the one hand, such well-established defenses have proven to be effective at

mitigating many common attacks on user privacy in the past. On the other hand,

privacy requirements of modern-day computer users are still rapidly evolving as new

1

applications of privacy-sensitive information emerge, all the while attacks to compro-

mise privacy become more complex and lucrative. This situation necessitates security

professionals to revise, update, and adapt existing defenses, or to come up with novel

solutions to adequately address contemporary threats to privacy.

1.1 Motivation for Novel Defenses

Two substantial factors that motivate the design of novel privacy defenses and un-

derline the shortcomings of existing ones are the changing technology and adversarial

models. Here, we take a closer look at these, and illustrate how they impact current

privacy solutions.

Rapid technological advancements are important components that give rise to a

need for novel privacy defenses by exposing new types of privacy-sensitive resources

to attackers. One prominent example is the proliferation of mobile devices such as

smartphones and tablets. These devices are often equipped with various physical

sensors such as cameras and microphones, which are often not sufficiently protected

by traditional access control mechanisms. As a result, the potential for their abuse

to spy on users has become a growing concern.

Another instance of a typical system resource that has previously received little at-

tention from a privacy standpoint is the clipboard. As the use of online commerce and

other Internet-based services has soared over the years, digital wallets and password

managers have become commonplace. These utilities often utilize the clipboard to

automatically copy users’ secret credentials into Web forms. Thus, clipboard contents

have become attractive targets for attackers.

In a similar manner, other technologies render previously working defenses ob-

solete. For example, while securely erasing sensitive persistent data was previously

2

possible by overwriting the contents of the corresponding disk sectors, modern stor-

age technologies such as journaling file systems and solid state drives (SSDs) make it

difficult to prevent data remanence.

A second pertinent factor that calls for revisions to existing privacy defenses is

the changing roles and capabilities of adversaries. Contemporary attacks against full

disk encryption technologies clearly illustrate this problem. Disk encryption is a well-

established, technically sound solution to restrict access to stored private information

in the case of device theft or forensic inspection of the disk. Nevertheless, evolving

adversarial models have positioned government authorities, such as law enforcement

agencies, as powerful threats to user privacy [1]. The fact that such authorities can

force users to disclose their disk encryption keys and render any encryption ineffective

necessitates novel defenses that are resistant to coercion attacks.

Even with the use of techniques such as hidden encrypted disk partitions that

would allow a user to conceal the existence of encrypted data on disk, sufficiently

motivated adversaries can still exploit shortcomings of known defenses and gain ac-

cess to the sensitive data. For example, the hidden partition scheme provided by

TrueCrypt, a popular disk encryption tool, can be bypassed by inspecting multiple

snapshots taken from the same disk [2].

1.2 Constraints for Practical Defenses

Generalizing from the above examples, we observe that as the accumulation of private

information on computers increases and threats to privacy keep evolving, defenses

need to be revised to address changing security requirements. However, resolving

these emerging privacy issues in an effective way also requires solution designers to

adhere to various constraints.

3

First, with the scale of sensitive information being processed and stored on com-

puters today, a great many applications would benefit from any additional privacy fea-

tures offered by novel solutions. However, implementing and maintaining application-

specific privacy features would both come at a high software development cost, as well

as be more likely to lead to security-crippling bugs. This issue is exemplified by a

prevalent application-specific defense: the private browsing modes offered by modern

web browsers which promise to erase the users’ tracks after each browsing session is

ended. Indeed, recent research has demonstrated that all major browsers have flawed

private browsing mode implementations that leave behind traces of the terminated

private session [3].

Consequently, we argue that designing more general solutions that target large

categories of applications without requiring modifications to every single application

instance would be a better approach to implementing correct privacy features at scale.

We point out that this requirement also implies that practical defenses would need to

be transparent to existing applications; in other words, they would need to work out

of the box, without disrupting the functionality of, or requiring drastic modifications

to common software running on the system.

Next, we note that a common pattern in security technologies is that low-complexity,

low-overhead solutions are easily adopted (e.g., ASLR [4], NOEXEC [5]). In con-

trast, solutions that require complex setup procedures or are resource-intensive are

not immediately or widely utilized, even if they are effective defenses (e.g., full-scale

Control Flow Integrity [6], CSP [7]). Therefore, defenses that are easy to deploy and

less resource-intensive are more likely to have a significant practical impact.

Finally, we stress that the human element is another important factor that deter-

mines the success of novel security features. Defenses that are not well-understood

by users or are unfamiliar to them are more likely to be misused or completely aban-

4

doned in practice. Therefore, designing high-usability privacy defenses also warrants

significant attention.

1.3 Our Approach

In light of the constraints and requirements discussed above, we argue that the op-

erating system is a natural and suitable place to implement novel privacy defenses

against emerging threats.

First and foremost, thanks to its supervisor role the operating system is able to

introspect on all user space, providing strong security guarantees to applications and

enforcing privacy policies. Furthermore, due to its position as a common platform

that all applications run on, the operating system is capable of offering general pri-

vacy services to programs in an application-agnostic manner. This also enables easy

verification of the implemented defenses to ensure correctness.

While it could be possible to build a secure operating system from the ground

up with this philosophy in mind, such an approach would have numerous negative

implications on the effectiveness and practicality of any proposed privacy features.

To begin with, a new operating system would likely require extensive changes to

the user space. In other words, maintaining compatibility with the large pool of

existing, well-established software systems would require modifying, or in the worst

case, rewriting each one of these programs. In addition, replacing the universally-

deployed and popular operating systems such as Windows, Mac OS X, or Linux, and

familiarizing their users with a brand new platform would be costly. All in all, a new

secure operating system would be highly unlikely to gain widespread adoption.

5

Thesis statement

In this thesis, we argue that extending existing operating systems with novel de-

fenses is the preferable approach to addressing emerging privacy needs. We show

that retrofitting privacy-enhancing technologies into existing operating systems is

both feasible and effective.

1.4 Contributions

In this work, we examine the issue of user privacy from four perspectives, each repre-

senting a distinct threat scenario to sensitive information captured and processed by

or stored on a computer. We present four systems to retrofit novel privacy-enhancing

techniques into existing operating systems to address these problems. A summary of

the topics we cover is as follows.

• In Chapter 2, we look at the problem of keeping short-lived program execution

sessions private, or in other words, securely erasing all traces left behind by a

running application in persistent storage. We describe PrivExec, an operat-

ing system service that allows a private browsing mode-like private execution

platform for arbitrary applications.

• In Chapter 3, we shift our focus to the challenges of irreversibly deleting data

that is persisted to storage media for long-term use, after it is no longer needed.

We present Eraser, a technique that can guarantee secure deletion indepen-

dent of the characteristics of the underlying storage medium.

• In Chapter 4, we tackle the problem of hiding the existence of sensitive infor-

mation on a computer. We propose Hive, a hidden volume encryption scheme

6

that offers plausibly deniable disk encryption against strong adversaries with

multiple disk snapshot capabilities.

• In Chapter 5, we discuss access control for other types of privacy-sensitive sys-

tem resources, such as hardware sensors attached to a computer, and virtual

resources including the clipboard and screen contents. We present Overhaul,

a dynamic, input-driven access control architecture, where access to privacy-

sensitive resources is mediated based on the temporal proximity of user inter-

actions to access requests, and requests are communicated back to the user via

visual alerts.

In each of the following chapters, we first present operating system-independent

designs for the aforementioned systems, and describe how they solve the privacy

issues in our focus. Next, we describe concrete Linux implementations for all four

systems, and demonstrate that retrofitting privacy features into a standard, prevalent

operating system is both feasible and practical.

1.5 Research Goals

Before we venture into the discussion of these individual systems that we propose, we

lay out a set of criteria, or in other words, research goals, as a guideline to designing

effective privacy systems with high practical impact. These are presented below.

(G1) Compatibility. The privacy threats that we have described and that we are

trying to address in this research are affecting many users today. Therefore, the

solutions we propose must be retrofittable into existing operating systems. This

must be performed without drastic modifications to well-established operating

7

system design and implementation principles, but instead by reusing tried and

tested technologies already available.

(G2) Generality. Providing application-specific privacy features to target ever-

evolving threats and adversaries does not scale well with the plethora of ap-

plications available to users today. Such an approach would require high devel-

opment and maintenance effort, and be prone to developer errors. Instead, our

approach must provide security features as general operating system services

to all applications running on the system, making it possible for users to avail

themselves of the defenses as necessary regardless of the application they are

running.

(G3) Transparency. Directly following from the above, techniques that would re-

quire patching or rewriting individual applications to make them conform to the

restrictions of the introduced defense mechanisms are cumbersome to deploy,

and are especially impractical for use with legacy binary applications with-

out source code. Thus, the techniques we propose must be transparent to the

software running above the operating system level, and must not require any

modifications to existing applications. In the same way, the systems we propose

must not break, or interfere with, the normal functionality of applications.

(G4) Usability. Security solutions that are resource-intensive, difficult to set up

or use, or that drastically change the user experience of interacting with a

computer are less likely to enjoy widespread adoption. Therefore, the systems

we propose must provide low-overhead, unintrusive defenses while preserving

the user experience computer users are familiar with.

At the end of each of the following chapters, we will revisit these criteria to assess

and discuss the effectiveness of the proposed systems.

8

Chapter 2

PrivExec: Private Execution as an

Operating System Service

2.1 Overview

Approaches to preserving user privacy on the client side often involve preventing

sensitive data handled by applications from being exposed in the clear in persistent

storage. Web browsers serve as a canonical example of such an approach. As part of

their normal execution, browsers store a large amount of personal information that

could potentially be damaging were it to be disclosed, such as the browsing history,

bookmarks, cache, cookie store, or local storage contents. In recognition of the fact

that users might not want to leave traces of particularly sensitive browsing sessions,

browsers now typically offer a private browsing mode that attempts to prevent persis-

tent modifications to storage that could provide some indication of the user’s activities

during such a session. In this mode, sensitive user data that would normally be per-

sisted to disk is instead only stored temporarily, if at all, and when a private browsing

session ends, this data is discarded.

9

Private browsing mode has come to be a widely-used feature of major browsers.

However, its implementation as an application-specific feature has significant disad-

vantages that are important to recognize. First, implementing a privacy-preserving

execution mode is difficult to get right. For instance, prior work by Aggarwal et al. [3]

demonstrates that all of the major browsers leave traces of sensitive user data on disk

despite the use of private browsing mode. Second, if any sensitive data does reach

stable storage, it is difficult for user-level applications to guarantee that this data

would not be recoverable via forensic analysis. For example, modern journaling file

systems make disk wiping techniques unreliable, and applications must be careful to

prevent sensitive data from being swapped to disk through judicious use of system

calls such as mlock on Linux.

One way to avoid leaving traces of sensitive user data in persistent storage is to use

cryptographic techniques such as full disk encryption. Here, the idea is to ensure that

all application disk writes are encrypted prior to storage. Therefore, regardless of the

nature of the data that is saved to disk, users without knowledge of the corresponding

secret key will not be able to recover any information. While this is a powerful and

realizable technique, it nevertheless has the significant disadvantage that users can

be coerced, through legal or other means, into disclosing their keys, at which point

the encryption becomes useless.

These concerns suggest that private execution is a feature that is best provided

by the operating system, where strong privacy guarantees can be provided to any

application and analyzed for correctness. Standard cryptographic techniques such as

disk encryption do not satisfactorily solve the problem.

In this chapter, we present the design and implementation of PrivExec, a novel

operating system service for private execution. PrivExec provides strong, general

guarantees of private execution, allowing any application to execute in a mode where

10

storage writes, either to the file system or to swap, will not be recoverable by others

during or after execution. PrivExec achieves this by binding an ephemeral private

execution key to groups of processes that wish to execute privately. This key is

used to encrypt all data stored to file systems, as well as process memory pages

written to swap devices, and is never exposed outside of kernel memory or persisted

to storage. Once a private execution session has ended, the private execution key

is securely wiped from volatile memory. In addition, inter-process communication

(IPC) restrictions enforced by PrivExec prevent inadvertent leaks of sensitive data

to public processes that might circumvent the system’s private storage mechanisms.

PrivExec does not require application support; any unmodified, legacy binary

application can execute privately using our system. Due to the design of this ap-

proach, users cannot be coerced into disclosing information from a private execution.

We also demonstrate that our prototype implementation of PrivExec, which we

construct using existing, well-tested technologies as a foundation, incurs minimal

performance overhead. For many popular application scenarios, PrivExec has no

discernible negative impact on the user experience.

2.2 Threat Model

Our primary motivation for designing PrivExec is to prevent disclosure of sensitive

data produced in short-lived private execution sessions. The model for these private

execution sessions is similar to private browsing modes implemented in most modern

browsers, but generalized to any user-level application.

We divide the threat model we assume for this work into two scenarios, one for

the duration of a targeted private execution session, and another for after a session

has ended.

11

For the first scenario, we assume that an adversary can have remote access to the

target system as a normal user. Due to normal process-based isolation, the attacker

cannot inspect physical memory, kernel virtual memory, or process virtual memory

for processes labeled with a different user ID.

The threat model for the second scenario corresponds to a technically sophisticated

adversary with physical access to a target system after a private execution session

has ended. In this scenario, the adversary has complete access to the contents of any

local storage such as hard disks, as well as the system memory. It is assumed that

the adversary has access to sophisticated forensics tools that can retrieve insecurely

deleted data from a file system, or process memory pages from swap devices.

Common to both scenarios is the assumption of a “benign-but-buggy”, or perhaps

“benign-but-privacy-unaware”, application. In particular, our threat model does not

include applications that maliciously transmit private information to remote parties,

or users that do the same. As we describe in the next section, PrivExec aims to

avoid inadvertent disclosure of private information.

2.3 Design

In this section, we first outline the security guarantees that our system aims to provide,

and then elaborate on the privacy policies that a PrivExec-enabled system must

enforce for file system, swap space, IPC, and memory isolation.

2.3.1 Security Properties

PrivExec provides private execution as a generic operating system service by cre-

ating a logical distinction between public processes and private processes. While

public processes execute with the usual semantics regarding access to shared system

12

P1 P2

P3 P4

Public Processes

P6{PEK }x

Private Process Group

Private Container

Disk reads, writes

P7{PEK }y

Disk reads, writes

Public Filesystem

Disk readsDisk reads, writes

P8{PEK }y

Private Container

Disk reads, writesP5

IPC

IPC

IPC

x

Private Process Groupy

Figure 2.1: An overview PrivExec’s design. Public processes behave as normalapplications, with read-write access to public file systems and unrestricted IPC inthat they can write to all other processes. Private processes, however, have read-onlyaccess to public file systems. All private process writes are redirected to a dedicatedtemporary secure storage container that persists only for the lifetime of the processand is irrevocably discarded at process exit. Data stored in this container is encryptedwith a protected, process-specific private execution key (PEK) that is never revealed.Private process swap is conceptually handled in a similar fashion. Finally, privateprocesses cannot write data to public processes or unrelated private processes viaIPC channels.

13

resources, private processes are subject to special restrictions to prevent disclosure

of sensitive data resulting from private execution. In the PrivExec model, private

processes might execute within the same logical privacy context, where resource ac-

cess restrictions between processes sharing a context are relaxed. We refer to private

processes related in this way as private process groups.

The concrete security properties that our system provides are the following:

(S1) Data explicitly written to storage must never be recoverable without knowledge

of a secret bound to an application for the duration of its private execution.

(S2) Application memory that is swapped to disk must never be recoverable without

knowledge of the application secret.

(S3) Data produced during a private execution must never be passed to processes

outside the private process group via IPC channels.

(S4) Application secrets must never be persisted, and never be exposed outside of

protected volatile memory.

(S5) Once a private execution has terminated, application secrets and data must be

securely discarded.

Together, (S1), (S2) and (S3) guarantee that data resulting from a private exe-

cution cannot be disclosed without access to the corresponding secret. (S4) ensures

that users cannot be coerced into divulging their personal information, as they do

not know the requisite secret, and hence, cannot provide it. (S5) implies that once

a private execution has ended, it is computationally infeasible to recover the data

produced during that execution. Figure 2.1 depicts an overview of the design of

PrivExec.

14

2.3.2 File System

Public processes have the expected read-write access to public file systems. Private

processes, on the other hand, are short-lived processes that have temporary secure

storage containers. This storage container is allocated only for the lifetime of a private

execution and is accessible only to the private process group it is associated with.

Each private process group is bound to a private execution key (PEK) that is the

basis for uniquely identifying a privacy context. This PEK is randomly generated

at private process creation, protected by the operating system, never stored in non-

volatile memory, and never disclosed to the user or any other process. The PEK

is used to encrypt all data produced during a private execution before it is written

to persistent storage within the secure container. In this way, PrivExec ensures

that sensitive data resulting from private process computation cannot be accessed

through the file system by any process that does not share the associated privacy

context. Furthermore, when a private execution terminates, PrivExec securely

wipes its PEK, and hence makes it computationally infeasible to recover the encrypted

contents of the associated storage container.

Although all new files created by a private process must be stored in its secure

container, applications often also need to access files that already exist in the normal

file system in order to function correctly. For instance, many applications load shared

libraries and read configuration files as part of their normal operation. The operating

system needs to ensure that such read requests are directed to the public file system.

An even more complicated situation arises when a private process attempts to modify

existing files. In that case, we need to create a separate private copy of the file in

the process’ secure container, and redirect all subsequent read and write requests for

that file to the new copy. PrivExec ensures that private processes can only write to

15

the secure storage container while they still have a read-only view of the public file

systems by enforcing the following copy-on-write policy:

For a write operation,

• if the destination file does not exist in the file system or in the secure container,

a new file is created in the container;

• if the file exists in the file system, but not in the container, a new copy of the

file is created in the container and the write is performed on this new copy;

• if the file exists in the container, the process directly modifies it regardless of

whether it exists in the file system.

For a read operation,

• if the file exists in the container, it is read from there regardless of whether it

also exists in the file system;

• if the file exists in the file system but not in the container, the file is read from

the file system;

• if the file exists neither in the file system nor in the container, the read operation

fails.

2.3.3 Swap Space

In addition to protecting data written to file systems by a private process, PrivExec

must also preserve the privacy of virtual memory pages swapped to disk. This is

different from existing approaches to swap encryption, which use a single key to

encrypt the entire swap device, and fail to meet our security requirements in the

16

same way that full disk encryption also does not. Since swap space is shared between

processes with different user principals, PrivExec encrypts each private process

memory page that is swapped to disk with the PEK of the corresponding process as

in the file system case, and thus imposes a per-application partitioning of the system

swap.

2.3.4 Inter-Process Communication

The private storage mechanisms described in the previous sections effectively prevent

sensitive data resulting from private computation from being persisted in the clear.

However, applications frequently make use of a number of IPC channels during their

normal operation. Without any restrictions in place, private processes might use

these channels to inadvertently leak sensitive data to a public process. If that public

process in turn persists that data, it would circumvent the protections PrivExec

attempts to enforce. Therefore, PrivExec must also enforce restrictions on IPC to

prevent such scenarios from occurring.

Specifically, PrivExec ensures that a private process can write data via IPC

only to the other members of its group that share the same privacy context. In other

words, a private process cannot write data to a public process or to an unrelated

private process.

As usual, public processes can freely exchange data with other public processes.

Note that public processes can also write data to private processes since data flow

from a public process to a private process does not violate the security properties of

PrivExec.

17

2.3.5 Memory Isolation

Enforcing strong memory isolation is essential to our private execution model, not

only for protecting the virtual address space of a private process, but also for pre-

venting the disclosure of PEKs. To this end, PrivExec takes measures to enforce

process and kernel isolation boundaries against unprivileged users for private pro-

cesses, in particular by disallowing standard exceptions to system isolation policies.

This includes disabling features such as debugging facilities or blocking unprivileged

access to devices that expose the kernel virtual memory or physical memory.

2.3.6 Discussion

The design we describe satisfies the goals we enumerate in Section 2.3.1. The PEK

serves as the application secret that ensures confidentiality of data produced during

private execution (S1), (S2). The PrivExec-enabled operating system is responsi-

ble for protecting the confidentiality of the PEK, ensures that the user cannot be

expected to know the value of individual PEKs, and prevents private processes from

inadvertently leaking sensitive data via IPC channels to other processes (S3), (S4).

Destroying the PEK after a private execution has ended ensures that any data pro-

duced cannot feasibly be recovered by anyone, including the user (S5).

2.4 Implementation

In the following, we describe our prototype implementation of PrivExec as a set of

modifications to the Linux kernel and a user-level helper application. We center this

discussion around five main technical challenges: managing private processes, con-

18

structing secure storage containers, implementing private application swap, enforcing

restrictions on IPC channels, and running applications privately at the user level.

2.4.1 Private Process Management

The first requirement for implementing PrivExec is to enable the operating system

to support a private execution mode for processes. The operating system must be able

to launch an application as a private process upon request from the user, generate the

PEK, store it in an easily accessible context associated with that process, mark the

process and track it during its lifetime, and finally destroy the PEK when the private

process terminates. The operating system must also expose a simple interface for

user-level applications to request private execution without requiring modifications

to existing application code.

The Linux kernel represents every process on the system using a process descrip-

tor. The process descriptor contains all the information required to execute the

process, including functions such as scheduling, virtual address space management,

and accounting. A new process, or child, is created by copying an existing process,

or parent, through the clone system call which allocates a new process descriptor

for the child, initializes it, and prepares it for scheduling. clone offers fine-grained

control over which system resources the parent and child share through a set of clone

flags passed as an argument. When a process is ready to terminate, the exit system

call deallocates resources associated with that process.

To implement our system, we first defined a new private execution flag that is

passed to clone to signal that a private process is to be created. We also defined

a similar flag that is set in the process descriptor to indicate that its corresponding

process is executing privately. We further extended the process descriptor to store

19

the PEK and a pre-allocated cryptographic transform structure that is used for swap

encryption.

To handle private process creation we modified clone to check for the presence of

our private execution flag. If present, we mark the newly cloned process descriptor

as private and generate a fresh PEK using a cryptographically-secure PRNG. As

previosuly discussed, the PEK is stored inside the process descriptor, resides in the

kernel virtual address space, and is never disclosed to the user. For private process

termination we adapted exit to check whether the terminating process is executing

privately, and if so, to deallocate the swap cryptographic transform and securely wipe

the PEK from memory. Since the Linux kernel handles processes and threads in the

same way, this approach also allows for creating and terminating private threads

without any additional implementation effort.

Note that applications might spawn additional children for creating subprocesses

or threads during their course of execution. This can lead to two critical issues with

multi-process and multi-threaded applications running under PrivExec. First, pub-

lic children of a private process could cause privacy leaks. Second, public children

cannot access the parent’s secure container, which could potentially break the appli-

cation. In order to prevent these problems, our notion of a private execution should

include the full set of application processes and threads, despite the fact that the

Linux kernel represents them with separate process descriptors. Therefore, we modi-

fied clone to ensure that all children of a private process inherit the parent’s private

status and privacy context, including the PEK and the secure storage container. Ref-

erence counting is used to ensure that resources are properly disposed of when the

entire private process group exits.

Also note that our implementation exposes PrivExec to user applications through

a new flag that is passed to clone. As a result, when the private execution flag is

20

Block Driver Block Driver

dm-crypt

Ext4

eCryptfs

ReiserFS

Virtual File System (VFS)

Page Cache

AppUser Space

Kernel

Hardware

Figure 2.2: An overview of the Linux block I/O layers.

not passed to the system call, the original semantics of the system call are preserved,

maintaining full compatibility with existing applications. Likewise, applications that

are not aware of the newly implemented PrivExec interface to clone can be made

private by simply wrapping their executables with a program that spawns them using

the private execution flag. We explain how existing applications run under PrivExec

without modifications in Section 2.4.5.

21

2.4.2 Private Disk I/O

PrivExec requires the operating system to provide every private application with

a dedicated secure storage container, to which all application data writes must be

directed. Upon launching a private application, the operating system must construct

this container, intercept and redirect I/O operations performed by the private appli-

cation, and encrypt writes and decrypt reads on the fly.

Although the Linux file I/O API consists of simple system calls such as read

and write, the corresponding kernel execution path crosses many different layers

and subsystems before the actual physical device is accessed. Block I/O requests

initiated by a system call first pass through the virtual file system (VFS), which

provides a unifying abstraction layer over different underlying file systems. After a

particular concrete file system processes the I/O request, the kernel caches it in the

page cache and eventually inserts the request into the target device driver’s request

queue. The driver periodically services queued requests by initiating asynchronous

I/O on the physical device, and then notifies the operating system when the operation

is complete. We refer the reader to Figure 2.2 for a graphical overview of these kernel

subsystems.

The choice of where to integrate PrivExec into the file I/O subsystems requires

careful consideration. In particular, in order to build a generic solution that is inde-

pendent of the underlying file system and physical device, we should avoid modifying

the individual file systems or the drivers for the physical storage devices. One option

is to intercept I/O requests between the page cache and the device’s request queue.

However, this results in sensitive data being stored as plaintext in the page cache, a

location that is accessible to the rest of the system. Thus, this is not an acceptable

solution. Likewise, encrypting the data as it enters the page cache is insufficient since

22

direct I/O operations that bypass the page cache would not be intercepted by our

system. In addition, a second major implementation question is how to handle the

redirection of I/O requests made by private processes per our copy-on-write policy.

In order to build a generic system that addresses all of the above challenges,

we leverage stackable file systems. A stackable file system resides between the VFS

and any underlying file system as a separate layer. It does not store data by itself,

but instead interposes on I/O requests, allowing for controlled modifications to these

requests before passing them to the file system it wraps. Since stackable file systems

usually do not need to know the workings of the underlying file system, they are

often used as a generic technique for introducing additional features to existing file

systems. PrivExec uses a combination of two stackable file systems to achieve its

goals: A version of eCryptfs [8] with our modifications to provide the secure storage

containers, and Overlayfs [9] to overlay these secure containers on top of the root file

system. In the following, we explain their use in PrivExec and our modifications

to eCryptfs in detail.

Secure Storage Containers

eCryptfs is a stackable cryptographic file system distributed with the Linux kernel,

and it provides the basis of PrivExec’s secure storage containers. eCryptfs provides

file system-level encryption, meaning that each file is encrypted separately and all

cryptographic metadata is stored inside the encrypted files. While this is likely to

be less efficient compared to block-level encryption (e.g., the approach taken by dm-

crypt [10]), eCryptfs does not require a full device or partition allocated for it, which

allows us to easily create any number of secure containers on the existing file systems

as demand necessitates.

23

Containers are structured as an upper directory and a lower directory. All I/O

operations are actually performed on the lower directory where files are stored in

encrypted form. The upper directory provides applications with a private view of the

plaintext contents.

The lower directory is provided by eCryptfs using AES-256 to encrypt both file

contents and directory entries. However, while its cryptographic capabilities are pow-

erful, eCryptfs has a number of shortcomings that make it unsuitable for use in

PrivExec on its own. First, once an encrypted directory is mounted and a de-

crypted view is made available at the upper directory, all users and applications with

sufficient permissions can access the decrypted content. Second, eCryptfs expects to

find the secret key in the Linux kernel keyring associated with the user before the file

system can be mounted. This makes it possible for other applications running under

the same user account to access the keyring, dump the key, and access data belonging

to another private application. Therefore, we modified eCryptfs in order to address

these issues and restrict access to private process data in line with our system design.

Our first set of modifications aim to uniquely associate mounted eCryptfs contain-

ers with a single privacy context. In Linux, each file system allocates and initializes

a super block structure when it is mounted. We extended this structure used by

eCryptfs to include a private execution token (PET) that serves as a secret that iden-

tifies the privacy context associated with the mounted eCryptfs container. We then

modified the file system mount routine of eCryptfs to check whether the mount oper-

ation is requested by a private process. Since this function runs in the process context

inside the kernel, we can bind a container to a privacy context by simply checking for

the presence of the private execution flag we introduced in Section 2.4.1 inside the

process descriptor. If the flag is set, we populate the PET with a value derived from

the PEK. These extensions allow us to use the PET as a unique identifier in order

24

to determine whether a process performing eCryptfs operations is the owner of the

container. Of course, we securely wipe the PET from memory when the container is

unmounted.

To enforce access control on containers, we modified the cryptographic functions

of eCryptfs to check the identity of the requesting process using PET. If the process

is not the owner of the container, the I/O request is blocked. Otherwise, if the

private process is the owner of the container, we fetch the PEK from the current

process descriptor and use it as the cryptographic key. This ensures that the PEK

never appears in the user’s kernel keyring, and is never exposed outside of the private

process group.

Although these extensions to eCryptfs address the root cause of the aforemen-

tioned privacy issues, one last problem remains: Once an encrypted file is accessed

by an authorized private process, eCryptfs caches the decrypted content and directly

serves subsequent I/O requests made by other processes from the cache, bypassing our

privacy measures. Therefore, we perform a final PET verification during file access

permission checks, and ensure that access to the eCryptfs upper directory is denied

to the rest of the system regardless of the directory’s UNIX permissions.

All in all, our modified eCryptfs layer provides a secure storage container that is

only accessible to a single private process group. Also note that all of the security

checks we inserted only trigger if eCryptfs is mounted by a private process in the

first place. This guarantees that normal applications can still use eCryptfs as before

without being restricted by our additional privacy requirements.

Overlaying Secure Storage Containers

Once a dedicated secure container has been constructed for a private process group,

we need to redirect I/O operations to that container. We achieve this through the

25

root /

eCry

ptfs

~/pr

ivat

e/

P

r/w

r/w

root /

Ove

rlay

fs/tm

p/fa

kero

ot/

P

r/w

r/w

eCry

ptfs

~/pr

ivat

e/

r/w

root /

r/w

eCry

ptfs

~/pr

ivat

e/

ro

U

chro

ot/tm

p/fa

kero

ot/

P

r/w

Ove

rlay

fs/tm

p/fa

kero

ot/

root /

r/w

eCry

ptfs

~/pr

ivat

e/

ro

U

Step

ISt

ep II

Step

III

Fig

ure

2.3:

Set

ting

up

the

secu

rest

orag

eco

nta

iner

and

over

layin

git

onth

ero

otfile

syst

em.

26

use of a stackable union file system. Union file systems are used to overlay several

different file system trees – sometimes referred to as branches – in a unified hierarchy,

and merge their contents as if they were a single file system together. Although every

implementation supports different unioning capabilities, in theory, a union file system

can be used to overlay any number of branches in a defined order, with specific read

and write policies for each branch.

Overlayfs is an implementation of this idea distributed with the Linux kernel, and

we leverage it as part of our prototype. We use Overlayfs to layer secure storage

containers on top of the root file system tree. The root file system is mounted as

a read-only lower branch, while the secure container is made the read-write upper

branch. In this way, through an Overlayfs mount point, a private process has a

complete view of the root file system, while all write operations are actually performed

on the secure container. Overlayfs also supports copy-on-write by default. In other

words, when an application attempts to write to a file in the lower read-only root file

system, Overlayfs first makes a copy of the file in the writable secure container and

performs the write on the copy. The files in an upper branch take precedence over

and shadow the same files in the lower branch, which also ensures that all subsequent

read and write operations are redirected to the new encrypted copies.

The entire process of setting up a secure container for a private process P and

overlaying it on the root file system is illustrated in Figure 2.3. Note that the given

path names are only examples; PrivExec uses random paths to support multiple

private execution sessions that run simultaneously. Before launching a private pro-

cess, in step one, PrivExec creates a secure container using our modified version

of eCryptfs and mounts it on ~/private. In step 2, Overlayfs is used to overlay the

container on the root file system, and this new view is mounted on /tmp/fakeroot.

In the final step, the private process is launched in a chroot environment with its

27

root file system the Overlayfs mount point. In this way, the private process still has a

complete view of the original file system and full read-write access; however, all writes

are transparently redirected to the secure container. When the private process ter-

minates, PrivExec destroys the secure container and PEK, rendering the encrypted

data in ~/private irrecoverable.

2.4.3 Private Swap Space

Since the Linux kernel handles swap devices separately from file system I/O, PrivExec

must also interpose on these operations in order to preserve the privacy of virtual

memory pages swapped to disk. To this end, each page written to a swap device

must be encrypted with the PEK of the corresponding private process.

We implemented per-application swap encryption as a patch to the kernel’s swap

page-out routine. First, a check is performed to determine whether a page to be

written belongs to a private process. If so, the pre-allocated cipher transform in the

process descriptor is initialized with a page-specific IV, and the page is encrypted

with PEK prior to scheduling an asynchronous write operation.

For page-in, the situation is more complex. The kernel swap daemon (kswapd)

is responsible for scanning memory to perform page replacement, and operates in

a kernel thread context. Therefore, once a page has been selected for replacement,

process virtual memory structures must be traversed to locate a process descriptor

that owns the swap page. Once this has been done, however, the inverse of page-out

can be performed. Specifically, once the asynchronous read of the page from the swap

device has completed, a check is performed to determine whether the owning process

is in private execution mode. If so, the process cipher transform is initialized with

28

the page-specific IV, and the page is decrypted with the PEK prior to resumption of

the user process.

2.4.4 Private Inter-Process Communication

PrivExec also imposes restrictions on private process IPC to prevent data leaks

from a privacy context. In general, our approach with respect to private IPC is to

modify each IPC facility available to Linux applications as follows.

Similarly to secure storage containers, we embedded a PET in the kernel struc-

tures corresponding to IPC resources. We then modified the kernel IPC functions to

perform a check to compare the tokens of the endpoint processes at the time of chan-

nel establishment, or before read and write operations, augmenting the usual UNIX

permission checks as appropriate. The policy we implemented ensures that private

processes with the same token can freely exchange data, while private processes with

different tokens are prevented from communicating with a “permission denied” er-

ror. In addition, private processes are allowed to read from public processes, but

prevented from writing data to them. Of course, IPC semantics for communication

between public processes remains unchanged.

The specific Linux IPC facilities that we modified to conform to the policy de-

scribed above include UNIX SysV shared memory and message queues, POSIX shared

memory and message queues, FIFO queues, and UNIX domain sockets. We omit de-

tails of the specific changes as they are similar in nature to those described for the

case of secure storage containers.

29

2.4.5 Launching Private Applications

While PrivExec-aware applications can directly spawn private subprocesses or threads

as they require by passing the private execution flag to the clone system call, we

implemented a PrivExec wrapper as the primary method for running existing ap-

plications in private mode.

The PrivExec wrapper first creates a private copy of itself by invoking clone

with the private execution flag. Then, this private process creates an empty secure

storage container and mounts it in a user-specified location. Recall that, as explained

in Section 2.4.2, our modifications to eCryptfs ensure that only this specific private

process and its children can access the container from this point on. The wrapper

then creates the file system overlay, and finally loads the target application executable

in a chroot environment, changing the application’s root file system to our overlay. As

explained in Section 2.4.1, the application inherits the PEK of the wrapper, and starts

its private execution. When the application terminates, the PrivExec wrapper

cleans up the mounted overlay and exits.

Note that the final destruction of the container is simply for user convenience.

Even if the wrapper or the private application itself crashes or is killed, leaving the

container and the overlay mounted, the container is accessible only to the processes

that have the corresponding PEK (i.e., the private application that created it). Since

that application and its PEK are guaranteed to be destroyed by the kernel, the private

data remains inaccessible even if the container remains mounted.

2.5 Evaluation

The primary objective of our evaluation is to demonstrate that PrivExec is prac-

tical for real-world applications that often deal with sensitive information, without

30

detracting from the user experience. To this end, we first tested whether our system

works correctly, without breaking program functionality, by manually running pop-

ular applications with PrivExec. Next, we tested PrivExec’s performance using

standard disk I/O and file system benchmarks. Finally, we ran performance exper-

iments with well-known desktop and console applications that are representative of

the use cases PrivExec targets.

All tests were run on a standard desktop computer with an Intel i7-930 CPU, 9 GB

RAM, running Arch Linux x86-64 with kernel version 3.12.0-rc2. Disk benchmarks

were performed on a Samsung Spinpoint F3 HD502HJ mechanical hard disk.

2.5.1 Running Popular Applications

To demonstrate that our approach is applicable to and compatible with a wide va-

riety of software, we manually tested 50 popular applications with PrivExec. We

selected our test set from the top rated applications list reported by Ubuntu Software

Center. Specifically, we selected the top 50 applications, excluding all non-free or

Ubuntu-specific software. The tested applications include software in many differ-

ent categories such as developer tools (e.g., Eclipse, Emacs, Geany), graphics (e.g.,

Blender, Gimp, Inkscape), Internet (e.g., Chromium, FileZilla, Thunderbird), office

(e.g., LibreOffice), sound and video (e.g., Audacity, MPlayer), and games (e.g., Battle

for Wesnoth, Teeworlds). We launched each application with PrivExec, exercised

their core features, and checked whether they worked as intended.

This experiment revealed two important limitations of PrivExec regarding our

measures to block IPC channels. First, private X applications failed to start because

they could not communicate with the public X server through UNIX domain sockets.

This led us to modify our system to launch these applications in a new, private X

31

Original eCryptfs-only PrivExec

Performance Performance Overhead Performance Overhead

Write 110694.60 KB/s 97536.83 KB/s 13.49 % 97979.47 KB/s 12.98 %Read 111217.67 KB/s 107134.53 KB/s 3.81 % 106293.73 KB/s 4.63 %

Create 13906.73 files/s 8312.73 files/s 67.29 % 8181.10 files/s 69.99 %Delete 42012.87 files/s 25232.67 files/s 66.50 % 23017.00 files/s 82.53 %

Table 2.1: Disk I/O and file system performance of PrivExec. eCryptfs-only per-formance is also shown for comparison.

session, which resolved the issue. Alternatively, the IPC protection for stream type

UNIX domain sockets could be disabled as a trade-off in order to run private and

public applications in the same X session.

Second, a number of X applications that utilized the MIT Shared Memory Exten-

sion (MIT-SHM) to draw to the X display failed to render correctly since SysV shared

memory writes to the public X server were blocked. This issue was also resolved by

running a private X session, or simply by disabling the MIT-SHM extension in the X

server configuration file.

Once the above problems were dealt with, all 50 applications worked correctly

without exhibiting any unusual behavior or noticeable performance issues.

2.5.2 Disk I/O and File System Benchmarks

In order to evaluate the disk I/O and file system performance of PrivExec, we

used Bonnie++ [11], a well-known I/O and file system benchmark tool for UNIX-like

operating systems.

We first configured Bonnie++ to use 10 × 1 GB files to test the throughput of

block write and read operations. Next, we benchmarked file system operations by

configuring Bonnie++ to create and delete 102,400 files in a single directory, each

containing 512 bytes of data. We ran Bonnie++ as a normal process and then using

32

PrivExec for comparison, repeated all the experiments 10 times, and calculated the

average scores to get the final results. We present our findings in Table 2.1.

These results show that PrivExec performs reasonably well when doing regular

reads and writes, incurring an overhead of 12.98% and 4.63%, respectively. However,

private applications can experience slowdowns ranging from 70% to 85% when dealing

with large numbers of small files in a single directory. In fact, unoptimized file

system performance with large amounts of files is a known deficiency of eCryptfs,

which could provide an explanation for this performance hit.1 When we adjusted our

benchmarks to decrease the number of files used, or when we configured Bonnie++

to distribute the files evenly to a number of subdirectories, the performance gap

decreased drastically.

To see the impact of eCryptfs on PrivExec’s performance in general, we repeated

the measurements by running Bonnie++ on an eCryptfs-only partition. The results,

also shown in Table 2.1 for comparison, indicate that a significant part of PrivExec’s

disk I/O and file system overhead is introduced by the eCryptfs layer. This suggests

that a more optimized encrypting file system, or the use of block-level encryption

via dm-crypt (despite its various disadvantages such as the requirement to create

separate partitions of fixed size to be utilized by PrivExec) could greatly increase

PrivExec’s disk I/O and file system performance. We report the worst-case figures

in this section and leave the evaluation of these alternative techniques for future work.

While these results clearly indicate that PrivExec might not be suitable for

workloads involving many small files, such as running scientific computation appli-

cations or compiling large software projects, we must stress that such workloads do

not represent the use cases PrivExec is designed to target. In the next section

1See an eCryptfs developer’s response to a similar performance-related issue at http://

superuser.com/questions/397252/ecryptfs-and-many-many-small-files-bad-performance,also linked from the official eCryptfs web page.

33

we demonstrate that these benchmark scores do not translate to decreased perfor-

mance when executing real-world applications with concrete privacy requirements

using PrivExec.

2.5.3 Real-World Application Performance

In a final set of experiments, we measured the overhead incurred by various common

desktop and console applications when running them with PrivExec. Specifically,

we identified 12 applications that are representative of the privacy-related scenarios

and concerns that PrivExec aims to address, and designed various automated tests

to stress those applications. We ran each application first as a normal process, then

with PrivExec, and compared the elapsed times under each configuration.

Note that designing custom test cases and benchmarks in this way requires careful

consideration of factors that might influence our runtime measurements. In particular,

a major challenge we faced was automating the testing of desktop applications with

graphical user interfaces. Although several GUI automation and testing frameworks

exist for Linux, most of them rely on recording and issuing X server events without

any understanding of the tested application’s state. As a result, the test developer

is often expected to insert fixed delays between each step of the test in order to give

the application enough time to respond to the issued events. For instance, consider a

test that involves opening a menu by clicking on it with the mouse, and then clicking

on a menu item. When performing this task automatically using a tool that issues

X events, the developer must insert a delay between the two automated click events.

After the first click on the menu, the second click must be delayed until the tested

application can open and display the menu on the screen. This technique works

well for simple automation tasks, but for runtime measurements, long delays can

34

Fir

efox

Ch

rom

ium

Ori

g.

Ru

nti

me

(s)

PrivExec

Ru

nti

me

(s)

Over

hea

dO

rig.

Ru

nti

me

(s)

PrivExec

Ru

nti

me

(s)

Over

hea

d

Ale

xa

98.4

310

3.56

5.21

%91

.63

94.6

93.

34%

Wik

iped

ia37

.80

39.9

65.

71%

39.2

540

.12

2.22

%C

NN

66.6

169

.15

3.81

%49

.21

50.8

33.

29%

Gm

ail

58.4

361

.36

5.02

%30

.61

30.9

81.

21%

Tab

le2.

2:R

unti

me

per

form

ance

over

hea

dof

PrivExec

for

two

pop

ula

rw

ebbro

wse

rs.

35

easily mask the incurred overhead and lead to inaccurate results. Taking this into

consideration, in our tests, we refrained from using any artificial delays, or employing

tools that operate in this way.

First, we tested PrivExec with two popular web browsers, Firefox and Chromium.

We designed four test cases that represent different browsing scenarios.

Alexa. In this test, we directed the browsers to visit the top 50 Alexa domains.

While some of these sites were relatively simple (e.g., www.google.com), others in-

cluded advertisement banners, embedded Flash, multimedia content, JavaScript, and

pop-ups (e.g., www.bbc.co.uk).

Wikipedia. In this test, we visited 50 Wikipedia articles. As is typical of

Wikipedia, these web pages mostly included text and images.

CNN. In this test, we navigated within the CNN web site by clicking on different

news categories and articles. We cycled 5 times through 10 CNN pages with many

embedded images, videos, and Flash content in order to exercise the browser’s cache.

Gmail. In this test, we navigated to and logged into Gmail, composed and sent

5 emails, and then logged out of the web site.

To execute these tests, we used Selenium WebDriver [12], a popular browser au-

tomation framework. Selenium commands browsers natively through browser-specific

drivers, and is able to detect when the page elements are fully loaded without requiring

the user to introduce fixed delays. We repeated each test 10 times, and calculated the

average runtime over all the runs. We present a summary of the results in Table 2.2.

Next, we tested 10 popular Linux applications, including media players, an email

client, an instant messenger, and an office suite. These applications and their corre-

sponding test cases are described below.

36

Orig. Runtime (s) PrivExec Runtime (s) Overhead

Audacious 61.27 62.30 1.68 %Feh 51.86 52.52 1.27 %FFmpeg 105.47 111.31 5.54 %grep 245.37 253.82 3.44 %ImageMagick 96.16 101.41 5.46 %LibreOffice 99.64 100.62 0.98 %MPlayer 122.98 129.39 5.21 %Pidgin 116.49 117.87 1.19 %Thunderbird 75.45 78.78 4.41 %Wget 71.48 71.89 0.57 %

Table 2.3: Runtime performance overhead of PrivExec for various desktop andconsole applications.

Audacious. We configured Audacious, a desktop audio player, to iterate through

a playlist of 2500 MP3 audio files totaling 15 GB, load each file, and immediately

skip to the next file without playing them.

Feh. Feh is a console-based image viewer. We configured Feh to load and cycle

through 1000 JPEG images totaling 1.5 GB.

FFmpeg. FFmpeg, a video and audio converter, was configured together with

libmp3lame to convert 25 AAC formatted audio files to the MP3 format.

grep. grep is the standard Linux command-line utility for searching files for

matching regular expressions. We used grep to search the entire root file system for

a fixed string, and dumped the matching lines into a text file. This process resulted

in 16186 matching lines leading to a 3 MB dump.

ImageMagick. ImageMagick is a software suite for creating, editing and viewing

various image formats. Using ImageMagick’s convert utility we converted 150 JPEG

images to PNG images.

LibreOffice. LibreOffice is a comprehensive office software suite. We used Libre-

Office to open 10 documents and print them to PostScript files.

37

MPlayer. We configured MPlayer, a console and desktop movie player, to iterate

through a playlist of 100 Matroska files totaling 30 GB containing videos in various

formats, load each file, and immediately skip to the next one without displaying the

content.

Pidgin. Pidgin is a multi-protocol instant-messaging client. Using Pidgin we sent

500 short text messages between two Gtalk accounts.

Thunderbird. Thunderbird is a desktop email client. We composed and sent 5

emails with 1 MB attachments in our test.

Wget. Wget is a console-based network downloader. We used Wget to download

10 small video clips from the Internet, each sized 10-25 MB.

To carry out these tests, we utilized the synchronous command line interfaces

provided by the applications themselves, and also used xdotool [13], an X automation

tool that can simulate mouse and keyboard events. We stress that we only used

xdotool for simple tasks such as bootstrapping some of the GUI applications for

testing, and never included any artificial delays. Similar to the previous experiments,

we repeated each test 10 times, and we present the average runtimes in Table 2.3. Note

that in the tests above, we had the option to supply inputs to the applications from

the secure storage containers or from the public file systems. For each application,

we tested both and have reported the worse case. Also note that PrivExec would

normally prevent us from writing to the secure container from outside the private

process. Therefore, we implemented a backdoor in PrivExec during the evaluation

phase in order to copy the test data to the secure container.

In our experiments, the overhead of private execution was under 6% in every

test case, and private applications took only 3.31% longer to complete their tasks on

average. These results suggest that PrivExec is efficient and that it does not detract

from the user experience when used with popular applications that deal with sensitive

38

data. Finally, these experiments support our claim in Section 2.5.2 that the Bonnie++

benchmark results do not necessarily indicate poor performance for common desktop

and console applications. On the contrary, PrivExec can demonstrably provide a

private execution environment for real applications without a significant performance

impact. Still, we must stress that if a user runs PrivExec with a primarily I/O

bound workload, lower performance should be expected as indicated by the Bonnie++

benchmarks.

2.6 Limitations

While our prototype aims to provide a complete implementation of private execution

for Linux, there are some important limitations to be aware of.

One limitation is that the current prototype does not attempt to address system

hibernation, which entails that the contents of physical memory are persisted to

disk. As a result, if a hibernation event occurs while private processes are executing,

sensitive information could be written to disk as plaintext in violation of system

design goals. We note that this is not a fundamental limitation, as hibernation could

be handled in much the same manner as per-process encrypted swap. However, we

defer the implementation of private execution across hibernation events to a future

release.

By design, PrivExec relies upon memory isolation to protect both private process

memory as well as the corresponding PEK that resides in kernel memory. If malicious

code runs as a privileged user, such as root on UNIX-like systems, then that code

could potentially bypass PrivExec’s protection mechanisms. One example of this

would be for a malicious user to load a kernel module that directly reads out PEKs,

or simply introspects on a private process to access its memory directly. For this

39

reason, we explicitly consider privileged malicious users or code as outside the scope

of PrivExec’s threat model.

As previously discussed in Section 2.5, certain X clients do not interact well with

the current prototype implementation of stream-based UNIX domain socket and SysV

shared memory IPC privacy restrictions. In the former case, UNIX domain socket re-

strictions must be relaxed for X applications, while disabling the MIT-SHM extension

is sufficient to work around the second case. A related limitation is the possibility for

malicious code to extract sensitive data by capturing screenshots of private graphi-

cal elements through standard user interface facilities. However, we again note that

these are not fundamental limitations of the approach, and they can be addressed

with additional engineering effort.

2.7 Related Work

While other work has attempted to protect application privacy to varying degrees,

we believe that PrivExec strikes the right balance between security guarantees,

system integration effort, and performance with its operating system-level interface

for protecting generic binaries. In this section, we relate existing work to PrivExec,

including work on privacy attacks and defenses, file system and disk encryption, and

sensitive information leakage in various contexts.

Privacy as an Operating System Service

To the best of our knowledge, Ioannidis et al. [14] provide the first academic work

that proposes the idea of deploying privacy mechanisms in an application-independent

manner as an operating system service, but without providing a concrete system

design or implementation.

40

In a recent work, Lacuna [15] enables private execution for virtual machines,

which – like PrivExec’s private process groups – are used to confine the secrets of

sets of processes. By leveraging QEMU, modifications to the host operating system,

and hardware support, Lacuna not only ensures privacy for storage and swap space,

but also eliminates leaks into operating system drivers via its ephemeral channel

abstraction. PrivExec provides a subset of the security guarantees of Lacuna, but

at a lower engineering cost and with greater usability.

Similar to PrivExec, Kywe et al. [16] implement a general private execution

mode on Android by leveraging the platform’s existing sandboxing capabilities.

Djoko et al. [17] extend PrivExec to use a memory-backed secure storage con-

tainer, and report improved performance over our original implementation.

Privacy Leaks in Web Browsers

Privacy attacks and defenses have been studied extensively specifically in the context

of web browsers. For example, Felten and Schneider [18] introduce the first privacy

attacks exploiting DNS and browser cache timing. In other works, Clover et al. [19]

demonstrate a technique for stealing browsing history using CSS visited styles, and

Janc and Olejnik [20] show the real-world impact of this attack. On the defense side,

solutions have been proposed for preventing sniffing attacks and session tracking

(e.g., [21, 22, 23, 24]). These works are largely orthogonal to ours in that they target

information leaks on the web, while PrivExec addresses the problem of privacy leaks

in persistent storage.

Aggarwal et al. [3] and Said et al. [25] analyze the private browsing modes of

various browsers, and reveal weaknesses that would allow a local attacker to recover

sensitive data saved on the disk. The former study also shows that poorly designed

browser plug-ins and extensions could undermine well-intended privacy protection

41

measures. These studies underline the value of PrivExec as our approach aims to

mitigate the attacks described in these papers.

Xu et al. [26] present similar findings, and propose a universal private browsing

framework for web browsers, which utilizes a temporary sandbox file system to con-

tain, and later discard, data produced during a private browsing session. In contrast,

PrivExec is designed as a generic solution that is not only limited to protecting web

browsers. In other words, our approach can be used to run any arbitrary application

in private sessions, including browsers that already have private browsing modes and

that have been shown to be vulnerable.

Privacy Leaks in Volatile Memory

Studies have demonstrated that it is possible to recover sensitive data, such as disk

encryption keys, from volatile memory [27], and many others have proposed solutions

to address this problem. While PrivExec stores PEKs in memory, we are careful to

wipe them after the associated process has ended. Anti-cold boot measures can also

be deployed to complement PrivExec if so desired by users.

Secure hardware architectures such as XOM [28] and AEGIS [29] extensively

study memory encryption techniques to prevent information leakage, and support

tamper-resistant software and processing. Alternatively, Cryptkeeper [30] proposes

a software-encrypted virtual memory manager that works on commodity hardware

by partitioning the memory into a small plaintext working set and a large encrypted

area.

Likewise, secure deallocation [31] aims to reduce the lifetime of sensitive data in

the memory by zeroing memory promptly after deallocation. Provos [32] proposes

encrypting swapped-out memory pages in order to prevent data leaks from memory

to disk.

42

In contrast, PrivExec is designed as an operating system service that guarantees

storage writes to the file system or to swap cannot be recovered during or after a pri-

vate execution session. As such, encrypted memory is complementary to PrivExec’s

private processes. Furthermore, PrivExec works on commodity hardware and does

not necessitate architectural changes to existing systems.

Disk and File System Encryption

Many encrypted file systems (e.g., CFS [33], Cryptfs [34], eCryptfs [8], EncFS [35]),

and full disk encryption technologies (e.g., dm-crypt [10], BitLocker [36]) have been

proposed to protect the confidentiality of data stored on disk. In a recent study,

CleanOS [37] extends this idea to a new Android-based operating system that protects

the data on mobile devices against device loss or theft by encrypting local flash and

storing keys in the cloud. Borders et al. [38] propose a system that takes a system

checkpoint, stores confidential information in encrypted file containers called storage

capsules, and finally restores the previous state to discard all operations that the

sensitive data was exposed to.

Although many of these solutions provide confidentiality while the encrypted

drives or partitions are locked, once they are unlocked, sensitive data may become

exposed to privacy attacks. Moreover, encryption keys can be retrieved by exploiting

insecure key storage, or through malware infections. Approaches that may be resilient

to such attacks (e.g., storage capsules) remain open to key retrieval via coercion (e.g.,

through a subpoena issued by a court). In contrast, PrivExec destroys encryption

keys promptly after a process terminates, guaranteeing that recovery of sensitive data

on the disk is computationally infeasible. Furthermore, it can be applied selectively

to specific processes on demand, as opposed to encrypting an entire device or par-

43

tition. Finally, PrivExec is a flexible solution that can work with any file system

supported by the kernel.

Secure File Deletion

The idea of securely deleting files using ephemeral encryption keys was introduced by

Boneh and Lipton [39], and was later used in various other systems (e.g., [40, 41, 42]).

We borrow this idea and apply it to a new context.

Other more general secure wiping solutions, including user space tools such as

shred [43] and kernel approaches [44, 45] provide only on-demand secure removal of

files. In contrast, PrivExec provides operating system support for automatically

rendering all files created and modified by a private process irrecoverable, and does

not require users to manually identify files that contain sensitive data for deletion.

Ritzdorf et al. [46] describe a technique to automatically identify related content

upon file deletion. While this work does not consider secure deletion per se, in

principle, the proposed system can be combined with other secure deletion techniques

to automatically remove all traces of a private execution session.

We present an in-depth discussion of other secure deletion techniques in Chapter 3.

Application-Level Isolation

Various mechanisms have been proposed to sandbox applications and undo the effects

of their execution. For example, Alcatraz [47] and Solitude [48] provide secure exe-

cution environments that sandbox applications while allowing them to observe their

hosts using copy-on-write file systems. Li et al. [49] propose a two-way sandbox for

x86 native code that protects applications and the operating system from each other.

Other works utilize techniques such as system transactions, monitoring, and logging

to roll back the host to a previous state (e.g., [50, 51]). Unlike PrivExec, these

44

systems are primarily concerned with executing untrusted applications and recovery

after a compromise; they do not provide privacy guarantees.

2.8 Summary

Preventing sensitive data handled by applications from being exposed in persistent

storage is a common privacy goal. To achieve this, web browsers often support private

browsing modes that discard users’ traces after a browsing session ends. However,

as evidenced by the security flaws found in many popular browsers, implementing

private execution features in an application-specific manner is bug prone and can be

costly.

In this chapter, we presented PrivExec, an operating system service for private

execution of arbitrary applications. PrivExec leverages the short-lived nature of

the private execution model to associate protected, ephemeral private execution keys

with processes that can be securely wiped after use so that they cannot be recovered

by a user or adversary.

The proposed design and implementation satisfies all of the research goals we laid

out in Section 1.5. (G1) We provided an abstract design of PrivExec independent of

the underlying operating system, and demonstrated that it can be applied to Linux-

like systems with lightweight modifications to existing operating system structures,

and by reusing already deployed technologies such as eCryptfs and Overlayfs. (G2)

PrivExec works with any application, and provides strong, general guarantees of

private execution. (G3) It does not require explicit application support, recompila-

tion, or any other preconditions. (G4) Finally, our evaluation shows that PrivExec

is applicable to a wide variety of popular applications, and that it incurs a minimal

performance overhead in practice, when running real-world applications.

45

Chapter 3

Eraser: Secure Deletion on

Blackbox Hardware

3.1 Overview

Secure deletion of data from non-volatile storage is a well-recognized and heavily

studied problem. To date, researchers and developers have proposed a plethora of

techniques for securely erasing data from physical media, often employing methods

such as overwriting files containing sensitive data in-place, encrypting data with tem-

porary keys that are later discarded, or hardware features that scrub storage blocks.

Despite these extensive efforts, advances in storage technologies and character-

istics of modern hardware still pose significant difficulties to achieving irreversible

data deletion in prevailing computing environments. For instance, Solid State Drives

(SSDs) often utilize hardware controllers inaccessible to the outside world. These con-

trollers can redirect I/O operations performed on logical device blocks to arbitrary

memory cells in order to implement wear leveling and minimize the effects of write

amplification. Similarly, journaling file systems may keep traces of I/O operations

46

that include sensitive data in their logs. As a result, many secure deletion methods

that base their security on behavioral assumptions regarding older file systems or me-

chanical disk drives are rendered ineffective because tracking and removing sensitive

data in these settings is often infeasible, or sometimes impossible.

In the face of these emerging challenges, recent research has adapted secure dele-

tion technologies to new applications. For example, Reardon et al. [42] present an

encrypting file system that guarantees secure erasure on raw flash memory used in

smartphones. However, secure deletion remains a challenge on blackbox devices such

as the aforementioned SSDs, which only allow access to their storage through opaque

hardware controllers that translate I/O blocks in an unpredictable manner.

In this chapter, we present a technique that provides secure deletion guarantees at

file granularity, independent of the characteristics of the underlying storage medium.

Our approach is based on the general observations made in previous work that secure

deletion cannot be guaranteed on a blackbox storage medium with unknown behavior.

Therefore, we instead bootstrap secure deletion using a minimal master key vault

under the user’s control, such as a Trusted Platform Module chip or a smartcard.

Our approach is an evolution of the first cryptographic erasure technique proposed

by Boneh and Lipton [39]. At an abstract level, we encrypt every file on an insecure

medium with a unique key, which can later be discarded to cryptographically render

a file’s data irrecoverable. Note that while these keys would need to be persisted to

keep the files accessible in the future, they cannot be stored on the same medium

together with the files since that would in turn prevent us from securely deleting the

keys.

To address this problem, we compress the keys into a single master key that is

never persisted to insecure storage, but instead is evicted to the master key vault.

To this end, we utilize a key store organized as an n-ary tree (i.e., a tree where

47

each node has up to n children), where every node represents a unique encryption

key. We term this key store a file key tree (FKT). Keys corresponding to leaf nodes

each encrypt a single file stored on the blackbox medium, and in turn parent nodes

encrypt their children nodes. This tree hierarchy compresses the master secret to a

single encryption key, the root node, which is never persisted to the blackbox storage

but is instead easily evicted to the master key vault. In contrast, the rest of the tree

nodes (i.e., encrypted keys) are stored together with the files on the insecure device.

In this model, securely deleting a file from an FKT of capacity |F | involves de-

crypting n logn |F | nodes, regenerating logn |F | keys, and re-encrypting the n logn |F |

nodes with the new keys. During this process, the master key is also securely wiped

from the vault and replaced with a fresh one. In this way, the previous path leading

to the deleted file will be rendered irrecoverable.

We implemented this technique in an unconventional prototype, a file-aware stack-

able block device, which can be deployed as a stand-alone Linux kernel module that

does not require any modification to the operating system architecture. As the name

implies, our implementation exposes a virtual block device on top of an existing phys-

ical device installed on the computer. Users can format this drive with any file system

and interact with it as they would normally do with a physical disk. Our block level

implementation is able to capture higher-level file system information to identify file

blocks while providing I/O performance significantly better than a file system-level

solution.

3.2 Background & Related Work

Secure deletion of data from physical storage is a well-studied and complicated prob-

lem. Regardless, it remains unsolved in the general case. In the following, we briefly

48

outline related work on various forms of secure deletion, highlight their shortcomings,

and motivate our approach.

3.2.1 Related Work

Secure deletion approaches have been investigated at several different layers of ab-

straction and using a variety of techniques. We refer readers to a comprehensive

classification of prior approaches [52], while in the following we summarize relevant

related work.

Hardware Techniques

The lowest point at which secure deletion can be performed is at the physical layer.

In the most direct interpretation, secure deletion can be performed through physical

destruction of the storage medium. Scenarios where these methods apply are out of

scope for this paper.

Secure deletion can also be performed at the hardware controller. For magnetic

media, SCSI and ATA controllers provide a Secure Erase command that overwrites

every physical block. Some solid-state drives also provide such a command. However,

this is a coarse-grained approach to secure deletion that is difficult to improve upon

since, without knowledge of the file system, controllers cannot easily distinguish data

to be preserved from data to be deleted. Furthermore, prior work has shown that

hardware-level secure deletion is not always implemented correctly [53].

49

File System-based Solutions

The next layer of abstraction above the physical controller is at the file system.

Here, secure deletion approaches can take advantage of file system semantics, but are

potentially restricted by the device driver interface.

One class of techniques is aimed at devices for which the operating system can reli-

ably perform in-place updates (e.g., magnetic hard drives). Many specific techniques

have been proposed, including queuing freed blocks for explicit overwrite [44, 45, 54]

as well as intercepting unlink and truncation events for user space scrubbing [45].

Another class of techniques is intended for devices such as raw flash memory, where

there is asymmetry between the minimum sizes of read or write and erase operations.

One notable example is DNEFS [42], which modifies the file system to encrypt each

data block with a unique key and co-locates keys in a dedicated storage area. Secure

deletion is implemented by erasing the current key storage area and replacing it with

a new version. During this replacement, keys corresponding to deleted data are not

included in the new version.

However, a fundamental underlying assumption of these approaches, that the OS

has ability to directly read or write physical blocks as in the case of magnetic hard

drives or raw flash memory, is not valid for modern storage devices such as SSDs as

we describe below.

User-level Tools

User space is the highest layer of abstraction from which secure deletion can be at-

tempted. These approaches are restricted to the file system API exposed by the op-

erating system to accomplish their task (e.g., the POSIX API for a POSIX-compliant

system). One example of such an approach is Secure Erase [55], an application that

50

simply invokes the Secure Erase command on a storage controller. However, as dis-

cussed above, this is not a reliable secure deletion mechanism.

User-level tools can also attempt to explicitly overwrite data to be securely deleted [56],

a popular approach first proposed by Gutmann [57]. However, these approaches as-

sume that overwriting a block using the interface provided by the operating system

guarantees that all copies of that data on physical storage will be overwritten on the

underlying physical medium.

A third user space secure deletion approach is to fill the free space of a file sys-

tem [58, 59]. The motivation for this approach is to proactively overwrite remnants

of potentially sensitive data on storage left in the free block pool. However, this

approach is also limited by the operating system actually providing the capability

to overwrite all free blocks on storage, as well as the system’s ability to expose all

physical blocks to user space. We discuss below that this may not always be the case

with modern SSDs.

Cryptographic Erasure

Along a different axis than abstraction layer, there are also techniques that make use

of cryptographic erasure as a fundamental primitive. Put simply, these techniques

reduce secure deletion of data to secure deletion of a key encrypting that data. Under

computational hardness assumptions, encrypted data without the corresponding key

is infeasible for an attacker to recover. Prominent examples of this include Boneh’s se-

cure deletion approach for offline data such as tape archives [39], Lee’s secure deletion

approach for YAFFS [60], DNEFS [42], and TrueErase [61, 62]. While these works

present various secure deletion techniques for certain solid state storage types, they

are not compatible with flash translation layers implemented in opaque, hardware

controllers, excluding them from use on typical SSDs.

51

Another approach to cryptographic erasure is proposed by Tang et al. [37]. In

CleanOS, sensitive data on mobile devices is encrypted and the corresponding key is

evicted to the cloud. The fundamental assumption underlying this work is that the

cloud is more trustworthy than the user’s device, which is not always the case.

Yet another example of cryptographic erasure is proposed by Swanson et al. [63],

this time at the controller level. Here, a cryptographic key is used to encrypt all data

stored on the physical device, and this key is stored within a dedicated memory also

located on the device. Secure deletion is performed by replacing this key, resulting in

a coarse-grained secure deletion of all data on storage.

Reardon et al. [64] also present a graph theoretic approach to analyzing and prov-

ing the security of any tree-like approach to secure deletion involving encryption and

key wrapping. They provide an implementation of an instance of this class of ap-

proaches as a B-tree that can provide file-level deletion granularity, and exhibits the

potential for good performance when combined with a suitable caching policy. This

work is closely related to ours, and therefore, we defer a direct comparison between

them to the discussion in Section 3.7.

3.2.2 Flash Translation Layers

Raw flash memory is a common storage technology due to its low power consump-

tion, density, and efficient random-access characteristics. In a significant departure

from classical storage technologies such as magnetic hard disks, flash memory pos-

sesses an asymmetry between the sizes of read and write operations versus the size

of erasure operations. In particular, data is read and written at page granularity

(e.g., 4 KB blocks), but is erased at an erase block granularity (e.g., 256 KB chunks).

Furthermore, flash memory cannot be written to unless the page, and its enclosing

52

erase block, has first been erased. Since this operation incurs significant wear, wear

leveling is performed wherein erasure operations are evenly distributed across flash

erase blocks in order to maximize the device’s service lifetime. This leads to the

phenomenon of write amplification, where one logical I/O operation leads to multiple

physical I/O operations.

For raw flash devices intended to be directly exposed to an operating system,

wear leveling is expected to be performed by the device driver. However, devices such

as solid-state drives (SSDs) do not expose this low-level interface. Instead, a flash

translation layer (FTL) is interposed to provide a traditional sector-based interface to

the operating system much as a magnetic hard disk would provide. For an SSD, the

FTL is implemented within the hardware controller, and in such cases the operating

system does not have direct access to physical flash pages, erase blocks, or visibility

into the wear leveling process. In fact, in order to accommodate expected wear,

account for failed erase blocks, and improve performance, modern SSDs are typically

over-provisioned by 25%.

Since FTLs obscure physical flash erase blocks and wear leveling leads to write

amplification that results in significant amounts of duplicated data, existing secure

deletion techniques are incompatible with such devices.

3.2.3 Motivation

To summarize, while prior secure deletion approaches work under certain circum-

stances, they do not address common cases where the operating system cannot guar-

antee that physical blocks are not duplicated on storage, or that logical blocks map

directly to physical blocks, as in the case of FTL-based devices such as SSDs. Those

53

approaches that remain, such as whole-device secure erase commands or cryptographic

erasure [63], only operate at the coarsest granularity possible.

Our work aims to fill this important gap for arbitrary storage devices by satisfying

the following design goals:

• Secure deletion must not rely on the assumption that blocks are not duplicated

without its knowledge.

• Secure deletion must not rely on the assumption that logical block addresses

map one-to-one to physical block addresses.

• Secure deletion must operate at a useful level of granularity – in our case, at

the file level.

3.3 Threat Model

The threat model we consider in this work is essentially a notion of forensic security.

That is, while the system computes over sensitive data an adversary is not present

on the system and cannot examine or tamper with this data. We assume that an

adversary can later gain a high level of access to the system, including physical access,

and attempt to forensically recover deleted files that previously contained privacy-

sensitive data. The secure deletion approach we describe in this chapter guarantees

that attackers cannot recover data that has been deleted during prior computation.

We assume a trusted computing base (TCB) composed of a subset of the system’s

software that includes the kernel and a small set of high-privilege user space utilities.

The TCB also includes a subset of the underlying firmware and hardware, in partic-

ular a secure storage area described later in the paper such as a Trusted Platform

Module (TPM) chip or smartcard. However, storage controllers are considered to be

54

untrusted, and no assumptions are made as to the kind of physical medium used in

the system (e.g., magnetic hard disk, SSD, tape, optical drive).

3.4 Design

Before describing Eraser, we first describe a naıve approach to secure file deletion,

and discuss its drawbacks to motivate the actual design of Eraser. We then analyze

the theoretical storage and time bounds for Eraser.

3.4.1 Naıve Approach

A straightforward approach to secure file deletion using cryptographic erasure is to

simply generate a unique encryption key for each file. Any data written to storage

would be encrypted with its associated file key, and decrypted when read from storage.

Securely deleting a file is then reduced to securely deleting the corresponding file

key (i.e., under computational hardness assumptions, it should be infeasible for an

attacker to recover the file without the key).

This approach, however, has an important flaw: file keys must also be persisted

to storage across system reboots or failures, and as a result, there would be no way

to assure that file keys themselves are securely deleted. To address this recursive

problem, we encrypt the file keys with a master key and rely upon a trusted element

to serve as secure storage for this master key. We term this master key secure storage

the master key vault, which must satisfy the following properties:

• The vault must be large enough to store a master key.

• The vault must allow the system to perform encryption and decryption opera-

tions using the stored master key.

55

• The vault must allow the system to update the stored master key with a new

key.

Unfortunately, this leads to a second problem: the simple two-level hierarchy

described above implies that deleting a single file requires re-encrypting all file keys.

To understand why this is the case, consider that on modern storage devices, block

data might be persisted to multiple physical locations due to phenomena such as

flash wear leveling, and that such processes are completely outside the control of an

operating system kernel. Therefore, in order to ensure that file data is irrecoverable,

the master key must itself be rotated, and the old key securely deleted from the vault

such that there is no computationally feasible way for an attacker to decrypt block

data recovered from physical storage. Since the master key must be rotated, all file

keys must be re-encrypted before being persisted to disk, leading to a phenomenon we

term encryption amplification. This is an expensive operation that should be avoided

for any practical system.

3.4.2 File Key Trees

To address the above problems identified in the naıve approach, Eraser’s design

incorporates two key elements: (i) a master key vault, and (ii) a file key tree (FKT).

The master key vault has the properties described above, which allows for master

keys to be rotated with secure deletion of the old key. The FKT, on the other hand,

avoids the problem of encryption amplification by bounding the number of keys that

must be re-encrypted each time the master key is rotated.

An FKT is an n-ary tree (i.e., a tree where each node has up to n children ) of

height m. At the root is the master key, which is stored in the master key vault, is

never released from the system TCB, and is never persisted to other storage in any

56

M

...

Ekn(kn2+n)

EM(kn)

Ekn(kn2+1)Ek1

(k2n)

EM(k1)

Ek1(kn+1)

......

m = 2

Figure 3.1: Structure of an n-ary FKT with m = 2. The root node is represented bya master key M stored in a secure master key vault. Each internal node contains akey encrypted by the parent key. Leaf nodes correspond to file encryption keys.

form. Internal nodes of the tree correspond to randomly-generated encryption keys.

Each node key encrypts the keys of its children. Leaves of the tree correspond to file

encryption keys. An example of an n-ary FKT with m = 2 is shown in Figure 3.1.

FKT Space Complexity

To represent |F | files, an FKT with at least |F | leaves must be created. Therefore,

the size of an FKT is bounded by

O(nblogn |F |c + |F |

).

This is simply the number of internal nodes required to represent |F | leaves in an

n-ary tree plus the leaves themselves. In practice, the root key will be evicted to the

dedicated master key vault, while the remaining levels of the FKT will be persisted

to disk.

57

M

Ek1(k3) Ek1

(k4) Ek2(k5) Ek2

(k6)

EM(k1) EM(k2)

Ek3(f1) Ek4

(f2) Ek5(f3) Ek6

(f4)

M

Ek1(k3) Ek1

(k4) k5 k6

k1 k2

Ek3(f1) Ek4

(f2) Ek5(f3) Ek6

(f4)

M'

Ek1(k3) Ek1

(k4) k'5 k6

k1 k'2

Ek3(f1) Ek4

(f2) Ek5(f3) Ek6

(f4)

M'

Ek1(k3) Ek1

(k4) Ek'2(k'5) Ek'2

(k6)

EM'(k1) EM'(k'2)

Ek3(f1) Ek4

(f2) Ek6(f4)

Ek2(k5)

EM(k2)

Ek5(f3)

Step 1 Step 2

Step 3 Step 4

Figure 3.2: Secure deletion using an FKT with n = 2, |F | = 4. Step 1: The initialstate of the tree contains encryption keys for four files. Step 2: When the user decidesto delete file 3, a traversal of the FKT from the corresponding leaf node for file 3 to Mis performed. Starting below the master node, each node’s key is decrypted using theparent’s key. Additionally, all other direct children of the current node are decrypted.Decrypted nodes are shown here in bold. Step 3: Keys along the direct path fromfile 3’s leaf node to the master key node are randomly regenerated. These nodes havea dotted outline. The old master key M is securely deleted from the vault, and a newmaster key M ′ is stored. Step 4: Keys at direct children of nodes on the path fromStep 3 are re-encrypted to obtain the new FKT, which is persisted to disk. Nodesfrom the pruned branch as it existed at Step 1 might remain on insecure storage, butsince M has been erased it is computationally infeasible for an attacker to decryptdata along that path.

58

FKT Operations and Time Complexity

Accessing a file encrypted using an FKT involves collecting a chain of encryption keys

from the corresponding FKT leaf node to the master key and performing a series of

decryption operations to recover the file encryption key. Therefore, the number of

decryption operations to obtain access to a file is bounded by

O (dlogn |F |e) .

Deleting a file, similarly to file access, first requires collecting a chain of encryption

keys from the corresponding FKT leaf node to the master key. However, the next step

of this process is to: (i) randomly generate new encryption keys for each node along

the path to, and including, the master key node; and, (ii) re-encrypt the existing keys

at direct children (i.e., non-recursively) for each node along the previously identified

path in the FKT. Therefore, this operation’s time complexity is bounded by

O (ndlogn |F |e) .

This process is explained in a concrete example in Figure 3.2 for n = 2, |F | = 4.

3.5 Implementation

We implemented the general secure deletion approach presented in the previous sec-

tion in a prototype tool called Eraser, which operates at the block I/O layer of

Linux but provides secure deletion guarantees at file granularity. Eraser does not

require modifications to the rest of the Linux architecture and can be deployed as a

stand-alone kernel module. Our prototype utilizes commodity TPM chips that are

59

present on many modern motherboards and can easily be extended from user space

to support other types of secure external storage (e.g., smartcards).

3.5.1 Alternative Solutions & Our Philosophy

As we discussed in Section 3.2, there is a myriad of tools and techniques that imple-

ment secure deletion capabilities at various layers of a computer system. To assess

the advantages and drawbacks of each of these options, we first discuss various imple-

mentation alternatives to realize our approach, and explain our decision to choose the

block I/O layer for our prototype. We refer readers to Section 2.4.2 and Figure 2.2

for an overview of the various Linux I/O subsystems.

Unsatisfactory Solutions

While it is relatively easy to develop user space solutions instead of trying to un-

derstand and modify operating system internals, the I/O-related system calls offer

minimal control over how data blocks are processed and stored at lower levels, lim-

iting the effectiveness of such solutions for our purposes. Likewise, in this work, we

refrain from directly modifying concrete file system implementations or specific de-

vice drivers. While implementing our approach at those layers is possible, choosing

any specific instance to adapt to our needs would limit the usefulness of our system.

Otherwise, modifying and maintaining every single file system or driver available in

Linux would be a high-effort and bug-prone affair.

In spite of the issues mentioned above, the file system layer is still the most natural

place to enforce secure deletion of files. By definition, the file system is already aware

of all the data blocks corresponding to any given file and also has full control over

file metadata, all of which significantly eases development burden. One solution that

60

alleviates the issues tied to working with a specific file system, while also leveraging

the advantages of the file system layer, is utilizing a stackable file system. These

special file systems reside on top of another underlying file system and transparently

interpose on the passing I/O requests, presenting a viable option to implement secure

deletion. For instance, eCryptfs [8] is an stackable encrypting file system distributed

with Linux, and could easily be adapted to our approach.

Unfortunately, stackable file systems often come with a significant performance

overhead. In fact, during our evaluation of PrivExec in Section 2.5 we already wit-

nessed the performance issues with eCryptfs, and benchmarks also show that eCryptfs

performs considerably worse than block-layer encryption [65]. Since one of our im-

plementation goals is to build a performant system that could be used on everyday

computers, we chose to employ a different strategy for our prototype.

Another seemingly viable alternative is to implement our system as part of the

Linux page cache. However, examining the kernel internals reveals that some of

the critical page cache functions that manipulate file blocks (e.g., pageread and

pagewrite) are actually required to be provided by file systems. Furthermore, Linux

gives applications the ability to perform direct I/O operations that bypass the page

cache. As a result, we conclude that the page cache is not a suitable layer to implement

secure deletion.

Our Solution

In light of the above considerations, we decided to implement our approach at the

block device level in a stackable block device driver. Similar to how stackable file

systems operate, stackable drivers intercept block I/O requests before they reach the

underlying drivers and allow us to manipulate them as necessary. The main advantage

of a block-layer approach is its performance (e.g., compare dm-crypt’s performance

61

to eCryptfs [65]). However, at a first glance, it is not clear how file-level information

could be gathered at the block layer or, in other words, how physical sectors on a

device could be matched to logical file blocks.

Our prototype Eraser closes this semantic gap between the file system and block

device layers by leveraging the Linux kernel’s property that, regardless of the file

system implementation, every file system object is represented by a common data

structure provided by the VFS: the inode. In this way, we can avail ourselves of the

performance benefits of operating on low-level device blocks, while still retaining a

high-level understanding of the file system. At the same time, Eraser works under

any Linux-native file system and is compatible with any physical block device.

3.5.2 Prototype Overview

We implemented Eraser using device-mapper [66] as a stackable block device driver,

also referred to as a device-mapper target in Linux parlance. Device-mapper is a

standard Linux kernel framework that allows users to create stackable drivers, and is

used in technologies such as dm-crypt, LVM, software RAID, and Docker. It maps

existing physical block devices onto new ones and exposes these virtual devices to

user space via new device nodes, often found under /dev/mapper/*. Users can then

interact with these device nodes in the usual way, formatting them with a file system

of their choice and storing data in them.

A high-level view of the system is illustrated in Figure 3.3. Eraser organizes file

encryption keys in an FKT as described and stores them in a reserved section of the

underlying storage device. The master key, however, is never persisted to this device.

Instead, it is confined to an external secure store. Specifically, in our implementation,

we store it in the NVRAM area of a TPM chip installed on the machine. Eraser

62

?

File Key Tree File System

...a7SEPwZLelCZBqn

5UbAcrRICmaBIum

rx2yLulxZ0EJ2OJ

rx2yLulxZ0EJ2OJ

TXT

Key Cache

9vAOgP3geq2oqI6

TPM Chip ERASER Block Device

File I/O

Storage

Device

KernelMaster Key Vault

Figure 3.3: An overview of Eraser’s design. Our prototype implementation utilizesa TPM chip as its external secure store to preserve the master encryption key.

63

then intercepts all block I/O operations in flight, identifies which files those I/O blocks

belong to, and retrieves the appropriate keys to encrypt or decrypt the file contents on

the fly. When a file is deleted, its associated key is discarded as described previously.

Finally, the newly generated keys are written to the key store and a fresh master key

is synced to the TPM chip, overwriting the obsolete key. We will now discuss these

components in more detail.

3.5.3 I/O Manipulation

The kernel represents and tracks in-flight block I/O operations with a data structure

called a bio. Through the device-mapper framework, each bio destined for an un-

derlying physical device is first handed to Eraser where we can freely manipulate

them before passing them on to the next device driver in the stack.

Identifying Files

The first task Eraser needs to be able to perform is detecting whether a bio corre-

sponds to a file system operation. Thanks to the way Linux handles pages of a file

during I/O and the VFS layer which necessitates that every file system object have

a corresponding inode object associated with it, this task is possible without explic-

itly modifying the upper kernel layers or attempting to propagate this information

downwards.

Since there is a one-to-one mapping between inodes and files, our implementation

uses inode numbers to uniquely identify files and find their corresponding encryption

keys. Whenever Eraser receives a bio, it first iterates over all of the memory pages

(i.e., data buffers in volatile memory) it points to. Linux provides another related

object per file, called an address space, that describes the mapping between physical

64

blocks on a disk, pages in memory, and the inode owning these. By walking through

this structure Eraser is able to match every page in a bio to a specific inode,

check whether the inode at hand corresponds to a file, and subsequently identify the

encryption keys to be used based on the inode number. Otherwise, if the pages are

found to have no corresponding inode or an inode that represents a file system object

other than a file, that bio is simply remapped and sent to the actual underlying

device without further processing.

Writing Files

Once a bio corresponding to a file write operation is identified, Eraser needs to

retrieve the appropriate key, encrypt the contents, and perform the write to the

underlying device. However, simply iterating over the memory pages pointed to by

a bio and encrypting them in-place is not the correct approach. This is because

the same memory pages representing the write buffers are often also present in the

page cache. Thus, directly encrypting them would result in ciphertext being served

to user space from the cache with future I/O requests, without our driver having an

opportunity to decrypt them. Even if we could attempt to intercept cache hits, it

would be sub-optimal to decrypt the same contents with every individual read.

To address this issue, Eraser makes a clone of the original bio and all of its

pages, and instead encrypts the copied pages. Next, the cloned bio is asynchronously

submitted to the underlying device, while the original I/O request is being stalled.

Once Eraser receives notification of a completed disk write through a callback, it

marks the original bio as completed as well, which automatically signals the upper

layers of a successful disk write. In this way, the cached data remains untouched and

subsequent file reads that result in cache hits do not require repeatedly decrypting

the same data buffers.

65

Reading Files

Handling of bio objects that represent file reads is similar, with a single difference.

When Eraser first intercepts the bio, the pages it points to are empty, ready to be

filled with data read from the physical disk. Therefore, Eraser first needs to initiate

the actual disk read, and decrypt the data only once the operation is complete.

This is achieved by, once again, cloning the original bio, and submitting the

clone for I/O to the underlying device while the original operation is being stalled.

However, this time, it is not necessary to allocate separate memory pages for the clone;

instead, the clone points to the original memory pages. Once we receive notification

of a completed read operation, Eraser retrieves the appropriate key, decrypts the

contents in place, and finally signals completion of the original bio.

Cryptographic Operations

Eraser uses AES-256 in CBC mode to encrypt file blocks. Every file is also given a

unique IV stored together with the keys in the FKT. Since encryption is performed

on a page-by-page basis and pages of a file could be read or written in any arbitrary

order, a page IV is derived from the file’s unique IV and the file offset of the processed

page. Finally, all random data used by Eraser to regenerate encryption keys and IV

after a file deletion is generated using AES-256 in CTR mode. This random stream

is seeded by a key from the kernel’s cryptographically secure random byte pool and

the cipher is reseeded after every 1 MB of data output.

3.5.4 Intercepting File Deletions

When an Eraser virtual device is first initialized, its FKT is filled with randomly

generated keys and IVs for every file, and the system is ready for use. Consequently,

66

our system does not need to track file creation events. Instead, we only monitor file

deletions, discard the appropriate keys in the FKT as discussed in Section 3.4, and

immediately generate new keys for the freed inodes later to be used by the next file

that is assigned the same inode number.

While this approach simplifies our implementation, intercepting file deletion events

from the block layer is still not a trivial task. In particular, because a file deletion

often only involves changes to file system indices and metadata, and no I/O operations

are performed on the actual file blocks, the block I/O layer remains oblivious to this

file system modification.

Eraser addresses this challenge with the help of another Linux kernel framework,

Kernel Probes (kprobes) [67]. Kprobes allow users to hook into code addresses inside

the kernel, and access or manipulate system state. We utilize this capability to trap

execution at the entry point of the vfs unlink function, a choke point inside the

VFS for all deletion operations. Next, in our hook function, we access the original

function’s arguments from the CPU registers, retrieve a pointer to the deleted inode,

and check whether it represents a file object. Note that since vfs unlink is called

from all file systems available on the machine, we also need to check here that the

inode actually resides on an Eraser partition and not some other device. Once it is

confirmed that a file on a relevant device is being deleted, we then trigger the secure

deletion process and generate fresh keys for the freed inode.

3.5.5 Key Storage & Management

While Eraser’s key organization is based on the high-level FKT design presented in

Section 3.4, we also employ a number of optimizations specific to our implementation.

67

Due to the potentially large size of an FKT, the majority of the keys are stored

on the disk at any given time and are only accessed when required for a file access

or secure deletion. Because of this, the parameter n of the FKT should be chosen

to optimize disk I/O performance. In our implementation, every node of the tree

contains a 256-bit encryption key and 128-bit IV, for a total size of 48 bytes. To

ensure that we perform disk I/O operations on block boundaries, we set n to the

maximum number of tree nodes that can fit into a single block (i.e., b409648c = 85 for a

system with 4 KB logical blocks). In this way, we can perform disk I/O on all children

of a node directly with a single block access. This also has the desirable side effect

of allowing us to perform cryptographic operations on the blocks with a single pass,

because page sizes are often equal to or multiples of block sizes. Of course, other

configurations are also possible as long as n is chosen so that nodes fall within block

boundaries.

Note that the structure of an FKT can be estimated fairly accurately at the time of

system initialization, and the tree structure will remain static throughout the system’s

life. In our approach, there is a leaf node corresponding to each file. The number of

files is limited by either the number of inodes a file system can support on a device of

given capacity or, in the case of file systems that allocate inode indices dynamically,

by the space reserved on the device for the key store. With this knowledge of how

many leaves are going to be available in the FKT at any given time, we can further

optimize the tree structure for space efficiency.

We do this by first calculating the minimum tree height required based on the

number of inodes we need to support, and then decreasing the fan-out of the root

node to a value smaller than n in order to cull unused, empty subtrees of the root.

For instance, an Ext4 file system created on a device with a 100 GB capacity would

default to allocating 6,553,600 inodes. To create a tree with 6,553,600 leaves working

68

back towards the root (with n = 64 to simplify calculations), we would need 655360064

=

102400, 10240064

= 1600, and 160064

= 25 nodes at each level. Consequently, a fan-out of 25

for the root would be sufficient for this configuration. The recurrence relationship for

calculating the total number of tree nodes required to support |F | files is given below,

excluding the root node stored externally. As a result, our implementation reserves

48R(|F |) bytes of storage space for a file system that can represent a maximum of

|F | files.

R(|F |) =

|F |+R(d |F |

ne), if |F | > n

|F |, if |F | ≤ n

Recall that our approach requires logn |F | disk accesses (i.e., the height of the

tree) to retrieve or discard the required keys with each file access and deletion. Our

approach to mitigate the I/O overhead caused by this necessity is twofold. First,

we always keep the decrypted nodes in levels 1 and 2 in memory. For example,

following the previous example with 6,553,600 leaves and n = 64, this would require

less than 7 MB of memory. Modified nodes are written to disk periodically. Next, we

employ a caching strategy for the leaf nodes so that the keys for frequently accessed

files are available in memory for quick access. Similarly, a dedicated kernel thread

periodically synchronizes dirty cache entries to their disk blocks and evicts old cache

entries. We should point out that we experimented with various caching strategies and

data structures for searching the cache efficiently. Our performance measurements

show that having a cache, as opposed to always reading the keys from the disk, results

in a significant performance gain. However, fine-tuning the cache organization had

no discernible impact on performance. This indicates that Eraser’s performance is

69

primarily I/O bound as expected, and that cache searches are overshadowed by I/O

operations.

3.5.6 Master Key Vault

In our prototype implementation we store the master key inside the NVRAM area

of a TPM chip. This enables us to reliably discard (i.e., overwrite) an obsolete key

when the master key needs to be regenerated after a file deletion, and also provides

a strong defense against unauthorized retrieval of the master key.

While it would also be possible to interact with the TPM chip directly from

within the kernel, our implementation instead utilizes a user space helper application

to read from and write to the NVRAM. This is a conscious design choice to make it

possible to extend the system to support different secure storage modules in the future

without requiring modifications to the kernel core. Eraser coordinates with this

helper application using the Linux kernel’s netlink facility, a standard mechanism

for kernel-to-user space communication. Note that the master key is further protected

by an encryption key derived from a user password, configured when setting up a new

Eraser instance. Therefore, the master key is always encrypted when residing inside

the TPM chip, accessed by the user space helper, or in transit through the netlink

channel.

3.5.7 Encrypting Non-File System Blocks

Our discussion of cryptography in this chapter focused on the secure deletion of

files. The approach we present also provides security guarantees similar to ordinary

file encryption tools as a side benefit, provided that the external master key cannot

be read by others or is otherwise further protected with another secret such as a

70

password. However, the approach we described so far does not provide full disk

encryption capabilities, since non-file blocks on the disk (e.g., the file system’s internal

data or free blocks) are not encrypted. This would require a user desiring both full

disk encryption and secure deletion at the same time to run Eraser on top of yet

another disk encryption solution, such as dm-crypt. This redundancy could hurt I/O

performance.

To address this limitation, we extended Eraser to provide full disk encryption

for non-file blocks as well. In short, Eraser operates in file encryption mode, as

described in Section 3.5.3, if the I/O request is for a file block. In all other cases,

it performs regular disk sector encryption using a fixed key generated on system

initialization and protected with a user password. The IVs in this mode are also

derived from disk sector numbers using the “encrypted salt-sector initialization vector

(ESSIV)” method [68]. In this way, Eraser becomes a full replacement option for

other disk encryption solutions, offering secure deletion guarantees on top of the usual

confidentiality characteristics of disk encryption.

3.5.8 Managing Eraser Partitions

Lastly, users interact with Eraser through a user space application that allows them

to format physical devices to create the required headers and internal metadata.

During this setup process, users are required to configure a password from which

encryption keys are derived for securing the master key while it is being transported

from the TPM chip to the kernel, and also to encrypt non-file system blocks.

Later, Eraser partitions can be activated with this tool to expose the securely-

deleting virtual device node by supplying the correct password. In the same way,

users can view active instances of Eraser, made available by the driver through a

71

/proc node, and deactivate them when no longer needed. Through this application

users can also configure Eraser to use any of the supported vault devices for master

key storage.

3.6 Evaluation

While the approach we presented in this chapter was primarily designed to provide

strong secure deletion guarantees, many of our implementation choices were also

geared toward achieving good I/O performance on commodity computer systems so

that Eraser could have a practical impact. In this section, we present two sets

of experiments to evaluate the performance overhead of Eraser and compare it to

ordinary full disk encryption.

All experiments described in this section were performed on a regular desktop

computer with an Intel i7-930 2.2GHz CPU, 9 GB of RAM, running Arch Linux x86-

64 with an unmodified 4.17.0 kernel. The storage device used was a Samsung 950

PRO solid state drive with 1TB capacity, formatted with Ext4 using the default file

system settings.

For all tests, the results presented were averaged over five runs. The maximum

relative standard deviation we observed was below 2% for the I/O benchmarks we

describe first, and below 5% for the real-life small file tests we discuss next.

3.6.1 I/O Benchmarks

To understand how Eraser impacts the I/O performance of the underlying storage

device, we first put our system under stress using the popular disk and file system

benchmarking tool Bonnie++. For file I/O tests, we configured Bonnie++ to write

and read 20×1 GB files. This size was chosen to be more than twice the system

72

No

En

cryp

tion

dm

-cry

pt

Erase

r

Ove

rhea

dvs.

Ove

rhea

dvs.

Bon

nie

++

Tes

tsP

erfo

rman

ceP

erfo

rman

ceN

oE

nc.

Per

form

ance

No

En

c.d

m-c

ryp

t

Wri

te255

300.

00K

B/s

2549

90.0

0K

B/s

0.12

%25

3530

.20

KB

/s0.

70%

0.58

%R

ead

21377

8.0

0K

B/s

1421

74.2

0K

B/s

50.3

6%

1417

47.6

0K

B/s

50.8

2%

0.30

%

Cre

ate

37183

.60

file

s/s

3585

0.80

file

s/s

3.72

%34

266.

00fi

les/

s8.

52%

4.63

%D

elet

e5941

8.8

0fi

les/

s59

098.

00fi

les/

s0.

54%

4923

0.80

file

s/s

20.6

9%

20.04

%

Tab

le3.

1:D

isk

I/O

and

file

syst

emp

erfo

rman

ceof

Erase

rco

mpar

edto

full

dis

ken

crypti

onw

ith

dm

-cry

pt.

Ben

chm

ark

resu

lts

onan

unen

crypte

ddev

ice

are

also

pre

sente

das

abas

elin

e.

73

RAM, following the benchmark tool’s recommendation. Next, file creation and dele-

tion tests were performed with 512×1024 small files each containing 512 bytes of

data, distributed among 10 directories. These tests were also repeated on the same

test environment, but instead using dm-crypt, the standard Linux subsystem that

provides full disk encryption. While our discussion will primarily focus on compar-

ing Eraser’s performance to dm-crypt, we also provide benchmark results obtained

without running either encryption tool as a baseline. The results are shown in Ta-

ble 3.1.1

Bonnie++ benchmarks reveal that when performing read and write operations

on a small number of large files Eraser exhibits very similar performance to dm-

crypt, with the overhead staying below 1%. This is not surprising, because once

Eraser obtains the encryption key for the processed file with a negligible, one-time

performance hit, the remaining task of encrypting and decrypting the file blocks in-

flight is nearly identical to how dm-crypt performs disk block encryption. However,

in file creation tests, Eraser incurs a more noticeable performance impact. This is

likely due to the fact that Eraser now needs to perform a larger number of additional

I/O operations to repeatedly access its key store, and decrypt the corresponding FKT

nodes to obtain keys corresponding to each newly created file.

Finally, the most significant performance impact is observed during file deletions,

where Eraser falls behind dm-crypt by about 20%. Once again, this outcome is

in line with our expectations since a file deletion is the most expensive operation

Eraser performs: Eraser first intercepts the unlink system call, then performs

several accesses to the FKT, and finally replaces all involved keys with freshly gen-

1We point out that in all tests performed with and without Eraser we measured higher writespeeds than read speeds. While this was unanticipated, unofficial Internet discussions indicate thatthis is an issue observed with SSDs produced by this vendor, most likely due to a firmware quirk.Notwithstanding the reasons, we would like to point out that this issue does not have any bearingon our experimental results.

74

No Encryption dm-crypt Eraser

Overhead vs. Overhead vs.Tests Time (s) Time (s) No Enc. Time (s) No Enc. dm-crypt

Unpack 10.60 10.84 2.26 % 11.39 7.45 % 5.07 %Copy 11.44 23.59 106.21 % 22.61 97.64 % −4.15 %Remove 3.26 4.17 27.91 % 5.04 54.60 % 20.86 %Grep 11.11 25.18 126.64 % 24.12 117.10 % −4.21 %MD5 Hash 10.39 24.20 132.92 % 22.20 113.67 % −8.27 %Compile 1564.13 1564.15 > 0.01 % 1568.13 0.26 % 0.26 %

Table 3.2: Timed experiments with the Linux kernel source code directory to comparethe small-file performance of Eraser to full disk encryption with dm-crypt. Testsresults on an unencrypted device are also presented as a baseline.

erated ones, also encrypting and writing them back to the key store if there is cache

contention. However, regardless of this drawback, the actual number of files pro-

cessed per second by Eraser remains considerably high. As a result, we next test

how Eraser performs with real-life tasks that heavily involve small file operations

and explore this behavior in more detail.

3.6.2 Tests with Many Small Files

Prompted by Eraser’s relatively high performance overhead observed when dealing

with large numbers of small files under benchmark conditions, we next investigated

how it would perform in more realistic scenarios. To this end, we chose six tasks

involving a large directory tree – namely, the Linux kernel source code – and measured

the time elapsed to complete each task. Once again, the test were performed first

with Eraser and then dm-crypt. Measurements on a vanilla system with no disk

encryption are also provided as a baseline.

Our tests included the following tasks: (i) Unpacking the XZ-compressed source

code archive, (ii) making a copy of the directory tree, (iii) deleting the directory tree,

(iv) grepping the entire directory for a fixed string, (v) computing an MD5 hash over

75

all the files, and finally, (vi) compiling the kernel. All tasks were chosen to include

a large number of file operations, including reads, writes, deletions, and new file

creations. Furthermore, certain tasks such as kernel compilation and MD5 hashing

combined small file I/O operations with a CPU-bound component to cover different

scenarios. The results are presented in Table 3.2. Note that all operating system

caches were dropped between tests to ensure that measurements were not affected by

prior runs.

On the one hand, these results confirm our findings from the Bonnie++ bench-

marks that Eraser has a noticeable file deletion overhead, this time manifesting

itself at 21% during the directory removal task. On the other hand, in terms of the

time elapsed, the real-life impact of this performance loss is measured in a few sec-

onds. In all other tasks, Eraser performed comparably to dm-crypt, and surpassed

it in certain cases. However, this should not be taken to mean that Eraser is faster

than dm-crypt. Instead, we conclude that they perform similarly in real-life tasks.

The small differences in our measurements are likely due to natural variations in how

the underlying operating system and hardware performs.

3.6.3 Discussion of Results

In light of our evaluation, we confirm that the performance overhead of Eraser is

directly correlated with the number of files it handles at any given time. I/O per-

formed in big chunks and on a small number of files incurs no significant overhead.

In contrast, accessing a new file or deleting an existing one triggers additional I/O

operations to retrieve the corresponding keys from the FKT, or to rebuild branches

of the FKT with fresh keys. Therefore, repeatedly accessing large numbers of small

files results in a noticeable loss of throughput compared to ordinary full disk encryp-

76

tion. However, in comparison to ordinary full disk encryption, Eraser guarantees

secure data deletion and is useful in scenarios where privacy guarantees are of utmost

importance.

In addition, our tests also show that this reduction in throughput does not always

translate negatively to realistic workloads such as manipulating or working with very

large directory trees. In our tests, the performance loss is often measured in merely

seconds. In fact, in many workloads that include those that have processor-heavy

components, Eraser matches dm-crypt in performance. We find these results very

encouraging, especially considering that dm-crypt is a standard, well-optimized sub-

system of the Linux kernel. We conclude that in most practical use cases Eraser

offers performance comparable to regular full disk encryption with the added benefit

of guaranteed secure deletion.

We should point out that we refrained from comparing our prototype to a system

running without any disk encryption. As shown in Tables 3.1 and 3.2 a vanilla

system offers significantly higher I/O performance than both Eraser and dm-crypt.

However, we believe that such a comparison between encrypted and unencrypted

storage is not very meaningful in this context. First, the observed performance loss is

a direct result of disk encryption, and thus, it is not directly related to secure deletion.

Moreover, we believe that this downside of full disk encryption is a well-understood

and accepted trade-off in the face of modern privacy threats.

3.7 Discussion

Prior Tree-based Secure Deletion Work

As mentioned in Section 3.2, Reardon et al. [64] implemented a B-tree-based approach

to secure file deletion that also made use of cryptographic erasure and key wrapping.

77

This work is highly related; however, a significant difference between their prototype

and Eraser lies in our focus on developing a high performance secure deletion tech-

nique, and subsequently, presenting a practical and usable system that can act as a

viable substitute for existing, well-established full disk encryption tools.

First, while Reardon’s B-tree prototype shows promising performance character-

istics when combined with a suitable caching policy, our evaluation of Eraser shows

that an FKT implementation can closely approach the performance characteristics

of a heavily used and optimized production-level full disk encryption implementation

(i.e., dm-crypt). We stress that we are not the first to propose tree-based crypto-

graphic erasure using key wrapping. However, we believe that FKTs and our pro-

totype implementation are the first to show that it can be performant for everyday

use.

Next, Reardon’s work leverages the Linux kernel’s network block device facil-

ity [69], which routes block I/O requests over a TCP connection, and is typically used

for accessing remote storage devices. The authors utilize this technique to present a

proof-of-concept implementation of their approach for their experiments. In contrast,

one of our primary goals when developing Eraser was to provide a robust, practical,

and usable system that could easily be adopted for everyday use, on a typical Linux

system. As a result, we were faced with a different set of design and implementation

challenges to fulfill our specific requirements.

Implementation Limitations

As we have shown, Eraser makes it possible to maintain a file-level secure deletion

granularity even when operating at the block device layer. However, this design

choice does pose a major difficulty to securely deleting file metadata, as matching file

system-specific metadata to inodes is a non-trivial (but not impossible) task. Our

78

current implementation does not perform secure deletion of metadata, and we leave

tackling this implementation challenge to future work.

Eraser uses inode numbers to uniquely match encryption keys to files. This

is an intuitive solution when dealing with Linux-native file systems, such as Ext4,

which internally represent files directly using inodes. However, it should be noted

that “foreign” file systems that are ported to work under Linux (e.g., FAT, ZFS)

do not necessarily have the concept of an inode. Instead, they construct inodes

in memory as files are accessed, and map their own internal representation of files

onto these in-memory structures as this is required to interface with the VFS. This

peculiar technical detail does not currently pose any difficulty to our implementation.

However, in theory, it could be possible to implement a file system that does not have

a fixed inode number-to-file mapping, but rather assigns arbitrary inode numbers

to files every time the file system is mounted. Our Eraser prototype would not

be compatible with such a file system, and addressing this limitation would require

Eraser to employ a different method to uniquely identify files on that file system.

Swap Space & Hibernation Considerations

The secure deletion guarantees provided by our approach require that file keys are

never written to physical storage without first being encrypted by a parent key. Like-

wise, the master key must never be persisted outside its designated secure vault.

These conditions could easily be satisfied by keeping the keys in volatile memory

protected by the kernel while in use. However, we stress that implementations should

also take the necessary precautions to prevent inadvertent leakage of keys in case the

system goes into hibernation, or when memory pages are swapped to non-volatile

storage. Specifically, sensitive memory areas containing key caches should be marked

79

non-swappable, and before entering hibernation, all key caches must be written back

to persistent storage and their corresponding memory regions sanitized.

Unavailability of Master Key Vault

As part of Eraser’s normal operation it would be necessary to frequently rotate the

master key stored in the external vault. However, should the vault become inacces-

sible for any reason (e.g., a removable storage device acting as the vault, such as a

smartcard, could be unplugged by the user), Eraser needs to take the appropriate

actions to prevent inadvertent loss of data on disk. One way to deal with such sit-

uations is to delay the master key rotation until the vault becomes available once

again.

Even if an Eraser partition needs to be taken offline under these conditions, the

direct children of an FKT could be encrypted with the old master key and persisted

to disk, which would temporarily forgo secure deletion. Later, the next time the vault

could be accessed, the master key would be rotated and all its direct children in the

FKT immediately re-encrypted to securely erase all previously deleted files. Note

that even in this scenario, an offline Eraser partition cannot be accessed again until

the vault becomes available, because the master key is required to unlock the FKT

on disk before the file system could be mounted.

Alternatively, in different threat environments that involve highly-sensitive files,

it could be preferable to rotate the master key as soon as files are deleted regardless of

the vault’s availability, and opt for having the file system become inaccessible should

the system be taken offline before the new key could be written to the vault. Such a

policy would instead sacrifice data integrity in favor of guaranteed secure deletion.

80

Users’ Perception of Secure Deletion

Finally, we should point out that Eraser is designed to securely delete files only when

a system call explicitly requests removal of the file inode in question. For instance, our

prototype implementation considers the unlink family of system calls as the trigger

for secure deletion. Of course, this could trivially be extended to cover other related

system calls, such as truncate, by intercepting their corresponding entry points as

well.

However, file system implementations may not always explicitly destroy inodes

even when, from a user’s perspective, it may appear that a file’s contents are being

deleted. For example, consider a scenario under Linux and Ext4 where a directory

contains two files X and Y. When a user executes the command “mv X Y” to overwrite

the first file with the latter, the file system does not actually unlink Y. Instead, its

inode is reused, and only the data blocks of Y are overwritten. In other words,

Eraser would not consider this a file deletion event and would not securely delete

the contents of Y until the user later executes another command such as “rm Y”,

at which point all current and old data pointed by that inode is securely deleted.

Therefore, users of Eraser should be aware of this semantic gap and limitation of

the system, and explicitly execute file deletion operations when secure deletion is

desired.

3.8 Summary

Even though the problem of irrevocably deleting data from non-volatile storage has

been explored by many researchers, flash-based storage media with opaque on-board

controllers, and journaling file systems with data replication features still make it a

challenging task to provide strong secure deletion guarantees on modern computers.

81

At the same time, previously practical secure deletion tools and techniques are rapidly

becoming obsolete, and are rendered ineffective.

In this chapter, we leveraged the well-known concept of cryptographic erasure to

design a novel, effective secure deletion technique called Eraser. Our work is dis-

tinct from the myriad of existing literature in this field in that, Eraser can guarantee

secure deletion of files on storage media regardless of the underlying hardware’s char-

acteristics, treating storage devices as blackboxes. We achieve this by bootstrapping

cryptographic erasure with the help of an external, secure storage vault, which could

be implemented in practice using cheap, commodity hardware such as a TPM chip,

or a smartcard.

Eraser’s design and implementation fulfills all of our research goals laid out in

Section 1.5. (G1) We presented a practical implementation of Eraser, realized as a

stand-alone Linux block device driver that can be deployed and used on a commodity

computer with a TPM chip. (G2) Eraser partitions are exposed to user space

as virtual devices that behave identical to ordinary block storage media; they are

supported on any block-based hardware and can also be formatted with any file

system. (G3) Eraser requires explicit cooperation neither from applications, nor

users; it performs secure deletion automatically whenever a file is removed. (G4)

Finally, our implementation exhibits similar performance characteristics to dm-crypt,

and thus offers users a viable alternative disk encryption solution with the added

benefit of secure file deletion.

82

Chapter 4

HiVE: Hidden Volume Encryption

4.1 Overview

Full disk encryption is a common security technology used for protecting sensitive

information saved in a computer’s persistent storage. Today, many major operating

systems offer basic disk encryption solutions out of the box, and there also exists a

large pool of free and commercial tools that provide disk encryption technologies to

suit different security and privacy needs.

While disk encryption is a well-studied technology that is known to provide strong

security when implemented correctly, it is nevertheless vulnerable in the face of chang-

ing adversarial models of the modern day. Specifically, against powerful adversaries,

such as government and law-enforcement agencies, which may have the authority to

force users into disclosing their keys, basic disk encryption techniques become inef-

fective regardless of how strong the underlying cryptographic algorithms are.

To address this problem, certain disk encryption tools provide advanced features

that offer plausible deniability to their users. For example, TrueCrypt, a popular disk

encryption tool, allows users to create a second hidden volume inside an ordinary

83

encrypted disk partition, using a separate key for encryption. In this scheme, data

blocks of the hidden volume are stored inside the seemingly-free blocks of the first

volume. Then, if the user is coerced into disclosing her encryption keys, she can reveal

only the key to the first partition and withhold the key to the hidden volume. Even

with full access to the primary volume, the adversary cannot tell whether a second

hidden volume exists, or more specifically, he cannot distinguish actual free blocks

from data blocks that are part of a potential hidden partition.

Unfortunately, as recognized by Czeskis et al. [2], this hidden volume scheme has

an important flaw. Namely, an adversary that has the ability to inspect multiple

snapshots of the disk at different times can guess with a high probability of success

whether a hidden volume exists. This is an important shortcoming since it is com-

mon for users to lose possession of their encrypted devices on multiple occasions, for

instance, while traveling (e.g., checking bags for multiple flights, border inspections

when entering and leaving a foreign country, leaving the device in a hotel room unat-

tended). The reason behind this vulnerability is the fact that TrueCrypt does not

make any attempts to hide disk access patterns. To explain intuitively, an adversary

can compare two disk snapshots, and attempt to determine whether an unlikely large

number of “free” disk blocks have been modified in between, which would give away

activity in a hidden volume.

In this chapter, we present a hidden volume encryption scheme that is secure

against adversaries with multiple-snapshot capabilities. We achieve this by using an

Oblivious RAM (ORAM) as a building block to hide disk access patterns, and then

refine our basic construction in several steps to present a final scheme we call Hive.

We demonstrate that our design can be implemented as a standard block device driver

on Linux, allowing users and applications to interact with Hive volumes in the exact

same manner as they would with ordinary disk partitions.

84

4.2 Threat Model

The primary threat we target with this work is a coercion attack, whereby an adver-

sary aims to defeat disk encryption and gain unauthorized access to privacy-sensitive

data by forcing the disk’s owner to willfully reveal her encryption keys. Typical exam-

ples of such adversaries include government and law-enforcement agencies that often

carry the authority to request users to reveal their secret keys, for instance, through

a court order.

Following the standard adversarial model for this attack, and as a precursor to

coercion, we assume that an attacker has the capability to inspect a disk of interest

in order to identify any encrypted data stored on it. However, we further assume that

the attackers may access and inspect the same disk more than once, and can compare

multiple disk snapshots taken at different times.

In the rest of this chapter, we refer to such scenarios as multiple-snapshot attacks.

As we describe in the next section, Hive aims to enable users to create hidden

encrypted partitions on their disk, and plausibly deny their existence in the face of

multiple-snapshot attackers.

4.3 Design

A key observation in our design of a hidden volume scheme resistant to multiple-

snapshot adversaries is that access patterns to encrypted volumes need to be hidden.

To this end, we first present a naıve, generic scheme using a standard ORAM. ORAM

is a block-based oblivious data structure; in other words, it does not reveal any

information about the sequence of read and write operations performed on its data

store. Thus, it is a natural fit for our purposes. ORAM specifics have been widely

85

studied by computer science researchers and we refer the readers to the large body of

previous work for more details (e.g., [70, 71, 72]).

In the following, we first present a basic, generic hidden volume scheme that

is resistant to multiple-snapshot attacks, but performs poorly from a performance

standpoint due to the inclusion of ORAMs as its building blocks. We then discuss

refinements and optimizations to this scheme in iterations, and finally present Hive,

a practical hidden volume encryption scheme.

4.3.1 Model

For the hidden volume encryption schemes presented in this chapter, we assume the

following model. The scheme gives a user access to max number of encrypted volumes

Vi, where the user can choose to set up and use any l, l < max number of volumes.

Each Vi is encrypted with a key Ki derived from a password Pi, and consists of ni

blocks of B bytes of data each. The total size of the disk is N . The hidden volume

scheme works in such a way that, given that an adversary has access to a number of

passwords smaller than the total number of volumes present, and can inspect multiple

snapshots taken from the disk, he will be uncertain about the real value of l.

4.3.2 Generic Hidden Volume Encryption

Our generic scheme uses max ORAMs as its storage units, each holding the data

for a corresponding encrypted volume Vi. The volume read and write operations are

performed as described in Algorithms 1 and 2. Intuitively, a write into Vi writes the

actual data into ORAMi, and then, for all the remaining ORAMs executes a dummy

write that does not change the data stored in the volume. Similarly, a read operation

for Vi reads the requested data from ORAMi, and then performs dummy writes to all

86

Algorithm 1 Generic Hidden Volume WriteInput: volume v, block b, data d, keys < K1, . . . ,Kmax >for i← 1 to max do

if i = v thenORAMi.write(b, d,Ki)

elser ← random({1, . . . , ni})dummy ← ORAMi.read(r,Ki)ORAMi.write(r, dummy,Ki)

end ifend for

Algorithm 2 Generic Hidden Volume ReadInput: volume v, block b, keys < K1, . . . ,Kmax >Output: data dd← ORAMv.read(b,Kv)for i← 1 to max do

r ← random({1, . . . , ni})dummy ← ORAMi.read(r,Ki)ORAMi.write(r, dummy,Ki)

end forreturn d

ORAMs. Note that when not all max volumes are in use, and consequently, there is

no corresponding password P for those, the unused ORAMs would be replaced by a

simulator S which executes dummy operations that look identical to real operations

to an adversary.

This construction has two important properties:

• Write patterns to encrypted volumes are hidden. This is guaranteed by defini-

tion through the use ORAMs to represent encrypted volumes.

• Writes to hidden volumes can be plausibly denied. This is possible because any

read operation results in dummy writes to all volumes, which covers for a write

into a hidden volume. In effect, with the described scheme, all disk operations

looks the same to an adversary, regardless of which volume is being used, and

what operation is being performed.

87

Algorithm 3 Write-Only ORAM WriteInput: block b, data d, key KS ← random subset({1, . . . , N}, k)β ← random(S), where β is a free block

Disk.write(β,EncK(d))Map[b]← β

for all β′ in (S − β) doif β′ is free then

Disk.write(β′, random bytes(B))else . β′ holds data

d← DecK(Disk.read(β′))Disk.write(β′, EncK(d))

end ifend for

Algorithm 4 Write-Only ORAM ReadInput: block b, key KOutput: data dβ ←Map[b]d← DecK(Disk.read(β))return d

4.3.3 Write-Only ORAM Construction

Note that an adversary inspecting snapshots of a disk would not be able to observe

read operations, as reads do not leave any trace on the disk. Consequently, hiding

disk read patterns to an encrypted volume does not provide any additional security;

only hiding the write patterns would be sufficient. This means that the ORAMs we

use in the described generic hidden volume scheme are more powerful than required.

Therefore, in this section, we describe a more efficient write-only ORAM construction

to use with our hidden volume scheme.

This write-only ORAM construction uses a data structure Map to map a virtual

block b in the ORAM to a physical sector β on the disk. Map is kept in volatile

memory, and thus, is not visible to an adversary inspecting a disk snapshot. We also

88

assume that the disk has at least twice the amount of space allocated for the ORAM

(i.e., N ≥ 2n). Finally, we require that the cryptographic operations utilized realize

IND$-CPA encryption, meaning that the ciphertext produced is indistinguishable from

random strings [73].

The write-only ORAM write and read operations are defined in Algorithms 3 and

4. To perform a write, we first pick a set S of k random disk sectors (we discuss the

considerations for choosing a concrete value for k in the following sections). Then

we choose a random sector β from S, where β is free, meaning that it is not mapped

to any ORAM block b. The data is encrypted and written to β on disk, and Map

is updated to reflect this. The remaining k − 1 sectors in S are either overwritten

with randomized strings if they are free, or are re-encrypted and written back if they

contain data. Because the k sectors are chosen uniformly randomly and the ciphertext

on disk is indistinguishable from a random string, this construction does not reveal

any information about b or d to the adversary.

As this construction does not attempt to hide read patterns, the ORAM read

operation is trivial. It only involves resolving the requested logical ORAM block b

to the corresponding disk sector β through a Map lookup, and then performing the

read normally.

4.3.4 Choosing the Parameter k

In order to successfully execute a disk write operation, the choice of k must ensure

that there exists at least one free block in S where we can write the data into.

Recall our assumption that N ≥ 2n; that is, at least half of the ORAM’s underly-

ing storage disk is left empty. Then, the probability of a randomly chosen block from

the disk being empty is at least 1/2. Let X be a random variable that, when selecting

89

k blocks uniformly from N , describes the number of free blocks among those k. As

N is typically large compared to k, we approximate X with a binomial distribution.

Then, P [X ≥ 1] = 1 − P [X = 0] ≈ 1 −(k0

)· ( n

N)k = 1 − 2−k for N = 2n. In other

words, the probability of not finding any free block is a negligibly small 2−k, at the

cost of doubling the disk space requirement.

4.3.5 Write-Only ORAM Optimizations

There are two important limitations of the write-only ORAM construction we pre-

sented above. First, we need to perform k disk operations for each ORAM write,

which would impact performance for large values of k. Second, storing Map in mem-

ory could be excessively expensive with large disks.

To address the first problem we introduce a stash optimization. Recall that the

probability of selecting a random free block on disk is at least 1/2. This means that,

for a single write operation, our ORAM scheme will pick k/2 free blocks in expec-

tation; however, we need only one to write our data block into. Our optimization

technique exploits this fact to allow for very small values of k (e.g., our Linux imple-

mentation uses k = 3). Specifically, we extend our construction with an in-memory

data block queue, or stash. During a write operation, if there is no free block among

the k selected, we instead temporarily store the data in the stash. Otherwise, if there

are multiple free blocks in our selection, we write the pending blocks from the stash

into the excess free blocks.

To bound the size of the stash, and thus, memory use, we use a standard queuing

argument. We model our stash as a D/M/1 queue with a deterministic arrival rate

of γ = 1 and service times exponentially distributed with a rate parameter µ = k/2.

Then, as shown by [74], the steady state probability P of having i items in the stash

90

at any time is P = (1− δ) · δi, where δ is the root of the equation δ = 2−µγ(1−δ) with

the smallest absolute value. If µ is larger than 1, then δ < 1, and the steady state

probability of having i blocks in the stash will be O(2−i). As a result, we can set k

to a small constant, for example, k = 3, to find δ = 0.41718, and we can bound the

probability of overflowing the stash at 2−64 using a stash size of only 50 blocks.

We address the second problem of large Map sizes by adopting a standard tech-

nique that involves storing the mapping recursively in smaller ORAMs [71, 75, 76]. If

our block size B is at least χ · logN for some constant χ > 2, then we are guaranteed

that another ORAM recursively holding our map will have its own map no greater

than half the size of the original. After O(log n) recursive ORAMs, we will have a

constant size map that can be stored in memory. Of course, this slightly increases the

communication complexity since we now have to access O(log n) recursive ORAMs

to map ORAM blocks to disk sectors.

4.3.6 Hidden Volume Encryption with HiVE

While the combination of our write-only ORAM with the generic scheme we pre-

sented previously provides a secure hidden volume encryption technique, it has one

final practical limitation. Namely, our generic scheme uses max separate ORAMs to

support max volumes. In turn, each I/O on any given volume performs additional I/O

operations on all ORAMs, resulting in a complexity dependent on the value of max.

Our final refinement of the hidden volume encryption scheme, which we call Hive,

addresses this problem by storing all volumes interleaved on the disk inside a single

ORAM. Blocks of all volumes are mapped to random disk sectors, and mappings are

updated randomly using our ORAM mechanism every time a block is written to (see

Figure 4.1).

91

Algorithm 5 Hive WriteInput: volume v, block b, data d, keys < K1, . . . ,Kmax >Stashv.enqueue(b, d)S ← random subset({1, . . . , N}, k)for all β in S do

for i← 1 to max doif β is block b in Vi then

d← DecKi(Disk.read(β))Stashi.enqueue(b, d)break

end ifend for

end for

i← 1for all β in S do

while i ≤ max and Stashi = ∅ doi← i+ 1

end while

if i > max thenbreak

end if

b, d← Stashi.dequeue()Disk.write(β,EncKi(d))Mapi[b]← βS ← (S − β)

end for

for all β in S doDisk.write(β, random bytes(B))Mapi.dummy write()

end for

92

Algorithm 6 Hive ReadInput: volume v, block b, keys < K1, . . . ,Kmax >Output: data dif b in Stashv then

d← Stashv.dequeue()return d

end if

β ←Mapv[b]d← DecKv(Disk.read(β))HiV E.dummy write()return d

Volume 1 Volume 2 Volume max

...

block b

1. Map[b] = β

2. Map[b] = β'

block βblock β'Hard Disk

Figure 4.1: Hive stores volumes interleaved on disk.

93

Details of the Hive write and read operations are shown in Algorithms 5 and

6. Note that, in line with our recursive block translation map design, the Map

structures referred to in the algorithms are actually separate Hive instances as well.

Consequently, they must also be accessed with the presented Hive read and write

routines in a recursive manner. For brevity, however, we use the simple array-like

notation in the algorithm listings.

We point out that combining all encrypted volumes together in a single backing

ORAM has an important implication on security: writes to volumes may influence

each other. To illustrate, to write to a given volume, when we pick k random blocks to

form S as usual, this set can contain both free blocks, as well as blocks used by other

volumes. In this case, we cannot simply choose to use the free block as that would

create a write pattern that attempts to deliberately avoid certain used blocks, and

subsequently, would undermine the security of the scheme. To address this problem,

Hive utilizes separate stashes for each volume. When performing a write, the data

to be written, and then all used blocks among the randomly chosen k blocks are read

from disk, and enqueued to their corresponding stashes. As a result, k blocks are now

freed on the disk. Next, these k free spaces are filled with the pending blocks in the

stashes, with stashes for lower volumes taking priority. This ensures that writes to

higher volumes cannot influence writes to lower volumes, solving the aforementioned

problem of leaking write patterns. Finally, if all stashes are empty and any of the k

freed blocks remain unprocessed, they are filled with randomized blocks. Note that

when randomizing blocks in this way we still need to perform a dummy update on

the corresponding Map, because the maps are recursive instances of Hive as well.

During the Hive read operation the requested block is read as usual. However,

as a final step, a dummy write should be performed according to Algorithm 5. Once

94

again, this ensures that reads and writes look identical to an adversary, and also

provides an opportunity for writing stashed blocks to disk.

4.4 Implementation

We implemented Hive for Linux as a combination of a kernel module, and a user space

helper application. The kernel module allows us to expose Hive as a virtual Linux

block device that behaves and can be used like any physical block device installed on

a machine. The helper application allows users to create and manage Hive instances

on any system partition.

HiVE Volumes

Similar to our Eraser prototype previously discussed in Chapter 3, Hive’s im-

plementation makes use of the device-mapper framework provided by the kernel.

Device-mapper allows us to map any part of a hardware block device onto virtual

block devices, which the users then interact with. As a result, Hive sits between the

Linux block I/O layer and the underlying hardware, intercepts the I/O requests in

flight, and modifies or redirects them to different disk sectors as necessary to imple-

ment the hidden volume scheme we described. Note that our implementation is not

tied to the low-level block device drivers, and works on any underlying block device,

including hard disks and USB sticks. Likewise, since device-mapper resides below

the Linux virtual file system (VFS), Hive volumes could be formatted with any file

system available to the user.

The cryptographic algorithms we use in our implementation are AES-256 in CBC

mode for volume encryption, AES-256 in CTR mode for efficiently generating ran-

95

domized blocks, and PBKDF2 for deriving the volume encryption keys from user

selected passwords.

Disk I/O Optimizations

Remember that all data blocks sent to a Hive volume are written to a uniformly

randomly chosen disk sector. This is an inherent characteristic of our write-only

ORAM construction that is required to satisfy the property that write patterns cannot

be observed in a disk snapshot. Unfortunately, this also means that all I/O operations

performed on Hive volumes result in random disk accesses, which can be significantly

slower than typical sequential access patterns.

We perform two optimizations to alleviate this problem. First, we set the block

size of Hive volumes to 4 KB (i.e., the maximum allowed by Linux on the x86

architecture) regardless of the underlying storage medium’s physical sector size (which

is usually 512B). This forces the kernel to issue I/O requests in larger chunks, reduces

the overall number of block writes and random disk seeks necessary during operation,

and greatly reduces the I/O overhead.

Next, we disable all kernel I/O reordering and scheduling optimizations for Hive

volumes. Because Hive strictly performs random access to the disk, it cannot benefit

from these disk access pattern anticipation features. On the contrary, disabling them

improves performance by eliminating the overhead of these unnecessary optimization

routines.

Managing HiVE Partitions

The Hive user space tool is a management interface which allows users to create Hive

instances on top of their hardware devices. The tool automatically computes the size

requirements for each volume and the recursive map Hive instances, partitions the

96

Write Read Create Stat Delete(MB/s) (MB/s) (files/s) (files/s) (files/s)

Raw disk 216.04 221.74 82290 201180 105100Hive 0.97 0.99 1570 3230 1790

Table 4.1: Disk I/O and file system performance of Hive. System parameters chosenas l = 2, k = 3.

device accordingly, sets up the required metadata such as encryption IVs and reverse

maps used for detecting whether a given disk sector is free, and allows users to set

their passwords.

Once a Hive instance is created, the management tool enables users to create

virtual block devices that represent Hive volumes by supplying their passwords,

and then format, mount, and interact with them as usual. Users can also view

active instances of Hive and their volumes; this information is made available to the

management tool by the Hive driver through a /proc node.

The management tool performs all of its functionality automatically by commu-

nicating with the device-mapper framework through the appropriate ioctl calls.

4.5 Evaluation

We tested our implementation on a standard desktop computer with an Intel i7-930

CPU, 9 GB RAM, running Arch Linux x86-64 with kernel version 3.13.6. As the

underlying block device, we used an off-the-shelf Samsung 840 EVO SSD. For the

evaluation, we used Bonnie++, a standard disk and file system benchmarking tool.

We first tested an Ext4 file system with 4 KB blocks on the raw disk to get a baseline.

We then created 2 hidden volumes on our disk and set l = 2 and k = 3. We repeated

the experiments by running Bonnie++ on an Ext4 file system created on top of the

97

Hive volume. Table 4.1 presents the results averaged over 5 runs with a relative

standard deviation of < 6%.

These results show that I/O operations (i.e., writes and reads) were slower by

a factor of ≈ 200, while file system operations (i.e., create, stat, and delete) were

slower by a factor of 50 to 60. Random seek performance was not measurable on the

raw SSD (i.e., Bonnie++ reported that the tests completed too quickly to measure

reliable timings), whereas Hive achieved 1200 seeks/s. The Hive-induced CPU uti-

lization was low with < 1% during measurements, indicating that random access I/O

constitutes the main bottleneck.

We conclude that the measured slowdown is certainly significant, and that Hive is

not a suitable substitute to general-purpose full disk encryption solutions. However, a

throughput of 1 MB/s on an off-the-shelf disk would be acceptable in many high-risk

scenarios that involve highly-sensitive data, rendering Hive practical for the intended

real-world cases.

4.6 Related Work

The concept of deniable encryption is explored in detail for the first time by Canetti

et al. [77]. There exists a large body of free and commercial disk encryption tools

that provide various plausible deniability solutions (e.g., [78, 79, 80, 81]). These tools

do not hide access patterns to the disk, and as a result, repeated writes to hidden

volumes can be detected by a multiple-snapshot adversary.

Anderson et al. [82] present StegFS, and describe two techniques to hide data on

a disk. The first one utilizes a set of randomized cover that are later modified to hide

data in. The plaintext can then be retrieved as a linear combination of these. This

98

technique does not offer deniability if an attacker has knowledge of some of the files

on the disk, and it is not secure against multiple-snapshot adversaries.

The second technique, and its extension [83], hides files among randomly-filled

disk blocks, where data blocks are stored in locations derived from the file name and

a password. To avoid possible collisions, [84] instead iterates over the disk blocks

following the initial one, and writes to a free block found. This approach subverts

the problem of known files, but it is still vulnerable to multiple-snapshot attacks.

Other research focuses on providing plausibly deniable encryption specifically on

mobile devices [85, 86, 87, 88]. Once again, these techniques are not designed to be

resilient in the face of multiple-snapshot attacks.

Paterson and Strefler [89] describe a practical attack on an earlier implementation

of our work. Specifically, Hive’s previous use of the RC4 stream cipher to generate

random blocks was vulnerable, allowing an attacker to distinguish between random-

ized dummy blocks and actual encrypted data on disk, and consequently, to break

Hive’s security guarantees. The current Hive implementation instead uses AES-256

in CTR mode for generating randomized blocks, and is no longer vulnerable to this

attack.

4.7 Summary

Even though full disk encryption is recognized as an effective way to guarantee con-

fidentiality of sensitive information on a computer’s disk, this basic cryptographic

technique is bound to fail in the face of adversaries with the ability to coerce users

into revealing their secret keys. Advanced tools offer plausibly deniable disk encryp-

tion schemes to address this weakness; however, more powerful adversaries that can

99

inspect multiple snapshots of a disk taken at different times can still successfully

detect hidden encrypted partitions.

In this chapter, we presented Hive, a practical hidden volume encryption scheme

resistant against multiple-snapshot adversaries. Our performance evaluation indicates

that Hive may not be suitable for everyday use, or for replacing regular disk encryp-

tion technologies. However it provides strong privacy guarantees and reasonable per-

formance in scenarios involving highly-sensitive data, where a security-to-performance

trade-off may be acceptable.

The proposed design and implementation satisfies all of our research goals stated

in Section 1.5. (G1) Hive is realized on a standard Linux system using a well-known

kernel framework; the concept of virtual block devices are applicable to other major

operating systems. (G2) Hive volumes function exactly like ordinary block devices;

they are supported on any block-based storage hardware, and can be formatted with

any file system. (G3) Likewise, both applications and users access Hive volumes just

like any other disk partition available on a system; encryption and privacy features

are offered transparently. (G4) Finally, both the presented write-only ORAM con-

struction, and the final Hive scheme perform reasonably well for the intended use

cases, and are suitable for practical use.

100

Chapter 5

Overhaul:

Input-Driven Access Control on

Traditional Operating Systems

5.1 Overview

The prevailing security model for traditional operating systems focuses on protect-

ing users from each other. For instance, the UNIX access control model provides a

framework for isolating users from each other through a combination of user iden-

tifiers, group identifiers, and process-based protection domains. The fundamental

assumption underlying this approach to security is that the primary threat to user

data originates from other users of a shared computing system.

The traditional user-based security model makes sense in the context of timeshar-

ing systems, where many users share access to a common pool of computing resources.

However, the modern proliferation of inexpensive and powerful computing devices has

resulted in the common scenario where one user has sole access to a set of resources.

101

Unfortunately, there exists a significant impedance mismatch between user-based ac-

cess control and the primary security threat in the single-user scenario, where users

inadvertently execute malicious programs that operate with that user’s privilege and

have full access to all of the user’s sensitive computing resources. As such, user-based

access control is not well-suited to preventing attacks against user confidentiality. In

particular, malicious programs can access privacy-sensitive hardware devices such as

the microphone or camera, or access virtual resources such as the system clipboard

and display contents of other programs.

In response to the changing computing landscape, much effort has been invested

in extending the user-based access control model to enable dynamic, user-driven

security. For instance, modern operating systems for smartphone and tablet devices

have taken the opportunity provided by these new platforms to introduce permission

systems as an extension to the underlying UNIX security model that remains in use

on these systems. For instance, iOS gives users the ability to approve or deny access

to sensitive resources during runtime via popup prompts. Research operating systems

have also proven a fertile milieu for experimenting with security models that address

the needs of modern computing systems. For instance, Roesner et al. [90] present

an extension to ServiceOS where gadgets are embedded into applications that allow

users to grant or deny access to sensitive resources.

In each of the preceding examples, determining legitimate user intent and translat-

ing that intent into appropriate security policies is a central feature of their respective

security models. For each system, security decisions as to whether to allow or deny

access to sensitive resources for individual programs are delegated to the user, and the

system is responsible for establishing trusted input and output paths to capture user

intent such that malicious programs cannot influence this process by either spoofing

or intercepting user inputs.

102

We fundamentally agree with this approach to securing modern computing de-

vices, since users are often solely capable of classifying program actions as privacy

violations or other inappropriate uses of their resources. However, one drawback of

these efforts is that applications and operating systems must be written with this

security model in mind. This requirement largely excludes traditional operating sys-

tems such as Windows, Linux, and OS X, which remain in wide use, from enjoying

the benefits of user-driven access control.

In this chapter, we show that providing a user-driven security model for protect-

ing privacy-sensitive computing resources can be realized on traditional operating

systems, as an extension to the traditional user-based security model. In particular,

our security model is based on the observation that a legitimate application usually

accesses privacy-sensitive devices immediately after the user interacts with that ap-

plication (e.g., by clicking on a button to turn on the camera, or pressing the key

combination for a copy & paste operation). We call this security model input-driven

access control, and demonstrate how it can be enforced by correlating user input

events with security-sensitive operations based on their temporal proximity, making

access control policy decisions automatically based on this information, and notify-

ing the user of resource accesses in an unintrusive manner. We achieve this by using

lightweight and generic techniques to augment the operating system and display man-

ager with trusted input and output paths, which we collectively call Overhaul, and

demonstrate our approach by implementing a prototype for Linux and X Window

System.

In contrast to prior work, we show that capturing user interaction as a basis for

security decisions involving sensitive resources can be performed in an application-

transparent manner, obviating the requirement that applications be rewritten to con-

form to special APIs or with a more refined security model in mind. Using our

103

approach, we demonstrate how dynamic access control can be transparently achieved

for common resources such as the microphone, camera, clipboard, and display con-

tents. Finally, we show that this can be achieved without a discernible performance

impact, and without utilizing intrusive prompts or other changes to the way users

interact with traditional operating systems.

5.2 Threat Model

Input-driven access control primarily addresses two privacy breach scenarios. The first

one covers programs that stealthily run in the background and access privacy-sensitive

resources without the user’s knowledge, behavior typical of malware [91, 92, 93, 94].

Overhaul ensures that such attempts are automatically blocked.

The second scenario involves “benign-but-buggy” or misbehaving applications

that access protected resources without the user’s knowledge. Due to the trade-

offs Overhaul make in order to transparently retrofit a dynamic access control into

existing systems, unlike previous work [90], it is not possible to match each input

event to a precise user intent. Therefore, in this scenario, Overhaul instead visu-

ally notifies the user to alert her of the undesired resource access.

For this work, we assume that the trusted computing base includes the display

manager, OS kernel, and underlying software and hardware stack. Therefore, we

assume that these components of the system are free of malicious code, and that

normal user-based access control prevents attackers from running malicious code with

super user privileges. On the other hand, we assume that the user can install and

execute programs from arbitrary untrusted sources, and therefore, that malicious

code can execute with the privileges of the user. We assume that complementary

104

preventive security mechanisms are in place to prevent privilege escalation attacks,

such as ASLR or DEP.

We note that all forms of user-driven security are fundamentally vulnerable to full

mimicry attacks. For instance, if a user could be tricked into knowingly installing,

executing, and granting privileges to a malicious application that imitates a well-

known legitimate application, user-driven security models would fail to provide any

protection. Hence, our threat model does not include this third scenario.

5.3 Design

The architecture of an Overhaul-enhanced system requires modifications to and

close interaction between several components of the operating system and display

manager. In this section, we describe the abstract design of Overhaul, independent

of the underlying operating system, and present the challenges involved in monitoring

and tracking user input across process boundaries.

Note that our work assumes a user space display manager (i.e., a design similar

to that of the X Window System), an approach employed by popular commodity

operating systems. Different OS designs can allow display managers integrated into

the kernel, which would alleviate the need for some of the components we describe

below, such as a separate trusted communication channel between the kernel and the

display manager. Our design can be applied to that case in a straightforward manner.

5.3.1 Security Properties

The primary security goals Overhaul aims to achieve through input-driven access

control are the following:

105

(S1) Overhaul must allow an application to access privacy-sensitive resources only

if the user has explicitly interacted with that application through physical, hard-

ware input devices, immediately before the access request. Resources include

hardware devices such as cameras, microphones, and other sensors, or virtual

resources such as the system clipboard and display contents of user programs.

(S2) Overhaul must prevent programs from forging input events or mimicking user

interaction to escalate their (or other applications’) privileges.

(S3) Overhaul must ensure that legitimate user interaction events cannot be hi-

jacked by malicious applications, such that users should not mistakenly grant

permissions to a malicious program that were intended for a legitimate program.

(S4) Overhaul must notify users of successful accesses to protected resources via

a trusted output path that cannot be obscured or interfered with by other

applications.

5.3.2 Trusted Input & Output Paths

In order to realize any of the aforementioned security goals, Overhaul must es-

tablish a trusted path for user input. By a trusted path, we refer to the property

that input events should be authenticated as legitimately issued by a real user with

a hardware input device, as opposed to synthetic input events that can be issued

programmatically. This capability serves as a generally useful primitive that could

be exposed to higher layers of the software stack. However, in this chapter we focus

on illustrating its use for transparently securing access to system-wide resources.

The display manager of the system is often responsible for receiving all low-level

input events, including mouse clicks and key presses, from device drivers and deliv-

ering them to their target application windows. Consequently, Overhaul utilizes a

106

display manager with an enhanced input dispatching mechanism that can detect and

filter out synthetically generated inputs to fulfill the trusted input path requirement.

Likewise, Overhaul is tasked with establishing a trusted output path to alert

users whenever a sensitive resource access request is granted. We achieve this through

visual notifications that appear on the screen. Since the display manager is in control

of the screen contents, Overhaul extends it with an overlay notification mecha-

nism that is always stacked on top of the screen contents, and cannot be obscured,

interrupted, or interfered with by other processes.

5.3.3 Permission Adjustments

The kernel is responsible for dynamically adjusting the privilege level of user pro-

grams in response to permission granting actions, or in other words, authentic user

input events. In order to accomplish this task, the kernel first needs to establish

a secure communication channel to the display manager. The display manager can

then use this channel to send the kernel interaction notifications each time the user

interacts with an application. Since the display manager is often a regular user space

process, the kernel is able to authenticate the communication endpoint and ignore

communication attempts by other processes in a straightforward manner.

The kernel keeps a history of these interaction notifications, which include the

identity of the application that received the interaction and a timestamp, inside a

permission monitor. Once this information is stored, the permission monitor can

respond to permission queries and adjustment requests, originating either from the

user space display manager through the already established secure communication

channel, or from within the kernel, any time a permission decision is to be made.

This decision process involves comparing a timestamp issued together with the query

107

with the stored interaction timestamp corresponding to the target application, and in

this way correlating privileged operations with input events based on their temporal

proximity.

Finally, the kernel also uses the secure communication channel to request from

the display manager that it display a visual alert when a resource access is granted.

5.3.4 Sensitive Resource Protection

An important class of system resources that Overhaul aims to protect is privacy-

sensitive hardware devices. These devices could include arbitrary sensors attached to

the system; typical examples on desktop operating systems include the camera and

microphone. In order to implement dynamic access control over hardware resources,

the kernel is responsible for mediating accesses to these privacy-sensitive hardware

devices.

However, note that the kernel does not interpose on all privacy-sensitive resources;

representative examples include the system clipboard and program display contents.

The operating system often has no immediate visibility into such resources. Instead,

these resources are controlled by the system’s display manager. Applying dynamic

access control over these resources requires the display manager to query the kernel

permission monitor, and grant or deny the action based on the response.

To illustrate the enhancements required to the kernel and the display manager,

and how sensitive resources are protected, we present two scenarios that build upon

the components described above.

108

For the following discussion, we let:

opt be a privileged operation at time t,

where op ∈ {copy, paste, scr,mic, cam},

EA,t be an input event sent to application A at time t,

NA,t be an interaction notification corresponding to EA,t

QA,t be a permission query for application A at time t,

RA,t be a response ∈ {grant, deny} for QA,t,

VA,op be a visual alert request, indicating A performs op.

Hardware Resources

Figure 5.1 presents an example interaction involving an application’s request to access

the system microphone. In an unmodified system, the request would succeed so long

as application A holds the permission to access the microphone device at t+ n.

Overhaul introduces the following changes: First, the system ensures that for all

applications the permission to access the microphone is denied by default. (1) When

the user clicks on a button in application A to turn on the microphone at time t,

the display manager receives the input event EA,t and verifies that it is generated by

a hardware input device through user interaction. (2) If EA,t is authentic, then the

display manager first sends the kernel permission monitor an interaction notification

NA,t through the secure communication channel. The permission monitor records

this notification, indicating that A received authentic user input at t. (3) The display

manager then forwards EA,t to its destination A. (4) Upon receiving the event, A

109

DisplayManager

Permission Monitor

A

EA,t

NA,t

EA,t1

2

3

Kernel

Userspace

Cam Mic

4

5

mict+n

Hardware

VA,mic

6

Figure 5.1: Dynamic access control over privacy-sensitive hardware devices.

attempts to turn on the microphone. The permission monitor intercepts A’s request

mict+n to access the device. It compares A’s latest interaction time t with the device

access request time t + n to correlate the input event with the privileged operation,

based on a preconfigured threshold δ. (5) Access to the device is granted to A only if

the privileged operation could successfully be correlated with a preceding input event

(i.e., if (t + n)− t = n < δ holds). (6) Finally, the kernel sends VA,mic to the display

manager to request that the user be alerted. This step is necessary because the display

manager may not have adequate information to identify the process that actually

accessed the resource (e.g., due to IPC mechanisms, as explained in Section 5.3.5).

The verification of user input authenticity provides the property that sensitive

device access operations can only be performed in response to legitimate user input.

110

Note that, in this scenario, no permission query from the display manager to the

permission monitor is necessary. Since the kernel has full mediation over hardware

resources, the permission monitor can implicitly adjust the permissions of A when

necessary. This entire process is transparent to the application.

Display Resources

Figure 5.2 shows an example interaction for a clipboard paste operation between the

display manager and an application A. The baseline protocol consists of A requesting

the clipboard contents from the display manager, and receiving back the copied data.

Overhaul revokes all clipboard access permissions by default, and modifies the

protocol in the following way:

(1) First the user inputs the keystrokes to paste some text, (2) the display manager

verifies that the input EA,t is authentic and notifies the kernel permission monitor with

NA,t, (3) and forwards the key event to A. (4) After receiving the command from the

user, A issues a clipboard paste request pastet+n to the display manager. (5) Instead

of immediately serving the request, the display manager sends a permission query

QA,t+n to the kernel permission monitor through the secure communication channel.

(6) As before, the permission monitor compares the interaction time t in its records

for A with the privileged operation request time t+n issued together with the query.

If the correlation of the input event with the operation request is successful based

on the temporal proximity threshold δ, (i.e. n < δ), the permission monitor replies

with a grant response RA,t+n; otherwise RA,t+n is a deny response. (7) If and only if

RA,t+n is a permission grant does the display manager return to A the data; or else

A is blocked from accessing the clipboard. In this scenario an explicit visual alert

request from the kernel is not necessary, because the display manager can successfully

identify the requesting process without kernel assistance.

111

DisplayManager

Permission Monitor

A

EA,t

NA,t QA,t+n

RA,t+n

EA,t

pastet+n

data

1

2

3

7

6

4

5

Kernel

Userspace

Figure 5.2: Protecting copy & paste operations against clipboard sniffing.

Here, the secure communication channel between the kernel and the display man-

ager is used both for sending interaction notifications to the permission monitor, and

for querying it whether to allow the privileged operation.

As before, the verification of user input authenticity provides the property that

copy & paste operations can only be performed in response to actual inputs. This

provides protection against malicious programs that attempt to capture sensitive data

from the system clipboard, such as passwords pasted from a password manager. We

note that because permission queries are implicitly generated along with the copy &

paste requests, this protection is transparent to the application.

Note that, in this scenario, first sending input notifications to the permission mon-

itor and later querying it for the same information could seem unnecessary. Instead,

one could store input notifications inside the display manager to avoid kernel commu-

nication. However, in the next section, we show that our design is necessary for the

112

DisplayManager

Permission Monitor

Shot

ERun,t

NRun,t

ERun,t1

2

3

Kernel

Userspace

Run

4

QShot,t+n

RShot,t+n

76

scrt+n

img8

5

create process

Figure 5.3: A program launcher executing a screen capture program, illustrating theneed for interposing on process spawn mechanisms to propagate interaction informa-tion.

kernel to track interactions across process boundaries through process spawns and

inter-process communication (IPC) channels.

5.3.5 Interaction Across Process Boundaries

Real-life applications often consist of multiple processes or threads, and communicate

with each other using application-specific protocols via IPC facilities provided by the

OS. This significantly complicates the task of associating user input with privileged

operations requested by an application, because the process receiving the input event

could be different from the actual process that accesses a sensitive resource. We

illustrate this challenge Overhaul needs to address with the examples below.

Figure 5.3 presents a scenario where an application Shot attempts to capture a

screen image. Since the screen content is also a resource controlled by the display

manager, this example is similar to the previous copy & paste example. However, here,

113

DisplayManager

Permission Monitor

Tab

EBrowser,t

NBrowser,t

EBrowser,t1

2

3

Kernel

Userspace

Cam Mic

5

6

camt+n

Hardware

Browser

4

open cam

SharedMemory

Figure 5.4: A multi-process browser, components of which communicate via sharedmemory IPC. This example illustrates the need for interposing on IPC endpoints topropagate interaction information.

the user first executes a program launcher Run, types in the name of the program Shot,

and the application launcher executes Shot on the user’s behalf. In other words, (1–3)

the user actually interacts with Run, which the kernel permission monitor records;

(4) but Run creates a new process Shot, (5) and the screen capture request scrt+n is

made by this different process for which there exists no interaction record.

In another scenario, Figure 5.4 depicts how a multi-process Internet browser that

uses separate processes for each browser tab (i.e., similar to Chromium) would run

a web-based video conferencing application. (1–3) When the user commands the

browser to launch a video conference session, she actually interacts with the main

browser window Browser, and the permission monitor is notified of this. However,

114

Browser opens the web application in a separate process Tab and (4) commands it

to turn on the camera via shared memory IPC. As a result, (5) Tab requests camt+n

without a corresponding interaction record in the permission monitor.

The ubiquity of multi-process application architectures, applications that launch

third-party programs, and IPC use make it necessary for Overhaul to correctly han-

dle cases similar to those exemplified above. Therefore, our design requires Over-

haul to interpose on all process and thread spawns, as well as the entire range of IPC

mechanisms provided by the OS (e.g., (4) in Figure 5.3 and Figure 5.4). Specifically,

Overhaul needs to propagate interaction notifications between processes according

to the following policy:

• Interaction notifications of a parent process must be propagated to a newly-

spawned child process. In other words, whenever a process X creates a new

process Y , all interaction notifications NX,t recorded in the permission monitor

must be duplicated as NY,t.

• In an IPC channel established between two (or more) processes, interaction

notifications of a message sender process must be propagated to the receiver

process. That is, Overhaul must monitor all established IPC endpoints, and

whenever process X sends a message to process Y , interaction notifications NX,t

recorded in the permission monitor must be duplicated as NY,t.

In this way, Overhaul can support process spawns and IPC chains of arbitrary

length and complexity, and remain transparent to the applications and oblivious to

the application-level communication protocols.

115

5.3.6 Discussion

Our approach fulfills the design goals enumerated in Section 5.3.1. Overhaul pro-

vides a trusted input path between the user and kernel, a display manager that

authenticates hardware-generated input events and interposes on display resources,

and a kernel permission monitor that mediates access to sensitive hardware (S1), (S2).

The display manager also enforces appropriate visibility requirements on application

windows to prevent hijacking of authentic user interaction (S3), and ensures that

resource accesses are communicated to the user via visual alerts (S4).

We point out that Overhaul inherently shares the limitations of other user-

driven security approaches. In particular, because the user’s perception of malice

and their interaction with applications are central to this security model, Overhaul

cannot provide protection against malware that can trick a user into voluntarily in-

stalling and using them, for example, by mimicking the appearance and functionality

of well-known legitimate applications. Additionally, Overhaul does not support

running scheduled tasks, or persistent non-interactive programs that need access to

the protected sensitive devices (e.g., a cron job or daemon that periodically takes

screen captures). We stress that these issues are fundamental to any user-driven ac-

cess control model, and despite its limitations Overhaul provides important security

benefits complementing the standard access control models employed in commodity

operating systems, without any significant detriments to performance or user experi-

ence.

The trade-offs Overhaul makes between backwards compatibility with legacy

programs and defending against on-system malware results in a system that pro-

vides strictly weaker security guarantees than prior work on user-driven access con-

trol [90], where a stronger connection between user intent and program behavior can

116

quired for malware to interact with a user interface on the user’s behalf so long as

the hardware is considered to be free of embedded malicious functionality.

As a result, Overhaul focuses on distinguishing between hardware and software-

generated input events. We identified two facilities provided by X11 for generating

and injecting synthetic events to the event queue: the SendEvent [95] and XTest-

FakeInput [96] requests. SendEvent is a core X11 protocol request that allows a client

to send events to other clients connected to an X server. In particular, this interface

could allow malware to inject keystrokes or mouse events on other windows. However,

events sent using this interface must have a flag set that indicates that the event is

synthetic. As such, filtering such input events within the X server is a matter of

checking for the presence of this flag.

The second request, XTestFakeInput, is part of the XTest extension, which is used

for providing a GUI testing framework. In this case, it is not possible to implement

a flag check since no indicator flag is used with XTest requests. Therefore, it was

necessary to modify the X server to tag events with the extension or driver that

generated the event. While this is more onerous than checking for the existence

of a flag, it is also a method for determining the provenance of input events that

generalizes to future modifications to the X Window System.

With the ability to distinguish hardware-generated input from synthetic input,

the X server was modified to connect to a secure communication channel upon ini-

tialization (as we will explain in Section 5.4.2), and send interaction notifications to

the kernel permission monitor every time the user interacts with an X client. These

notifications are labeled with the PID of the process that received the event and a

timestamp. The PID serves as an unforgeable binding between a window belonging

to a process and events, as the mapping between X client sockets and the PID is

retrieved from the kernel.

118

We note that the trusted input path described so far remains vulnerable to click-

jacking attacks [97]. For instance, a malicious X client may place transparent overlays

on the screen, or periodically display a previously invisible window over other appli-

cations in an attempt to trick users into clicking on them and stealing authentic input

events. To prevent this, Overhaul only generates interaction notifications if the X

client receiving the event has a valid mapped window that has stayed visible above a

predefined time threshold.

Trusted Output

As described before, the trusted output path that Overhaul utilizes is a visual alert

shown on the screen whenever a sensitive resource is accessed. Since the X Window

System controls the entire display contents, Overhaul ensures that displayed alerts

are rendered on top of all other windows, and cannot be blocked, obscured, or manip-

ulated by other X clients. We designed the alert messages to be displayed for a few

seconds at the top of the screen, at a reasonably large size, to be easily noticeable.

Since resource accesses can only be granted immediately following user input, the user

is highly likely to be present and interacting with the computer, making it difficult

for her to miss an alert. In addition, the alerts make use of a visual shared secret set

by the user to prevent malicious applications from forging fake alerts. Two example

alerts are shown in Figure 5.5.

Note that, compared to popup prompts that require explicit policy decisions from

the user during runtime (e.g., Windows User Account Control or iOS permission di-

alogues), alerting the users with visual notifications inherently establishes a looser

association between user actions and the application behavior. Indeed, we imple-

mented and verified that Overhaul’s security primitives can be used to support

such a security model in a trivial manner, where the trusted output path would be

119

Figure 5.5: Sample visual alerts shown by Overhaul. The cat image is used as avisual shared secret to indicate that the alert is authentic.

used for displaying an unforgeable prompt, and the trusted input path to verify user

interaction with it. However, it has been shown that popup prompts have severe

usability issues that conflict with their security properties, and that they are often

ignored by users, or disabled completely [98]. Therefore, we believe the non-intrusive,

transparent approach we took with Overhaul is a worthwhile trade-off between

security and usability, and would be a more effective security solution in a real-life

setting. We do not explore the popup prompt approach further in this chapter.

Display Contents

The X Window System allows any client program to access the contents of the root

window (i.e., the entire screen), or any specific window through the GetImage core

protocol request [95], or the XShmGetImage request provided by the MIT shared mem-

ory extension [99]. These interfaces can be used to retrieve the displayed contents for

any purpose such as taking screenshots or recording the desktop.

120

In order to mediate accesses to the display contents of X clients, our modified

X server intercepts these events, and queries the kernel permission monitor via the

secure communication channel with a message containing the PID of the requesting

process and a timestamp. Based on the response, access is either granted, or the

screen capture request is dropped. This way, Overhaul can enforce that display

contents can only be accessed in response to user input.

The X Window System also provides two additional core protocol requests, CopyArea

and CopyPlane, which are used for copying a representation of display contents be-

tween two buffer areas. These requests could be used as an alternative approach to

capture the screen contents, and therefore, Overhaul must also interpose on them.

However, unlike the previous GetImage, these requests are not specifically designed

for capturing display contents, and they are regularly used by X clients for various

other purposes. Therefore, in this case, Overhaul first needs to inspect the owners

of the source and destination buffers specified in the copy request. If the owners of

both buffers are identical, in other words, a client is copying a portion of its own win-

dow, the request is allowed to proceed. However, if a client is requesting the display

contents owned by a different client (or the root window), Overhaul applies its user

input-based access control as before, and allows or blocks the request accordingly.

Clipboard Contents

The X Window System does not provide a central clipboard space, but instead defines

the copy & paste operations as an inter-client communication protocol [100] outlined

in Figure 5.6. The steps to copy data from a source client to a target client are as

follows:

(1) A copy operation is initiated by user input received via an X input driver.

(2) The source client asserts ownership of a selection object by issuing to the X server

121

App A

Copy Source

App B

Paste Target

XServer

SetSelection

GetSelection

ConvertSelection

SelectionRequest

ChangeProperty

SendEvent

SelectionNotify

GetProperty

data

DeleteProperty

CopyA PasteB

Owner is A

data

1

2

10

4

5

6

7

8

9

3

11

12

13

Figure 5.6: Protocol diagram for the X11 copy & paste operation. Modified steps arehighlighted in bold.

122

a SetSelection request. In (3) and (4) the source client confirms with the X server

that it has successfully acquired the selection. This concludes the copy operation;

note that no data has actually been copied at this stage.

(5) The paste event is initiated by user input. (6) The target client sends a

ConvertSelection request to the X server, (7) which, in turn, issues a Selection

Request to the selection owner (i.e., the source client) to notify it of the request for

the copied data. (8) The source client sends the data to the X server to be stored

as a property using a ChangeProperty request, (9) and then requests from the server

that the target client be sent a Selection Notify event, using a SendEvent request.

(10) The paste target is notified that the copied data is available. (11) The target

client responds with a GetProperty request, (12) retrieves the data, (13) and finally,

removes it from the server.

In Figure 5.6, the protocol steps that were modified in Overhaul are highlighted

in bold. In particular, steps (1) and (5) are events that are verified as authentic

user input from a hardware input device. The X server notifies the kernel permission

monitor of these events as previously described. In steps (2) and (6), before serving the

SetSelection or ConvertSelection requests received from the clients, the X server

first queries the kernel permission monitor via the secure communication channel to

confirm that the copy or paste request is preceded by corresponding user interaction.

The operation is allowed to proceed only if the permission monitor responds with a

permission grant message; otherwise, the client is sent back an error message.

Note that this copy & paste protocol is followed merely by convention, and the

given interaction sequence is not enforced by the X server. As a result, a malicious

X client may attempt to skip certain steps of the protocol to bypass Overhaul’s

checks. One possible attack vector is the SendEvent request which allows an X client

to command the X server to send an X11 event on behalf of the client. By exploiting

123

this mechanism, a malicious client can directly send SelectionRequest events to

other clients and receive the copied data from the selection owner. To prevent such

attacks, our implementation also interposes on the SendEvent requests, and blocks

the sending of events that can break the copy & paste protocol. Other examples

of possible attacks include subscribing to events generated by the X server when

properties are created and updated to retrieve the pasted data stored in them before

the actual paste target could remove it. Overhaul ensures that such events are only

delivered to the paste target while the clipboard data is in flight. We omit details of

these low-level implementation issues.

5.4.2 Enhancements to the Linux Kernel

As shown in Section 5.3, our implementation augments the Linux kernel with a per-

mission monitor that establishes a secure communication link to the X Window Sys-

tem, mediates sensitive hardware accesses, adjusts per-application privileges in re-

sponse to interaction notifications, and responds to permission queries from the X

server for access to display resources.

Secure Communication Channel

The first property that our kernel must support is establishing and authenticating the

communication channel to the X Server. In our prototype, we used the Linux netlink

facility to provide this channel [101]. Netlink was originally designed to exchange

networking information between the kernel and user space, but it serves as a robust

general communication channel across this boundary.

Netlink, however, does not solve the authentication problem. That is, the kernel

and X server must ensure that no malicious program is interposing on the channel.

124

While using a standard mutual authentication protocol is possible, our prototype

instead relies on the fact that the kernel operates in supervisor mode and can intro-

spect on the user space X process. Once the kernel establishes the netlink channel

and receives a connection request from X during server initialization, it examines the

virtual memory maps to check whether the process it is communicating with is indeed

the X server. In particular, it checks whether the executable code mapped into the

process is loaded from the well-known, and super user-owned, file system path for the

X binaries. If so, it considers the remote party to be authenticated as the legitimate

X server and, due to the kernel’s supervisor privileges, the X server trusts that the

kernel will perform this procedure correctly.

Device Mediation

Overhaul must interpose on all accesses to sensitive hardware devices. To this

end, it suffices on Linux to monitor open system call invocations on device nodes

exposed in the file system. Therefore, our prototype implements an augmented open

system call that, in addition to normal UNIX access control checks, looks up the

interaction notification records received from the X server for the running process

to allow or deny access to the device accordingly. Note that it is usually considered

better practice to implement kernel-side security checks using the Linux Security

Modules (LSM) framework [102] instead of modifying system calls directly. However,

as of this writing, LSM does not officially support stacking multiple security modules.

Since Overhaul is not a replacement for other security modules, we implemented

our prototype in this way as a conscious design choice.

An important implementation detail of our prototype deals with accurately map-

ping sensitive devices to their file system paths. In particular, modern Linux dis-

tributions often make use of dynamic device name assignments at runtime using

125

frameworks such as udev. Therefore, our prototype relies on a trusted helper applica-

tion, owned by the super user and protected against unauthorized modification using

normal user-based access control, to manage this mapping. It is invoked in response

to changes in the device file system mounted by convention at /dev, and propagates

these changes to the kernel via an authenticated netlink channel.

Process Permission Management

The kernel permission monitor receives interaction notifications from the X server,

which includes a PID and a timestamp, and needs to record this information in an

easily accessible context associated with each process. Our prototype stores this infor-

mation inside the process descriptor, the data structure Linux uses to represent a

process. Every process descriptor is implicitly associated with a unique process; there-

fore, this procedure only requires us to locate the process descriptor corresponding to

the PID reported in the interaction notification, and save the interaction timestamp

inside this structure.

To perform a permission check, the permission monitor first receives the PID of the

process that requests access to the sensitive resource, either internally from the device

mediation layer, or from the X server via the netlink channel. Next, it retrieves the

correct process descriptor and compares the timestamp recorded there (i.e., the most

recent user interaction time) with the privileged operation’s timestamp. If the tem-

poral proximity of the two is above a pre-configured threshold, permission is granted

(or a positive response is sent back to the X server). We empirically determined that

setting a threshold of less than 1 second can lead to falsely revoked permissions, but

2 seconds is sufficient to prevent incorrectly denying access to legitimate processes.

In our long-term experiments with this configuration, described in Section 5.5.4, we

did not encounter any broken functionality or unusual program behavior.

126

Process Creation & IPC

As previously explained, Overhaul must be able to track interaction information

across process boundaries for any meaningful real-life use. Recall from Section 2.4.1

that, in Linux, a new process (i.e., the child) is created by duplicating an existing

process (i.e., the parent), using the clone system call. This operation duplicates

the process descriptor of the parent to be used for the child process, which includes

the interaction timestamp stored in the same data structure. In other words, our

implementation ensures that the parent’s interaction information is passed down to

a newly-created child automatically, without additional modification to the kernel.

This property also extends to the threads of a process, because Linux does not have a

strict distinction between processes and threads and uses a separate process descriptor

for each.

In contrast, tracking interaction information across IPC channels requires further

modifications to the kernel for each IPC facility provided by the OS. Our implemen-

tation supports all of POSIX shared memory and message queues, UNIX SysV shared

memory and message queues, FIFOs, anonymous pipes, and UNIX domain sockets.

Higher-level IPC mechanisms that are built on these OS primitives (e.g., D-Bus) are

also automatically covered. These IPC mechanisms are modified in a similar man-

ner to propagate interaction information between the two endpoint processes, which

works as follows:

(1) When an IPC channel is first established, we embed an expired interaction

timestamp inside the kernel data structures that correspond to the IPC resource.

(2) When a process wants to send data through an IPC link, it first embeds its own

interaction timestamp inside the IPC resource, unless the structure already contains a

more recent timestamp. (3) When the receiving process reads data from the channel,

127

it compares its own interaction timestamp with the one embedded inside the IPC

resource. If the IPC channel has a more up-to-date timestamp, the process saves it

in its own process descriptor.

Implementation of this protocol requires adding a timestamp field inside the IPC

data structures, and inserting checks inside the corresponding send and receive func-

tions for each IPC facility. However, a notable exception is POSIX and SysV shared

memory, which must be handled differently. Specifically, once the kernel allocates and

maps a shared memory region with the mmap system call, writes and reads to these

regions are regular memory operations that cannot be intercepted above the hardware

level. We overcome this obstacle by taking a different approach. We interpose on vir-

tual memory mapping operations inside the kernel, check whether the mapped area

is flagged as shared (indicated by a flag inside the corresponding vm area struct),

and if so, revoke read and write permissions for that memory area. This causes

subsequent accesses to that memory region to generate access violations, and allows

Overhaul to capture the IPC attempt inside the page fault handler. We then run

the interaction propagation protocol described above, and temporarily restore the

memory access permissions to their original values to allow the memory operation to

succeed on the next try. Clearly, repeating this process for every memory access could

lead to severe performance overhead; therefore, after every access violation, we put

the corresponding vm area struct on a wait list before its permissions are revoked

once again. This allows memory accesses that immediately follow the first page fault

to proceed uninterrupted. This wait duration must be sufficiently shorter than the 2

second interaction expiration time, since we would miss shared memory IPC attempts

and fail to propagate interaction timestamps during this period. We configured this

duration to 500 ms, which yielded a good performance-usability trade-off as shown

in Section 5.5.

128

Command Line Interface Interactions

A final implementation requirement arises from the fact that Linux systems often

make extensive use of the command line interface. On graphical desktops, this is

achieved by running a terminal emulator (e.g., xterm) that communicates with a

command line shell (e.g., bash) via a pair of pseudo terminal devices. If the user

was to type in the name of a command line application inside a terminal emulator

(as opposed to using a graphical application launcher), the terminal emulator would

receive the input events, and communicate the command to launch to the shell via the

pseudo terminal devices. Any subsequent device access requests would be made by a

program launched by the shell process, which has not received any direct interaction.

In fact, the shell usually is not even an X client and, thus, cannot receive X11 input

events.

To enable command line tools that access the protected sensitive devices to func-

tion correctly under Overhaul, we implemented an interaction timestamp propaga-

tion protocol analogous to the one described for IPC channels above. Here, the modifi-

cations are made inside the pseudo terminal device driver. Whenever a process writes

to a terminal endpoint, that process embeds its timestamp into the kernel data struc-

ture representing the pseudo terminal device. Subsequently, when another process

reads from the corresponding terminal endpoint, that process copies the embedded

timestamp to its process descriptor, unless it already has a more recent timestamp.

Processes Isolation and Introspection

Overhaul does not require sandboxing of individual user applications, or any ad-

vanced process isolation mechanism beyond the kernel and process memory isolation

that commodity operating systems provide. In particular, all interaction notifica-

129

tions in our design are managed by the OS; they are never exposed to user space

applications. This prevents malicious applications from tampering with legitimate

interaction notifications to mount denial-of-service attacks, or hijacking interaction

notifications of other processes. Similarly, since each interaction notification is bound

to a specific process, malicious applications that run in the background and receive

no user interaction cannot hijack the permissions granted to another application.

However, process introspection and debugging facilities offered by operating sys-

tems need attention, because they might make it possible to inject malicious code

into legitimate applications that are expected to have access to sensitive resources.

In Linux, this threat is somewhat contained since the Linux debugging facilities, such

as ptrace and /dev/{PID}/mem (which also uses ptrace internally), do not allow

attaching to processes that are not direct descendants of the debugging process. In

other words, even if two unrelated processes run with identical (but non-super user)

credentials, they cannot manipulate each other’s state.

In our implementation, we provide even stricter security by temporarily disabling

all permissions for a debugged process with a trivial patch to the ptrace system

call. This also prevents parent processes from tracing their own children, which, in

turn, subverts attacks where a malicious program could launch another legitimate

executable, and then inject code into it. Overhaul enables this protection by de-

fault, but it can be toggled off by the super user through a proc file system node to

facilitate legitimate debugging tasks.

5.5 Evaluation

Our evaluation of Overhaul consists of measuring its performance impact on the

system, and testing the usability and security properties of the implementation.

130

Benchmarks Baseline Overhaul Overhead

Device access 45.20 s 46.18 s 2.17 %Clipboard 116.48 s 119.93 s 2.96 %Screen capture 68.26 s 69.86 s 2.34 %Shared memory 234.86 s 236.33 s 0.63 %Bonnie++ 47319 files/s 47265 files/s 0.11 %

Table 5.1: Performance overhead of Overhaul.

5.5.1 Performance Measurements

Since Overhaul is an input-driven system that only impacts the operations per-

formed on privacy-sensitive resources, we expect its performance overhead to be

overshadowed by human-reaction times and I/O processing delays. Indeed, in our

experiments with the prototype implementation, we did not observe a discernible

performance drop compared to normal system operation. Consequently, in order to

obtain measurable performance indicators to characterize the overhead of Over-

haul, we created micro-benchmarks that exercise the critical performance paths of

our system. We also used a standard file system benchmarking utility to measure

the impact of our modified open system call on regular file system operations. We

explain each of these benchmarks in more detail below.

Device access. In this benchmark, we measured the time to open the file sys-

tem device node corresponding to the microphone installed on our testing system 10

million times.

Clipboard operations. We designed this benchmark to measure the runtime for

performing 100,000 clipboard operations. Since in the X Window System a paste is

significantly more costly than a copy, we configured our benchmark to only perform

pastes for this test and report the worst-case results.

131

Screen capture. This benchmark takes 1,000 screen captures using the imlib2

library and measures the total runtime. The time to save the image files to disk is

not included.

Shared memory IPC. Although Overhaul interposes on every IPC mecha-

nism, our preliminary measurements indicated that the shared memory communica-

tion incurred the highest overhead due to the necessity for intercepting page faults,

changing virtual memory access permissions, and invalidating page tables. Conse-

quently, to measure the worst-case performance impact, in this benchmark we mea-

sured the runtime for performing 10 billion write operations on a shared memory

area. We repeated this benchmark with different shared memory sizes (i.e., from 1

to 10,000 pages, with a page size of 4096 KB), and experimented with sequential and

random write patterns. We found no correlation between these parameters and the

performance impact; the overhead was near-identical in all runs. Here, we present

the results for a shared memory size of 10,000 pages, and random writes.

File system. To measure the performance impact of Overhaul on regular file

system operations, we ran Bonnie++ configured to create, stat and delete 102,400

empty files in a single directory. Since Overhaul does not interpose on stat or

unlink system calls, we were unable to reliably measure any overhead for file access

or deletion operations, as expected. Therefore, we only report the runtime overhead

for file creation.

For the purpose of this evaluation we temporarily modified Overhaul’s permis-

sion monitor to grant access to resources even when there is no user interaction,

in order to exercise the entire execution path of the benchmarked operation. We

repeated all tests on a Linux system with Overhaul, and on a system with an un-

modified kernel and X server, five times each, and compared the average results when

calculating the overhead.

132

Experiments were performed on a computer with an Intel i7-930 2.2GHz CPU,

9GB of RAM, running Arch Linux x86-64 with an Intel i7-930 processor, 9 GB mem-

ory, and running Arch Linux x86-64. We present the results of our experiments in

Table 5.1.

Our measurements show that Overhaul performs efficiently, with the highest

overhead observed being below 3%. Note that these experiments artificially stress

each operation under unusual workloads, and the overhead for a single operation is

on the order of milliseconds in the worst case, and ranging down to below a nanosec-

ond. Hence, the overhead is often not noticeable by the user. Moreover, the Bon-

nie++ benchmark demonstrates that Overhaul does not significantly impact the

performance of regular file open operations.

5.5.2 Usability Experiments

We conducted a user study with 46 participants to test the usability of Overhaul.

The participants were computer science students at the author’s institution, recruited

by asking for volunteers to help test a “defensive security system”. In order to avoid

the effects of priming, participants were not informed about the specific functionality

of Overhaul. The only recruitment requirement was that the participants be famil-

iar with using Skype and web browsing so that they could perform the given tasks

correctly. No personal information was collected from the participants at any point.

The participants were asked to perform two tasks to test different aspects of our

system. The first task presented them with a Skype instance on our test machine

running Overhaul, logged into a test account. They were asked to perform a call to

a second test account, while Overhaul performed its security checks without their

knowledge. Once complete, an experimenter asked the participants to compare this

133

process with their previous experience of using Skype. Specifically, they were asked

to rate the difficulty involved in interacting with the test setup on a 5-point Likert

scale, where a score of 1 indicated that their experience was almost identical, and 5

indicated that the test setup posed significant difficulty.

In the next task, the participants were asked to perform a specific search on the

Internet on an Overhaul-enabled machine. While they were occupied with the task,

a hidden background process that attempted to access the camera was triggered at

a random time, and was subsequently blocked by Overhaul, causing a visual alert

to be displayed. Once the task was complete, the participants were asked to explain

whether they had noticed anything unusual while performing their tasks.

At the end of the first phase of the experiment, all 46 participants found the

experience to be identical to using Skype on an unmodified system. This empiri-

cally confirms that Overhaul is transparent to the users. In the second phase, 24

participants immediately interrupted the task when the Overhaul notification was

displayed, and alerted the experiment observer to the blocked camera access. An-

other 16 noticed the alert, however continued the task and reported the unexpected

camera activity after being prompted by the observer. Only 6 users reported not

having noticed anything unusual. These results confirm that Overhaul alerts are

able to draw most users’ attention while they are occupied with other tasks, and are

effective security notifications.

5.5.3 Applicability & False Positives Assessment

To understand whether Overhaul interferes with the normal functionality of ap-

plications, or produces false alerts due to incorrectly blocked legitimate programs,

we tested the system on common applications. To compile the application pool for

134

this task, we first manually inspected the descriptions of all Top Rated packages in

the Ubuntu Software Center, and identified those that access the resources Over-

haul is designed to protect. Next, we searched the official and community package

repositories of Arch Linux, our experiment environment, with relevant keywords (e.g.,

webcam, microphone, screenshot, capture, record), and added the hits to the pool.

After eliminating the packages that do not work (e.g., due to missing dependencies)

we ended up with 58 applications consisting of video conferencing tools (e.g., Skype,

Jitsi), audio/video editors (e.g., Audacity, Kwave), audio/video recorders (Cheese,

ZArt), screenshot utilities (Shutter, GNOME Screenshot), and screencasting tools

(e.g., Istanbul, recordMyDesktop). The pool also included the popular web browsers

Firefox and Chromium; in those cases we tested them with various web-based video

chat applications. Note that the application pool contained both GUI and console pro-

grams. We manually experimented with each application to verify that they work as

expected, observed whether Overhaul alerts were displayed correctly, and whether

there were false alarms.

In our experiments, we encountered a single application that produced what could

be considered a spurious alert. Specifically, we observed that Skype attempted to

access the camera as soon as the program was launched, before the user logs into the

application. When Skype was configured to automatically start on boot, this situation

led to a camera access without user interaction, and consequently, Overhaul blocked

the access and produced an alert. This did not cause subsequent video calls to fail,

and we argue that blocking such unanticipated device accesses is the desired behavior

in order to achieve Overhaul’s security properties.

While we did not encounter any malfunctioning application, this experiment also

revealed a peculiar limitation of Overhaul. Specifically, some of the screenshot tools

we tested included an option to delay the shot by a user-specified time. By design,

135

Overhaul does not support this functionality since the interaction notifications

associated with the application expire before the screen could be captured.

To test Overhaul’s clipboard protection mechanism we used an additional set

of 50 applications including popular office programs, text and media editors, web

browsers, email clients, and terminal emulators. Since Overhaul does not display

alerts for clipboard accesses due to usability reasons, we instead verified correct func-

tionality by inspecting the logs produced by our system. In these tests we did not

encounter any false positives or incorrect program behavior.

We note that Overhaul does not support running scheduled tasks, or persistent

non-interactive programs that access the protected devices (e.g., a cron job that

periodically takes screenshots). While we did not encounter such applications in our

tests, this remains a fundamental limitation of our system.

5.5.4 Empirical Experiments

Due to ethical concerns, and the necessity of installing a custom kernel and malware-

like applications on users’ machines, it is a difficult task to design a large-scale user

study to test the long-term security and usability properties of Overhaul. There-

fore, we instead experimented with Overhaul on our personal home and work com-

puters. Below, we present the anecdotal insights gained in the process.

For this experiment, we implemented a malware-like application that runs in the

background during the computer’s normal operation and spies on the user. In par-

ticular, it periodically retrieves clipboard contents, takes screenshots, and records

sound samples from the microphone. For privacy reasons, our sample did not record

camera images. Since the test was performed on actual, personal machines used on

a daily basis, we only stored the captured information on disk, while real malware

136

would exfiltrate it to a remote host. We stress that our spy application was created

to mimic the behavior of real information-stealing malware [91, 92, 93, 94], exploiting

the standard interfaces to the sensitive resources exposed by the operating system.

No functionality was artificially added or removed that would ease its detection. We

installed this spy application on two of our computers, and enabled Overhaul on

one of the machines, while the other was left running unmodified, without protection.

We left the malware running for 21 days. Both computers were actively used everyday

for work and personal use.

At the end of the experiment we confirmed that the malware running on the Over-

haul-protected system could not collect any information, as expected. We checked

Overhaul’s logs and verified that attempts to access the protected resources were

detected and blocked. The malware on the vulnerable computer, on the other hand,

was able to successfully spy on the user. We manually investigated the collected data

and found sensitive information including screenshots of bank account information

displayed on an e-banking site and email exchanges. The data sampled from the clip-

board included passwords copied from the password manager, phone numbers, and

excerpts from emails. The malware was also able to collect voice recordings from the

headset microphone.

We also investigated Overhaul’s logs to see which applications were granted

access to the protected resources. The camera and microphone were used by two video

conferencing applications. Screen was captured by the system’s default screenshot

tool, and by a desktop recording application. Clipboard accesses were logged for

a large number of applications. During the testing period of 21 days, we did not

encounter any cases of legitimate applications being incorrectly blocked.

These observations show that spying malware can be severely damaging, and that

Overhaul is effective at improving user privacy in the face of attacks. Conducting a

137

similar long-term study at a larger scale, in a more scientific framework, is a difficult

yet promising future research direction.

5.6 Related Work

Previous work has studied capturing user intent to implement user-driven access con-

trol. Roesner et al. [90] present an approach in which permission granting is built

into user interactions with permission-granting GUI elements called access control

gadgets (ACG). The authors extend ServiceOS to provide this capability to applica-

tion developers, and require that applications be modified to use ACGs. This work

captures user intent at a fine granularity and provides stronger security guarantees

than Overhaul as each action is precisely mapped to a permission. However, our

goal is to propose an architecture that can be retrofitted into traditional OSes trans-

parently. In our work, we encountered a different set of challenges stemming from

the fact that we are dealing with traditional systems (i.e., Linux) that do not provide

the features that ServiceOS does.

Ringer et al. [103] take a different approach, and provide access control gadgets via

a secure user space library, combined with static and dynamic analyses. Then, they

port various Android applications to work with their system. For similar reasons as

above, this work provides stronger security guarantees than Overhaul. In contrast,

Overhaul treats user space applications in a blackbox manner, and does not require

modifications to them.

Gyrus [104] is a virtualization-based system that displays editable UI field entries

in text-based networked applications back to the user through a trusted output chan-

nel, and guarantees that this is the information sent over the network. BLADE [105]

138

infers the authenticity of browser-based file downloads based on user behavior. While

sharing similar goals with Overhaul, these address different security problems.

Systems that use timing information to capture user intent include BINDER [106]

and Not-a-Bot [107]. BINDER associates outbound network connections with input

events to build a host-based IDS. However, its design does not address the challenges

of IPC, making it unsuitable for use with certain applications that Overhaul targets.

Not-a-Bot uses TPM-backed attestations to tag user-generated network traffic on

the host, and a verifier on the server that checks them to implement DDoS, spam,

and clickjacking mitigation measures. These systems target network-based attacks,

whereas Overhaul aims to control access to privacy-sensitive devices.

Some systems that advocate user-authentic gestures for secure copy & paste be-

tween domains are the EROS Window System (EWS) [108], Qubes OS [109], and

Tahoma [110]. Similarly, in this chapter, we also address the problem of secure copy

& paste so that malicious applications cannot intercept these requests. There has also

been much work in the domain of trusted computing. For example, Terra [111], Over-

shadow [112], and vTPM [113] use virtual machine technology for enabling trusted

computing. In contrast to the above, Overhaul does not require use of virtualization

or explicit user cooperation.

Several operating systems and applications employ popup prompts to defer privacy

policy decisions to users [114, 115, 116, 117]. However, this approach to user-driven

access control has been shown to suffer from usability issues; for instance, Motiee et

al. [98] demonstrate that Windows users often find User Account Control prompts

distracting, dismiss them without due diligence, or disable them completely. Over-

haul sidesteps these concerns by taking a transparent, unintrusive approach. Flash

Player employs a mechanism that only allows clipboard operations initiated by user

input [118]. Overhaul generalizes this application-specific defense to the entire sys-

139

tem and other sensitive resources, and provides the additional security property that

user input cannot be generated synthetically.

Quire [119] is an extension to Android that enables applications to propagate

call chain context to downstream callees. Hence, applications can verify the sources

of user interactions, and make policy decisions accordingly. There has also been

much work that aims to enforce install time application permissions within Android

(e.g., Kirin [120], Saint [121], Apex [122]). These approaches enable the user to

define policies for protecting themselves against malicious applications. Overhaul

is orthogonal to the smartphone platform security work.

SELinux [123] enables MAC policies on Linux, and is mostly used to restrict

daemons such as database engines or web servers that have clearly defined data access

rights. While SELinux does not address the problem of dynamic access control,

Overhaul could in principle make use of SELinux for enforcing access control, as

an implementation choice.

5.7 Summary

Security models for traditional operating systems center on multiplexed computation

on timesharing systems, where multiple users share access to a single set of computing

resources. However, the shift towards dedicated devices with single users has resulted

in a fundamental impedance mismatch between the traditional model of users, groups,

and processes and the needs of modern systems. In particular, contemporary threats

often take the form of malicious programs that execute with the full privileges of

the user, rendering user-based security models largely ineffective. Mobile operating

systems such as iOS and Android, as well as research systems such as ServiceOS [90],

have promoted the concept of user-driven, dynamic access control to address the

140

shortcomings of traditional access control models. Here, permissions to access sensi-

tive resources are granted by users on-demand. However, operating systems for the

desktop and server have been largely neglected by these advances, since prior work

has required that applications be designed with dynamic access control in mind.

In this chapter we presented Overhaul, a general architecture for retrofitting a

dynamic, input-driven access control model into traditional operating systems in a

transparent manner. In our access control model, access to privacy-sensitive resources

is mediated based on the temporal proximity of user interactions to access requests.

We built upon this architecture to demonstrate how input-driven access control can be

implemented to protect privacy-sensitive resources such as the microphone, camera,

clipboard, and display contents.

The proposed design and implementation satisfies all of the research goals we laid

out in Section 1.5. (G1) We presented an abstract design of Overhaul indepen-

dent of the underlying operating system, and described a practical implementation

for Linux and X Window System. (G2) Overhaul is applicable to any software that

requires access to the privacy-sensitive resources covered by the architecture. (G3)

Overhaul necessitates no explicit effort to make applications conform to the pro-

posed input-driven access control model; existing applications can remain oblivious

to the presence of Overhaul and still benefit from Overhaul’s security properties.

(G4) Overhaul is demonstrably performant, usable, and applicable to most appli-

cations; in particular, it requires no changes to the traditional computing interface.

141

Chapter 6

Conclusions

Together, the changing technology and adversarial models gradually render existing

privacy defenses obsolete, or otherwise lead to the emergence of previously unexplored

privacy challenges. This evolving nature of privacy threats necessitate the security

community to continuously innovate and develop novel defenses. A significant aspect

of these efforts is to ensure that new defenses can easily enter into practice and achieve

widespread adoption. To this end, factors such as correctness of the solutions, de-

ployment and maintenance costs, scalability, and usability are of utmost importance.

From a technical standpoint, the operating system is a natural and convenient

platform to develop novel defenses on. In particular, an operating system-based

defense would allow the enforcement of strong security properties at scale, on all user

space applications. However, despite these advantages, such an approach would suffer

on the cost and usability front. Rolling out a new operating system in lieu of the

already widely established, popular alternatives is unlikely to find support in practice.

In this thesis, in light of the above considerations, we argued that retrofitting

novel privacy solutions into existing operating systems is the preferable approach. In

this way, solution developers can leverage the technical advantages of working at the

142

operating system level, while also sidestepping many of the cost and usability related

concerns. We illustrated that our proposed approach is not only feasible, but also

effective, by discussing four contemporary privacy threats, and presenting solutions

to address them.

In Chapter 2, we looked at keeping privacy-sensitive data produced during short-

lived program execution sessions, and persisted to disk, confidential. We then pre-

sented PrivExec as a means of providing an application-agnostic private execution

service inside the operating system.

In Chapter 3, we explored the challenges of securely discarding long-term persis-

tent data from modern storage hardware, once it is no longer needed. To address this

issue, we presented Eraser as a technique that can perform secure file deletion on

any blackbox storage medium.

In Chapter 4, we examined existing plausibly deniable disk encryption techniques

that allow users to hide the existence of privacy-sensitive data on their disk, and

pointed out their vulnerability to multiple-snapshot attacks. Next, we presented

Hive, a hidden volume encryption scheme that remains secure even in the face of

such attacks.

In Chapter 5, we discussed the shortcomings of traditional access control mech-

anisms in securing emerging, unconventional types of privacy-sensitive system re-

sources. We presented a user-driven access control model and operating system

architecture, collectively called Overhaul, to address this problem on traditional

desktop operating systems.

As evidenced by the operating system-independent designs, concrete Linux imple-

mentations, and evaluation of each of these systems, retrofitting novel privacy defenses

into existing operating systems is indeed possible. Furthermore, all four techniques

satisfy our research goals of (G1) designing solutions compatible with prevalent op-

143

erating systems, (G2) offering general privacy guarantees to entire classes of appli-

cations, (G3) requiring no modifications to user space applications, and finally, (G4)

providing good performance, and familiar interfaces to users.

Future Research Directions

In this thesis, we primarily focused on extending operating systems with defenses

that provide strong confidentiality of privacy-sensitive data on a host computer. In

principle, the general philosophy we presented in this thesis could also be applied

to techniques that address various other types of privacy issues. For instance, an

operating system service could be designed to transparently provide any application

with end-to-end secure communication capabilities over insecure network connections.

In this following, however, we are going to aim our attention at ideas related to

the specific privacy defenses we presented, and discuss their implications on possible

future research directions.

All four techniques we discussed in this thesis strictly conform to our overarch-

ing research goals of achieving compatibility, generality, transparency, and usability.

While, in an ideal world, all privacy defenses could greatly benefit from following

these principles, it is also important to recognize that relaxing some of these require-

ments can allow for a wider range of novel privacy techniques. One promising avenue

to investigate would be to relax the transparency requirement, and explore whether

cooperative applications (i.e., applications that are aware of the operating system’s

privacy services, and explicitly request or interact with them) could benefit from ad-

ditional, or stronger privacy guarantees. For example, PrivExec could be extended

with a fine-grained private execution API that would allow for more control over the

degree or types of privacy an application would like to provide to users.

144

Another promising research direction would be to explore ways to automatically

capture user intent, or in other words, link users’ intents to their actions when in-

teracting with a computer. One immediate application of such a capability would

be to address the limitations of Overhaul, and close the gap between white-box

approaches [90] that require applications to be written with user-driven access con-

trol and the black-box approach adopted here. For instance, it could be possible to

leverage static and dynamic program analyses to more precisely link user intent, user

input, and device accesses, all without requiring modifications to existing programs.

Of course, such a capability would also serve as a strong general primitive that

could be applied to different contexts and threats, and assist the security community

in designing defenses that target human vulnerabilities, or otherwise significantly

improve the usability of security tools.

Acknowledgments

The research presented in Chapter 2 is based on author’s previously published work:

Kaan Onarlioglu, Collin Mulliner, William Robertson, and Engin Kirda. PrivExec:

Private Execution as an Operating System Service. In IEEE Symposium on Security

and Privacy, 2013.

The research presented in Chapter 4 is based on author’s previously published

work: Erik-Oliver Blass, Travis Mayberry, Guevara Noubir, and Kaan Onarlioglu.

Toward Robust Hidden Volumes using Write-Only Oblivious RAM. In ACM Confer-

ence on Computer and Communications Security, 2014.

The research presented in Chapter 5 is based on author’s previously published

work: Kaan Onarlioglu, William Robertson, and Engin Kirda. Overhaul: Input-

Driven Access Control for Better Privacy on Traditional Operating Systems. In

IEEE/IFIP International Conference on Dependable Systems and Networks, 2016.

145

Bibliography

[1] The New York Times. Apple Fights Order to Unlock San BernardinoGunman’s iPhone. http://www.nytimes.com/2016/02/18/technology/

apple-timothy-cook-fbi-san-bernardino.html, 2016.

[2] A. Czeskis, D.J. St. Hilaire, K. Koscher, S.D. Gribble, T. Kohno, andB. Schneier. Defeating Encrypted and Deniable File Systems: TrueCrypt v5.1aand the Case of the Tattling OS and Applications. In USENIX Summit on HotTopics in Security, 2008.

[3] Gaurav Aggarwal, Elie Bursztein, Collin Jackson, and Dan Boneh. An Analysisof Private Browsing Modes in Modern Browsers. In USENIX Security Sympo-sium, 2010.

[4] The PaX Team. PaX Address Space Layout Randomization (ASLR). http:

//pax.grsecurity.net/docs/aslr.txt, 2003.

[5] The PaX Team. PaX Non-Executable Pages (NOEXEC). http://pax.

grsecurity.net/docs/noexec.txt, 2003.

[6] Martın Abadi, Mihai Budiu, Ulfar Erlingsson, and Jay Ligatti. Control-FlowIntegrity. In ACM Conference on Computer and Communications Security,2005.

[7] Michael Weissbacher, Tobias Lauinger, and William Robertson. Why is CSPFailing? Trends and Challenges in CSP Adoption. In International Symposiumon Research in Attacks, Intrusions and Defenses, 2014.

[8] eCryptfs. https://launchpad.net/ecryptfs.

[9] Overlayfs Filesystem. https://www.kernel.org/doc/Documentation/

filesystems/overlayfs.txt.

[10] dm-crypt. http://code.google.com/p/cryptsetup/wiki/DMCrypt.

[11] Bonnie++. http://www.coker.com.au/bonnie++/.

[12] Selenium – Web Browser Automation. http://seleniumhq.org/.

146

[13] xdotool. http://www.semicomplete.com/projects/xdotool/xdotool.

xhtml.

[14] Sotiris Ioannidis, Stelios Sidiroglou, and Angelos D. Keromytis. Privacy as anOperating System Service. In USENIX Summit on Hot Topics in Security,2006.

[15] Alan M. Dunn, Michael Z. Lee, Suman Jana, Sangman Kim, Mark Silberstein,Yuanzhong Xu, Vitaly Shmatikov, and Emmett Witchel. Eternal Sunshineof the Spotless Machine: Protecting Privacy with Ephemeral Channels. InUSENIX Conference on Operating Systems Design and Implementation, 2012.

[16] Su Mon Kywe, Christopher Landis, Yutong Pei, Justin Satterfield, Yuan Tian,and Patrick Tague. PrivateDroid: Private Browsing Mode for Android. InIEEE International Conference on Trust, Security and Privacy in Computingand Communications, 2014.

[17] Judicael Briand Djoko, Brandon Jennings, and Adam J. Lee. TPRIVEXEC:Private Execution in Virtual Memory. In ACM Conference on Data and Appli-cation Security and Privacy, 2016.

[18] Edward W. Felten and Michael A. Schneider. Timing Attacks on Web Privacy.In ACM Conference on Computer and Communications Security, 2000.

[19] Andrew Clover. CSS visited pages disclosure. http://seclists.org/bugtraq/2002/Feb/271, 2002.

[20] Artur Janc and Lukasz Olejnik. Web Browser History Detection as a Real-worldPrivacy Threat. In European Symposium on Research in Computer Security,2010.

[21] Adil Alsaid and David Martin. Detecting Web Bugs with Bugnosis: PrivacyAdvocacy through Education. In Privacy Enhancing Technologies, 2003.

[22] Collin Jackson, Andrew Bortz, Dan Boneh, and John C. Mitchell. ProtectingBrowser State from Web Privacy Attacks. In World Wide Web Conference,2006.

[23] Markus Jakobsson and Sid Stamm. Invasive Browser Sniffing and Countermea-sures. In World Wide Web Conference, 2006.

[24] Umesh Shankar and Chris Karlof. Doppelganger: Better Browser Privacy With-out the Bother. In ACM Conference on Computer and Communications Secu-rity, 2006.

147

[25] Huwida Said, Al Noora Mutawa, Al Awadhi Ibtesam, and Mario Guimaraes.Forensic Analysis of Private Browsing Artifacts. In IEEE Innovations in Infor-mation Technology, 2011.

[26] Meng Xu, Yeongjin Jang, Xinyu Xing, Taesoo Kim, and Wenke Lee. UCog-nito: Private Browsing Without Tears. In ACM Conference on Computer andCommunications Security, 2015.

[27] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William Clarkson,William Paul, Joseph A. Calandrino, Ariel J. Feldman, Jacob Appelbaum, andEdward W. Felten. Lest We Remember: Cold Boot Attacks on EncryptionKeys. In USENIX Security Symposium, 2008.

[28] David L. C. Thekkath, Mark Mitchell, Patrick Lincoln, Dan Boneh, JohnMitchell, and Mark Horowitz. Architectural Support for Copy and TamperResistant Software. In ACM International Conference on Architectural Supportfor Programming Languages and Operating Systems, 2000.

[29] G. Edward Suh, Dwaine Clarke, Blaise Gassend, Marten van Dijk, and Srini-vas Devadas. AEGIS: Architecture for Tamper-Evident and Tamper-ResistantProcessing. In International Conference on Supercomputing, 2003.

[30] Peter A. H. Peterson. Cryptkeeper: Improving Security with Encrypted RAM.In IEEE International Conference on Technologies for Homeland Security, 2010.

[31] Jim Chow, Ben Pfaff, Tal Garfinkel, and Mendel Rosenblum. Shredding YourGarbage: Reducing Data Lifetime through Secure Deallocation. In USENIXSecurity Symposium, 2005.

[32] Niels Provos. Encrypting Virtual Memory. In USENIX Security Symposium,2000.

[33] Matt Blaze. A Cryptographic File System for UNIX. In ACM Conference onComputer and Communications Security, 1993.

[34] Erez Zadok, Ion Badulescu, and Alex Shender. Cryptfs: A Stackable VnodeLevel Encryption File System. Technical report, Computer Science Department,Columbia University, 1998.

[35] EncFS. www.arg0.net/encfs.

[36] BitLocker. http://windows.microsoft.com/en-US/windows7/products/

features/bitlocker.

148

[37] Yang Tang, Phillip Ames, Sravan Bhamidipati, Ashish Bijlani, Roxana Geam-basu, and Nikhil Sarda. CleanOS: Limiting Mobile Data Exposure with IdleEviction. In USENIX Conference on Operating Systems Design and Implemen-tation, 2012.

[38] Kevin Borders, Eric Vander Weele, Billy Lau, and Atul Prakash. ProtectingConfidential Data on Personal Computers with Storage Capsules. In USENIXSecurity Symposium, 2009.

[39] Dan Boneh and Richard J. Lipton. A Revocable Backup System. In USENIXSecurity Symposium, 1996.

[40] Radia Perlman. The Ephemerizer: Making Data Disappear. Technical report,Sun Microsystems, Inc., 2005.

[41] Zachary N. J. Peterson, Randal Burns, Joe Herring, Adam Stubblefield, andAviel D. Rubin. Secure Deletion for a Versioning File System. In USENIXConference on File and Storage Technologies, 2005.

[42] Joel Reardon, Srdjan Capkun, and David Basin. Data Node Encrypted FileSystem: Efficient Secure Deletion for Flash Memory. In USENIX SecuritySymposium, 2012.

[43] shred(1) - Linux Man page. http://www.gnu.org/software/coreutils/.

[44] Steven Bauer and Nissanka B. Priyantha. Secure Data Deletion for Linux FileSystems. In USENIX Security Symposium, 2001.

[45] Nikolai Joukov, Harry Papaxenopoulos, and Erez Zadok. Secure DeletionMyths, Issues, and Solutions. In ACM Workshop on Storage Security and Sur-vivability, 2006.

[46] Hubert Ritzdorf, Nikolaos Karapanos, and Srdjan Capkun. Assisted Deletion ofRelated Content. In Annual Computer Security Applications Conference, 2014.

[47] Zhenkai Liang, Weiqing Sun, V. N. Venkatakrishnan, and R. Sekar. Alcatraz:An Isolated Environment for Experimenting with Untrusted Software. ACMTransactions on Information and System Security, 12(3):14:1–14:37, 2009.

[48] Shvetank Jain, Fareha Shafique, Vladan Djeric, and Ashvin Goel. Application-Level Isolation and Recovery with Solitude. In European Conference on Com-puter Systems, 2008.

[49] Yanlin Li, Jonathan McCune, James Newsome, Adrian Perrig, Brandon Baker,and Will Drewry. MiniBox: A Two-Way Sandbox for x86 Native Code. InUSENIX Annual Technical Conference, 2014.

149

[50] Francis Hsu, Hao Chen, Thomas Ristenpart, Jason Li, and Zhendong Su. Backto the Future: A Framework for Automatic Malware Removal and SystemRepair. In Annual Computer Security Applications Conference, 2006.

[51] Suman Jana, Donald E. Porter, and Vitaly Shmatikov. TxBox: Building Secure,Efficient Sandboxes with System Transactions. In IEEE Symposium on Securityand Privacy, 2011.

[52] Joel Reardon, David Basin, and Srdjan Capkun. SoK: Secure Data Deletion.In IEEE Symposium on Security and Privacy, 2013.

[53] Michael Wei, Laura M. Grupp, Frederick E. Spada, and Steven Swanson. Reli-ably Erasing Data from Flash-Based Solid State Drives. In USENIX Conferenceon File and Storage Technologies, 2011.

[54] Nikolai Joukov and Erez Zadok. Adding Secure Deletion to Your Favorite FileSystem. In IEEE International Security in Storage Workshop, 2005.

[55] Gordon F. Hughes, Tom Coughlin, and Daniel M. Commins. Disposal of Diskand Tape Data by Secure Sanitization. In IEEE Symposium on Security andPrivacy, 2009.

[56] Berke Durak. Wipe. https://github.com/berke/wipe, 2009.

[57] Peter Gutmann. Secure Deletion of Data from Magnetic and Solid-State Mem-ory. In USENIX Security Symposium, 1996.

[58] Apple, Inc. Mac OS X: About Disk Utility’s erase free space feature. https:

//support.apple.com/kb/HT3680, 2016.

[59] Jim Garlick. diskscrub. https://code.google.com/archive/p/diskscrub/,2008.

[60] Jaeheung Lee, Sangho Yi, Junyoung Heo, Hyungbae Park, Sung Y. Shin, andYookun Cho. An Efficient Secure Deletion Scheme for Flash File Systems.Journal of Information Science and Engineering, 2010.

[61] Sarah Diesburg, Christopher Meyers, Mark Stanovich, Michael Mitchell, JustinMarshall, Julia Gould, An-I Andy Wang, and Geoff Kuenning. TrueErase: Per-file Secure Deletion for the Storage Data Path. In Annual Computer SecurityApplications Conference, 2012.

[62] Sarah Diesburg, Christopher Meyers, Mark Stanovich, An-I Andy Wang, andGeoff Kuenning. TrueErase: Leveraging an Auxiliary Data Path for Per-FileSecure Deletion. ACM Transactions on Storage, 12(4):18:1–18:37, 2016.

150

[63] Steven Swanson and Michael Wei. SAFE: Fast, Verifiable Sanitization for SSDs.Technical report, University of California, San Diego, 2010.

[64] Joel Reardon, Hubert Ritzdorf, David Basin, and Srdjan Capkun. Secure DataDeletion from Persistent Media. In ACM Conference on Computer and Com-munications Security, 2013.

[65] Michael Larabel. Phoronix – The Performance Impact Of Linux Disk En-cryption On Ubuntu 14.04 LTS. http://www.phoronix.com/scan.php?page=article&item=ubuntu_1404_encryption.

[66] Device-mapper – Linux Kernel Documentation. https://www.kernel.org/

doc/Documentation/device-mapper/.

[67] Kernel Probes (Kprobes) – Linux Kernel Documentation. https://www.

kernel.org/doc/Documentation/kprobes.txt.

[68] Clemens Fruhwirth. New Methods in Hard Disk Encryption. http://clemens.endorphin.org/cryptography, 2005.

[69] Network Block Device (TCP version) – Linux Kernel Documentation. https:

//www.kernel.org/doc/Documentation/blockdev/nbd.txt.

[70] Oded Goldreich and Rafail Ostrovsky. Software Protection and Simulation onOblivious RAMs. Journal of the ACM, 43(3):431–473, 1996.

[71] Elaine Shi, T.-H. Hubert Chan, Emil Stefanov, and Mingfei Li. Oblivious RAMwith O(log3(N)) Worst-Case Cost. In International Conference on the Theoryand Applications of Cryptology and Information Security, 2011.

[72] Emil Stefanov, Marten van Dijk, Elaine Shi, Christopher Fletcher, Ling Ren,Xiangyao Yu, and Srinivas Devadas. Path ORAM: An Extremely Simple Obliv-ious RAM Protocol. In ACM Conference on Computer and CommunicationsSecurity, 2013.

[73] Phillip Rogaway. Nonce-Based Symmetric Encryption. In Fast Software En-cryption, 2004.

[74] Birger Jansson. Choosing a Good Appointment System – A Study of Queuesof the Type (D,M, 1). Operations Research, 14(2):292–312, 1966.

[75] Lichun Li and Anwitaman Datta. Write-Only Oblivious RAM-based Privacy-Preserved Access of Outsourced Data. International Journal of InformationSecurity, 2016.

151

[76] Travis Mayberry, Erik-Oliver Blass, and Agnes Hui Chan. Efficient Private FileRetrieval by Combining ORAM and PIR. In Network and Distributed SystemSecurity Symposium, 2014.

[77] Ran Canetti, Cynthia Dwork, Moni Naor, and Rafail Ostrovsky. DeniableEncryption. In Advances in Cryptology, 1997.

[78] TrueCrypt. Free Open-Source On-the-Fly Encryption. http://www.

truecrypt.org/.

[79] Adam Skillen and Mohammad Mannan. On Implementing Deniable StorageEncryption for Mobile Devices. In Network and Distributed System SecuritySymposium, 2013.

[80] Sarah Dean. FreeOTFE, 2010. Archive available at https://web.archive.

org/web/20130531062457/http://freeotfe.org/.

[81] Julian Assange, Ralf Philipp Weinmann, and Suelette Dreyfus. Rubber-hose File System, 2001. Archive available at http://web.archive.org/web/

20120716034441/http://marutukku.org/.

[82] Ross Anderson, Roger Needham, and Adi Shamir. The Steganographic FileSystem. In Information Hiding, 1998.

[83] Andrew D. McDonald and Markus G. Kuhn. StegFS: A Steganographic FileSystem for Linux. In Information Hiding, pages 462–477, 1999.

[84] Hwee Hwa Pang, Kian-Lee Tan, and Xuan Zhou. StegFS: A SteganographicFile System. In International Conference on Data Engineering, 2003.

[85] Adam Skillen and Mohammad Mannan. On Implementing Deniable StorageEncryption for Mobile Devices. In Network and Distributed System SecuritySymposium, 2013.

[86] Adam Skillen and Mohammad Mannan. Mobiflage: Deniable Storage Encryp-tion for Mobile Devices. IEEE Transactions on Dependable and Secure Com-puting, 11(3):224–237, 2014.

[87] Xingjie Yu, Bo Chen, Zhan Wang, Bing Chang, Wen Tao Zhu, and Jiwu Jing.MobiHydra: Pragmatic and Multi-level Plausibly Deniable Encryption Storagefor Mobile Devices. In Information Security, 2014.

[88] Bing Chang, Zhan Wang, Bo Chen, and Fengwei Zhang. MobiPluto: FileSystem Friendly Deniable Storage for Mobile Devices. In Annual ComputerSecurity Applications Conference, ACSAC 2015, 2015.

152

[89] Kenneth G. Paterson and Mario Strefler. A Practical Attack Against the Useof RC4 in the HIVE Hidden Volume Encryption System. In ACM Symposiumon Information, Computer and Communications Security, 2015.

[90] Franziska Roesner, Tadayoshi Kohno, Alexander Moshchuk, Bryan Parno, He-len J. Wang, and Crispin Cowan. User-Driven Access Control: RethinkingPermission Granting in Modern Operating Systems. In IEEE Symposium onSecurity and Privacy, 2012.

[91] CERT Polska - Slave, Banatrix and Ransomware. http://www.cert.pl/news/10358.

[92] Dell SonicWALL Security Center - Malware switches users Bank Account Num-ber with that of the attacker. https://www.mysonicwall.com/sonicalert/

searchresults.aspx?ev=article&id=614.

[93] Alexander Gostev. The Flame: Questions and Answers. http://securelist.com/blog/incidents/34344/the-flame-questions-and-answers-51/.

[94] Trojan-Spy:W32/Zbot. http://www.f-secure.com/v-descs/trojan-spy_

w32_zbot.shtml.

[95] Robert W. Scheifler. X Window System Protocol. http://www.x.org/

releases/X11R7.7/doc/xproto/x11protocol.html.

[96] Kieron Drake. XTEST Extension Protocol. http://www.x.org/releases/

X11R7.7/doc/xextproto/xtest.html.

[97] Lin-Shung Huang, Alex Moshchuk, Helen J. Wang, Stuart Schechter, and CollinJackson. Clickjacking: Attacks and Defenses. In USENIX Security Symposium,2012.

[98] Sara Motiee, Kirstie Hawkey, and Konstantin Beznosov. Do Windows UsersFollow the Principle of Least Privilege? Investigating User Account ControlPractices. In Symposium on Usable Privacy and Security, 2010.

[99] Jonathan Corbet. MIT-SHM (The MIT Shared Memory Extension). http:

//www.x.org/releases/X11R7.7/doc/xextproto/shm.html.

[100] David Rosenthal. Inter-Client Communication Conventions Manual. http:

//www.x.org/releases/X11R7.7/doc/xorg-docs/icccm/icccm.html.

[101] J. Salim, H. Khosravi, A. Kleen, and A. Kuznetsov. Linux Netlink as an IPServices Protocol. http://www.ietf.org/rfc/rfc3549.txt, 2003.

153

[102] Chris Wright, Crispin Cowan, James Morris, Stephen Smalley, and Greg Kroah-Hartman. Linux Security Modules: General Security Support for the LinuxKernel. In USENIX Security Symposium, 2002.

[103] Talia Ringer, Dan Grossman, and Franziska Roesner. AUDACIOUS: User-Driven Access Control with Unmodified Operating Systems. In ACM Confer-ence on Computer and Communications Security, 2016.

[104] Yeongjin Jang, Simon P. Chung, Bryan D. Payne, and Wenke Lee. Gyrus: AFramework for User-Intent Monitoring of Text-Based Networked Applications.In Network and Distributed System Security Symposium, 2014.

[105] Long Lu, Vinod Yegneswaran, Phillip Porras, and Wenke Lee. BLADE: AnAttack-agnostic Approach for Preventing Drive-by Malware Infections. In ACMConference on Computer and Communications Security, 2010.

[106] Weidong Cui, Randy H. Katz, and Wai-tian Tan. Design and Implementationof an Extrusion-based Break-In Detector for Personal Computers. In AnnualComputer Security Applications Conference, 2005.

[107] Ramakrishna Gummadi, Hari Balakrishnan, Petros Maniatis, and Sylvia Rat-nasamy. Not-a-Bot: Improving Service Availability in the Face of Botnet At-tacks. In USENIX Symposium on Networked Systems Design and Implementa-tion, 2009.

[108] Jonathan S. Shapiro, John Vanderburgh, Eric Northup, and David Chizmadia.Design of the EROS Trusted Window System. In USENIX Security Symposium,2004.

[109] The Qubes OS Project. http://www.qubes-os.org/trac.

[110] Richard S. Cox, Steven D. Gribble, Henry M. Levy, and Jacob Gorm Hansen.A Safety-Oriented Platform for Web Applications. In IEEE Symposium onSecurity and Privacy, 2006.

[111] Tal Garfinkel, Ben Pfaff, Jim Chow, Mendel Rosenblum, and Dan Boneh. Terra:A Virtual Machine-based Platform for Trusted Computing. In ACM Symposiumon Operating Systems Principles, 2003.

[112] Xiaoxin Chen, Tal Garfinkel, E. Christopher Lewis, Pratap Subrahmanyam,Carl A. Waldspurger, Dan Boneh, Jeffrey Dwoskin, and Dan R.K. Ports. Over-shadow: A Virtualization-based Approach to Retrofitting Protection in Com-modity Operating Systems. ACM SIGOPS Operating Systems Review, 42(2),2008.

154

[113] Stefan Berger, Ramon Caceres, Kenneth A. Goldman, Ronald Perez, ReinerSailer, and Leendert Doorn. vTPM: Virtualizing the Trusted Platform Module.In USENIX Security Symposium, 2006.

[114] OS X Mountain Lion: Prompted for access to contacts when opening an appli-cation. http://support.apple.com/en-us/HT202531.

[115] Flash Player Help - Privacy settings. http://www.macromedia.com/support/documentation/en/flashplayer/help/help09.html.

[116] iOS Developer Library - Getting the User’s Location. https:

//developer.apple.com/library/ios/documentation/UserExperience/

Conceptual/LocationAwarenessPG/CoreLocation/CoreLocation.html.

[117] Windows Help - What is User Account Control? http://windows.microsoft.

com/en-us/windows/what-is-user-account-control.

[118] Ian Melven. User-initiated action requirements in Flash Player 10.http://www.adobe.com/devnet/flashplayer/articles/fplayer10_uia_

requirements.html.

[119] Michael Dietz, Shashi Shekhar, Yuliy Pisetsky, Anhei Shu, and Dan S. Wal-lach. Quire: Lightweight Provenance for Smart Phone Operating Systems. InUSENIX Security Symposium, 2011.

[120] William Enck, Machigar Ongtang, and Patrick McDaniel. On Lightweight Mo-bile Phone Application Certification. In ACM Conference on Computer andCommunications Security, 2009.

[121] Machigar Ongtang, Stephen McLaughlin, William Enck, and Patrick McDaniel.Semantically Rich Application-centric Security in Android. In Annual ComputerSecurity Applications Conference, 2009.

[122] Mohammad Nauman, Sohail Khan, and Xinwen Zhang. Apex: ExtendingAndroid Permission Model and Enforcement with User-defined Runtime Con-straints. In ACM Symposium on Information, Computer and CommunicationsSecurity, 2010.

[123] SELinux Project. http://selinuxproject.org/page/Main_Page.

155

Documents

Retrofitting privacy into operating systems › files › neu:cj...privacy-sensitive data produced during short-lived program execution sessions con dential, (2) securely deleting