25
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis Avgerinos, David Brumley 8/16/2010 Carnegie Mellon University 1 (Yes, we were trying to overflow the title length field on the submission server)

All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

All You Ever Wanted to Know AboutDynamic Taint Analysis

&Forward Symbolic Execution

(but might have been afraid to ask)

Edward J. Schwartz, Thanassis Avgerinos, David Brumley

8/16/2010 Carnegie Mellon University 1

(Yes, we were trying to overflow the title length fieldon the submission server)

Page 2: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

A Few Things You Need to Know AboutDynamic Taint Analysis

&Forward Symbolic Execution

(but might have been afraid to ask)

Edward J. Schwartz, Thanassis Avgerinos, David Brumley

8/16/2010 Carnegie Mellon University 2

Page 3: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

The Root of All Evil

8/16/2010 Carnegie Mellon University 3

Humans write programs

This Talk:Computers Analyzing Programs Dynamically at

Runtime

Page 4: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Two Essential Runtime Analyses

8/16/2010 Carnegie Mellon University 4

Dynamic Taint Analysis:What values are derived from user input?

Detect Exploits [Costa2005,Crandall2005,Newsome2005,Suh2004]

Detectpacking in malware

[Bayer2009,Yin2007]

Forward Symbolic Execution:What input will make execution reach this line of code?

Input Filter Generation [Costa2007,Brumley2008]

Automated Test Case Generation

[Cadar2008,Godefroid2005,Sen2005]

Page 5: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Our Contributions

1: Turn English descriptions into an algorithm

– Operational Semantics

2: Algorithm highlights caveats, issues, and unsolved problems that are deceptively hard

8/16/2010 Carnegie Mellon University 5

Dynamic Taint Analysis:Is this value affected by user input?

Forward Symbolic Execution:What input will make execution

reach this line of code?

Computers Analyzing Programs Dynamically at Runtime

Page 6: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Our Contributions (cont’d)

3: Systematize recurring themes in a wealth of previous work

8/16/2010 Carnegie Mellon University 6

Page 7: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

1. How it works – example

2. Desired properties

3. Example issue. Paper has many more.

8/16/2010 Carnegie Mellon University 7

Dynamic Taint Analysis:What values are derived from user input?

Page 8: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 8

x = get_input( )

y = x + 42

goto yInput is tainted

untaintedtainted

x 7

ΔVar Val

Tx

Tainted?Var

τ

Inputt = IsUntrusted(src)get_input(src)↓ t

Taint Introduction

Page 9: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 9

x = get_input( )

y = x + 42

goto yData derived from

user input is tainted

untaintedtainted

y 49

ΔVar Val

x 7

Ty

Tainted?

T

Var

x

τ

BinOpt1 = τ[x1] , t2 = τ[x2]

x1 + x2 ↓ t1 v t2

Taint Propagation

Page 10: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 10

Policy ViolationDetected

x = get_input( )

y = x + 42

goto y

untaintedtainted ΔVar Val

x 7

y 49

Tainted?

T

T

Var

x

y

τTaint Checking

Pgoto(ta) = ¬ ta

(Must be true to execute)

Page 11: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 11

x = get_input( )

y = …

goto y

…strcpy(buffer,argv[1]) ;…return ;

Jumping to overwritten

return address

Real Use:Exploit Detection

Different Use:Program Control

Page 12: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Memory Load

8/16/2010 Carnegie Mellon University 12

Variables Memory

ΔVar Val

x 7

Tainted?

T

Var

x

τ

μAddr Val

7 42

Tainted?

F

Addr

7

τμ

Page 13: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Problem: Memory Addresses

8/16/2010 Carnegie Mellon University 13

x = get_input( )y = load( x )… goto y

All values derived from user input

are tainted??

7 42μ

Addr Val

Tainted?

F

Addr

7τμ

x 7Δ

Var Val

Page 14: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

μAddr Val

x = get_input( )y = load( x )… goto y

Jump target could be any untainted

memory cell value

Policy 1:

8/16/2010 Carnegie Mellon University 14

Loadv = Δ*x] , t = τμ[v]

load(x) ↓ t

Taint depends only on the memory cell

Taint Propagation

7 42

Tainted?

F

Addr

7τμ

x 7Δ

Var Val

UndertaintingFailing to identify tainted values

- e.g., missing exploits

Page 15: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

jmp_table

Policy Violation?

8/16/2010 Carnegie Mellon University 15

x = get_input( )y = load(jmp_table + x % 2 )…goto y

Policy 2:

Memory

printa

printb

Address expression is tainted

Load v = Δ*x] , t = τμ[v], ta = τ[x]load(x) ↓ t v ta

If either the address or the memory cell is tainted, then the value is tainted

Taint Propagation

OvertaintingUnaffected values are tainted

- e.g., exploits on safe inputs

Page 16: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Research ChallengeState-of-the-Art is not perfect for all

programs

8/16/2010 Carnegie Mellon University 16

Undertainting:Policy may miss taint

Overtainting:Policy may wrongly

detect taint

Page 17: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

• How it works – example

• Inherent problems of symbolic execution

• Proposed solutions

8/16/2010 Carnegie Mellon University 17

Forward Symbolic Execution:What input will make execution reach this line of code?

Page 18: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

The Challenge

8/16/2010 Carnegie Mellon University 18

0x12345678

232 possible inputs

packet_len(int header, char *packet)char buf*2048+ = “…”;if (header < 0)

return 0;if (header == 0x12345678)

strcpy(buf, packet); return strlen(buf);

Forward Symbolic Execution:What input will make execution

reach this line of code?

Page 19: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

f t

f t

A Simple Example

8/16/2010 Carnegie Mellon University 19

header < 0

header symboliccan have any

value

packet_len(int header, …)

If (header < 0)

If header == 0x12345678 return 0;

strcpy(buf,packet);return strlen(buf);

Interpreter

InterpreterInterpreter

InterpreterInterpreter

header ≥ 0 Λheader != 0x12345678

header ≥ 0 Λheader == 0x12345678

header ≥ 0

What input will make execution reach this line of

code?

Page 20: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 20

One Problem: Exponential Blowup Due to Branches

Branch 2

Branch 3

Branch 1

Exponential Number of Interpreters/formulas in # of branches

Interpreter

Page 21: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 21

Path Selection Heuristics

Symbolic Execution Tree

• Depth-First Search (bounded) ,Random Search [Cadar2008]

• Concolic Testing [Sen2005,Godefroid2008]…

However, these are heuristics. In the worst case all create an exponential number of formulas in the tree height.

Page 22: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Symbolic Execution is not Easy

• Exponential number of interpreters/formulas

• Exponentially-sized formulas

• Solving a formula is NP-Complete!

8/16/2010 Carnegie Mellon University 22

branching

substitutions + s + s + s +s + s + s + s == 42

Page 23: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Other Important Issues

8/16/2010 Carnegie Mellon University 23

Formalization

Π = (s + s + s + s +s + s + s + s) ==

42

More complex policies

Page 24: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

Conclusion

• Dynamic taint analysis and forward symbolic execution used extensively in literature

– Formal algorithm and what is done for each possible step of execution often not emphasized

• We provided a formal definition and summarized

– Critical issues

– State-of-the-art solutions

– Common tradeoffs

8/16/2010 Carnegie Mellon University 24

Page 25: All You Ever Wanted to Know About Dynamic Taint …aavgerin/slides/WOOT10.pdfAll You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been

8/16/2010 Carnegie Mellon University 25

Questions?

Thank [email protected]