Download pdf - Checking the World’s Software - CMU Engineering · Checking the World’s Software ... 52 new bugs in 29 programs. 5 . Fact: Windows, Mac, and ... 2. at the beginning of each epoch,

Checking the Worlds Software

David Brumley Carnegie Mellon University [email protected] http://security.ece.cmu.edu/

Black Hat

format c:

White Hat

vs.

2

An epic battle

Black Hat

format c:

White Hat

Bug

3

Exploit bugs

Black Hat

format c:

White Hat

Bug Fixed!

4

1. inp=`perl e '{print "A"x8000}'` 2. for program in /usr/bin/*; do 3. for opt in {a..z} {A..Z}; do 4. timeout s 9 1s $program -$opt $inp 5. done 6. done

1009 Linux programs. 13 minutes. 52 new bugs in 29 programs.

5

Fact: Windows, Mac, and

Linux all have 100,000s

of bugs

6

Which bugs are exploitable?

7

Highly Trained Experts 8

CMU creates experts

9

11

DEFCON 2013 DEFCON 2014

12

The US is lagging.

13

...

picoCTF: Teaching 10,000 High School Students to Hack

+

17

> 12 hours working/average

$25k scholarships

~10,000 each year

Largest cyber exercise ever

~2000 teams

57 problems

18

Will you encourage your students to compete next year?

YesNo

Impact beyond high school Promote awareness 1. Cisco 2. Project lead the way 3. HS-CTF 4. BoilerQuest 5. ACTF 6. BCACTF 7. ...

Foundation for Cyber 1. US Naval Academy 2. US Air Force Academy 3. US Military Academy 4. US Coast Guard

Academy

19

20

Which bugs are exploitable?

even the best people are not enough.....

DEF CON 2012 scoreboard

21

CMU

Time (3 days total)

Does not scale 22

I skate to where the puck is going to be, not where it has been. --- Wayne Gretzky Hockey Hall of Fame

23

White

The Vision: Automatically

Check the Worlds Software for

Exploitable Bugs

24

Checking the worlds software

25

Program Analysis

Safe

Exploitable

Buggy

We owned the machine in seconds

27

28

Program Analysis

Safe

Exploitable

Buggy

Binary Programs

Symbolic Execution [Boyer75, Howden75, King76]

Verification

Correct Safe path

Incorrect Exploitable

Program

Correctness Property Un-exploitability Property

29

Verification, but with a twist

if x < 42

if x*x = MAXINT

x = input()

30

Basic symbolic execution

if x > 42

jmp stack[x]

x can be anything

x > 42

(x > 42) (x*x != MAXINT)

(x > 42) (x*x != MAXINT)

!(x < 42)

f t

f t

f t

if x < 42

if x*x = MAXINT

x = input()

31

if x > 42

jmp stack[x]

x can be anything

x > 42

(x > 42) (x*x != MAXINT)

(x > 42) (x*x != MAXINT)

!(x < 42)

f t

f t

f t

Path formula (true for inputs that take path)

if x < 42

if x*x = MAXINT

x = input()

32

Basic symbolic execution

if x > 42

jmp stack[x] (x > 42)

(x*x != MAXINT) !(x < 42)

f t

f t

f t

SMT Solver satisfiability modulo

theories

Satisfiable (x = 43) path

test case!

Control flow hijack

attacker gains control of execution

33

buffer overflow format string attack heap metadata overwrite use-after-free ...

Same principle, different

mechanism

Process Memory

Basic execution semantics of compiled code

34

Code

Data

...

Stack

Heap

Processor

Fetch, decode, execute

read and write

EIP

...

Control Flow Hijack: EIP = Attacker Code

Instruction Pointer points to next instruction

to execute

f t

f t

f t

x = input()

35

if x > 42

if x*x = MAXINT

jmp stack[x] (x > 42) (x*x == MAXINT) ... EIP != input

if x < 42

Checking non-exploitability

Un-exploitability property: EIP != user input

SMT

Exploit Safe

checking Debian for exploitable bugs

36

37,391 programs

209,000,00 test cases

2,606,506 crashes

13,875 unique bugs

152 new hijack exploits

3 years CPU time

16 billion verification queries

* [ARCB, ICSE 2014, ACM Distinguished Paper], [ACRSWB, CACM 2014]

~$0.28/confirmed bug ~$21/exploit

mining data

Q: How long do queries take on average? A: 3.67ms on average with 0.34 variance Q: Should I optimize hard or easy formulae? A: 99.99% take less than 1 second and account for 78% of total time Q: Do queries get harder? A: Good question...

37

optimize fast queries

38

Size not strongly correlated with

hardness

SAT UNSAT

39

500 sec timeout

No clear upward trend

40

Only 39 programs create hard formulas

a/10 replaced with (a*0xcccccccd) >> 3

0%10%20%30%40%50%60%70%80%90%

100%

50% average

Symbolic execution is not perfect. It wont find all problems.

But each report is actionable. 41

42

Program Analysis

Safe

Exploitable

Buggy

Binary Programs

Our Vision: Automatically Check the Worlds Software for Exploitable Bugs

43

A Story

44

Program Analysis

Safe

Exploitable

Buggy

Binary Programs

Billions of programs

Potentially many analyses

AEG Fuzz ...

analysis campaigns

An analysis campaign comprises a sequence of epochs: 1. takes a list of (program, analysis) pairs as input 2. at the beginning of each epoch, picks one

(program, analysis) pair to run based on data collected from previous epochs

45

Campaign time

... A1

P1

A2

P2

A3

P3

A1

P1

A3

P2

Explore vs. Exploit Two competing goals during a fuzz campaign: Good News: Clearly a Multi-Armed Bandit (MAB) problem! Research Opportunities: Not a traditional MAB. Needs new algorithms

46

vs.

Explore each (pi, si) sufficiently often so as to identify pairs that can yield new bugs

Exploit knowledge of (pi, si) that are likely

to yield new bugs by fuzzing them more

Multi-Armed Bandits

47

48

Theoretical foundation

Evaluated 26 new and existing randomized

scheduling algorithms wrt fuzzing

1.55x more bugs than current industry best (CERT BFF)

48,000 hours of fuzzing to get ground truth

This talk: CMU security program Grow the security community Automatic exploit generation Maximizing ROI and there is always more...

49

Move beyond publications

Research

Tools for: Developers to check their final builds

(a best practice!) Businesses, IT depts, and compliance testers to

check software they use, but didnt develop Everyone to assess the weighted security score

of that new app they see, as well as all apps installed on their system

50

forallsecure.com

White

The Vision: Automatically

Check the Worlds Software for

Exploitable Bugs

51

It seems wrong to not try.

Thank You! Questions?

52

END

Checking the Worlds SoftwareSlide Number 2Slide Number 3Slide Number 4Slide Number 5Slide Number 6Which bugs are exploitable?Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13picoCTF: Teaching 10,000 High School Students to HackSlide Number 15Slide Number 16Slide Number 17Slide Number 18Impact beyond high schoolWhich bugs are exploitable?DEF CON 2012 scoreboardSlide Number 22Slide Number 23Slide Number 24Checking the worlds softwareSlide Number 26We owned the machine in secondsSlide Number 28Symbolic Execution[Boyer75, Howden75, King76]Basic symbolic executionSlide Number 31Basic symbolic executionControl flow hijackBasic execution semantics of compiled codeChecking non-exploitabilitychecking Debian for exploitable bugsmining dataSlide Number 38Slide Number 39Slide Number 40Slide Number 41Slide Number 42A StorySlide Number 44analysis campaignsExplore vs. ExploitMulti-Armed BanditsSlide Number 48ResearchSlide Number 50Slide Number 51Slide Number 52ENDReal world exploit generationa brief historyExploiting Real Code:The Mayhem ArchitecturePotentially infinite state spacecheck every branch blindlyPreconditioned symbolic executionSlide Number 59Example: length preconditionSlide Number 61Symbolic memory indicesSymbolic addresses occur oftenConcretization: test case generationObservationSlide Number 66Experiments with MayhemSlide Number 68coveragegcov coverage per programUnique code lines coveredChecking Debianpublic datahardness is (likely) localizedSlide Number 75We are not perfectSlide Number 77Slide Number 78student feedbackImpact beyond high schoolSlide Number 81