Checking the Worlds Software
David Brumley Carnegie Mellon University [email protected] http://security.ece.cmu.edu/
Black Hat
format c:
White Hat
vs.
2
An epic battle
Black Hat
format c:
White Hat
Bug
3
Exploit bugs
Black Hat
format c:
White Hat
Bug Fixed!
4
1. inp=`perl e '{print "A"x8000}'` 2. for program in /usr/bin/*; do 3. for opt in {a..z} {A..Z}; do 4. timeout s 9 1s $program -$opt $inp 5. done 6. done
1009 Linux programs. 13 minutes. 52 new bugs in 29 programs.
5
Fact: Windows, Mac, and
Linux all have 100,000s
of bugs
6
Which bugs are exploitable?
7
Highly Trained Experts 8
CMU creates experts
9
10
11
DEFCON 2013 DEFCON 2014
12
The US is lagging.
13
...
picoCTF: Teaching 10,000 High School Students to Hack
+
15
16
17
> 12 hours working/average
$25k scholarships
~10,000 each year
Largest cyber exercise ever
~2000 teams
57 problems
18
Will you encourage your students to compete next year?
YesNo
Impact beyond high school Promote awareness 1. Cisco 2. Project lead the way 3. HS-CTF 4. BoilerQuest 5. ACTF 6. BCACTF 7. ...
Foundation for Cyber 1. US Naval Academy 2. US Air Force Academy 3. US Military Academy 4. US Coast Guard
Academy
19
20
Which bugs are exploitable?
even the best people are not enough.....
DEF CON 2012 scoreboard
21
CMU
Time (3 days total)
Does not scale 22
I skate to where the puck is going to be, not where it has been. --- Wayne Gretzky Hockey Hall of Fame
23
White
The Vision: Automatically
Check the Worlds Software for
Exploitable Bugs
24
Checking the worlds software
25
Program Analysis
Safe
Exploitable
Buggy
26
We owned the machine in seconds
27
28
Program Analysis
Safe
Exploitable
Buggy
Binary Programs
Symbolic Execution [Boyer75, Howden75, King76]
Verification
Correct Safe path
Incorrect Exploitable
Program
Correctness Property Un-exploitability Property
29
Verification, but with a twist
if x < 42
if x*x = MAXINT
x = input()
30
Basic symbolic execution
if x > 42
jmp stack[x]
x can be anything
x > 42
(x > 42) (x*x != MAXINT)
(x > 42) (x*x != MAXINT)
!(x < 42)
f t
f t
f t
if x < 42
if x*x = MAXINT
x = input()
31
if x > 42
jmp stack[x]
x can be anything
x > 42
(x > 42) (x*x != MAXINT)
(x > 42) (x*x != MAXINT)
!(x < 42)
f t
f t
f t
Path formula (true for inputs that take path)
if x < 42
if x*x = MAXINT
x = input()
32
Basic symbolic execution
if x > 42
jmp stack[x] (x > 42)
(x*x != MAXINT) !(x < 42)
f t
f t
f t
SMT Solver satisfiability modulo
theories
Satisfiable (x = 43) path
test case!
Control flow hijack
attacker gains control of execution
33
buffer overflow format string attack heap metadata overwrite use-after-free ...
Same principle, different
mechanism
Process Memory
Basic execution semantics of compiled code
34
Code
Data
...
Stack
Heap
Processor
Fetch, decode, execute
read and write
EIP
...
Control Flow Hijack: EIP = Attacker Code
Instruction Pointer points to next instruction
to execute
f t
f t
f t
x = input()
35
if x > 42
if x*x = MAXINT
jmp stack[x] (x > 42) (x*x == MAXINT) ... EIP != input
if x < 42
Checking non-exploitability
Un-exploitability property: EIP != user input
SMT
Exploit Safe
checking Debian for exploitable bugs
36
37,391 programs
209,000,00 test cases
2,606,506 crashes
13,875 unique bugs
152 new hijack exploits
3 years CPU time
16 billion verification queries
* [ARCB, ICSE 2014, ACM Distinguished Paper], [ACRSWB, CACM 2014]
~$0.28/confirmed bug ~$21/exploit
mining data
Q: How long do queries take on average? A: 3.67ms on average with 0.34 variance Q: Should I optimize hard or easy formulae? A: 99.99% take less than 1 second and account for 78% of total time Q: Do queries get harder? A: Good question...
37
optimize fast queries
38
Size not strongly correlated with
hardness
SAT UNSAT
39
500 sec timeout
No clear upward trend
40
Only 39 programs create hard formulas
a/10 replaced with (a*0xcccccccd) >> 3
0%10%20%30%40%50%60%70%80%90%
100%
50% average
Symbolic execution is not perfect. It wont find all problems.
But each report is actionable. 41
42
Program Analysis
Safe
Exploitable
Buggy
Binary Programs
Our Vision: Automatically Check the Worlds Software for Exploitable Bugs
43
A Story
44
Program Analysis
Safe
Exploitable
Buggy
Binary Programs
Billions of programs
Potentially many analyses
AEG Fuzz ...
analysis campaigns
An analysis campaign comprises a sequence of epochs: 1. takes a list of (program, analysis) pairs as input 2. at the beginning of each epoch, picks one
(program, analysis) pair to run based on data collected from previous epochs
45
Campaign time
... A1
P1
A2
P2
A3
P3
A1
P1
A3
P2
Explore vs. Exploit Two competing goals during a fuzz campaign: Good News: Clearly a Multi-Armed Bandit (MAB) problem! Research Opportunities: Not a traditional MAB. Needs new algorithms
46
vs.
Explore each (pi, si) sufficiently often so as to identify pairs that can yield new bugs
Exploit knowledge of (pi, si) that are likely
to yield new bugs by fuzzing them more
Multi-Armed Bandits
47
48
Theoretical foundation
Evaluated 26 new and existing randomized
scheduling algorithms wrt fuzzing
1.55x more bugs than current industry best (CERT BFF)
48,000 hours of fuzzing to get ground truth
This talk: CMU security program Grow the security community Automatic exploit generation Maximizing ROI and there is always more...
49
Move beyond publications
Research
Tools for: Developers to check their final builds
(a best practice!) Businesses, IT depts, and compliance testers to
check software they use, but didnt develop Everyone to assess the weighted security score
of that new app they see, as well as all apps installed on their system
50
forallsecure.com
White
The Vision: Automatically
Check the Worlds Software for
Exploitable Bugs
51
It seems wrong to not try.
Thank You! Questions?
52
END
Checking the Worlds SoftwareSlide Number 2Slide Number 3Slide Number 4Slide Number 5Slide Number 6Which bugs are exploitable?Slide Number 8Slide Number 9Slide Number 10Slide Number 11Slide Number 12Slide Number 13picoCTF: Teaching 10,000 High School Students to HackSlide Number 15Slide Number 16Slide Number 17Slide Number 18Impact beyond high schoolWhich bugs are exploitable?DEF CON 2012 scoreboardSlide Number 22Slide Number 23Slide Number 24Checking the worlds softwareSlide Number 26We owned the machine in secondsSlide Number 28Symbolic Execution[Boyer75, Howden75, King76]Basic symbolic executionSlide Number 31Basic symbolic executionControl flow hijackBasic execution semantics of compiled codeChecking non-exploitabilitychecking Debian for exploitable bugsmining dataSlide Number 38Slide Number 39Slide Number 40Slide Number 41Slide Number 42A StorySlide Number 44analysis campaignsExplore vs. ExploitMulti-Armed BanditsSlide Number 48ResearchSlide Number 50Slide Number 51Slide Number 52ENDReal world exploit generationa brief historyExploiting Real Code:The Mayhem ArchitecturePotentially infinite state spacecheck every branch blindlyPreconditioned symbolic executionSlide Number 59Example: length preconditionSlide Number 61Symbolic memory indicesSymbolic addresses occur oftenConcretization: test case generationObservationSlide Number 66Experiments with MayhemSlide Number 68coveragegcov coverage per programUnique code lines coveredChecking Debianpublic datahardness is (likely) localizedSlide Number 75We are not perfectSlide Number 77Slide Number 78student feedbackImpact beyond high schoolSlide Number 81