Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach

Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach

Zhiqiang Lin Xiangyu Zhang, Dongyan Xu

Purdue University

June 27th, 2008

The 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks

Motivation

Internet Worms(CodeRed, Slammer)

Denial of Service (DoS)User

DoSDoS

Viruses,Trojan Horses,Bots (Botnet)

FCFC

Vulnerability In Software

Accidental Breachesin Security

Related Work

Dynamic analysis Program shepherding (V. Kiriansky et al.)

TaintCheck (J. Newsome et al.)Control Flow Integrity (M. Abadi et al.)Data Flow Integrity (M. Castro et al.)…

Run-time overhead, and waiting for attack Static analysis

BOON (D. Wagner et al.), Splint (D. Larochelle et al.), Archer (Y. Xie et al.), RATS, Flawfinder

False positive Recent automated multi-path exploration

DART (P. Godefroid et al.), Cute (K. Sen et al.), EXE (C. Cadar et al.), SAGE (P. Godefroid et al.)

Low Efficiency

Problem Statement and Our Technique

How to more efficiently discover/convict software vulnerability

An Efficient Input Provenance Based Approach Conservative static analysis => Suspect Dynamic analysis => Convicting the suspect

and pruning false positives Randomly mutation is avoided No symbolic execution (can handle long

execution)

Key idea Data lineage tracing (Input Provenance)

Basic Idea

fread(&imagehed,sizeof(imagehed),1,in);...width=(imagehed.wide_lo+256*imagehed.wide_hi)height=(imagehed.high_lo+256*imagehed.high_hi);...if((...(byte *)malloc(width*height))...) { fclose(in); return(_PICERR_NOMEM); } ...

231

245246

494495496497498

Input a.gif (256x128):xx...0x00 0x01 0x80 0x00...

Input Data label (Offset): 6 7 8 9

An image viewer: Zgv-5.8/readgif.c

Integer Overflow

Architecture

Static-front End

Input Lineage Tracer

Input Mutator

Run-time Detector

Program/binary

Lineage

Program Input

Evidence

Suspect New Input

A piece of instruction which is exploitable to trigger the

vulnerability

Component 1. Input Lineage Tracer

Label the input stream (using the offset) Track their propagation

mov 0xfffffffc(%ebp),%eax

mov %eax, 0xfffffff8(%ebp)

add %eax, %ecx

mov %ecx, %edx

Component 1. Input Lineage Tracer

Key concept Data Dependency (direct propagation)

Control dependency (indirect propagation)

1. b=a;

1. if (a==1)2. b=1;3. else4. c=0;

mov 0xfffffffc(%ebp),%eax

mov %eax,0xfffffff8(%ebp) b=a

cmpl $0x1,0xfffffffc(%ebp)

jne 804832d <main+0x25>

movl $0x1,0xfffffff8(%ebp)

movl $0x0,0xfffffff4(%ebp)

jmp 8048334 <main+0x2c>

a==1

b=1

c=0

Component 1. Data Lineage Tracer

DL(Si)=DL(def@si)

DL(def@si) =

get_new_id() if def is an input value

U DL(usex@si) otherwise

Input data tracking (labeled with its offset in the input stream)

DL Representation: reduced ordered Binary Decision

Diagram (roBDD)

Component 1. Data Lineage Tracer

An Example

fread(&imagehed,sizeof(imagehed),1,in);...width=(imagehed.wide_lo+256*imagehed.wide_hi)height=(imagehed.high_lo+256*imagehed.high_hi);...if((...(byte *)malloc(width*height))...) { fclose(in); return(_PICERR_NOMEM); } ...

231

245246

494495496497498

READ (buf,size,...), 0<= i < size , buf[i], DL(buf[i]@pc231) = get_new_id()DL(wide_lo@pc245)=

DL(buf[6]@pc231) = {6}DL(wide_hi@pc245)=DL(buf[7]@pc231) = {7}

DL(width@245) = DL(wide_hi@pc245) U DL(wide_lo@pc245) = {6; 7}DL(height@246) = DL(high_hi@pc246) U DL(high_lo@pc246) = {8; 9}

DL((width*height)@494) = {6;7;8;9}

Component 2. Input Mutator

Program InputData Lineage Evidence

Heuristics#1: Buffer overflow mutation(double buffer size …)

Heuristics#2: Format string mutation(replace %s in format string argument)

Heuristics#3: Integer overflow mutation(Boundary integer value:

0xffffffff,0,0x0fffffff)…

Suspect

Implementation

Diablo: Control flow graph Statically generate Control dependency to

facilitate Valgrind instrumentation http://diablo.elis.ugent.be/

Valgrind: Lineage tracing http://valgrind.org/ RoBDD (Reduced ordered Binary Decision

Diagram) to represent the data lineage.

Evaluation - Effectiveness

Static Detector Known vulnerability

CVE-2001-1413 (ncompress 4.2.4, SO) CVE-2001-1228 (gzip 1.2.4, SO) CVE-2002-1496 (Nullhttpd 0.50, HO) CVE-2002-1549 (lhttpd 0.1, SO) CVE-2000-0573 (wu-ftpd-2.6.0, Format String) CVE-2001-0609 (cfingerd-1.4.3, Format String) CVE-2005-0226 (ngircd-0.8.2, Format String) CVE-2004-0904 (xzgv-0.8, IO & HO) CVE-2006-3082 (GnuPG 1.4.3, IO & HO)

RATS (Unknown) Make extension to catch: buffer overflow,

integer overflow (ipgrab-0.99, epstool-3.3, dcraw-7.94)

Evaluation - CVE-2006-3082 (GnuPG 1.4.3)

GnuPG Parse_User_ID Remote Buffer Overflow Vulnerability

pktlen=in[2,3,4,5]=0x ff ff ff ff

Evaluation - CVE-2001-0609 (Cfingerd-1.4.3)

syslog(LOG_NOTICE, "%s", (char *) syslog_str);

Evaluation - Ipgrab-0.99 (A New VUL)

Evaluation – Performance (Lineage Tracing)

Platform: two 2.13 Ghz Pentium processors and 2G RAM running the Linux kernel 2.6.15

Evaluation - Performance

Evaluation - Space

Summary An input lineage tracing and mutation system:

Capable of convicting known and unknown vulnerability.

Has reasonable overhead for the scenario of offline vulnerability conviction.

Static-front End

Data Lineage Tracer

Input Mutator

Run-time Detector

Program/binary

LineageNew Input

Program Input

Evidence

Suspect

Thank you

For more information:

{zlin, xyzhang, dxu}@cs.purdue.edu

Q & A

Documents

Convicting Exploitable Software Vulnerabilities: An Efficient Input Provenance Based Approach