22
February 2001 OASIS --- Norfolk 1 Dependence Graphs for Information Assurance Paul Anderson [email protected] GrammaTech, Inc. Ithaca, NY http://www.grammatech.com Tim Teitelbaum [email protected] (Cornell)

February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson [email protected] GrammaTech, Inc. Ithaca, NY

Embed Size (px)

Citation preview

Page 1: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 1

Dependence Graphs for Information Assurance

Paul Anderson

[email protected]

GrammaTech, Inc.

Ithaca, NY

http://www.grammatech.com

Tim Teitelbaum

[email protected]

(Cornell)

Page 2: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 2

• Problem– Understanding information flows

• important for solving security problems– But tool support for understanding information flows is poor

• too abstract / not code based / research languages• too imprecise / don’t scale

• Opportunity– Dependence analysis

• sound and tractable basis for understanding information flows– Theory of dependence graphs and program slicing

• mature theory for compilers and software-engineering tools

• Objective– Effective information-flow analysis tools based on dependence graphs

• modest goal [less expressive than type-based approaches]• ambitious goal [C/C++; interprocedural precision; scalable]

– Apply to security problems

Page 3: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 3

Sample application 1: Covert Channel Analysis

The “chop” between read_high and write_low shows possible information flows from HI to LO

HI

LO

NRL PUMPread_high

read_low

write_high

write_low

Page 4: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 4

Covert Channel Analysis, continued

Page 5: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 5

Sample application 2: Buffer Overrun Analysis

External strings

Unbounded string copies

“foo” “bar”

Bounded string copies

strcpy strcpy strncpy

Internal strings

“internal string”

The “chop” between external strings and unbounded copies shows possible exploitable overruns

Page 6: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 6

• Sample policy

• Sends– never preceded by a read => necessarily in state 1

• omit code to check state

– always preceded by a read => necessarily in state 2• replace code with error

• Reads – always preceded by a read => necessarily in state 2

• omit code to change state

– never followed by a send => no subsequent violation possible• omit code to change state

Sample Application 3: Analysis for Efficient IRM Insertion

1 2¬read

read¬send

– Simple implementation [Schneider]• inline the security automaton everywhere• partial evaluation

– Efficient implementation• exploit dependence information

Page 7: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 7

a = ain

Control Flow Graphsvoid main()

{

int sum, i;

sum = 0;

i = 1;

while (i<11) {

sum = add(sum,i);

i = add(i, 1);

}

printf(“sum=%d\n”, sum);

printf(“i=%d\n”, i);

}

static int add(int a, int b)

{

return (a+b);

}

entry main

entry add

sum = 0 i = 1 while i < 11 print sum print i

call add call add

result = a + b

ain = sum bin = i sum= ret ain = i bin = 1

ret = resultb = bin

i= ret

Page 8: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 8

Legend: control data

a = ain

Dependence Graphsvoid main()

{

int sum, i;

sum = 0;

i = 1;

while (i<11) {

sum = add(sum,i);

i = add(i, 1);

}

printf(“sum=%d\n”, sum);

printf(“i=%d\n”, i);

}

static int add(int a, int b)

{

return (a+b);

}

i= ret

entry main

entry add

sum = 0 i = 1 while i < 11 print sum print i

call add call add

result = a + b

ain = sum bin = i sum= ret ain = i bin = 1

ret = resultb = bin

Legend: control data

Page 9: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 9

ain = i

a = ain

Dependence Graphs [ and Slicing ]void main()

{

int sum, i;

sum = 0;

i = 1;

while (i<11) {

sum = add(sum,i);

i = add(i, 1);

}

printf(“sum=%d\n”, sum);

printf(“i=%d\n”, i);

}

static int add(int a, int b)

{

return (a+b);

}

i= ret

entry main

entry add

sum = 0 i = 1 while i < 11 print sum print i

call add call add

result = a + b

ain = sum bin = i sum= ret bin = 1

ret = resultb = bin

The “backward slice” from a statement shows all influences on that statement.

Legend: control data

Page 10: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 10

ain = i

a = ain

Dependence Graphs [ and Slicing ]void main()

{

int sum, i;

sum = 0;

i = 1;

while (i<11) {

sum = add(sum,i);

i = add(i, 1);

}

printf(“sum=%d\n”, sum);

printf(“i=%d\n”, i);

}

static int add(int a, int b)

{

return (a+b);

}

i= ret

entry main

entry add

sum = 0 i = 1 while i < 11 print sum print i

call add call add

result = a + b

ain = sum bin = i sum= ret bin = 1

ret = resultb = bin

The “backward slice” from a statement shows all influences on that statement.

Legend: control data

Page 11: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 11

Wide-Spectrum Program Representation

FE1

BuilderPre-IR

FEm

...

• front ends EDG - ANSI C - other C

- C++ - Java assembler / binaries UML (Rose/RT)

VerilogVHDLJovial

Analysis

operations

Synthesis

operations

Schemescripts

C client code

GUIIRAPI

Code: done; current; designed; prototyped by others; IASET / OASIS

• ASTs• symbol table • local def, use,

conditional kill, pointer ref-deref info

• CFGs• source positions

Support for: make, libraries and archives, loader

Pre-IR +• points-to sets• call multigraph• GMOD / GREF• summary edges• PDGs / SDG

• Precise interprocedural queries - predecessors and successors - slice - chop - model checking• Sets of PDG nodes - Boolean operations - persistent storage

Page 12: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 12

Pointer Analysis

• Flow insensitive, context insensitive – Andersen

– Steensgaard / Das

• Improvements– Structure fields

– Context sensitive

• PCC-like application for IRM

Page 13: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 13

Andersen Pointer

p = &q;

p = q;

p = *q;

*p = q;

p q

p

r1

r2

q

r1

r2

q

s1

s2

s3

p

p

s1

s2

qr1

r2

normalized statements of program(base facts)

program-independent rules

iterate to a fixpoint

Page 14: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 14

Steensgaard / Das

• Andersen– time: cubic in # variables

• Steensgaard– time: almost linear in # variables

• keep at most one out edge; form unions on p=q;

– precision : << Andersen

• Das– time: ~Steensgaard

– precision (size of points-to sets) : ~Andersen

• GrammaTech implementation of Steensgaard / Das– tentative finding

• performance gain often substantial

• precision loss often unacceptable

– current plan: continue improving Andersen

pR

S

qR Sp q

Page 15: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 15

Need for Discrimination by Structure Field

backward slice from here

Which assignments through p should be in the slice?

• Current release: all fields

participate in every

operation on any field

• Must discriminate among

fields

• Must consider unions and

casts

• Offsets

– Cannot use for portable

analysis

– Should use for precise

platform dependent

analysis

Page 16: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 16

Need for Context Sensitive Pointer Analysis

points-to(p) = {c, d}

shouldn’t be in slice

backward slice herebuf = c;

buf = d;

Page 17: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 17

PCC-like Pointer-Analysis for IRM

B

FP

object code

base facts

fixpointsolution

iteration

• Generate pointer analysis for

object code

• Ship B and FP with code

• Receiver verifies– B corresponds to the program

– B FP

– FP is a fixpoint

• Receiver avoids the iteration

subset lattice of points-to info

Page 18: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 18

PCC-like Pointer-Analysis for IRM, continued

B

FP

source code

base facts

fixpointsolution

object code

iteration

B

FP

B’

base facts

FP B’

start here

fixpointsolution

iteration

Page 19: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 19

Need for Variable-Based Queries

• Security analysts need information flows per variable

• Point-based backward slice (implicitly w.r.t. p, a, and b)a = 1;

b = 2;

if ( … ) p = &a; else p = &b;

x = *p;

• Point-and-variable backward slice w.r.t. aa = 1;

b = 2;

if ( … ) p = &a; else p = &b;

x = *p;

Page 20: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 20

Non-Structured Control Constructs [Horwitz]

• Unstructured jumps are a source of imprecision in slicing based on dependence graphs

• Importance– in C / C++

• switch, break, continue, goto

– in assembler and binaries

• Solution– distinguish between

• transitive closure of direct control dependence

• generalized control dependence

Page 21: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 21

• Boolean combinations of primitive queries are not sufficient– e.g., chop(p,q,) forward-slice(p) backward-slice(q)

• Need a language for posing generalized path queries• Example: Find tell-tale signs of Trojan-horse in login shell

• Approach: use model checking on CFGs and dependence graph– Model checker for CTL (interprocedurally imprecise) prototyped – Model checker for Modal Mu Calculus (interprocedurally precise)

Model Checking

start

exec

authenticate

Page 22: February 2001OASIS --- Norfolk1 Dependence Graphs for Information Assurance Paul Anderson paul@grammatech.com GrammaTech, Inc. Ithaca, NY

February 2001 OASIS --- Norfolk 22

Summary

• Extensive static-analysis infrastructure – constructing dependence graphs – performing precise interprocedural information-flow queries– inspecting flows

• Applicable today to real C programs [demo]• Limitations being addressed now

– precision• pointer analysis• non-structured control constructs• variable-based queries

– performance– query expressiveness

• Upcoming inquiry– verifiable static analysis for efficient IRM insertion