Upload
jasmine-mccormick
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
Static Program Analysisvia Three-Valued Logic
Mooly Sagiv (Tel Aviv), Thomas Reps (Madison),
Reinhard Wilhelm (Saarbrücken)
• A sailor on the U.S.S. Yorktown entered a 0 into a data field in a kitchen-inventory program. That caused the database to overflow and crash all LAN consoles and miniature remote terminal units.
• The Yorktown was dead in the water for about two hours and 45 minutes.
Analysis musttrack
numericinformation
Full Employmentfor Verification Experts
x = 3;y = 1/(x-3);
x = 3;px = &x;y = 1/(*px-3);
x = 3;p = (int*)malloc(sizeof int);*p = x;q = p;y = 1/(*q-3);
need to track valuesother than 0
need to track pointers
need to track heap-allocatedstorage
Flow-SensitivePoints-To Analysis
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
3
2
1
4
a = &e
5
c = &f
*b = c
b = a
d = *a
p = &q;
p = q;
p = *q;
*p = q;
p q
pr1
r2
q
r1
r2
qs1
s2
s3
p
ps1
s2
qr1
r2
p q
pr1
r2
q
r1
r2
qs1
s2
s3
p
ps1
s2
qr1
r2
Flow-InsensitivePoints-ToAnalysis
3
2
1
4
a = &e
5
c = &f
*b = c
b = a
d = *a
3
2
1
4
a = &e
5
c = &f
*b = c
b = a
d = *a
3
21
4
5
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
bc
f
e
a
d
b
cf
e
Shape Analysis [Jones and Muchnick 1981]
• Characterize dynamically allocated data– Identify may-alias relationships
– x points to an acyclic list, cyclic list, tree, dag, …
– “disjointedness” properties• x and y point to structures that do not share cells
– show that data-structure invariants hold
• Account for destructive updates through pointers
Applications: Software Tools
• Static detection of memory errors– dereferencing NULL pointers– dereferencing dangling pointers– memory leaks
• Static detection of logical errors– Is a data-structure invariant restored?
Applications: Code Optimization
• Parallelization– Operate in parallel on disjoint
structures
• Software prefetching
• “Compile-time garbage collection”– Insert storage-reclamation operations
• Eliminate or move “checking code”
Why is Shape Analysis Difficult?
• Destructive updating through pointers– p next = q– Produces complicated aliasing relationships
• Dynamic storage allocation– No bound on the size of run-time data structures
• Data-structure invariants typically only hold at the beginning and end of operations– Want to verify that data-structure invariants are
re-established
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
1 2 3 NULL
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Materialization
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Example: In-Situ List Reversal
List reverse (List x) { List y, t; y = NULL; while (x != NULL) { t = y; y = x; x = x next; y next = t; } return y;}
typedef struct list_cell { int val; struct list_cell *next;} *List;
x
yt
NULL
Idea for a List Abstraction
represents x
yt
NULL
xyt
NULL
xyt
NULL
xyt
NULL
x
yt
NULL
x
yt
NULL
x
yt
return y
t = y
ynext = t
y = x
x = xnext
x != NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
x
yt
x
yt
Properties of reverse(x)• On entry, x points to an acyclic list• On each iteration, x & y point to disjoint acyclic
lists• All the pointer dereferences are safe• No memory leaks• On exit, y points to an acyclic list• On exit, x = = NULL• All cells reachable from y on exit were reachable from x on entry, and vice versa• On exit, the order between neighbors in the y-list is opposite to their order in the x-list on entry
A ‘Yacc’ for Shape Analysis: TVLA
• Parametric framework– Some instantiations known analyses– Other instantiations new analyses
• Applications beyond shape analysis– Partial correctness of sorting algorithms– Safety of mobile code– Deadlock detection in multi-threaded
programs– Partial correctness of mark-and-sweep gc alg.– Correct usage of Java iterators
A ‘Yacc’ for Static Analysis: TVLA
• Parametric framework– Some instantiations known analyses– Other instantiations new analyses
• Applications beyond shape analysis– Partial correctness of sorting algorithms– Safety of mobile code– Deadlock detection in multi-threaded
programs– Partial correctness of mark-and-sweep gc alg.– Correct usage of Java iterators
Formalizing “. . .”Informal:
x
Formal:
xSummary
node
Using Relations to Represent Linked Lists
Relation Intended Meaning
x(v) Does pointer variable x point to cell v?
y(v) Does pointer variable y point to cell v?
t(v) Does pointer variable t point to cell v?
n(v1,v2) Does the n field of v1 point to v2?
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
x(u) y(u) t(u)u1 1 1 0u2 0 0 0u3 0 0 0u4 0 0 0
u1 u2 u3 u4
xy
Using Relations to Represent Linked Lists
Formulas:Queries for Observing Properties
Are x and y pointer aliases?v: x(v) y(v)
xy u1 u2 u3 u4
Are x and y Pointer Aliases?
v: x(v) y(v)
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
x(u) y(u) t(u)u1 1 1 0u2 0 0 0u3 0 0 0u4 0 0 0
xy u1
1
Yes
Predicate-Update Formulas for “y
= x”
•x’(v) = x(v)•y’(v) = x(v)•t’(v) = t(v)
•n’(v1,v2) = n(v1,v2)
x(u) y(u) t(u)u1 1 0 0u2 0 0 0u3 0 0 0u4 0 0 0
x
u1 u2 u3 u4
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
y’(v) = x(v)
1000
y
Predicate-Update Formulas for “y
= x”
Predicate-Update Formulas for “x = x
n”
•x’(v) = v1: x(v1) n(v1,v)
•y’(v) = y(v)•t’(v) = t(v)
•n’(v1, v2) = n(v1, v2)
x
u1 u2 u3 u4
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
x(u) y(u) t(u) u1 1 1 0 u2 0 0 0 u3 0 0 0 u4 0 0 0
y
x’(v) = v1: x(v1) n(v1,v)
x
Predicate-Update Formulas for “x = x
n”
Why is Shape Analysis Difficult?
• Destructive updating through pointers– pnext = q– Produces complicated aliasing relationships
• Dynamic storage allocation– No bound on the size of run-time data
structures• Data-structure invariants typically only
hold at the beginning and end of operations– Need to verify that data-structure invariants
are re-established
Two- vs. Three-Valued Logic
0 1
Two-valued logic
{0,1}
{0} {1}
Three-valued logic
{0} {0,1}{1} {0,1}
Two- vs. Three-Valued Logic
Two-valued logic
1 01 1 00 0 0
1 01 1 10 1 0
Three-valued logic
{1} {0,1} {0}
{1} {1} {0,1} {0}{0,1} {0,1} {0,1} {0}{0} {0} {0} {0}
{1} {0,1} {0}
{1} {1} {1} {1}{0,1} {1} {0,1} {0,1}{0} {1} {0,1} {0}
Two- vs. Three-Valued Logic
0 1
Two-valued logic
{0} {1}
Three-valued logic
{0,1}
Two- vs. Three-Valued Logic
0 1
Two-valued logic
½
0 1
Three-valued logic
0 ½
1 ½
Boolean Connectives [Kleene]
0 1/2 1
0 0 0 01/2 0 1/2 1/21 0 1/2 1
0 1/2 1
0 0 1/2 11/2 1/2 1/2 11 1 1 1
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
Canonical Abstraction
u1 u2 u3 u4
xu1
xu234
x(u) y(u)u1 1 0u2 0 0u3 0 0u4 0 0
n u1 u234
u1 0
u234 0 1/2
x(u) y(u)u1 1 0
u234 0 0
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
Canonical Abstraction
u1 u2 u3 u4
xu1
xu234
x(u) y(u)u1 1 0u2 0 0u3 0 0u4 0 0
n u1 u234
u1 0
u234 0 1/2
x(u) y(u)u1 1 0
u234 0 0
= u1 u2 u3 u4 u1 1 0 0 0 u2 0 1 0 0 u3 0 0 1 0 u4 0 0 0 1
Canonical Abstraction
u1 u2 u3 u4
xu1
xu234
x(u) y(u)u1 1 0u2 0 0u3 0 0u4 0 0
= u1 u234 u1 1
u234 0 1/2
x(u) y(u)u1 1 0
u234 0 0
Canonical Abstraction
•Partition the individuals into equivalence classes based on the values of their unary predicates
•Collapse other predicates via
Property-Extraction Principle
• Questions about a family of two-valued stores can be answered conservatively by evaluating a formula in a three-valued store
• Formula evaluates to 1 formula holds in every store in the family
• Formula evaluates to 0 formula does not hold in any store in the family
• Formula evaluates to 1/2 formula may hold in some; not hold in others
Are x and y Pointer Aliases?
u1 u
xy
v: x(v) y(v)
Yes
1
Is Cell u Heap-Shared?
v1,v2: n(v1,u) n(v2,u) v1 v2
u
Yes
1 1
1
1
MaybeIs Cell u Heap-Shared?
v1,v2: n(v1,u) n(v2,u) v1 v2
u1 u
xy
1/21/2 1
1/2
The Embedding Theorem
y
x
u1 u34u2
y
x
u1 u234
y
x
u1 u3u2 u4
x
yu1234
v: x(v) y(v)
Maybe
No
No
No
Embedding
u1 u2 u3 u4
xu5 u6
u12 u34 u56
x
u123 u456
x
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
Canonical Abstraction:An Embedding Whose Result is of Bounded Size
u1 u2 u3 u4
xu1
xu234
x(u) y(u)u1 1 0u2 0 0u3 0 0u4 0 0
n u1 u2 u3 u4
u1 0 1 0 0u2 0 0 1 0u3 0 0 0 1u4 0 0 0 0
n u1 u234
u1 0
u234 0 1/2
x(u) y(u)u1 1 0u2 0 0u3 0 0u4 0 0
x(u) y(u)u1 1 0
u234 0 0
x
yt
NULL
return y
t = y
ynext = t
y = x
x = xnext
x != NULL
x
yt
NULL
x
yt
NULL
u1 u
x
x(u) y(u) t(u)u1 1 0 0u 0 0 0
n u1 u
u1 0 1/2
u 0 1/2
yu1 u
x
x(u) y(u) t(u) u1 1 1 0 u 0 0 0
n u1 u
u1 0 1/2
u 0 1/2
y’(v) = x(v)
10
x
yt
NULL
x
yt
NULL
x
yt
return y
t = y
ynext = t
y = x
x = xnext
x != NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
NULL
x
yt
x
yt
x
yt
How Are We Doing?
• Conservative • Convenient • But not very precise
– Advancing a pointer down a list loses precision– Cannot distinguish an acyclic list from a cyclic
list
The Instrumentation Principle
• Increase precision by storing the truth-value of some chosen formulas
Is Cell u Heap-Shared?
v1,v2: n(v1,u) n(v2,u) v1 v2
u
is = 0 is = 0 is = 0 is = 0
Example: Heap Sharing
x 31 71 91
is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2
u1 ux
u1 ux
is = 0 is = 0
Example: Heap Sharing
x 31 71 91
is(v) = v1,v2: n(v1,v) n(v2,v) v1 v2
is = 0 is = 0 is = 0 is = 0is = 1
u1 ux
u1 ux
is = 0 is = 0is = 1
Example: Cyclicity
x 31 71 91
c(v) = v1: n(v,v1) n*(v1,v)
c = 0 is = 0 c = 1 c = 1c = 1
u1 ux
u1 ux
c = 0 c = 1
Is Cell u Heap-Shared?
v1,v2: n(v1,u) n(v2,u) v1 v2
u1 u
xy
is = 0 is = 0
No!
1/21/2 1
1/2 Maybe
Formalizing “. . .”Informal:
x
y
Formal:x
y
Formalizing “. . .”Informal:
x
y
t2
t1
Formal:x
y t2
t1
Formalizing “. . .”Informal:
x
y
Formal:x
y
reachable fromvariable x
reachable fromvariable y
r[x]
r[y]
r[x]
r[y]
Formalizing “. . .”Informal:
x
y
t2
t1
Formal:
t2
t1
r[x],r[t1]
r[y],r[t2]
r[x],r[t1]
r[y],r[t2]
x
yr[y]
r[x] r[x]
r[y]
• doubly-linked(v)• reachable-from-variable-x(v)• acyclic-along-dimension-d(v)• tree(v)• dag(v)• AVL trees:
– balanced(v), left-heavy(v), right-heavy(v)
– . . . but not via height arithmetic
Useful Instrumentation Predicates
NeedFO + TC
Materialization
x = xn
Informal:
xy y
x
x = xn
Formal:
xy
x
y
Formal:
xy x = xn y
x
Naïve Transformer (x = x n)
xy
Evaluateupdate
formulas
y
x
Best Transformer (x = x n)
xy
y
x
y
x
yx
yx
...Evaluateupdateformulas
y
x
y
x
...
“Focus”-Based Transformer (x = x n)
xy
yx
yx
Focus(x n)
“Partial ”
y
x
y
x
Evaluateupdateformulas
y
x
y
x
Why is Shape Analysis Difficult?
• Destructive updating through pointers– pnext = q– Produces complicated aliasing relationships– Track aliasing using 3-valued structures
• Dynamic storage allocation– No bound on the size of run-time data structures– Canonical abstraction finite-sized 3-valued
structures
• Data-structure invariants typically only hold at the beginning and end of operations– Need to verify that data-structure invariants are re-
established– Query the 3-valued structures that arise at the exit
What to Take Away
• A ‘yacc’ for static analysis based on logic
• Broad scope of potential applicability– Not just linkage properties:
predicates are not restricted to be links!– Discrete systems in which a relational (+
numeric) structure evolves– Transition: evolution of relational + numeric state
Canonical Abstraction
A family of abstractions for use in
logic
Questions?