View
619
Download
6
Category
Preview:
DESCRIPTION
Citation preview
HAVOC: A precise and scalable verifier for systems software
Shaz QadeerMicrosoft Research
Collaborators
• Researchers– Jeremy Condit, Shuvendu Lahiri
• Interns– Shaunak Chatterjee, Brian Hackett, Zvonimir
Rakamaric, Ian Wehrman, Thomas Wies
HAVOC
• Modular verifier for C programs– Verifies each procedure separately– Requires contracts: preconditions, postconditions,
modifies clauses, loop invariants
• Features– Accurate heap model– Expressive annotation language– Efficient checking using SMT solvers
• Precise and efficient reasoning for loop-free and call-free code
Visual C Front End
CtoBoogiePL
Z3SMT solver
Boogie VCGenerator
Annotated C program
Control flow graph
Boogie program
Verification condition
Memory model
Verified Warning
Challenges for HAVOC
• Concise and precise expression of non-aliasing and disjointness of heap values
• Properties of unbounded collections– Lists, Arrays, …
• Enable such reasoning for low-level software– pointer arithmetic – interior pointers– nested structures and unions– …
But will programmers ever write contracts?
• In some cases, they might– security properties: thousands of buffer
annotations in Windows code– maintenance of critical legacy code: the Windows
NT file system
• Automatic annotation inference– precise and efficient checking of annotated
programs is a crucial first step
Roadmap
• Novel features of the specification language
• Dealing with low-level features of C
• Concluding remarks
next
prev
data
next
prev
data
next
prev
data
channel_name
file_name
logtype
struct _logentry
log_list.head log_list.tail
LinkNode
char *
[muh: Internet Relay Chat (IRC) bouncer]
LinkNode *iter = log_list.head;while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next;}
For every node x in the list between log_list.head and null:x->data is a unique pointer, andx->data->channel_name is a unique pointer, andx->data->file_name is a unique pointer.
Universal quantification
Reachability predicateData structure invariant
Ensure absence of double free
Limitations of SMT solvers
• No support for precise reasoning with reachability predicate– Incompleteness in Floyd-Hoare proofs for straight
line code• Brittle support for quantifiers
– Complexity: NP-complete (ground) undecidable– Leads to unpredictable behavior of verifiers
• Proof times, proof success rate– Requires user ingenuity to craft axioms/invariants
with quantifiers
Contribution
• Expressive and efficient logic for precise reasoning about reachability, unique pointers, and restricted quantification
• A decision procedure for the logic built over an SMT solver
Simple Java-like memory model
• Heap consists of a set of objects (obj)• Each field “f” is a mutable map
– f: obj obj– g: obj int– h: obj bool
• The sort obj may be refined into a collection of sorts
next
prev
data
next
prev
data
next
prev
data
yx
Btwnnext(x,y)
Btwnprev(y,x)
Reachability predicate: Btwnf
next
prev
data
next
prev
data
next
prev
data
yx
Inverse of a function: f-1
wdata-1(w) = {x, y}
LinkNode *iter = log_list.head;while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next;}
For every node x in the list between log_list.head and null:x->data is a unique pointer, and….
Data structure invariant
x Btwnf (log_list.head, null) \ {null}.data-1(data(x)) = {x} ….
Expressive logic
• Express properties of collectionsx Btwnf (f(hd), hd). state(x) = LOCKED //cyclic
• Arithmetic reasoning on data (e.g. sortedness)x Btwnf (hd, null) \ {null}. y Btwnf (x, null) \ {null}. d(x) d(y)
Precise
• Given the Floyd-Hoare triple X = {P} S {Q}– P and Q are expressed in our logic– S is a loop-free call-free program
• We can construct a formula Y in our logic – Y is linear in the size of X– X is valid iff Y is valid
Need annotations/abstractions only at procedure/loop boundaries
Efficient
• Decision problem is NP-complete – Can’t expect any better with propositional logic!– Retains the complexity of current SMT logics
• Provide a decision procedure for the logic on top of state-of-the-art Z3 SMT solver– Leverages powerful ground-theory reasoning
(arithmetic, arrays, uninterpreted functions…)
Ground Logic
t Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G GFormula ::= t = t’ | t < t’ |
t Btwnf(t1, t2) | G
S Set ::= f-1(t) | Btwnf(t1, t2)F Formula ::= G | F1 F2 |F1 F2 |
x S. F
Logic
Ground decision procedure
• Provide a set of 10 rewrite rules for Btwnf
– Sound, complete and terminating• E.g. Transitivity3
t1 Btwnf(t0, t2) t Btwnf(t0, t1)
t Btwnf(t0, t2), t1 Btwnf(t, t2)
t Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G GFormula ::= t = t’ | t < t’ |
t Btwnf(t1, t2) | G
S Set ::= f-1(t) | Btwnf(t1, t2)F Formula ::= G | F1 F2 |F1 F2 |
x S. F
Logic
Bounded quantification over interpreted sets
Lazy quantifier instantiation
• Instantiation rulet S x S. F
F[t/x]
• Lazy instantiation– Instantiate only when a term t belongs to the set S– Substantially reduces the number of terms to instantiate a
quantified fact• Terminates if x S. F is sort-restricted
– sort(x) is less than sort(t[x]) for any term t[x] in F
Experience
• Compared with an earlier implementation– Unrestricted quantifiers, incomplete
axiomatization of reachability, no f-1
– Small to medium sized benchmarks• Greatly improved the predictability of HAVOC
– Reduced runtimes (2X – 100X)– Eliminate need for carefully crafted axioms and
invariants– Can handle newer examples
Roadmap
• Novel features of the specification language
• Dealing with low-level features of C
• Concluding remarks
data1nextprevdata2
record recordp
q = CONTAINER(p, record, node) = (record *) ((int *) p – (int) (&(((record *)0)node))) = (record *) ((int *) p – 1)
q
data1nextprevdata2
struct list { list *next; list *prev;};
struct record { int data1; list node; int data2;};
void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; }}
• Type safety requires nontrivial reasoning• the container of every element in list has type record*
• Use of memory model with field abstraction is unsound
• Field abstraction is crucial to all property checkers• &a->data1 is not aliased to &b->data2• init_all_records(p) preserves the assertion a->data1 == 0
void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; }
Unify type checking and property checking
• Harness the power of constraint solvers to enhance type checking– type safety often depends on program-specific
invariants• Harness the strong guarantees provided by
the type invriant to enhance property checking– non-aliasing, field abstraction
100
101
102
100
99
int
Int
Ptr(Int)
List
Ptr(List)
Record
Ptr(Record)
type
Mem:int int Type:int typeMutable Immutable
Type invariant: a:int. HasType(Mem(a), Type(a))
void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; }
requires a:int. HasType(Mem(a), Type(a))requires HasType(p, Ptr(List))ensures a:int. HasType(Mem(a), Type(a))void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a));}
struct list { list *next; list *prev;};
struct record { int data1; list node; int data2;};
Match(a, Int) Type(a) = Int
Match(a, Ptr(t)) Type(a) = Ptr(t)
Match(a, List) Match(a, Ptr(List)) Match(a+1, Ptr(List))
Match(a, Record) Match(a, Int) Match(a+1, List) Match(a+3, Int)
HasType(v, Int) true
HasType(v, Ptr(t)) v = 0 (v > 0 Match(v, t))
struct list { list *next; list *prev;};
struct record { int data1; list node; int data2;};
void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; }
requires a:int. HasType(Mem(a), Type(a))requires HasType(p, Ptr(List))ensures a:int. HasType(Mem(a), Type(a))void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a));}
requires HasType(p-1, Ptr(Record)) p - 1 0
struct list { list *next; list *prev;};
struct record { int data1; list node; int data2;};
Match(a, Int) Type(a) = Int
Match(a, Data1) Type(a) = Data1
Match(a, Data2) Type(a) = Data2
Match(a, Ptr(t)) Type(a) = Ptr(t)
Match(a, List) Match(a, Ptr(List)) Match(a+1, Ptr(List))
Match(a, Record) Match(a, Data1) Match(a+1, List) Match(a+3, Data2)
HasType(v, Int) true
HasType(v, Data1) true
HasType(v, Data2) true
HasType(v, Ptr(t)) v = 0 (v > 0 Match(v, t))
struct list { list *next; list *prev;};
struct record { int data1; list node; int data2;};
Other highlights
• Decision procedure for type safety– suffices to instantiate the type invariant and
definitions of Match and HasType on few terms• Extensions
– unions– function pointers– parametric polymorphism– user-defined types– sub-word accesses (char, short)
Experience
• Property checking on small benchmarks– list-manipulation: insertion, removal, multiple lists
each with a different container type– sorting: bubble sort, merge sort, quick sort– intuitive and concise annotations
• Type checking of four WDK drivers– cancel, event, kbfiltr, vserial– ~1 min to check each driver– ~5KLOC, ~225 annotations
Roadmap
• Novel features of the specification language
• Dealing with low-level features of C
• Concluding remarks
Other case studies with HAVOC
• Synchronization protocols protecting critical data structures in the NT file system (Brian Hackett)– ~300KLOC, 1500 procedures– reference count usage, lock usage, data races, teardown
races– 45 confirmed bugs (out of 125 warnings)– most bugs fixed
• Spin lock usage in Windows device drivers (Juan Pablo Galeotti, Thomas Wies)– flpydisk, kbdclass, daytona, serial (~50KLOC)
HAVOC is available
• Download:– http://research.microsoft.com/projects/HAVOC
Future directions
• Unified decision procedure for reachability, inverse, arrays, and types for the low-level memory model
• Exploiting type invariant for property checking on device drivers
• Annotation inference
Questions
Recommended