Upload
ann-morris
View
218
Download
1
Embed Size (px)
Citation preview
Benefits of Bounded Model Checking at an Industrial Setting
F.Copty, L. Fix, R.Fraer, E.Giunchiglia*, G. Kamhi, A.Tacchella*, M.Y.Vardi**
Intel Corp., Haifa, Israel
*Università di Genova, Genova, Italy
**Rice University, Houston (TX), USA
intel
Technical framework
Symbolic Model Checking (MC) • Over 10 years of successful application in formal
verification of hardware and protocols• Traditionally based on reduced ordered Binary Decision
Diagrams (BDDs)
Symbolic Bounded Model Checking (BMC)• Introduced recently, but shown to be extremely effective
for falsification (bug hunting)• Based on propositional satisfiability (SAT) solvers
Open points
Why is BMC effective?• Because the search is bounded, and/or...• ...because it uses SAT solvers instead of BDDs?
What is the impact of BMC on industrial-size verification test-cases?• Traditional measures: performance and capacity• A new perspective: productivity
Our contribution
Apples-to-apples comparison• Expert’s tuning both on BDDs and SAT sides
optimal setting for SAT by tuning search heuristics• BDD-based BMC vs. SAT-based BMC
using SAT (rather than bounding) is a win
A new perspective of BMC on industrial test-cases• BMC performance and capacity
SAT capacity reaches far beyond BDDs• SAT-based BMC productivity
greater capacity + optimal setting = productivity boost
Agenda
BMC techniques• Implementing BDD-based BMC • SAT-based BMC: algorithm, solver and strategies
Evaluating BMC at an industrial setting• BMC tools: Forecast (BDDs) and Thunder (SAT)• Measuring performance and capacity
In search of an optimal setting for Thunder and Forecast Thunder vs. Forecast Thunder capacity boost
• Measuring productivity Witnessed benefits of BMC
BFS traversal
Buggy states
Initial states
Counterexample trace
From BDD-based MC to BMC
Adapting state-of-the-art BDD techniques to BMC
Bounded prioritized traversal • When the BDD size reaches a certain threshold...• ... split the frontier into balanced partitions, and...• ... prioritize the partitions according to some criterion• Ensure bound is not exceeded
Bounded lazy traversal • Works backwards• Application of bounded cone of influence
SAT-based BMC
Bound (k=4)
Sat
UnsatIncrease k?
SAT solver
)(sI
)(sB
)',( ssT ),( 110 ii
ki ssT
)(),()( 101
100 i
kiii
ki sBssTsI
SAT solvers
Input: a propositional formula F( x1, ..., xn )
Output: a valuation v = v1, ..., vn withvi {0,1} s.t. F( v1, ..., vn ) = 1
A program that can answer the question “there exists v s.t. F( v ) = 1” is a SAT solver
Focus on solving SAT• By exploring the space of possible assignments• Using a sound and complete method
Stålmarck’s (patented) Davis-Logemann-Loveland (DLL)
DLL method
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
HR, LB or SAT
LA or UNSAT
LA or SAT
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
s = {F,v} is an object
next { SAT, UNSAT, LA, LB, HR } is a variable
DLL-SOLVE(s)1 next LA
2 repeat
3 case next of
4 LA : next LOOK-AHEAD(s)
5 LB : next LOOK-BACK(s)
6 HR : next HEURISTIC(s)
7 Until next { SAT, UNSAT }
8 return next
SIMO: a DLL-based SAT solver
Boolean Constraint Propagation (BCP) is the only Look-Ahead strategy
Non-chronological Look-Back• Backjumping (BJ): escapes trivially unsatisfiable subtrees• Learning: dynamically adds constraints to the formula
Search heuristics• Static: branching order is supplied by the user• Dynamic
Greedy heuristics: simplify as many clauses as possible BCP-based: explore most constrained choices first
• Independent (relevant) vs. dependent variables
SIMO’s search heuristics
ScoringSelection Propagation
AllAllMoms
RelevantRelevantMorel
Relevant
Relevant
All
Relevant
All
All
AllUnirel
AllUnirel2
AllUnit
Forecast: BDD-based (B)MC
…Intel’s BDD
Forecast
Interface to BDD engines
Spec Synthesis RTL synthesis
CALCUDD
Directives
Proof/Counterexample
Property (ForSpec) Model (HDL)
Model Checking Algorithms
intel
Thunder: SAT-based BMC
GRASPSIMO
Thunder
Interface to SAT engines
Spec Synthesis RTL synthesis
SATOProver
Directives
Proof/Counterexample
Property (ForSpec) Model (HDL)
Formula generation ++
intel
Performance and capacity
Performance (what resources?)• CPU time • Memory consumption
Capacity (what model size?)• BDD technology tops at 400 state variables (typically)• SAT technology has subtle limitations depending on:
The kind of property being checked The length of the counterexample
Measuring performance
Benchmarks to measure performance are• Focusing on safety properties• Challenging for BDD-based model checking • In the capacity range of BDD-based model checking
In more detail• A total 17 circuits coming from Intel’s internal selection
with known counterexample minimal length k• Using 2 formulas per circuit with Thunder/SIMO flow
A satisfiable instance (falsification) at bound k, and An unsatisfiable instance (verification) at bound k-1
An optimal setting for Thunder
With BJ + learning enabled... ... we tried different heuristics
• Moms (M) and Morel (MR)• Unit (U), Unirel (UR) and
Unirel2 (UR2) SIMO admits a single optimal
setting (UR2)• Faster on the instances solved
by all the heuristics (16)• Solves all instances in less
than 20 minutes of CPU time Unirel2 is the default setting
with the Thunder/SIMO flow0
200
400
600
800
1000
1200
Instances (total 26)
CP
U t
ime
(s) M
MRUURUR2
Bounded traversal in Forecast
0
1000
2000
3000
4000
5000
6000
7000
8000
Instances (total 13)
CP
U ti
me
(s)
ABLABPAUPSBLSBPSUP
With automatically derived initial order• Bounded lazy (ABL)• Bounded prioritized (ABP)• Unbounded prioritized (AUP)
bounding does not yield consistent improvements!
With semi-automatically derived initial order• Bounded settings (SBL, SBP)• Unbounded prioritized (SUP)
bounding does not yield consistent improvements!
An optimal setting for Forecast?
0
1000
2000
3000
4000
5000
6000
7000
Instances (total 17)
CP
U ti
me
(s)
AUPST
Default setting is AUP• Best approximates the notion of
default setting in Thunder• AUP is the the best among A’s
Tuned setting (ST)• Semi-automatic intial order• Specific combinations of:
Unbounded traversal Prioritized traversal Lazy strategy Partitioning the trans. relation
No single optimal tuned setting for Forecast
Thunder vs. Forecast
0
1000
2000
3000
4000
5000
6000
7000
Instances (total 17)
CP
U ti
me
(s) AUP
UR2ST
Forecast default AUP is worse than Thunder UR2
Forecast tuned ST compares well with Thunder UR2
Forecast ST time does not include:• Getting pruning directives• Finding a good initial order• Getting the best setting
Measuring capacity
The capacity benchmark is derived from the performance benchmark• Getting rid of the pruning directives supplied by the
experienced users• Enlarging the size of the model beyond the scope
of BDD-based MC
Unpruned models for this analysis…• …have thousands sequential elements (up to 10k)• …are out of the capacity for Forecast
Thunder capacity boost
Latches+Inputs Latches+Inputs
(after pruning)
Variables in SAT formula
Thunder
CPU time
Circuit 1(5) 12011 152 6831 6.10
Circuit 1(4) 12011 152 5403 5.10
Circuit 2(7) 7054 661 24487 96.10
Circuit 2(6) 7054 661 20552 16.37
Circuit 3(11) 6586 1129 119248 78.61
Circuit 3(10) 6586 1129 107838 68.20
Circuit 4 9704 1069 21351 29.39
Circuit 5 17262 5542 TIMEOUT
Circuit 6 6832 2936 121786 576.24
Circuit 7 3321 532 35752 73.32
Circuit 8 1457 1012 50758 267.91
Measuring productivity
Productivity decreases with user intervention• Need to reduce the model size• Need to find a good order on state variables• Need to find a good tool setting
No user intervention no productivity penalty• Using Thunder/SIMO BMC flow:
Dynamic search heuristic: no need for an initial order Single optimal setting: Unirel2 (with BJ and learning) Extended capacity: no manual pruning
• Comparison with Forecast BMC flow indicates that SAT(rather than bounding) is the key for better productivity
A single optimal setting found for Thunder using SIMO: Unirel2 with backjumping and learning
SAT (rather than bounding) turns out to be the key benefit when using BMC technology
A complete evaluation• Performance of tuned BDDs parallels SAT • Impressive capacity of SAT vs. BDDs • SAT wins from the productivity standpoint
Witnessed benefits of BMC
Useful links
The version of the paper with the correct numbers in the capacity benchmarks:
www.cs.rice.edu/~vardiwww.cs.rice.edu/~tac
More information about SIMO:www.cs.rice.edu/CS/Verificationwww.mrg.dist.unige.it/star