View
222
Download
3
Category
Tags:
Preview:
Citation preview
Carnegie Mellon University
SAT-Based Decision SAT-Based Decision Procedures forProcedures for
Linear Arithmetic andLinear Arithmetic andUninterpreted FunctionsUninterpreted Functions
SAT-Based Decision SAT-Based Decision Procedures forProcedures for
Linear Arithmetic andLinear Arithmetic andUninterpreted FunctionsUninterpreted Functions
http://www.cs.cmu.edu/~bryant
Randal E. Bryant
– 2 –
Decision Procedures in Formal VerificationDecision Procedures in Formal Verification
RTL/ Sourc
e Code
+Specif
i-cation
Abstraction Verification OK
Error
Formal
Model+
Specifi-
cation
Decision Procedure for Decidable Fragment of First-Order Logic
Decision Procedure for Decidable Fragment of First-Order Logic
Applications: Out-of-order, Pipelined Microprocessors; Cache Coherence Protocols; Device Drivers; Compiler
Validation; …
– 3 –
SAT-based Decision ProceduresSAT-based Decision Procedures
Input Formula
Boolean Formula
satisfiable unsatisfiable
Satisfiability-preserving Boolean
Encoder
SAT Solver
EAGER ENCODING
Input Formula
Boolean Formula
satisfiable
unsatisfiable
Approximate Boolean Encoder
SAT Solver satisfying assignment
satisfiable
First-order Conjunctions SAT Checker
unsatisfiableadditional clause
LAZY ENCODING
– 4 –
Lazy Encoding CharacteristicsLazy Encoding Characteristics
+ Can be extended to handle wide variety of theories+ Clean & modular design– Does not scale well
Number of calls to conjunction checker typically exponential in formula size
Each call independent: nothing learned in one call can be exploited by another
First-order Conjunctions SAT Checker
UninterpretedFunctions
LinearArithmetic
Bit Vectors
Theory N
•••
TheoryCombiner
– 5 –
Eager Encoding CharacteristicsEager Encoding Characteristics– Must encode all information about
domain properties into Boolean formula
– Some properties can give exponential blowup
+ Lets SAT solver do all of the work
Good Approach for Some DomainsGood Approach for Some Domains Modern SAT solvers have remarkable
capacityGood at extracting relevant portions out
of very large formulasLearns about formula properties as
search proceeds
Focus of this talkFocus of this talk
Input Formula
Boolean Formula
satisfiable unsatisfiable
Satisfiability-preserving Boolean
Encoder
SAT Solver
– 6 –
x0x1
x2
xn-1
Data and Function AbstractionData and Function Abstraction
ALU
x
f
Bit-vectors to (unbounded) Integers
Functional units to Uninterpreted Functions a = x b = y f(a,b) = f(x,y)
Common Operations
1
0
x
y
p
ITE(p, x, y)
If-then-else
x
y x = y=
Test for equality
– 7 –
Abstract Modeling of MicroprocessorAbstract Modeling of Microprocessor
For any Block that Transforms or Evaluates Data:For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Also view instruction memory as function
Reg.File
IF/ID
InstrMem
+4
PCID/EX
ALU
EX/WB
=
=
Rd
Ra
Rb
Imm
Op
Adat
Control Control
F1
F 2
F3
– 8 –
EUF: Equality with Uninterp. FunctsEUF: Equality with Uninterp. Functs Decidable fragment of first order logic
Formulas (Formulas (F F )) Boolean ExpressionsBoolean ExpressionsF, F1 F2, F1 F2 Boolean connectives
T1 = T2 Equation
P (T1, …, Tk) Predicate application
Terms (Terms (T T )) Integer ExpressionsInteger ExpressionsITE(F, T1, T2) If-then-else
Fun (T1, …, Tk) Function application
Functions (Functions (FunFun)) Integer Integer Integer Integerf Uninterpreted function symbolRead, Write Memory operations
Predicates (Predicates (PP)) Integer Integer Boolean Booleanp Uninterpreted predicate symbol
– 9 –
EUF Decision ProblemEUF Decision ProblemCircuit Representation of FormulaCircuit Representation of Formula
Truth ValuesDashed LinesModel ControlLogical connectivesEquations
Integer ValuesSolid linesModel DataUninterpreted functions If-Then-Else operation
TaskTask Determine whether formula F is universally valid
True for all interpretations of variables and function symbolsOften expressed as (un)satisfiability problem
» Prove that formula F is not satisfiable
=
f
T
F
T
F
fT
F
=
e1
e0x0
d0
T
F
T
F
T
F
e1
e0x0
d0
=
f
f
=
– 10 –
=
f
T
F
T
F
fT
F
=
e1
e0x0
d0
T
F
T
F
T
F
e1
e0x0
d0
=
f
f
=
Finite Model Property for EUFFinite Model Property for EUF
ObservationObservation Any formula has limited number of distinct expressions Only property that matters is whether or not different terms
are equal
x0 d0 f (x0) f (d0)
– 11 –
Boolean Encoding of Integer ValuesBoolean Encoding of Integer Values
For Each ExpressionFor Each Expression Either equal to or distinct from each preceding expression
Boolean EncodingBoolean Encoding Use Boolean values to encode integers over small range EUF formula can be translated into propositional logic
Logic circuit with multiplexors, comparators, logic gates Tautology iff original formula valid
ExpressionExpression Possible Possible ValuesValues
Bit Bit EncodingEncoding
x0 {0}{0} 00 00
d0 {0,1}{0,1} 00 bb1010
f (x0) {0,1,2}{0,1,2} bb2121 bb2020
f (d0) {0,1,2,3}{0,1,2,3} bb3131 bb3030
– 12 –
Some History of EUF Decision ProceduresSome History of EUF Decision Procedures
Ackermann, 1954Quantifier-free decision problem can be decided based on finite
instantiations
Burch & Dill, CAV ‘94Automatic decision procedure
» Davis-Putnam enumeration
» Congruence closure to enforce functional consistency
Boolean approachesGoel, et al, CAV ‘98
» Attempted with BDDs, but didn’t get good resultsBryant, German, Velev, CAV ‘99
» Could verify microprocessor using BDDsVelev & Bryant, DAC 2001
» Demonstrated power of modern SAT procedures
– 13 –
Exploiting Positive EqualityExploiting Positive Equality
Bryant, German, Velev CAV ‘99 First successful use of Boolean methods for EUF
Positive EqualityPositive Equality Equations that appear in unnegated form
ExploitingExploiting Can greatly reduce number of cases required to show
validityOnly need to consider maximally diverse interpretations
Reduce number of Boolean variables in bit-level encoding
– 14 –
Diverse Interpretations: IllustrationDiverse Interpretations: Illustration
TaskTask Verify someone’s obscure code for 4X4 array transpose
void trans(int a[4][4]){ int t; for (t = 4; t < 15; t++) if (~t&2|| t&8 && ~t&1) { int r = t&0x3; int c = t>>2; int val = a[r][c]; a[r][c] = a[c][r]; a[c][r] = val; }}
ObservationObservation Array elements altered only by copying one to another Just need to make sure right set of copies performed
Only operations on array elements
– 15 –
Verifying Array CodeVerifying Array Code
Test for Test for trans4trans4
trans4
Single Test AdequateSingle Test Adequate Unique value for each possible source element
“Maximally Diverse”
If a’[r][c] = a[c][r], then must have copied proper value
a a’
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
0 4 8 12
1 5 9 13
2 6 10 14
3 7 11 15
– 16 –
Characteristics of Array VerificationCharacteristics of Array Verification
Correctness ConditionCorrectness Condition
a’[0][0]a’[0][0] = = a[0][0] a[0][0] a’[0][1] a’[0][1] = = a[1][0] a[1][0]
a’[0][2]a’[0][2] = = a[2][0] a[2][0] … …
… …
a’[3][2]a’[3][2] = = a[2][3] a[2][3] a’[3][3] a’[3][3] = = a[3][3]a[3][3]
PropertiesProperties All equations are in positive form Worst case test is one that tends to make things unequal
Maximally diverse interpretation: use as many different values as possible
All maximally diverse interpretations isomorphicOnly need to try one to prove all handled correctly
– 17 –
Equations in Processor VerificationEquations in Processor Verification
Data TypesData Types EquationsEquations Register Ids Control stalling & forwarding Instruction Address Only top-level verification condition Program Data Only top-level verification condition
Reg.File
IF/ID
InstrMem
+4
PCID/EX
ALU
EX/WB
=
=
Rd
Ra
Rb
Imm
Op
Adat
Control Control
– 18 –
Exploiting Equation StructureExploiting Equation Structure
Positive EquationsPositive Equations In top-level verification condition Can use maximally diverse interpretation
Negative EquationsNegative Equations PIpeline control logic
Between register IDsOperation depends on whether or not two IDs are equal
Must use general encodingEncode with Boolean variablesAll possibility of IDs that match and/or don’t match
– 19 –
=
f
T
F
T
F
fT
F
=
e1
e0x0
d0
T
F
T
F
T
F
e1
e0x0
d0
=
f
f
=
Application of Positive EqualityApplication of Positive Equality
ObservationObservation All equations are positive in this formula Can consider single, diverse interpretation for terms
x0 d0 f (x0) f (d0)
5 6 7 8
01
0 1
5
65 6
7
7 8
5 67 8
5 67 6
1
– 20 –
f
fvf1
vf2
Function Elimination: Ackermann’s MethodFunction Elimination: Ackermann’s MethodReplace All Function Applications by Integer VariablesReplace All Function Applications by Integer Variables
Introduce new domain variable Enforce functional consistency by global constraints
Unclear how to restrict evaluation to diverse interpretations
x1
x2
F= =
– 21 –
f
f
fx1
x2
x3
vf1
vf2
T
F
=
==
T
F
vf3
T
F
Function Elimination: ITE MethodFunction Elimination: ITE Method
General TechniqueGeneral Technique Introduce new domain variable Nested ITE structure maintains functional consistency
– 22 –
f
f
fx1
x2
x3
5
6T
F
=
==
T
F
7
T
F
Generating Diverse EncodingGenerating Diverse Encoding
Replacing ApplicationReplacing Application Use fixed values rather than variables Application results equal iff arguments equal
– 23 –
Benefits of Positive EqualityBenefits of Positive EqualityMicroprocessor BenchmarksMicroprocessor Benchmarks
1xDLX: Single issue, RISC processor 2xDLX-EX-BP: Dual issue processor with exception handling & branch
prediction 9VLIW-BP: 9-way VLIW processor with branch prediction
MeasurementsMeasurements Using BerkMin SAT solver
Benchmark Using Pos. Eq. No Pos. Eq
1xDLXbuggy 0.02 2
good 0.07 229
2xDLX-EX-BPbuggy 4 15
good 15 > 24hrs
9VLIW-BPbuggy 10 > 24hrs
good 224 > 24hrs
Velev & Bryant, JSC ‘02
– 24 –
Revisiting Encoding TechniquesRevisiting Encoding Techniques
Small Domain (SD)Small Domain (SD)
Use bit-level encodings of bounded integers Implicitly encode properties of equality logic
Per-Constraint Encoding (EIJ)Per-Constraint Encoding (EIJ)
Introduce explicit Boolean variable for each equation Additional transitivity constraints to express properties of
equality logic
x = y y = z z x
x1x0 = y1y0 y1y0 = z1z0 z1z0 x1x0
exy eyz exz
Satisfiable?
eyz ezx exy exy eyz exz exy exz eyz
Transitivity Constraints
– 25 –
Per-Constraint EncodingPer-Constraint Encoding Introduced by Goel et al., CAV ‘98Exploiting sparse structure by Bryant & Velev, CAV 2000
ProcedureProcedure Initial formula F
Want to prove validProve that F is not satisfiable
Replace each equation x = y by Boolean variable exy
Gives formula Fsat
Generate formula expressing transitivity constraintsGives formula Ftrans
Use SAT solver to show that Fsat Ftrans not satisfiable
MotivationMotivation Provides SAT solver with more direct representation of
underlying problem
– 26 –
Graph Interpretation of TransitivityGraph Interpretation of Transitivity
Transitivity ViolationTransitivity Violation Cycle in graph Exactly one edge has ei,j = false
== ==
==
==
====
==
– 27 –
Exploiting ChordsExploiting Chords
ChordChord Edge connecting two non-
adjacent vertices in cycle
PropertyProperty Sufficient to enforce
transitivity constraints for all chord-free cycles
If transitivity holds for all chord-free cycles, then holds for arbitrary cycles
– 28 –
Enumerating Chord-Free CyclesEnumerating Chord-Free Cycles
StrategyStrategy Enumerate chord-free cycles in graph Each cycle of length k yields k transitivity constraints
• • •
1 2 k• • •
ProblemProblem Potentially exponential number of chord-free cycles
2k+k chord-free cycles
– 29 –
Adding ChordsAdding Chords
StrategyStrategy Add edges to graph to reduce number of chord-free cycles
• • •
1 2 k• • •2k+k chord-free cycles
2k+1 chord-free cycles
Trade-OffTrade-Off Reduces formula size Increases number of relational variables
– 30 –
Chordal GraphChordal Graph
DefinitionDefinition Every cycle of length > 3 has a
chord
GoalGoal Add minimum number of edges
to make graph chordal
Relation to Sparse Gaussian Relation to Sparse Gaussian EliminationElimination
Choose pivot ordering that minimizes fill-in
NP-hard Simple heuristics effective
– 31 –
1xDLX-C Equation Structure1xDLX-C Equation Structure
VerticesVertices For each vi
13 different register identifiers
EdgesEdges For each equation Control stalling and
forwarding logic 27 relational variables
Out of 78 possible
– 32 –
Adding Chordal Edges to 1xDLX-CAdding Chordal Edges to 1xDLX-C
OriginalOriginal 27 relational variables 286 cycles 858 clauses
AugmentedAugmented 33 relational
variables 40 cycles 120 clauses
– 33 –
2DLX-CCt Equation Structure2DLX-CCt Equation Structure
EquationsEquations Between 25
different register identifiers
143 relational variables
Out of 300 possible
– 34 –
Adding Chordal Edges to 2xDLX-CCtAdding Chordal Edges to 2xDLX-CCt
OriginalOriginal 143 relational
variables 2,136 cycles 8,364 clauses
AugmentedAugmented 193 relational
variables 858 cycles 2,574 clauses
– 35 –
Choosing Encoding MethodChoosing Encoding Method
ComparisonComparison Formula length n with m integer variables & function
applications Worst-case complexity
Per-Constraint Encoding Works Well in PracticePer-Constraint Encoding Works Well in Practice Generates slightly larger formulas than small domain Better performance by SAT solver
Small Domain Per-Constraint
Boolean Variables
O(m log m) O(m2)
Formula Size O(n + m2 log m) O(n + m3)
– 36 –
Encoding ComparisonEncoding ComparisonBenchmarksBenchmarks
Superscalar, out-of-order datapath 2–6 instructions issued in parallel
MeasurementsMeasurements Using BerkMin SAT solver
Issue Width
Per-Constraint Small Domain
Vars Clauses Time Vars Clauses Time
2 139 8,213 1.6 81 1,294 1.7
3 308 33,270 15 127 3,780 19
4 553 96,480 65 194 8,362 99
5 857 240,892 154 249 15,647 255
6 1,243 528,962 1,957 304 26,738 3,206
Velev & Bryant, JSC ‘02
– 37 –
ExtensionsExtensions
Difference logicDifference logic Predicates of form x ≤ y + C Original logic of UCLID Use integer variables to represent pointers into buffers
C = 1
Linear constraintsLinear constraints Predicates of from a1x1 + a2x2 + … + anxn ≤ b
Used in applying UCLID to software verification and software security problems
– 38 –
Difference LogicDifference Logic
Predicates of form x ≤ y + CC generally a small integer
Encoding MethodsEncoding Methods Small domain
Range bound n · max |C|
Per constraint encodingVariables of form ex,,y
C
Can have exponential blowup in number of variables
Choosing Encoding MethodChoosing Encoding Method Per constraint better, as long as it doesn’t blow up Predicting blowup
Successfully used classifier trained by machine learning (Seshia, Lahiri & Bryant, DAC ’03)
– 39 –
Linear ConstraintsLinear Constraints
Predicates of from a1x1 + a2x2 + … + anxn ≤ b
Common CaseCommon Case All but k predicates are difference predicates
ai = +1, aj = –1, rest = 0
Rest are sparseAt most w coefficients nonzeroCoefficient values small
n #variables
w max #non-zero terms
k #non-difference constraints
bmaxmax |constant|
amaxmax |coefficient|
– 40 –
Linear ConstraintsLinear Constraints
Small Domain EncodingSmall Domain Encoding(Seshia & Bryant, LICS ’04) Find value D such that only need to
consider solutions with 0 ≤ xi < D, for all i
Bounds on D:
Encode as SAT problem with log(D) bits / integer variable
Practical for real applications
(n+2) ¢ n ¢ (bmax+1) ¢ ( w ¢ amax ) k
n #variables
w max #non-zero terms
k #non-difference constraints
bmaxmax |constant|
amaxmax |coefficient|
– 41 –
Some Lessons We’ve LearnedSome Lessons We’ve Learned
Preserve Boolean StructurePreserve Boolean Structure Other approaches require collapsing to conjunctions of
predicates
Exploit Problem CharacteristicsExploit Problem Characteristics Sparseness
Tighten bounds and/or reduce number of constraints
Polarity structurePositive equality
Let SAT Solver Do the WorkLet SAT Solver Do the Work Eager encoding: provide sufficient set of constraints to
prove / disprove formula They are good at digesting large volume of information
Recommended