Lori A. Clarke 教授

麻萨诸塞大学 ( 阿姆赫斯特 ) 计算机科学系先进软件工程研究实验室

University of Massachusetts, Amherst.Laboratory for Advanced Software Engineering ResearchLaboratory for Advanced Software Engineering Research

[email protected]://laser.cs.umass.edu/

Lori A. Clarke 教授

22

Finite State Verification:

• An Emerging Technology for Validating Software Systems

• 一种介于测试和验证之间的技术和方法• 支持这种方法的自动工具。

33

Sorry State of Affairs

• Testing consumes about half the cost of s/w development

• Maintenance consumes about 80% of the full life cycle costs--much of that devoted to testing

• Most companies use ad hoc QA practices• Unhappy with the results; Unhappy with the cost

– Failed projects

– Delayed product releases

44

Testing

• can:– Uncover failures– Show specifications are (not) met for specific

test cases– Be an indication of overall reliability

• cannot:– Prove that a program will/will not behave in a

particular way

55

Must do better!

• Increasing number of high assurance applications– Medical applications– Flight control software– Electronic commerce

• Increasing number of complex systems– Systems of systems– Distributed systems

66

Distributed SystemsDistributed Systems

• Better performance, better flexibility,

but there is a cost• distributed systems are more difficult

to test than sequential systems– number of execution paths can grow

exponentially with the number of processes– Testing can not even demonstrate that a

system works on the selected/executed test data

77

3

4

2

9

T1 T2

5

8

1 6

1,6

2,6

5,9

3,6

4

1,7

2,7 1,8

3,7 2,8

3,8

7

Complexity of Distributed Systems

88

3

4

2

9

T1 T2

5

8

1 6

1,6

2,6

5,9

3,6

4

1,7

2,7 1,8

3,7 2,8

3,8

7

Uncertainty of Testing

X:=1

X: =2

X==?

99

Formal Verification: An Alternative to Testing

• Theorem Proving Based Verification– Use mathematical reasoning– Prove properties about all possible executions – Difficult and error prone

• Finite State Verification– Reason about a finite model of the system– Prove properties about all possible executions, but

not as powerful as theorem proving– Almost a totally automated process

1010

Spectrum of Difficulty

Ad-hoc Testing

Systematic Testing

Theorem Proving

Finite State Verification

•Arbitrary testcases

•Reqts based test planning

•Requirements captured as properties

•Properties guaranteed on all possible executions

1111

Finite State Verification (FSV)

• Holds the promise of providing a cost effective way of verifying important properties about a system

– Not all faults are created equal– Invest effort into most important properties

• Several promising prototypes1. Reachability Based Model Checking

• SPIN or Symbolic Model Checking (SMV)2. Flow Equations

• Integer Necessary Conditions (INCA)3. Data Flow Analysis

• FLAVERS

Property

System

Property Translator

SystemTranslator

ReasoningEngine

System ModelProperty Verified

Property Representation

High-Level Architecture of FSV High-Level Architecture of FSV SystemsSystems

Counter Examples for Model

1313

Conservative Analysis （保守型、守性型分析）

• If property verified, property holds for all possible executions of the system

• If property not verified: – An error

OR

– A spurious result （虚假的）• System model abstracts information to be tractable• Conservative abstractions over-approximate behavior• If inconsistency relies upon over-approximations, then a

spurious result – e.g. counter example corresponds to an infeasible path

1414

System Model

• Depends on property being verified

• Eliminate information that does not impact the proof

• Abstraction techniques allows “states” in the model to be reduced/collapsed

1515

Some Properties of Properties

• State-based versus event-based• Once temperature is greater than 100 degrees,

lock is true

• Elevator door closes before elevator moves

• Single locations versus (sub)paths – Deadlock or race conditions

– Sequences of states or events

• Safety versus Liveness

1616

A quick look at three approaches to FSV

1. Model Checking

2. Flow Equations

3. Data Flow Analysis

1717

1, Model Checking

• Properties usually expressed in a temporal logic

• System represented as a (possibly “abstracted”) reachability graph– State based

• Reasoning engine propagates valid subformulas through the graph

1818

High-Level Architecture of Model High-Level Architecture of Model CheckingChecking

Temporal Logic Property

System

Property Translator

SystemTranslator

Subformula propagation

State-based Reachability

GraphProperty Verified

Property Representation


1919

Representing Properties

• CTL operators– G - globally– F - future– X- next– U - until

• At a state in the model: A- , E- ,– AG p means that for all paths from this state, p is

true and will remain true – AF p means that for all paths from this state, p will

eventually be true– EG p means that for some path from this state, p is

true and will remain true – EF p means that for some path from this state, p will

eventually be true

2020

Propagating Propositions

p AF p

AF p

AF p

AF p

2121

Propagating Propositions

p

AF p

EF p

X p

2222

Propagation rules

• For each formula type• Only need to look at a node’s successors

• Need a reachability graph that shows the states (i.e., the values) of the variables– Example: process 1 can be null, trying to obtain the

lock, or in its critical region (n1, t1, c1). process 2 can be null, trying to obtain the lock, or in its critical region (n2, t2, c2)turn is a variable that indicates which process can obtain the lock ( 0,1,2)

2323

Example: mutual exclusion protocol*

reachability graph n1,n2,turn=0

t1,n2,turn=1

c1,n2,turn=1

t1,t2,turn=1

c1,t2,turn=1

n1,t2,turn=2

n1,c2,turn=2

t1,t2,turn=2

t1,c2,turn=2

*McMillan(process1,process2,turn)

2424

Example Property

• Property: AG(t1=>AF c1)

• If process1 tries to get the lock (t1) then eventually it gets into its critical region (c1)

• Note, would like to prove this for all processes but FSV approaches usually must instantiate property (and system)

2525

Example: propagation

n1,n2,turn=0

t1,n2,turn=1

c1,n2,turn=1

t1,t2,turn=1

c1,t2,turn=1

n1,t2,turn=2

n1,c2,turn=2

t1,t2,turn=2

t1,c2,turn=2

AF c1

AG(t1=>AF c1)

AF c1

AF c1

AF c1

AF c1

AF c1

AF c1

AF c1

AF c1

2626

Need to continue propagating

• ( t1=>AF c1 ) means ( AF c1 ¬ t1 )

2727


n1,n2,0

t1,n2,1

c1,n2,1 t1,t2,1

c1,t2,1

n1,t2,2

n1,c2,2t1,t2,2

t1,c2,2t1=> AF c1

AG(t1=>AF c1)

t1=> AF c1

t1=>AF c1

t1=> AF c1

t1=> AF c1

t1=> AFc1

t1=> AF c1

t1=> AF c1

t1=> AF c1

2828


n1,n2,0

t1,n2,1

c1,n2,1 t1,t2,1

c1,t2,1

n1,t2,2

n1,c2,2t1,t2,2

t1,c2,2t1=> AF c1

AG(t1=>AF c1)

t1=> AF c1

t1=>AF c1

t1=> AF c1

t1=> AF c1

t1=> AFc1

t1=> AF c1

t1=> AF c1

t1=> AF c1

Connected region where all nodes have t1=> AF c1 ==> AG( t1=> AF c1)

2929


n1,n2,0

t1,n2,1

c1,n2,1 t1,t2,1

c1,t2,1

n1,t2,2

n1,c2,2t1,t2,2

t1,c2,2AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

AG(t1=>AF c1)

3030

n1,t2,2

Formula Propagation• Propagate until no change

– propagate from smaller to larger subformulas– keep all the formulas associated with a node

AF c1, t1=>AF c1 ,AG(t1=>AF c1)

• “smart” algorithm: linear in the size of model and size of the formula– But, model is exponential – Many optimization techniques

• Symbolic model checking

3131

Symbolic Model Checking

• With abstraction, nodes may represent sets of values– Ordered Binary Decision Diagram (OBDD)

ab+c

0 1

0 1

1

a

b b

c

0 01

c0 1

0 1

c0 1

0 1

c0 1

0 1 1 110

0 1

1

a

b b

c0

01

c0 1

0 1 1 1

Note: order is a, b, c

Symbolic Model Checking

ab+c

0 1

0 1

1

a

b b

c

0 01

c0 1

0 1

c0 1

0 1

c0 1

0 1 1 1

10

0 1

1

a

b b

c0

01

c0 1

0 1 1 1

1 1

a

b0

0

1

1c

0 1

0 1 110

0 1

1

a

b

c

0c0 1

0 1

3333

Order Binary Decision Diagram

• Order of variables can effect the size of the model

• In general, can not determine the optimal order without trying them all

3434

Example of OBDDs

(x1 v x2) (y1 v y2)

x1

x2

y1

1

y2

0

00

0

0

1

1

11

x1

y1 y1

1

y2

0

0

0

00

1

1

11

1

x2 x2

y20

1

0

0

1

3535

Some observations: Model Checking

• Worst case bound linear in size of the model– Model exponential

• Experimentally often very effective

• Not clear if model checking or symbolic model checking is superior– Depends on the problem

3636

2, Flow Equations

• Model system as finite state automata

• Use extended network flow inequalities to capture legal flow through a concurrent system

• Represent negation of the property as a set of inequalities

3737

High-Level Architecture of INCAHigh-Level Architecture of INCA

PropertySystem

Property Translator

FSATranslator

Integer Linear Programming System

Set of Inequalities Property Verified

(no solution)

Set ofInequalities

Counter Examples for Model (solution)

SystemTranslator

FSA’s

IntegerNecessaryConditionAnalyzer

Example: Task Flow Equations

aa’

b’

bx1

x2x3

x5

x6x7

x10

x11x12

x0 x9

x1 = x0 + x3x2 = x1x2 = x3 + x4x0 = 1x4 = 1

x10 = x9 + x12x11 = x10x11 = x12 + x13x9 = 1x13=1

x6 = x5 + x7x6 = x7 + x8x5 = 1x8 = 1

x4x8 x13

Flow in to a node = Flow out of a node

Example: Inter-task Flow Equations

a a’

b’

bx1

x2x3

x5

x6x7

x10

x11x12

x0 x9

x4 x8x13

x2 = x6x11 = x7 + x8

Rendezvous are always matched:# calls = # accepts

4040

Example: Require Non-Negative Flow

a a’

b’

bx1

x2x3

x5

x6x7

x10

x11x12

x0 x9

x4 x8x13

j: 0 xj

Flow over edges is non-negative

Example: Property

aa a’

b’

bx1

x2x3

x5

x6x7

x10

x11x12

x0 x9

x4 x8x13

Are there more occurrences of event a then event b?

Property: For all paths, event a occurs more than event b

Property Equation: x2 > x11

Property Complement: x2 x11

4242

Solving the Set of Inequalities

• Determine if combined system of inequalities is consistent– Use integer linear programming

• If consistent, there is a set of flows through automata that violate the property

• Provides guidance for trace through the model (but may not be executable)

Solving for a propertyx1 = 1 + x3x2 = x1x2 = x3 + 1x6 = 1 + x7x6 = x7 + 1x10 = 1 + x12x11 = x10x11 = x12 + 1

x2 = x6x11 = x7 + x8

j: 0 xj

x2 x11

Task Flow Equations

Inter-Task Flow Equations

Non-Negative Flow

Property Complement

Does this set of inequalities have a solution?

4444

Solving for a propertyx1 = 1 + x3x2 = x1x2 = x3 + 1x6 = 1 + x7x6 = x7 + 1x10 = 1 + x12x11 = x10x11 = x12 + 1

x2 = x6x11 = x7 + x8

j: 0 xj

x2 x11

Solution exists e.g., x3, x7, x12=0, all other xi=1

=> property does not hold

4545

Counter example

a a’

b’

bx1

x2x3

x5

x6x7

x10

x11x12

x0 x9

x4 x8x13

Solution exists e.g., x3, x7, x12=0, all other xi=1 => property does not hold

4646

Some Limitations

• Integer Linear Programming has an exponential worst case bound

• Inter-process order information is not preserved– only checks whether event counts are

consistent– Like most static techniques, may produce

spurious results

4747

Some Benefits

• Does not enumerate the state space!

• Integer linear Programming is often very efficient– Empirical evidence: linear inequality systems

usually grow linearly and take sub-exponential times to solve

• In practice, INCA is often an effective technique

4848

3, Data Flow Analysis: FLAVERS

FLow Analysis for VERification of Systems• Represents property as a finite state

automaton• System model is collection of annotated

control flow graphs– Inter-process communication and interleavings

are represented with additional edges– does not enumerate all reachable states– over-approximates relevant executable behaviors

• Reasoning engine based on data flow analysis

4949

High-Level Architecture of High-Level Architecture of Data Flow Analysis

Property

System

Property Translator

SystemTranslator

State Propagation

Collection of annotated CFG’s

Property Verified

FSA

Counter Examples from Model

(Control Flow Graphs)

5050

Modeling the System

3

4

2

9

T1 T2

5

7

1,6

2,6

5,9

3,6

4

8

1,71 6

2,7 1,8

3,7 2,8

3,8

State explosion

5151

Modeling the System

3

4

2

9

T1 T2

5

7

8

1 6

•Automatically creates the program model from source code

•Instead of the state space, explicitly represents interleaved execution via edges

•Smaller model

•Loss of precision

5252

Representing PropertiesRepresenting Properties

Example:

close,open,move

0

1

openclose

2

move

closemove

open

5353

State Propagation

• States of the property are propagated through the model

• The property is proved if only accepting (non-accepting) states are contained in the final node of the model

5454

Example

public static void main (String [] args){ … if (elevatorStopped) {... openDoors(); } recordState(); if (elevatorStopped) {... closeDoors(); } moveToNextFloor();}

if

open

close

if

move

5555

Example

if

open

close

if

move

{0}

{1}

{0,1}

{0}

{0,2}

0

1

openclose

2

move

closemove

open

5656

Incrementally Improving PrecisionIncrementally Improving Precision

Property

System

Property Translator

SystemTranslator

State Propagation

System model

Property Verified

FSA


...

Constraints

Example:Boolean variable constraint

== is a predicate

= is assignment

viol

S==trueS=true

S==trueS=true

S==true

S==falseS=false

S==false

S==trueS=true

S==falseS=false

S==falseS=false

S=false

S=true

true false

unknown

5858

Example with Constraints

if

open

close

if

move

S==true S==false

S==falseS==true

ConstraintProperty(0,0)

(0,1)

(1,1) (1,1)

(1,viol)

0

1

openclose

2

move

closemove

open

0

21

viol

S==true S==false

S==true

S==true

S==false

S==false

S==true

5959

Example with Constraints

if

open

close

if

move

S==true S==false

S==falseS==true

S==false

Constraint

close,open,move

0

1

openclose

2

move

closemove

open

Property(0,0)

(0,1)

(1,1){(1,1), (0,2)}

(0,2)

{(1,1), (0,viol)} {(1,viol), (0,2)}

{(0,1)}

{(0,1), (0,2)}

0

21

viol

S==true S==false

S==true

S==true

S==false

S==false

S==true

6060

2004 年的一篇文章

• Matthew B. Dwyer, Lori A. Clarke, Jamieson M. Cobleigh, Gleb Naumovich,

Flow Analysis for Verifying Properties of Concurrent Software Systems,

Department of Computer Science, University of Massachusetts, Amherst, MA 01003, April 2004. (UM-CS-2004-006)

http://laser.cs.umass.edu/techreports/04-06.pdf

6161

quantified regular expression (QRE)for events {valid, invalid, transaction}show all executions satisfy invalid*; (valid; transaction)

6262

Ada Tasks for the ATM Example

6363

Trace-Flow Graph

6464

Refined Control-Flow Graphs

6565

Task Synchronization

6666

Reducing the size of the TFGState-propagation Analysis

6767

Some Observations: Data Flow Analysis

• Overall complexity is O(N2S)– N is the # nodes in the model – S is the number of states: property x constraints– Experimentally: performance subexponential

• Usually requires several iterations to determine needed constraints

• Constraints– Many automatically generated on request– Can be used to model other information

6868

Experimental Comparisons

• All these approaches are:– very effective on some problems

– disappointing on some problems

• Hard to predict how they will perform

• Experimental results– George S. Avrunin, James C. Corbett, Matthew B. Dwyer,

Corina S. Pasareanu, and Stephen F. Siegel, Comparing Finite-State Verification Techniques for Concurrent Software

6969

Can we move beyond academic prototypes to practitioners’ tools?

• Yes, but there is more work to be done– Optimization, optimization, optimization– Process support– Better support for specifying properties– Better support for generating, selecting, visualizing

counter example traces – Better approaches for dealing with dynamism– Full support for real languages– Full lifecycle support

• Integration with testing

7070

Specifying Properties

• It is very hard to specify properties precisely– E.g., open and close file repeatedly

• Must file always be opened?Or, IF it is opened, then it must be closed?

• Can file be opened repeatedly before it is closed?

• Need notations that are easy to use– Specification patterns

• Need tools to help understand properties– need to test the properties

7171

Counter Example Traces

• Want “short” but “useful” counter examples

• How to select the “next” counter example?

• How to incorporate user guidance?

• How to go from traces in the model to traces in the program?

7272

Dynamism

• FSV is a static analysis approach that deals with static models– Must create a specific instance of the model

• E.g., N philosophers => 5 philosphers– Can not handle

• dynamic objects• dynamic process creation

• Need hybrid techniques that integrate theorem proving with FSV

7373

Support for Real Languages

• Many language features have not been addressed– Aliasing ( 别名 )– Exception handling– Event based notification

7474

Lifecycle-based Verification

• High-level architectural design– Extremely important for distributed systems

• Detect problems early– Need to support heterogeneous interaction

models

• Low-level design– Additional detail leads to additional properties– Need to maintain consistency with the HLA

7575

Lifecycle-based Verification (continued)

• Coding– Partial systems– Incremental, compositional

development/verification

• Debugging– Hypothesize fault in terms of a property– FSV provides a counter example trace or

invalidates hypothesis

7676

Lifecycle-based verification (continued)

• Testing– Generalize test cases to their corresponding

property– Test planning via requirements based

property specification

• Regression testing– re-verify properties that should not have

changed• Need efficient re-verification techniques

7777

Integrating Testing and Verification

• Testing and verification complement one another– verification makes assumptions that should

be monitored dynamically– testing finds problems that should then be

examined globally

• Need to develop integrated techniques

7878

Synergy between Testing and Verification

Properties

Faults

Testing

Assumptions/constraints

Counter examples

Assertions

Verification

Test plans/cases

7979

Conclusions

• Testing alone can not provide the assurance that is needed for many applications– especially distributed systems

• FSV a promising technology– Applicable to a wide range of properties– Applicable throughout the lifecycle

– Initial empirical results promising

8080

ConclusionConclusion

• Finite State Verification is a major paradigm shift– More difficult than testing,

but not that much more difficult – Cultural resistance to doing anything

different• Is the pain worth the gain?

• Grand challenge: Can we lower the obstacles to adoption?

Documents

Lori A. Clarke 教授