[IEEE 2008 Sixth IEEE International Conference on Software Engineering and Formal Methods - Cape Town, South Africa (2008.11.10-2008.11.14)] 2008 Sixth IEEE International Conference

Formal Change Impact Analyses of Extended Finite State Machines using aTheorem Prover

Bo Guo and Mahadevan SubramaniamComputer Science Department,

University of Nebraska-Omaha, Omaha, NE 68182.

Abstract

This paper describes a formal change impact analysisapproach for systematic evolution of communicating sys-tems. Systems are modeled using a network of communicat-ing extended finite state machines (CEFSMs) with variablesranging over commonly used data types including numbers,booleans, arrays, and object fields. Parameterized mes-sages exchanged over queues and shared variables are usedfor communication. Changes to the system are performedat the transition level by adding/deleting transitions. Givena changed transition, the impacted system transitions areautomatically computed using a bounded, selective, stateexploration based on the inductive assertion approach. Atheorem prover extended with queue axioms is used to dis-charge the verification conditions. Multiple symbolic valuesfor each variable present in a system state are representedas a set of rewrite rules to minimize state space overheads.Rewrite-rule based procedures are described for reducingthe number of symbolic values in system states. We alsodescribe heuristics to identify simultaneously enabled anddisabling transitions and describe a procedure to reduce thenumber of verification conditions generated during the im-pact analysis. The effectiveness of the proposed approachis illustrated on several applications including web servicesand cache coherence protocols.

1. Introduction

Evolution and maintenance of communication systemsis a challenging activity. Simple changes can lead to unan-ticipated outcomes due to the subtle interactions among thecomponents in such systems. Manually analyzing the im-pact of changes to such systems is a labor intensive and er-ror prone activity.

In this paper, we describe an automatic approach forperforming change impact analysis for communicating sys-tems. We focus on communicating extended finite state ma-chines (CEFSMs) that have been extensively used to modelthe behavior of communicating systems, data rich protocols

and more recently, web services [4, 9, 15, 16]. A systemcomprises of a network of CEFSMs that communicate witheach other by exchanging messages and by shared data vari-ables. Message parameters and variables can range overcommonly used data types including numbers, booleans, ar-rays, and object fields.

Changes to the system performed at the transition levelby adding/ deleting one or more transitions from one ormore CEFSMs. Every transition that can appear in a sys-tem execution with a changed transition is deemed to beimpacted by the change. Such a model of impact capturesthe information necessary to ensure that every change pre-serves application-independent properties of these systemssuch as consistency and executability [13, 14], which are apre-requisite for further verification.

Our impact analysis procedure is based on the well-known inductive assertions approach and automaticallygenerates all the system states reachable from a given mod-ified transition up to a specified bound. At each step ofthe procedure, a verification condition is generated usinga given system state and a given transition, translated toa first-order logic formula, and input to a theorem prover.If the formula is satisfiable then the next system state pro-duced by executing that transition is generated. Then, thegiven state is further explored using other transitions. Weuse a theorem prover called Simplify, which supports deci-sion procedures for numbers, booleans, equality, maps, andpartial-orders [6].

In the system states of CEFSMs, several previous valuesof a variable may be used in the parameters of messagesthat are yet to be processed. The impact analysis procedurecan incur significant overheads in terms of the state spaceand the required reasoning effort due to these multiple sym-bolic values in each system state. Our contributions towardsmitigating this problem are as follows.

• Compactly represent symbolic values of variables insystem states by rewrite rules that can be simplifiedlazily outside of the prover.

• Reduce the number of symbolic values in a global state

2008 Sixth IEEE International Conference on Software Engineering and Formal Methods

978-0-7695-3437-4/08 $25.00 © 2008 IEEE

DOI 10.1109/SEFM.2008.40

335


978-0-7695-3437-4/08 $25.00 © 2008 IEEE

DOI 10.1109/SEFM.2008.40

335


978-0-7695-3437-4/08 $25.00 © 2008 IEEE

DOI 10.1109/SEFM.2008.40

335

by automatically expressing older values in terms ofthe new one and

• Pre-process transitions to determine i) transitions dis-abled by a given transition in every system state and ii)block of transitions that are likely simultaneously en-abled in every system state. We use this information toavoid the generation of certain verification conditionsto be discharged by the prover during impact analysis.

We have applied the proposed approach to several re-cently published web service examples as well as severalcache coherence protocols. We studied the effort requiredto analyze the impact of cumulative changes to several tran-sitions in a system as well as the distribution of this effortover changes to individual transitions. The number of sys-tem states visited, time taken, the number of prover calls,and the number of symbolic values maintained in the systemstates are used to measure impact analysis effort. Our initialresults seem quite promising. For the cumulative changes,in our examples, the proposed approach improves on all ofthe metrics in comparison to a brute-force inductive asser-tion approach. Our results on impact analysis effort for in-dividual transitions of cache coherence protocols, show thatit is costlier to change transitions in the reply controllers ofcaches and in the memory than in the request controllers.This result conforms to our expectations since transitions inthese controllers have many more involved interactions thanothers.

The rest of this paper is organized as follows. After de-scribing related work next, Section 3 gives an overview ofCEFSMs and, the theorem prover Simplify. Section 4, de-scribes the use of rewrite rules to represent multiple sym-bolic values in the system states, and the procedures for re-ducing these values in the system states. In Section 5, theproposed change impact analysis approach for CEFSMs us-ing bounded state exploration is described. In Section 6, wedescribe how the theorem prover can be used to pre-processthe CEFSMs to reduce the number of calls to the proverduring change impact analyses. Experimental results arediscussed in Section 7 and Section 8 concludes the paper.

2. Related Work

Change impact analysis is an active area of research insoftware engineering. There is extensive published workin this area (see [11] for an excellent overview). How-ever, most of this work deals with sequential programs andis not directly applicable to communicating systems. Fur-ther, most of the earlier impact analysis approaches eitheruse static analyses of program flow graphs [11] or per-form static analyses of dynamic information (like traces)collected from deployed systems [12]. None of these ap-proaches are based on inductive assertion approach with a

theorem prover. This work is also perhaps the first to pro-pose the use of rewrite rules to model symbolic values.

This paper builds on our earlier work in [13, 14, 2]on change analysis of communicating finite state machines(with no parameters or variables). In [13], we proposed amodel of changes for communicating finite state machinesand developed conditions on these changes to preserve cer-tain properties. Our work in [14] described a method forchange-guided repair of inconsistencies for communicatingfinite state machines. Our preliminary efforts for state ex-ploration of CEFSMs are described in [2].

3. Preliminaries

System Model. We model a system P = (C1, C2, · · ·,Cn, ε) as a network of CEFSMs Ci’s along with the envi-ronment ε. The CEFSMs Ci’s are connected using (possiblyinfinite) FIFO queues. The information and notation in thissection are largely derived from earlier works. More detailscan be found in [5, 10].

Each CEFSM Ci = (Ii, Oi, Si, Vi, Ti) modeling the be-havior of the controller Ci, is a 5-tuple where Ii, Oi, Si, Vi,and Ti are finite sets of parameterized input and output mes-sages, states, variables, and transitions respectively. Eachmessage m in Ii (in Oi) has parameters p1, · · ·, pk withtype signature, −→σ = (σ1, · · ·, σk); σ1 and σ2 are controllerids denoting the sender and receiver of m and the rest can beone of integer, booleans, arrays, and object fields. The setVi = Xi ∪ CHi, is a union of global variables Xi appear-ing in Ti and the queue variables CHi denoting the inputand output queues to (from) Ci. A queue variable Q(j, i)in CHi is a queue from Cj to Ci. We use, Q(j, i).hd, Q(j,i).tl to stand for the messages at the head and the tail of thisqueue; enq and deq to enqueue and dequeue a message intothe queue respectively; predicates full and empty deter-mine if a queue is full or empty respectively; Q(j, i).hd.mstands for the message name at the head of the queue; Q(j,i).hd.m.k stands for the parameter pk of the message at thehead of the queue.

Each transition t in Ti is of the form: Q(j, i).hd== mj(−→p ), Pt(−→xi ,−→p ), st �→ qt, enq(Q(i, l), ml(−→e )),At(−→xi ,−→p ) where −→p = (p1, · · ·, pk) is a tuple of distinctparameter variables, the predicate Pt and the actions At

depend on global variables −→xi (−→xi ⊆ Xi) and the param-eters pi’s. Above, −→e = (e1, · · ·, eo) stands for a tuple ofexpressions eo’s over global variables and parameters andAt(−→xi ,−→p ) = {x1 = e1; · · ·; xw = ew} is an ordered se-quence of assignments to global variables1.

The semantics of a system P is operationally defined us-ing a directed global state graph. The nodes of this graph

1We use == to denote equality and = to denote assignment. We caneasily generalize At to have conditional statements. Allowing loops whosebounds cannot be a priori determined in At requires invariants and is notconsidered in this paper.

336336336

are global states of the system P . Directed edges betweenthe nodes in this graph are each labeled by a transition andrepresented the execution of the transition from the startingglobal state ending in the other.

A global state of P , g = (〈 s1 · · ·, sn〉, pred) is a pairwhose first element is an n-tuple of its controller states andthe second element pred is a conjunction of atomic predi-cates over Vi. An atomic predicate is formed by using re-lational operators (and, or, not, ==, �=, <, ≤, >, ≥) overexpressions over the different data types (integers, booleans,arrays, and object fields). Note that parameters do not ap-pear in any global state. Conditions on the parameters areexpressed as conditions over the corresponding queue vari-ables. For instance, a condition pk ≥ 0, over parameter pk

of message m at the head of queue Q(j, i) is expressed in aglobal state as Q(j, i).hd.m.k ≥ 0. An initial global stateof P is a global state g in which each controller state si inthe first element of g belongs to the initial state of Si, andthe second element pred is the initial predicate which is aconjunction of atomic predicates over queue variables stat-ing that all queues between any two controllers are empty2.Atomic predicates may include application dependent con-ditions on global variables and may also include conditionson parameters of messages in the queues from the environ-ment3.

The transition t in controller Ci is enabled in givenglobal state g if its input state and input message match thestate and the head of the corresponding queue in g and itspredicate Pt is satisfied by g. In addition, the output mes-sage queue in g must not be full so that t can enqueue itsoutput message in that queue. An execution step of P , g→t g′ executes a transition t enabled by the global state gand produces the global state g′. The global state g′ is saidto be enabled by t. An executable path of P is a sequenceof execution steps of P . A preamble of t, PreA(t), is anexecutable path starting in an initial global state and endingin a state g that enables t. A postamble of t, PosA(t), is anexecutable path starting in a global state g′ enabled by t andending in an initial global state. The context of a transitiont is the set of all preambles and postambles of t.

Simplify Prover. Simplify has been used to perform ex-tended static analysis and model checking of software pro-grams [6, 7]. Usually, quantified formulas called a verifi-cation conditions are generated from the program and inputto the prover. The prover automatically determines the va-lidity of an input formula and returns valid if the formulaevaluates to true under all the assignments to variables informula and returns invalid otherwise. To check if a for-

2We assume all environment messages to be available in the initialglobal state. The method is easily extended to allow external inputs inany global state.

3Note that message parameters in queues from the environment maynot be bound to global variables. So, their conditions cannot be eliminatedby flattening CEFSMs.

mula F is satisfiable, its negation not(F ) is input to prover.F is unsatisfiable if the prover returns valid; F is satisfiableif the prover returns invalid. Simplify contains decision pro-cedures for numbers, booleans, equality, partial-orders, andthe theory of maps [6].

The data types (and the operations) of the message pa-rameters and data variables in the CEFSMs are automat-ically translated to the data types supported by the prover.Aggregate types like arrays (object fields) are translated intomaps from array index data types (field ids) to array (field)element data types. We model CEFSM queues as mapsfrom queue indices to queue elements. Formulas definingthe queue constructors – initq (create an empty queue), enq(enqueue an element), and deq(dequeue), selectors – headand tail, and the queue predicates – full, and empty, aredefined in Simplify as predicates and are added to the proveras additional axioms. Please refer to [2, 3] for details.

4. Handling Multiple Symbolic Values

It is often necessary to refer to several previous valuesof a variable in a global state. As an example, consider acontroller C1 executing a transition t1: I(ε, 1).hd == m1(ε,1), x > 0, s0 �→ s1, enq(O(1, 2), m2(1, 2, x)), {x = −x},in a global state g0: (s0, x > 0 ∧ not(full(O)) ∧ I .hd ==m1(ε, 1)) resulting in another global state, say, g1. It isclear that we must maintain the previous positive value ofx (as well as its current negative value) in the global stateg1 so that an appropriate transition from controller C2 canprocess the output message of the transition t1. We can ex-ecute another transition t2: I(ε, 1).hd == m3(ε, 1), x <0, s1 �→ s0, enq(O(1, 2), m4(1, 2, x)), {x = x + 1} inthe global state g1 to produce another global state, say, g2.Now, two previous values of x (in addition to its currentvalue) must be maintained in g2 so that the output messagesof both the transitions t1 and t2 are correctly processed. Itis easy to generalize the above example so that several pre-vious values of a variables need to be maintained in a globalstate4. We can also similarly construct examples where bothcurrent and previous values of a queue variable need to bemaintained in a global state.

To refer to multiple values of a variable in a global statewe create multiple instances of that variable. A new variableis introduced to refer to each instance. The variable nwxrepresents the wth instance of a variable x. The instancereferring to the most recent value of x (one with the highestw value) is called the the latest instance of x. The initialvalue of x is called the initial instance and is denoted by thevariable n0x.

4Note that CEFSMs can be easily generalized to support multi-castingwhere each transition sends several output messages. Executing a singlesuch transition can produce global states maintaining multiple previousvalues of variables. Examples of such transitions may be found in theprotocols in Section 7.

337337337

In the rest of this Section, we first describe how thevariable instances and their values are maintained in globalstates. Then, we describe procedures for reducing the num-ber of variable instances in a global state.

4.1. Global States with Rewrite Rules

A symbolic value is associated with each instance of avariable in each global state. Each symbolic value is simplya term in the language of the prover and denotes the compu-tation performed on a variable before reaching a particularglobal state. We can automatically construct the symbolicvalue of an instance of a variable in a global state by repeat-edly composing the relevant action statements in the tran-sitions used to reach that state. As we reach deeper states,several statements may need to be composed and this canresult in large terms. Storing many such terms in a globalstate corresponding to the multiple variable instances in thestate can become prohibitively expensive.

Our idea to address this problem is simple: implicitlyrepresent the symbolic value of each instance in a globalstate by a set of rewrite rules. We extend each global stateg = (〈s1, · · ·, sn〉, pred, R) to be a triple that contains a(possibly empty) set of rewrite rules R in addition to thepredicate pred and the n-tuple controller states. R has atmost one rewrite rule for each variable instance appearingin the state g. There are no rules in R for the initial in-stances. Each action statement executed to reach the globalstate g is translated into a rewrite rule and included in R.For instance, the action statement {x = −x} in transitiont1 above, is translated into rule n1x → −n0x using the in-stance n0x for the initial value of x and the instance n1xfor its modified value and used to produce global state g1

with pred = n0x > 0 and R = {n1x → − n0x}. Similarly,the global state g2 with pred = n0x > 0 ∧ n1x < 0, andR = {n1x → − n0x, n2x → n1x + 1} is produced afterexecuting transitions t1 and t2.

More generally, for a given sequence of actions we sub-stitute every wth definition (occurrence on left hand sideof a statement) of every global variable xj in the sequenceand all of the subsequent references to xj up to (but not in-cluding) its next definition by a new variable instance nwxj .Then, each action statement nwxj = ew is simply convertedinto the rule nwxj → ew. Let R(At) stand for the orderedset of rules produced from a given sequence of action state-ments At. The set R(At) is included in the global stategenerated by executing the corresponding transition5.

The symbolic value associated with an instance nwx ina global state can be computed whenever necessary by sim-plifying nwx using the rewrite rules in that global state.We use rewriting to ensure that a distinct symbolic value

5More details on generation of global states are described in the post-image and pre-image computation steps in Section 5.

is associated with each instance of a variable in a globalstate. Whenever a new state is generated, the instances ofa variable are simplified and redundant variable instancesare discarded from those having equivalent symbolic val-ues. While discharging verification conditions using theprover, we first compute the symbolic values of the variableinstances by rewriting. Then, these variable instances in thestate predicates are substituted by their symbolic values togenerate formulas to be input to the prover.

Our representation of global states is similar to that usedin symbolic execution [8]. The predicates in our globalstates correspond to the path conditions used in the states ofsymbolic execution. However, we use rewrite rules to rep-resent symbolic values. Further, unlike symbolic execution,multiple symbolic values are associated with each variable.

4.2. Eliminating Variable Instances

In many cases, a previous value of a variable can be ex-pressed in terms of its current value. We can then eliminatethe older instance of the variable by using its latest instanceand reduce the number of variables in the global states.

As an example, suppose that a controller C1 executesthe transition t: I(ε, 1).hd == m1(ε, 1), x > 1, s0 �→ s1,enq(O(1, ε), m2(1, ε)), {x = x + 1}, in a global state g0 withpredicate n1x≥ 0. The resulting global state g1 includes theatomic predicates: n1x ≥ 0, n1x > 1 and the rewrite rule:n2x: n2x → n1x + 1, which involves the older instancen1x as well as the latest instance n2x. We do not need bothn1x and n2x in g1 if we can express n1x in terms of n2x.

To do so, first, the rule n2x → n1x + 1 in g1 can betransformed into the rule: n1x → n2x − 1, which rewritesolder instance into a term with the newer instance. Then, thepredicates and the rewrite rules in g1 can be simplified usingthis rule to obtain a global state, say, g′1, with predicates:n2x ≥ 1, n2x > 2, and the rewrite rule n2x → (n2x − 1)+ 1. Finally, we can simplify g′1 to delete the trivial rule(whose two sides are the same6) and uniformly rename n2xby n1x to avoid creating unnecessary instances.

In general, when a transition t with an action statementx = f (nuy1, · · ·, x, · · ·, nuyn) is executed in a global stateg, first, we formulate the rule nv+1x → f (nuy1, nvx, · · ·,nuyn), where nvx is the latest instance of the variable xin g. Second, this rule is transformed into a rule of theform: nvx → h(nuy1, · · ·, nv+1x, · · ·, nuyn). Any actionnv+1x = f (nuy1, · · ·, nvx, · · ·, nuyn) over the theory ofinteger linear arithmetic can always be automatically trans-formed by first writing it in the canonical form: nv+1x =a0 + a1nwy1 + · · · cnvx + · · · annuyn, which can be ex-pressed as predicates nvx == nv+1x − a0 − · · · − annuyn

6In this case we use pre-defined rules expressing associativity and can-cellativity of + for simplification. Alternatively, the prover can also becalled to check for trivial rules.

338338338

and nvx mod c == 0, the first of which is added to therewrite rules of the resulting global state and the secondis added to the predicates of that state. Finally, we sim-plify the predicates and the rewrite rules in the resultingglobal state by eliminating all occurrences of nvx by usingthe above rule, deleting trivial rules, predicates, and renam-ing instances to remove redundancies. The above approachnaturally extends to handle a list of action statements withmultiple side-effects to multiple variables. Please refer to[3] for more details. In the rule set obtained after elimina-tion, each rule is of the form npxi → e′i where npxi is theoldest instance of xi and e′i comprised of the latest gener-ated instances only. Let RE(At) stand for the rule set andPE(At) stand for the set of predicates obtained. The pred-icates PE(At) are added to the global state which is thensimplified using the rule set RE(At) to eliminate older in-stances as described above.

Note that global states are not needed to compute therewrite rule set RE(At) and the predicates PE(At). In fact,we pre-compute these for each transition by using the latestinstances of system variables nuy1, · · ·, nuyn as parame-ters. These parameters are then instantiated based on theglobal state where the corresponding transition is executed.

5 Change Impact Analysis for CEFSMs

A change to a system can either add a new transition ordelete an existing one. Replacing a transition is modeled bya sequence of changes – deletion followed by an addition.The impact of adding a transition t is the set of transitionsthat can appear with t in any run of the changed system. Theimpact of deleting a transition t is the set of transitions thatcan appear with t in the original system. The impact of asequence of changes is the union of the impacts of changesin the sequence.

The impact of changing a transition t is computed byperforming a bounded state exploration computing the setof postambles and preambles of t containing distinct globalstates. If two postambles (or preambles) involve equivalentset of global states, then only one is considered. Let transi-tion t: Q(j, i).hd == mj(−→p ), Pt(−→xi ,−→p ), st �→ qt, enq(Q(i,l), ml(−→e )), At(−→xi ,−→p ) be from controller Ci, where −→p =(p1, · · ·, pk) is a tuple of distinct parameter variables, −→e =(e1, · · ·, eo) is a tuple of parameter expressions over param-eters and global variables and action list At is an orderedsequence of assignments {x1 = e1; · · ·; xw = ew}. Below,RE(At) stands for the set of rewrite rules and PE(At) isthe set of predicates pre-computed from t as described inSection 4.

The main steps for finding the impact of modifying atransition t are the following.

1. Process the transition t: Replace every occurrenceof each parameter pk from −→p by the queue variable expres-sion Q(j, i).mj .k in Pt, At, −→e , RE(At), and PE(At). Let

nPt, nAt, and −→ne, nRE(At), and nPE(At) respectivelybe the results. Further, while computing a pre-image (post-image) below, the processing step is performed relative to agiven global state g′ (g). In these cases, we first replace eachoccurrence of queue variables Q(j, i), Q(i, l), and each oc-currence of each global variable xi in t by their latest newvariable instances in the global state g′ (g). The instancevariables in the rule set RE(At) and predicate PE(At) arealso similarly instantiated. The parameters are then elimi-nated as described above.

Below, we use nLx to refer to the latest instance of vari-able x and use nP x to refer to its penultimate (latest minus1) instance. The instances nLx and nP x in a verificationcondition refer to those in the given global state. The in-stances nLx and nP x in a generated state refer to the latestand penultimate instance of x in the generated state.

2. Compute Postambles of t: The set of postambles,PosA(t) are iteratively computed in a breadth-first mannerusing the following two steps. First, we compute the (sym-bolic) global state g′ enabled by t i.e., all global states thatcan be reached after executing t. The verification condi-tion generated is, V C1 : nPt ∧ not(full(nLQ(i, l))). Thefirst conjunct of V C1 is the condition of transition t (afterprocessing t as described in step 1). The second conjunctstates that the latest instance of the the output queue Q(i,l) is not full in the global state where t is executed (an un-known state) and this ensures that the output message of t, ifany, can be enqueued. If V C1 is satisfiable then the globalstate generated is,

g′ = (〈xs1, · · ·, qt, · · ·, xsn〉, V C1 ∧ nPE(At),nRE(At) ∪ {nLQ(j, i) → deq(nP Q(j, i)),nP Q(j, i).hd → mj(−→xp),nLQ(i, l) → enq(nP Q(i, l), ml(−→ne))})

The first component of the state g′ is an n-tuple of controllerstates whose ith component is set to the output state of t andall other components are set to distinct, unknown controllerstates represented by variables xsn’s. The second compo-nent of g′ is a conjunction of the verification condition V C1

and the predicate nPE(At) that is generated while eliminat-ing variable instances from action statements as describedin Section 4. The last component of the state g′ comprisesof a set of rewrite rules, nRE(At), corresponding to the ac-tions of the transitions t and additional rules relating queuevalues before and after the execution of transition t. Theconjunct not(full(nLQ(i, l))) is not needed if either t doesnot have an output message or if the queue Q(i, l) is un-bounded. The last rewrite rule concerning the output queuenLQ(i, l) is absent when t does not contain any output mes-sage. Note that the variables not appearing in the transitiont are all uninstantiated in the state g′. These variables mayget instantiated by back propagation as successors of thestate g′ are generated as explained below.

339339339

Post-images for each global state g are repeatedly cre-ated with respect to each system transition t. Given stateg = (〈 xs1, · · ·, xsn〉, pred, R) and a transition t, we firstsyntactically check that input state of t, st matches the ith

component of the first element of g and that the head ofthe queue, nLQ(j, i) in g7 matches the message mj alongwith its parameters. If so, we generate verification condi-tion, V C2 : pred ∧ nPt ∧ not(full(nLQ(i, l))). The ver-ification condition V C2 states that the predicates in state gand the condition of the transition t must be satisfiable andthat the latest instance of the output queue must not be fullin the state g. The condition V C2 ensures that the transitiont is enabled in the state g and that t will execute its outputactions and end in a global state. If V C2 is satisfiable thenthe post-image generated is

g′ = (〈xs1, · · ·, qt, · · ·, xsn〉, V C2 ∧ nPE(At) ∧{nRE(At) ∪ {nLQ(j, i) → deq(nP Q(j, i)),nP Q(j, i).hd → mj(−→xp)nLQ(i, l) → enq(nP Q(i, l), ml(−→ne))} ∪ R}).

The ith component in the first element of g′ is set to theoutput state of the transition t. The other components inthe n-tuple are copied over from the state g. The secondelement of the state is the conjunction of the verificationcondition V C2 and the predicate nPE(At) generated whileperforming elimination of instances from the actions At oftransition t. The last element of g′ comprises of the set ofrewrite rules nRE(At) derived from the actions At and 3additional rules relating the values of the input and outputqueues in the state g′ to their previous values in the state g.The rules R in the state g are also copied over.

Backpropagation: Certain information from the post-image g′ is propagated back to the global state g by updat-ing its components. If the ith component of the first ele-ment of g is uninstantiated then it is set to be the input stateof the transition t. Other components in the first elementof g are unaffected. The second component of g is set tobe the second component of the state g′ and finally, all therewrite rules from g′ whose left hand sides involve the inputand output queue values (and other aggregate variables) areadded to the rules in g. Each post-image g′ generated fromthe given state g creates a new state gi obtained from g bypropagation. The information is propagated to the prede-cessors of g by creating copies of the paths reaching g withnew, distinct global states containing the updated informa-tion. These paths are extended to reach the new state gi andthe state gi reaches the post-image g′ using the transitiont. Once the post-image of g with respect to all the transi-tions are considered then the state g are removed. Then, thecomputation is continued from the instantiated states gi.

7Note that the instances are simplified by rewriting before matching.Also, an uninstantiated variable in a global state matches any arbitraryvalue.

The postambles, PosA(t) are computed iteratively in abreadth-first manner by starting with the state enabled by t.At each step, the post-image is generated and its feasibil-ity is checked. If the post-image is feasible then a postam-ble prefix is constructed with that post-image and after per-forming back propagation, the process is continued with ev-ery distinct post-image generated. Since we are interestedin computing the impact of changes to t, the process ter-minates if all transitions have appeared in PosA(t) imply-ing that changes to t impact the entire system. In all othercases, the process terminates until either every postambleprefix ends in an initial global state or no new post-imagesare generated or if the specified depth is reached.

3. Compute Preambles of t: The set of preambles oft, PreA(t) are iteratively computed in a breadth-first man-ner using the following two steps. First, we compute the(symbolic) global state g enabling t, i.e., all global states inwhich t can be executed. Verification condition V C3: nPt

is generated to ensure that t is enabled in some global state.If V C3 is satisfiable then the global state generated is,

g = (〈xs1, · · ·, st, · · ·, xsn〉, V C3 ∧ not(full(n0Q(i,l))),{n0Q(j, i).hd → mj(−→xp)})

The ith component of the first element of the state g is setto the input state of transition t. The second element is setto the conjunction of V C3 and the condition that the outputqueue is not full. The second conjunct is dropped if t doesnot have an output message. The last element of g containsthe rewrite rule that equates the head of the input queue tothe input message of t.

To generate the ordered set of rewrite rules R(At) from agiven ordered sequence of actions At = {x1 = ne1, · · ·, xw

= new} for preambles, we replace the last definition of xw

by its latest instance nLxw in the given state and replace theoccurrences of xw in new and other preceding right handsides of by the previous instance of nP xw until the preced-ing definition of xw is found, and so on. Let nR(At) standfor the processed set of rules. The processed predicate nPt

for preambles is generated by replacing each global variablex by its earliest instance in nR(At). If a variable x does notappear in the actions of nR(At) then the latest instance nLxis used. A queue variable Q in Pt of a transition t is replacedby its previous instance nP Q.

Next, pre-images for each global state g′ are repeatedlycreated with respect to each transition t. Given state g′ = (〈xs1, · · ·, xsn〉, pred′, R′) and a transition t, we first syn-tactically check that output state of t, qt matches the ith

component of the first element of the state g′ and that thetail of the output queue, nLQ(i, l) matches the message ml

along with its o parameters. If so, we generate the verifica-tion condition, V C4 : nPt ∧ Pr ∧ not(full(nLQ(j, i))) ∧∧o

z=1 nLQ(i, l).tl.m.z == nez, where the first, conjunctnPt checks that the condition of t is satisfied in the global

340340340

state g enabling t. Second conjunct, Pr, gives the conditionon queue and global variables that must be satisfied in thestate g so that the predicate pred′ holds in the state g′ afterperforming actions nAt of the transition. The third conjunctin V C4 checks that the input queue nLQ(j, i) in the state g′

is not full. Finally, the last conjunct ensures that the parame-ter expressions neo’s (after processing as explained earlier)in the output message of t and the corresponding ones in thequeue in the state g′ are consistent. The conjunct Pr is ob-tained by simplifying the predicate pred′ in g′ by using therewrite system nR(At). The last conjunct in V C4 is absentif the transition t does not have any output message. If V C4

is satisfiable then we generate the pre-image,

g = (〈 · · ·, st, · · ·, 〉, V C4 ∧ not(full(nP Q(i, l))),{ nP Q(j, i).hd → mj(−→xp),

nLQ(i, l) → enq(nP Q(i, l), ml(−→ne)),nLQ(j, i) → deq(nP Q(j, i)) ∪ nR(At)})

Above, in the first element of g, the ith component is theinput state st of the transition. All the other components arecopied over from the corresponding components from thefirst element of the state g′. The second element includesthe verification condition V C4 along with the condition thatthe previous value of the output queue (before executing t)is not full so that the output message of t can be enqueued.The last element of g includes the rule set R(At) derivedfrom the actions of t along with three other rules relatingthe current and previous values of the input and the outputqueues. The first rule states that the head of the previousvalue of the input queue has the input message of the tran-sition t. The second rule states that the current value of theoutput queue is obtained by enqueueing the output messageof t to the previous value of this queue and the last rulestates that the current value of the input queue is obtainedby dequeuing its previous value.

Each pre-image computation is followed by a back prop-agation step to instantiate variables as done for post-images.More details are available in [3]. The preambles, PreA(t)are computed iteratively by starting with the state enablingby t and continued with every distinct pre-image generated.The process stops if all the transitions have been found tobe impacted. Otherwise, the process is continued until ev-ery preamble prefix ends in an initial global state or no newpre-images are generated or the specified depth is reached.

4. Combine Pre-ambles and Post-ambles: Since weperform state exploration from the point of change, and onlyup to a certain depth, the transitions appearing in the pathsprovide a conservative over-approximation of the impactedtransitions. To provide a more accurate impacted set of tran-sitions, it is checked for each start state gs in each postamblein the set PosA(t) if there is a preamble in PreA(t) endingin state gl such that gs is a post-image of gl with respect to tand the transitions appearing in such paths are only reportedas being impacted by the change.

6. Pre-processing to Compute Change Impacts

Computing impacts starting from a changed transitionreduces state exploration efforts in many cases in practice,especially when the transition appears only in certain lim-ited scenarios of the system. However, invoking the proverfor each CEFSM transition at each global state generatedfor impact analysis can get expensive due to the large num-ber of global states.

We can reduce the prover calls by conservatively identi-fying the transitions those that are disabled by a transitionin every global state of the system. Whenever a transition isenabled in a global state then all the transitions disabled byit are ignored in extending that state.

A transition ti disables a transition tj if either i) theirinput states are different or ii) their input messages comefrom the same CEFSM or iii) their conditions Pi and Pj

can never be satisfied together. The first two conditionsare obvious to check. For the last condition, we check thatPi ∧ Pj is unsatisfiable and Pi ∨ Pj is valid i.e., they aresatisfied together only in a global state with a trivial for-mula8. Below, the formula φ denotes these two conditions.Note that disables is a symmetric relation on transitions.

for transition t: m(p1, · · ·, pk), s1, c �→ s2, A in Ti

for each transition t′: m′(p′1,· · ·, p′

k), s′1, c′ �→ s′2, A′ in Tj , j �= iif ((s1 �= s′1) ∨ ((p1 = p′

1) ∧ (p2 = p′2)) ∨ φ)

Add t′ to DisableSet(t) and add t to DisableSet(t′)

6.1. Identifying Enabled Transition Blocks

Often, the above simple disable check does not includetransitions coming from different controllers. One possibil-ity for handling such transitions is to determine if they arealways enabled together in all states, in which case, basedon a single verification condition we can simultaneouslygenerate multiple global states. This is like simultaneouslyextending a global state by multiple arcs labeled by thesetransitions.

We use a simple heuristic and identify two transitions tiand tj to be likely simultaneously enabled if their disablesets are identical. Suppose to the contrary that their disablesets differ, say, by a transition tk. Without loss of generality,let tk belong to disable set of ti but not to that of tj . Sinceevery transition must be enabled in some global state, theremust be a global state g that enables tk. And, g enables tibut not tj . Hence ti and tj are not simultaneously enabledin all states. An example has the condition of ti: x == 0,condition of tk: x �= 0 and condition of tj : x ≤ 0. It can

8We can make this check more accurate for non-convex theories likeintegers by checking that no convex formula can satisfy both c and c′.However, this simple check is also quite effective as illustrated by our ex-periments.

341341341

be verified that ti disables tk but tj does not disable tk. Aglobal state g with x < 0 enables tj and tk but not ti.

We extend the likely simultaneously enabled relation toassociate a transition block with each transition. The tran-sition block of a transition ti, Bl(ti) = {U1, · · ·, Ul} isa set of sets of transitions Ul’s that are likely simultane-ously enabled with ti. Each set Ul is maximally enabled,i.e., no transition in a set Ul belongs to the disable set ofany other transition of Ul. And, every system transitionto, which does not belong to Ul belongs to the disable setof some transition tl of Ul. The transition block Bl(ti) iscomputed using the disable sets of transitions as follows.

Let Bl(ti) = {{ti}}for each transition t in (T − DS(ti)) do

for each Ul in Bl(ti) doif t �∈ DisableSet(tl), for any tl ∈ Ul

Bl(ti) = (Bl(ti) - Ul) ∪ {Ul ∪ {t}}elseBl(ti) = Bl(ti) ∪ {{Ul - {tl | t ∈ DisableSet(tl)}} ∪ t}

Discard U ′l s that are subsets.

We represent each Bl(ti) = 〈 Ki, {E(i, 1), · · ·, E(i, l)}〉,in a factored form as a pair, where Ki called the Kernel ofBl(ti), represents the set of transitions common to all setsof transitions in a transition block and E(i, l)’s are disjointsets of transitions, called the extensions to the kernel. Agiven block Bl(ti) can be translated into its factored formby simply performing a pairwise intersection of the sets Ul’sand removing the set of common transitions from each Ul.Note that for each block Bl(ti), the kernel Ki is non-emptyand must at least contains the transition ti.

As an example, given T = {t1, t2, t3, t4, t5} and DS(t1)= DS(t4) = {}, DS(t2) = {t3, t5}, DS(t3) = {t2} andDS(t5) = {t2}, Bl(t1) = 〈{t1, t4}, {{t2}, {t3, t5}}〉, Bl(t2)= 〈{t1, t2, t4}, {{}}〉 and so on.

The transition blocks are used to generate next statesfrom a global state g as described by the procedure in Fig-ure 1. The procedure first considers if the kernel of eachunmarked block is satisfiable with the given global state gby generating a single verification condition comprised ofthe conditions governing all of the transitions in the kerneland the predicates in g. If so, it simultaneously generates allthe global states produced from the transitions in the kerneland also propagates the satisfiable result to the other blocks.The propagation marks the entire block of a transition ti thateither appears in the kernel or is disabled by such a transi-tion. It also removes these transitions from the kernel andextensions of the other blocks. Then, each extension in theblock is checked for satisfiability and processed in a similarfashion.

If the kernel of a block Bl(ti) is unsatisfiable then wepropagate the unsatisfiability result. Transitions appearingin the subsumed kernels and extensions of other blocks aremarked unsatisfiable in this case. Then, we check if ti canbe satisfied by g and then check each transition in the kernel.

If not then the entire block is discarded after propagatingthe unsatisfiability information. The unsatisfiable result foran extension is similarly propagated. In this case also wediscard an entire block of a transition ti if the extension isthe singleton {ti}.

Inputs: A set of blocks and a global state g.Outputs: A set of global statesMain:

foreach unmarked block Bl(ti) = 〈 Ki,{E(i, 1), · · ·, E(i, l)}〉:if (ProcessithKernel(g, Ki, ti)) == Ok then

foreach E(i, l) in Bl(ti): ProcessithExts(g, E(i, l))ProcessithKernel(g, K, ti):

If Satisfiable(g, K) thenPropagateSat(K);GenSetStates(g, K);return Ok

else /* Kernel K is unsatisfiable */PropagateNosat(K);if (ProcessTransition(g, ti)) == Ok then

foreach tj in (K − {ti}): ProcessTransition(g, tj )else /* ti itself is not enabled */

PropagateNosat({ti});return Fail /* discard entire ith block */

ProcessithExts(g, E):if Satisfiable(g ∧ E) then

PropagateSat(E);GenSetStates(g, E);

else /* Extension E is unsatisfiable */PropagateNosat(E);foreach tj in E: ProcessTransition(g, tj )

ProcessTransition(g, t):if Satisfiable(g, t) then

PropagateSat({t});GenState(g, t);return Ok

elsePropagateNosat({t});return Fail

PropagateSat(TS):DS(TS) =

⋃t∈T S

DisableSet(t)

MarkT = TS ∪ DS(TS) /* Processed/disabled transitions */foreach unmarked block Bl(ti) = 〈 Ki, {E(i, 1), · · ·, E(i, l)}〉 do

if ti ∈ MarkT then Mark Bl(ti); /* discard */else

/* mark subsumed kernels and extensions */if Ki ⊆ TS then Mark Ki Satisfiable;foreach E(i, l) in Bl(ti) do

if E(i, l) ⊆ TS then Mark E(i, l) Satisfiable/* Remove processed/disabled transitions */Ki = Ki − MarkT ;foreach E(i, l) in Bl(ti) do

E(i, l) = E(i, l) − MarkT ;PropagateNoSat(TS):

foreach unmarked block Bl(ti) = 〈 Ki, {E(i, 1), · · ·, E(i, l)}〉 doif {ti} == TS then Mark Bl(ti); /* discard */else /* mark subsuming kernels and extensions */

if Ki ⊇ TS then Mark Ki Unsatisfiableforeach E(i, l) in Bl(ti) do

if E(i, l) ⊇ TS then Mark E(i, l) Unsatisfiable

Figure 1. Generating States Using Blocks

Note that the block information in the above procedure isused only to avoid prover calls. The match for the controllerstate and the messages must still be syntactically done foreach satisfiable block in each global state. It is also impor-tant to note that pre-processing can not reduce the global

342342342

states explored. So, we can use the disable information toavoid transitions only when the relevant parts of the globalstate are instantiated. For instance, if the global state has thepredicate x == −1 and enables a transition t1 with conditionx < 0 then we do not have to consider a transition t2 withcondition x >= 0 in that state. However, if no informationis known about x in the global state then we need to con-sider both. And, this will be automatically done when wedo not use pre-processing since we compute preambles andpostambles in a breadth-first manner and create a new statewhenever the information is back propagated as describedin Section 5. Note that a new state will be created for eachof the conditions in the above example.

7. Preliminary Experiments

The proposed procedure was applied to several examplesincluding a simple web-service (Ws1), webATM (Ws2), athird-party call protocol (Ws3) from [9, 15, 16] as well as 3cache coherence protocols – with no global variables (CC1),protocol with number of processors as a parameter(CCP),and a protocol with four replicated controllers(CCM). Ourinitial results are obtained on a IBM Linux machine, with512MB Memory based on implementation in Perl.

The experimental methodology used is as follows. First,each transition was changed and the impact of the changewas computed by bounded state exploration for depths 1-16. For each depth, for each change, we studied the costof computing impacts of change in terms of four metrics– i) the time taken, ii)number of new variables introducedand eliminated iii) the number of states explored, and theiv) the number of prover calls. First, we pre-processed andannotated each transition with i) the rewrite rule set RE(t),ii) predicates PE(t), and iii) the disable set and transitionblock. Then, the experiments were repeated without pre-processing and the same data was collected.

Cumulative Change Effort: Our results for the timeand number of new variables are given in Figure 2. Col-umn Example gives the case study along with the numberof transitions. Column Depth shows the level at whichthe exploration was cut-off, columns Tp and Tnp give thetime in seconds with and without pre-processing respec-tively, columns Nvp and Nvnp give the new global vari-ables created with and without pre-processing respectively.The last column Elim shows the number of variables elim-inated with preprocessing. The results are shown for mostcases for two representatives depths – 2 and 8. For CCMwe only considered up to depth 4 for cumulative changes.Data for the other depths are similar. The costs of impactanalysis are aggregates of changing all the transitions.

The results show that the pre-processing generally re-duces the time taken. Similarly, the number of new vari-ables generated also reduces in all the cases with pre-processing simply because we explore lesser states. Exam-

ples Ws3 and CC1 do not have any global variables andhence no new variables are created for these. The num-ber of new variables created apparently increase with depthsince our current implementation first creates new variablesand then performs elimination instead of interleaving themat each step. In all the examples, almost all of the new vari-ables are eliminated except for those created by transitionsassigning an initial values. Hence elimination does not de-pend on the depth explored on most cases. The only excep-tion is web service Ws2 where a new variable is created ateach depth based on the ATM (deposit, withdraw, balance,invalid) operation that the user chooses.

Figure 3 depicts the results for the number of prover callsfor cumulatively changing all the transitions. The resultsare shown for the maximum depth values. Pre-processingreduces the number of calls in all cases. Note however, thatpre-processing does not affect the number of global statesexplored.

Example Depth Tp(secs) Tnp(secs) Nvp NvnP ElimWs1(4) 2 0.17 0.18 9 12 6Ws1(4) 8 0.35 0.42 24 96 21Ws2(13) 2 0.98 1.34 24 24 19Ws2(13) 8 34.92 36.27 968 1428 957Ws3(15) 2 1.23 1.84 0 0 0Ws3(15) 8 3.32 3.88 0 0 0CC1(24) 2 8.79 38.73 0 0 0CC1(24) 8 623.46 714.05 0 0 0CCP(36) 2 13.33 129.62 121 245 118CCP(36) 8 638.48 771.62 259 2151 256CCM(64) 2 62.75 2078.26 176 886 173CCM(64) 4 723.16 2662.66 1599 9739 1596

Figure 2. Cumulative Change Impact Costs

��

� � � � � ��

��

��

Figure 3. Number of prover calls vs. cumulative changes.

Change Effort Distribution: We also studied how thechange impact analysis costs are distributed over the transi-tions within each of our examples. We changed each tran-sition and computed the four metrics with and without pre-processing for depths ranging from 1-8. Figure 4 depictsour results for the CCP example for the number of provercalls for the depth 4. The impact analysis effort for transi-tions 1-20 is not that much since these belong to the requestcontroller and are involved only in either read or write trans-actions but not both. The transitions 21-26 from the replycache controller incur the most cost since they interact with

343343343

0

200

400

600

800

1000

1200

� � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � �

number of calls with prenumber of calls w/o pre

Figure 4. Calls vs. individual transition changes in CCP.

both the memory and request controllers in almost all trans-actions. Transitions 27-36 belong to memory interact morethan those in the request controller but not as much as theresponse controller and hence incur an intermediate cost.The same pattern can be seen with pre-processing also butwith lesser number of calls.

8. Conclusions

An approach for change impact analysis of communica-tion systems modeled using a network of CEFSMs with pa-rameters is proposed. Changes are performed at the transi-tion level by adding/deleting transitions. Impact of a changeis the set of transitions that can appear in a system runwith the modified transition. A bounded state explorationmethod to automatically compute the impacts of changes isdescribed. The use of rewrite rules to compactly representmultiple symbolic values appearing in global states of suchsystems is proposed. It shown how rewriting along withdecision procedures implemented in a well-known theoremprover can be used to discharge the verification conditionsgenerated with minimal overheads to the prover. We de-scribed how actions in transitions can be pre-processed intorewrite rules, which can then be used to reduce the numberof symbolic values maintained in each global state. We alsodescribe heuristics to determine transitions that can never beexecuted together and those that are likely to be executed to-gether. Procedures based on these heuristics are developedto reduce the number of verification conditions to be dis-charged by the prover. The effectiveness of the proposedapproach is shown by applying it on recent applications ofCEFSMs to model web services as well as on parameterizedcache coherence protocols.

References

[1] F. Baader and T. Nipkow, Term Rewriting and All That. CambridgeUniversity Press, 1998.

[2] B. Guo, M. Subramaniam, ”Selective State Exploration of Commu-nicating Extended Finite State Machines Using a Theorem prover”,

Work-in-progress paper, In Testing of Communicating Systems,TestCom-20, 2008.

[3] M. Subramaniam and B. Guo, ”A Rewrite-based approach forChange Impact Analysis of Communicating Systems”, CS Techni-cal Report, University of Nebraska, June 2008.

[4] C.Bourhfir, E.Aboulhamid, F.Khendek, R.Dssouli, ”Test Case Selec-tion from SDL specifications”, In Computer Networks, 35, 2001.

[5] D. Brand and P. Zafiropulo, ”On Communicating Finite State Ma-chines”, JACM, 30(2), 1983.

[6] D. Detlefs, G. Nelson, J. B. Saxe, Simplify: A Theorem Prover forProgram Checking, In Journal of the ACM, 52(3), 2005.

[7] T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre, ”Softwareverification with BLAST” In Proc. of the 10th SPIN Workshop onModel Checking Software (SPIN), Lecture Notes in Computer Sci-ence 2648, 2003.

[8] J. C. King, ”Symbolic Execution and Program Testing”, In CACM,19(7), 1976.

[9] C. Keum, S. Kang, I. Ko, J. Baik, ”Generating Test Cases for WebServices Using Extended Finite State Machine”, In Testing of Com-municating Systems, TestCom-18, LNCS 3964, 2006.

[10] D. Lee and M. Yiannakakis, ”Principles and Methods of Testing Fi-nite State Machines – A Survey”, Proceedings of the IEEE, 84(8),1996.

[11] Xiaoxia Ren, Fenil Shah, Frank Tip, Barbara G. Ryder, and Ophe-lia Chesley. Chianti: A tool for change impact analyses of Javaprograms. In Proc. of Conference on Object-oriented ProgrammingSystems, Languages and Applications (OOPSLA-04), 2004.

[12] A. Orso, T. Apiwattanapong, and M. J. Harrold. Leveraging fielddata for impact analyses and regression testing. In Proc. ACM Symp.on Foundations of Software Engineering (FSE), 2003.

[13] M. Subramaniam and P. Chundi. Preserving consistency and exe-cutability of protocols across updates. In Proc. Sixth Intl. Conferenceon Formal Engineering Methods, (ICFEM), LNCS, 2004.

[14] M. Subramaniam, H. Siy, ”Consistently Incorporating Changes toEvolve Transition-based Systems”, In 11th European Conference onSoftware Maintenance and Reengineering, (CSMR), 2007.

[15] H. Ural and Z. Xu An EFSM-based Passive Fault Detection Ap-proach, In Testing of Communicating Systems, TestCom-19, LNCS4581, 2007.

[16] A. Benharref, R. Dssouli, S. Adel, A. En-nouaary and R.Glitho ”NewApproach for EFSM-Based Passive Testing of Web Services”, InTesting of Communicating Systems, TestCom-19, LNCS 4581 2007.Testcom 2007.

344344344

Documents

[IEEE 2008 Sixth IEEE International Conference on Software Engineering and Formal Methods - Cape Town, South Africa (2008.11.10-2008.11.14)] 2008 Sixth IEEE International Conference