A methodology for evaluating test coverage criteria of high levelPetri nets

A Methodology for Evaluating Test Coverage Criteria of High Level Petri Nets

Junhua Dinga, Peter J. Clarkeb, Gonzalo Argote-Garciab, Xudong Heb

aDepartment of Computer Science, East Carolina University, Greenville, NC 27858, USAbSchool of Computing and Information Sciences, Florida International University, Miami, FL 33173, USA

Abstract

High level Petri nets have been extensively used for modeling concurrent systems; however their strong expressive power reducestheir ability to be easily analyzed. Currently there are few effective formal analysis techniques to support the validation of highlevel Petri nets. The executable nature of high level Petri nets means that during validation they can be analyzed using test criteriadefined on the net model. Recently, theoretical test adequacy coverage criteria for concurrent systems using high level Petri netshave been proposed. However, determining the effectiveness of these test adequacy criteria has not yet been undertaken. In thispaper, we present an approach for evaluating the proposed test adequacy criteria for high level Petri nets through experimentation.In our experiments we use the simulation functionality of the model checker SPIN to analyze various test coverage criteria on highlevel Petri nets.

Key words: Predicate/Transition nets, Software Testing, Test Adequacy Criteria, Model Checker SPIN

1. Introduction

Software systems have increasingly become more complex,in particular safety and mission critical systems. How to ensurethe dependability of complex software systems is a grand re-search challenge. Modeling, especially based on a well-definedformal method, plays an important and critical role in the devel-opment of large complex systems. High level Petri nets [14, 30]are a formal computation model well suited for concurrent anddistributed systems, and have been extensively applied to sys-tem modeling in almost every branch in computer science, aswell as in many other scientific and engineering disciplines.The benefits of using high level Petri nets as a modeling methodare particularly significant due to the nature of today’s softwaresystems that operate concurrently in distributed environments.The expressive nature of high level Petri nets provide them witha means to model functionality, data, behavior, and structure.However, this expressiveness results in net models that are dif-ficult to analyze during validation. Despite various attempts toextend traditional Petri net reachability tree analysis techniquesand adapt model checking techniques to high level Petri nets,there are no effective formal analysis techniques that supportthe validation of high level Petri nets. Fortunately, high levelPetri nets are executable models therefore analysis techniquesused during the validation of programs may also be conceptu-ally applicable for high level Petri nets.

In the past few years, testing theories and methods for val-idating concurrent systems models represented as high levelPetri nets have been proposed. Zhu and He [29] formally de-fined test coverage criteria on Predicate/Transition nets, a type

Email addresses: [email protected] (Junhua Ding),[email protected] (Peter J. Clarke), [email protected](Gonzalo Argote-Garcia), [email protected] (Xudong He)

of high level Petri net. However, these testing theories andmethods have not been validated for their practical use due toa lack of suitable tool support. Validating these testing theoriesand methods require a tool that (1) supports the translation of ahigh level Petri net into a model that the tool understands, and(2) facilitates recording the dynamic behavior of the model. Re-cently, we explored the simulation capability of the well-knownmodel checker SPIN [17], and found that it provides the facili-ties necessary to validate the theories and methods proposed byZhu and He [29].

In this paper, we present our results on how to use the simu-lation capability of the model checker SPIN to validate the testadequacy criteria for concurrent systems modeled as high levelPetri nets. SPIN is used to control the execution of a high levelPetri net, expressed as a Promela process with a global state[17], and a monitor (another Promela process) used to recordand evaluate the events generated by the net processes. Usingthe events recorded by the monitor we are able to determine theadequacy of test coverage criteria defined by Zhu and He [29].The contribution of this paper is as follows:

1. A framework for systematically evaluating the adequacyof test coverage criteria of high level Petri nets. The frame-work provides an executable platform to execute the netmodels and record test coverage information. The plat-form includes a run-time Monitor that determines the testcoverage information on the models being executed.

2. A discussion on the conditions and solutions for detectingdeadlock, livelock, and infinite loops when testing concur-rent systems represented as high level Petri nets.

3. A detailed description of our simulation-based approachfor evaluating the adequacy of different test coverage cri-teria on two examples. These examples include: (1) thedining philosophers problem (example that threads the pa-

Preprint submitted to Information and Software Technology June 15, 2009

https://www.researchgate.net/publication/220695682_The_Spin_Model_Checker_Primer_and_Reference_Manual?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


https://www.researchgate.net/publication/3217035_Petri_nets_and_industrial_applications_a_tutorial_IEEE_Trans_Ind_Electron?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/222554088_A_methodology_of_testing_high-level_Petri_nets?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==



https://www.researchgate.net/publication/287843606_High-Level_Petri_Nets-Extensions_Analysis_and_Applications?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

per), and (2) the alternative bit protocol (ABP), a more de-tailed case study. The limitations of the simulation-basedapproach are also presented based on the evaluation data.

In the next section we present the concepts essential to un-derstanding high level Petri nets, the testing of these nets, andthe simulation capability of the model checker SPIN. Section 3describes how we use the model checker SPIN to simulate thebehavior of a high level Petri net, record net events during thesimulation, and evaluate the adequacy of test coverage criteria.Section 4 presents a case study that is used to evaluate our ap-proach. In Section 5 we discuss the important issues related tothe simulation-based evaluation of the test adequacy criteria forhigh level Petri nets. Section 6 describes related work followedby concluding remarks in Section 7.

2. Background

Predicate/transitions nets (PrT nets) are a class of high levelPetri nets, and can be used to define other high level Petri nets.We choose PrT nets in this paper due to the facts that the for-mal definitions of PrT nets are very close to the newly proposedinternational standards on high level Petri nets [14] and the test-ing theory and methods for concurrent systems proposed in [29]are based on PrT nets.

2.1. Predicate/Transition Nets

A PrT net consists of: (1) a finite net structure (P, T, F),(2) an algebraic specification S PEC, and (3) a net inscription(ϕ, L,R,M0) [14, pp. 459-476]. P and T are the set of predi-cates and transitions, respectively, where P ∩ T = ∅. F is theflow relation where F ⊆ P × T ∪ T × P. S PEC is a meta-language to define the tokens, labels, and constraints of a PrTnet. The underlying specification S PEC = (S ,OP, Eq) consistsof a signature Σ = (S ,OP) and a set Eq of Σ-equations. S isa set of sorts and OP is a family of sorted operations. Tokensof a PrT net are ground terms of the signature Σ, denoted asMCONS . The set of labels is denoted by LabelS (X), where X isthe set of sorted variables disjoint with OP. Each label can be amultiset expression of the form {k1x1, ..., knxn}. Constraints of aPrT net are a subset of first order logic formulas containing theS -terms of sort bool over X, denoted as TermOP, bool(X).

The net inscription (ϕ, L,R,M0) associates each graphicalsymbol of the net structure (P, T, F) with an entity in the un-derlying S PEC, and thus defines the static semantics of a PrTnet. Each predicate in a PrT net is a data structure and a com-ponent of the overall system state. Mapping ϕ : P → ℘(S ) as-signs a subset of sorts to each predicate p in P, which defines itsvalid values, i.e. proper tokens. Mapping L : F → LabelS (X)is a sort-respecting labeling of flows. Mapping R : T →TermOP, bool(X) associates each transition t in T with a con-straint expressed in a first order logic formula in the under-lying algebraic specification. The constraints define a transi-tion in terms of pre-condition and post-conditions. The pre-condition specifies the constraints on the incoming arcs and thepost-conditions specify the relationships between the variablesof the incoming arcs and label variables of the outgoing arcs.

A marking m of a PrT net is a mapping P → MCONS fromthe set of predicates to multi-sets of tokens. M0 is a set of ini-tial markings, which are thus the test cases. A transition is en-abled if its pre-set contains enough tokens and its constraint issatisfied with an occurrence mode. The pre-set ( •t) for a tran-sition are the set of input places for that transition. Similarly,the post-set (t•) for a transition are the set of output places forthat transition. The firing of an enabled transition consumes thetokens in the pre-set and produces tokens in the post-set. Twotransitions (including the same transition with two different oc-currence modes) fire concurrently if they are not in conflict.Conflicts are resolved non-deterministically. The firing of anenabled transition is atomic. We define the behavior of a PrTnet to be the set of all possible execution sequences E. Eachexecution sequence e ∈ E represents consecutively reachablemarkings from the initial marking, in which a successor mark-ing is obtained through a step (firing of some enabled transi-tions) from the predecessor marking. We denote an executionas

e : m0n0−→ m1

n1−→ m2n2−→ ... nk−1−→ mk

nk−→ ...where ni is a set of transitions, m0 ∈ M0 is an initial marking,mi, i = 1, 2, ..., are markings such that mi is obtained from mi−1by firing transition set ni. The execution sequence e is said to beflat if all the ni’s are singletons, otherwise e is said to be non-flat. We can obtain a flat execution sequence from a non-flatexecution sequence by interleaving the transitions in the non-singleton ni’s.

Figure 1 shows a PrT net model for the dinning philosophersproblem. The PrT net in Figure 1 consists of three predicates -Thinking, Eating and Chopsticks, and two transitions - Pickupand Putdown. The flow relation in Figure 1 consists of the sixlabeled arcs f1 through f6, and annotated with variables of thesorts shown in italics. We use the dining philosophers exampleto illustrate our approach throughout the paper.

2.2. Testing PrT Nets

PrT nets are formal models that use graphical and mathemat-ical notations, and are well suited for modeling and analyzingconcurrent and distributed systems. PrT nets can play differentroles in the development of concurrent systems, e.g., as a for-mal specification or as an executable model. In general thesetwo roles provide the developer with the opportunity to com-bine both verification and testing of the PrT net model therebyproviding a higher level of confidence in the correctness of thesystem. Verification of a PrT net model can be performed by us-ing a model checker such as SPIN, to check various propertiesof the net [12]. The nature of PrT nets allow the application ofboth specification-based and program-based testing techniques.Testing a PrT net is possible because it is considered as botha specification and an executable model. In this paper we willfocus on structural test coverage criteria of PrT nets.

Beizer [2] defines test coverage as any metric of complete-ness with respect to a test selection criterion. Zhu et al. [28]further explain the notion of test adequacy criteria by describ-ing the three definitions of test adequacy criteria, which include:

3

https://www.researchgate.net/publication/2870532_Profiting_from_Spin_in_PEP?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/220565910_Software_Unit_Test_Coverage_and_Adequacy_ACM_Comput_Surv?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/240486276_Software_unit_test_coverage_and_adequacy?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


https://www.researchgate.net/publication/247573628_Software_Testing_Techniques?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==



Thinking ChopstickPickup

f2

<ch1, ch2>

f4

f5 f6

<ph, ch1, ch2>

<ph, ch1, ch2> f3

Eating<ch1, ch2><ph>

Putdown

(ch1 = ph) /\

<ph>

f1

(ch2 = ph 1)

ϕ(Thinking) = ℘(PHIL)

ϕ(Chopstick) = ℘(CHOP)

ϕ(Eating) = ℘(PHIL,CHOP,CHOP)

PHIL and CHOP are sorts that represent philosophers and chopsticks respec-

tively.

Where ⊕ is modulus k addition and ℘(A) is the power set of A.

Figure 1: A PrT net for dining philosophers.

(1) test adequacy as stopping rules; (2) test adequacy criteriaas measurements; and (3) test adequacy criteria as generatorsof test cases. In this paper we focus on the second definitionpresented by Zhu et al. [28], test adequacy criteria as measure-ments. The formal definition presented by Zhu et al. [28] statesthat test adequacy criteria as measurements is a mapping fromthe cross product of a set of programs, a set of specifications,and a class of test sets to a real number in the range of zero andone. The greater the real number the more adequate the test setis. Applying the dual nature of PrT nets to the concept of testadequacy criteria as measurements implies that: (1) PrT nets,which are executable models, map to a set of programs; (2) PrTnets map to a set of specifications; and (3) the initial markingsfor each PrT net represent a set of test inputs.

Zhu and He [29] provided a methodology for testing PrTnets based on the general theory of testing concurrent soft-ware systems. They identified four classes of testing strategies:transition-oriented testing, state-oriented testing, flow-orientedtesting, and specification-oriented testing [29]. For each strat-egy, a set of schemes to observe and record testing results anda set of coverage criteria to measure test adequacy were de-fined. The observational scheme for a concurrent system p isthe ordered pair < B, μ >, where B is the set of partial ordersof events generated by p, and μ represents the mapping from atest set to a non-empty consistent subset of all partial orders forp [29]. Note that due to non-determinism and concurrency, twoor more partial orders may be generated by the same test inputfor a given p. Zhu and He defined a set of test coverage criteriafor PrT nets based on this observational scheme [29]. In thispaper, we propose a framework for evaluating the adequacy ofthese test coverage criteria including: Transition coverage, K-concurrency length-L trace coverage, All transition trace cover-age, Interleaving length-L transition sequence coverage, Statecoverage, State transition coverage, State transition path cov-

erage, Inward flow, outward flow, and flow coverage, Flow pathcoverage, Equation coverage.

2.3. SPINThe model checker SPIN [25] is a generic model check-

ing tool for formally analyzing the logical consistency of dis-tributed systems, which are defined using Promela [17]. SPINhas three basic functions: (1) As an exhaustive state space an-alyzer for rigorously proving the validity of user-specified cor-rectness requirements. (2) As a system simulator for rapid pro-totyping. (3) As a bit-state space analyzer that can validate largeprotocol systems with maximal coverage of the state space.

Promela [17] is a verification modeling language that uses aC-like programming language style. It provides a way for mak-ing abstractions of distributed systems that suppress details thatare unrelated to process interaction. A Promela program con-sists of processes, message channels, and variables. Processesare global objects. Message channels and variables can be de-clared either globally or locally within a process. Each processspecifies behaviors, channels and variables that define the envi-ronment in which the process runs. In this paper we focus on thesimulation component of the tool SPIN. The tool SPIN offersthree options for performing simulation: (1) random, (2) inter-active, and (3) guided. The random simulation option allows auser to monitor the behavior of a model by printing any outputproduced by the model to the console or files. Interactive simu-lation allows a user to resolve non-deterministic choices duringthe simulation of the model by selecting an option during thesimulation process. If there is only one option then SPIN imme-diately selects that option and continues the simulation. Guidedsimulation uses a specially encoded trail file generated by theverifier, after a correctness violation, to guide the search. Theexecution sequence stored in the trail file represents the eventsleading up to the violation.

We use the random simulation option to monitor behaviors ofa PrT net model. Guided simulation is used to generate test setsor initial markings based on verification results, e.g., counterexamples. Using these test sets the simulation can explicitlyexecute specific scenarios such as a particular state, flow ortransition in a PrT net. It is possible that a SPIN simulationwill executed indefinitely, making it difficult to monitor all thebehavior of the model from the console. Therefore we pipe theoutput from the simulation to files for analysis at a later time.To limit the output of the simulation on the model we use differ-ent combinations of the -u and -j options. The -uN option limitsthe simulation to the first N steps, and the -jN option skips overthe first N steps. There are other options [25] provided by theSPIN tool that provides for additional flexibility in managingthe output from the random simulation facility.

2.4. Transforming PrT Nets to PromelaIn order to simulate PrT nets using SPIN, it is necessary to

translate PrT net models into SPIN models - specified usingPromela. Several researchers have used SPIN to check mod-els specified using Petri nets [6, 10, 12]. The basic idea is totranslate Petri nets into equivalent SPIN models - Promela pro-grams, and their properties into assertions or never claims in

4




https://www.researchgate.net/publication/48189113_A_methodology_for_formally_modeling_and_analyzing_software_architecture_of_mobile_agent_systems?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/3940357_An_automated_tool_for_analyzing_Petri_nets_using_Spin?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==










Promela programs. We provide a general procedure and rulesfor translating a PrT net into a Promela program in SPIN.

The Promela program structure. Each individual net is trans-lated into a process in the Promela program, assuming the PrTnet is a composition of other PrT nets. The sorts of a PrT netare translated into integer types and structured types in Promela.Predicates (places) in the PrT net are translated into fixed-lengtharray variables. Each transition in the PrT net is translated intoan atomic statement body defining the relationship between thepre-set and the post-set of the transition. The init process isused to assign initial values of the program according to theinitial marking of the PrT net.

Translating predicates. Each predicate in a net is translatedinto a global variable in the Promela program. The type ofthe predicate is translated into an equivalent variable type inthe Promela program. If a type is not a predefined type in thePromela, then the type has to be defined, and it has the samedomain, range and operations as the type in the PrT net. Notethat some types cannot be defined in the Promela model dueto the limitation of types in SPIN, therefore some restrictionshave to be applied to the transformation of data types. Givena predicate T defined in the PrT net: ϕ(T ) = ℘(PHIL, CHOP,CHOP), then a corresponding type will be defined for the typeof predicate T: typedefine T {byte ph, byte ch, byte ch} in thePromela program. The value range of each variable representsthe possible markings of the predicate in the PrT net. There-fore, the number of possible values of a variable is the numberof possible tokens of the corresponding place. If a place p isk-bounded, the declaration statement for the place p is an arraywith k elements and its type is the predicate type. Thus, we treata predicate symbol as a set of proposition symbols. This can bedone when each p is bounded and ϕ(p) is finite [15].

Translating transitions. The transitions of a PrT net are en-closed in a do .. od statement in the Promela program. Eachtransition is defined as a guarded atomic statement within thedo .. od statement. The atomic statement defines the firingrules of the transition. The combination of the do .. od state-ment and the guarded atomic statements ensures the non-deter-ministic firing of the transitions. The atomic statement consistsof a series of case statements, and each one is correspondingto one possible firing input of the transition. The body of eachcase statement explicitly defines the translation from the inputto the output. Global variables and channel variables are usedto synchronize the communication between processes.

Defining the initial marking. Each global variable of thePromela program is initialized via the init process with a valuethat is the initial marking of the corresponding predicate in thePrT net.

Defining properties to be verified. System properties aredefined using LTL formulas, which are translated into neverclaims in a Promela program. In addition, some properties aredefined as accept-state labels or other assertion labels such asbasic assertion, end-sate labels, progress-state labels, and traceassertions in a Promela program.

M0

(Initial marking/Test cases)

PrT net model inPromela

Evaluate thecoverage

Test coveragecriterion

Adequate?

no

yes

Generateadditional test

cases

Generateevaluationreport

Figure 2: Test coverage criteria analyzer.

3. Evaluating Test Coverage Criteria of PrT Nets

In this section we describe the approach that uses SPIN sim-ulation facility to analyze PrT net models for various test cover-age criteria. We also show how to apply the approach for eval-uating the PrT net model to the dining philosophers example.

3.1. Experimental Design

Figure 2 shows a high level representation of the approachthat uses the simulation mode of SPIN to analyze the ade-quacy of various test coverage criteria in PrT nets. We referto the tool in Figure 2 as the Test Coverage Criteria Analyzer orTCC analyzer. In the approach, a PrT net model is convertedinto a Promela program using the steps outlined in Section 2.4.The Promela program is instrumented with Promela statementsin a special process, referred to as the Monitor [20], for collect-ing execution events such as the transitions fired, states covered,guard conditions tested or tokens that flowed through arcs. Theevaluation procedure is based on the experimental testing workdeveloped by Briand et al. [4] and Frankl and Weiss at [9]. Theevaluation procedure can be described as follows:

Step 1: Transform the PrT net under test to a Promelaprogram. Each PrT net is transformed into one process in aPromela program. If a system includes multiple PrT nets, theneach PrT net is defined as a Promela process and invoked by theinit process in the Promela program (see top right of Figure 2).

Step 2: Specify the test coverage criterion to be evaluated inthe Monitor process. The test coverage criterion is specified asan atomic transaction in the Monitor, which defines the infor-mation to be recorded in the log files (see second box on theleft of Figure 2). For example, if an evaluation is about transi-tion coverage, then all enabled transitions under a marking arerecorded in the weak coverage file, and all fired transitions arerecorded in the strong coverage file. For simplicity, we evaluateeach test coverage criterion separately.

Step 3: Evaluate the coverage. The Analyzer checks whetherthe evaluation requirements are satisfied or not through analyz-ing the recorded events in the Monitor (see second box on the

5

https://www.researchgate.net/publication/213878888_Debugging_concurrent_programs?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/3187586_Experimental_comparison_of_the_effectiveness_of_branch_testing_and_data_flow_testing?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/4083531_Using_simulation_to_empirically_investigate_test_coverage_criteria?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/222653336_Formally_analyzing_software_architectural_specifications_using_SAM?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

right of Figure 2). For some test coverage criteria such as alltransitions coverage, it is necessary to find test sets to coverthe criterion 100%. But for some other test coverage criteriasuch as all states, it is infeasible to find test sets to cover thecriterion 100%, then some restrictions have to be applied to thecriterion so that the evaluation of the adequacy is performedon the restricted criterion. The evaluation requirements of eachcriterion are represented as a separated transition in the Monitorwith guard conditions, which are satisfied when coverage of thecorresponding criterion is 100% covered by a test set.

Step 4: Build an adequate test set. Each test case is inputto a Promela program via the init process. A PrT net modelincludes a set of initial markings, which serves as the initial testcases or initializations in the corresponding Promela program(see top box on the left of Figure 2). An adequate test set for atest coverage criterion may include a set of test cases, which areincrementally generated from a set of initial markings in the PrTnet. However, the initial markings may not be explicitly definedin a PrT net model, and may not be adequate for most of thetest coverage criteria. Systematically generating test cases fortesting a PrT net is necessary. Based on domain knowledge, wefirst randomly select an initial test case. If the initial test case isnot adequate for a test coverage criterion, then we change or addmore test cases to the initial test case based on the evaluationreports and execution log files from the testing of previous testcases (see box on the bottom right of Figure 2).

In order to generate an adequate test set, we have to run thesimulation using different test sets many times. For some com-plex systems, we may use other tools such as a reachability treeanalyzer [27] or a mutation generator [3] to generate test cases.The procedure for building an adequate test set is not automaticat this time, but the evaluation results are very helpful in build-ing an adequate test set. Due to the degree of randomness inselecting one transition to fire from several enabled ones, thecoverage observed for one criterion may not be consistent atdifferent times. Therefore, we need to run a large number oftest sets to draw a conclusion on the adequacy. In our experi-ments we build at least 50 test sets for evaluating each criterion.

Step 5: Generate analysis report. Evaluation of the cover-age for a given criterion is performed by computing the ratio ofthe recorded events against the expected events (see box on thebottom left of Figure 2). The goal for each criterion is to at-tain 100% coverage for that criterion. The evaluation report in-cludes the coverage ratio and other coverage information suchas the name of the criterion e.g., all transition, all transitionpaths, all state transitions or all equations. Several files are usedto record execution streams including the transitions enabled ata marking (state), transitions fired, state or marking paths, tran-sition paths and flow paths. The total length of execution se-quences for each test set is recorded as the evaluation cost forthe criterion. Due to the issue of non-determinism, some testsets may not be adequate when they are re-run. The ratio ofadequate test sets among the 50 selected test sets is also an in-dicator of the stability of the adequate test suite for a criterion.

3.2. Recording Dynamic BehaviorsOur approach for evaluating the adequacy of test coverage

criteria, identified in [29], is based on analyzing the eventstream generated by the execution of a PrT net model. In gen-eral, it is infeasible to mechanically evaluate the test adequacyfor all criteria given a PrT net model. However, it is importantto know the limitations and the effectiveness when evaluatingthe test adequacy for a given criterion. The evaluation of testadequacy criteria has two purposes: (1) if the evaluation of acriterion is feasible, then the analyzer tool should generate anadequate test set that covers the criterion, and be able to calcu-late the cost of evaluating the test adequacy criterion based onthe test set, and (2) if the evaluation of a criterion is infeasiblethen the analyzer tool should generate an adequate test set fora restricted feasible version of the criterion and calculate thecost for this restricted criterion. We assume that the restrictedcriterion is a modified version of the original criterion.

3.2.1. Experimental conditionsIn order to record non-deterministic executions in a PrT net,

we introduce the concept of weak coverage and strong cover-age in our approach. Weak coverage means that a given crite-rion is possibly covered under a marking. More specifically, atransition is weakly covered if the transition is enabled undera marking M; a state or an abstract state is weakly covered ifthe state is possibly reachable under a marking M; and a flow isweakly covered if some tokens may pass through the flow un-der a marking M. For example, if two transitions t1 and t2 ina PrT net are both enabled under marking M0, which one willfire is non-deterministic. Under this situation, both t1 and t2satisfy the weak coverage criterion. Strong coverage is the reg-ular coverage, which means the coverage of the criterion canbe determined. For example, when a transition is fired during asimulation, a state is reached, or a token passes through a flow.

End a simulation. The major challenge when evaluating testcoverage criteria using a simulation-based method is to decidewhen the simulation should end. The simulation has to bestopped at a particular time for several reasons. The first reasonis some predicates in a PrT net model are not bounded and thenumber of tokens held in a predicate is too large. The secondreason is the occurrence of endless loops in a PrT net. The thirdreason is an execution may enter a livelock or a deadlock state.To address the first reason, we transform those non-boundedpredicates into bounded predicates, and then the data type of thepredicate is reduced to the data type with finite possible values.It is reasonable in software testing to transform a non-boundedpredicate into a bounded predicate and restrict the data type ofthe predicates as a data type with finite possible values. To ad-dress the issue of endless loops, a stopping condition has to bedefined to terminate the execution of the PrT net model. TheSPIN can solve the problems caused by the third reason sincethe SPIN can detect deadlock during a simulation. A livelockoccurs when a process is prevented from progressing due to un-fair scheduling. But the fairness of scheduling of processes isensured in SPIN.

Figure 3.2.1 is used to illustrate the concurrency, conflict andloop situations that we need to handle during the evaluation of

6

https://www.researchgate.net/publication/2510952_Prod_Reference_Manual?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/3867121_Mutation_operators_for_specifications?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


a PrT net model. For example, Figure 3.2.1(c) illustrates themethods for dealing with situation that a predicate might be un-bounded such as the predicate P6 in the figure. The number oftokens in the place P6 could be infinite, but we have to restrictthe number of tokens held in the place. If the number of to-kens in place P6 is over the limit, then the problem should bereported because it is a potential overflow issue. Because eachpredicate is represented as a group of global variables with thesame type in the corresponding Promela program, the limit ofthe number of tokens for each predicate is already handled dur-ing the transformation from a PrT net to its Promela program.That is, the number of variables in the type definition repre-senting the place is the number of tokens that can be held inthat place. Each data type in a Promela program must be finiteso that each predicate in a PrT net is considered as boundedwith a finite data type.

End conditions. Based on above discussion, we conclude asimulation has to end if one of the following conditions is sat-isfied: (1) The Promela program exits or aborts; (2) The targettest criteria is successfully evaluated; (3) The simulation entersa deadlock state; (4) The simulation enters a livelock state; (5)The simulation enters an endless loop; or (6) The length of theexecution sequence of the running program is over a pre-definedlimit. If a simulation encounters one of the first two conditions,then the program will automatically stop. We define deadlockhere as a program entering a state such that no transition canbe enabled in the program. Detecting deadlock in a concurrentprogram is a major challenge, however there are several solu-tions to this problem [5, 7]. During the simulation of a Promelaprogram, the Promela program will exit if there is no enabledtransition. The fairness of scheduling of processes in SPIN canprevent the occurrence of livelock during simulation. In thissection we design an approach to detect the endless loop con-dition during simulation so that the simulation can stop. If asimulation continuously runs without generating new states fora period of time, and none of the first five end conditions arefound, then the simulation has to be stopped based on the lengthof the execution sequences.

Detecting an endless loop. We developed a practical solu-tion for detecting endless loops during simulation of a PrT netmodel. In addition, if the length of an execution sequence (thenumber of transitions invoked since the latest new states gener-ated) in a simulation is over than a pre-defined limit, then thesimulation has to be terminated.

t1 t 2 t 3 … … tn0 0 3 0 0 1

(a)

t1 t 2 t 3 … … tnS1 X1 Y1 E1S2 X2 E2S3 X3…

(b)

Figure 4: Data structures for detecting an endless loop.

To detect an endless loop, we use matrices to keep track ofenabled transitions. One vector called Tvec is used for record-

ing the firing times of each transition, the initial value is 0 foreach transition. A matrix Pma is used for recording the statesequence of each transition, one dimension represents the tran-sitions, and the other dimension records the state sequence ofeach transition. The state of each transition is the union of thepreset and postset: •t ∪ t• of transition t when transition t fires(other predicates should remain the same). Figure 4 shows thetwo data structures used for detecting an endless loop. Fig-ure 4(a) shows the vector Tvec and Figure 4(b) the matrix Pma.When transition t fires, if the state of transition t does not existin the state sequence of t in the matrix Pma after t is fired, thenthe number of each transition in the Tvec is reset to one if theoriginal number was larger than 1, and the state of transition tis appended to the state sequence of t in the matrix Pma. How-ever, if the state exists in the state sequence of t, the number of tin the Tvec is increased by one, but the state will not be added tothe state sequence of t in the matrix Pma. If the number of anytransition in the Tvec is changed from 0 to 1, then the number ofeach transition in the Tvec is reset to one if the original numberis larger than 1, and each state sequence in Pma is cleared exceptthe latest state in each sequence because the program is movingforward.

A number for each transition in Tvec is defined to stop theexecution of a PrT net. As soon as any number in the Tvec islarger than the pre-defined number, the simulation stops. Us-ing the above solution, the loop in the Figure 3.2.1(a) can bedetected because eventually the transition t1 and t2 or t3 willrepeat many times without generating any new states. Note thatif the target test criterion is adequate (the end condition 2), thenthe simulation of the program exits normally. For example, ifwe are evaluating the transition coverage in the Figure 3.2.1(a),as soon as t1, t2, and t3 are all covered, the simulation can stop.In an extreme situation, t2 or t3 may never be covered, thenthe solution discussed in the detecting of an endless loop willdetect the endless situation. In this situation, it is a challengeto decide when the program should stop because firing of t2 ort3 is non-deterministic. For the situation in Figure 3.2.1(b), thestate sequences in the Pma can detect the endless loop, however,if the data type of a predicate in preset or a postset of a transi-tion has many possible values, the size of the matrix Pma mightbe very large.

3.3. Evaluate the Adequacy of Test Coverage CriteriaIn this section we discuss how to evaluate the adequacy of

each criterion identified in [29]. Each criterion is evaluatedthrough running and generating adequate test sets using an in-cremental approach. The adequacy of each criterion is mea-sured as a percentage of the coverage. We are also looking atthe cost of evaluating the adequacy of each criterion in PrT nets.The cost is measured in term of the number of total transitionsinvoked during the simulation, making the assumption that costis overall proportional to the number of transition invocations[4].

3.3.1. Transition-Oriented CoverageTest coverage criteria on transitions include: transition cov-

erage, K-concurrency length-L trace coverage, all transition

7

https://www.researchgate.net/publication/220280742_A_Deadlock_Detection_Tool_for_Concurrent_Java_Programs?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/220403769_Flow_Analysis_for_Verifying_Properties_of_Concurrent_Software_Systems?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==



p1t1

t2

t3

p2

(a)

<x> <y>

(b)

p4t5

t4

(c)

p6

p5

t6

t7p3

Figure 3: PrT net models may not stop during the simulation.

trace coverage and interleaving length-L transition sequencecoverage. Except for the transition coverage, it is infeasible toevaluate the adequacy of other transition based coverage crite-ria using the simulation approach. For those types of criteria,we check whether particular cases of a criterion are covered ornot.

Transition coverage. To evaluate transition coverage we ran-domly choose the initial test cases based on the initial markingsor domain knowledge. If some transitions are not covered bythe initial set of test cases when the simulation ends, then wemodify the initial set of test cases or add more test cases to theinitial set based on previous simulation results. If a transitionsuch as t cannot be covered by a test case ts, then we backtrack from the transition t to its immediate input places suchas {p1, ..., ps} to find an input for enabling t. If any place suchas px in {p1, ..., ps} cannot be directly assigned with an initialmarking, then we need to find a transition such as tx whose out-put places include px. We check whether an initial marking canenable tx or not; if it does, choose the initial marking that canenable tx and t, otherwise, we have to repeat the above stepsuntil an initial marking can potentially cover the transition t.

Due to complexity of a PrT net, which may include circles oftransition paths and complex constraints with transitions, auto-matically generating adequate test cases is challenging. Basedon domain knowledge and results from manually analyzingevaluation outputs, we generate adequate test cases step by step.If one transition is weakly covered but not covered by the ini-tial test case, then we have to add new test cases that can bedetermined to cover the transition. However, several transitionsmight be always concurrently enabled at the same time. If thatis the case, then we have to duplicate similar test cases multipletimes in a test set based on the assumption that scheduling ofenabled transitions is fair. If there is a transition that is nevercovered by any test case, then we have to check whether thePrT net model has some defects. If we cannot find problems ina model, we report the ratio of the coverage, and list the transi-tions that cannot be covered by any of the test cases in the testset.

K-concurrency length-L trace coverage. The challengewith this criterion is calculating TrcK,L(N), all feasible K-concurrency length-L transition traces, for a PrT net model us-ing the simulation method. If TrcK,L(N) is pre-defined or cal-culated using other approaches such as analyzing the PrT net

model or domain model, then evaluating the adequacy of thiscriterion is feasible. In most situations, only some of the transi-tion traces are important to the system properties we are inter-ested in, the other traces can be ignored. In this paper, insteadof considering all feasible K-concurrency length-L transitiontraces, we focus on the interesting transition traces during eval-uation, i.e., those traces related to system properties. If all in-teresting K-concurrency length-L transition traces are coveredby a test set, then the test set is considered adequate for theK-concurrency length-L trace coverage. The concurrency de-gree of a transition trace is the maximum number of transitionsthat may fire (weak coverage) in that trace. During the sim-ulation, all enabled transitions under a certain marking M arerecorded thereby providing a record that consists of all enabledtransitions. A record is added to the log file as soon as a transi-tion fires. The execution log file records a sequence of concur-rent transitions (i.e., the weak coverage). If all K-concurrencylength-L transition traces are included in the log file, then thetest set is considered as adequate for this test criterion, other-wise, information regarding the non-coverage is provided.

Interleaving length-L transition sequence coverage. This cri-terion is a special case of K-concurrency length-L trace cov-erage with 1-concurrency. Therefore, the evaluation approachdiscussed for evaluating K-concurrency length-L trace cover-age can be directly used for evaluating this criterion. Becausethe longer sequence coverage subsumes the shorter sequencecoverage for the interleaving length-L transition sequence cov-erage, we increase the length of the trace step by step until thesequence does not include any new trace.

All transition trace coverage. It is usually infeasible to cal-culate all transition traces in a PrT net model, especially thosemodels with infinite states, i.e. the number of all transitiontraces could be infinite. Therefore, we only consider coverageof a group of transition sequences. The evaluation of the alltransition trace coverage is exactly the same as the method forthe interleaving length-L transition sequence coverage

3.3.2. State-Oriented CoverageThe state-oriented test coverage includes state coverage,

state transition coverage, and state transition path coverage.Evaluating the adequacy of state coverage criteria is usuallynot feasible using the simulation method since some PrT netmodels may have infinite states. Instead of using the explicit

8

states in the evaluation of test coverage criteria, abstract statesare used. The concept of abstract states provides a way to re-duce the state space of the model by identifying a set of finitestates to be used during state coverage testing. If the state spaceis small enough then there can be an m to one mapping of theexplicit states to the abstract states, where m approaches one. Ingeneral a mapping needs to be defined from the markings (ex-plicit states) in the net to the abstract states. An example of aset of abstract states for the dining philosophers problem is thenumber of philosophers eating. In this case the abstract statesare 0, 1, 2. One set of possible markings associated with thesestate are: 0 - Thinking = {1,2,3,4,5}, Eating = {}, Chopsticks ={1,2,3,4,5}; 1 - Thinking = {2,3,4,5}, Eating = {<1,1,2>}, Chop-sticks = {3,4,5}; and 2 - Thinking = {1,3,5}, Eating = {<2,2,3>,<4,4,5>}, Chopsticks= {1}. Therefore by inspecting the contentof the variables or log files that store the markings of the net itis possible to deduce which abstract states have been covered.Note that in the above example there is one possible markingfor the abstract state 0, five possible markings for the abstractstate 1, and five possible markings for abstract state 2.

State coverage. Choosing test cases for evaluating the ab-stract state coverage is closely related to the abstraction pro-cedure, which is beyond the scope of this paper. During theabstraction process we identify the relation that maps the mak-ings to the abstract states for the PrT net model. The simulationprogram then uses this relation to convert each state into anabstract state, and then computes the coverage based on the ab-stract states. If some abstract states are not covered by a test set,then we have to analyze the execution log files and use the eval-uation reports to update the test set. For many systems, we mayonly be interest in a subset of the total number of states (im-portant states), therefore we only need to evaluate the coveragefor these important states. During the simulation, the state orabstract state is recorded and stored in the execution log files.As soon as all states or abstract states are included in the logfiles, the test set is considered adequate for state coverage andthe simulation is concluded.

State transition coverage. This coverage criterion is simi-lar to state coverage. It is feasible to evaluate the adequacy ofabstract state/state transition coverage if the total number of ab-stract/state transitions of a PrT net model is known. Otherwise,we only evaluate the subset of state transitions that are impor-tant to the model under test. If all state transitions in this subsetare covered by a test set, then we conclude that the test set is ad-equate for the state transition coverage. During the simulation,the initial state is recorded as the first item in the execution statesequence in the log file, and then a new state will be appendedas soon as a transition fires during the simulation. Throughsearching state transitions in the state log file, the system canmake a decision of updating of test set or ending the simulationbased on the coverage of each state transition.

State transition path coverage. The idea for evaluating theadequacy of length-k state transition path coverage is similarto state transition coverage, we are only interested a particularsubset of state transition paths. Through searching each statetransition path < S 1, S 2, ...S n > in the state log file, we canconclude whether a state transition path is covered or not by a

test set. If all interested state transition paths are covered bya test set, then the test set is concluded as adequate for statetransition path coverage; otherwise, it is not.

3.3.3. Flow-Oriented CoverageIn this section, we consider inward flow coverage, outward

flow coverage, flow coverage, and flow path coverage. Flowchoice coverage and flow combination coverage are specific forhierarchical PrT nets [13], so we do not discuss them in thispaper.

Flow coverage. This coverage criterion includes inward flowcoverage and outward flow coverage. It is easy to calculate thetotal number of inward flows, outward flows and flows becausethe inward flows and outflows are explicitly defined in transi-tion statements in the Promela program of a PrT net. When atransition t fires, all input flows of the transition t are recordedin the input flow sequence of the execution, and all output flowsof the transition t are recorded in the output flow sequence ofthe execution of a PrT net. If all input flows, or all output flows,or both input flows and output flows of all transitions exist inthe log files, then the test set is adequate for the inward flowcoverage, the outward flow coverage, or the flow coverage, re-spectively. The approach for flow coverage uses exactly thesame idea discussed in Section 3.3.1 for choosing test cases anddealing with non-deterministic issues.

Flow path coverage. It is a challenge to determine all feasi-ble flow paths in a PrT net model using the simulation method,so we don’t consider the total number of feasible flow paths.We only check for a subset of flow paths, i.e., those that are im-portant to a PrT net model. The strategy for evaluating the flowpath coverage is the same as the method for evaluating the tran-sition path coverage except it uses flow events (inward flowsand outward flows) instead of transition events.

3.3.4. Specification-Oriented CoverageWe only consider equation coverage in this paper. Each

equation eq of a transition constraint in a PrT net model is trans-lated into a predicate in the guard condition of an atomic tran-sition statement in the Promela program. When a transition isenabled, the value of the equation must be true, so that the equa-tion event is recorded in the log file. If all equations in a PrTnet are tested with true by a test set, then we conclude the ad-equacy of equation coverage. We don’t consider the mutationsituations for specifications in this paper.

3.4. A Running Example

We illustrate our approach for evaluating the adequacyof each test coverage criterion by applying it to the din-ing philosophers problem. An outline of the Promela pro-gram for analyzing a test coverage criterion of the problem(see Figure 1) is shown in Figure 5. Recall that the PrTnet for the dinning philosophers problem consists of threepredicates (Thinking, Eating,Chopsticks) and two transitions(Pickup, Putdown). The flow relation in Figure 1 consists ofsix arcs labeled f1 through f6, each arc is also labeled witha tuple representing the types of the tokens consumed when a

9

https://www.researchgate.net/publication/220783730_A_Formal_Definition_of_Hierarchical_Predicate_Transition_Nets?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

1 #define N 52 typedef tokenE{ byte ph; byte ch1; byte ch2;}3 /* token type used in Eating predicate*/45 byte thinking [N]; byte chopstick[N];6 tokenE eating[N];7 /* predicates: philosophers, chopsticks , eating*/8 byte trans [2]; /* Monitor - transitions*/9 byte states [3]; /* Monitor - states*/

10 byte flows [6]; /* Monitor - flows */11 byte t_vec [2]; /* Monitor - firing times */12 /* Monitor - state sequence of a transition*/13 byte p_m1[3], p_m2[3], p_m3[3], p_m4[3], p_m5[3];14 ......15 proctype DP(){16 ....17 do18 /* transition pickup*/19 :: atomic { /* guard for transition pickup*/20 -> /* execute transition pickup*/21 /* update global variables in Monitor*/22 }23 /* transition putdown*/24 :: atomic { /* guard for transition putdown*/25 -> /* execute transition putdown*/26 /* update global variables in Monitor*/27 }28 od29 }3031 proctype Monitor (){32 /* Monitor checks coverage criteria*/33 do34 :: atomic { /* guard condition for a criterion*/35 /*if guard condition fails then skip*/36 }37 :: atomic { /* detecting endless loop*/38 /* if an endless is detected , then */39 /* generate report , exit */40 }41 :: atomic { /* checking execution sequences*/42 /* if the length is over than limit , then */43 /* generate report , exit */44 }45 :: atomic { /* detecting deadlock */46 /*if a deadlock is detected , then*/47 /* generate report , exit */48 }49 :: else -> {50 /* generate report*/51 break52 }53 od;54 assert(false) /* Stops simulation */55 }56 init {57 /* Initialize global variables */58 /* Initialize initial marking */59 /* Run processes DP and Monitor */60 }

Figure 5: An outline of Promela program for analyzing a criterion

transition fires. The Promela program in Figure 5 consists offour main sections: (1) global declarations define user-definedtypes, global variables and other macros; (2) process DP rep-resenting the PrT net in Figure 1; (3) process Monitor, repre-senting the monitor in Figure 2; and (4) process init used tostart each process and initialize global variables according tothe initial marking in the corresponding PrT net.

The global declarations section contains a constant N thatholds the number of philosophers and chopsticks, the datatype declaration tokenE represents tokens used in the eatingpredicate, and the array variables represent predicate thinking,chopstick and eating. There are several arrays for recording thetransitions, states, and flows used by the monitor. Executionevents are written to log files. Several array variables are usedto detect deadlock or endless loops. The process DP consistsof two atomic statements, which are used to model the transi-tions pickup and putdown. Each atomic statement consists ofa guard condition, statements used to define the behaviors ofthe transition, and statements used to update global variablesand event log files. When a transition is fired, we conclude thefollowing: the transition was covered, the constraint expressionfor the transition was evaluated, the state was reached, the flowsthat tokens just passed through were covered, and places asso-ciated with the transition were tested. If multiple transitions areenabled at the same time, the transition selected to be executedis chosen non-deterministically. The procedure for evaluatingthe adequacy of test coverage criteria is discussed as follows:

Step 1: Choose the initial marking. The initial markingfor the dinning philosophers PrT net model is decided assoon as the number of philosophers is defined. The initialmarking is transformed into an initialization or a test case of thecorresponding Promela program. We chose following initialmarking: ({1}{1}, {}); ({1, 2}, {1, 2}, {}); ({1, 2, 3}, {1, 2, 3}, {});({1, 2, 3, 4}, {1, 2, 3, 4}, {}); ({1, 2, 3, 4, 5}, {1, 2, 3, 4, 5}, {}),where the first field represents philosophers, the second fieldrepresents chopsticks, and the third field represents eating.

Step 2: Evaluate transition coverage. If the initial test case is1 philosopher and 1 chopstick, then we found that the simula-tion exited without covering any transitions since no transitionwas enabled under the test case. Based on above simulation re-sult, the initial test case was changed to 2 philosophers and 2chopsticks, then all transitions are covered. Therefore, the sec-ond test case is adequate for all transition coverage. However,it does not cover any 2-concurrency transition trace, since notwo transitions were enabled at the same time in the weak cov-erage log file. We then changed the initial test case to 3 philoso-phers and 3 chopsticks, but the coverage does not change. If wechange the initial test case to 4 philosophers and 4 chopsticks,then the 2-concurrency length-2 trace coverage is 100%. Toachieve 100% we had to run the simulation with the test case along time since the trace {pickup, pickup}was difficult to cover.This test case did not cover any length-3 traces such as pickup,pickup, pickup. When we increase the number of philosophersand the chopsticks, the coverage for the concurrency and thelength of the trace is also increased. As soon as more than onephilosopher and one chopstick are selected, the program canrun forever so that the simulation has to be terminated using

10

the techniques discussed in Section 3.2. Some test crtieria maynot be adequately covered by a test set all the time; therefore,we need to evaluate each test coverage criterion with many testcases (we chosen 50 test cases) to improve our confidence inthe correctness of the Prt net model.

Step 3: Evaluate state coverage. If we consider anynumber of philosophers, then the number of states of thedinning philosophers problem could be huge. For 5 din-ing philosophers, there are 11 reachable states. When wechose the initial test case as 5 philosophers and 5 chop-sticks, all 11 states were covered by the initial test case. Forstate transition coverage we found that some feasible statetransitions were not covered. For example, the transitionfrom the state where 0 philosophers are eating to the statewhere 2 philosophers are eating was not covered because onlyone transition can fire each time in our simulation program.As a result of analyzing the log file for weak coverage ofstates, we found above state transition (0 to 2 philosophers)is weakly covered. Instead of checking all state transitionpaths, we check the coverage of a group of specific transitionpaths such as: ({2, 3, 4, 5}, {3, 4, 5}, {{1}, {1, 2}}); ({3, 4, 5}, {5},{{1, 2}, {1, 2, 3, 4}}); ({2, 3, 4, 5}, {3, 4, 5}, {{1}, {1, 2}}). We foundthat this transition path is covered by the initial test case. Weidentified three abstract states for the 5 dinning philosophersproblem. The three abstract states were not-eating - no philoso-pher is eating, one-eating - one philosopher is eating, and two-eating - two philosophers are eating. Then the evaluation ofstate coverage was performed on the abstract states.

Step 4: Evaluate flow coverage. If we choose the initial testcase as 5 philosophers and 5 chopsticks, then inward flows, out-ward flows and flows are 100% covered by the test case. Forflow path coverage, it suffers the same problem as other pathcoverage criteria – it is difficult to find all paths using the sim-ulation approach. However, we can check the coverage for agroup of flow paths. If a path is not covered by a test set, thenadditional test cases will be added to the test set.

Step 5: Evaluate specification coverage. We only consideredthe equation coverage for the dinning philosophers example.There is only one equation in this case, which is the constrainton the transition Pickup: (ch1 = ph)∧(ch2 = ph⊕1). When wechoose the initial test case as 5 philosophers and 5 chopsticks,the equation was covered so that the test case is adequate forequation coverage.

4. Case Study

In this section we evaluate our approach to testing high levelPetri nets by applying it to the PrT net model for the Alternat-ing Bit Protocol (ABP). We describe the evaluation procedureand results of each test adequacy criterion and discuss the limi-tations of the simulation-based evaluation method.

4.1. Specifying ABP

The ABP is a protocol that consists of a sender, a receiver,and two channels, for reliable transmission over channels thatmay corrupt, but not duplicate messages. The channels may

corrupt a message or an acknowledgment, if this corruption oc-curs then a message or acknowledgment has to be resent (wedon’t consider contiguously sending messages or acknowledg-ments in this paper). The protocol guarantees that (1) an ac-cepted message will eventually be delivered, (2) an acceptedmessage is delivered only once, and (3) the accepted messagesare delivered in order.

The PrT net for ABP is shown in Figure 4.1. The net modelhas three components, the Sender, the Channel and the Re-ceiver. The Sender component of the net accepts messagesfrom the environment via the Accept predicate, shown on theupper left of Figure 4.1, and sends them to the Channel com-ponent via the DataOut predicate, shown on the dotted linesbetween the Sender and Channel. These messages are passedfrom the Channel component via the DataIn predicate to theReceiver component and delivered to the environment via theDeliver predicate shown in the upper right of Figure 4.1.

The Sender component of the net has four predicates (Accept,DataBuf, DataOut, Ackin), two transitions (sendData, resend-Data) and eight flows f1, f2, f3, f4, f5, f6, f7, and f8. Two ofthe predicates (DataOut, AckIn) are shared with the Channelcomponent of the net. The arcs in the Sender component ofthe net are annotated based on the tokens consumed or created.These annotations include: <m> - original message; <d>, <d′>- where d or d′ each includes two elements, i.e., d[1] or d′[1]represents a message and d[2] or d′[2] represents a bit value forthe message sequence; and <ack> represents a bit value for theacknowledgment.

The Channel component of the net contains four predicates(DataOut, Ackin, DataIn, AckOut), four transitions (transmit-ted, corrupted, acorrupted, atransmitted), and eight flows f9,f10, f11, f12, f16, f17, f18, f19. The predicates DataIn and Ack-Out are shared with the Receiver component of the net. Thea prefixing the transitions represents the acknowledgment e.g.,acorrupted - corrupted acknowledgment. The annotations onthe arcs are similar to those in the Sender component except thatb is used to represent corrupted or acknowledgment tokens. TheReceiver component of the net is similar to the Sender compo-nent, except that the message is delivered to the environment,and the acknowledgment generated in that Receiver component.

The inscriptions for the ABP net shown in Figure 4.1 aregiven below. Where = is an assignment operator, and � meansempty.

1. Net inscription of the Sender model:ϕ(Accept) = ℘(MES S AGE), where MES S AGE is thetype of stringϕ(AckIn) = BIT , where BIT = {0, 1}ϕ(DataOut) = ϕ(DataBu f ) = BIT × MES S AGER(sendData) =

(ack ∈ BIT ∧ ack == 1−d[1] ∧ d′[1] = 1−ack ∧d′[2] = m)R(resendData) = (ack == d′[1] ∧ d′[2] � � ∧ d′[1] =1 − d′[1] ∧ d′[2] = m)

2. The net inscription of the Receiver model is as follows:ϕ(Deliver) = ℘(MES S AGE)ϕ(AckOut) = ϕ(AckBu f ) = BIT

11

https://www.researchgate.net/publication/2546442_Model_Checkers_in_Software_Testing?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==




Accept

f1 <m>

Sender Channel Receiver

<d’>

f3 <d’>

sendData

DataBuf

<d’>

f6

f7<ack>

<d’>f8f2

AckIn

DataOut <d’> <d’>

resendData

atransmitted

acorrupted

corrupted

transmittedf9 f12

f10<d’>

f11

f19 f18

 



f17

<b’>



f14<d>



f20

f21 

AckOut

DataIn

Deliver

resendAck

deliverDataf16f15

<d>

f13

f23<m>

f22 <b’>f4 <d>

f24<b’>

<ack>f5

Figure 6: A PrT net model of ABP protocol.

ϕ(DataIn) = (BIT × MES S AGE) ∪ corruptedR(deliverData) =

(d ∈ BIT × MES S AGE ∧ d[1] = 1 − b ∧ b′ =d[1] ∧ m = d[2])R(resendAck) = (b == d[1] ∧ d[2] == �)

3. The net inscription of the Channel model is as follows:R(corrupted) = (b = 1 − d′[1])R(acorrupted) = (b′ = b)

Using the guidelines in Section 2.4, we identify the PrT netmodel for ABP consisting of four components, which are theSender, Receiver, Mchannel and the Achannel. The Channelcomponent of the net in Figure 4.1 is partitioned into Mchannel- message channel and Achannel - acknowledgment channel.The PrT net model of ABP results in four Promela proctypeconstructs for each of the sub-nets Sender, Receiver, Mchan-nel and Achannel. Each transition is represented as an atomictransaction in the proctype representing their sub-nets.

4.2. Evaluating Test Coverage Criteria of ABP

In this section we first discuss how to build adequate test setsfor evaluating each test coverage criterion, and then describethe evaluation process and the results for each criterion. Weconclude this section with an analysis of the experimental re-sults.

4.2.1. Building test setsThe test cases for running the ABP model include messages

from the environment, message sequence numbers, and mes-sage acknowledgments. We represent each message as a threedigit number, and the message sequence numbers and messageacknowledgments as 0 or 1. Corrupted messages or acknowl-edgments are detected by the wrong bit values (0 or 1). Basedon the PrT net model, we know the difference between test casesshould be only the value of the initial marking for predicateAccept, which holds the messages from the environment. Oneof the test cases is shown below:

M0(Accept) = {128},M0(Deliver) ={},M0(DataBu f ) = {< 1,� >},M0(AckIn) = {0},M0(DataOut) = {},M0(DataIn) = {},M0(AckOut) = {},M0(AckBu f ) = {1}where � represents the absence of a message, 0 and 1 are thealternating sequence numbers.

Each test case includes 0 to 1000 messages, and each test setincludes zero or many test cases. In this particular case, eachtest set includes only one test case. Due to a certain degree ofrandomness in building adequate test sets for a given test cover-age criterion, we ran 50 different test sets for each test coveragecriterion. Each test set should be adequate for the test cover-age criterion in ABP. We determine a test set is adequate if itcovers the criterion 100% at least once during different simu-lation executions. Note however that test set adequacy is notguaranteed. The size of an adequate test set is measured by theaverage size of the 50 adequate test sets, and the size of eachtest set is measured by the number of messages in the predicateAccept of the ABP model. The cost of evaluation of each cri-terion is measured by number of transitions invoked during thesimulation.

4.2.2. Evaluating coverageTransition coverage. We first checked all transition cover-

age, then 2-concurrency length-3 trace coverage followed byinterleaving length-6 transition sequence coverage. We did notcheck all transition trace coverage since the number of all tran-sition traces in ABP is infinite. It is not possible find all feasible2-concurrency length-3 traces or interleaving length-6 transi-tion sequences using our tool, so we performed a reachabilityanalysis to find the values. We started the testing with one mes-sage, and found that four transitions (the normal situation) were

12

covered when the simulation exited. We increased the numberof messages for place Accept to 8, at that point we found that alltransitions were covered by the test set. However, a re-run of thetest set may not guarantee the 100% coverage of the transitionsdue to the non-determinism of the corruption in the Channelprogram. When we changed the number of messages to 16, thecoverage of all transitions was fairly stable during the re-runs.

When we checked the 2-concurrency length-3 trace cover-age, we used the same test sets for testing all transition cover-age, and found that the test set was adequate when the test setincluded 8 messages, but the adequacy coverage was not stableuntil we increased the number of the messages in the test setto 64. However, the adequate test set for interleaving length-6transition sequence coverage was much larger because it wasrare to have one message and its acknowledgment both cor-rupted and resent. When we increased the number of messagesto 240, we got an adequate test set for interleaving length-6transition sequence coverage.

State coverage. In the ABP model we consider places Acceptand Deliver as having only two states: with messages or with-out messages. This assumption results in the total number offeasible states being 48. If a test set includes 8 or more mes-sages, then all states were covered by the test set. However,checking the state transition coverage and the state transitionpaths coverage are much more complex since it is very chal-lenging to find all feasible state transitions based on the explicitstates. Instead of evaluating the explicit states directly, we de-fine six abstract states for ABP, which are ready to send – thesystem is ready for sending messages, sending – messages aresent out but has not arrived at the receiver, resending message– a message is corrupted and the system is re-sending the mes-sage, sending ack – an acknowledgment is sent by the receiverbut has not arrived at the sender, resending ack – an acknowl-edgment is corrupted and the system is re-sending the acknowl-edgement, and received – a message is received by the receiver.Based on knowledge of ABP, we can find all feasible state tran-sitions, and all feasible lenght-6 state transition paths. The testsets that are adequate for the state coverage were also adequatefor the state transition coverage. In order to adequately coverlenght-6 state transition paths, the number of messages in thetest set has to be very large.

Flow coverage. We used the same test sets for transition cov-erage to evaluate the flow coverage. The inward flow coverage,outward flow coverage and flow coverage were easily 100%covered by a test set which includes 8 or more messages. Butfor flow path coverage, the size of the adequate test set is up tothe length of the flow paths. When the length of the flow pathis reach 11, the size of the test set has to be increased substan-tially. But after that, even the length of the flow path increased,the size of the adequate test set did not increase much.

Specification coverage. We only consider the equation cov-erage. Since each equation is part of the necessary conditionof its associated transition, we look at the equation as tested assoon as the transition is fired.

4.2.3. SummaryThere was a slight change in the ratio of coverage every time

we re-ran a test set. However, we have to be aware that in thecase of an extreme situation some flow paths, states or transi-tions, such as resendACK, may never be covered even if manymessages were accepted by the model because whether a mes-sage is corrupted or not is determined non-deterministicallyduring the simulation. The fairness provided by the SPINmakes the simulation easier because each transition or processhas a fair opportunity of being executed. We ran 50 test sets foreach test coverage criterion several times, and we found thatthose 50 test sets were 100% adequate for all transition cover-age, all flow coverage and equation coverage at almost all thetime. But several test sets were not adequate any more for pathrelated coverage criteria when we re-ran the test sets. Evenwhen we increased the size of the test sets, it did not changemuch.

Figure 4.2.3(a) shows the relationship between the size of testsets and the coverage for four different coverage criteria, wherethe coverage is the average coverage of the 50 test sets, and thesize is the average size of the 50 test sets. In Figure 4.2.3, Arepresents all transition coverage and all state coverage, B rep-resents 2-concurrency length-3 transition coverage and C rep-resents length-6 state transition path coverage. The size of thetest set is not drawn to scale in Figure 4.2.3(a). Figure 4.2.3(b)shows the relationship between the average coverage of the 50test sets for the four coverage criteria and the average numberof transitions invoked during the simulation, where A, B, and Crepresents the same coverage as in in Figure 4.2.3(a). The sizeof the steps is not drawn to scale in Figure 4.2.3(b).

Based on the results and experimental experience from thecase study of the ABP, we found the simulation-based ap-proach is effective for evaluating the adequacy of many testcoverage criteria for testing PrT nets. However, the main chal-lenges of the simulation-based evaluation are related to resolv-ing non-determinism in concurrent systems and detecting end-less loops, livelock and deadlock during the simulation. Be-cause simulation-based evaluation only examines a selectedsubset of the system state space during execution, it is difficultto evaluate the adequacy of some test coverage criteria such asall interleaving transition trace coverage in the ABP (the num-ber of all interleaving transition traces is infinite). Fortunately,we only need to evaluate the feasible subsets of those infeasi-ble test coverage criteria at most cases, and those subset criteriashould be still valid for evaluating many interested system prop-erties.

5. Discussion

In this section we present the important issues on using theSPIN simulation mode to perform test coverage on PrT nets.We discuss our approach for both of the PrT net models pre-sented in this paper: the net for the dining philosophers’ prob-lem and the ABP problem.

Handling infinite state spaces. In Section 2.2 we stated thata marking defines a state in a PrT net. The state space for a

13

coverage%

size

100

50

024 8 6416 240

A C

coverage%

steps

100

50

05 45 200100 1000

B BC

A

(a) (b)

Figure 7: Experimental results for four coverage criteria: A - all transition coverage and all state coverage, B - 2-concurrency length-3 transition coverage, and C -length-6 state transition path coverage.

PrT net can be infinite or very large; thus making it hard tocover all the states for these nets. Bounding all places of aPrT net is necessary but not sufficient to limit the number ofstates; we also need to limit the number of possible values atoken can have in each place. In our transformation from a PrTnet to a Promela program, each place is translated to a group ofvariables with the same data type (bounding the place), and datatypes defined in a PrT net are mapped to bounded integers orstructured types in the Promela program (bounding the valuesof tokens). For example, in our case study we encode stringsas integers. In testing PrT nets, we use abstract states, whereeach abstract state represents some configuration of interest tobe observed during the execution of the net. The set of abstractstates is finite, and we map a set of possibly infinite markings(explicit states) to an abstract state. If we want to observe theoccurrence of an abstract state, i.e., the net is in a given abstractstate, then we need to observe one or more of the explicit statesthat maps to the corresponding abstract state.

During online monitoring of the execution of a PrT net modelwe provide a means for tracking both the explicit states and theabstract states. The abstract state space is mapped to a fixed-length array variable in the Promela program, and we use thearray variable to record the states covered during the simula-tion. Testing PrT nets using abstract states provides useful in-formation. For example, in the five-dining philosophers prob-lem, we defined three states: one for the situation in which nophilosopher is eating (s0), one philosopher is eating (s1), andtwo philosophers are eating (s2). If during the testing process,we never reach s1 or s2, we don’t need to ask further questionssuch as whether every philosopher or a specific philosopher getsthe chance to eat or not.

Alternative translation approaches. We performed additionalexperiments using two variations on how PrT net transitions aretranslated to Promela statements. One in which each PrT netmaps to a unique process (see Sections 3 and 4) and another inwhich each transition is mapped to a process. Similar results interms of coverage were obtained in both cases. However, forthe first case, we had to add an extra variable for controlling theenabling of each transition.

Handling dynamic semantics. In the SPIN simulation mode,the execution sequence of the translated PrT net can be very

large and we may have to stop the simulation at some point dur-ing the execution of the program. In the case that we complywith the coverage criteria (e.g. the state coverage criteria met-ric yields 1 or 100% coverage) then the simulation is terminatedusing an assert statement or ended by a deadlock detecting pro-cess as discussed in Section 3.2.1. One challenge regarding the“one Promela process per net” implementation used is on howto decide which transition to fire and which substitution to usefor testing the enabling of a transition given the tokens at the in-put places. One of the properties that a Promela program shouldhave is when a transition that is enabled infinitely times will fireinfinitely times. A non-deterministic choice of an atomic tran-sition statement and the fairness of transition firing should beguaranteed during the simulation. The Promela program non-deterministically chooses both a transition and the substitutionto be used for checking its enabling condition. The implemen-tation has the following form:

:: atomic{ test_to_pick_transition&& test_substitution

-> fire_transition }

The condition test_to_pick_transition contains a vari-able that defines the execution chance of the transition. Ifa transition (the atomic construct) is chosen to be evaluatedby the SPIN and it is enabled, its firing depends on whethertest_to_pick_transition is true or f alse, and this deci-sion is particularly important for testing the transition coveragefor the ABP model. The initial approach used to implement themchannel process in the ABP model consisted of two transi-tions (transmitted and corrupted), which is shown below:

proctype mchannel() {do::atomic{guard_transmitted

-> fire_transmitted}::atomic{guard_corrupted -> fire_corrupted}::else -> skip

od}

Using the initial approach, in the simulation mode of theSPIN, the first transition (transmitted) may have a higher

14

change to fire than others if all of them are enable under a mark-ing so that the fairness of the firing will not be kept. To avoidthe above problem, we defined a variable (tran) that changes itsvalue randomly and it is part of the guard for each transition(this is the variable in test_to_pick_transition), resultingin the modified code shown below:

proctype mchannel() {byte tran=0;do::atomic{guard_transmitted && tran==0

-> fire_transmitted}::atomic{guard_corrupted && tran==1

-> fire_corrupted}::else -> tran=(tran+1)%2

od}

Liveliness and safety issues. In this paper, we presented aframework using the SPIN simulator to record and measure avariety structural testing coverage criteria for PrT nets proposedin [29]. It should be noted that even though we may reach 100%coverage of various testing criteria for a given PrT net, only aselected subset of the system state space or executions were ex-amined. Therefore we cannot draw conclusions with regard togeneral safety and liveness properties, which often require anexhaustive analysis of the complete state space or behavior ofthe system. Only in special circumstances, we can concludethat a liveness property holds when a selected test execution issatisfied or that a safety property does not hold when a selectedtest execution is violated. We believe that these structural test-ing coverage criteria provide insights and establish the basis forproperty oriented testing coverage criteria, which will be ex-plored in our future work.

6. Related Work

In this section we compare our work to other approachesthat use different testing techniques such as specification-basedtesting, program-based testing, or a combination (specification-based and program-based) testing, to evaluate test adequacy cri-teria. Research and experimentation on simulation-based eval-uation [8, 9, 16] of test criteria or the effectiveness of testingstrategies have been conducted for many years. Most of theexperimentation on simulation-based evaluation of test criteriawas performed on the system implementation and not on formalspecification of the system. Rutherford et al. [24] discusseda fault-based analysis technique for evaluating test suites andtest adequacy criteria using discrete-event based simulations.The focus of the evaluation described in [24] is on system-leveltesting of distributed systems. The experiments were conductedon three implementations of client-server applications and thefault-based analysis was performed using mutation analysis onthe implementations. Unlike the work presented in [24] ourmethod is based on the simulation of PrT nets models using thesimulation mode of the model checker SPIN. The focus of thework in [24] is on comparing the effectiveness of test criteria for

testing the implementation of distributed systems, but our workis on the systematic evaluation of test adequacy criteria on exe-cutable concurrent system models. Formal specification-basedtesting techniques was first introduced by Richardson et al. [22]through extending implementation-based testing techniques toformal specification languages. Several adequacy criteria fortesting software architectures, based on Chemical Abstract Ma-chine model [18], were defined in the work by Richardson andWolf [23]. Richardson and Wolf did not discuss the issues re-lated to evaluating the adequacy of test criteria they defined in[23].

There are several program-based testing techniques for con-current programs that use reachability analysis to create graphsfrom the source code. The main objective of these techniquesis to minimize the effects caused by state explosion. Taylor etal. [26] describe a testing approach that extends structural test-ing criteria for sequential programs to concurrent programs, andpropose a hierarchy of supporting structural testing techniques.These criteria are defined in terms of the features of a concur-rency graph. Koppol et al. [19] use annotated labeled transitionsystems (ALTSs) to select test sequences for concurrent pro-grams. ALTSs reduce the impact of the state explosion prob-lem by performing incremental reachability analysis. ALTSsare similar to concurrency graphs and the criteria by Taylor etal. can be applied to ALTSs. Koppol et al. define additionaltest criteria that focus on synchronization events. Unlike the ap-proaches in [26] and [19], our approach uses executable system-level models defined at a higher level of abstraction thereby pro-viding us with the ability to handle the state explosion problemby using abstract states. Using system-level models also re-moves some of the implementation details and the dependenceon certain language features.

Several researchers have used model checking to analyzeand/or verify Petri nets models. Gannod and Gupta [10] de-scribe a tool that supports the use of the model checker SPINto analyze and verify Petri nets constructed using the DOMEtool. The tool by Gannod and Gupta focuses on integrating amodeling environment (DOME) with an analysis environment(SPIN). Gannod and Gupta do not consider the analysis of PrTnets in their work. Grahlmann and Pohl [12] integrate the SPINverifier into the PEP tool (Programming Environment based onPetri nets). Several examples were presented in [12] highlight-ing the advantages of the integration between the SPIN andPEP, the major one being the speedup of using the SPIN basedanalysis versus prefix based analysis. We use an approach sim-ilar to the one presented by Grahlmann and Pohl when convert-ing PrT nets into Promela programs [12]. None of the aboveapproaches that use SPIN to analyze and/or verify Petri netsconsider analyzing the net with respect to test adequacy criteria.Using test coverage criteria during analysis provides a measureof the adequacy of the initial markings used in the verificationprocess. In addition our approach uses a monitor written inPromela to record and/or evaluate net events for the test criteriaduring the analysis process.

In our work we do not use the model checking facility inthe SPIN, however, several researchers have investigated usingmodel checking to support the testing of software. Ammann

15

https://www.researchgate.net/publication/37327244_Comparing_test_sets_and_criteria_in_the_presence_of_test_hypotheses_and_fault_domains?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/220854680_An_Experimental_Comparison_of_the_Effectiveness_of_the_All-Uses_and_All-Edges_Adequacy_Criteria?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==






https://www.researchgate.net/publication/3188241_Incremental_integration_testing_of_concurrent_programs?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


https://www.researchgate.net/publication/220854729_Approaches_to_Specification-Based_Testing?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/3189857_Evaluating_Test_Suites_and_Adequacy_Criteria_Using_Simulation-Based_Models_of_Distributed_Systems?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==




https://www.researchgate.net/publication/3187719_Formal_specification_and_analysis_of_software_architectures_using_the_chemical_abstract_machine_model?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==

https://www.researchgate.net/publication/3187446_Structural_testing_of_concurrent_programs?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


https://www.researchgate.net/publication/2445629_Software_Testing_at_the_Architectural_Level?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==


et al. [1] explored the role of model checkers in software test-ing by investigating how the powerful computation engines inmodel checkers are used to generate and evaluate test sets fora variety of test coverage criteria. The model Ammann et al.used is an FSM with constraints over states and execution repre-sented as temporal logic constraints. The main contribution byAmmann et al. is using the model checker SMV to generate testsets using specification-based mutation analysis. Other test cri-teria described in [1] include: uncorrelated full predicate cov-erage, transition pair, coverage and branch coverage. Gargan-tini and Heitmeyer [11] used a model checker to produce coun-terexamples that are then used to generate test sequences. Themodel checker generates these counter examples by verifyingnegatived premises (trap properties) taken from the specifica-tion. They show how a model checker can be used to automat-ically generate test cases to satisfy certain structural coveragecriteria. Their approach claims to generate test cases from anydevelopment artifact that can be represented as an FSM. Thestructural coverage criteria by Rayadurgam and Heimdahl [21]are defined in terms of the transition relation of the FSM, eachtransition is thought of as a triple (pre-state, post-state, guard).A guard is a condition that must be satisfied for a change froma pre-state to a post-state. We do not focus on generating testcases but on evaluating the structural coverage criteria for a PrTnet given a test case (initial marking).

Zhu and He formally define the test coverage criteria of PrTnets in [29], which are extended in our work by providing apractical implementation for evaluating the test adequacy crite-ria. We use the model checker SPIN to simulate the executionof the PrT net, log the net events generated and evaluate theadequacy of each test coverage criterion.

7. Summary and Future Work

Testing continues to be an effective approach in detecting er-rors and can be readily applied to concurrent and distributedsystems. Objectively evaluating test adequacy criteria is criti-cal for generating effective test sets in an efficient manner. Inthis paper we presented a methodology for automatically eval-uating the test coverage criteria of PrT net models using thesimulation capability of the model checker SPIN. We system-atically evaluated the adequacy of the different test coveragecriteria that can be applied to PrT net models. In addition, toovercome the inefficiency of evaluating the test coverage cri-teria using the simulation-based method, some practical andeffective approaches for evaluation are proposed in this paper.The research results of this paper can be easily integrated intotesting or model checking tools and be used to evaluate testsets or test adequacy criteria of concurrent systems. Further-more, our results for high level Petri nets can be easily adaptedto other formal specification methods in which the correspond-ing coverage criteria can be defined. In the future we plan touse the test cases generated using our methodology to produceprogram-based test sets and evaluate the effectiveness of thesetest sets on the implementation.

Acknowledgments We thank the anonymous reviewers fortheir constructive comments that help improve the quality ofthis paper. This research was supported in part by the NationalScience Foundation of the USA under grants HRD-0317692and HRD-0833093.

References

[1] P. Ammann, P. E. Black, W. Ding, Model checkers in software testing,Tech. Rep. NIST-IR 6777, National Institute of Standards and Technology(2002).

[2] B. Beizer, Software Testing Techniques, Van Nostrand Reinhold, NewYork, NY, 1990.

[3] P. E. Black, V. Okun, Y. Yesha, Mutation operators for specifications,in: In Proceedings of 15 th IEEE International Conference on AutomatedSoftware Engineering (ASE2000), 2000, pp. 81–88.

[4] L. C. Briand, Y. Labiche, Y. Wang, Using simulation to empirically in-vestigate test coverage criteria based on statechart, in: ICSE ’04: Pro-ceedings of the 26th International Conference on Software Engineering,Washington, DC, USA, 2004, pp. 86–95.

[5] C. DeMartini, R. Iosif, R. Sisto, A deadlock detection tool for concurrentjava programs, Softw. Pract. Exper. 29 (7) (1999) 577–603.

[6] J. Ding, A methodology for formally modeling and analyzing softwarearchitecture of mobile agent systems, Ph.D. thesis, Florida InternationalUniversity (2004).

[7] M. B. Dwyer, L. A. Clarke, J. M. Cobleigh, G. Naumovich, Flow analy-sis for verifying properties of concurrent software systems, ACM Trans.Softw. Eng. Methodol. 13 (4) (2004) 359–430.

[8] P. G. Frankl, S. N. Weiss, An experimental comparison of the effective-ness of the all-uses and all-edges adequacy criteria, in: TAV4: Proceed-ings of the symposium on Testing, analysis, and verification, 1991, pp.154–164.

[9] P. G. Frankl, S. N. Weiss, An experimental comparison of the effective-ness of branch testing and data flow testing, IEEE Transactions on Soft-ware Engineering 19 (1993) 774–787.

[10] G. C. Gannod, S. Gupta, An automated tool for analyzing petri nets us-ing spin, in: Proceedings of the 16th IEEE International Conference onAutomated Software Engineering (ASE’01), IEEE, 2001, pp. 404–407.

[11] A. Gargantini, C. Heitmeyer, Using model checking to generate tests fromrequirements specifications, in: Proceedings of the 7th European softwareengineering conference held jointly with the 7th ACM SIGSOFT interna-tional symposium on Foundations of software engineering, ACM, 1999,pp. 146–162.

[12] B. Grahlmann, C. Pohl, Profiting from spin in PEP, in: Proceedings of the4th International SPIN Workshop (SPIN ’98), 1998.

[13] X. He, A formal definition of hierarchical predicate transition nets, in:Proceedings of the 17th International Conference on Application and The-ory of Petri Nets, 1996, pp. 212–229.

[14] X. He, T. Murata, High-Level Petri Nets - Extensions, Analysis, and Ap-plications, Electrical Engineering Handbook (ed. Wai-Kai Chen), Else-vier Academic Press, 2005.

[15] X. He, H. Yu, T. Shi, J. Ding, Y. Deng, Formally analyzing software archi-tectural specifications using sam, Journal of Systems and Software 71 (1-2) (2004) 11–29.

[16] R. M. Hierons, Comparing test sets and criteria in the presence of testhypotheses and fault domains, ACM Trans. Softw. Eng. Methodol. 11 (4)(2002) 427–448.

[17] G. J. Holzmann, The Spin Model Checker: Primer and reference manual,Addison-Wesley, Boston, MA., 2003.

[18] P. Inverardi, A. L. Wolf, Formal specification and analysis of software ar-chitectures using the chemical abstract machine model, IEEE Transactionon Software Engineering 21 (4) (1995) 373–386.

[19] P. V. Koppol, R. H. Carver, K. C. Tai, Incremental integration testing ofconcurrent programs, IEEE Transactions on Software Engineering 28 (6)(2002) 607–623.

[20] C. E. McDowell, D. P. Helmbold, Debugging concurrent programs, ACMComputing Surveys (CSUR) 21 (4) (1989) 593 – 622.

[21] S. Rayadurgam, M. P. Heimdahl, Coverage based test-case generation us-ing model checkers, in: Proceedings of the 8th Annual IEEE International

16






















































https://www.researchgate.net/publication/220882601_Coverage_Based_Test-Case_Generation_Using_Model_Checkers?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==




https://www.researchgate.net/publication/299115965_Using_model_checking_to_generate_tests_from_requirements_specifications?el=1_x_8&enrichId=rgreq-a064a141538372d5530280ad692e8012-XXX&enrichSource=Y292ZXJQYWdlOzIyMzQ1MjEyMztBUzoxNjcxNTE2MzM3MDcwMDhAMTQxNjg2MzQ2MzMzMA==






Conference and Workshop on the Engineering of Computer Based Sys-tems (ECBS 2001), IEEE Computer Society, 2001, pp. 83–91.

[22] D. Richardson, O. O’Malley, C. Tittle, Approaches to specification-basedtesting, SIGSOFT Softw. Eng. Notes 14 (8) (1989) 86–96.

[23] D. J. Richardson, A. L. Wolf, Software testing at the architectural level,in: In Joint Proceedings of the Second International Software Architec-ture Workshop (ISAW-2) and International Workshop on Multiple Per-spectives in Software Development (Viewpoints 96) on SIGSOFT 96Workshops, 1996, pp. 68–71.

[24] M. J. Rutherford, A. Carzaniga, A. L. Wolf, Evaluating test suites andadequacy criteria using simulation-based models of distributed systems,IEEE Transactions on Software Engineering 34 (4) (2008) 452–470.

[25] Spin, On-The-Fly, LTL Model Checking with SPIN ,http://spinroot.com/ (August 2008).

[26] R. Taylor, D. Levine, C. Kelly, Structural testing of concurrent programs,IEEE Transactions on Software Engineering 18 (3) (1992) 206 – 215.

[27] K. Varpaaniemi, J. Halme, K. Hiekkanen, T. Pyssysalo, Prod referencemanual, Tech. rep., Helsinki University of Technology, Dept. of Com-puter Science and Eng., Digital Systems Lab. (1995).

[28] H. Zhu, P. A. V. Hall, J. H. R. May, Software unit testing coverage andadequacy, ACM Computing Surveys 29 (4) (1997) 366–427.

[29] H. Zhu, X. He, A methodology of testing high-level petri nets, Informa-tion and Software Technology 44 (2002) 473–489.

[30] R. Zurawski, M. Zhou, Petri nets and industrial applications: A tutorial,IEEE Trans. on Industrial Electronics 41 (6) (1994) 567–583.

17

Documents

A methodology for evaluating test coverage criteria of high levelPetri nets