14
788 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013 A Formal Data-Centric Approach for Passive Testing of Communication Protocols Felipe Lalanne and Stephane Maag, Member, IEEE Abstract—There is currently a high level of consciousness of the importance and impact of formally testing communicating networks. By applying formal description techniques and formal testing approaches, we are able to validate the conformance of implementations to the requirements of communication protocols. In this context, passive testing techniques are used whenever the system under test cannot be interrupted or access to its interfaces is unavailable. Under such conditions, communication traces are extracted from points of observation and compared to the expected behavior formally specied as properties. Since most works on the subject come from a formal model context, they are optimized for testing the control part of the communication with a secondary focus on the data parts. In the current work, we provide a data-centric approach for black-box testing of network protocols. A formalism is provided to express complex properties in a bottom-up fashion starting from expected data relations in messages. A novel algorithm is provided for evaluation of proper- ties in protocol traces. Experimental results on Session Initiation Protocol (SIP) traces for IP Multimedia Subsystem (IMS) services are provided. Index Terms—Conformance, data, IP Multimedia Subsystem (IMS), passive testing, protocols. I. INTRODUCTION I N CURRENT times, where communication is essential and an immense array of services is available online, computer networks continue to grow, and new communication protocols and services are continuously being developed. Regarding this area, communication standards are essential to allow different systems to interwork. Although these standards can be for- mally veried [41], their implementations may contain faults and therefore must be tested. Testing is mainly known as the process of checking that a system possesses a set of desired properties and behavior. There is a high level of consciousness of its importance and impact for the future deployment and use of software and systems. This is notably observed with the numerous works tackling the testing areas—works provided by the research communities of course [18], but also by the industry [15] and the standardization institutes [16]. Traditional testing approaches are based on formal models that allow them to automate the testing process (generation and Manuscript received September 23, 2011; revised April 28, 2012; accepted July 12, 2012; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Ed- itor T. Wolf. Date of publication August 09, 2012; date of current version June 12, 2013. The authors are with the TELECOM SudParis, CNRS UMR 5157, 91011 Evry Cedex, France (e-mail: [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TNET.2012.2210443 execution of test sequences) in order to cover a large set of expected behaviors and to improve the reliability of the pro- tocols. By applying formal description techniques and formal testing approaches, we are able to validate multiple aspects of protocol implementations: scalability, security, and particularly the conformance [1] to the requirements of the protocol. Within these techniques, passive testing is used whenever the state of an implementation under test (IUT) cannot be controlled by means of test sequences either because access to the interfaces of the system is unavailable or a reset of the IUT is unwanted. Passive testing is based on the observation of input and output events of an implementation under test in runtime. The term passive” means that the tests do not disturb the natural run- time of a protocol as the implementation under test is not stim- ulated. The record of the event observation is called a trace. In order to check the conformance of the IUT, this trace will be compared to its expected behavior, dened either by a formal model (whenever available) or by one or more expected func- tional properties. The current work deals with black-box testing (i.e., the details on the protocol’s implementation are unknown) for distributed network systems. As will be explained later in a case study for IP Multimedia Subsystem (IMS) services, we also lack access to the communication interfaces of the system. Given these con- straints, the current work focuses on passive testing techniques for distributed systems. Due to the complexity of current pro- tocols, in particular for application-layer protocols, complete formal specications are rarely available (or are too expensive to develop). We then base our approach on testing of expected protocol behaviors (formally specied as properties). Passive testing is often conated with runtime moni- toring [10]; they both observe or monitor a run of the system (contained in a trace) and attempt to determine the satisfaction of a given correctness property [25]. However, while passive testing has the specic purpose of delivering a verdict about the conformance of a black-box implementation (IUT), runtime verication deals with the more general aspects of property evaluation and monitor generation, without necessarily at- tempting to provide a verdict about the system. Passive testing and runtime monitoring approaches derive re- spectively from model-based testing and model checking tech- niques. Due to this, they are usually propositional in nature, that is, they consider events (inputs/outputs) in the trace to be part of a (usually) nite set of symbols called the control part or con- trol portion of the communication. This allows the testing of properties such as “if input is observed, then output must be observed at some point in the future.” Since modern, spe- cially application-layer, protocols depend heavily on data, many works extend the propositional approach by using a concept 1063-6692/$31.00 © 2012 IEEE

A Formal Data-Centric Approach for Passive Testing of Communication Protocols

Embed Size (px)

Citation preview

788 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

A Formal Data-Centric Approach for Passive Testingof Communication Protocols

Felipe Lalanne and Stephane Maag, Member, IEEE

Abstract—There is currently a high level of consciousness ofthe importance and impact of formally testing communicatingnetworks. By applying formal description techniques and formaltesting approaches, we are able to validate the conformance ofimplementations to the requirements of communication protocols.In this context, passive testing techniques are used whenever thesystem under test cannot be interrupted or access to its interfacesis unavailable. Under such conditions, communication tracesare extracted from points of observation and compared to theexpected behavior formally specified as properties. Since mostworks on the subject come from a formal model context, theyare optimized for testing the control part of the communicationwith a secondary focus on the data parts. In the current work, weprovide a data-centric approach for black-box testing of networkprotocols. A formalism is provided to express complex propertiesin a bottom-up fashion starting from expected data relations inmessages. A novel algorithm is provided for evaluation of proper-ties in protocol traces. Experimental results on Session InitiationProtocol (SIP) traces for IP Multimedia Subsystem (IMS) servicesare provided.

Index Terms—Conformance, data, IP Multimedia Subsystem(IMS), passive testing, protocols.

I. INTRODUCTION

I N CURRENT times, where communication is essential andan immense array of services is available online, computer

networks continue to grow, and new communication protocolsand services are continuously being developed. Regarding thisarea, communication standards are essential to allow differentsystems to interwork. Although these standards can be for-mally verified [41], their implementations may contain faultsand therefore must be tested. Testing is mainly known as theprocess of checking that a system possesses a set of desiredproperties and behavior. There is a high level of consciousnessof its importance and impact for the future deployment anduse of software and systems. This is notably observed with thenumerous works tackling the testing areas—works providedby the research communities of course [18], but also by theindustry [15] and the standardization institutes [16].Traditional testing approaches are based on formal models

that allow them to automate the testing process (generation and

Manuscript received September 23, 2011; revised April 28, 2012; acceptedJuly 12, 2012; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Ed-itor T. Wolf. Date of publication August 09, 2012; date of current version June12, 2013.The authors are with the TELECOM SudParis, CNRS UMR 5157,

91011 Evry Cedex, France (e-mail: [email protected];[email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TNET.2012.2210443

execution of test sequences) in order to cover a large set ofexpected behaviors and to improve the reliability of the pro-tocols. By applying formal description techniques and formaltesting approaches, we are able to validate multiple aspects ofprotocol implementations: scalability, security, and particularlythe conformance [1] to the requirements of the protocol. Withinthese techniques, passive testing is used whenever the state of animplementation under test (IUT) cannot be controlled by meansof test sequences either because access to the interfaces of thesystem is unavailable or a reset of the IUT is unwanted.Passive testing is based on the observation of input and output

events of an implementation under test in runtime. The term“passive” means that the tests do not disturb the natural run-time of a protocol as the implementation under test is not stim-ulated. The record of the event observation is called a trace. Inorder to check the conformance of the IUT, this trace will becompared to its expected behavior, defined either by a formalmodel (whenever available) or by one or more expected func-tional properties.The current work deals with black-box testing (i.e., the details

on the protocol’s implementation are unknown) for distributednetwork systems. As will be explained later in a case study forIP Multimedia Subsystem (IMS) services, we also lack accessto the communication interfaces of the system. Given these con-straints, the current work focuses on passive testing techniquesfor distributed systems. Due to the complexity of current pro-tocols, in particular for application-layer protocols, completeformal specifications are rarely available (or are too expensiveto develop). We then base our approach on testing of expectedprotocol behaviors (formally specified as properties).Passive testing is often conflated with runtime moni-

toring [10]; they both observe or monitor a run of the system(contained in a trace) and attempt to determine the satisfactionof a given correctness property [25]. However, while passivetesting has the specific purpose of delivering a verdict about theconformance of a black-box implementation (IUT), runtimeverification deals with the more general aspects of propertyevaluation and monitor generation, without necessarily at-tempting to provide a verdict about the system.Passive testing and runtime monitoring approaches derive re-

spectively from model-based testing and model checking tech-niques. Due to this, they are usually propositional in nature, thatis, they consider events (inputs/outputs) in the trace to be part ofa (usually) finite set of symbols called the control part or con-trol portion of the communication. This allows the testing ofproperties such as “if input is observed, then output mustbe observed at some point in the future.” Since modern, spe-cially application-layer, protocols depend heavily on data, manyworks extend the propositional approach by using a concept

1063-6692/$31.00 © 2012 IEEE

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 789

like “parameterized propositions,” where propositions are con-sidered to include a set of parameters, commonly belonging tosome (usually) finite domains. An example provided in [37] forLTL formulas is the following: 1 in-dicating that every file that has been opened must be eventu-ally closed. Such a kind of expression admits the incorporationof data into the definition of properties. However, being propo-sitional or control-centric in nature, it does not allow the def-inition of complex relations (for instance, functional relations)between data fields, or as the number of parameters increases,they do it in detriment of succinctness and readability of for-mulas. It is the premise of the current work that when the mainfunctionality of a protocol is contained in the data and not in thecontrol part, a data-centric approach, i.e., a bottom-up definitionof properties, starting from expected data relations to expressproperties of incremental complexity, provides a more effectivesolution to test network protocols. We propose a novel approachto achieve precisely that purpose.Our main contributions are the following.• A logic-based formalism that allows the definition of com-plex properties based on data for testing distributed net-work protocols. The basis for the syntax of the logic is Hornclauses, extended to allow trace messages as parametersand the comparison of data fields between messages. Tracequantifiers are introduced in Section II to express temporalrelations between messages, relating either to the future orthe past of the trace. The high-level expressiveness of thelogic allows for complex relations between messages anddata to be defined and used for expressing formulas that canbe tested on protocol traces. This is clearly an improvementof the very few existing approaches dealing with protocoldata parts.

• A novel algorithm to check whether a set of formal prop-erties (expected protocol behaviors) are contained withina trace set is introduced in Section III. While many worksassume that it is possible to know the trace states (somecomparisons to related works are provided in Section IV),we cannot make this assumption in a black-box testingcontext. Evaluation results can be obtained per trace,which can be related to verdicts about the IUT, accordingto preestablished conditions.

• Implementation of the introduced concepts into a frame-work, briefly described on Section V.

• Real experiments on the Session Initiation Protocol (SIP)in a real IMS testbed with our testing tool.

Moreover, we provide some discussions on our results andapproach as well as perspectives in Section VI and conclude inSection VII.

II. USING HORN LOGIC TO EXPRESS DATA-AWARE PROPERTIES

In this section, we describe a logic, based on Horn clauses,to specify predicates and formulas to be evaluated on protocoltraces. The base unit of division of protocol traces is the mes-sage, which we describe first.

1LTL symbols: indicates that formula must hold on the next state,that must always (generally) hold, and that must hold at some point inthe future.

A. Messages in Network Protocols

Entities in communication protocols determine their behavioron the basis of two things: internal state information (includinginternal variables) and external data received in the form ofmes-sages (packets) from different communication peers. In passivetesting, the internal state is unknown, and only message data isavailable to the tester. A formal description of messages is thenessential to effectively perform testing.A message in a communication protocol is, using the most

general view possible, a collection of fields belonging to dif-ferent data domains. Domains may be either atomic or com-pound. An element in an atomic domain provides informationas a unit, e.g., “the set of port numbers” and “the set of lastnames” are atomic domains. An element in a compound do-main can be subdivided into multiple data elements that maybe used either individually or as an ensemble. For instance, “theset of URIs” is a compound domain, where each element is aURI (e.g., ) composed in turn by aprotocol, a user, and a domain element.Formally, an atomic domain is a set of either numeric or string

values.Definition 2.1: Let denote the set of all strings, and the

set of all numeric values; an atomic domain is any set , whereor . Each element is called a atomic

value.In a compound domain, each element (or compound value) is

represented by a set of pairs , where is usedto indicate the functionality of the piece of data contained in

. For instance, the URI is rep-resented by the compound value

where , and are the labels indicatingthe functionality of the atomic values , , and

, respectively.Definition 2.2: Let be a set of labels

and a sequence of domains (not necessarily dis-joint), with . A compound value of length is a set

, where and, for .

Notice that for a particular label, the accompanying value canbe undefined if the (null) value is used.Definition 2.3: Given a set of labels and

a sequence of domains (not necessarily disjoint),with , a compound domain is the set of all compoundvalues that can be formed with pairs , withand ( ). We denote such a compounddomain as the tuple .It should be observed that domains are not neces-

sarily restricted to atomic domains, making it possible to definerecursive structures by using compound domains within com-pound domains.Definition 2.4: Let be a compound

domain. Then, a function isdefined, where given a compound value then, for eachpair , . In other words, the function

, projects the component of associated with the label .

790 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

Finally, for a given protocol , we denote as the set ofall valid messages2 of the protocol (the domain of the protocol).Without too much loss of generality, we assume that can bedescribed as a compound domain3. A message in a protocolis any element of the compound domain .Example 2.1: A possible message for the SIP protocol, spec-

ified using the previous definition, is

representing an request (a call request) from. Notice that the value as-

sociated to the label is also a compound value,.

In the example, given the message , it might be desirableto extract the value associated with the label insidethe value associated with . This would require the func-tion call 4, with the com-pound domain of all possible SIP messages. Accessing data in-side messages is a basic requirement for the current approach.In order to simplify this type of nested calls, the function isdefined.Definition 2.5: Given a compound value , denotes

the domain set of . Let be a tuple of labels, a func-tion that returns the first element of the tuple, a functionthat returns the resulting tuple when removing the first elementof , and a function that returns the length of the tuple.The function is then defined recursively as

ififis a compound valueotherwise.

Intuitively, the function receives a compound valueand a sequence of labels and returns the value pointed bythe labels, or (null) if the pointed value does not exist.In order to ease the reading of formulas in the rest of thepaper, the notation is used to represent the call

. For instance, for the message de-fined in Example 2.1, the value accompanying thelabel for element in message would be represented by

.

B. Definition of a Trace

A trace is a collection of messages of the same domain (i.e.,using the same protocol) containing the observed interactions ofan entity of a network, the entity being called the point of obser-vation (P.O) in passive testing, with one or more peers duringan indeterminate period of time. In other words, the trace is thecollection of all messages exchanged by the P.O within its life.Depending on the interpretation of life, such definition makes

2As defined in the protocol specification documents.3Given the recursive structure allowed by Definition 2.3, such an assumption

should hold for a large range of protocols.4Since the value accompanying in is also a compound value of some

domain , we use to indicate .

a trace potentially infinite. Testing of properties, however, canonly occur in a finite segment of the trace.Definition 2.6: Given the domain of messages for a pro-

tocol , a trace is a sequence of potentiallyinfinite length, where .Definition 2.7: Given a trace , a trace seg-

ment is any finite subsequence of , that is, any sequence ofmessages ( ), where iscompletely contained in (samemessages in the same order). Amessage belongs to a finite trace ,if for some .Definition 2.8: Given a finite trace of

length , let denote the position of in thetrace5 , the order relations are defined in atrace, where for ,and .In practical terms, the trace extraction process should assign

each collected message to its position and time of observation,therefore the function will simply return that value. Sinceit is only possible to capture trace segments, in the rest of thedocument, trace will be used to refer to a trace segment.

C. Syntax of Formulas

In order to describe properties, a syntax based on Hornclauses is used. The syntax is closely related to that of the querylanguage Datalog, described in [2], for deductive databases.Formulas in this logic can be defined with the introduction ofterms and atoms. A term is either a constant, a variable, or apointer to a subelement of a variable, as obtained with thefunction.Definition 2.9: A term is either a constant, a variable or a

selector variable. In Backus–Naur form (BNF)

where is a constant in some domain (e.g., a message in a trace),is a variable, represents a label, and is called a se-

lector variable, equivalent to evaluatingfrom Definition 2.5.Definition 2.10: An atom is defined as

where represents a term, is a predicate of labeland arity given by the number of elements in parentheses. Thesymbols and represent the binary relations “equals to” and“not equals to,” respectively.In this logic, relations between terms and atoms are stated by

the definition of clauses. A clause is an expression of the form

where , called the head of the clause, has the form, where are a restriction on terms for the

head of the clause . The expression iscalled the body of the clause, where are atoms. Disjunction

5We assume that all can be distinguished from one another. This canbe achieved by adding some unique marker (e.g., the message index) to eachmessage during trace collection.

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 791

in this logic is provided by overloading predicate declarations,i.e., the definition of multiple clauses with the same head. Thefollowing set of declarations:

...

is equivalent to

Example 2.2: The relation , defined as follows:

accepts all messages with method or .Finally, a formula is defined as follows.• If are atoms, with , thenis a formula.

• If and are formulas, then so is .• If are variables and is a formula, then so are ,

, , , and .We can condense this information using the following BNF:

where are atoms, and and are vari-ables. Some more details regarding the syntax are provided inthe following.• The operator indicates causality in a formula and shouldbe read as “if–then” relation.

• The and quantifiers are equivalent to their counter-parts in predicate logic. However, and as it will be seenon the semantics of the logic, the quantifiers in this logiconly apply to the trace. Then, given a trace , is equiv-alent to , and is equivalent to ,with “ ” indicating the order relation from Definition 2.8.These type of quantifiers are called trace quantifiers.

D. Semantics of the Logic

The semantics used in this work are related to the traditionalApt–Van Emdem–Kowalsky semantics for logic programs [40],however some changes will be introduced to deal with mes-sages and trace temporal quantifiers. We begin by introducingthe concepts of substitutions (as defined in [30]) and of groundexpressions.Definition 2.11: A substitution is a finite set of bindings

where each is a term and each is avariable such that and if .The application of a substitution to a variable is de-

fined as follows:

ifotherwise.

The application of a substitution to a selector variableis defined as

if witha compound valueotherwise.

The application of a particular binding to an expression(atom, clause, formula) is the replacement of each occurrence

of by in the expression. The application of a substitutionon an expression , denoted by , is the application of allbindings in to all terms appearing in .Example 2.3: The application of the substitution

to the clauseis the clause .Two substitutions can also be composed to form a

new substitution. Let andbe substitutions. The composition

of and , denoted by is obtained from the set

by removing all where andby removing those for which

. In other words, redundant bindings and incon-sistent bindings (there cannot be and simultaneouslyin the resulting substitution) are removed from the composedsubstitution.Definition 2.12: A ground expression is any expression

where only constant (ground) terms are present. A groundinstance of an expression is the expression , where isa substitution and every variable in the expression has abinding to a constant term in .Given a set of clauses and

a trace. An interpretation6 is any functionsuch that for every expression that can be formed withelements (clauses, atoms, terms) of and terms from , then

. It is said that is true in if .The semantics of formulas under a particular interpretation

is given by the following rules.• The expression is true, iff equals (they are thesame term).

• The expression is true, iff is not equal to (theyare not the same term).

• A ground atom is true, iff .• An atom is true, iff every ground instance of is true in

.• The expression , where are atoms, is true,iff every is true in .

• A clause is true, iff every ground instance ofis true in .

• A set of clauses is true, iff everyclause is true in .

An interpretation is called a model for a clause setand a trace if every is true in . A for-

mula is true for a set and a trace (true in , for short),if it is true in every model of . It is a known result [40] that

6Called a Herbrand interpretation in logic programming.

792 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

if is a minimal model7 for , then if , thenis true for .The general semantics of formulas is then defined as follows.

Let be a clause set, a trace for a protocol, and a minimalmodel, the operator defines the semantics of formulas

ifotherwise.

Then, for and variables in , we define the semanticsof quantifiers and for a potentially infinite trace

as

if , whereandif with and ,whereif with and ,whereif , whereand

Since a finite trace is a finite segment of an infinite execution,it is not possible to declare a “ ” result for as in the infinitecase since we do not know if may become “ ” after the endof . Equivalently, for , it is unknown whether becomestrue for future values of . Similar issues have been observedin passive testing [9], as well as in runtime monitoring [7], forevaluations on finite traces. The semantics for trace quantifiersrequires then the introduction of a new truth value “ ” (incon-clusive) to indicate that no definite response can be provided.The semantics of quantifiers for finite traces is defined as

if with and ,whereotherwise

if with and ,whereotherwise.

The rest of the quantifiers are detailed in the following, whereis assumed to be bound as a message previously obtained byor :

if with ,where andotherwise

if with ,where andotherwise.

The semantics for and are equivalent to the lasttwo formulas, exchanging by . Finally, the truth value for

, using the truth table shown inTable I.The semantics of formulas described in the current section

are not meant to provide a method for procedural evaluation offormulas since it would be quite inefficient to calculate everymodel of and trace in order to test a particular property. Analgorithm for evaluation of formulas is provided in Section III.

7Obtained as , the intersection of all models for .

TABLE I3-VALUED TRUTH OPERATOR “ ”

TABLE IISTRUCTURE OF AN SIP MESSAGE

E. Example for the SIP

We finalize this section with an example definition for theSIP [35] (see Section V-A.1 for more details on SIP), which willalso be useful in the rest of the paper.For testing rules in the SIP, an SIP message is defined with the

labels described on Table II. With such structure, the followingclauses identify a message as a request or a response:

A message is a response [35, Section 8.2.6.2] to another mes-sage if the following clause holds:

Using these definitions allows us to express properties as thefollowing.• The property “every message is either a request or a re-sponse” can be tested defining the additional clauses

then results for are answers to theproperty.

• The property “every request must have a response after it”is defined as

III. EVALUATION OF PROPERTIES ON TRACES

In Section II-D, a two-part semantics was defined for thelogic: one for formulas of type (atomic formulas),

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 793

and a second one for formulas including trace quantifiers. In thecurrent section, we intend to provide an algorithm for the eval-uation of formulas, which will also require a two-part method-ology: 1) resolution of atomic formulas, where a variant of theclassical Selective Linear Definite-clause (SLD) resolution al-gorithm [4] will be introduced; 2) evaluation of quantifiers anddeclaration of verdicts for a given trace. Both parts are describedin the current section, although the most attention is given to thesecond part, given that the SLD algorithm is quite documentedin the literature. We begin by introducing the concept of unifiersand unification.Definition 3.1: Let and be two terms. A substitution such

that and are identical (denoted ) is called a unifierof and . If such a substitution exists, it is said that the termsunify under .Similarly, a unifier of two expressions8 and is a substi-

tution such that (the expressions are identical andtheir terms unify). If a unifier exists, then the expressions aresaid to be unifiable. A substitution is said to be more generalthan a substitution if there exists a substitution such that

. A unifier is said to be the most general unifier (mgu)of two expressions and iff is more general than any otherunifier. This is denoted as .Example 3.1: Atoms and are not unifiable,

while the mgu of and is the set .

A. SLD-Resolution

Given a set of clauses , SLD-resolu-tion of an atomic formula (also calledquery) follows a top-down process. For each atom , if a clause

exists in whose head unifies with(with unifier ), then is replaced by , and theprocess is repeated on the resulting formula

. Otherwise, no solution exists, and the algorithmreturns “ .” If in the clause and the unification succeeds,then is considered to be true and can be replaced by thesymbol “ ” in the resulting formula. A solution has been foundwhen all atoms in the formula are “ .” This process can be sum-marized by the following inference rule (borrowed from [30]):

where:1) are atoms.2) is a clause in , with renamed variables,so no conflicts can occur if the formula and the clause co-incide on the variable names.

3) .In the present work, can also be of type or. In such a case, the algorithm must evaluate the operation

or ,9, then replacing in the formula by“ ” if the evaluation succeeds. If there exists more than oneclause in that can be unified with , the algorithm musttest all alternatives until finding one that replaces every atom

8See Definition 2.12.9If either or are variables, then “ ” can be considered equivalent to

unification, however the evaluation of “ ” should return an error unless andare the same variable.

Fig. 1. Resolution trees for query , for cases: (a) , a re-quest, and (b) , a response. Each node in the tree represents a step inthe resolution starting to the head of the clause until a “ ” leaf node is reached.Variable symbols are replaced by their binding obtained after unification, e.g.,

is the result of unification of (the query), with thehead of clause (1) and substitution .

in the resulting formula by “ ” (we cannot declare false untilall alternatives are tested); this is also illustrated by Fig. 1(b) forExample 3.2.In the following, a short example is provided to illustrate the

resolution mechanism.Example 3.2: Let the following be a clause set for the SIP

protocol, as defined in Section II-E, simplified for the purposesof the example:

The clause set indicates that a valid SIP message is either a re-quest or a response. A valid request is one with method ,and a valid response is one with status code 200. Again, thisis an oversimplification made for the purposes of the example.Let us test a property stating that only valid messages should befound in the trace, i.e., . However, since theinterest is to demonstrate SLD-resolution, for now we will ig-nore the quantifiers and assume , an arbitrary messagein the trace. Fig. 1 shows the evaluation steps when the query isevaluated for messages , a request, and , a response.

B. Evaluation of Formulas in Traces

Given a formula defined using a set of clauses , it is notnecessarily interesting to attempt to produce a single (general)satisfaction result (“ ,” “ ,” or “ ”) for a particular trace .Let us take, as an example, a property . Due tothe semantics of the logic (Section II-D), the truth value ofcan only be “ ” if the value of is also “ ,” otherwise thevalue defaults to “ .” Since can never be “ ” (forany ), then the evaluation of can never yielda result other than inconclusive. Even though results obtainedthis way do not provide useful information, particular values of

that make true or false do. Multiple results willbe expected for the evaluation of a formula in a trace, and tworules will be used for reporting a particular result.1) Given a formula , an independent result should be de-clared for every value of , unless the expression isnested inside an “ ” or “ ” formula.

794 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

2) Given a formula , a result should be given only if thereexists some value of in the trace that makes the prop-erty true. Any other values for are irrelevant for theresolution.

An evaluation algorithm can be constructed using recursion,and the details of the different cases in the evaluation are pro-vided as follows, where returns the evaluation ofthe formula using substitution into trace .• Evaluation of a formula provides an independent solu-tion with the evaluation of for every possible value of inthe trace. The result is provided by the following formula:

If the formula, is nested inside an existential formulaor an implication, the result of evaluation is provided by

if wherewith

otherwise.

• Evaluation of a formula looks for a “ ” result to theevaluation of

if wherewith

otherwise.

• Evaluation of a formula assumes that a binding forto a message in the trace already exists in and provides

a solution for every evaluation of after the position ofthat message. Depending on the implementation, an errorshould occur if no binding for exists in

For nested formulas

if withwhere ,

otherwise.

• Evaluation of a formula looks for a “ ” result to theevaluation of after the position of in the substitution

if withwhere ,

otherwise.

• Evaluation of and are analogous to theirequivalents with “ ” just replacing every occurrence of“ ” by “ .”

• Evaluation of a formula first will evaluate andonly if the result is “ ,” then is evaluated. New bind-ings obtained from the evaluation of must be used in the

evaluation of . If the evaluation of is “ ” or “ ,” the re-sult is considered vacuous (uninteresting), since evaluationof and are independent of the value of

if and

if and

if and

• Evaluation of a formula , where are atoms,returns the value obtained using SLD-resolution, as de-scribed in Section III-B

ifhas a solutionotherwise.

The previous rules define every possible case for the evalu-ation algorithm. In the last part of this section, the time com-plexity of the algorithm is provided.1) Complexity of the Algorithm: As seen in the previous sec-

tion, the evaluation of formulas can be represented by a tree. In-tuitively, the time complexity of evaluation will depend on thenumber of nodes on the tree. On the other hand, the memorycomplexity, will only depend on the height of the tree, giventhat only the part of the tree being evaluated needs to be kepton memory (top-down resolution). Time complexity being thenthe most critical issue, we will focus on it in the current section.Since each evaluation will be processed to one or more for-

mulas of type , two time complexities can berecognized for each formula: represents the time re-quired for the evaluation of a formula and

represents the worst-case time of evaluation of a for-mula . For a simple formula, the height of the resolution treeis small compared to the length of the trace, therefore an upperbound for the evaluation time of a simple formulacan be used. Therefore, given a formula with quantifiers

, where each , vari-ables related to messages, and a trace , we canstate that

which shows that the worst-case complexity for this type of for-mula is , where is the length of the trace and is thenumber of quantifiers in the formula. Although it seems large, itshould be emphasized that this represents the worst-case com-plexity, i.e., the complexity for a trace where the evaluation ofevery quantifier returns “ .” It should also be clarified that thiscorresponds to the complexity of analyzing the whole trace, andnot for obtaining individual solutions, which depends on thetype of quantifiers used.

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 795

For a formula such as , the analysis of the com-plexity is similar, with the exception that the indexes on the sum-mation will no longer be independent, using the same method-ology, the time complexity is found to be

Similarly, for a formula with three quantifiers, the time com-plexity is given by . Following aninductive methodology, we can show that, for a formula

, the complexity isgiven by a polynomial of order .For a formula with a “ ” operator

where represent different quantifiers, it is also simple to showthat the time complexity of the evaluation is inthe worst case.

C. Obtaining Conformance Verdicts

In the previous sections, we have introduced a syntax and se-mantics for formulas, where the satisfaction of a formula (ex-pressed as a relation between data variables) in a trace can bedeterminedwithin truth values . However, such resultsare only interesting in a testing context if they can be translatedto a conformance verdict about the implementation (pass, fail,inconclusive) to indicate the existence (or lack of observation)of faults. In this section, we present our approach to deal withthis.An important issue needs to be dealt with when testing

temporal properties on finite traces. For a property such as “ifevent happens, then event must occur in the future,” ifevent is observed but event is not observed, how does onedetermine whether the event was not produced, or whether thetrace collection stopped too early? We have already consideredthis in the design of our semantics for formulas, with the intro-duction of the truth value “ .” The issue is now to determinewhether a fail or an inconclusive verdict should be provided fora “ ” satisfaction result. We provide four alternative solutions,integrated into our approach.1) Assume that the trace is not long enough, i.e., a “ ” sat-isfaction result necessarily yields an inconclusive verdict.Although this is a strict assumption, and it may not alwaysprovide interesting results, further analysis of the verdictsmight. For instance, a large number of inconclusive ver-dicts for the same property may be indicative of a fault.

2) Assume that the trace is long enough, i.e., . If thetrace is long enough, this might be a suitable alternative,however some false positive results (i.e., premature failverdicts) may be expected at the edges of the trace.

3) Define an alternative, conditional behavior to be observed.If while attempting to observe a given property, the con-ditional behavior is observed first, then a fail verdict isreturned; otherwise, an inconclusive verdict is returned.However, the condition may not always be easy to define.

4) Explicitly define behavior that should not be observed,or negative behavior. This way, satisfaction of a negativeproperty (a “ ” result) must indicate a fail verdict. How-ever, negative behavior may not always be easy to definefrom the requirements of the service or protocol.

With such alternatives, we define positive and negative pas-sive tests as follows.Definition 3.2: Let and be formulas in our logic. A pos-

itive passive test is defined as the pair , where rep-resents the expected or test behavior and represents a con-ditional behavior. The evaluation of a passive test re-turns the verdict pass if the behavior defined by is observed,fail if the behavior specified by is not observed and the be-havior specified by is observed, and inconclusive if none areobserved.As a convention, represents a test where the condi-

tion is true, i.e., a fail verdict must be returned if the sequencedefined by is not observed. represents a test wherethe condition is false, i.e., an inconclusive verdict must be re-turned if the sequence defined by is not observed.Definition 3.3: Let and be formulas in our logic. A neg-

ative passive test is defined as , where representsa negative test behavior and is the conditional behavior.In evaluation, , , and

. Analogously to positive passivetests, returns a pass verdict if is not observed, and

returns an inconclusive verdict if is not observed.

IV. COMPARISON TO RELATED WORK

A number of different approaches to the testing and moni-toring of formulas in traces exist in the literature for passivetesting and runtime monitoring. In the following, we describeworks in both categories in relation with their ability to expressdata relations for defining properties.

A. Passive Testing

Formal testing methods have been used for years to provecorrectness of implementations by combining test cases eval-uation with proofs of critical properties. In [18], the authorspresent a description of the state of the art and theory behindthese techniques. Within this domain, and in particular for net-work protocols, passive testing techniques have to be used to testalready-deployed platforms or when direct access to the inter-faces is not available. Some examples of these techniques usingFinite State Machine derivations are described in [24], [28],and [38]. Most of these techniques consider only control por-tions; in [22], [23], and [39], data portion testing is approachedby evaluation of traces in Event-based Extended Finite StateMachine (EEFSM) and Simplified Extended Finite State Ma-chine (SEFSM) models, testing correctness in the specificationstates and internal variable values. Our approach, although in-spired by it, is different in the sense that we test critical proper-ties directly on the trace and only consider a model for potentialverification of the properties. In [5] and [12], the invariant ap-proach was presented and studied also in [9] and [19]. A studyof the application of invariants to an IMS service was also pre-sented by us in [20] and [21].

796 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

In more recent work, the authors of [29] define a method-ology for the definition and testing of time-extended invariants,where data is also a fundamental principle in the definition offormulas and a packet (similar to a message in our work) isthe base container data. In this approach, the satisfaction of thepackets to certain events is evaluated, and properties are ex-pressed as , where and are events de-fined as a set of constraints on the data fields of packets, isthe number of packets where the event should be expected tooccur after finding in the trace, and is the amount of timewhere event should be found on the trace after (or before)event . This work served as an inspiration for the work de-scribed in this paper, however in our work we improve on itby allowing the definition of formulas that test data relationsbetween multiple messages/packets. Although we do not takeinto account real time in the present work, we will discuss it inSection VI.Although closer to runtime monitoring, the authors of [11]

propose a framework for defining and testing security propertieson Web Services using the Nomad [14] language, based on pre-vious works by the authors of [26] and [27]. As a work on Webservices, data passed to the operations of the service are takeninto account for the definition of properties, and multiple eventsin the trace can be compared, allowing to define, for instance,properties such as “Operation can only be called betweenoperations and .” Nevertheless, in Web services,operations are atomic—that is, the invocation of each operationcan be clearly followed in the trace, which is not the case withnetwork protocols, where operations depend on many messagesand sometimes on the data associated with the messages.

B. Runtime Monitoring

Runtimemonitoring and runtime verification techniques havegained momentum in the latest years, particularly using modelchecking techniques for testing properties on the trace. The au-thors of [25] provide a good survey and introduction of method-ologies in this area. The usual approach consists on the defini-tion of some logic (LTL is commonly used), which is used tocreate properties from which a monitor is defined to test on thetrace. The authors of [7] describe the definition of monitors as fi-nite state machines for LTL formulas. They introduce a 3-valuedsemantics (true, false, inconclusive) in order to test formulas forfinite segments of the trace.10 In [8], they expand their analysison inconclusive results by proposing a 4-value semantics to dis-tinguish cases where the property is most likely to become trueor become false on the continuation of the trace. The analysisprovided in this work on finite and infinite traces is based on thedefinitions from these authors, applied to our logic.Regarding the inclusion of data, the concept of parameterized

propositions is introduced by the authors of [37]. Propositionscan contain data variables, and quantifiers can be defined for thedata variables by the introduction of a operator, formulas oftype , where

10In their work, a trace segment is considered a finite word with an infinitecontinuation, so formulas that deal with the future of the trace have to take intoaccount that the property can become true (or false) on the continuation of thetrace.

are quantifiers and are variables. In this ap-proach, valid data values in formulas are fixed, so if is usedon the left side, the set with valid values musthave been defined previously. Although it is an interesting ap-proach to data testing, it is still propositional in nature. Our ap-proach adds flexibility to the definition of formulas by considersdata the central part of the communication.A similar approach to ours is presented by the authors of [34]

for attack detection in logs. Their work uses a simplified LTLsyntax where only the operators “ ” and “ ” are used. Aformula

attempts to find a record in the log matching the firstpart ( and are fields in a record), and a future onematching the second part, where the variable is assignedusing a mechanism similar to unification. However, their ap-proach focusing on attack detection only deals with infinitetraces since definite verdicts are not a requirement.Another work, defined to test message based work-flows, is

provided by the authors of [17] in the definition of the logic. Here, data are a more central part of the defini-

tion of formulas, and LTL temporal operators are used to in-dicate temporal relations between messages in the trace. Mes-sages are defined as a set of pairs , similarly toour work, and formulas are defined with quantifiers specific tothe labels. As an example, the formula

indicates that generally, if a message with methodis found, then there exists a field in that mes-

sage, such that a future message with status 200 exists with thesame . Although the syntax of the logic is flexible, itcan quickly lose clarity as the number of variables required in-creases. Our current work improves on this by allowing to groupconstraints with clause definitions.Finally, in [6], the authors propose a logic for runtime mon-

itoring of programs, called EAGLE, that uses the recursive re-lation from LTL (and its analogous for the past)to define a logic based only on the operators next (representedby ) and previous (represented by ). Formulas are definedrecursively and can be used to define other formulas. Constrainton the data variables and time constraints can also be tested bytheir framework. However, their logic is propositional in nature,and their representation of data is aimed at characterizing vari-ables and variable expressions in programs, which makes it lessthan ideal for testing message exchanges in a network protocolas required in our work.

C. Comparison on Qualitative Aspects

Passive testing and runtime monitoring mainly consider con-formance and safety properties. Two main reasons can be fig-ured out. The first one is that many interesting properties to betested on traces are not expressible in some logics (e.g., LTL).Second, although one property is expressible, it can be unmon-itorable [33]. In our approach, nonregular properties such asproper nesting of operators like calls and returns [36] are ex-pressible. Furthermore, in many solutions, the researchers as-sume that the current states of the observed trace are known. Inour case, we do not require such assumptions. Our monitoring

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 797

TABLE IIISOME COMPARATIVE ASPECTS OF PASSIVE TESTING TOOLS

technique has no effect on the running behavior of themonitoredsystem. Moreover, as mentioned above, our points of observa-tion are set in a black-box framework that does not allow anyhoming phase [9]. Since no specification of the implementationunder test is provided, the extracted traces are not related to anyknown states. Because of these concerns, a comparison of theapproaches according to their expressiveness, efficiencies, com-plexity, and capabilities are not easy to settle [3]. Nevertheless,we try in the following to compare some key aspects of theseapproaches.Our method can be compared to the techniques used in

PASTE [9] and EAGLE [6]. These two tools are representativeof how to passively and efficiently test a protocol. EAGLE pro-vides an interesting formalism to express complex properties.However, they assume knowing the variables’ values of eachstate in a trace. Furthermore, even if its expressiveness is closeto ours, the design of such properties is difficult given theircomplex scheme, making it hard to implement them efficiently.PASTE embeds innovative passive testing algorithms, butdoes not consider the causality between the data portions in atrace. With our approach, we argue first the need of checkingall the packets in a trace since the states are unknown (notdone in EAGLE) and second the analysis of data constraintsthrough all the packets of the trace (not done in PASTE).When comparing memory and time complexities of the threealgorithms (see Table III), our tool Datamon presents a hightime complexity in comparison to the others. However, theDatamon memory complexity is much more interesting. Thereasons are obvious. Indeed, in our work we manage some datain the formula and we do not assume any knowledge about theimplementation states, which increases the time complexity.However, in compensation, our top-down resolution tree leadsto a linear memory complexity. The complexity is not the onlykey point allowing to compare our approach to others. Table IIIdetails the comparative aspects of the three above-mentionedapproaches plus another one named MOP11 commonly used inbenchmarks [13].Table III illustrates the advantages and limitations of our

approach. Although the time complexity of our algorithm ishigher, the space complexity is quite better. Two limitations ofDatamon can be noted: It does not take into account formulaswritten in temporal logics (EAGLE and MOP do), and theobtained verdicts cannot act directly on the IUT.

11http://fsl.cs.uiuc.edu/index.php/Monitoring-oriented_programming

Fig. 2. Architecture for the framework.

V. IMPLEMENTATION AND EXPERIMENTS

We have implemented this testing framework using Java. Theimplemented system is composed of three main modules: 1) fil-tering and conversion of collected traces; 2) evaluation of tests;and 3) evaluation of formulas.Fig. 2 shows the way the modules interact and the inputs and

outputs from each one.The trace processing module takes the raw traces collected

from the network exchange, and it converts the messages fromthe input format. In our particular implementation, the inputtrace format is PDML, an XML format that can be obtainedfrom Wireshark12 traces. The purpose of the module is to con-vert each packet in the raw trace into a data structure (a com-pound value) conforming to the definition of a message. Theformat for the message is defined in a different input file to themodule. There, each subelement for the target message (e.g.,“.method,” “.cseq.seq”) is associated with the respective ele-ment in the trace source format. This module also performs fil-tering of the trace in order to only take into account messagesof the studied protocol (defined in the tool configuration). Sincethis is a separate module of the implementation, alternative traceformats can be changed or expanded by modifying this module.The test evaluation module receives as input a passive test

as defined in Section III-D, as well as a trace from the traceprocessing module, and produces a verdict from the satisfactionresults of the test and conditional formulas. The formula evalu-ation module is implemented as described in Section III-B.1. Itreceives a trace and a formula, along with the clause definitionsand returns a set of satisfaction results for the query in the trace,as well as the messages and variable bindings obtained in theprocess. The implementation and the files used for the experi-ments can be found at http://www-public.telecom-sudparis.eu/~lalanne/ieee.html.

A. Experiments

1) IP Multimedia Subsystem Services: The IP MultimediaSubsystem (IMS) is a standardized framework for delivering IPmultimedia services to users in mobility. It was originally in-tended to deliver Internet services over GPRS connectivity. This

12Wireshark: http://www.wireshark.org/

798 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

Fig. 3. Core functions of the IMS framework.

vision was extended by 3GPP, 3GPP2, and TISPAN standard-ization bodies to support more access networks, such as Wire-less LAN, CDMA2000, and fixed access networks. The IMSaims at facilitating the access to voice or multimedia services inan access-independent way in order to develop the fixed-mobileconvergence. To ease the integration with the Internet world, theIMS heavily makes use of IETF standards.The core of the IMS network consists on the Call Session

Control Functions (CSCF), which redirect requests dependingon the type of service, the Home Subscriber Server (HSS),a database for the provisioning of users, and the ApplicationServer (AS), where the different services run and interoperate.Most communication with the core network and between theservices is done using the SIP [35]. Fig. 3 shows the corefunctions of the IMS framework and the protocols used forcommunication between the different entities.The SIP is an application-layer protocol that relies on request

and response messages for communication. Messages may haveto traverse multiple proxies and servers to arrive to a destina-tion, which may be stateless or stateful, meaning that messagesmust include different types of session/routing data to ensurecommunication. Messages can contain a body part that can beused for complementing or extending information provided bythe headers of the protocol (for instance by providing the mediaconfiguration by using the session description protocol). Also,several other RFCs have been defined to extend the protocol toallow messaging, event publishing, and notification. These ex-tensions are used by services of the IMS architecture such as thePresence service [31] and the Push-to-talk Over Cellular (PoC)service [32].For the experiments, traces for an ad hoc PoC session estab-

lishment were obtained from a production IMS implementation,provided by the Alcatel-Lucent company and extracted from theinterfaces of the IMS core (as shown in Fig. 3). These traces con-tain all communication between the client, the IMS core, and theAS. Given the point of observation, messages exchange for thePresence service, as well as the PoC service can be observed inthe collected traces. Also, many packets for protocols differentthan SIP (TCP, RTCP, TalkBurst) appear as well. Although thetool allows the filtering of these messages, due to the massiveamount of extra information (in one case, from 137 530 mes-sages, only 299 were SIP messages), filtering was done prior tothe tests. In the following, the properties used for evaluation, thesyntax, and the obtained results for each one are described.

TABLE IVRESULTS OF TESTING THE PROPERTY “FOR EVERY REQUEST THERE MUST BE A

RESPONSE” ON THE SET OF TRACES

2) For Every Request There Must be a Response: This prop-erty can be used for a monitoring purpose in order to draw fur-ther conclusions from the results. Due to the nature of the prop-erty, false results can never be provided for the evaluation of thetest part of the passive test. Furthermore, given the generality ofthe test formula, a condition cannot be defined since the condi-tion depends on the type of request and response. Finally, dueto the fact that the provided traces are not very long, a “ ” con-dition is used to avoid false positive verdicts (cf. Section III-D).Nevertheless, as it will be shown, inconclusive results can

also provide interesting information about the peers of the com-munication. The property is defined using a positive passive testwith the following test and condition :

where accepts all nonprovisional re-sponses (nonfinal responses, with status ) to requestswith a method different than ACK, which does not require a re-sponse. The results from the evaluation on the traces are shownin Table IV. As expected, most traces show only true resultsfor the property evaluation, however traces 4 and 10 show anunusual number of inconclusive results. Taking a closer lookat trace 10, all of the errors correspond to messages,with an header corresponding to a event(RFC 4575). Furthermore, the first two messages indicate theoriginal request (from the AS to the SIP core and from theSIP core to the client, respectively), and the rest correspond toretransmissions of the first two. All of these messages are at theend of the trace, which could indicate that the client closed theconnection before receiving the NOTIFY message. The samephenomena can be observed on trace 4.3) Every Session Initialization Must be Acknowledged: The

session initialization procedure is a three-way handshake, com-posed by the messages – – . The construction ofthe request is detailed in [35, Section 13.2.2.4] and is de-scribed by the following clause:

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 799

TABLE VRESULTS OF TESTING THE PROPERTY “EVERY SESSION INITIALIZATION MUST

BE ACKNOWLEDGED” ON THE SET OF TRACES

A positive passive test with the following test and condi-tion pair serve to validate the property that every successfulrequest should be acknowledged:

where accepts all success responses (status code be-tween 200 and 299) and the failure criteria for the passive test isbased on finding a session terminating request , which pro-vides indication that the session was initiated without anmessage appearing in the trace. The predicate is defined as

The results of evaluation are shown in Table V. Since thepremise of the test formula is much more restrictive thanfor the example in Section V-A.2, very few cases are re-ported by the tool, and most results are vacuous (not shown inthe table). This can be observed in traces 1, 5, and 8, where onlyvacuous results were reported, since these traces mostly contain

and messages, and no responses tomessages appear.In traces 3 and 6, a fail verdict was produced, meaning that a

session was initiated without acknowledgment from the client,determined by the appearance of a session termination message

as indicated by the condition . Since a single fail verdictwas produced in comparison to several pass, the result may beindicative of an error in the collection of the trace, and it is notnecessarily conclusive. Nevertheless, it shows the effectivenessof our approach to detect inconsistent behavior.4) Subscription to Events and Notifications: The presence

service is a system to disseminate presence information. Thisservice can be used as part of an IMS network and relies on theSIP for session initialization and setup. In the presence service, auser (the watcher) can subscribe to be notified of another user’s(the presentity) presence information. This works by using theSIP messages , , and for subscrip-tion, update, and notification, respectively. These messages alsoallow the subscription to other types of events other than pres-ence, which is indicated in the header on the SIP mes-sage. It is desirable then to test that whenever there is a sub-scription, a notification must occur upon an update event. This

TABLE VIRESULTS OF TESTING THE PROPERTY “WHENEVER AN UPDATE EVENT

HAPPENS, SUBSCRIBED USERS MUST BE NOTIFIED” ON THE SET OF TRACES

can be tested with a positive passive test , where andare defined as follows:

where , , and hold on ,, and events, respectively. Notice that the

values of the variables , , and may not havea value at the beginning of the evaluation; in that case, theirvalue is set by the evaluation of the clause, shownin the following:

Here, the operator compares the two terms. However, if oneof the terms is an unassigned variable, then the operator worksas an assignment. In the formula, the values assigned on theevaluation of will be then used for comparison in theevaluation of . This is another way of defining formulas,different from using only message attributes.The results of evaluating the passive test are shown in

Table VI. The results show no inconclusive results, althoughthey also show that the full notification sequence is notpresent in most traces, with the exception of traces 9 and10. Notice that we are explicitly looking for a sequence

, however the sequencecan also be present for subscription

to server events, therefore and eventsmight also appear on the trace. To test the capabilities of de-tection, some messages were manually introducedon a trace, matching existing messages. The lackof notification for the update was correctly detected by theevaluation of the property.From the results in Table VI, it can also be seen that the

evaluation of this property is much more time-consuming thanthe one in Tables IV and V. Although this is expected giventhe number of quantifiers and the complexity of evaluation de-scribed in Section III-C.1, the definition of the test formula isalso quite inefficient. During evaluation, all combinations ofand are tested until both and return a

800 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 21, NO. 3, JUNE 2013

TABLE VIIRESULTS OF TESTING THE PROPERTY “WHENEVER AN UPDATE EVENT

HAPPENS, SUBSCRIBED USERS MUST BE NOTIFIED” ON THE SET OF TRACES

“ ” result. The evaluation time can be improved by rewritingas

which can be understood as: “if an update event is found, then ifa previous subscription exists to such event, then a notificationmust be provided at some point after the update event.” The re-sults of evaluating this property are shown in Table VII. Noticethat for trace 9, a different number of true results is returned.This is due to the order of search given by the property. In theprevious case, it sufficed with one pair –in order to return a result. In the current property, for each

, it will look for a matching . Since forevery subscription there can exist multiple updates, the numberof true results differs.

VI. DISCUSSION AND FUTURE WORKS

As it can be seen from Section V, inconclusive verdicts aremany times unavoidable for the evaluation of properties on fi-nite traces, and a conditional behavior is not always possible todefine. For a formula , there is no way of de-ciding when to stop looking for a message for which holdsand declare a verdict, other than reaching the end of the trace.A possible solution for this issue might be in the incorporationof real-time constraints in the declaration of the properties. Interms of syntax, extending the type of constraints allowed insidetrace quantifiers might be sufficient to allow this. A new seman-tics and algorithm should be defined to distinguish between theend of the trace and the failure of a real-time constraint in orderto provide an inconclusive or a false result.Online testing, i.e., evaluation of properties as the IUT is

being run, might also be a desired improvement, for instancefor the testing of security properties. Two issues must be solvedfirst though: 1) storage of traces (and how long) to allow the def-inition of temporal quantifiers on the past of a given message;2) parallel processing of properties to avoid running indefinitelywithout returning results, as it might happen for properties withnested quantifiers.As seen in Section V, and in accordance with the complexity

analyzed in Section III-C.I, whenever using nested quantifiers,the evaluation time increases proportionally to one order ofmagnitude with respect to the length of the trace for each nestinglevel. Given two equivalent properties,

and , processing time for the formeris much larger than the latter due to the fact that the first onetests every combination of values for and , and the latteronly starts the search for a suitable value for , after an isfound that makes true. Modification of the syntax andsemantics to allow for the second type of formula should bestudied further.Race conditions, i.e., two or more messages of the protocol

occurring at exactly the same time (appearing with the sametimestamp on the trace), merit further study. Depending on thetype of property and the testing purpose, it may occur that thesame pair of messages causes a false/inconclusive or true verdictto be returned, depending on the order on which they are con-sidered. This issue presents an interesting research challenge forthe topic of passive testing.

VII. CONCLUSION

This paper introduces a novel approach to passive testing ofnetwork protocol implementations, with a particular focus onIMS services. Motivated by the fact that modern (particulary ap-plication-layer) protocols are highly dependant on data to set upand coordinate communication, our approach reduces the focuson the testing of control part of the communication and providesa means for directly testing data relations between messages ina network trace.The approach allows the definition of high-level relations be-

tween messages or message data, and then uses such relations inorder to define properties that will be evaluated on the trace. Analgorithm for evaluation is defined in our work, where evalua-tion of the property returns a “ ,” “ ,” or “ ” result, wheneverthe property is confirmed, falsified, or no useful information canbe derived, respectively, on the given trace, and a method forrelating satisfaction of a formula with a conformance verdicthas been provided. The approach has been implemented into aframework, and results from testing properties on a IMS PoCservice traces are provided.The results are positive. The implemented approach allows

the definition and testing of complex data relations efficiently.Nevertheless, improvements can always be proposed as futureworks, some examples of these being: real-time testing, im-provements for online monitoring, as well as a richer syntax.

REFERENCES[1] ISO, “Information technology—Open systems interconnection—Con-

formance testing methodology and framework—Part 1: General con-cepts,” Tech. Rep. ISO/IEC 9646-1, 1994.

[2] A. Serge, H. Richard, and V. Victor, Datalog and Recur-sion. Reading, MA: Addison-Wesley, 1995, ch. 12, pp. 271–310.

[3] R. Alur and T. A. Henzinger, “Real-time logics: Complexity and ex-pressiveness,” Inf. Comput., vol. 104, no. 1, pp. 35–77, 1993.

[4] K. R. Apt and M. H. Van Emden, “Contributions to the theory of logicprogramming,” J. ACM, vol. 29, no. 3, pp. 841–862, 1982.

[5] J. Arnedo, A. Cavalli, and M. Nunez, “Fast testing of critical propertiesthrough passive testing,” in Proc. Testing Commun. Syst., 2003, pp.608–608.

[6] B. Howard, A. Goldberg, K. Havelund, and K. Koushik, “Rule-basedruntime verification,” in Proc. Verif., Model Checking, Abstract Inter-pret., 2004, pp. 277–306.

[7] A. Bauer and M. Leucker, “Runtime verification for LTL and TLTL,”Trans. Softw. Eng. Methodol., vol. X, pp. 1–68, 2007.

[8] A. Bauer, M. Leucker, and C. Schallhart, “The good, the bad, and theugly, but how ugly is ugly?,” in Proc. 7th Int. Conf. Runtime Verif.,2007, pp. 126–138.

LALANNE AND MAAG: FORMAL DATA-CENTRIC APPROACH FOR PASSIVE TESTING OF COMMUNICATION PROTOCOLS 801

[9] E. Bayse, A. Cavalli, M. Núñez, and F. Zaïdi, “A passive testing ap-proach based on invariants: Application to the wap,” Comput. Netw.,vol. 48, no. 2, pp. 247–266, 2005.

[10] K. Brzezinski, Towards the Methodological Harmonization of PassiveTesting Across ICT Communities. Rijeka, Croatia: InTech, 2009, pp.143–168.

[11] T. -D. Cao, T.-T. Phan-Quang, P. Felix, and R. Castanet, “Automatedruntime verification for web services,” in Proc. IEEE Int. Conf. WebServices, Jul. 2010, pp. 76–82.

[12] A. Cavalli, S. Prokopenko, and C. Gervy, “New approaches for passivetesting using an extended finite state machine specification,” Inf. Softw.Technol., vol. 45, no. 12, pp. 837–852, 2003.

[13] C. Christian, G. J. Pace, and G. Schneider, “Dynamic event-based run-time monitoring of real-time and contextual properties,” in FormalMethods for Industrial Critical Systems. Berlin, Germany: Springer-Verlag, 2009, pp. 135–149.

[14] F. Cuppens, N. Cuppens-Boulahia, and T. Sans, Nomad: A SecurityModel With Non Atomic Actions and Deadlines. Piscataway, NJ:IEEE Press, 2005.

[15] ETSI, Sophia-Antipolis, France, “Methods for testing and specification(MTS); the testing and test control notation version 3; Part 1: TTCN-3core language v3.2.1,” Tech. Rep. ETSI/ES 201 873–1, 2007.

[16] European Telecommunications Standards Institute, Sophia-Antipolis,France, “ Advanced Testing methods—Vocabulary of terms used incommunication protocols conformance testing,” Tech. Rep., 1993.

[17] S. Halle and R. Villemarie, “Runtime monitoring of message-basedworkflows with data,” in Proc. 12th Int. IEEE Enterprise Distrib. Ob-ject Comput. Conf., Sep. 2008, pp. 63–72.

[18] R. M. Hierons, P. Krause, G. Lüttgen, A. J. H. Simons, S. Vilkomir,M. R. Woodward, H. Z. K. Bogdanov, J. P. Bowen, R. Cleaveland, J.Derrick, J. Dick, M. Gheorghe, M. Harman, and K. Kapoor, “Usingformal specifications to support testing,” Comput. Surveys, vol. 41, no.2, pp. 1–76, 2009.

[19] B. T. Ladani, B. Alcalde, and A. Cavalli, “Passive testing-a constrainedinvariant checking approach,” in Proc. 17th IFIP Int. Conf. TestingCommun. Syst., 2005, pp. 9–22.

[20] F. Lalanne and S. Maag, “From the IMS PoC service monitoring to itsformal conformance testing,” in Proc. 6th Int. Conf. Mobile Technol.,Appl. Syst., Nice, France, 2009, pp. 1–8.

[21] F. Lalanne, S. Maag, E. M. De Oca, A. Cavalli, W. Mallouli, and A.Gonguet, “An automated passive testing approach for the IMS PoCservice,” in Proc. IEEE/ACM Int. Conf. Autom. Softw. Eng., 2009, pp.535–539.

[22] D. Lee and R. E. Miller, “A formal approach for passive testing ofprotocol data portions,” in Proc. 10th IEEE Int. Conf. Netw. Protocols,2002, pp. 122–131.

[23] D. Lee and R. E. Miller, “Network protocol system monitoring-aformal approach with passive testing,” IEEE/ACM Trans. Netw., vol.14, no. 2, pp. 424–437, Apr. 2006.

[24] D. Lee, A. N. Netravali, K. K. Sabnani, B. Sugla, and A. John, “Passivetesting and applications to network management,” in Proc. IEEE Int.Conf. Netw. Protocols, 1997, pp. 113–122.

[25] M. Leucker and C. Schallhart, “A brief account of runtime verifica-tion,” J. Logic Algebraic Program., vol. 78, no. 5, pp. 293–303, May2009.

[26] Z. Li, Y. Jin, and J. Han, “A runtime monitoring and validation frame-work for web service interactions,” in Proc. IEEE Australian Softw.Eng. Conf., 2006, pp. 10–PP.

[27] Z. Li, J. Han, and Y. Jin, “Pattern-based specification and validation ofweb services interaction properties,” in Proc. ICSOC, 2005, pp. 73–86.

[28] R. E.Miller, “Passive testing of networks using a CFSM specification,”in Proc. IEEE IPCCC, 1998, pp. 111–116.

[29] G. Morales, S. Maag, A. Cavalli, W. Mallouli, and E. De Oca, “Timedextended invariants for the passive testing of web services,” in Proc.8th IEEE ICWS, 2010, pp. 592–599.

[30] U. Nilsson and J. Maluszynski, Logic, Programming and Prolog, 2nded. New York: Wiley, 1990, vol. 5.

[31] Open Mobile Alliance, San Diego, CA, “Internet messaging and pres-ence service features and functions,” Approved ver. 1.2, 2005.

[32] Open Mobile Alliance, San Diego, CA, “Push to talk over cellular re-quirements,” Approved ver. 1.0, 2006.

[33] A. Pvueli and A. zAKS, “Psl model checking and run-time verifica-tion via testers,” in Proc. 14th Int. Symp. Formal Methods, Hamilton,Canada, 2006, pp. 573–586.

[34] M. Roger, M. Roger, and J. Goubault-Larrecq, “Log auditing throughmodel-checking,” in Proc. 14th IEEE CSFW, 2001, pp. 220–236.

[35] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson,R. Sparks, M. Handley, and E. Schooler, “SIP: Session Initiation Pro-tocol,” IETF, Tech. Rep. RFC 3261, 2002.

[36] G. Rosu, F. Chen, and T. Ball, “Synthesizing monitors for safety prop-erties: This time with calls and returns,” in Proc. 8th Int. WorkshopRuntime Verif., Budapest, Hungary, 2008, Lecture Notes in ComputerScience, pp. 51–68.

[37] V. Stolz, “Temporal assertions with parametrized propositions,” J.Logic Comput., vol. 20, no. 3, pp. 743–757, Nov. 2008.

[38] A. Cavalli and M. Tabourier, “Passive testing and application to theGSM-MAP protocol,” Inf. Softw. Technol., vol. 41, no. 11–12, pp.813–821, Sep. 1999.

[39] H. Ural and Z. Xu, “An EFSM-based passive fault detection approach,”in Lecture Notes Comput. Sci., 2007, pp. 335–350.

[40] M. H. Van Emden and R. A. Kowalski, “The semantics of predicatelogic as a programming language,” J. ACM, vol. 23, no. 4, pp. 733–742,1976.

[41] J. Woodcock, P. G. Larsen, J. Bicarregui, and J. Fitzgerald, “Formalmethods: Practice and experience,” Comput. Surveys, vol. 41, no. 19,pp. 1–19, Oct. 2009.

Felipe Lalanne received the Ph.D. degree in com-puter science from TELECOM SudParis, Evry,France, in 2012.His main research interests and activities are the

use of formal methods and their application to testing,as well as formal models and their utilization in ser-vice specifications.

Stephane Maag (M’11) received the Ph.D. degreein computer science from Evry University, Evry,France, in 2002.He has been an Associate Professor with

TELECOM SudParis, Evry, France, since 2002,dealing with formal approaches for the validationof protocols and services. He is involved in theCNRS research group Samovar. He chaired severalconferences/workshops and published more than 60papers in journals and conferences.