10.1.1.96.7072

Embed Size (px)

Citation preview

  • 7/28/2019 10.1.1.96.7072

    1/8

    Integrating Fault Tree Analysis with Event Ordering Information

    Marco Bozzano & Adolfo VillafioritaITC - IRST, Via Sommarive 18, Povo, 38050 Trento, Italy{bozzano,adolfo}@irst.itc.it

    ABSTRACT: Fault tree analysis is a traditional and well-established technique for analyzing system design androbustness. Its purpose is to identify sets of basic events, called cut sets, which can cause a given top level event,e.g., a system malfunction, to occur. In this paper we present an algorithm that extracts ordering information,i.e., finds out possible ordering constraints which are required to hold between basic events in a cut set. The

    algorithm is completely automatic, and has been incorporated into a more general framework, based on modelchecking techniques, for automatic fault tree generation and analysis.

    1 INTRODUCTION

    The development of safety critical systems requiresto check that the system behaves as expected notonly in nominal situations, but also under certain de-graded situations. Thus, on the one hand, system mod-els are developed by the design engineers in orderto specify and to analyze the expected behaviour of

    the system under consideration. On the other hand,the envisaged system is analyzed by safety special-ists with respect to malfunctions, i.e., unintended be-haviour. The safety analysis, performed at each stageof the system development, is intended to identifyall possible hazards with their relevant causes. Tra-ditional safety analysis methods include, e.g., Func-tional Hazard Analysis, Failure Mode and EffectAnalysis (FMEAs), and Fault Tree Analysis (FTA)(Vesely et al. 1981).

    Fault tree analysis (Vesely et al. 1981), in partic-ular, is a deductive and top-down method to analyze

    system design and robustness. Roughly speaking, theFTA process consists in picking a top level event(e.g.,a system malfunction condition) and identifying allpossible sets of basic events, called cut sets, which cancause the top event to occur. Among them, one wouldlike to isolate minimal cut sets, that is, cut sets whichdo not include events that ultimately do not affect theoccurrence of the top event. The information on cutsets is then collected in a fault tree, which consistsof system and component events, connected by gates

    This work has been and is being developed within ESACS,an European sponsored project, contract no. G4RD-CT-2000-00361. See also URL http://www.esacs.org/

    which define the logical relations between events. Thecut setrepresentation provided by traditional fault treeanalysis is not structured. A cut set is simply seenas a flat collection of basic events, and no informa-tion is provided about their mutual relationship. Al-though events are often allowed to happen in any or-der, in general there may be timing constraints whichenforce a particular event to happen before or afteranother one. This can happen as a result of a causal-ity relation, a functional dependency, or more subtlereasons related to dynamic scenarios where systembehaviour can be affected by, e.g., automatic controlsystems or operator actions (Siu 1994).

    In this paper, we are interested in automaticallycomputing ordering information of basic events.Specifically, given a top level event and a minimalcut set computed via fault tree analysis, we want tofind out whether there are ordering constraints whichhold between pairs of basic events in the cut set. We

    call this event ordering analysis. We present an al-gorithm which integrates traditional fault tree analy-sis by providing event ordering information for basicevents in a cut set. The algorithm is completely au-tomatic, and has been incorporated into a more gen-eral framework for automatic fault tree generation andanalysis. The core of our ordering analysis algorithmis based on known procedures for minimization (i.e.,computation ofminimal cut sets) ofboolean functions(Coudert and Madre 1992; Coudert and Madre 1993;Manquinho et al. 1998) represented as Binary De-cision Diagrams (BDDs) (Bryant 1992). The encod-

    ing of the problem and some adjustments necessaryto deal with inconsistency are original. The encod-

  • 7/28/2019 10.1.1.96.7072

    2/8

    ing is based on ordering information variables, that is,variables which relate pairs of different basic events,tracking the information about the mutual order inwhich the two events may or may not occur.

    Our framework is based on model checking (Clarkeet al. 2000), a well-established method for formallyverifying temporal properties of finite-state concur-

    rent systems. Model checking has been applied for theformal verification of a number of significant safety-critical industrial systems (Holzmann 1997; Larsenet al. 1997; Cimatti et al. 2002). We have incorpo-rated fault tree and ordering analysis functionalitiesinto the model checking tool NuSMV (Cimatti et al.2002), a BDD-based symbolic model-checker devel-oped at ITC-IRST, originated from a re-engineeringand re-implementation of SMV (McMillan 1993).NuSMV is a well-structured, open, flexible and well-documented platform for model checking, and it hasbeen designed to be robust and close to industrial sys-

    tem standards (Cimatti et al. 2000).This line of research has been carried on inside

    the ESACS project, an European-Union-sponsoredproject whose main goals are to define a methodol-ogy to improve the safety analysis practice for com-plex systems development, to set up a shared environ-ment based on tools supporting the methodology, andto validate the methodology through its application tocase studies. The fault tree and ordering analysis func-tionalities which we discuss in this paper have beenincluded in a more general safety analysis platformwhich we are developing inside the ESACS project

    (Bozzano and al. 2003).

    Structure of the paper. The rest of the paper is struc-tured as follows. In Section 2 we give a brief overviewof the basics of fault tree analysis and we introduceevent ordering analysis, explaining its significanceand its relationship with model checking. In Section3 we introduce a simple example which we will usein Section 4, where we present our minimization al-gorithm for ordering analysis and we briefly discussits integration with fault tree analysis based on modelchecking. Finally, in Section 5 we discuss related

    work and draw some conclusions.

    2 EVENT ORDERING ANALYSIS

    Fault Tree Analysis (FTA) (Vesely et al. 1981;Liggesmeyer and Rothfelder 1998; Rae 2000) is a de-ductive, top-down method to analyze system designand robustness. It usually involves specifying a toplevel event(TLE hereafter) to be analyzed (e.g., a fail-ure state), and identifying all possible sets of basicevents (e.g., basic faults) which may cause that TLEto occur. Benefits of FTA include, e.g.: identify pos-sible system reliability or safety problems at design

    time; assess system reliability or safety during opera-tion; identify root causes of equipment failures. Fault

    trees provide a convenient symbolic representation ofthe combination of events resulting in the occurrenceof the top event. Fault trees are usually represented ina graphical way, structured as a parallel or sequentialcombination of AND/OR gates.

    In this paper we are interested in deductive meth-ods which can be used to automatically generate fault

    trees starting from a given system model and toplevel event. In particular, we focus on analysis tech-niques based on model checking. Model checking(Clarke et al. 2000) is a well-established method forformally verifying temporal properties of finite-stateconcurrent systems. System specifications are writtenas temporal logic formulas, and efficient symbolic al-gorithms (based on data structures like BDDs (Bryant1992)) are used to traverse the model defined by thesystem and check if the specification holds or not. Theapplication of model checking to fault tree generationworks in the following way. Given a system model

    and a top level event (TLE) to analyze, model check-ing techniques can be used to extract automatically allcollections of basic events (called minimal cut sets)which can trigger the TLE. The generated cut sets areminimal in the sense that only events that are strictlynecessary for the TLE to occur are retained.

    In this paper, we discuss and propose an algorithmfor extending FTA with event ordering information.In traditional FTA, cut sets are simply flat collections(i.e, conjunctions) of events which can trigger a givenTLE. However, there might be timing constraints en-forcing a particular event to happen before or after

    another one, in order for the TLE to be triggered (i.e.,the TLE would not show if the order of the two eventswere swapped). Ordering constraints can be due, e.g.,to a causality relation or a functional dependency be-tween events, or caused by more complex interactionsinvolving the dynamics of the system under consid-eration. Whatever the reason, event ordering analysiscan provide useful information which can be used bythe design and safety engineers to fully understand theultimate causes of a given system malfunction, so thatadequate countermeasures can be taken.

    3 AN EXAMPLE

    We present below an example which we will use inSection 4 to explain our methodology. The exampleis deliberately simple for illustration purposes andshould not be regarded as modeling a realistic system.We refer to (Bozzano and al. 2003) for more meaning-ful examples to which the methodology and the algo-rithm have been applied. Let us consider the circuitdrawn in Figure 1. The circuit is composed of twoJK flip-flops and an OR gate, and it has three inputbits and one output. In short, a JK flip flop is a (clock

    driven) logical component with two input bits (J andK) and two output bits (Q and !Q, the latter sim-

  • 7/28/2019 10.1.1.96.7072

    3/8

    K

    J Q

    !Q

    J

    K

    Q

    !Q

    bit 1

    bit 2

    bit 3

    out

    Figure 1: A simple circuit with two JK flip-flops

    ply being the negation of the former). The truth tableof the JK flip flop is such that whenever J and Kare low the output signal Q (which can be either lowor high) remains unchanged, whenever either J orK is set to high the output Q is set to, respectively,one or zero, and, finally, a high signal on both J andK is used to toggle the current value of Q.

    In the circuit drawn in Figure 1, the three inputbits are set to zero, but they can non-deterministicallyfail, at any time and in any order, in which case theirvalue is inverted (i.e., it is set to one) forever (notethat we assume persistent failures). Initially, we as-sume all signals to be low, i.e., the input bits, the Qoutputs of the flip-flops and, consequently, the outputof the circuit, are all set to zero. A NuSMV modelof the circuit is shown in Figure 2. It is composed ofthree modules, one for modeling an input bit, one formodeling a JK flip-flop (note that this module sim-ply implements the truth- table of a JK flip-flop) and

    the main module, which puts all the components to-gether and defines the output signal of the circuit. Forsimplicity, we have not modeled flip-flop clocks ex-plicitly. We assume that the input values J and Qare transferred to the flip-flop outputs at each NuSMVtransition (i.e., a NuSMV transition can be thought ofas causing a triggering edge of the clock pulse).

    Top level events to be used for fault tree analysiscan be expressed in the temporal logic CTL (Emer-son 1990). Arbitrary CTL formulas can be used toperform FTA in NuSMV. Some examples are:

    AG(out) (T1)AG((out AX(!out)) & (!out AX out)) (T2)EG((out AX(!out)) & (!out AX out)) (T3)

    The top level event T1 is a CTL formula specifyingall the states of the system in which the output of thecircuit is forced to be set to value one forever, i.e., forevery possible path (evolution of the system) the out-put is globally set to value one on that path. Similarly,the CTL formula T2 is a specification of all the statessuch that the output of the circuit is forcedto oscillateforever back and forth between the values zero andone. Finally, the CTL formula T3 is a specification of

    all the states such that there exists one path on whichthe output is globally forcedto oscillate forever.

    As mentioned in Section 2, we have implementeda procedure for performing fault tree analysis inNuSMV. As an aside, we mention that the safety anal-ysis platform we are developing inside the ESACSproject provides additional features for managing fail-ure modes. Specifically, fault tree computation startswith the user assigning a set of failures to the variouscomponents, which are then automatically inserted

    into the original model of the system. The result is anextended system model with failure variables (e.g., thevariable FailureMode of the bit module in Figure 2).Model checking techniques can then be applied to theextended NuSMV model (e.g., the model in Figure 2)to extract automatically all collections of basic events,i.e., all minimal cut sets, which can cause any of theabove TLEs. Cut sets are expressed in terms of sets of

    failure events, i.e., pairs consisting of a failure vari-able and a failure mode. The results of fault tree anal-ysis for the model in Figure 2 and the CTL formulasT1, T2, and T3 are shown below (hereafter, we shorten(biti.FailureMode,inverted) with biti inv). Inthis particular case, exactly one minimal cut set Miis computed for each formula Ti (note that in generalmore than one cut set can be computed for a TLE).

    {bit1 inv, bit3 inv} (M1){bit1 inv, bit2 inv, bit3 inv} (M2){bit2 inv} (M3)

    For M1, we have that, in order for the output of thecircuit to be forced to return value one forever (prop-erty T1), it is necessary that both the first and the thirdbit fail. Notice that the output of the circuit can also

    get stuck at value one as a result of a failure of thefirst bit only. In this case, however, the output of thecircuit is not forced to that value, i.e., as the readercan verify, there exist possible evolutions of the cir-cuit such that the output can assume value zero. In

    bit1 bit3inverted inverted

    Top Level Event

    bit3 invbit1 inv

    Figure 3: A Fault Tree for T1

    Figure 3 we show a simple graphical representationfor the fault tree corresponding to T1. The cut set isminimal in the sense that only events that are strictlynecessary for the TLE to occur are retained. Similarly,minimal cut set M2 states that all bits must be failedin order for the output of the circuit to be forced to

    oscillate forever. Notice that failure of all input bits isnot a sufficientcondition for oscillation of the circuit.

  • 7/28/2019 10.1.1.96.7072

    4/8

    MODULE bit(input)VAR

    out : boolean;FailureMode : {no_failure,inverted};

    ASSIGNinit(FailureMode) := no_failure;next(FailureMode) := case

    FailureMode = no_failure : {no_failure,inverted};FailureMode = inverted : inverted;

    esac;out := case

    FailureMode = no_failure : input;FailureMode = inverted : !input;

    esac;

    MODULE ff(J,K) MODULE mainVAR VAR

    Q : boolean; bit1 : bit(0);bit2 : bit(0);

    ASSIGN bit3 : bit(0);init(Q) := 0; ff1 : ff(bit1.out,bit2.out);next(Q) := case ff2 : ff(bit2.out,bit3.out);

    !J & !K : Q; out : boolean;

    !J & K : 0;J & !K : 1; ASSIGNJ & K : !Q; out := ff1.Q | ff2.Q;

    esac;

    Figure 2: A NuSMV model for the circuit in Figure 1

    In fact, there are some timing constraints which mustbe satisfied in order for the circuit to show this oscil-lating behaviour. Extracting information about thesetiming constraints is exactly the purpose of the order-

    ing analysis which is described in the next section.

    4 THE MINIMIZATION ALGORITHM

    In this section we explain in detail our algorithm forevent ordering analysis. Specifically, we describe aprocedure which takes in input a system model (e.g.,the NuSMV model in Figure 2) and a cut set, and isable to extract event ordering information. The coreof the algorithm is based on procedures for comput-ing prime implicants of boolean functions (Coudertand Madre 1993), and exploits the BDD-based rep-resentation for boolean functions, which is used ex-

    tensively in NuSMV. The algorithm is made up of anumber of different phases, which are detailed below.

    0) Pick a minimal cut set. A prerequisite of our pro-cedure is having a system model SM and a top levelevent TLE at hand. Then, as explained in Section 3,we run NuSMV on SM and TLE, and we get a col-lection of minimal cut sets. Assuming the collectionis not empty, we pick one MCS. The purpose of theordering analysis algorithm is to extract ordering con-straint information from MCS.

    1) Generate the ordering information model. As-suming MCS is composed of a set of n failureevents, say (f m1,var name1), . . . , (f mn,var namen),for each pair of distinct failure events with indexes i

    and j in MCS (i = j), we introduce a new orderingvariable order var nameij in the NuSMV model SM,to keep track of the mutual order in which the twofailure events may happen. In order to give a com-plete encoding for ordering information, we thus needa total of 1

    2n(n1) ordering variables. We call the re-

    sulting model ordering information model (OIM here-after). The NuSMV skeleton for defining an order-ing variable is shown in Figure 4. The skeleton is in-stantiated each time with different actual parametersfor failure variables and failure modes. Every order-ing variable can assume one among the three values{before, after, simult}, the intuition being that thefirst event happens before, after, or at the same timewith the second one (note that the notion of simulta-neousness is relative to the granularity of the NuSMVstep, e.g., one clock pulse for the circuit in Figure 1).

    An auxiliary variable definition (which may assumethe additional value unknown) is necessary in orderto code the fact that the value of a given variable isstill unknown during the computation (the unknownvalue will be eventually overwritten, because all fail-ure events in MCS are forced to occur, see below).

    2) Re-run NuSMV on the ordering informationmodel. In this phase NuSMV is re-run on the OIM,with the same TLE, in order to track the informa-tion captured by the ordering variables. The analy-sis is specialized to the given MCS, i.e., the formulaprovided to NuSMV for the analysis forces the fail-ure events contained in MCS (and only them) to oc-cur. The result is a BDD representing all the different

  • 7/28/2019 10.1.1.96.7072

    5/8

    VARORDER_VAR_NAME_AUX : {UNKNOWN, BEFORE, AFTER, SIMULT};ORDER_VAR_NAME : {BEFORE, AFTER, SIMULT};

    ASSIGNinit(ORDER_VAR_NAME_AUX) := UNKNOWN;next(ORDER_VAR_NAME_AUX) := case

    ORDER_VAR_NAME_AUX = UNKNOWN : case(VAR_NAME_1 = FM_1 & VAR_NAME_2 = NO_FAILURE) : BEFORE;(VAR_NAME_1 = NO_FAILURE & VAR_NAME_2 = FM_2) : AFTER;(VAR_NAME_1 = FM_1 & VAR_NAME_2 = FM_2) : SIMULT;1 : UNKNOWN;esac;

    1 : ORDER_VAR_NAME_AUX;esac;

    ORDER_VAR_NAME := caseORDER_VAR_NAME_AUX = UNKNOWN : SIMULT; -- does not matter1 : ORDER_VAR_NAME_AUX;esac;

    Figure 4: NuSMV code skeleton for ordering variable definition

    configurations (including system variables and order-ing variables) which can cause TLE in presence ofthe failure events in MCS. For instance, consider theNuSMV model, the top level event T2, and the rele-vant minimal cut set M2 described in Section 3. Theminimal cut set and the top level event are combinedtogether, yielding the CTL formula

    ((AG (out) AX (!out)) & (!out AX out)) &bit1 inv & bit2 inv &bit3 inv

    NuSMV is fed with this formula in order to generatea BDD representing all states causing T2 because of

    the failure events in M2. This BDD also includes theinformation about ordering variables.

    3) Abstract away non-ordering variables. In thisphase we simply abstract away variables other thanordering ones. The result is still a BDD representingall the possible failure event orderings.

    4) Extract ordering constraints. This phase containsthe core of the minimization algorithm, and is com-posed of three interrelated sub-phases.

    4.1. Add inconsistent configurations. The order-ing variable encoding described in point 1 above is

    redundant in the following sense. Consider, e.g., threevariables vij , vjk , and vik, representing the order inwhich failure events i and j (j and k, i and k) oc-cur. Clearly, if, say, vij and vjk are both set to thevalue before, for transitivity also vik will be (nec-essarily) set to before. In other words, the encod-ing allows for inconsistent configurations which willnever be the result of the model checking analysis(e.g., before, before, after in the previous exam-ple). During this phase we extend the BDD resultingfrom phase 3 with a BDD representing such incon-sistent configurations (i.e., we consider their disjunc-

    tion). This amounts to admitting inconsistent config-urations as if they were perfectly legal. In this way

    (and with a little adjustment explained in phase 4.3below) we can ensure that the minimization algo-rithm (see phase 4.2) works properly. Intuitively, theminimization algorithm works by picking variableswhose value is irrelevant (in the sense that they canassume any among the possible legal values). There-fore, phase 4.1 is necessary to give the minimizationalgorithm enough information to correctly recognizewhich variables are irrelevant and which are not.

    4.2. Compute prime implicants. We simply runthe standard algorithm for computing prime impli-cants of a boolean function (Coudert and Madre

    1993) on the BDD resulting after phase 4.1. For in-stance, for the top level event T2 and cut set M2 thisphase computes 16 prime implicants. An example ofprime implicant is the following (it represents the tim-ing constraint enforcing the first bit to fail before thesecond one and the second one before the third one):

    p0) -------------------------bit1_inv **before** bit2_invbit2_inv **before** bit3_inv

    4.3. Run simplification subroutine. The purpose ofthis phase is to cut inconsistentresults and subsumed

    (i.e., logically implied) results from the output ofphase 4.2. Inconsistent results can arise as a side ef-fect of phase 4.1, and must be discarded after the min-imization phase, whereas the purpose of the simplifi-cation subroutine is to retain only minimal results. Forthe top level event T2 and cut set M2, the previouslygenerated 16 implicants are reduced to 3 after the sim-plification phase. For instance, the prime implicant p0is discarded because it is subsumed by prime impli-cant p2 (see below).

    5) Show results. The final output consists of thecollection of prime implicants resulting after phase 4.For the top level event T2 and cut set M2, NuSMVoutputs the following three prime implicants:

  • 7/28/2019 10.1.1.96.7072

    6/8

    2

    1

    2

    31 , 2

    3

    p3)p2)p1)

    1

    Figure 5: Prime implicants as precedence graphs

    p1) -------------------------bit1_inv **simult** bit2_inv

    p2) -------------------------bit2_inv **before** bit1_invbit3_inv **before** bit1_inv

    p3) -------------------------bit1_inv **before** bit3_invbit2_inv **before** bit3_inv

    Each prime implicant is a list of ordering constraintsbetween failure events, and represents a different al-ternative (i.e., a different ordering possibly causingthe TLE). Each list of failure events represents a

    precedence graph, showing which failure event musthappen before which, for a given prime implicant.The output might be drawn in a more suggestive way,as shown in Figure 5. Each node in the graph con-tains one or more failure events (an index i denotesfailure event biti inv), which are supposed to hap-pen simultaneously, whereas each arrow represents abefore relation between two (sets of) failure events.

    The precedence graphs shown in Figure 5 representthe ordering information which has been obtained byrunning our algorithm on the model, top level eventT2 and minimal cut set M2 described in Section 3. Asdiscussed in Section 3, the CTL formula T2 is a speci-fication of all the states such that the output of the cir-cuit is forced to oscillate forever between the valueszero and one, whereas minimal cut set M2 shows thatall three input bits of the circuit must fail in order forsuch a behaviour to be observable. The ordering anal-ysis results give us further information about this os-

    cillating phenomenon, i.e, show that some further tim-ing constraints between the failure events must holdas well. In particular, one of the orderings shown inFigure 5 must hold: either the first and the secondbit must fail simultaneously (and the third one at anytime, including simultaneously with the other two),or the second and third bit must fail before the first(the order between the former two bits is left unspeci-fied), or the first and second must fail before the third(again, the order between the former two bits is leftunspecified). A possible ordering which is ruled outby the results of our ordering analysis is the one in

    which the three bits fail in the following order: thefirst bit, then the third one, and finally the second one.

    4.1 Ordering and fault tree analysis

    We conclude this section with a brief explanationabout how the minimization algorithm can be inte-grated with fault tree analysis. A system model is as-sumed to be given. The verification process consistsof the following phases. First of all, a top level eventto analyze is chosen (clearly, the analysis can be re-peated for different top level events). Then, we run theminimization algorithm of (Coudert and Madre 1993)to compute the minimal cut sets of basic events caus-

    ing TLE. Finally, for each cut set we generate an or-dering information model and we perform the order-ing analysis. The output is a fault tree for TLE, whereeach cut set in the tree is equipped with ordering in-formation (that is, a precedence graph).

    For the example of Section 3, the analysis can pro-ceed in the following way. First, a top level event ischosen (e.g., T1, T2, or T3) and NuSMV is run to per-form fault tree analysis. The result is a collection ofminimal cut sets of failure events. For each cut set,ordering analysis is performed on a suitable order-ing information model. The results show that: for T1

    there are no timing constraints (all orderings betweenthe two failure events are possible); for T2, the outputof NuSMV is shown in Figure 5; finally, T3 only in-cludes one basic event, hence it is useless to performordering information analysis on it.

    5 CONCLUSIONS AND RELATED WORK

    In this paper we have presented an algorithm whichimproves the results given by traditional fault treeanalysis (Vesely et al. 1981). In particular, our algo-rithm performs what we call event ordering analy-

    sis. This analysis can be conveniently used to extractpossible ordering constraints holding between basicevents in a given cut set, thus providing a deeper in-sight into the causes of system malfunction and sup-porting the reliability and safety analysis process. Al-though very simple, the example in Section 3 showsthe importance and significance of event ordering in-formation. It also suggests that timing constraints canarise very naturally in industrial systems.

    As explained in the paper, our algorithm for or-dering analysis is based on classical procedures forminimization ofboolean functions, specifically on the

    implicit-search procedure described in (Coudert andMadre 1992; Coudert and Madre 1993), which is

  • 7/28/2019 10.1.1.96.7072

    7/8

    based on Binary Decision Diagrams (BDDs) (Bryant1992). This choice was quite natural, given that theNuSMV model checker makes a pervasive use ofBDD data structures. For alternative explicit-searchand SAT-based techniques for computation of primeimplicants, see (Manquinho et al. 1998). The resultscomputed by the algorithm may differ depending onthe exact order in which ordering variables are cho-sen in the minimization step. This a consequence ofthe non-determinism which is inherent in the primeimplicant computation (Coudert and Madre 1993).However, we conjecture that the minimization proce-dure enjoys some optimality properties which we arestudying as part of our future work.

    A large amount of work has been done in the area ofprobabilistic safety assessment (PSA) and in particu-lar on dynamic reliability (Siu 1994). Dynamic relia-bility is concerned with extending the classical eventor fault tree approaches to PSA by taking into consid-

    eration the mutual interactions between the hardwarecomponents of a plant and the physical evolution ofits process variables (Marseguerra et al. 1998). Ex-amples of scenarios which dynamic reliability triesto take into consideration are, e.g., human interven-tion, expert judgment, the role of control/protectionsystems, the so-called failures on demand(i.e., failureof a component to intervene), and also the ordering ofevents during accident propagation (Senni, Semenza,and Galvagni 1991; Cacciabue and Cojazzi 1994;Cacciabue and Cojazzi 1995). Different approachesto dynamic reliability include, e.g., state transitionsor Markov models (Aldemir 1987; Papazoglou 1994),the dynamic event tree methodology (Cojazzi et al.1992), and direct simulation via Monte Carlo anal-ysis (Smidts and Devooght 1992; Marseguerra et al.1998). The work which is probably closer to ours is(Cojazzi et al. 1992), which describes dynamic eventtrees as a convenient means to represent the timingand order of intervention of a plant sub-systems andtheir eventual failures. With respect to the classifica-tion the authors propose, our approach can support si-multaneous failures, whereas, at the moment, we are

    working under the hypothesis of persistent failures(i.e., no repair is possible).

    The most notable difference between our approachand the works on dynamic reliability mentionedabove is that we present automatic techniques, basedon model checking, for both fault tree generationand ordering analysis, whereas traditional works ondynamic reliability rely on manual analysis (e.g.,Markovian analysis (Papazoglou 1994)) or simula-tion (e.g., Monte Carlo simulation (Marseguerra et al.1998), the TRETA package of (Cojazzi et al. 1992)).Automation is clearly a point in favour of our frame-

    work. Furthermore, we support automatic verificationof arbitrary temporal CTL properties (in particular,

    both safety and liveness properties). Current work isfocusing on a number of improvements and exten-sions in order to make the methodology competitivewith existing approaches and usable in realistic sce-narios. First of all, there are some improvements atthe modeling level. The NuSMV models used so farare discrete, finite-state transition models. In order toallow for more realistic models, we are considering anextension of NuSMV with hybrid dynamics, along thelines of (Henzinger 1996; Henzinger et al. 1997). Thiswould allow both to model more complex variable dy-namics, and also a more realistic modeling of time(currently, time is modeled by an abstract transitionstep). Furthermore, we need to extend our frameworkin order to deal with probabilistic assessment. Al-though not illustrated in this paper, associating prob-abilistic estimates to basic events and evaluating theresulting fault trees is straightforward. However, morework needs to be done in order to support more com-

    plex probabilistic dynamics (see, e.g., (Devooght andSmidts 1994)). Also, we want to overcome the currentlimitation to permanent failures.

    We also mention (Manian et al. 1998; Sullivan et al.1999), which describe DIFTree, a methodology sup-porting (however, still at the manual level) fault treeconstruction and allowing for different kinds of analy-ses of sub-trees (e.g., Markovian or Monte Carlo sim-ulation for dynamic ones, and BDD-based evaluationfor static ones). The notation the authors use for non-logical (dynamic) gates of fault trees and the supportfor sample probabilistic distributions could be nice

    features to be integrated in our framework.Finally, the line of research concerning ordering

    analysis has been carried out inside the ESACSproject. As a contribution to the project, we are de-veloping an integrated platform providing the safetyengineers with tools for the specification, analysisand validation of complex systems. Formal verifica-tion functionalities of the platform are based on modelchecking, and in particular on NuSMV(Cimatti et al.2002). In this paper we have focused only on the as-pects related to the minimization procedure. We referthe reader to (Bozzano and al. 2003), where a moredetailed description of the project goals, the ESACSmethodology, and more realistic examples to whichthe methodology has been applied can be found.

    REFERENCES

    Aldemir, T. (1987). Computer-assisted Markov FailureModeling of Process Control Systems. IEEE Trans-actions on Reliability R-36, 133144.

    Bozzano, M. and al. (2003). ESACS: An IntegratedMethodology for Design and Safety Analysis ofComplex Systems. In Proc. of European Safety and

    Reliability Conference (ESREL03).Bryant, R. (1992). Symbolic Boolean Manipulation

  • 7/28/2019 10.1.1.96.7072

    8/8

    with Ordered Binary Decision Diagrams. ACMComputing Surveys 24(3), 293318.

    Cacciabue, P. and G. Cojazzi (1994). A human factormethodology for safety assessment based on theDylam approach. Reliability Engineering and Sys-tem Safety 45, 127138.

    Cacciabue, P. and G. Cojazzi (1995). An integrated

    simulation approach for the analysis of pilot-aeroplane interaction. Control Engineering Prac-tice 3(2), 257266.

    Cimatti, A., E. Clarke, E. Giunchiglia, F. Giunchiglia,M. Pistore, M. Roveri, R. Sebastiani, and A. Tac-chella (2002). NuSMV2: An OpenSource Tool forSymbolic Model Checking. In Proc. 14th Interna-tional Conference on Computer Aided Verification(CAV02), LNCS. Springer-Verlag.

    Cimatti, A., E. Clarke, F. Giunchiglia, and M. Roveri(2000). NUSMV: a new symbolic model checker.International Journal on Software Tools for Tech-

    nology Transfer 2(4), 410425.Clarke, E., O. Grumberg, and D. Peled (2000). Model

    Checking. MIT Press.

    Cojazzi, G., J. M. Izquierdo, E. Melendez, and M. S.Perea (1992). The Reliability and Safety Assess-ment of Protection Systems by the Use of Dy-namic Event Trees. The DYLAM-TRETA Package.In Proc. XVIII Annual Meeting Spanish Nucl. Soc.

    Coudert, O. and J. Madre (1992). Implicit and Incre-mental Computation of Primes and Essential Primesof Boolean Functions. In Proc. 29th Design Au-tomation Conference (DAC98), pp. 3639. IEEEComputer Society Press.

    Coudert, O. and J. Madre (1993). Fault Tree Analysis:1020 Prime Implicants and Beyond. In Proc. AnnualReliability and Maintainability Symposium.

    Devooght, J. and C. Smidts (1994). Probabilistic Dy-namics; The Mathematical and Computing Prob-lems Ahead. In T. Aldemir, N. O. Siu, A. Mosleh,P. C. Cacciabue, and B. G. Goktepe (Eds.), Reliabil-ity and Safety Assessment of Dynamic Process Sys-

    tems, Volume 120 of NATO ASI Series F, pp. 85100. Springer-Verlag.

    Emerson, E. (1990). Temporal and modal logic. InJ. van Leeuwen (Ed.), Handbook of TheoreticalComputer Science, Volume B, pp. 9951072. Else-vier Science.

    Henzinger, T. A. (1996). The Theory of Hybrid Au-tomata. In Proc. 11th Annual International Sympo-sium on Logic in Computer Science (LICS96), pp.278292. IEEE Computer Society Press.

    Henzinger, T. A., P.-H. Ho, and H. Wong-Toi (1997).HyTech: A Model Checker for Hybrid Systems.Software Tools for Technology Transfer 1, 110122.

    Holzmann, G. (1997). The Model Checker SPIN. IEEETransactions on Software Engineering 23(5), 279

    Larsen, K., P. Pettersson, and W. Yi (1997). UPPAAL ina Nutshell. International Journal on Software Toolsfor Technology Transfer 1(1-2), 134152.

    Liggesmeyer, P. and M. Rothfelder (1998). Improv-ing System Reliability with Automatic Fault TreeGeneration. In Proc. 28th International Symposiumon Fault-Tolerant Computing (FTCS98), Munich,Germany, pp. 9099. IEEE Computer Society Press.

    Manian, R., J. Dugan, D. Coppit, and K. Sullivan(1998). Combining Various Solution Techniques forDynamic Fault Tree Analysis of Computer Sys-tems. In Proc. 3rd International High-AssuranceSystems Engineering Symposium (HASE98), pp.2128. IEEE Computer Society Press.

    Manquinho, V., A. Oliveira, and J. Marques-Silva(1998). Models and Algorithms for ComputingMinimum-Size Prime Implicants. In Proc. Interna-tional Workshop on Boolean Problems (IWBP98).

    Marseguerra, M., E. Zio, J. Devooght, and P. E. Labeau

    (1998). A concept paper on dynamic reliability viaMonte Carlo simulation. Mathematics and Comput-ers in Simulation 47, 371382.

    McMillan, K. (1993). Symbolic Model Checking.Kluwer Academic Publ.

    Papazoglou, I. A. (1994). Markovian Reliability Analy-sis of Dynamic Systems. In T. Aldemir, N. O. Siu,A. Mosleh, P. C. Cacciabue, and B. G. Goktepe(Eds.), Reliability and Safety Assessment of Dy-namic Process Systems, Volume 120 of NATO ASISeries F, pp. 2443. Springer-Verlag.

    Rae, A. (2000). Automatic Fault Tree Generation - Mis-

    sile Defence System Case Study. Technical Report00-36, Software Verification Research Centre, Uni-versity of Queensland.

    Senni, S., M. Semenza, and R. Galvagni (1991). AD-MIRA: An Analytical Dynamic Methodology forIntegrated Risk Assessment. In G. Apostolakis(Ed.), Probabilistic Safety Assesment and Manage-ment, pp. 407412. Elsevier Science.

    Siu, N. O. (1994). Risk Assessment for Dynamic Sys-tems: An Overview. Reliability Engineering ansSystem Safety 43, 4374.

    Smidts, C. and J. Devooght (1992). Probabilistic Reac-tor Dynamics II. A Monte-Carlo Study of a FastReactor Transient. Nuclear Science and Engineer-ing 111(3), 241256.

    Sullivan, K., J. Dugan, and D. Coppit (1999). TheGalileo Fault Tree Analysis Tool. In Proc. 29thAnnual International Symposium on Fault-Tolerant

    Computing (FTCS99), pp. 232235. IEEE Com-puter Society Press.

    Vesely, W., F. Goldberg, N. Roberts, and D. Haasl(1981). Fault Tree Handbook. Technical ReportNUREG-0492, Systems and Reliability Research

    Office of Nuclear Regulatory Research U.S. Nu-clear Regulatory Commission.