State model based

State Model Based

Disciplina: Tópicos Avançado em Avaliação de Desempenho de Sistemas Aluno: Rafael Roque de Souza Professores: Eduardo Tavares, Ricardo Massa

Agenda � Dependability:

� Motivation � Dependability � Attributes � Model Base State � Modelos Formais

� Makov chain � Petri Net

� Tools � Conclusão

2

3/81DRCN2009 Copyright © by Kishor S. Trivedi

Health & Medicine

Communication

Avionics

Entertainment Banking

Motivation: Dependence on Computer SystemsMotivation

� Dependability:

�  The concept of dependable computing first appears in the 1830’s in the context of Babbage’s Calculating Engine [1,2].

�  The ability to deliver service that can be justifiably be trusted [3]. �  The ability to avoid service failures that are more frequent and

more severe than acceptable [3].

What is Dependability?

More… Default approach: Utilize a formalism to model system dependability • Quantify the availability of components, calculate system availability based on this data and a set of assumptions - the availability model

• Most models expose the same expressiveness • Each formalism allows to focus on certain aspects • Component-based models: Reliability block diagram, fault tree • State-based models: Markov chains, petri nets

• System understanding evolved from hardware to software to IT infrastructures

What is Dependability?

•  Some assumptions

–  All failure and repair events are exponentially distributed –  Components are either fully working or completely failed –  All failure and repair events are pair-wisely stochastically independent –  Correct functioning at t can be treated as event, with event probability

derived from the availability value computed by failure rate and repair rate

Exemple

Dependable Systems Course PT 2011

Dependability Modeling

3

Up Down

µ = 1MTTR

� = 1MTTF

• Some assumptions

• All failure and repair events are exponentially distributed

• Components are either fully working or completely failed

• All failure and repair events are pair-wisely stochastically independent

• Correct functioning at t can be treated as event, with event probability derived from the availability value computed by failure rate and repair rate

Schedule � Dependability:

� Dependability � Attributes � Model Base State � Modelos Formais



7

According to different properties, which may be more or less emphasized depending on the application intended for the computer system under consideration:

�  Availability is always required, although to a varying degree depending on the application;

�  reliability, safety, confidentiality may or may not be required according to the application [3].

Dependability and its a4ributes

Dependability and its a4ributes

� Trustworthiness of a computer system such that reliance can justifiably be placed on the service it delivers, It encompasses the following attributes:

� Availability: readiness for correct service

� Reliability: continuity of correct service

� Safety: absence of catastrophic consequences

�  Integrity: absence of improper system alterations

� Maintainability: ability to undergo modifications and repairs 9

What is Availability? Availability is the probability that the system will still be operating to requirements at a given time.

� Availability is a function not only of how rarely a system fails (reliability) but also of how quickly it can be repaired (time to repair)

�  Availability of 0.998 means software is available for 998 out of 1000 time units

10

11

What is Reliability? �  Reliability is the probability that the system will deliver a set of

services for a given period of time, whereas a system is fault tolerant when it does not fail even when there are faulty components.

Example

•  Reliability x Availability? •  Look at the example:

–  Consider a site online that negotiators get down to 1 minute every 4 hours, ie every 240 (4x60) minutes. The availability and 239/240 = 99 583% (relative high availability)

– Reliability: can it be low down periods occur at critical times when the market isfluctuating and clients who trade their shares!

Safety is an extension of reliability. When the state of correct service and the states of incorrect service due to non-catastrophic failure are grouped into a safe state (in the sense of being free from catastrophic damage, not from danger), safety is a measure of continuous safeness, or equivalently, of the time to catastrophic failure. Safety is thus reliability with respect to catastrophic failures.

What is Safety

•  A measure of the time to service restoration since the last failure occurrence, or equivalently, measure of the continuous delivery of incorrect service;

•  The maintainability model situations where the system fails and the return to proper functioning of the state requires any maintenance. The maintainability and defined as the probability of a repair system was successfully completed in given time.

What is Maintainability


� Dependability � Attributs � Model Base State � Model



15

State-space methods are much more comprehensive. They allow explicit modeling of complex relationships (e.g., [5]), and their transition structure encodes important sequencing information. Historically, state-space methods have been explored in the context of mathematical models that specify probabilistic assumptions about time durations and transition behavior. We now review those models and comment on how they are being applied in the security context.

What is Model State?

•  In contrast with state-space models, combinatorial models do not enumerate all possible system states to obtain a solution. Instead, simpler approaches are used to compute system dependability measures. Despite several extensions that have been made to combinatorial models, they do not easily capture certain features, such as stochastic dependence and imperfect fault coverage. We present a brief overview of combinatorial models.

What is Model State?

•  Reliability: “The ability of a system to perform a required function under given conditions for a given time interval.” No recovery is assumed after system fails (there can be recovery after a component failure)

•  • Availability: “The ability of a system to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval.“

Dependability Attributes or Measures


Reliability Availability

Dependability Measures

Dependability Attributes or Measures

• Reliability: “The ability of a system to perform a required function under given conditions for a given time interval.” No recovery is assumed after system fails (there can be recovery after a component failure)• Availability: “The ability of a system to be in a state to perform a required function at a given instant of time or at any instant of time within a given time interval.“


Numerical solution tool

Close-formsolution

Dependability Evaluation Methods

Model-based

Discrete-event simulation

Hybrid

Analytic Models

Quantitative evaluation

Measurement-based

Numerical solutionOf analytic modelsNot as well utilized;

Unnecessarily excessiveUse of simulation

Dependability Evaluation Methods

Modeling Taxonomy


Modeling Taxonomy

Hierarchical modelslargeness

Abstract models

Discrete-event simulation

Hybrid

Analytic models

Combinatorial modelsEfficiency, simplicity

State-space modelsDependency capture

•  States and labeled state transitions •  State can keep track of:

–  Number of functioning resources of each type –  States of recovery for each failed resource –  Number of tasks of each type waiting at each resource –  Allocation of resources to tasks

•  A transition: –  Can occur from any state to any other state –  Can represent a simple or a compound event

State-Space Models

•  State space explosion problem or the largeness problem •  Stochastic Petri nets and related formalisms for easy specification

and automated generation/solution of underlying Markov model

•  Or use hierarchical (Multilevel) model composition. –  e.g. Upper level : FT or RBD, lower level: Markov chains –  Many practical examples of the use of hierarchical models exist

Problem with State Space Models

•  Can relax the assumption of exponential distributions

State-Space model taxonomy


State-Space model taxonomy

(discrete) State space models

Markovian models

non-Markovian models

discrete-time Markov chains (DTMC)

continuous-time Markov chains (CTMC)

Markov reward models (MRM)

Semi-Markov process (SMP)

Markov regenerative process

Non-Homogeneous Markov

Can relax the assumption of exponential distributions


� Dependability � Attributs � Model Base State � Modelos Formais



24

•  Discrete random process, usually drawn as state transition diagram • Markov property - Next step depends only on the current step

•  Impossible to predict future states, but useful for statistical

properties • Finite state space (chain), transitions with probabilities, initial

state probabilities • Transient state - Probability to not return to this state (finite number of visits) • Recurrent state - Probability of 1 to return to this state after unspecified time t

• Mean recurrence time can be used as MTTF metric • Time-homogeneous Markov chains - Transition probabilities do not change in time.

What is Markov Chain


Markov Chains

• Discrete random process, usually drawn as state transition diagram

• Markov property - Next step depends only on the current step

• Impossible to predict future states, but useful for statistical properties

• Finite state space (chain), transitions with probabilities, initial state probabilities

• Transient state - Probability to not return to this state (finite number of visits)

• Recurrent state - Probability of 1 to return to this state after unspecified time t

• Mean recurrence time can be used as MTTF metric

• Time-homogeneous Markov chains - Transition probabilities do not change in time

8


Markov Chains

• Discrete random process, usually drawn as state transition diagram

• Markov property - Next step depends only on the current step

• Impossible to predict future states, but useful for statistical properties

• Finite state space (chain), transitions with probabilities, initial state probabilities

• Transient state - Probability to not return to this state (finite number of visits)

• Recurrent state - Probability of 1 to return to this state after unspecified time t

• Mean recurrence time can be used as MTTF metric

• Time-homogeneous Markov chains - Transition probabilities do not change in time

8

•  Discrete-time Markov chain (DTMC) • System state only changes after a fixed time interval, system is in

exactly one state • Transition to next state depends on transition probability (non-

negative) at t • Each row of the probability transition matrix represents flow out

of that state, the columns the transition flow into the state, row sum is one •  Continuous-time Markov chain (CTMC)

• Allows state changes at any instance of time - continous parameter space, still discrete state space • Transition to next state after spending some time in a state - holding time • Generator matrix Q therefore expresses transition rates instead of probabilities • By definition, the diagonal entries are equal to minus the total rate out of that state –  Rates with which no state change takes place

What is Markov Chain

•  Initial distribution vector can be combined with transition matrix to find probabilities for being in one of the states after one step

•  Each row sum of the transition matrix is 1

Markov Chains - DTMC Example



10

Transition Matrix

• Each row sum of the transition matrix is 1

• Initial distribution vector can be combined with transition matrix to find probabilities for being in one of the states after one step

Probability Matrix after 2 steps

(C) Tamara Lynn Anthony



10

Transition Matrix







10

Transition Matrix





•  Each state represents a particular error state, transition with component failure rate • States expresses number of failed components at any given time • Time-homogeneous process - Failure / repair rates do not change over time • Components have identical failure rates and identical repair rates

–  Failure and repair events are independent, process is memory-less • Row sum is zero: Probability mass flowing out of state i will go to some other state

–  Example:

Markov Chains - CTMC Example


Dependability Modeling with CTMCs

• Each state represents a particular error state, transition with component failure rate

• States expresses number of failed components at any given time

• Time-homogeneous process - Failure / repair rates do not change over time

• Components have identical failure rates and identical repair rates

• Failure and repair events are independent, process is memory-less

• Row sum is zero: Probability mass flowing out of state i will go to some other state

• Example:

11

3�s0 = µs1

3�s0 + 2µs2 = µs1 + 2�s1

2�s1 + 3µs3 = 2µs2 + �s2

�s2 = 3µs3

s0 + s1 + s2 + s3 = 1

s0 =µ

3�s1

s2 =�

µs1

s3 =�2

3µ2s1

s1 =3µ2�

(µ+ �)3

s0 =µ3

(µ+ �)3; s1 =

3µ2�

(µ+ �)3; s2 =

3µ�2

(µ+ �)3; s3 =

�3

(µ+ �)3

A = s0 + s1 =µ2(µ+ 3�)

(µ+ �)3= 3a2 + 2a3 a =

µ

(µ+ �)


Example: 2-of-3 System

13


•  Interested in steady-state availability of the system • Interpretation as steady-state probability for the system being

operational at –  Derived from probability vector -> contains steady-state

probabilities for the system being in one of the failure states after a number of steps

–  ,Static‘ steady-state availability computable if probabilities are in equilibrium

•  Probability for leaving state is similar to probability for going into that state - probability mass is evenly distributed

–  Typically achieved after a high number of steps




• Interested in steady-state availability of the system

• Interpretation as steady-state probability for the system being operational at t

• Derived from probability vector -> contains steady-state probabilities for the system being in one of the failure states after a number of steps

• ,Static‘ steady-state availability computable if probabilities are in equilibrium

• Probability for leaving state is similar to probability for going into that state - probability mass is evenly distributed

• Typically achieved after a high number of steps

12

•  Resulting formula equals to result from Boolean investigation, but Markov chains also support non-independent events - common cause failure

•  Markov chains grow exponentially with their number of components - which is bad

–  Divide-and-conquer - Decompose and aggregate chain parts –  Structural decomposition - Consider a system as set of independent subsystems –  Behavioral decomposition - Assume time constants for some fault occurences and handling

processes based on criticality - e.g. fault in parked airplane

Markov Chains


Markov Chains

• Resulting formula equals to result from Boolean investigation, but Markov chains also support non-independent events - common cause failure

• Markov chains grow exponentially with their number of components - which is bad

• Divide-and-conquer - Decompose and aggregate chain parts

• Structural decomposition - Consider a system as set of independent subsystems

• Behavioral decomposition - Assume time constants for some fault occurences and handling processes based on criticality - e.g. fault in parked airplane

14

•  Mathematical model for concurrent systems with many components (Carl Adam Petri)

•  Bipartit directed graph (places vs. transitions) •  Each place has a capacity for tokens, default is unlimited or one •  Each arc has a weight expressing a cost factor, default is one •  Places are pre- / postconditions for transitions •  Distribution of tokens is called a marking

• Every net has an initial marking

What is Petri Nets


Stochatic Petri Nets

• Mathematical model for concurrent systems with many components (Carl Adam Petri)

• Bipartit directed graph (places vs. transitions)

• Each place has a capacity for tokens, default is unlimited or one

• Each arc has a weight expressing a cost factor, default is one

• Places are pre- / postconditions for transitions

• Distribution of tokens is called a marking

• Every net has an initial marking

16

Place / State

Transition

Token

Input place of the transition

Output place of the transition

Redes de Petri EstocásticasExtension SPN

•  Transition is activated (may fire) when –  All input places contain enough tokens for the transition costs –  All output places have enough capacity to take the new tokens

•  Tokens are consumed and placed in output places, considering the arc weights

•  Atomic nondeterminstic operation - any activated transition may fire •  Firing happens with given delay •  More complex Petri net versions can •  distinguish different token types

–  Colored tokens (data values) –  Activation times for tokens

•  Petri nets allow both formal analysis (for exponential distribution) and simulation

Stochastic Petri Nets



• Transition is activated (may fire) when

• All input places contain enough tokens for the transition costs

• All output places have enough capacity to take the new tokens

• Tokens are consumed and placed in output places, considering the arc weights

• Atomic nondeterminstic operation - any activated transition may fire

• Firing happens with given delay

• More complex Petri net versions can distinguish different token types

• Colored tokens (data values)

• Activation times for tokens

• Petri nets allow both formal analysis (for exponential distribution) and simulation

17

•  A stochastic process and a sequence of random variables indexed on time witha well-defined correlation structure

•  Have probability distributions associated with them –  Arrival of customers in a bank queue –  Number of requests in a Web Server

•  Why stochastic modeling? –  In many systems, you need to join in time to events

•  How to model stochastic processes? –  Analytical queuing –  theory models –  Petri Nets


•  Probabilistic behavior model •  Distributions:

–  Exponential - SPN (stochastic Petri net) –  exponential or immediate GSPN(Generalized) –  There are other models with arbitrary functions


•  Reachability set –  All possible markings reachable from an initial marking –  Possible analysis questions

•  Can some system state (e.g. an error state) be reached at all ?

•  Exists a firing sequence that transforms M0 to M ?

• Boundedness –  Marking is bounded if there is a k so that for every reachable

marking the number of tokens in each place is bounded by k –  Useful for modeling limited (bounded) resources

Typical Petri Net Properties

•  Complexity of the petri net does not depend on the number of components !




23

Complexity of the petri net does not depend on the number of components !

•  Modeling of cold standby components (inhibitor arc) •  Limited repair capacities - at most R repairmen available at a time •  Dependability analysis - prove that there is no state where some

property is violated

Example: K-of-N With Standby and Repairmen


Example: K-of-N With Standby and Repairmen

• Modeling of cold standby components (inhibitor arc)

• Limited repair capacities - at most R repairmen available at a time

• Dependability analysis - prove that there is no state where some property is violated

24

•  buffer size = #token_capacity(p1 + p2) •  unit count = #token_capacity(p3 + p4) •  Firing rate of t1 is arrival rate

t2 is an immediate transition •  Firing rate of t3 is the service rate, depends on token count in p4

Example: Parallel System with Input Buffer



25

Input buffer with positions

Identical units

• buffer size = #token_capacity(p1 + p2)

• unit count = #token_capacity(p3 + p4)

• Firing rate of t1 is arrival rate

• t2 is an immediate transition

• Firing rate of t3 is the service rate, depends on token count in p4

Free buffer positions

Filled buffer positions

Free units

(C) A

nd

rea B

ob

bio

Active units



26

(C) Andrea Bobbio

• Light lines - Fault free operation

• Heavy lines - Failures

• Dotted lines - repairs

• Rate computation demands exponential distribution


•  In many cases, simulation is the only way to solve the net • More than one outgoing non-exponential distribution • Special guard functions • Complexity issues

• ... • Typical simulation problems

• Modeled failure rates might be small, so many runs needed for valid result

• Random number generation • Confidence intervals

Petri Net Simulation

•  Petri net has according reachability graph • Combines to Markov chain when transition probabilities are given

Petri Net ->Markov Chain


Petri Net -> Markov Chain

• Petri net has according reachability graph

• Combines to Markov chain when transition probabilities are given

21





21





21





21


� Dependability � Attributs � Model Base State � Modelos Formais



44

•  Astro •  Mercury •  TimeNet •  Sharp •  CPN Tool •  INA •  ...

Tools

•  [1] D. Lardner, Babbage's calculating engine. Edinburgh Review, July 1834. Reprinted in P. Morrison and E. Morrison, editors, Charles Babbage and His Calculating Engines. Dover, 1961.

•  [2] C. Babbage. On the mathematical powers of the calculating engine (December 1837). Unpublished Manuscript. Buxton MS7, Museum of the History of Science. In B. Randell, editor, The Origins of Digital Computers: Selected papers, pages 17-52. Springer, 1974.

•  [3]Fundamental Concepts of Dependability by A Avizienis, J C Laprie, B Randell, Brian Randell •  K. Goseva-Popstojanov, K. S. Trivedi, Stochastic Modeling Formalisms for Dependability,

Performance and Performability, LNCS 1769, 2000 •  [5] David M. Nicol, Fellow, IEEE, William H. Sanders, Fellow, IEEE, and Kishor S. Trivedi, Fellow,

IEEE, Dependability to Security Model-Based Evaluation •  [6] . K. Muppala, M. Malhotra, and K. S. Trivedi, “Markov dependability models of complex

systems: Analysis techniques,” in Reliability and Maintenance of Complex Systems, S. Ozekici, Ed. Berlin, Germany: Springer, 1996, pp. 442–486.

•  [7] Vedran Kordic, Petri Net Theory and Applications •  [8] Peter J. Haas, Stochastic Petri Nets- Modelling, Stability, Simulation, Springer. •  [9] Ebeling, C. E., An Introduction to Reliability and Maintainability Engineering. Illinois, Waveland

Press, 1997

References

•  http://www.ee.duke.edu/~kst/ •  http://www.modcs.org •  http://www.informatik.uni-hamburg.de/TGI/PetriNets •  http://www.informatik.uni-hamburg.de/TGI/PetriNets/tools/quick.html •  http://tandem.bu.edu/rsg.html

Links

Thank You!

Thanks

Technology

State model based