47
ELEC-C7210 Modeling and analysis of communication networks 9. Reliability theory 1 Material based on original slides by Tuomas Tirronen

9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

ELEC-C7210 Modeling and analysis of communication networks

9. Reliability theory

1

Material based on original slides by Tuomas Tirronen

Page 2: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Contents

• Introduction• Structural system models• Reliability of structures of independent repairable components• Reliable network topology design

2

Page 3: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

History

• As a technological concept, reliability emerged after WW1, practicalmethods were developed during and after WW2– For example Lusser’s law, i.e., product probability law of series of

components, was formulated by Robert Lusser during V1 flying bomb tests– Arised from the need to improve and control the quality of industrial

products with many parts• 50s, 60s

– ballistic missiles, space programs– first journal, IEEE Transactions on Reliability 1963

• 70s– safety of nuclear power plants

• 80s, 90s– oil and gas industries, computer programs to evaluate reliability, software

reliability, ...• 00s, new kinds of operation concepts (remote control/maintenance of

systems) require reliability analysis, network reliability etc3

Page 4: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Approaches to reliability

• Hardware reliability– Physical approach

• Strength S of an item is a random variable• Load L the item is exposed to is another random variable• Reliability R = Pr(S > L)• Structural reliability analysis

– Actuarial approach ¬ our approach• Time to failure T is studied using its distribution F(t)• All information of individual strengths, loads, etc is conveyed in F(t)• System reliability analysis

• Software reliability• Human reliability

4

Page 5: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Basic concepts (1)

• Reliability: The ability of an item to perform a required function, undergiven environmental and operational conditions and for a stated periodof time (ISO 8402)– the item can be a single component or a larger entity (system)– required function may refer to a single function or many

• Quality: The totality of features and characteristics of a product orservice that bear on its ability to satisfy stated or implied needs (ISO8402)– i.e., ”conformance to specifications”– reliability can be seen as an extension of quality into the time domain

• Availability: The ability of an item to perform its required function at astated instant of time or over a stated period of time (BS4778)– i.e., can the item be used at some time instant or what is the time fraction

the item is usable (= average availability)

5

Page 6: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Basic concepts (2)

• Maintainability: The ability of an item, under stated conditions, to beretained in, or restored to, a state in which it can perform its requiredfunctions, when maintenance is performed under stated conditions andusing prescribed procedures and resources (BS4778)– if an item can be repaired, then maintainability determines the availability of

the item• Dependability: Collective term to describe availability performance

and influencing factors: Reliability performance, maintainabilityperformance and maintenance support performance (IEC60300)– umbrella term often used when covering reliability issues

• Safety: Freedom from those conditions that can cause death, injury,occupational illness, or damage to or loss of equipment or property(MIL-STD-882D)

• Security: Dependability with respect to prevention of deliberate hostileactions

6

Page 7: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Basic concepts(3)

• Fault– a defect or mistake which leads to error. Reason for an error.

• Error– a system state which can lead to a failure

• Failure– The termination of its ability to perform a required function (BS 4778)– An unacceptable deviation from the design tolerance or in the anticipated

delivered service, an incorrect output, the incapacity to perform the desiredfunction (NASA 2002)

7

Page 8: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Basic concepts (4)

8

CauseFault

Error

Failure

Fault prevention• aim is to design a system without

faults• physical shielding of components,

careful manufacturing etc.

Fault tolerance• aim is to be able to provide the

service even in the presence offaults

• main tool: redundancy!• hardware• software• information• time

Page 9: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Repairable and nonrepairable items

• Can study two types of items– Nonrepairable items

• The item can be single item or larger system• We are only interested in the time until first failure – whatever happens

after this is of no interest to us• Interesting measures include: Mean time to failure, reliability

(function) and failure rate

– Repairable items• Single item or larger system• Interesting measures include: Availability, mean time between

failures, mean down time, number of failures in some time interval• In some sources the term dependability is used instead of availability

to mean the same thing

9

Our

focu

s

Page 10: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Systems of items

• We also study systems of many items or components. There are twopossibilities for modeling systems:

– Systems of independent components• Easy analysis: independence of components → independence of

probabilities• Most examples during this course assume independence

– Systems with dependent components• Exact analysis is harder, even impossible, because of the

dependencies• Analysis of the system as a stochastic process

10

Our

focu

s

Page 11: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Tools and models

• As function of time, we have state models where systems aremodelled as stochastic processes (cf. queueing models in earlierlectures)– especially repairable items/systems– the failure process, repair times, etc.

• Structure of systems and its subsystems -> structural models– reliability block diagrams, structure function

• Tools:– Basic probability theory– Stochastic processes

• Markov chains/processes• (Renewal processes)

– Statistical methods• Main limitations of ”probabilistic reliability analysis”: human errors,

human factor11

Page 12: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Applications

• Risk analysis– Identification of accidental events– Causal analysis– Consequence analysis

• Environmental protection• Quality• Optimization and maintenance• Engineering design• Verification of quality• Research and development• ...

12

Page 13: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Reliability in communications and networking (1)

• From user point of view, an interesting quality of service concept is thenetwork availability

– = Pr(usercanaccesstheagreednetworkservicesattime )– Average availability tells us the time fraction the system is available

• A way to understand availability of networks is to study the downtimeof a network (or outage of some specific service) per year

13

# nines Avg. availability Downtime / year2-nines 0.99 87.6 hours3-nines 0.999 8 hours 46 mins4-nines 0.9999 52 mins 34 secs5-nines 0.99999 5 mins 15 secs6-nines 0.999999 31.5 secs7-nines 0.9999999 3.15 secs

Page 14: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Reliability in communications and networking (2)

• In addition to availability, network operators, service providers andequipment manufacturers are interested also in– reliability of components (mean times to failure, number of failures in

some time interval etc.)– maintainability– security of networks

• Reliability is an important factor when planning new services, networksor equipment

• Note that dependability, reliability and availability may have differentdefinitions in different sources. Be careful to understand what are thedefinitions of the different concepts.

14

Page 15: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Aim of the lecture

• We focus on– Repairable systems– Systems of independent components– Exponential assumptions on mean time to failure and mean down time– Thus, we get simple models using Markovian analysis– Apply the models to topology design of communication networks where

availability is defined as connectivity of the network

15

Page 16: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Literature

• Reliability theory / Dependability– System Reliability Theory: Models, Statistical Methods and Applications,

2nd edition, Marvin Rausand and Arnljot Høyland, Wiley, 2004– Mesh-Based Survivable Networks: Options and Strategies for Optical,

MPLS, SONET and ATM Networking, Wayne D. Grover, Prentice Hall,2004

– Moniste: Luotettavuus, käytettävyys, huollettavuus (luotettavuusteoria.pdf),Keijo Ruohonen, TTKK, 2002

– TKK courses AS-116.3180, Mat-2.3118

16

Page 17: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Contents

• Introduction• Structural system models• Reliability of structures of independent repairable components• Reliable network topology design

17

Page 18: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Reliability block diagrams

• Reliability block diagrams (RBD) are used to describe the function ofa system of components– it shows the logical connections between components

• A system works if there is a path of functioning components from thestart point (a) to the end point (b)

• RBDs give a deterministic model for the structure of a system– the whole system works properly if and only if some set of the components

function• It is important to determine which specific function of the system is

modelled: the logical structure may be different for different functions

18

Page 19: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Series and parallel structures

• When a system functions if and only if all of the components function,the logical structure is a series structure

• When a system functions if at least one of all possible n componentsfunctions, the logical structure is a parallel structure

• Series and parallel structures can be further combined to model morecomplex structures

1 2 3 4

2

1

3

4

19

a b

ba

Page 20: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Structure function (1)

• The state vector of a structure is x = (x1, x2, ... , xn), where each statevariable xi is either 1 when component i is functioning or 0 whencomponent i is in a failed state

• The structure function of the system is

• For a series structure, the structure function is

îíì

=statefailedainissystem theif0gfunctioninissystem theif1

)(xf

Õ=

=×=n

iin xxxx

121)( Lxf

– system works if and only if xi = 1 for all i

20

Page 21: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Structure function (2)

• For parallel structure the structure function is

Õ= =

=--=--×--=n

i

n

iiin xxxxx

1 121 )1(1)1()1()1(1)( CLxf

– If any xi = 1, then the system functions– The last operator (upwards product) is reap ”ip”

• Example:For structure with 2 components in parallel we have

C2

121212121 )1)(1(1),(

=

-+=---==i

i xxxxxxxxxf

21

Page 22: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Path set and cut set methods

• For small systems the structure function ( ) can be written down byvisually inspecting the system as a combination of series and parallelstructures

• However, for large systems it is not possible!

• Therefore, we need systematic computational methods forgenerating the structure function ( )– Path set and cut set methods allow this

22

Page 23: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Path/cut sets (1)

• Definition: A path set P is set of components which by functioningensure that the system is functioning. A path set is minimal if it cannotbe reduced without losing its status as a path set.

• Definition: A cut set K is set of components which by failing causethe system to fail. A cut set is minimal if it cannot be reduced.

• Example:

23

12

3Path sets:

}3,2,1{}3,1{}2,1{

Cut sets:

}3,2,1{}3,1{}2,1{}3,2{

}1{ Minimal path sets:

Minimal cut sets:

}3,1{}2,1{ 21 == PP}3,2{}1{ 21 == KK

Page 24: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Path set method

• Let us denote rj(x) the structure function of jth minimal path

• The whole structure functions if and only if at least one minimal pathset is functioning,

• Path set method:1. Determine the path sets of the structure2. Determine minimal path sets Pj3. Calculate the structure functions of minimal path setc as series stuctures4. Take ”ip” over all functions you get in step 3.5. Simplify as needed (TIP: Power of binary variable = variable without any

power, xij=xi)

24

ÕÎ

=jPi

ij x)(xr

CCj Pi

ij

jjj j

xÕÕÎ

==--= )())(1(1)( xxx rrf

Page 25: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Cut set method

• Let us denote k j(x) the structure function of jth minimal cut

• Now the structure fails if and only if at least one structurecorresponding to the minimal cut sets fail

• Cut set method:1. Determine the cut sets of the structure2. Determine minimal cut sets Kj3. Calculate the structure functions of minimal cuts sets as parallel structures4. Multiply all functions you get in step 3.5. Simplify as needed

25

ÕÕÎ

==j Ki

ij

jj

xC)()( xx kf

Cj jKi Ki

iij xxÎ Î

Õ --== )1(1)(xk

Page 26: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Demo/Exercise

• Determine the structure function of independent components below

a) directly (by using results for series/parallel structures and combining)b) using path set methodc) using cut set method

26

12

3

Page 27: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Contents

• Introduction• Structural system models• Reliability of structures of independent repairable components• Reliable network topology design

27

Page 28: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Repairable components/systems

• Now we study systems where components can be repaired or replacedupon failures (or even before), i.e., repairable components

• We are interested for example in– system reliability– component/system availability:– mean number of failures during a time interval– mean time between failures, MTBF– mean downtime (or repair time) of systems, MDT (MTTR)

• For this purpose we can model the systems/failure processes asstochastic processes– thus, we have studied the theoretical background already in the beginning

of this course

28

Page 29: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Reliability of maintained systems (1)

• The system is called maintainable, when its components arerepaired/restored to working condition using some kind of maintenance– Can be preventive, corrective, …

• Let X(t) denote the stochastic process of the system with X(t) = 1 if thesystem is operational and 0 otherwise

• The main measure is availability, A(t)– also ( ) =Ā(t) = 1 – A(t), the unavailability is studied

29)(lim

)(1lim)(lim

)(1)(

}1)({)(

0

0

tAA

dttAAA

dttAA

tXPtA

t

avav

av

¥®

¥®¥®

=

==

=

==

ò

ò

exists)(whentyavailabiliLimiting

tyavailabiliaveragerunLong

tyavailabiliAverage

tyAvailabili

t

tt

t

tt

tt

Page 30: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Availability of single component as on-off process (1)

• We can model a single component as an on-off type process X(t) with

= 1 ifcomponentisoperational0 otherwise

• Measures related to maintainable systems are– Mean time between failures, MTBF– Mean downtime, MDT– Mean time to failure, MTTF

30

1

0

X(t)

t

MTBF

MTTF

MDT

Page 31: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Reliability of single component as on-off process (2)

• Markov model– MTTF is independent and exponentially distributed with mean 1/– MDT is independent and exponentially distributed with mean 1/

• Steady state distribution simply:

– Steady-state distribution holds even when MTTF and MDT have generaldistributions (but still independent), insensitivity property

– Then no more a Markovian process but a so-called renewal process 31

10m

l

ïïî

ïïí

ì

+=

+==

+=

+==

MDTMTTFMTTF

MDTMTTFMDT

1

0

mlmp

mllp

av

av

A

U

10 43421 43421

~Exp(l) ~Exp(m)

Page 32: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Examples (1)

• Example 1:A machine has MTTF = 1000 hours and MDT = 5 hours

The average availability is

• Example 2:Item has independent uptimes with constant failure rate l. Downtimes are IIDwith mean MDT. Usually we have MDT << MTTF, the average unavailability isthen approximately

32

995.051000

1000MDTMTTF

MTTF»

+=

+=avA

MDTMDT1

MDTMDTMTTF

MDTMDTMTTF

MTTF11

×»×+

×=

+=

+-=-=

ll

l

avav AA

Page 33: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Systems of independent components (1)

• Consider a system consisting of n independent components– The state vector of a system is

• MTTF and MDT of component i independent and exp. distributed withmean 1/ and 1/ , respectively– Let = = 1 = / +– That is, is the availability of component i

• Then the steady state distribution of state = , … , is simply theproduct of Bernoulli distributions of each component i,

= 1 −

– Again distribution holds even under general distributions for MTTF andMDT (insensitivity)

))(,...),(),(()( 21 tXtXtXt n=X

33

Page 34: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Systems of independent components (2)

• In general, the average availability of the system is defined as

= = 1• The state space Ω can be partitioned into two sets

1. Up states Ω• where the system is working. Note that some components may be in

failed state, but the system still provides the intended service.

2. Down states Ω• where the system does not perform the required function

• The (average) availability of the system is given by

= ( )∈

= 1 = 1 ( )∈

– similarly, unavailability is the sum of probabilities of down states34

Page 35: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Systems of independent components (3)

• As ( ) is a binary-valued function,

• For series structure: (independence of ’s !!)

35

)]([}1)({ XX ff EPAav ===

ÕÕÕ===

==÷÷ø

öççè

æ==

n

ii

n

ii

n

iiav pXEXEEA

111

][)]([ Xf

• And similarly for parallel structure:

C

Cn

ii

n

ii

n

ii

n

ii

n

iiav

ppXE

XEXEEA

111

11

)1(1])[1(1

)1(1)]([

===

==

=--=--=

÷÷ø

öççè

æ--=÷÷

ø

öççè

æ==

ÕÕ

ÕXf

Page 36: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Systems of independent components (4)

• However, in general

= ( ) ≠ ( )

– Thus, to calculate availability one can not just write down the structurefunction ( ) and replace ’s by the corresponding ’s!

• Instead, the function must be first simplified– Note that ( ) is a polynomial function– All higher exponents of ’s are equal to , i.e., → etc.– To the simplified structure function one can then apply the expectation

operator

36

Page 37: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Demo/exercise

• Calculate the availability of the system below using the data given inthe table

• Hint: Use the structure function derived earlier, use availabilities ofcomponents 37

12

3

i MTTFi (hours) MDTi (hours)1 750 82 300 153 500 10

Page 38: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Models with state-dependent rates

• Earlier we assumed components are completely independent fromeach other

• Markov models can have state-dependent rates– The dynamics (or transition rates) may depend on the state to reflect some

physical causes resulting from the given state– For example, if there is only one repair man, when there are many faults

the repair rates are affected– But still we assume that MTTF’s and MDT’s obey exponential distributions

• One can construct the associated Markov process and solve steadystate via global balance equations

38

Page 39: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Example

• Consider parallel structure of two components. Uptimes are exp.distributed with rates l1 and l2. Repair rates are, correspondingly, m1and m2. Also, there is only one person to repair and he spends half ofthe time repairing component 1 and 2 when both are down.

• Now solve equilibrium probabilities pi. Average availability is theprobability that at least one component works:

39

l10

l2

1

2 3l1

l2m2/2m2

m1/2

m1

Systemstate

State ofcomponent 1

State ofcomponent 2

0 1 11 0 12 1 03 0 0

210 ppp ++=avA

Page 40: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Contents

• Introduction• Structural system models• Reliability of structures of independent repairable components• Reliable network topology design

40

Page 41: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Topology design problem

• Topology design is the starting point in network design

• Think of the network as a graph with nodes connected by links– Typically network topology is heavily influenced by the set of physical

locations that need connectivity, so nodes are often given– Also, many of the primary links between nodes are defined by the node

locations– In practice, design space allows to add some or few additional links and

nodes

• Question is..– Given a network topology (nodes + links), what is a reliable network?– By considering the network as a graph, reliability/availability can be

formalized by the notion of graph connectivity

41

Page 42: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Graphs and k-connectivity

• Consider the network as a graph G(N,J) consisting of a set of nodes Nand set of links J

• Definition: A graph is said to be connected if there exists a pathbetween every pair of nodes in the graph.

• Definition: Graph G is k-edge-connected if it remains connected afterremoval of any k-1 edges.– Remember: edge = link

• Definition: Graph G is k-vertex-connected if it remains connectedafter removal of any k-1 vertices.– Remember: vertex = node– Removal of node means that all links connected to the node are removed

from the graph• Efficient algorithms exist to check k-connectivity of the graph

42

Page 43: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Examples

• 1-edge-connected

• 2-edge-connected

43

Page 44: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Topology design method (1)

• Topology design objective:– For redundancy, all nodes in the network need to be at least 2-(edge)-

connected with probability 0.99999 (i.e., “5 nines”)– That is, the network must be resilient to single link failures

• Consider a given network topology represented by graph G(N,J)

• Assume that link ∈ is operational with probability , but the nodesare perfectly reliable– State = , … ,– State space Ω = 0,1

44

Page 45: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Topology design method (2)

• The structure function is then

= 1, ifnetworkinstate is2 − connected0, otherwise

• And the availability is defined as

= is2 − connected = ( )∈

– Note that the size of state space is 2^J (grows exponentially!)

• If availability is too low new links need to be added– Need to define heuristics for identifying most useful locations

45

Page 46: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

Topology design method (3)

• Taking into account node failures– Node ∈ is operational with probability

• We still require that all nodes must stay 2-connected with 5-nines– Thus, all nodes must then be operational and

= is2 − connected|allnodeson ∙ allnodeson= ( ⋯ ) ∙ is2 − connected|allnodeson

– The conditional probability of 2-(edge)-connectedness is evaluated asbefore assuming that nodes do not fail

• Note! This is just one version of the topology design objective and newones can be easily defined.

46

Page 47: 9. Reliability theory · 9. Reliability theory Basic concepts (1) • Reliability: The ability of an item to perform a required function, under given environmental and operational

9. Reliability theory

47

THE END

• What you should understand/remember:– what kind of things reliability theory studies– basic measures, MTTF, MDT– how to calculate structure function of simple systems and how to use that

to calculate the availability/reliability of a system– how to make Markov models of simple maintained systems and calculate

the availability– how can graph connectivity be used as a measure of reliability in data

network topology design