Upload
dwain-mclaughlin
View
216
Download
3
Embed Size (px)
Citation preview
Estimating Component Estimating Component Availability by Dempster-Availability by Dempster-Shafer Belief Networks Shafer Belief Networks
Lan GuoLan Guo
Lane Department of Computer Science & Electrical Engineering
West Virginia UniversityMorgantown, WV26506
BackgroundBackground
This work is based on the research of This work is based on the research of estimating component availability of a estimating component availability of a large, distributed network (Y. Yu and E. large, distributed network (Y. Yu and E. Stoker ISSRE’01)Stoker ISSRE’01)
The dataset was obtained from field The dataset was obtained from field observation over 18 monthsobservation over 18 months
Bayesian Belief Network (BBN) and Bayesian Belief Network (BBN) and traditional MTTR probability computation traditional MTTR probability computation were used in the previous workwere used in the previous work
We would like to develop a novel, objective We would like to develop a novel, objective methodology to estimate component methodology to estimate component availabilityavailability
Drawbacks of BBNsDrawbacks of BBNs Bayesian Belief Networks (BBNs) are Bayesian Belief Networks (BBNs) are
subject to human biases and logical subject to human biases and logical inconsistencyinconsistencyThe structure of the BBNs is based on the The structure of the BBNs is based on the
subjective opinions of domain experts subjective opinions of domain experts The prior of the Bayes Theorem is subjectiveThe prior of the Bayes Theorem is subjective Uniform prior is logically inconsistentUniform prior is logically inconsistent
A BBN example:A BBN example:
late
slept-in
traffic
Why D-S Belief NetworksWhy D-S Belief Networks
Dempster-Shafer (D-S) Belief Network Dempster-Shafer (D-S) Belief Network is a complete formalism of evidential is a complete formalism of evidential reasoning reasoning
D-S inference scheme is a more D-S inference scheme is a more general and robust theory than the general and robust theory than the Bayes TheoremBayes Theorem
The D-S Belief Network and the D-S The D-S Belief Network and the D-S theory are objective and free of theory are objective and free of human biaseshuman biases
How the D-S Network WorksHow the D-S Network Works
The Induction Algorithm builds the The Induction Algorithm builds the belief network automatically from the belief network automatically from the dataset dataset
Belief for certain node(s) is Belief for certain node(s) is dynamically updated based on dynamically updated based on evidence by the Dempster’s rule of evidence by the Dempster’s rule of combinationcombination
Updated belief is propagated through Updated belief is propagated through the whole network by the Belief the whole network by the Belief Revision AlgorithmRevision Algorithm
Improvement upon the Improvement upon the Former Induction Former Induction
AlgorithmAlgorithm
Drawbacks of the former Induction Algorithm:The Induction Algorithm by Liu et al. is
dramatically dependent on the sample size It violates the assumption of the Binomial
Distribution that the sample size must be constant It gives erroneous results for the dataset
Our Induction Algorithm is based on a sound scheme: prediction logic
Our Induction Algorithm Our Induction Algorithm
BeginBegin
Set a significance level Set a significance level minmin and a minimal and a minimal UUminmin
For For nodenodepp, , pp [0, [0, nnmaxmax – 1] and – 1] and nodenodeqq, , qq [ [pp + 1, + 1, nnmaxmax] (Note: ] (Note: nnmax max is the total number of nodes)is the total number of nodes)
For all empirical case samples For all empirical case samples NN Compute a contingency tableCompute a contingency table
MMpqpq = = For each relation type For each relation type kk out of the six cases find the solution to out of the six cases find the solution to
Max UMax Upp
Subject to Subject to Max UMax Upp > > UUminmin
pp minmin
ijij = = 1 or 0 (if 1 or 0 (if NNij ij corresponds to an error cell, corresponds to an error cell, ijij = = 1; 1; otherwise, otherwise, ijij = = 0)0)
(b)(b) > > (b’)(b’) if if (b)(b) = 1 and = 1 and (b’)(b’) = 0= 0 If the solution exists, then return a type If the solution exists, then return a type kk relation relationEndEnd
N11 N12
N21 N22
Our Induction Algorithm Our Induction Algorithm
For a single error cell, if For a single error cell, if NNij ij is the number of is the number of error occurrences: error occurrences:
UUpp = U = Uijij = =
pp = = ij ij = = 1 -1 - For multiple error cells:For multiple error cells:
UUpp = =
((ijij = = 1 for error cells; otherwise, 1 for error cells; otherwise, ijij = = 0)0)
pp = =
P
ij
UN
N
*
i j
ijij U*
ij
i j P
ijij
U
U )(
Experiment Experiment
We started with the Bayesian network for We started with the Bayesian network for estimating component availability in the estimating component availability in the large distributed network.large distributed network.
Based on the node probability tables Based on the node probability tables associated with the Bayesian network, we associated with the Bayesian network, we generated two sets of data samples: generated two sets of data samples: one for constructing the D-S belief network with one for constructing the D-S belief network with
1000 data points,1000 data points, the other for validating the evidential reasoning the other for validating the evidential reasoning
scheme with 100 data points.scheme with 100 data points.
We applied our induction algorithm to induce We applied our induction algorithm to induce the implication relationship between each the implication relationship between each pair of nodes. pair of nodes.
Experiment Experiment
For the testing sample, we randomly For the testing sample, we randomly selected an unobserved node and used its selected an unobserved node and used its value as the new evidence and propagated value as the new evidence and propagated the updated belief values to other reachable the updated belief values to other reachable nodes.nodes.
For each of the unobserved nodes, we For each of the unobserved nodes, we compared the belief value predicted and the compared the belief value predicted and the value in the testing sample, and output the value in the testing sample, and output the evaluation metrics. We continued these two evaluation metrics. We continued these two steps until all nodes were observedsteps until all nodes were observed..
Evaluation MetricsEvaluation Metrics
The absolute difference between the actual value The absolute difference between the actual value in the testing sample and the computed belief in the testing sample and the computed belief value:value:
XX = | Bel = | Belempemp(X) – Bel(X) – Belestest(X)|(X)| Mean estimate error:Mean estimate error:
Standard error of estimate:Standard error of estimate:
SN
i
n
j
ijS nN 1 1max*
max1
max
1 1
*
max
nNS
N
i
n
j
ij
S
Results (1)Results (1)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 2 3 4 5 6 7 8 9 10
Number of nodes observed
Mea
n e
rror
No inference
Implicationmethod
Results (2)Results (2)
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
Sample size
Ava
ilab
ility
Observation
D-S belief
TraditionalprobabilityBayesianbelief
Conclusions Conclusions
Our Induction Algorithm is an efficient, sound, Our Induction Algorithm is an efficient, sound, dynamic, and general means for automatically dynamic, and general means for automatically constructing the D-S belief networks.constructing the D-S belief networks.
The inducted belief network is free from The inducted belief network is free from human biases. human biases.
The implication method over the D-S network The implication method over the D-S network greatly reduced the prediction error. greatly reduced the prediction error.
This study is the first attempt to apply the D-S This study is the first attempt to apply the D-S belief network to software reliability belief network to software reliability engineering. engineering.
Our future work includes employing the Our future work includes employing the entropy notion for optimal inference of greater entropy notion for optimal inference of greater prediction accuracy over the whole network.prediction accuracy over the whole network.