Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
IMPERIAL COLLEGE LONDON
DEPARTMENT OF
MECHANICAL ENGINEERING
Quantification of Uncertainty in
Probabilistic Safety Analysis
Engineering Doctorate Thesis
Ashraf Ben Mamdouh El-Shanawany
March 2017
2
DECALARATION OF ORIGINALITY
The work in this thesis is my own and all else is appropriately referenced.
COPYRIGHT DECALARATION
The copyright of this thesis rests with the author and is made available under a Creative Commons
Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or
transmit the thesis on the condition that they attribute it, that they do not use it for commercial
purposes and that they do not alter, transform or build upon it. For any reuse or redistribution,
researchers must make clear to others the licence terms of this work.
ACKNOWLEDGEMENTS
I am grateful to many people who have supported me during this work, both personally and
professionally. In particular I would like to thank my parents, Mamdouh El-Shanawany and Elizabeth
El-Shanawany, my supervisor Simon Walker, Keith Ardron, Charles Shepherd, Lavinia Raganelli,
Jasbir Sidhu and Rebecca Brewer for their support, in all its many forms.
3
ABSTRACT
This thesis develops methods for quantification and interpretation of uncertainty in probabilistic
safety analysis, focussing on fault trees. The output of a fault tree analysis is, usually, the probability
of occurrence of an undesirable event (top event) calculated using the failure probabilities of
identified basic events. The standard method for evaluating the uncertainty distribution is by Monte
Carlo simulation, but this is a computationally intensive approach to uncertainty estimation and does
not, readily, reveal the dominant reasons for the uncertainty. A closed form approximation for the
fault tree top event uncertainty distribution, for models using only lognormal distributions for model
inputs, is developed in this thesis. Its output is compared with the output from two sampling based
approximation methods; standard Monte Carlo analysis, and Wilks’ method, which is based on order
statistics using small sample sizes. Wilks’ method can be used to provide an upper bound for the
percentiles of top event distribution, and is computationally cheap. The combination of the lognormal
approximation and Wilks’ Method can be used to give, respectively, the overall shape and high
confidence on particular percentiles of interest. This is an attractive, practical option for evaluation of
uncertainty in fault trees and, more generally, uncertainty in certain multilinear models. A new
practical method of ranking uncertainty contributors in lognormal models is developed which can be
evaluated in closed form, based on cutset uncertainty. The method is demonstrated via examples,
including a simple fault tree model and a model which is the size of a commercial PSA model for a
nuclear power plant. Finally, quantification of “hidden uncertainties” is considered; hidden
uncertainties are those which are not typically considered in PSA models, but may contribute
considerable uncertainty to the overall results if included. A specific example of the inclusion of a
missing uncertainty is explained in detail, and the effects on PSA quantification are considered. It is
demonstrated that the effect on the PSA results can be significant, potentially permuting the order of
the most important cutsets, which is of practical concern for the interpretation of PSA models. Finally,
suggestions are made for the identification and inclusion of further hidden uncertainties.
4
CONTENTS
NOMENCLATURE & ACRONYMS 10
MATHEMATICAL SYMBOLS & NOTATION 12
1. INTRODUCTION 14
1.1 Benefits and Limitations of PSA 15
1.2 Thesis Aims & Structure 18
2. REVIEW OF PREVIOUS WORK ON UNCERTAINTY IN PSA 20
2.1 Probability Formulations 20
2.1.1 Frequentist 22
2.1.2 Bayesian 24
2.1.3 Uncertainty Analysis – General Formulation 26
2.2 Risk 28
2.3 Types of Uncertainty 29
2.3.1 Types of Uncertainty 29
2.3.2 Uncertainty Domains 33
2.4 Sources of Uncertainty in PSA Models 40
2.4.1 Quantification Uncertainty 40
2.4.2 Component Reliability Uncertainty 41
2.4.3 Scenario Uncertainty 42
2.4.4 CCF Uncertainty 43
2.4.5 Human Reliability Uncertainty 45
2.4.6 Success Criteria Uncertainty 46
2.4.7 Model Completeness Uncertainty 47
2.5 PSA Importance Measures 50
5
2.5.1 Basic Event Importance Measures 50
2.5.2 Sensitivity and Uncertainty Importance Measures Applied to Uncertainties 56
2.6 Further Work 64
3. CLOSED FORM UNCERTAINTY APPROXIMATION IN LOGICAL MODELS 65
3.1 Fault Tree Quantification 65
3.2 The Lognormal Distribution 67
3.3 Product of Lognormal Distributions 68
3.4 Sum of Lognormal Distributions 68
3.4.1 Fenton-Wilkinson 69
3.5 Uncertainty Approximation for Lognormal Models 73
3.5.1 Uncertainty Closed Form Approximation Using FW 1st and 2nd Moment
Matching 73
3.6 Method for Uncertainty Approximation Based on Order Statistics 74
3.6.1 Order Statistics 74
3.7 Application of Methods to Fault Tree Example 75
3.7.1 Example Fault Tree 76
3.7.2 Comparison of Results 77
3.8 Discussion & Further Work 80
4. DEVELOPMENT OF METHOD FOR RANKING UNCERTAINTIES 82
4.1 Analytic Expressions for Uncertainty Importance in Lognormal Models 82
4.1.1 Uncertainty Importance for Linear Models 82
4.1.2 Fractional Contributions to the Variance 82
4.1.3 Uncertainty Importance for Multi-Linear Models 83
4.1.4 Uncertainty Importance for Multi-Linear Models Using Only Lognormal
Inputs 85
4.2 Ranking Method for “Known” Uncertainties 86
6
4.2.1 Uncertainty Contributors in Large, Realistically Valued Models 86
4.2.2 Example Model 1: System X 90
4.2.3 Example Model 2: Facility Sized Model 102
5. INCLUSION OF MODEL UNCERTAINTIES IN PSA 115
5.1 Structural Uncertainty and Success Criteria 115
5.2 Example: Incorporating Hidden Uncertainty Using Basic Event Uncertainty 118
5.2.1 Xenon Decay Transient 119
5.2.2 Time Factors 120
5.2.3 Incorporating Uncertainty 121
5.2.4 Results 122
5.2.5 Discussion 127
5.3 Proposals for Incorporating Hidden Modelling Uncertainties into a PSA 127
6. CONCLUSIONS AND FURTHER WORK 130
REFERENCES 133
TABLES
Table 1: Basic Event Distributions ....................................................................................................... 77
Table 2: Cutsets for Example Fault Tree .............................................................................................. 77
Table 3: Comparison of Approximated Percentiles .............................................................................. 79
Table 4: Cutset Variance For Two Lognormal Basic Events, Case1 .................................................... 88
Table 5: Cutset Variance For Two Lognormal Basic Events, Case2 .................................................... 88
Table 6: Cutset Variance For Two Lognormal Basic Events, Case3 .................................................... 88
Table 7: Failure Parameters for the Model in Figure 5 ......................................................................... 90
7
Table 8: Combinations of Inputs and Outputs for Gate S ..................................................................... 91
Table 9: Combinations of Inputs and Outputs for Gate T ..................................................................... 92
Table 10: Combinations of Inputs and Outputs for Gate R .................................................................. 93
Table 11: Fussel Vesely Importance Values for Basic Events of the Model ........................................ 93
Table 12: Error Factors For the model in Figure 5 ............................................................................... 94
Table 13: Comparison of Risk Spectrum and Python Top Event Distributions ................................... 96
Table 14: Basic Event Lognormal Parameters ...................................................................................... 97
Table 15: Cutset Lognormal Parameters ............................................................................................... 97
Table 16: Iman Uncertainty Importance listings for the model ............................................................ 99
Table 17: Uncertainty Importance Using Closed Form ...................................................................... 102
Table 18: Top Ten Cutsets By Fractional Contribution ...................................................................... 105
Table 19: Basic Events in the top Ten Mean Cutset Contributors ...................................................... 106
Table 20: PWR Basic Event Importance to Top Event Mean ............................................................ 108
Table 21: Ranking of Cutsets by Contribution To Variance of Core Damage Frequency ................. 112
Table 22: Discrete Probability Distribution Over the Human Error Probability Estimate ................. 122
Table 23: Summary of Basic Events ................................................................................................... 123
Table 24: Model Failure Parameters Table ......................................................................................... 124
Table 25: Minimum Cutsets for the Reference Case .......................................................................... 125
Table 26: Basic Event Importance and Sensitivity for the Reference Case ........................................ 125
Table 27: Minimum Cutsets for the Sensitivity Case ......................................................................... 126
Table 28: Fractional Contributions of Basic Events for the Uncertainty Case ................................... 126
8
FIGURES
Figure 1: OpenTurns Overview ............................................................................................................ 28
Figure 2: Uncertainty Domains Affecting Uncertainty In Risk ............................................................ 34
Figure 3: Example Fault Tree ............................................................................................................... 76
Figure 4: Approximated Top Event CDF, found by (i) Monte Carlo Sampling 20,000 points (Blue),
(ii) Wilks’ 95/x estimates from 100 points (Green Diamonds), (iii) Lognormal
Approximation (Red) ................................................................................................. 78
Figure 5: Example Fault Tree ............................................................................................................... 90
Figure 6: PDF of the probability of failure on demand for the System Estimated using RS MC ......... 95
Figure 7: PDF of the probability of failure on demand for System X Example Estimated Via Python
MC .............................................................................................................................. 96
Figure 8: CDF of Appoximated Lognormal Distribution Vs Two Monte Carlo Sampling Codes ....... 98
Figure 9: Top Event Probability Uncertainty When Basic Event A is forced to be certain ................ 100
Figure 10: Top Event Probability Uncertainty When Basic Event B is forced to be certain .............. 101
Figure 11: PWR Overview .................................................................................................................. 103
Figure 12: CDF of Frequency of Inadvertent Trip Resulting in Core Damage .................................. 109
Figure 13: PDF of Frequency of Inadvertent Trip Resulting in Core Damage ................................... 109
Figure 14: Approximation of CDF For Frequency of Inadvertent Trip Resulting in Core Damage .. 110
Figure 15: Variance Vs Error Factor For Lognormal Distribution ..................................................... 113
Figure 16: Structural Model Uncertainty of A Fault Tree .................................................................. 116
Figure 17: The Major Xenon Formation and Decay Processes .......................................................... 119
Figure 18: Time Uncertainty Fault Tree Example .............................................................................. 123
9
APPENDICES
APPENDIX A – CONSERVATISM, UNCERTAINTY AND BEST ESTIMATES
APPENDIX B – WILKS’ METHOD
APPENDIX C – PRODUCT OF LOGNORMAL RANDOM VARIABLES
10
NOMENCLATURE & ACRONYMS
AGR – Advanced Gas-cooled Reactor
ASEP – Accident Sequence Evaluation Programme
ATRUM – Automated Statistical Treatment of Uncertainty Method
CCF – Common Cause Failure
CDF – Cumulative Density Function
CSAU – Code Scaling Applicability and Uncertainty
DST – Dempster-Schafer Theory
ECCS – Emergency Core Cooling System
FC – Fractional Contribution
FV – Fussell Vessely
GRS – Gesellschaft fur Anlagen-und Reaktorsicherheit
HEART – Human Error Assessment and Reduction Technique
HSE – Health and Safety Executive
IDPSA – Integrated Deterministic-Probabilistic Safety Analysis
LOCA – Loss of Coolant Accident
LWR – Light Water Reactor
MCS – Minimal Cutset
MGL – Multiple Greek Letter
NP – Nondeterministic Polynomial
NPP – Nuclear Power Plant
NRC – Nuclear Regulatory Commission
PCT – Peak Clad Temperature
PDF – Probability Density Function
PRA – Probabilistic Risk Analysis
PRNG – Pseudo-Random Number Generator
PSA – Probabilistic Safety Analysis
PWR – Pressurised Water Reactor
11
RA – Risk Achievement
RAW – Risk Achievement Worth
RDW – Risk Decrease Worth
RIF – Risk Increase Factor
RRW – Risk Reduction
RRW – Risk Reduction Worth
RSS – Reactor Safety Study
SRC – Standardised Regression Coefficient
THERP – Technique for Human Error Rate Prediction
TMI – Three Mile Island
UPM – Unified Partial Method
12
MATHEMATICAL SYMBOLS & NOTATION
𝑛! �𝑖𝑛
𝑖=1
𝜕𝜕𝜕
Partial derivative with respect to x
iid Independent and identically distributed
ln Natural logarithm
μ Lognormal parameter: mean of the natural logarithm of the variable
σ Lognormal parameter: standard deviation of the natural logarithm of the variable
𝜉 Noise variable
f, g Real valued functions
a, b, k Real valued constants
x, y Real valued variables
Q() Unavailability
R, U, X, Y, Z Random variables
𝑅𝑅(𝑋𝑖) Risk achievement of Xi
𝑅𝑅𝑅(𝑋𝑖) Risk achievement worth of Xi
𝑅𝑅(𝑋𝑖) Risk reduction of Xi
𝑅𝑅𝑅(𝑋𝑖) Risk reduction worth of Xi
𝐹𝐹(𝑋𝑖) Fussell Vessely measure of Xi
𝐵(𝑋𝑖) Birnbaum measure of Xi
P(X) Probability of X
P(X|θ) Conditional probability of X given θ
𝐸[𝑋] � 𝑥𝑥(𝑥)∞
−∞𝑑𝑥
E�Xk� exp(kµ) exp �k2σ2
2� The kth moment of X
𝐸[𝑋|𝑌] ∫ 𝑥𝑥(𝑥|𝑦)𝑑𝑥∞−∞ Conditional expectation of X given Y
13
𝐹𝑉𝑉[𝑋] � (𝑥 − 𝐸[𝑋])2𝑥(𝑥)∞
−∞𝑑𝑥
𝐹𝑉𝑉(𝑋|𝑌) ∫ (𝑥 − 𝐸𝑌[𝑋|𝑌])2𝑥(𝑥|𝑦)∞−∞ 𝑑𝑥 Conditional Variance of X given Y
𝐸𝑦𝐸[𝑋|𝑌] ∫ 𝑥(𝑦)�∫ 𝑥𝑥(𝑥|𝑦)𝑑𝑥∞−∞ �
∞−∞ 𝑑𝑦
Expectation, over Y, of the conditional expectation of X given Y
𝐸𝑦[𝐹𝑉𝑉(𝑋|𝑌)] ∫ 𝑥(𝑦)�∫ (𝑥 − 𝐸𝑌[𝑋|𝑌])2𝑥(𝑥|𝑦)∞−∞ 𝑑𝑥�
∞−∞ 𝑑𝑦
Expectation, over Y, of the conditional variance of X given Y
𝐹𝑉𝑉𝑦(𝐸[𝑋|𝑌]) ∫ 𝑥(𝑦) �∫ 𝑥𝑥(𝑥|𝑦)∞−∞ 𝑑𝑥 − 𝐸𝑦𝐸[𝑋|𝑌]�
2∞−∞ 𝑑𝑦
Variance, over Y, of the conditional expectation of X given Y
Cov(X,Y) Covariance of X and Y
ф−1(𝑧) Inverse Cumulative Distribution Function of the Standard Normal Distribution
max, min Maximum, minimum
argmax Argument of the maximum
14
1. INTRODUCTION
Probabilistic Safety Analysis (PSA) is a modelling process used to quantify the risk from hazardous
operations. Internationally it is also referred to as Probabilistic Risk Analysis (PRA). PSA was
pioneered by the space and nuclear industries in order to help characterise the risks involved in
operations which had potentially large adverse consequences; respectively loss of a spacecraft and
crew, or release of radioactive material from a nuclear facility. This thesis considers quantification of
uncertainty in PSA models.
The construction of a PSA model, for some specified operation or facility, can be thought of as
consisting of the following overall steps:
1. Identification of initiating events which could present a challenge to safe operation;
2. Construction of a logical model which identifies plausible sequences of equipment failures
following an initiating event, potentially resulting in adverse consequences;
3. Identification of possible harmful outcomes which could occur as a result of these sequences;
4. Quantitative estimation of parameters of the logical model;
5. Estimation of the probability and/or frequency of specified outcome events (consequences).
The idea is to consider all potential plant behaviours which can affect safety, or mission success, in
one integrated model. These may include performance of plant systems, dependencies between plant
systems, human interactions with the mechanical components, such as operator actions, maintenance
and repair, and support systems such as electrical power and cooling water.
PSA uses both logical and probabilistic reasoning. Logical reasoning, also referred to as deductive
reasoning, uses only a set of agreed rules and initial assumptions. Hence, logical reasoning can be
thought of as a mechanical process whereby logical results certainly follow from starting assumptions
and correct application of the logical rules. Probabilistic reasoning, on the other hand, does not deal in
certainty. The rules of probabilistic reasoning allow plausible results to be identified rather than
certain ones. For example, on observing rain clouds in the sky probabilistic reasoning could be used to
conclude that it will probably rain later. This is distinct from a logical conclusion, since there are other
15
possibilities. Fault trees and event trees in PSA use logical reasoning. The failure parameter inputs of
PSA are described probabilistically. A robust basis for probabilistic reasoning is explained in Chapter
2. Chapter 2 also explains how logical reasoning and probabilistic reasoning can usefully be combined
for PSA models. Key to understanding uncertainty in PSA are the two layers of uncertainty which
exist. Briefly, the first layer is the natural description of risk results in terms of probabilities of
adverse events. The second layer is the uncertainty in the risk results, which again are described
probabilistically. It is the second layer of uncertainty with which this thesis is primarily interested.
1.1 Benefits and Limitations of PSA
A major qualitative benefit of PSA in the context of nuclear power production is the process of
conducting a structured analysis leading to a logical model of ways in which the plant can fail. This
process, in itself, provides an alternative mechanism to review plant safety and raise questions about
its intended operation, allowing cross-checking conclusions reached by teams of analysts from
different disciplines. A second benefit of PSA follows from the quantitative analysis which provides
information about the probability of undesirable sequences occurring. This is used to estimate which
postulated sequences contribute most to the risk from the plant or process. The results of the
quantitative analysis can identify high risk sequences which might be eliminated or at least mitigated
if this is practicable. The overall aim is to determine whether the plant risk is balanced and provide an
input into ensuring that the identified risks are as low as reasonably practical. The purpose of
uncertainty analysis is to give us information about how confident we can be in the predicted risk, and
the largest sources of uncertainty.
The greatest and most persistent criticisms of PSA are that the results are uncertain, due to the
uncertain nature of the plant and human reliability data that must be input into the models, failure to
identify initiating events and their associated failure sequences, and uncertainty about how accident
sequences really evolve. These criticisms are not unreasonable, but one of the reasons for the
development of PSA was to meaningfully manage precisely these issues. Historically, one approach
for allowing for uncertainty has been the use of pessimistic or conservative assumptions and data,
which allows analysts to claim that the risk predictions at least bound the “true” risk. However, the
16
use of conservatism, especially unequally applied conservatism, distorts the results of the modelling,
reducing the value of the PSA results. In recent years increasing effort has been made to incorporate
uncertainty into risk results, but much remains to be done [1]. Properly addressing and quantifying
uncertainty remains the one of the largest challenges to PSA. Improving the uncertainty analysis
would bolster both the validity of the results and the credibility of the modelling method. This thesis
aims to contribute to the uncertainty analysis of PSA, by highlighting deficiencies and proposing
improved techniques.
PSA techniques are applicable to a range of applications, but in this thesis the application domain will
be restricted to nuclear power plants (NPPs). The first systematic probabilistic safety analysis of a
NPP was the Reactor Safety Study (RSS) conducted by Rasmussen et al [2] in 1975, and analysed
PWR and BWR designs. The RSS analysed reactor safety by considering potential accident sequences
that would lead to core melt due to plausible initiating events. The study used fault trees and event
trees to logically represent the possible paths that accident sequences could follow. Quantitative
estimates of the probability of events were made through a combination of observed data and
engineering judgement, allowing the overall frequency of accident sequences, involving numerous
events, to be estimated. Combining all postulated sequences provided an overall estimation of the risk
of NPP operation. The report significantly altered the prevailing opinion of the risk from commercial
NPP operation. Broadly, RSS estimated that the frequency of a severe accident involving core melt
was higher than had been previously thought, but that the likely consequences were less severe.
Furthermore there were significant risk contributors which were not due to the design basis “worst
case” type accidents, such as large break loss of coolant accidents (LOCAs) that had been previously
the focus of plant design efforts. Combinations of relatively minor plant failures were estimated to be
risk significant. For example, the study highlighted the potential following certain sequences of events
for a coincident stuck open pressure relief valve on the primary circuit to cause core melt, identifying
an event sequence that was very close to that which later unfolded in the accident at Three Mile Island
(TMI). Lessons learnt from TMI helped establish PRA as a useful technique, and redirected some of
the, probably excessive, attention that had previously been given to worst case analyses, such as Large
17
Break LOCA. There were numerous criticisms of the RSS, but the most persistent is that there were
too many uncertainties in the modelling process, and that the overall risk figures were too speculative
to be relied upon. The Lewis Committee [3] conducted an in depth review of the RSS and reported
that “we are unable to determine whether the absolute probabilities of accident sequences in RSS are
high or low, but we believe that the error bounds on those estimates are, in general, greatly
understated”. Since RSS was published, progress has been made in incorporating uncertainty into
PSA, however, the criticism that PSA models contain many approximations and uncertainties still
remains valid.
The criticism that the uncertainties were underestimated in RSS is an interesting one, since throughout
the modelling process conservative judgements were consistently preferred. This raises the question
of why there is a sense of unease about the uncertainty if all judgements are made in a conservative
manner. A missing, or partially complete, uncertainty analysis fails to provide an indication of the
confidence level in the results, and this may contribute to the unease. This type of confidence
indication is an important component of scientific work. An underlying concern is that an absence of
uncertainty information may indicate that the uncertainties are not properly understood.
Conservative estimates distort PSA results, making them less useful [27]. Reference 4 describes some
of the misleading conclusions that can be drawn from conservative models using the example of the
risk impact of maintenance. Replacing conservative analysis with uncertainty analysis avoids these
distortions. Despite an overall tendency towards conservative estimates in RSS, the Lewis committee
report found that both conservative and non-conservative probability judgements were made in RSS.
RSS was followed by further generations of PSAs, in which the approach taken to uncertainty
analysis has become iteratively more sophisticated. It is now standard, in industry, to perform
uncertainty calculations as part of a PSA, but this still only quantifies a subset of the uncertainties that
affect PSA. A suggested improvement to PSA is integrated deterministic-probabilistic safety analysis
(IDPSA) [5]. IDPSA simulates the plant physics in detail and integrates this with simulated plant
reliability and plant equipment failures. This provides direct analysis of how plant equipment failures
interact with the physical modelling. The numerous uncertainties that need to be considered for a
18
facility level PSA present a large challenge in terms of the amount of effort that would need to be
expended.
1.2 Thesis Aims & Structure
Risk analysis can be viewed as a structured method of understanding and comparing uncertainties.
Taking this viewpoint then it is clearly important that uncertainty analysis of PSA is as complete and
accurate as possible. To this end, the key questions that are addressed in this thesis are:
• “How do we identify the uncertainties in PSA?”
• “How do we estimate the uncertainty in PSA?”
• “How important is the uncertainty of different contributors?”
• “How can we estimate the uncertainty in model parameters?”
• “What can the uncertainty in model parameters tell us?”
Chapter 2 presents a review of the literature on PSA uncertainty analysis. State of the art uncertainty
analysis is reviewed, and broader uncertainty analysis schemes used outside of PSA are discussed.
Chapter 3 develops a closed form approximate method for finding the uncertainty distribution PSA
outputs as a function of the input uncertainty parameters. The method developed is much less
computationally demanding than Monte Carlo sampling. This step is important since the ability to
estimate uncertainty distributions is a necessary precursor to any quantitative uncertainty analysis.
Chapter 4 develops a practical method of ranking uncertainty sources in terms of their contribution to
overall PSA uncertainty. Breaking down the uncertainty contributors is of value to understand where
additional effort might be focused in reducing uncertainty, for example by collecting more data,
performing more experiments, developing theoretical models further, or running simulations at a finer
scale. Conversely it may be possible to observe that some uncertainties are very unimportant to the
overall uncertainty, and that further reducing the uncertainty would not change top event uncertainty
by much. This latter consideration is analogous to observations made about the “unimportance” of
basic events made by Fleming [6], and may be easier to show than the importance of contributors.
19
To date, in nuclear PSA many sources of uncertainty are not explicitly incorporated into the model:
these can be thought of as “hidden but identifiable uncertainties”. Chapter 5 explains methods of
incorporating hidden uncertainties into PSA models. An example of incorporating a hidden
uncertainty is presented, noting that the main effort involved is understanding how the uncertainty
would affect the model, rather than the mathematical propagation of uncertainty. Finally, guidance is
provided on how to determine which hidden uncertainties to address, by considering which
uncertainty sources are most likely to be important.
20
2. REVIEW OF PREVIOUS WORK ON UNCERTAINTY IN PSA
This chapter presents a review of literature relevant to the development of uncertainty analysis in
PSA. The topics covered are:
1. Probabilistic reasoning: Probability is required to describe uncertainty quantitatively. PSA has
two levels of uncertainty; one layer is the assignment of point probability values to model
inputs. The second layer is the uncertainty over these probability inputs. The focus of this
thesis is on the second layer of uncertainty. There are two main schools of thought on
probability, “Frequentist” and “Bayesian”, which are reviewed and explained.
2. Risk: The mathematical description of risk is introduced and reviewed, to make clear the
basic terminology used in the rest of the thesis.
3. Types of uncertainty. Uncertainty is typically interpreted as being of one of two different
types in the context of PSA; aleatory or epistemic.
4. Sources of Uncertainty in PSA Models. There are numerous different sources of uncertainty
in PSA models. Note sources of uncertainty are distinct from the types of uncertainty
introduced above (aleatory and epistemic); each source of uncertainty may have components
of both aleatory and epistemic uncertainty.
5. PSA importance measures. One of the ways in which PSA models are interpreted is via
importance measures. “First layer uncertainty” PSA importance measures are reviewed first,
followed by a review of the importance measures for the second layer of uncertainty. In order
to explain uncertainty importance measures, methods of quantifying uncertainty are reviewed.
2.1 Probability Formulations
Abstract mathematical models can be made precise, and the structure of results in these abstract
formulations follows purely from the application of logic. However, real world problems typically
contain uncertainty, and so pure logic fails to provide a full picture. Probability provides a bridge
between idealised abstract formulations and real world problems. Probability theory, in its modern
form, is described in numerous sources; the primary references used here are References 7, 8 and 9.
21
There are many interpretations of probability; Cox [10] noted just how many different viewpoints on
probability are possible:
“If minor differences are counted, the number of schools seems to be somewhere between
two and the number of authors, and probably nearer the latter number”
The viewpoint can affect how analysis is carried out, and hence have an affect on the output of an
analysis. It is important to have a good understanding of the different formulations of probability in
order to interpret the probability component of risk. Possible frameworks include Bayesian,
Frequentist statistics, fuzzy logic and Dempster-Schafer theory (DST). DST is best described as a
possibilistic framework, rather than a probabilistic one. Fuzzy logic applies probability theory to set
membership. Cox [10] summarised that there were two main schools of thought:
“The concept of probability has from the beginning of the theory involved two ideas:
the idea of frequency in an ensemble and the idea of reasonable expectation.”
Due to the number of different interpretations, only Bayesian and Frequentist (sometimes referred to
as classical) viewpoints are considered here since these are, in the opinion of the author, the best
suited to handling problems involving uncertainty in PSA.
The difference between the two can be reduced to the interpretation of probability; the Frequentist
views probability as the limiting frequency of an infinite number of repeated random experiments.
The Bayesian views probability as a belief assignment. The Frequentist view requires the notion of
infinite sets (infinite repetitions of “identical” trials) in order to define probability as a frequency
attained in the limit, while Bayesians assign degrees of belief (probability) without reference to
infinite repeatability. This gives rise to a different perspective on unknown quantities. For example,
consider a parameter which indicates the probability of getting a head when tossing a coin. The
Frequentist views this parameter as a fixed (but unknown) value, while the Bayesian views different
values as having different probabilities (expressed by the belief distribution). This naturally leads to
interest in different types of question; the Frequentist tends to consider confidence intervals which
express the confidence that the “true” parameter value lies in an interval. The Bayesian considers
probability (belief) distributions which assign probabilities that the parameter is equal to different
22
values. For the Frequentist, it does not make sense to say that there is a probability that the parameter
is equal to some value, since the parameter has a true value and the probability that the parameter
takes that value is equal to one.
Different authors have different preferred styles, and it is apparent that any problem can be usefully
described and analysed in any style. The present author favours a Bayesian interpretation, which
appears the most natural and straightforward approach.
2.1.1 Frequentist
The Frequentist viewpoint is that probability can be understood as the frequency of some event
happening in a large number of trials, with the number of trials tending to infinity. In the example of a
coin toss, the probability that a heads occurs is defined by the limiting frequency of heads in an
infinite number of coin tosses. A prior probability on the probability of a heads occurring hence
appears to be an ill-defined concept to a Frequentist. The denial of prior probabilities restricts the
range of problems that Frequentist arguments can be used on. The requirement that we think in terms
of independent repetitions of a random experiment also limits the type of model we can use to simpler
models than can be considered in the Bayesian case. The benefits of a Frequentist approach lie mainly
in purity. If the conditions required can be achieved, then Frequentist analysis is precise, and avoids
the need for subjective probability beliefs.
The main advantages claimed by a Frequentist viewpoint are:
• Objectivity: there is no need to invoke priors or belief functions
• Ease of application: statistical tests can be applied straightforwardly
However, to a Bayesian, both of these claims are misleading. Objectivity is never really possible since
model choices must be made in order to perform statistical analysis. Assumptions are required by a
Frequentist in order to set up the problem for analysis, in particular the assumption that trials can be
repeated infinitely often. Ease of application, while bringing clear benefits, also leaves methods open
to misapplication. For example, the statistical standard in many scientific areas of research is to
achieve statistical significance at the 5% level, meaning that the probability of a type I error (the
23
probability that the null hypothesis is incorrectly rejected) is less than 5%. The aim to achieve
statistical significance at the 5% level has led to several pitfalls in the approach to analyses:
• Not defining the number of samples to use beforehand, and continuing to sample until 5%
significance is achieved (this is problematic since the Type I error probability is independent
of the number of samples).
• Studies with vastly different sample sizes may appear to be “equivalent”; for example in
reaching statistical significance for Type I error, the value of a test with 100 samples is the
same as for a test with 1,000 samples. This can distort the impact of weak studies.
• Excessive confidence in results: for example using 5% as a gold standard ensures that, if all
assumptions going into each analysis are correct, then the conclusions of 1 in 20 papers are
wrong. However, each study will tend to be interpreted with confidence since it has been
judged to be statistically significant.
• Studies are set up backwards: Typically a researcher will study whether or not some process
has an effect since they believe that the process will have an effect. However, the standard
null hypothesis is that there is no effect; i.e. usually the opposite to what the researcher
believes. For example, consider a potential new drug. The null hypothesis when trialling the
drug would be that the drug has no effect; i.e. is no better than a placebo. However, this is
obviously not what the researcher believes. If he believed that he would not be performing
expensive trials for the drug in the first place.
Note that there are more sophisticated Frequentist style analyses that can complement Type I error
tests. However, due to the apparent ease of applying a Type I significance test, many studies do not
include more in depth analysis.
In a PSA context, Frequentist analysis is best suited to estimating confidence intervals of failure
parameters, rather assigning probabilities throughout a range. This is more limiting than the Bayesian
approach discussed next.
24
2.1.2 Bayesian
The Bayesian viewpoint is that probability is about an expression of belief. This allows Bayesians to
consider and assign belief values to events which may only occur once, or indeed, only hypothesised
to occur without ever being observed. In a Bayesian viewpoint it is entirely reasonable to have an
opinion, expressed probabilistically, about the probability of a coin coming up heads, without ever
having seen that coin, or a coin toss. This is known as a prior belief distribution. Different people may
have different beliefs about the probability of a heads.
Bayes’ theorem is a very simple, but useful, statement about conditional probabilities:
𝑃(𝜃|𝑋) =
𝑃(𝑋|𝜃)𝑃(𝜃)𝑃(𝑋)
(1)
The usual way to consider this problem is as consisting of a parameter value of a model and some
observed data, X. The term P(X| θ) is the likelihood of the parameter value given the observed data,
X, P(θ) is the prior probability of the parameter θ. P(X) is the (prior) probability of the data, which is
best interpreted as a normalisation constant. Interpreted in a reliability context, this equation says that
the probability of a reliability parameter θ given some observed data X is equal to the probability of
the data given the parameter value, multiplied by the prior probability of the parameter value, and
divided by the probability of the data. The prior probability is an expression of the belief in values of
θ before the data, X, was observed. The prior distribution is sometimes estimated from related data,
and schemes using this format are referred to as empirical Bayesian methods.
Bayes’ theorem is not unique to Bayesian statistics, although the theorem takes on a much more
central role in Bayesian analyses. In the coin example, Bayes’ theorem provides a mechanism to
modify the prior distribution once some coin tosses have been observed. This results in a posterior
distribution for the probability of obtaining a heads, which makes use of all the available information.
Bayes’ theorem, interpreted using belief functions, allows consideration of a wider class of problem
than a pure Frequentist approach. The combination of a Bayesian viewpoint together with Bayes’
theorem allows consideration of problems involving real world difficulties such as little observed
data, and the presence of nuisance parameters (which can be integrated out). In addition, the absence
25
of data can be covered by prior distribution assumptions, for example by expert judgment elicitation.
This extension allows the consideration of many interesting real world problems.
There is an extensive literature on Bayesian methods and numerical solutions, for example standard
texts such as References 11 and 12. The Cox axioms from 1946 [10] are often quoted as the
underpinning assumptions of the Bayesian viewpoint. However, in their original form these axioms
were not stated mathematically. The Cox axioms have hence been discussed and made more formal
since their original statement. MacKay’s [11] interpretation of the Cox axioms is as follows:
Call B(x) “the degree of belief in proposition x”, and B(x|y) the degree of belief in the conditional
proposition, ‘x, assuming proposition y to be true’.
Axiom 1: Degrees of belief can be ordered, so transitivity follows. If B(x) is ‘greater’ than B(y), and
B(y) is ‘greater’ than B(z), then B(x) is ‘greater’ than B(z). This has the consequence that beliefs can
be mapped onto real numbers.
Axiom 2: The degree of belief in a proposition x and its negation xc are related by some function f, i.e.
B(x) = f[B(xc)].
Axiom 3: The degree of belief in a conjunction of propositions (x AND y) is related to the degree of
belief in the conditional proposition x|y and the degree of belief in the proposition y, i.e. there is a
function g such that B(x,y) = g[B(x|y), B(y)].
One of the practical difficulties of a Bayesian approach occurs when non-conjugate prior-likelihood
distributions are used. In these cases the posterior cannot be written in closed form, and sampling is
necessary to calculate the non-analytic integrals which occur during the course of analysis. With the
availability of cheap computing power this is no longer a substantial impediment. The theoretical
Bayesian formulations are well supported by practical code implementations, such as the freely
available BUGS software [13, 14] created by MRC Biostatistics group at Cambridge.
26
2.1.3 Uncertainty Analysis – General Formulation
The general problem of uncertainty analysis can be stated as finding the uncertainty distribution of an
output variable given input parameters, and input uncertainties; this can be described generally as
follows:
𝑦 = 𝑓(𝑿,𝑼)
(2)
In the above y is a univariate output variable, X is a vector of input parameters, U is a list of
uncertainties, and f is the function for the value of y. A purely aleatory uncertainty for a univariate
function could be expressed as follows:
𝑦 = 𝑓(𝑥) + 𝜉
(3)
Where 𝜉 is an aleatory (random noise) variable. The length of the vector U may be different to the
length of X, and typically will be larger. In order to be able to incorporate a wide range of uncertainty
types, including some which may affect the structure of our model, then it is also necessary to allow
the uncertainty to affect the form of the function f. This concept will be returned to in Chapter 5.
In general the list of uncertainties, U, can be considered to include unknown uncertainties. The
inclusion of the unknown uncertainties would then give the “correct” form of the output variable.
However, for cases including unknown uncertainties, the probability distribution of the function f
cannot be estimated, forcing the method to be theoretical, only, and which could not be implemented
in practice; this is discussed further in Section 2.4.7.
The formulation of a model and its uncertainty described by McKay [15] is more specific to PSA than
the general terminology used above; McKay uses nonparametric models, explained below.
Nonparametric models (or distribution free methods), make no assumptions about the form of
distributions used. It is also sometimes used to describe models in which the model structure is
allowed to be variable. A prediction model for a variable y, is defined as a computational model m(.),
taking a vector of inputs x. The uncertainty in the prediction variable, y, is termed the prediction
27
distribution, given by a probability density function fy. McKay’s description of the overall problem is
expressed by the following system:
𝑥𝑖~𝑓𝑥𝑖(𝑥𝑖), 𝑥𝑖 ∈ 𝐹𝑥𝑖 , 𝑖 ∈ {1, … , 𝑛}
(4)
𝑦 = 𝑚(𝒙)
(5)
𝑦 = 𝑓𝑦(𝑦) (6)
Where n is the total number of input variables, Vk is the domain for input k, and fk is a probability
density function over variable k. An interpretation of the above for PSA is that m(.) is the evaluation
function of a specified gate in a fault tree with some defined structure.
Many application specific methods and algorithms have been developed to propagate uncertainty for
mathematical models generally, and more specifically for PSA. OpenTurns [16] provides a good
example of a general framework for uncertainty propagation and analysis, developed for risk and
reliability applications. The software was constructed by joint academic-industry with the aim of
unifying disparate application specific efforts in quantifying uncertainty of the performance of
complex systems. This is due to the observation that uncertainty quantification follows a similar
process for different applications, such as probabilistic representation of inputs, propagation of
uncertainty according to a defined model function, and the interpretation of the results via tools such
as sensitivity and importance measures. The overall OpenTurns framework, taken from the
OpenTurns reference guide [17] is shown below:
28
FIGURE 1: OPENTURNS OVERVIEW
Where x is a tuple of input variables over which uncertainty is studied, d is a tuple of input variables
which are assumed to be certain (that is uncertainty is considered negligible for d), h specifies the
model function, and y are output variables of interest.
A key part of the process is interpretation of the results via the use of ranking metrics. In PSA this is
typically achieved by importance measures, reviewed in Section 2.5.
Chapter 3 will develop a closed form solution PSA uncertainty for a parametric model in which all
inputs are restricted to be lognormal. The benefits of the simplicity of this approach will be explained
in Chapter 4. Chapter 5 will consider uncertainty in model structure, which can be considered to be a
nonparametric model.
2.2 Risk
Colloquially risk is often used to describe only the consequence part; often “risk” is used
synonymously with “adverse consequence”, such as loss or injury due to exposure to some hazard.
However, this excludes the probability component of the mathematical description of risk, and
inclusion of this can lead to very different interpretations. For example an activity which can result in
death, colloquially might be described as having a high risk, however, in a mathematical sense the
same activity might be described as low risk if the probability of death is very low. The colloquial
29
usage of risk is similar to hazards in risk analysis; hazards are possible events which can cause
adverse consequences. The rest of this thesis only considers risk in a technical sense.
Risk consists of two basic components; probability and consequence. In addition to probability and
consequence, Kaplan and Garrick [18] include the “scenario” as a third component of risk in their
definition. They define risk as a triplet consisting of scenario, probability and consequence. The units
of an undesirable consequence are application specific. For example, it could be numbers of injuries
or fatalities, property damage or financial loss. For PSA models, an often reported value (or
distribution if uncertainty is evaluated) is the predicted frequency of exposure of a member of the
public to different radiation doses. In this thesis Kaplan’s definition is used as the founding basis. Of
the triplet defined in Kaplan’s definition, it is seen that the scenario and consequence vary greatly
depending on the problem at hand, and these two components of risk will be considered depending on
the application, rather than attempting to define them in a general context. The probability component
of risk can be described coherently in its own right; as described in the previous section.
2.3 Types of Uncertainty
PSA uncertainty is explained in Section 2.3.1 with respect to the concepts and terminology used in
relevant literature sources; namely aleatory and epistemic uncertainty. An explanation of uncertainty
due to the root causes is given in Section 2.3.2.
2.3.1 Types of Uncertainty
Apostolakis [19, 20] introduced the categories of aleatory and epistemic uncertainty. Aleatory
uncertainty refers to uncertainty due to randomness of events, such as the probability of obtaining a
head from a coin toss. Epistemic uncertainty refers to uncertainty due to inadequate knowledge, for
example uncertainty in the underlying physics of fluid flow. This categorisation of uncertainties is
widely used in the literature today. Another view of aleatory and epistemic uncertainty has been
provided by Gelman [21] who draws the distinction between ignorance and randomness. The example
he gives is to consider the difference between estimating the probability of heads from the toss of a
fair coin, and estimating the probability of the greatest boxer in the world defeating the greatest
30
wrestler in the world. In the first example, the uncertainty is very well understood, and the uncertainty
is due to randomness; a reasonable prior assignment is P(Heads)=P(Tails)=0.5. In the second we are
(as non-experts) ignorant about the probabilities, and the uncertainty is due to ignorance. The second
type of problem provides a challenge to defining a prior and Gelman notes “has no clearly defined
solution (hence the jumble of “noninformative priors” and “reference priors” in the statistical
literature)”. Numerous authors have taken a sceptical viewpoint of the difference between aleatory
and epistemic uncertainty, and have argued about whether or not a true difference exists. For example
Winkler [22] considers that the notion of types of uncertainty is fundamentally flawed, but also notes
that there can be beneficial practical implications from considering different types of uncertainty.
Both aleatory and epistemic uncertainty are handled using identical mathematical tools; i.e. the tools
of probability and statistics. However, despite the mathematical equivalence it can be observed that by
introducing the distinction, additional consideration about the sources of uncertainty is forced upon
the analyst. If this has the beneficial outcome of evoking clearer descriptions and quantification of
uncertainty sources, then the distinction between aleatory and epistemic uncertainty could be a useful
one. Given the prevalence of these terms in the literature, and the numerous interpretations, it is
important to have a concept of what these phrases mean for PSA. Aleatory and epistemic uncertainties
are therefore each explained further below, in the context of PSA.
2.3.1.1 Aleatory Uncertainty
Aleatory uncertainty, often referred to as statistical uncertainty in PSA, is usually associated with
estimates of failure parameters. The best example of aleatory uncertainty arises when considering
idealised systems in which independent and identically distributed samples are taken at random from
well-defined distributions; for example failure rates and probabilities of failure on demand of pumps.
Within the nuclear industry various alternative methods for failure parameter estimation have been
developed, as discussed for example in References 23, 24, 25 and 26. Of the sources of uncertainty in
PSA models, aleatory uncertainty appears to be the best understood, having had a long history of
study in numerous applications.
31
2.3.1.2 Epistemic Uncertainty
PSA models are based on numerous supporting analyses that determine whether a given failure
sequence will result in a success or failure condition. The supporting analyses employ various
phenomenological models; for example reactivity calculations, heat transfer and fluid flow models.
However, such models are incomplete due to an incomplete understanding of the physical behaviour
of complex systems. This introduces uncertainties into the results of phenomenological modelling,
which in turn causes uncertainty in the PSA results. This uncertainty is an example of epistemic
uncertainty, which is only rarely represented in modern PSAs.
The supporting analyses for PSA include thermal hydraulic modelling, fluid flow modelling, neutron
flux and material stress calculations. These analyses primarily support claims that are made on the
properties of reactor materials and systems. A key example is the peak temperature of the fuel pin
cladding. If peak clad temperature exceeds a critical level then radioactive materials inside the fuel
pins may escape into the environment.
2.3.1.3 Aleatory, Epistemic or Both?
The frequencies of certain external events are of interest for PSA; for example the frequencies of
hazards such as flooding or earthquakes. Depending on the method that is used to estimate the
frequency of external hazards, the uncertainty in the hazard magnitude may be either viewed as
epistemic or as purely aleatory (statistical). For example consider the return frequency of high winds
exceeding a stated wind speed: if this is estimated by modelling the underlying physics then the
uncertainty is largely epistemic. If on the other hand the return frequency is estimated by using
historical data, for example from observations collected over 100 years, the uncertainty in the return
frequency is being considered as an aleatory uncertainty.
Uncertainties in long term trends can best be thought of as epistemic uncertainties, since future
changes to a pattern depend mainly on physical processes rather than random changes. For example
consider the frequency of extreme high wind; the return frequency is affected by potential global
32
climate changes, which is primarily subject to epistemic uncertainty; i.e. our lack of knowledge about
how climate change will unfold.
It can be useful to split the uncertainty in a risk estimate into aleatory and epistemic components to
better understand whether the uncertainty is able to be reduced further. However, mathematically, no
distinction needs to be made between aleatory and epistemic uncertainty in order to proceed with an
uncertainty analysis, once the input uncertainties (aleatory or epistemic) have been defined. Note that
one person may consider an input to have aleatory uncertainty while another may interpret it as
epistemic uncertainty, or indeed have no uncertainty in the result at all (epistemic certainty). For
example consider pseudo-random number generators (PRNG). For the person that wrote the PRNG
then there is no uncertainty in the outcome; it follows a known pattern. Meanwhile someone who
presses “go” on the PRNG is, ideally, unable to distinguish the resultant sequence of numbers from a
true random process (such as radioactivity counts). In practice most input elements to a PSA appear to
have both aleatory and epistemic uncertainty.
Finally, note that conversion between uncertainty types is possible; consider that the advance of
science has a tendency to convert our viewpoint of different uncertainties. What may have appeared
entirely random to our predecessors, such as the appearance of comets, may be well understood at a
later date, and only subject to some residual epistemic uncertainty. Quantum mechanics provides a
good example of movement in the opposite direction; the behaviour of atoms and subatomic particles
in the 1920s would have been viewed as mainly subject to epistemic uncertainty. However, the
quantum interpretation of sub-atomic physics in a fundamentally probabilistic way results in a shift
back to aleatory uncertainty. The choice between aleatory or epistemic uncertainty can hence be seen
as depending on a subjective viewpoint.
One reason that it may be useful to make a distinction between aleatory and epistemic uncertainty lies
in the independence or dependence of different uncertainties. For a set of truly aleatory uncertainties,
the uncertainties are entirely independent, by definition. Dependence between two aleatory
uncertainties implies some generating mechanism, which we may or may not be ignorant about.
However, epistemic uncertainties may be independent or dependent, since the nature of the unknown
33
knowledge is unknown, any dependence structure between variables is possible. Hence, there may be
explanatory causative mechanisms between two apparently unrelated epistemic uncertainties. This can
potentially have a significant effect on the overall predicted uncertainty distributions.
Bier [27] notes that a fundamental difference in state of knowledge uncertainty versus true random
uncertainty is the degree of correlation between different plants and the likely actions required to
correct errors. Improvements in the state of knowledge are likely to result in changes or backfits to
numerous plants, whereas a true random error would be site specific and only affect one or a few
plants. For example, a single poorly performing pump (aleatory uncertainty in pump failure
probability) might be delivered to a single plant, whereas a knowledge error (epistemic uncertainty)
that resulted in the installation of pumps in an environment that the pump cannot operate in (e.g.
humidity) is more likely to affect several plants. This is important information to consider in terms of
the impact of a particular uncertainty source, and the likely mitigating actions necessary.
2.3.2 Uncertainty Domains
Uncertainty estimates form an important part of good engineering and scientific work. However,
uncertainty information is usually difficult to pass between different domains of enquiry. This section
explains the concept of uncertainty domains. To the extent that a way to effectively pass uncertainty
information between domains of knowledge can be found then PSA can be made to fully reflect the
current understanding of uncertainty due to all quantified sources.
As an illustrative PSA example consider that uncertainty in thermal conductivity parameters under
specified conditions are unknown. Uncertainty in the physical parameter causes some uncertainty in
fluid flow predictions. Uncertainty in fluid flow predictions causes uncertainty in coolant flow
requirements. This causes uncertainty in success criteria which causes uncertainty in PSA model
structure. At each stage there are numerous other uncertainties including, physical parameter
uncertainties, geometry uncertainties, discretisation uncertainty, model uncertainty, numerical
calculation uncertainties, and simplifying assumptions to name but a few. Accounting for all of these
uncertainties in PSA is difficult since the relationship between the uncertainties and PSA parameters
34
(such as failure parameters and model structure) is not clear. Put another way, there is no thermal
conductivity parameter in PSA models; any uncertainty due to this parameter must be represented
indirectly in the PSA by understanding the relationship between thermal conductivity and the
elements which are included in PSA models, such as gates and basic events.
2.3.2.1 Uncertainty Domains in PSA
In nuclear safety the lowest level of uncertainty might, reasonably, be viewed to be uncertainty in the
basic physical processes. Figure 2 below has been developed, as part of this thesis, to show how
uncertainty in physical phenomena can be traced through to risk models such as PSA.
FIGURE 2: UNCERTAINTY DOMAINS AFFECTING UNCERTAINTY IN RISK
In an ideal world, a decision maker would make decisions based directly on the real natural
phenomena that occur. However, this is not possible since the physical phenomena are incompletely
understood, and, even, incompletely conceptualised. In the transition from each stage in the figure,
some error is introduced, due to the introduction of a simplification of the preceding step, which is
necessary or convenient for the analysis in the next step to be practical. The addition of the error term
35
is due to uncertainty in the preceding step. Observations of the phenomena may be more or less
accurate, depending for example on measurement instruments used, but in all cases some error will be
introduced. Observations are used to create theoretical models of the basic processes. The theoretical
models also introduce some error as they involve codification of observed data. The theoretical
models will likely be used to investigate “what if” scenarios, for example “what if a large break
LOCA occurred”. This investigation typically requires the use of numerical approximations and
algorithms, which will introduce further uncertainties. The results of the numerical modelling will be
used to define engineering requirements on systems, which in the context of safety analysis, are
typically expressed as (conservative) success criteria. The success criteria of a system are the criteria
the functions performed by system must meet for the system, as predicted, by the numerical
modelling, to preserve in a safe state. By defining success criteria for each safety system on a plant, a
logical model of the plant, to identify the sequences where these functions are not performed that
would result in adverse consequences, such as a radiological release. The translation from success
criteria to a logical risk model (the fault tree structure of a PSA) also introduces uncertainties. Finally
decisions can be informed by the results of the risk model. Although the area of interest in this thesis
is PSA risk uncertainty, in order to estimate uncertainty in the risk model, it is necessary for the
uncertainty in the preceding layers to be understood and propagated through the risk model (PSA).
Only rarely does PSA explicitly incorporate uncertainties arising from lower levels of analysis. This
does not mean that the uncertainties are not considered at each level; only that the uncertainty is not
carried forward and included in PSA uncertainty results. For example, the uncertainty contribution
due to thermal-hydraulic analysis of a transient is not traceable to the uncertainty in core damage
result from a PSA model, in the present state of the art. Note, the safety studies that have been
performed, with uncertainty included, do inform probabilistic safety analysis. In these cases, the
safety analysis (with uncertainty) is generally used to substantiate a claimed set of success criteria. In
other words uncertainty analysis at the low levels of abstraction is used in a conservative manner to
demonstrate that a system can perform its required safety duty. Due to the differing way that analyses
are performed at each level of abstraction, passing on uncertainty information is usually not a simple
36
task. That is, the task is not easily reducible to a mathematical problem of Monte Carlo simulation,
since the PSA-relevant distributions are generally not known.
2.3.2.2 Uncertainty Quantification in Nuclear Safety Analyses
To provide a better understanding of the sources of uncertainty, methods of uncertainty quantification
used in the nuclear safety analyses are now reviewed. These relate to the mid-range level of Figure 2
above, such as uncertainty due to theoretical models of fluid flow. It is necessary to understand how
these uncertainties arise and how they are quantified in their own domain in order to understand how
they can affect uncertainty in PSA models. One key link is the development of success criteria that
determine the logic of PSA models. Ideally, a PSA analyst would like a probability distribution over
possible success criteria, rather than a justification (with high confidence) that one specified success
criterion is acceptable. Often, existing methods would need to be modified in order to supply this
uncertainty information, since uncertainty is calculated over different parameters at different levels in
Figure 2.
Starting in the 1980s the USNRC undertook a programme of work called the Code Scaling
Applicability and Uncertainty (CSAU) evaluation methodology. The aim of this work was to provide
support for revised rules on the acceptance of LWR Emergency Core Cooling System (ECCS) based
on best estimate rather than conservative calculations. Central to this shift to best-estimate methods,
was an explicit evaluation of uncertainties for complex phenomena (in place of conservative
estimates). The CSAU programme was initially aimed specifically at the performance of ECCS, and
acceptance criteria of parameters such as peak clad temperature, hydrogen generation rates, coolable
core geometry and long term cooling were not altered. In the first instance the method was designed
primarily for assessing uncertainty in thermohydraulic calculations, but it was intended that the
applicability of the method could be extended to severe accident studies and risk assessments [28]. A
series of six papers [28, 29, 31, 32, 33, 34] collectively described the evaluation methodology. The
first in the series of papers, Reference 28, primarily addresses uncertainties due to scaling of results
37
from test facilities to apply to full size plants. There are several important considerations to scaling
which are summarised below:
• Code scaling: this applies to a code validated against tests in small scale setups but that must
then be applied to other setups. This may affect the formulation of the models used, and also
implementation issues such as nodalisation.
• Best estimate codes use empirical correlations and parameters which are often estimated from
scaled down test facilities. Understanding how these correlations may change for different
scales is important in making correct use of available data. In some instances not all the data
may be consistent; for example results of integral tests may differ from those of single effect
tests. Discrepancies need to be understood in order to be able to use the code in a range of
situations. It is also important to understand the range of values that parameters may take, for
example due to the various plant states and operating conditions which can occur.
• Codes can have compensating errors due to tuning parameters to fit certain datasets, which
then fail to compensate when applied to different data sets. This has an analogue in a machine
learning context which is known as over-fitting. In failure parameter estimation for PSA the
problem can also arise; for example by using the maximum likelihood estimate for a dataset
with zero observed failures. In all these cases, the problem is (lack of) robustness of the
proposed solution with slight variations in the dataset.
• Some parameter estimates need to be made beyond the range supported by experimental data.
These parameters can still be chosen on a best estimate basis, but in these cases it is necessary
to perform sensitivity studies to understand how they may affect safety analyses.
The CSAU method used three elements to take account of the factors summarised above. The three
elements are further decomposed into fourteen steps.
1) The first element is determining the requirements and code capabilities. There are six steps in
this determination, described in detail in Reference 29. The steps are: 1) specification of the
scenario, 2) selection of the NPP, 3) identification and ranking of phenomena, 4) selection of
38
frozen code, 5) development of a complete set of code documentation and 6) determine code
applicability. These six steps are together used to determine the code applicability for the
intended purpose. A complementary method, Phenomena Identification and Ranking
Technique (PIRT) [30] is used in the step to identify the important phenomena that must be
considered to ensure that the particular case is modelled correctly. PIRT is a widely used and
systematic way of identifying and ranking relevant phenomena based on expert knowledge,
sample simulations and group discussions.
2) The second element in the method is assessment and ranging of model parameters, as
described in Reference 31. There are four main steps to this element; the overall goal is to
assess code results against experimental data, and to understand scaling of results obtained
from experimental rigs to apply to modelled systems. This part of the process is further
discussed in Reference 32, with reference to scaling techniques for loss of coolant accident
test facilities, and in particular for the modelling of the large break LOCA. Reference 32 notes
that uncertainty is not time invariant i.e. that the uncertainty in a given parameter (the
example used is peak clad temperature) increases as time progresses, which should be
expected as simulations move further from the known starting conditions:
“The uncertainty in the PCT (defined here as the difference between the 95th
percentile and the mean), at any point or within any interval in time is a consequence
of the preceding time behavior. It is not constant with time and generally tends to
increase with time. The most significant consequence of this condition is to require
uncertainty evaluations for each major phase of a scenario.”
This is a general feature of almost any numerical-approximation based analysis; uncertainty is
compounded both with time progression and with the number of stages required for the
analysis. This observation is important for PSA which is not time variant in the success
criteria; i.e. one set of success conditions (determining the model structure) are assumed
relevant throughout a defined mission time (usually 24 hours).
39
3) The third element is quantitative estimation of the uncertainty and performance of sensitivity
studies, as described in part 4 of CSAU [33], and further explained in part 5 [32]. The
uncertainty quantification incorporates the uncertainty from all identified sources, including
coding inaccuracies, scaling errors, parameter uncertainty and initiating scenario uncertainty.
Sensitivity studies can be described as “what if” analyses in which various values of
parameters are hypothesised to check how this would affect code predictions. Sensitivity
studies are particularly useful and important for those parameters for which the bounds of the
uncertainty distribution are themselves highly uncertain e.g. when estimates are needed for
parameters outside of the bounds supported by experimental data. The overall CSAU process
of estimating uncertainty using computational codes is reviewed in Reference [34] against
physically based arguments. This is a valuable final consideration in any uncertainty
quantification that uses complex computational codes, providing a “sense check” on the
predictions made.
The CSAU process is considered highly relevant to the incorporation of uncertainties into PSA since
it is a comprehensive description of an overall process of incorporating uncertainty into calculations,
which remains valid. Most implementations implicitly or explicitly use the framework devised under
CSAU. More recent work, (referred to as BEPU - best estimate plus uncertainty), uses alternative but
basically similar methods. Examples are the Gesellschaft fur Anlagen-und Reaktorsicherheit (GRS)
method for estimation of uncertainty in loss of coolant accidents [35, 36], which follows a similar
pattern to CSAU. Westinghouse has published their automated statistical treatment of uncertainty
method (ASTRUM) [37], which has also been applied to loss of coolant accidents [38 ,39]. ASTRUM
cites CSAU as forming the basis for its overall structure. Italy has also developed a similar uncertainty
method based on an accuracy extrapolation method [40], and methodologies have also been
developed in the UK and France. Reviews of the methods are provided in References 41 and 42. In
the opinion of this author CSAU, despite its age, remains the clearest description of the overall
40
approach. The most recent implementation to date is that described in Reference 42. A detailed
comparison of the uncertainty estimation methods is presented in Reference 42.
It is noted that, in some stages in these methods, conservative estimates are still used. An example of
how conservative estimates may still appear in a ‘best-estimate’ analysis is in the implementation of
the various computer codes. For example, current generations of thermohydraulic codes such as
RELAP, TRACE, ATHLET and CATHARE, are intended as best estimate codes. However, this is not
necessarily the case since the problem setup in terms of models used, user effects, system nodalisation
and code options (such as the choked flow model) are often chosen conservatively. From the
perspective of risk models these estimates can still be used in defining success criteria with
distributions, albeit noting that there remains a conservative bias.
For the purposes of this thesis, the uncertainty methodology of analyses pre-PSA is important in as
much as it exists. It provides a baseline demonstration that uncertainties are calculated at lower levels
and could, in principle, be incorporated into risk models. However, the uncertainty is often calculated
over parameters that cannot be directly used in PSA models. For example, calculating the uncertainty
in peak clad temperature as an output of thermal hydraulic studies does not, directly, inform a risk
analyst about the uncertainty in how many pumps are required to provide sufficient coolant flow.
2.4 Sources of Uncertainty in PSA Models
2.4.1 Quantification Uncertainty
Some uncertainty is introduced into PSA models due to numerical approximations in PSA codes used.
Calculation of minimum cutsets in a Boolean model is a nondeterministic polynomial time (NP) hard
problem [43]. This has the consequence that the minimal cutsets listings obtained from fault tree
models necessarily use approximations for large models. An overview of Boolean algebra, and the
approximations required for quantification of logical models are given in Reference 44. A major
source of error/uncertainty can be the truncation of the cutsets, since it is possible that a sum over
many small cutsets could contribute a non-negligible quantity. Codes such as Wam-Bam have been
used to bound this effect [45]. The possible size of quantification error can be practically investigated
41
by improving the approximation (for example decreasing threshold cut-off values for minimal cutsets)
and checking the change in the results that occurs. This can be used to provide confidence in the
stability of the results, and insensitivity to analysis settings.
Propagation of uncertainty distributions through the model is typically done via Monte Carlo
simulations. This introduces randomness, and hence uncertainty, into the results. The level of
uncertainty in estimated distributions diminishes with increasing sample size. Reliability software
such as Risk Spectrum (Relcon), Open FTA (GNU), CAFTA (EPRI), Fault Tree Plus (Isograph) and
SAPHIRE (NRC) all share the ability to propagate uncertainty through the model to express model
results in terms of distributions. The BUGS manual [14] has practical advice on managing variability
due to simulations in general. Some of this is applicable to PSA software, depending on the particular
software and how much of the mechanics are visible to the user. Practically, the uncertainty due to
sample size can be managed by altering the sample size until there is limited variability from run to
run.
In general, the uncertainty in the PSA results is expressed without explicitly recognising the
quantification uncertainty, since it is usually considered to be small compared to other sources of
uncertainty.
2.4.2 Component Reliability Uncertainty
Component reliability estimates are a basic input to PSA models. Components can be either active or
passive. Active components are those which are required to change state in order to fulfil their
function, for example pumps, valves and switches. Passive components are not required to change
state to fulfil their function; for example pipework and storage tanks. Reliability of active and passive
components is assessed differently, and hence uncertainty in these is considered individually below.
2.4.2.1 Active Component Uncertainty
The reliability of active components is estimated from observed data wherever possible. The methods
for doing this are mature, and covered in numerous sources, for example References 83 and 46. The
approach to estimating the failure parameters is to compare the component in question against
42
databases of operational experience. The operational experience is sometimes split into Type 1 and
Type 2 categories. Type 1 data consists of primary data sources, such as work order cards, which
describe operational experience and defects or failures encountered. Type 2 data is a summary
collated from Type 1 data, in which the number of failures is compared with the number of demands.
Failure statistics can be used to find uncertainties in component reliabilities that can be input into a
PSA model. Uncertainties in component reliability are generally treated as aleatory; see Section
2.3.1.1 for a description of aleatory uncertainty.
2.4.2.2 Passive Component Uncertainty
In addition to consideration of the uncertainty in failure parameters for active comments, uncertainty
in the reliability of passive systems is, sometimes, also considered in PSA models. For example the
structural reliability of the reactor containment building in an over-pressurisation event is an
important input quantity in a PSA model of an NPP in consideration of certain accident sequences.
Reliability of passive systems is of increasing interest, due to the increased reliance on passive safety
features in new NPP designs. Although in PSA studies some passive safety system failures are
discounted, such systems are generally reliant on the continued structural integrity of pipes and tanks,
or on complex phenomena such as natural circulation, whose predictions are uncertain: these
uncertainties need to be included in reliability parameters for passive components modelled in a PSA
[47, 48, 49]. Uncertainties in the reliability of passive components must usually be determined using
mathematical models of their failure modes that take account of uncertainties in model input data and
in the models themselves.
2.4.3 Scenario Uncertainty
Scenario uncertainty is the uncertainty that arises due to the difficulty of identifying all the possible
initiating events and event combinations that can occur in an NPP. To make the modelling task
tractable, only some of the possible scenarios can be analysed; this is enabled by the development of
bounding arguments which envelope a number of different scenarios. This could for example be a
bounding leak size. Starting from bounded analysis, one possible way in which additional uncertainty
43
can be included is by more detailed consideration of the possible scenarios; that is moving to more
fine grained bounding. For example an analysis may assume the reactor to be “at power”. To provide
some consideration of reactor power state uncertainty, this could be adjusted to a low power and a
high power state, appropriately proportioned according to the particular station’s operating history.
This idea can be extended to all parts of the model, for example initiating events, operator actions, and
dose band consequence releases. The issue of scenario uncertainty is linked to the issue of
phenomenological modelling. Phenomenological modelling will only analyse, out of necessity, a
certain subset of possible setups. It is useful to conceptually separate the issue of scenario uncertainty
from phenomenological modelling; here phenomenological modelling uncertainty is considered to
include all modelling aspects such as uncertainties in relevant physical parameters and model
assumptions. For example physical parameters for thermal hydraulic analysis include thermal
conductivity, viscosity and flow rate. For example, model assumptions include turbulence model
assumptions and homogeneous/heterogeneous fluid flow model assumptions. Scenario uncertainty
relates to the macro level setup of the problem, such as the initiating transient, the initial state of the
plant and environmental conditions. Scenario uncertainty is effectively considered through the use of
fault schedules and bounding fault schedules. By considering the possible plant and hazard faults that
could occur a fault scenario is modelled to a given level of detail. The fault schedule is typically too
fine grained to use in a meaningful way in a PSA model, resulting in the consideration of bounding
faults. Initial conditions, such as reactor power state and environmental conditions, are typically
considered in a bounding manner.
2.4.4 CCF Uncertainty
In complex systems, such as NPPs, there are numerous interactions between different parts of the
system. The net effect is that different parts do not act independently of one another. Statistically this
means that the probabilities of failure of any two components are not independent of one another.
However, given the large number of components at a power station, the number of potential
combinations and dependencies is large. Most PSA models make the assumption that basic entities in
44
the model are independent of one another, and add dependencies in places in the model where it is
expected to be significant. This is done by considering common cause failures (CCFs).
One of the main areas where CCFs are important is in the assessment of the reliability of redundant
setups. The use of redundancy in design, in which multiple identical components are employed to
perform the same safety function, is a central concept in the design of high reliability systems, as it