10
Model validation under epistemic uncertainty Shankar Sankararaman, Sankaran Mahadevan n Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, TN 37235, USA article info Available online 7 April 2011 Keywords: Model validation Interval data Sparse data Likelihood Bayesian statistics Hypothesis testing abstract This paper develops a methodology to assess the validity of computational models when some quantities may be affected by epistemic uncertainty. Three types of epistemic uncertainty regarding input random variables – interval data, sparse point data, and probability distributions with parameter uncertainty – are considered. When the model inputs are described using sparse point data and/or interval data, a likelihood-based methodology is used to represent these variables as probability distributions. Two approaches – a parametric approach and a non-parametric approach – are pursued for this purpose. While the parametric approach leads to a family of distributions due to distribution parameter uncertainty, the principles of conditional probability and total probability can be used to integrate the family of distributions into a single distribution. The non-parametric approach directly yields a single probability distribution. The probabilistic model predictions are compared against experimental observations, which may again be point data or interval data. A generalized likelihood function is constructed for Bayesian updating, and the posterior distribution of the model output is estimated. The Bayes factor metric is extended to assess the validity of the model under both aleatory and epistemic uncertainty and to estimate the confidence in the model prediction. The proposed method is illustrated using a numerical example. & 2011 Elsevier Ltd. All rights reserved. 1. Introduction Computational models are increasingly used to solve practical problems in various engineering disciplines. The quality of model prediction is affected by various sources of uncertainty such as model form assumptions and solution approximations, natural variability in model inputs and parameters, and data uncertainty due to sparse and imprecise information. When the model is used for system risk assessment, or certification of reliability and safety under actual use conditions, it is important to quantify the uncertainty and confidence in the model prediction, in order to facilitate risk-informed decision making. Quantification of mar- gins and uncertainties (QMU) has been advocated as a systematic procedure to aid decision making under uncertainty. Helton [1] discusses and illustrates the conceptual and computational basis of QMU in analyses that use computational models to predict the behavior of complex systems. Model validation is an important component of a QMU analysis that is intimately connected with the assessment and representation of uncertainty [1]. This paper focuses on model validation, under epistemic uncertainty in both the model inputs and the available experimental evidence for validation. The process of model validation measures the extent of agreement between the model output and experimental observa- tions [2]. A visual comparison, usually referred to as ‘‘graphical validation’’, though valuable, is inadequate in many cases [3,4]. Such an approach is only qualitative and cannot explicitly account for the different sources of uncertainty. Oberkampf and Barone [5] explain the need for rigorous quanti- tative validation metrics, which can be perceived as computable measures that can compare model predictions and experimental results over a range of input (or control) variables to sharpen the assessment of computational accuracy. An important aspect of model validation is the rigorous, explicit treatment of multiple sources and different types of uncertainty. Various types of uncer- tainty can be broadly classified into two typesaleatory and epistemic. In the context of validation, both the model inputs and the experimental evidence are uncertain. There is general consensus among researchers that a rigorous approach to model validation should explicitly account for the various sources of uncertainty – physical variability, information uncertainty, measurement error, etc. – and develop a robust model validation metric that can quantitatively judge the performance of the model and assess the confidence in the model prediction. Model validation under aleatory uncertainty has been studied by several researchers and there are methods available in the literature [513] to solve this problem. Oberkampf and Barone [5] used statistical confidence intervals to propose validation metrics Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/ress Reliability Engineering and System Safety 0951-8320/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.ress.2010.07.014 n Corresponding author. Tel.: þ1 615 322 3040. E-mail address: [email protected] (S. Mahadevan). Reliability Engineering and System Safety 96 (2011) 1232–1241

Model validation under epistemic uncertainty

Embed Size (px)

Citation preview

Page 1: Model validation under epistemic uncertainty

Reliability Engineering and System Safety 96 (2011) 1232–1241

Contents lists available at ScienceDirect

Reliability Engineering and System Safety

0951-83

doi:10.1

n Corr

E-m

journal homepage: www.elsevier.com/locate/ress

Model validation under epistemic uncertainty

Shankar Sankararaman, Sankaran Mahadevan n

Department of Civil and Environmental Engineering, Vanderbilt University, Nashville, TN 37235, USA

a r t i c l e i n f o

Available online 7 April 2011

Keywords:

Model validation

Interval data

Sparse data

Likelihood

Bayesian statistics

Hypothesis testing

20/$ - see front matter & 2011 Elsevier Ltd. A

016/j.ress.2010.07.014

esponding author. Tel.: þ1 615 322 3040.

ail address: sankaran.mahadevan@vanderbilt.

a b s t r a c t

This paper develops a methodology to assess the validity of computational models when some

quantities may be affected by epistemic uncertainty. Three types of epistemic uncertainty regarding

input random variables – interval data, sparse point data, and probability distributions with parameter

uncertainty – are considered. When the model inputs are described using sparse point data and/or

interval data, a likelihood-based methodology is used to represent these variables as probability

distributions. Two approaches – a parametric approach and a non-parametric approach – are pursued

for this purpose. While the parametric approach leads to a family of distributions due to distribution

parameter uncertainty, the principles of conditional probability and total probability can be used to

integrate the family of distributions into a single distribution. The non-parametric approach directly

yields a single probability distribution. The probabilistic model predictions are compared against

experimental observations, which may again be point data or interval data. A generalized likelihood

function is constructed for Bayesian updating, and the posterior distribution of the model output is

estimated. The Bayes factor metric is extended to assess the validity of the model under both aleatory

and epistemic uncertainty and to estimate the confidence in the model prediction. The proposed

method is illustrated using a numerical example.

& 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Computational models are increasingly used to solve practicalproblems in various engineering disciplines. The quality of modelprediction is affected by various sources of uncertainty such asmodel form assumptions and solution approximations, naturalvariability in model inputs and parameters, and data uncertaintydue to sparse and imprecise information. When the model is usedfor system risk assessment, or certification of reliability and safetyunder actual use conditions, it is important to quantify theuncertainty and confidence in the model prediction, in order tofacilitate risk-informed decision making. Quantification of mar-gins and uncertainties (QMU) has been advocated as a systematicprocedure to aid decision making under uncertainty. Helton [1]discusses and illustrates the conceptual and computational basisof QMU in analyses that use computational models to predict thebehavior of complex systems. Model validation is an importantcomponent of a QMU analysis that is intimately connected withthe assessment and representation of uncertainty [1]. This paperfocuses on model validation, under epistemic uncertainty in boththe model inputs and the available experimental evidence forvalidation.

ll rights reserved.

edu (S. Mahadevan).

The process of model validation measures the extent ofagreement between the model output and experimental observa-tions [2]. A visual comparison, usually referred to as ‘‘graphicalvalidation’’, though valuable, is inadequate in many cases [3,4].Such an approach is only qualitative and cannot explicitly accountfor the different sources of uncertainty.

Oberkampf and Barone [5] explain the need for rigorous quanti-tative validation metrics, which can be perceived as computablemeasures that can compare model predictions and experimentalresults over a range of input (or control) variables to sharpen theassessment of computational accuracy. An important aspect ofmodel validation is the rigorous, explicit treatment of multiplesources and different types of uncertainty. Various types of uncer-tainty can be broadly classified into two types—aleatory andepistemic. In the context of validation, both the model inputs andthe experimental evidence are uncertain. There is general consensusamong researchers that a rigorous approach to model validationshould explicitly account for the various sources of uncertainty –physical variability, information uncertainty, measurement error,etc. – and develop a robust model validation metric that canquantitatively judge the performance of the model and assess theconfidence in the model prediction.

Model validation under aleatory uncertainty has been studiedby several researchers and there are methods available in theliterature [5–13] to solve this problem. Oberkampf and Barone [5]used statistical confidence intervals to propose validation metrics

Page 2: Model validation under epistemic uncertainty

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–1241 1233

for interpolation and regression of experimental data. Oberkampfand Trucano [6] discussed benchmarks for model validationmetrics and demonstrated the construction of validation metricsbased on experimental error. A validation metric developed byHills and Leslie [7] normalized the difference between the modelprediction and the experimental observations and computes arelative error norm. Urbina et al. [8] developed a validation metricthat includes the uncertainty in experimental evidence due tolimited data through statistical distributions and hypothesistesting. These validation metrics [7,8] dealt with aleatory uncer-tainty and included measurement errors and model errors.

In the general context of model validation under aleatoryuncertainty, model inputs are assumed to be random variablesand hence can be described using unique probability distribu-tions. Experimental evidence, i.e. validation data, is assumed toconsist of point data. Rebba and Mahadevan [9] summarizedvarious statistical methodologies for model validation underaleatory uncertainty, where the statistical distributions of modelpredictions and experimental observations are compared. Zhangand Mahadevan [12] used a metric based on Bayesian hypothesistesting to assess the validity of reliability computation models. Inthis method, reliability analysis models were considered forvalidation and reliability data (with respect to system failure)were used for quantitative validation of a reliability model for ahelicopter rotor hub component. This methodology does notconfound model updating methods and model validation anddirectly quantifies the extent to which the experimental datasupports the model. Mahadevan and Rebba [13] used this meth-odology to include different sources of aleatory uncertainty inmodel validation. In this approach, the prior probability distribu-tion of the model output is estimated after accounting for varioussources of aleatory uncertainty, and experimental observationsare used to calculate the posterior distribution, acknowledgingthe presence of measurement errors. The validity of the model isthen assessed using the Bayes factor, which is the ratio of thelikelihood of observing the data under two competing hypoth-eses. Jiang and Mahadevan [14] showed how the threshold Bayesfactor for model acceptance can be derived based on a risk vs. costtrade-off, thereby aiding in robust, meaningful decision making.

Though it is clear that the Bayesian hypothesis testingapproach is suitable for the purpose of model validation, thismethod cannot be used directly when the model inputs andvalidation data are quantities with epistemic uncertainty.Researchers have used non-probabilistic techniques to deal withinterval data. Several methods based on interval analysis [15],evidence theory [16,17], fuzzy sets [18], convex models ofuncertainty [19], etc. have been investigated for the treatmentof epistemic uncertainty due to interval data. These methods havebeen primarily used for uncertainty quantification purposes and itis not clear how to perform model validation assessment underepistemic uncertainty. Ferson et al. [20] address uncertainty(sampling error) arising due to the small number of runs of anexpensive simulation model. However they do not deal with caseswhere input quantities and validation measurements are avail-able in the form of intervals.

This paper extends the Bayesian model validation approach toinclude epistemic uncertainty arising from sparse or imprecisedata with respect to the model inputs and validation evidence.Epistemic uncertainty regarding a variable can be of two types: apoorly known stochastic quantity [21] or a poorly knowndeterministic quantity [22]. In this paper, we are concernedonly with the former type of epistemic uncertainty where sparseand/or imprecise information (i.e. sparse point data and/or inter-val data) is available regarding a stochastic quantity; as a result,the distribution type and/or the distribution parameters areuncertain.

In this paper, a likelihood-based approach is used to constructprobability distributions for stochastic quantities when informa-tion about them is available in the form of interval data and/orsparse point data. The definition of likelihood is extended tointerval data from first principles. In the case of point data, thelikelihood is derived from the probability density function of therandom variables and in the case of interval data, the likelihood isderived from the cumulative distribution function of the randomvariable. Using this approach, each model input described usingsingle or multiple intervals is represented using a probabilitydistribution before uncertainty propagation analysis. The meth-ods are illustrated using both parametric and non-parametricprobability distributions. During the validation assessment, theBayesian hypothesis testing approach is extended to includeinterval information in the validation data.

Note that the use of the likelihood approach to construct theprobability distribution is not the same as approximating aninterval with a distribution; rather it is a systematic approachto account for the uncertainty in the distribution type anddistribution parameters of a stochastic quantity in the presenceof sparse and/or imprecise data. Some researchers argue against aprobabilistic approach to handle interval data because informa-tion may be added to the problem. However, if the quantity isstochastic to begin with, it is only appropriate to represent it witha probability distribution. Regarding faithfulness to the availableinformation, the proposed method addresses this concern either(1) by quantifying the uncertainty in the parameters of theprobability distribution and using a family of distributions inthe parametric approach or (2) by constructing non-parametricprobability distributions that are more flexible than parametricdistributions and more faithful to the available information.

The following sections describe the proposed methodology indetail. Section 2 describes the methodology to construct prob-ability distributions for model inputs that are sources of epistemicuncertainty. Section 3 develops the methodology for Bayesianupdating using epistemic validation data. Section 4 derives thevalidation metric based on Bayesian hypothesis testing, account-ing for both aleatory and epistemic uncertainty. Section 5 illus-trates the proposed methodology using a steady state heattransfer problem.

2. Probabilistic representation of epistemic model inputs

Consider a computational model, represented by G, whoseinputs are denoted by X and let Y denote the correspondingoutput. Note that X is a vector of model inputs (set of all modelinputs), and let X denote a particular model input variable.Information on X may be available in different formats:

a.

sufficient data to construct a precise probability distribution(aleatory uncertainty);

b.

sparse point data and/or interval data (epistemic uncertainty);and

c.

probability distribution with uncertain parameters (epistemicuncertainty).

This paper presents an approach that represents information indifferent formats using probability distributions for the purpose ofuncertainty propagation and model validation. Each model input iseventually represented using a single probability distribution, assummarized in Fig. 1. The following subsections discuss the methodsto process epistemic uncertainty due to sparse point data and intervaldata through a probabilistic treatment. Both parametric and non-parametric probability distributions are considered. The parametricmethod for sparse point data and interval data gives a known,

Page 3: Model validation under epistemic uncertainty

Non-Parametric Approach

Integration

Parametric Approach

Aleatory Uncertainty Single Distribution

Family ofDistributions

Sparse Point Data,Interval Data

Distribution ParameterUncertainty

Fig. 1. Probabilistic representation of epistemic model inputs.

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–12411234

conditional distribution form with uncertain parameters, which is thesame situation as the third format above. This leads to a family ofprobability distributions and the principle of total probability is usedto integrate this family into a single unconditional probabilitydistribution, as explained in the following subsection. As seen inFig. 1, the non-parametric method directly gives a single probabilitydistribution.

2.1. Parametric probability distributions

Let a model input X be described using a combination of pointdata (m data points, xi, i¼1 to m) and interval data (n intervals,[ai, bi], i¼1 to n). The principle of likelihood can be used toconstruct a probability distribution fX(x) with this information.

Assume that the distribution type (normal, lognormal, etc.) ofX is known and let P (for example, the mean and standarddeviation in the case of a normal probability distribution) denotethe distribution parameters. Then the probability density function(PDF) of X is denoted by fX(x9P). Note that this density function isconditioned on the choice of parameters, and hence these para-meters need to be estimated using the available evidence. Thelikelihood function of the parameters P can be calculated fromfirst principles, i.e. the likelihood of parameters P is proportionalto the probability of observing data conditioned on P. Assumingthat the sources of data are independent, the expression forlikelihood can be derived from first principles.

First, consider a single data point xi. Pawitan [23] defineslikelihood as the probability of observing data xi given P. Hestates, ‘‘A slight technical issue arises when dealing with contin-uous outcomes, since theoretically the probability of any pointvalue x is zero. We can resolve this problem by admitting that inreal life, there is only a finite precision: observing x is short forobserving XAðx�ðe=2Þ,xþðe=2ÞÞwhere e is the precision limit.’’ If eis small enough, on observing xi, the likelihood for P is

LðPÞ ¼ PðXA x�e2

,xþe2

� �9PÞ

¼

Z xþ e2

x�e2

fXðxi9ðPÞÞdx¼ efXðxi9ðPÞÞ ðby mean value theoremÞ

pfXðxi9ðPÞÞ ð1Þ

As e is an arbitrary, infinitesimally small constant, the expressionfor likelihood is meaningful only up to an arbitrary constant [23].

If there are multiple point data (m data points, xi, i¼1 to m),and if the sources of data are independent, then the overalllikelihood can be calculated as

LðPÞpYmi ¼ 1

fXðxi9ðPÞÞ ð2Þ

The word ‘‘independence’’ in Eq. (2) implies that the sources ofdata, i.e. different experiments, or different experts from which the

data originate, are considered to be independent. In other words, theoutcome of one experiment (or the prediction of one expert) does notaffect the outcome of another experiment (or the prediction ofanother expert). While this is a frequent assumption in statisticalanalysis, it is also theoretically possible to construct the likelihoodfunction in the case of dependent (correlated) data sets.

Note that the above derivation of likelihood considers aninfinitesimally small interval around the data point xi. Hence, itis straightforward to extend this definition to any interval [a, b].Hence, the expression for likelihood of the parameters (P) for asingle interval [a, b] can be derived as follows:

LðPÞpProbðdata9PÞpProbðxA ½a,b�9PÞ ¼ Probðarxrb9PÞ

p

Z b

afXðx9PÞdx ð3Þ

This definition calculates the probability of observing aninterval conditioned on the parameters P and this is not equiva-lent to approximating an interval using a probability distribution.Next, this definition can be extended to multiple interval data andthe expression for the likelihood of the parameters (P) can bewritten as

LðPÞpYn

i ¼ 1

Z bi

ai

fXðx9PÞdx ð4Þ

Hence, the overall likelihood (accounting for both point dataand interval data) can be written as

LðPÞpYn

i ¼ 1

Z bi

ai

fXðx9PÞdx

" # Ymi ¼ 1

fXðxi9PÞ

" #ð5Þ

The parameters P can be estimated by maximizing the like-lihood function in Eq. (5). This estimate is popularly known as themaximum likelihood estimate. Further, the uncertainty in theestimate of the parameters can be calculated using Bayes’ theo-rem. Let fP(p) denote the joint probability density of the para-meters P by choosing a uniform prior distribution of P (over thedomain of definition of P), Bayes’ theorem (after canceling theconstant prior distribution in the numerator and denominator)reduces to

fPðpÞ ¼LðPÞRLðPÞdP

ð6Þ

For a given set P, each model input X can be represented by aprobability distribution. If the distribution parameters (P) are alsouncertain, X is represented using a family of distributions, witheach member of the family resulting from a particular realizationof the distribution parameters [24].

As explained earlier in Section 1, each quantity X is an input toa computational model and the distribution of the model output Y

needs to be calculated using uncertainty propagation analysis.Given a family of distributions for X, each of these distributions(corresponding to a particular realization of the distributionparameters P) can be used to calculate the probability distribu-tion of the model output Y, thereby resulting in a family ofdistributions for Y. This requires a double-loop (two-level, nested)Monte Carlo analysis: first, a sample of P, which gives onedistribution within the above-mentioned family, is drawn andthen several samples of X are drawn from this distribution, whichis used to calculate one probability distribution for the modeloutput Y. Second, the entire procedure is repeated several timesfor different samples of P to calculate the family of probabilitydistributions for the model output Y. If each level of samplingrequires 1000 model evaluations, then nested sampling wouldrequire a total of 106 model evaluations to construct the family ofdistributions for the output Y, which might not be affordable.Hence, this paper discards this nested sampling-based approach

Page 4: Model validation under epistemic uncertainty

(3)

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–1241 1235

and uses the principles of conditional probability and totalprobability to integrate the family of distributions of X as

fXðxÞ ¼

ZfXðx9PÞfPðPÞdP ð7Þ

Note that the distribution of P (calculated in Eq. (6)) is notparametrically available (i.e. they are numerical and cannot beclassified as normal, lognormal, etc.). Hence the integral in Eq. (7)needs to be evaluated numerically. However, this is only anintegration issue and does not involve the calculation of anyperformance function or a computational model (which is usuallythe more expensive calculation in practical problems of uncer-tainty propagation). The integration in Eq. (7) is performed only inthe ‘‘input representation’’ stage, before the ‘‘uncertainty propa-gation’’ stage, thereby leading to a single probability distributionfor X, which includes the uncertainty in the distribution para-meters P. Thus, the two levels of sampling have been ‘‘unnested’’for the sake of faster computation. This strategy drasticallyreduces the computational effort in uncertainty propagation withinterval data (in general, when input variables have uncertaindistribution parameters) and results in a single probabilitydistribution for the output, as seen in Fig. 1.

The total probability-based integration approach is computa-tionally efficient and requires much fewer evaluations of thecomputational model in comparison with the double-loop nestedapproach. The former approach is beneficial for overall decisionmaking, as it integrates the two levels of uncertainty into a singledistribution, thereby facilitating easy uncertainty propagationand model validation. The latter approach helps separately assessthe contributions of different types of uncertainty. Using thelatter approach, multiple probability distributions can be calcu-lated; each probability distribution represents the variability inthe quantity of interest while the extent of the spread of thesedistributions gives an estimate of the contribution of datauncertainty toward the estimates of the distribution parameters(P). The integration-based approach does not facilitate suchanalysis as it combines both these levels of uncertainty andcalculates a single probability distribution.

This integration approach can also be used to handle distribu-tion parameter uncertainty. If information on a model input isavailable in the form of a distribution form, i.e. fX(x9P) anduncertain distribution parameters, i.e. fP(p), then Eq. (7) can beused to represent that particular model input using a singleprobability distribution.

Recall that a particular distribution type, i.e. fX(x9P), wasassumed for the quantity X at the beginning of this subsection.As the evidence is in the form of sparse point data and/or intervaldata, the distribution form may not be known initially and needsto be assumed. Further, this assumed distribution form is alteredafter the uncertainty in the estimates of the parameters P isaccounted for as a result of the calculation in Eq. (7). It can beverified that different initial distribution type assumptions (i.e.normal, lognormal for the quantity X) lead to different final non-parametric distributions. In order to overcome this disadvantage,a non-parametric approach is used in the following subsection.

2.2. Non-parametric probability distributions

Consider a random variable X with a probability densityfunction fX(X) that needs to be constructed. Evidence is availablein the form of point data (m data points, xi, i¼1 to m) and intervaldata ((n intervals, [ai , bi], i¼1 to n)), and this information needs tobe used to construct the probability density function fX(X). Anoptimization-based procedure is developed below.

Discretize the domain of X into a finite number of points, sayyi, i¼1 to Q. Assume that the probability density function values

at each of these Q points are given by fX(x¼yi)¼pi for i¼1 to Q.Using an interpolation technique, the entire probability densityfunction fX(x) can be calculated for allyAX, i.e. over the entiredomain of X. Then the probability of observing the given data(point data and interval data), i.e. the likelihood, can be calculatedusing Eq. (5). This likelihood is a function of the following:

(a)

the discretization points selected, i.e. yi, i¼1 to Q; (b) the corresponding probability density function values pi ; and (c) the type of interpolation technique used.

In this paper, the discretization is fixed, i.e. uniformly spaced yi

values (i¼1 to Q) over the domain of X are chosen in advance andthe values of pi that maximize the likelihood function arecalculated. The value of Q (the number of discretization points)is chosen based on computational power—higher the value of Q,better the results. The optimization problem is formulated as

Given yiAX8i, i¼ 1 to Q

Max LðpÞwhere p¼ fp1,p2,p3,. . .,pQ�1,pQ g and fXðx¼ yiÞ ¼ pi

Subject to

(1)

piZ08i fXðxÞZ08xZ (2)

fXðxÞdx¼ 1 ð8Þ

Note : pi at yi ði¼ 1 to Q Þ is used to interpolate and calculate fXðxÞ

The objective of this optimization problem is to maximize thelikelihood function, subject to the three constraints: (1) the vectorp contains probability values, which need to be positive. Further,the resultant function fX(x) must satisfy the properties of aprobability density function, which are (2) it must be positiveand (3) it must integrate to unity.

Different interpolation techniques – linear interpolation,spline-based interpolation, and the Gaussian process interpola-tion – were tried and the Gaussian process interpolationprocedure is used in this paper because it does not assume anyexplicit functional form for the probability density function fX(x).The details of this interpolation technique can be found in[25–30].

The Gaussian process interpolation is based on the ideathat when input points are near one another, the correlationbetween their corresponding outputs will be high. As a result,the uncertainty associated with the model’s predictions will besmall for input points that are near the points used to train themodel and will increase as one moves further from the trainingpoints. It is assumed that the underlying function (the probabilitydensity values, p in this case) being modeled can be describedby [26]

p¼ hðyÞTbþZðyÞ ð9Þ

In Eq. (9), h refers to the trend of the model, b is the vector oftrend coefficients, and Z is a stationary Gaussian process with zeromean (and covariance defined below) that describes the depar-ture of the model from its underlying trend. In this paper, thetrend is assumed constant and b is the vector of the means of theresponses at the training points (yi and corresponding pi). Thecovariance between the outputs of the Gaussian process Z atpoints a and b is assumed to be [30]

Cov½ZðaÞ,ZðbÞ� ¼ s2Ze½�jða�bÞ2 � ð10Þ

In Eq. (10), j is a scale parameter that indicates the correlationbetween the training points.

Page 5: Model validation under epistemic uncertainty

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–12411236

Note that the assumptions made regarding the Gaussianprocess model are commonly used by various researchers[25–30]. The underlying function is assumed to consist of twoparts: a stationary part and a non-stationary part (as seen in Eq.(9)). The non-stationary part includes a trend function; thoughthe trend of the model can be assumed to be any function, severalstudies have used just a constant value and found it to begenerally sufficient [30]. The correlation between points isassumed to decay with distance, and hence a squared exponentialcovariance function is used in this paper. Any other appropriatecovariance function may also be used. The focus of this paper isonly to use an interpolation technique to construct the likelihoodfunction and hence the probability density function, and not toimprove existing Gaussian process modeling techniques. Hence,this paper uses commonly used assumptions in constructing theGaussian process interpolation model.

In this paper, 10 training points (yi and corresponding pi, i¼1to 10) are used to train the Gaussian process. The probabilitydensity function fX(x) is calculated by substituting different valuesof h* into the following equation:

p� ¼ hðh�ÞTbþrðh�ÞT R�1ðp�FbÞ ð11Þ

In Eq. (11), r(h*) is a vector containing the covariance betweenh* and each of the training points yi. R is a n�n matrix (n¼10here) containing the correlations between each of the trainingpoints (i.e. correlation between yi and yj for all i, j¼1–10). p is avector of the outputs of the Gaussian process model, i.e. theprobability density values at the training points (pi, i¼1–10). Thematrix F contains the trend function for each training point. SeeBichon et al. [30] for a detailed implementation of this method.

Once the Gaussian process approximation of fX(x) is con-structed, it is next used to calculate the likelihood function, asshown in the optimization formulation in Eq. (8). The likelihoodfunction is then maximized and the optimal probability densityfunction is estimated.

This methodology is distribution parameter-free, i.e. the prob-ability density function and the probability distribution are notrepresented using parameters. Hence the maximum likelihoodestimate of fX(x) sufficiently represents all the information containedin the given data. This maximum likelihood estimate of fX(x) wascalculated using a Gaussian process interpolation, which is a non-parametric curve, and hence this methodology is referred to as beingnon-parametric. This approach is significantly different from theparametric method discussed in Section 2.1, where well knownprobability distributions with explicit parameters were used and theuncertainty in the estimate of the distribution parameters (andhence the unconditional distribution of fX(x)) was calculated usingthe Bayes theorem. Thus, the non-parametric approach directlygives a single probability distribution for each model input becauseit does not involve any estimation of parameters, whereas theparametric approach, as described in Section 2.1, gives a family ofdistributions, which is then integrated to calculate a single uncondi-tional distribution, as in Eq. (6).

Note: The use of a GP model to represent the PDF is non-parametric not only from a PDF point of view but also from aparameter estimation point of view. The fitting of a GP modelindeed involves the estimation of certain quantities such as thelength and the scale constants of the Gaussian process but theseare not the parameters of the GP model because they alone do notcontrol the GP model prediction. In fact, the GP model preservesall the training point information; to make a new prediction, wecontinue to use all the training point information. We do notsimply extract a set of parameters (as in regression models) andstop using the training point information going forward. This iswhat makes the GP model non-parametric. In fact for this reason,several researchers have used the term ‘‘hyperparameters’’ in

referring to the length and the scale constants [25], in order todistinguish these quantities from the ‘‘parameters’’ in parametriccurves (such as y¼aþbx, where a and b are considered para-meters), where the parameters alone are sufficient for prediction.

3. Bayesian updating using epistemic validation data

Sections 2.1 and 2.2 presented methods to represent informa-tion in the form of interval data and/or sparse point data usingprobability distributions. Using this approach, every model inputcan be represented using a single probability distribution. Thenext step is to propagate the input uncertainty through thecomputational model G and calculate the probability distributionof the model output Y.

Several methods for probabilistic uncertainty propagation arewell established in the literature. Hence, it is advantageous whenall the different types of uncertainty are represented usingprobability distributions, thereby facilitating the use of well-known methods such as Monte Carlo Simulation (MCS), FirstOrder and Second Order Reliability Methods (FORM, SORM) [31],etc. for uncertainty propagation. Let fY(y) denote the resultantprobability density function of the model prediction Y.

The process of model validation quantifies the extent to whichexperimental data agree with the model prediction. Hence, it isessential to perform experiments and collect data against whichthe model predictions can be compared. In this case, the modelprediction is probabilistic, and hence the experimental evidenceis compared with the probability distribution of the modeloutput.

Until now, researchers have developed methods for modelvalidation only in those cases where the experimental evidence isin the form of point data. There are some cases when experimentsmay result in interval data rather than point data [32,33]. Datacollected based on temporally spaced inspections may lead tointervals, e.g. if no crack is detected at time t1 and if a crack isdetected at time t2, then the crack initiation time may be reportedas [t1, t2]. Also, errors associated with calibrated instruments mayresult in observations that are best described using intervals, etc.

An intuitive approach for model validation in this case may beto check if the model prediction falls within the bounds of theinterval. If it does, then it may be concluded that the model doesnot disagree with the data. However, this conclusion is a quali-tative statement of model validation, and does not provide aquantitative measure of the extent of agreement between modelprediction and validation data. Further, when the evidence is inthe form of multiple intervals (some of them overlapping andsome of them non-overlapping), such comparison may not evenbe feasible. On the other hand, the Bayesian hypothesis testingtechnique used by Rebba et al. [11] provides a quantitativemeasure of model validation. The work by Rebba et al. [11]considered only aleatory uncertainty; this paper extends thisapproach to include epistemic uncertainty in the form of intervaldata using the definition of likelihood, thereby facilitating quan-titative model validation in the presence of interval data.

Rebba et al. [11] used the Bayes factor validation metric, whichis equal to the ratio of likelihoods of observing the data under twocompeting hypotheses. In this approach, the distribution of themodel output, i.e. fY(y), is updated after collecting experimentalevidence (D). This section explains the Bayesian updating proce-dure whereas the model validation method is explained in thenext section.

First, the observed error is calculated as the differencebetween model prediction and experimental observation (e) andthe likelihood function of the model output is constructed using aGaussian random variable that represents the aforementioned

Page 6: Model validation under epistemic uncertainty

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–1241 1237

error with a zero mean and an estimated standard deviation [11].Let fe(e) denote the corresponding probability density function.When the experimental evidence is in the form of point data, thelikelihood function is constructed using the probability densityfunction of the above-mentioned Gaussian distribution and in thecase of interval data, it is calculated from the correspondingcumulative distribution function. Suppose there are p data points(yi, i¼1 to p) and q intervals ([ci, di], i¼1 to q) as experimentalevidence. To evaluate the likelihood of the model output Y,the probability of observing the given data needs to be cal-culated conditioned on the model output Y. This probabilitycan be calculated using the error distribution function asexplained below.

In the case of point data, the error ei can be calculated bysubtracting yi from the model output y. In the case of intervaldata, the minimum error and the maximum error can be calcu-lated for each interval [ci, di] by considering all possible values of‘‘y’’ within that interval. This is a simple linear operation and doesnot require interval analysis. It is sufficient to calculate the errorsat the boundaries of the interval to calculate the minimum andmaximum errors. Hence, for each interval, a corresponding ‘‘errorinterval’’ can be calculated as [ai, bi]. Using this information, thelikelihood of the model output Y can be calculated as

LðyÞpYp

i ¼ 1

Z bi

ai

feðe9yÞde" # Yq

i ¼ 1

fXðei9eÞ

" #ð12Þ

Once the likelihood of the model output Y is calculated usingEq. (12), it can be used for Bayesian updating. The prior densityfunction of the model output fY(y) is multiplied by the likelihoodfunction L(y) and the product is normalized so that the area underthe curve of the resultant distribution is unity; this gives theposterior density function of the model output in the presence ofdata D:

f ðy9DÞ ¼LðyÞfY ðyÞRLðyÞfY ðyÞdy

ð13Þ

The prior density function fY(y) and the posterior densityfunction fY(y9D) calculated in Eq. (13) can be used to assess thevalidity of the computational model as explained in the followingsection.

4. Model validation

Rebba et al. [11] proposed a Bayesian hypothesis testing approachfor model validation that can account for different sources of aleatoryuncertainty in a systematic manner. This paper extends this approachto include epistemic uncertainty. For this purpose, methods werederived to represent information regarding epistemic uncertaintyusing probability distributions and these methods were explainedin Sections 2 and 3. This section implements the model validationmetric developed by Rebba et al. [11] to include both aleatory andepistemic uncertainties.

4.1. Bayes factor

Bayesian hypothesis testing compares two models or twohypotheses [34] and calculates the extent to which the observeddata D favors each hypothesis. Suppose that there are twohypotheses H0 and H1; the relative probabilities of these hypoth-eses can be updated using the Bayes theorem as [35]

ProbðH09DÞProbðH19DÞ

¼ProbðD9H0 is trueÞ

ProbðD9H1 is trueÞ

ProbðH0Þ

ProbðH1Þð14Þ

The first term on the right hand side of Eq. (14), the ratio oflikelihoods under the two hypotheses, is defined as the Bayes

factor (B). If B41, then the data D favors the hypothesis H0 morethan the hypothesis H1. In the context of model of validation, thetwo hypotheses H0 and H1 may be chosen as ‘‘Model is correct’’and ‘‘Model is incorrect’’, respectively. Hence, the Bayes factor is ameasure of the extent to which the data support the model.

Assume that for a given choice of model inputs, the modelprediction is y0. Note that the model prediction is a deterministicquantity. The numerator in Eq. (14) is proportional to the like-lihood of Y calculated in Eq. (12), evaluated at the modelprediction y0 [23]. (Note that this likelihood function is calculatedfor both point data and intervals of experimental evidence.) Letthe constant of proportionality be denoted by k:

ProbðD9H0 is trueÞ ¼ ProbðD9Model is acceptableÞ

¼ ProbðD9y¼ y0Þ ¼ kLðyÞ9y ¼ y0ð15Þ

The denominator in Eq. (14) is evaluated using the principle oftotal probability (integrated overall possible values of y) as

ProbðD9H1 is trueÞ ¼ ProbðD9Model is not acceptableÞ

¼ ProbðD9yay0Þ ¼

ZkLðyÞf ðyÞdy ð16Þ

assuming the PDF of Y is the same for both hypotheses H0 andH1 [11].

Hence, the Bayes factor can be evaluated by substituting Eqs.(15) and (16) into Eq. (14). Hence

B¼LðyÞR

LðyÞf ðyÞdy

�����y ¼ y0

ð17Þ

By combining Eqs. (17) and (13), the Bayes factor can beexpressed in terms of f(y) and f(y9D), i.e. the prior and posteriordensities of y, respectively:

B¼f ðy9DÞ

f ðyÞ

�����y ¼ y0

ð18Þ

Since the epistemic uncertainty in the model inputs and thevalidation data have already been converted into probabilisticinformation (in Sections 2 and 3), the Bayes factor-based modelvalidation approach proposed by Rebba et al. [11] is now alsoapplicable in the presence of epistemic uncertainty. Further,model validation problems with both aleatory and epistemicuncertainties can also be handled easily as all information isrepresented using probability distributions.

4.2. Confidence in model prediction

If the Bayes factor B is greater than 1, it implies that the datafavor the hypothesis H0, and hence the model. If the Bayes factoris less than 1, then the model has failed the test of validation.According to Jeffreys [36], a Bayes factor such that 1oBo3 is‘‘barely worth mentioning’’, 3oBo10 is substantial, 10oBo30is strong, 30oBo100 is very strong, and B4100 is decisive.However, instead of these qualitative categories, the Bayes factorcan be related to the posterior probability of ‘‘the model beingcorrect’’, and this gives a direct quantitative estimate of con-fidence in the model prediction. The posterior probability P(H09D)can be evaluated using the Bayes theorem as

Model confidence¼ PðH09DÞ ¼PðD9H0ÞPðH0Þ

PðD9H0ÞPðH0ÞþPðD9H1ÞPðH1Þð19Þ

In Eq. (19), P(H0) and P(H1) denote the prior probabilities of thehypotheses H0 and H1 being true. Prior to model validation, if theanalyst does not know about the probabilities of these hypoth-eses, then each of these probabilities can be assigned a value of0.5. Hence, the confidence in the model prediction, P(H09D), can

Page 7: Model validation under epistemic uncertainty

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–12411238

be calculated as

PðH09DÞ ¼PðD9H0Þ

PðD9H0ÞþPðD9H1Þ¼

kLðyÞ9y ¼ y0

kLðyÞ9y ¼ y0þR

kLðyÞf ðyÞdyð20Þ

By substituting Eq. (17) in Eq. (20)

Model confidence¼B

Bþ1ð21Þ

Thus, the validity of the model can be assessed using the Bayesfactor and the confidence in the model prediction can be easilyquantified.

Note that the term ‘‘model confidence’’ is used loosely here, asopposed to the technical term ‘‘confidence bounds’’ in statisticalliterature. The confidence in the model is additionally derivedfrom several other sources such as the analyst’s prior experience,judgment, and other qualitative sources of information.

0 1 2 3 4 5 6 7 8 9 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Conductivity of the Wire

Pro

babi

lity

Den

sity

Fun

ctio

n

Fig. 2. Distribution of conductivity k.

5. Numerical example

This section presents a numerical example that illustrates theproposed methodology for model validation. For the purpose ofillustration, different types of uncertainty – sparse point data,interval data, and distribution parameter uncertainty – areassumed to exist simultaneously in the model inputs. Further,the experimental observations may be point data or interval data.

5.1. Problem description and data

Consider the steady state heat transfer in a thin wire of lengthL, with thermal conductivity k, and convective heat coefficient b.The temperature at midpoint of the wire needs to be predicted.For the sake of illustration, it is assumed that this problem isessentially one-dimensional and that the solution can be obtainedfrom the following boundary value problem [11]:

�k@2T

@x2þbT ¼Q ðxÞ ð22Þ

Tð0Þ ¼ T0 ð23Þ

TðLÞ ¼ TL ð24Þ

Assume that the heat source Q(x)¼25(2x�L)2 [8]. Rebba et al.[11] assumed that the temperatures at the ends of the wire areboth zero (T0¼TL¼0). This is an ideal scenario and this paperconsiders uncertainty in the boundary condition, i.e. the tem-peratures at the ends of the wire (T0 and TL) are assumed to benormally distributed with statistics N(0, 1). This is an example ofaleatory uncertainty (in the boundary condition).

Suppose that the probability distribution of the conductivity ofthe wire k is described by an expert as normal but with uncertaindistribution parameters. For the sake of illustration, it is assumedthat the mean follows a normal distribution with statistics N(5, 0.2)and the standard deviation follows a lognormal distribution withmean¼1 and standard deviation¼0.1.

Suppose that a distribution is not available for the convectiveheat coefficient b. Instead it is described using two intervals [0.5,0.55] and [0.45, 0.48], and two additional point data are available,0.58 and 0.52.

The length of the wire is assumed to be deterministic, L¼4. Letthe purpose of the model be to predict the temperature at themiddle of the wire, i.e. at x¼2.0.

Suppose, for given values of end temperatures of the wire, andthe model parameters k and b, the numerical model (Eq. (22))predicted a temperature of 18.51, i.e. Eq. (22) predicted atemperature of TL/2¼18.51 for T0¼TL¼0, and k¼5 and b¼0.5.A wire made of a material with properties k and b having the

same measured values as input to the numerical model is testedthree times repeatedly to measure the temperature at locationx¼2.0. The measured temperatures are likely to be different ineach experiment. For the purpose of illustration, assume that twomeasurements are point data and one is an interval, i.e. 18.8, 18.2,and [18.9, 19.0]. It is required to assess whether the experimentalevidence supports the numerical model in Eq. (22).

Various steps involved in the validation procedure areexplained below.

5.2. Probabilistic representation of model inputs

The first step is to represent each uncertain model input usinga single probability distribution. The temperatures at the ends ofthe wire are already available in this form.

The distribution of the conductivity of the wire k is described byan expert as normal but with uncertain distribution parameter values.This means that the distribution is normal, conditioned on the givenset of parameter values. As explained in Section 2.1, Eq. (7) is used tocalculate the unconditional probability distribution of k that accountsfor the uncertainty in the estimates of the parameters. The resultantdistribution of the conductivity k is shown in Fig. 2. Note that theintegral in Eq. (7) renders the resultant unconditional distributionnon-normal and non-parametric. A numerical approach is required touse this distribution for uncertainty propagation.

There is no probability distribution information for the con-vective heat coefficient b. It is described using two intervals[0.5, 0.55] and [0.45, 0.48], and two point values, 0.58 and 0.52.The likelihood-based approach explained in Section 2 is used toconstruct the probability distribution of b. It assumed that theprobability density function is zero in the regions bo0.45 andb40.58 as there is no evidence available in this range. A non-parametric probability distribution is estimated using theGaussian process interpolation technique and the optimizationprocedure in Eq. (8). The resultant distribution is shown in Fig. 3.

It is clear that the non-parametric approach assigns higherprobability density values in the region where there is evidenceand assigns lower density values where there is no evidence. Theresult is a mutimodal distribution.

The probability density function in Fig. 3 has been computedbased only on the available evidence (two intervals and two pointdata), which is assumed to have come from expert opinion. Ingeneral, expert opinion may not be very precise and it may bedifficult to explicitly quantify the precision in expert opinion.Here, we do not consider this imprecision, and assume that theavailable data are precise, and the probability distributions (andlater, the validation metric) are calculated based only on theavailable information. Therefore, the probability density function

Page 8: Model validation under epistemic uncertainty

0.46 0.48 0.5 0.52 0.54 0.560

10

20

30

40

50

60

Convective Heat Coefficient

Pro

babi

lity

Den

sity

Fun

ctio

n

Fig. 3. Distribution of convective heat coefficient b.

5 10 15 20 25 30 35 40 450

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Temperature at the Middle of the Wire T(2.0)

Prio

r Pro

babi

lity

Den

sity

Fun

ctio

n

Fig. 4. Distribution of the model output T(2.0).

15 16 17 18 19 20 21 22 23 24 25Model Output T(2.0)

Like

lihoo

d of

Mod

el O

utpu

t

Fig. 5. Likelihood of the model output T(2.0).

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–1241 1239

is zero in the region b40.58. Also, note that when point data areused, there is a peak at each point data location. This leads to apeak at b¼0.58, which may make the PDF appear unrealistic, butit is faithful to the available data. Any attempt to make the PDFmore realistic implies that the analyst has information beyond theavailable data.

Now, each uncertain model input has been represented using asingle probability distribution. The next step is to calculate theprobability distribution of the model output using a method ofuncertainty propagation.

5.3. Probability distribution of the model output

The probability distributions of the model inputs were calcu-lated in Section 5.2. These inputs are propagated through thecomputational model in Eq. (22) and the probability densityfunction of the model output, i.e. the temperature at the mid-section of the wire, T(2.0), is calculated. Any uncertainty propaga-tion technique (MCS, FORM, SORM, etc.) may be used for thispurpose; an MCS-based approach is used in this paper. The outputdistribution is shown in Fig. 4.

The next step is to construct the likelihood function of themodel output using the experimental evidence.

5.4. Construction of the likelihood function

The likelihood of the model output is constructed as theprobability of observing the given data conditioned on the modeloutput. Recall that validation data were obtained through experi-ments; two measurements are point data and one is an interval,i.e. 18.8, 18.2, and [18.9, 19.0]. A Gaussian distribution is assumedfor the experimental error, with zero mean and a known variance.For the purpose of illustration, unit variance is assumed and thelikelihood function is evaluated using Eq. (12).

The likelihood of the model output, calculated using the experi-mental evidence, is shown in Fig. 5. Note that the numerical values ofthe calculated likelihood are not indicated on the vertical axis. Thelikelihood is expressed using the idea of proportionality, and the exactnumerical values are not of interest.

5.5. Bayesian updating and calculation of validation metric

The likelihood function is multiplied with the prior, and theproduct is normalized to calculate the posterior distribution ofthe model output. Rebba et al. [11] use a Markov chain MonteCarlo (MCMC) sampling approach to draw samples from theposterior distribution instead of finding out the normalizingconstant

Rf ðyÞLðyÞdy. However, this is a simple one-dimensional

integration, and hence any quadrature-based numerical techni-que may be employed instead of the expensive MCMC procedureused for Bayesian updating. In particular, Adaptive RecursiveSimpson’s Quadrature [37] is used in this paper. Consider a one-dimensional integral and its approximation using Simpson’srule asZ b

af ðxÞdx�

b�a

6f ðaÞþ4f

aþb

2

� �þ f ðbÞ

� �¼ Sða,bÞ ð25Þ

The adaptive recursive quadrature algorithm calls for subdi-viding the interval of integration (a,b) into two sub-intervals ((a,c)and (c,b), aocob), and Simpson’s rule is applied to each sub-interval. The error in the estimate of the integral is calculated bycomparing the integral values before and after splitting. Thesplitting is repeated; the criterion for determining when to stopdividing a particular interval depends on the tolerance level e. Thetolerance level for stopping [37] may be chosen as

9Sða,cÞþSðc,bÞ-Sða,bÞ9o15e ð26Þ

The integral in Eq. (23) can be used to calculate the normal-izing constant, and the results of Bayesian updating are shown inFig. 6.

Next, the Bayes factor metric is evaluated at the modelprediction T¼18.51 by calculating the ratio between the posteriorand the prior distribution [11], as in Eq. (18). It is evident from thefigure that the Bayes factor is greater than one, and hence the datasupport the model. The exact value of the Bayes factor is in fact,B¼6.7, indicating that the data strongly agree with the modelprediction.

The confidence associated with the model prediction, i.e. theprobability of the model being correct given the experimental

Page 9: Model validation under epistemic uncertainty

8 10 12 14 16 18 20 22 24 26 28 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Temperature at the Middle of the Wire T(2.0)

Pro

babi

lity

Den

sity

Fun

ctio

n

PosteriorPrior

Fig. 6. Bayesian updating of model output.

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–12411240

evidence, is calculated using Eq. (21) and is found to be equal to87%. If this level of confidence is sufficient for the analyst, thenthe model can be used for further evaluation and intended use;else, the model needs to be refined.

6. Conclusion

This paper proposed a methodology for model validation whenthe model inputs and/or validation data may have epistemicuncertainty. The proposed approach provides a unified frame-work for considering combinations of aleatory and epistemicuncertainty.

Each model input may be described using a completely knownprobability distribution (aleatory), sparse point data (epistemic),interval data (epistemic), or parametric probability distributionwhose parameters are uncertain (epistemic). A likelihood-basedapproach is used to represent combinations of sparse point dataand interval data using probability distributions. Both parametricand non-parametric approaches are considered. While the para-metric approach results in a family of probability distributions,the non-parametric approach results in a single probabilitydistribution. A family of distributions can be integrated into asingle unconditional probability distribution using the principleof total probability. This integration is computationally efficientand meaningful for model validation by integrating the contribu-tion of different types/sources of uncertainty into a single valida-tion metric, which is helpful for decision-making purposes.However, it may not be suitable for sensitivity analysis wherethe quantification of individual contributions of aleatory andepistemic uncertainties to the model output uncertainty is ofinterest.

The proposed method also handles validation experimentalevidence either in the form of point data and/or interval data. Thedistribution of the model output is updated using the Bayestheorem, and the Bayes factor validation metric is extended tomodel validation under epistemic uncertainty. Also, the confi-dence in the model is calculated using the Bayes factor.

The framework developed in this paper is very flexible, andcan effectively handle multiple sources of epistemic and aleatoryuncertainty in the context of model validation.

The scope of this paper is limited to a few types of epistemicuncertainty, namely, sparse point data, interval data, and distributionswith uncertain parameters. Future work needs to include other typesof epistemic uncertainty such as categorical variables, qualitativedata, etc. within the model validation framework.

Acknowledgment

This research was supported by funds from the Sandia NationalLaboratories through Contract no. BG-7732 (Technical Monitors:Dr. Thomas A. Paez and Dr. Angel Urbina) and NASA Langley ResearchCenter (Cooperative agreement no. NNX08AF56A, Technical Monitor:Dr. Lawrence Green). The support is gratefully acknowledged.

References

[1] Helton J. Conceptual and computational basis for the quantification ofmargins and uncertainty. Sandia report. SAND2009-3055.

[2] American Institute of Aeronautics and Astronautics. Guide for the verificationand validation of computational fluid dynamics simulations. AIAA-G-077-1998. Reston, VA; 1998.

[3] Defense Modeling and Simulation Office. Verification, validation, and accred-itation (VV&A) recommended practices guide. Office of the Director ofDefense Research and Engineering, www.dmso.mil/docslib. Alexandria, VA;April 1996.

[4] Trucano TG. Aspects of ASCI code verification and validation. Sandia technicalreport no. SAND2000-0390C. Albuquerque, NM: Sandia National Labora-tories; 2000.

[5] Oberkampf WL, Barone MF. Measures of agreement between computationand experiment: validation metrics. Journal of Computational Physics2006;217(1):5–36, doi:10.1016/j.jcp.2006.03.037. Uncertainty quantificationin simulation science.

[6] Oberkampf WL, Trucano TG. Verification and validation in computationalfluid dynamics. Progress in Aerospace Sciences 2002;38:209–72.

[7] Hills RG, Leslie IH. Statistical validation of engineering and scientific models:validation experiments to application. Report no. SAND2003-0706. Albuquer-que, NM: Sandia National Laboratories; 2003.

[8] Urbina A, Paez TL, Hasselman TK, Wathugala GW, Yap K. Assessment ofmodel accuracy relative to stochastic system behavior. In: Proceedings of the44th AIAA structures, structural dynamics, materials conference. Norfolk, VA;April 7–10, 2003.

[9] Rebba R, Mahadevan S. Computational methods for model reliability assess-ment. Reliability Engineering & System Safety 2008;93(8):1197–207.doi:10.1016.

[10] Rebba R. Model validation and design under uncertainty. PhD dissertation.Vanderbilt University, Nashville, TN, USA; 2005.

[11] Rebba R, Mahadevan S, Huang S. Validation and error estimation of compu-tational models. Reliability Engineering & System Safety 2006;91:1390–7.

[12] Zhang R, Mahadevan S. Bayesian methodology for reliability model accep-tance. Reliability Engineering & System Safety 2003;80(1):95–103,doi:10.1016/S0951-8320(02)00269-7.

[13] Mahadevan S, Rebba R. Validation of reliability computational models usingBayes networks. Reliability Engineering & System Safety 2005;86:223–32.

[14] Jiang X, Mahadevan S. Bayesian risk-based decision method for modelvalidation under uncertainty. Reliability Engineering & System Safety2007;92(6):707–18, doi:10.1016/j.ress.2006.03.006.

[15] Ha-Rok B, Grandhi RV, Canfield RA. An approximation approach for uncer-tainty quantification using evidence theory. Reliability Engineering & SystemSafety 2004;86(3):215–25, doi:10.1016/j.ress.2004.01.011.

[16] Shafer G. A mathematical theory of evidence. Princeton University Press;1976.

[17] Agarwal H, Renaud JE, Preston EL, Padmanabhan D. Uncertainty quantifica-tion using evidence theory in multidisciplinary design optimization. Relia-bility Engineering & System Safety 2004;85(1–3):281–94.

[18] Rao SS, Annamdas KK. An evidence-based fuzzy approach for the safetyanalysis of uncertain systems. In: Proceedings of the 50th AIAA/ASME/ASCE/AHS/ASC structures, structural dynamics, and materials conference. Papernumber AIAA-2009-2263. Palm Springs, California; 2009.

[19] Ben-Haim Y, Elishakoff I. Convex models of uncertainty in applied mechanics.Studies in Applied Mechanics 1990;25.

[20] Ferson S, Oberkampf WL, Ginzburg L. Model validation and predictivecapability for the thermal challenge problem. Computer Methods in AppliedMechanics and Engineering 2008;197(29–32):2408–30, doi:10.1016/j.cma.2007.07.030. Validation challenge workshop.

[21] Baudrit C, Dubois D. Practical representations of incomplete probabilisticknowledge. Computational Statistics & Data Analysis 2006:86–108,doi:10.1016/j.csda.2006.02.009. The fuzzy approach to statistical analysis.

[22] Helton JC, Johnson JD, Oberkampf WL. An exploration of alternativeapproaches to the representation of uncertainty in model predictions.Reliability Engineering & System Safety 2004:39–71, doi:10.1016/j.ress.2004.03.025. Alternative representations of epistemic uncertainty.

[23] Pawitan Y. In all likelihood: statistical modeling and inference using like-lihood. New York: Oxford Science Publications; 2001.

[24] McDonald M, Mahadevan S. Uncertainty quantification and propagation formulti-disciplinary system analysis. In: Proceedings of the 12th AIAA/ISSMOmulti-disciplinary analysis and optimization conference. 2008/09/10–2008/09/12. Victoria, British Columbia, Canada.

Page 10: Model validation under epistemic uncertainty

S. Sankararaman, S. Mahadevan / Reliability Engineering and System Safety 96 (2011) 1232–1241 1241

[25] Rasmussen CE, Williams CKI. Gaussian processes for machine learning. MITPress; 2006.

[26] Cressie N. Statistics for spatial data. New York: Wiley; 1993.[27] Wackernagel H. Multivariate geostatistics—an introduction with applica-

tions. Berling: Springer; 1995.[28] Chiles JP, Delfiner P. Geostatistics, modeling spatial uncertainty. In: Wiley

series in probability and statistics; 1999.[29] McFarland J. Uncertainty analysis for computer simulations through valida-

tion and calibration. PhD dissertation. Vanderbilt University; 2008.[30] Bichon B, Eldred M, Swiler L, Mahadevan S, McFarland J. Efficient global

reliability analysis for nonlinear implicit performance functions. AIAA Journal2008;46(10):2459–68, doi:10.2514/1.34321. (0001-1452).

[31] Haldar A, Mahadevan S. Probability, reliability and statistical methods inengineering design. New York: John Wiley & Sons; 2000. 304 p., ISBN-10:0-471-33119-8.

[32] Ferson S, Kreinovich V, Hajagos J, Oberkampf W, Ginzburg L. Experimentaluncertainty estimation and statistics for data having interval uncertainty.Sandia national laboratories technical report SAND2007-0939. Albuquerque,

New Mexico; 2007.[33] Du X, Sudjianto A, Huang B. Reliability based design with mixture of random

and interval variables. Journal of Mechanical Design, ASME 2005;127:1068–76.

[34] Berger JO, Pericchi LR. The intrinsic Bayes factor for model selection andprediction. Journal of the American Statistical Association 1996;91:109–22.

[35] Leonard T, Hsu JSJ. Bayesian methods: an analysis for statisticians and

interdisciplinary researchers. Cambridge: Cambridge University Press; 1999.[36] Jeffreys H. The theory of probability, 3rd ed. Oxford; 1961. p. 432.[37] McKeeman MW. Algorithm 145: adaptive numerical integration by Simp-

son’s rule. Communications of the ACM, 5; 1962.