9
Application of the control variate technique to estimation of total sensitivity indices S. Kucherenko a,n , B. Delpuech a , B. Iooss b , S. Tarantola c a Imperial College London, UK b Electricité de France, France c Joint Research Centre of the European Commission, Ispra, Italy article info Available online 17 July 2014 Keywords: Global sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract Global sensitivity analysis is widely used in many areas of science, biology, sociology and policy planning. The variance-based methods also known as Sobol' sensitivity indices has become the method of choice among practitioners due to its efciency and ease of interpretation. For complex practical problems, estimation of Sobol' sensitivity indices generally requires a large number of function evaluations to achieve reasonable convergence. To improve the efciency of the Monte Carlo estimates for the Sobol' total sensitivity indices we apply the control variate reduction technique and develop a new formula for evaluation of total sensitivity indices. Presented results using well known test functions show the efciency of the developed technique. & 2014 Elsevier Ltd. All rights reserved. 1. Introduction Global sensitivity analysis (SA) is the study of how the uncer- tainty in model output is apportioned to the uncertainty in model inputs [1,2]. Over the last decade, global SA has gained acceptance among practitioners in the process of model development, cali- bration and validation, reliability and robustness analysis, decision-making under uncertainty, quality-assurance, and com- plexity reduction. There have been many successful improvements in the efciency of estimating the main effect Sobol' indices ranging from the advanced formulas for small sensitivity indices [3,4] to application of RBD [5] and various metamodelling methods [611]. However, there have been no similar advances concerning estimation of the total sensitivity indices and the SobolJansen formula [12,13] remains to be the only formula used in the direct computation of the total sensitivity indices. To improve the efciency of the Monte Carlo (MC) estimates for total sensitivity indices we apply the variance reduction technique and develop a new formula for the evaluation of total sensitivity indices. We also present results using well known test functions to show the efciency of the developed technique. This paper is organised as follows. The next section introduces control variates reduction technique. ANOVA decomposition and Sobol' sensitivity indices are briey presented in Section 3. In Section 4 we describe how to apply the control variate technique to improve the efciency of the SobolJansen formula for evaluation of total sensitivity indices. The use of metamodels is presented in Section 5. This section also provides the details of the random sampling-high dimensional model representation (RS- HDMR) model which is used in this paper. Four different test cases are considered in Section 6. Finally, conclusions are shown in Section 7. 2. Control variate reduction technique Consider the integral of the function f over the n-dimensional unit hypercube H n : I½f ¼ Z f ðxÞdx: This integral can be seen as an expectation of f ðxÞ with respect to an n-dimensional random variable x that is uniformly distrib- uted: I½f ¼ E½f ðxÞ ¼ Z f ðxÞdx: ð1Þ Further we denote μ f ¼ E½f ðxÞ. To calculate this expectation, a sequence of N random or quasi random points x ðiÞ is sampled and then the mathematical expectation (1) is approximated as follows: I N ½f ¼ 1 N N i ¼ 1 f ðx ðiÞ Þ ð2Þ It is important to estimate an integration error ε ¼jI ½f I N ½f j. For the MC method the expectation of ε 2 is equal to σ 2 ðf Þ=N, Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/ress Reliability Engineering and System Safety http://dx.doi.org/10.1016/j.ress.2014.07.008 0951-8320/& 2014 Elsevier Ltd. All rights reserved. n Corresponding author. Tel.: þ44 2075946624; fax: þ44 2075941129. E-mail address: [email protected] (S. Kucherenko). Reliability Engineering and System Safety 134 (2015) 251259

Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

Application of the control variate technique to estimationof total sensitivity indices

S. Kucherenko a,n, B. Delpuech a, B. Iooss b, S. Tarantola c

a Imperial College London, UKb Electricité de France, Francec Joint Research Centre of the European Commission, Ispra, Italy

a r t i c l e i n f o

Available online 17 July 2014

Keywords:Global sensitivity analysisSobol' sensitivity indicesControl variate techniqueRandom sampling- high dimensional model

representation

a b s t r a c t

Global sensitivity analysis is widely used in many areas of science, biology, sociology and policy planning.The variance-based methods also known as Sobol' sensitivity indices has become the method of choiceamong practitioners due to its efficiency and ease of interpretation. For complex practical problems,estimation of Sobol' sensitivity indices generally requires a large number of function evaluations toachieve reasonable convergence. To improve the efficiency of the Monte Carlo estimates for the Sobol'total sensitivity indices we apply the control variate reduction technique and develop a new formula forevaluation of total sensitivity indices. Presented results using well known test functions show theefficiency of the developed technique.

& 2014 Elsevier Ltd. All rights reserved.

1. Introduction

Global sensitivity analysis (SA) is the study of how the uncer-tainty in model output is apportioned to the uncertainty in modelinputs [1,2]. Over the last decade, global SA has gained acceptanceamong practitioners in the process of model development, cali-bration and validation, reliability and robustness analysis,decision-making under uncertainty, quality-assurance, and com-plexity reduction. There have been many successful improvementsin the efficiency of estimating the main effect Sobol' indicesranging from the advanced formulas for small sensitivity indices[3,4] to application of RBD [5] and various metamodelling methods[6–11]. However, there have been no similar advances concerningestimation of the total sensitivity indices and the Sobol–Jansenformula [12,13] remains to be the only formula used in the directcomputation of the total sensitivity indices.

To improve the efficiency of the Monte Carlo (MC) estimates fortotal sensitivity indices we apply the variance reduction techniqueand develop a new formula for the evaluation of total sensitivityindices. We also present results using well known test functions toshow the efficiency of the developed technique.

This paper is organised as follows. The next section introducescontrol variates reduction technique. ANOVA decomposition andSobol' sensitivity indices are briefly presented in Section 3. InSection 4 we describe how to apply the control variate techniqueto improve the efficiency of the Sobol–Jansen formula for

evaluation of total sensitivity indices. The use of metamodels ispresented in Section 5. This section also provides the details of therandom sampling-high dimensional model representation (RS-HDMR) model which is used in this paper. Four different testcases are considered in Section 6. Finally, conclusions are shown inSection 7.

2. Control variate reduction technique

Consider the integral of the function f over the n-dimensionalunit hypercube Hn:

I½f � ¼Z

f ðxÞdx:

This integral can be seen as an expectation of f ðxÞ with respectto an n-dimensional random variable x that is uniformly distrib-uted:

I½f � ¼ E½f ðxÞ� ¼Z

f ðxÞdx: ð1Þ

Further we denote μf ¼ E½f ðxÞ�. To calculate this expectation,a sequence of N random or quasi random points xðiÞ is sampledand then the mathematical expectation (1) is approximated asfollows:

IN½f � ¼1N

∑N

i ¼ 1f ðxðiÞÞ ð2Þ

It is important to estimate an integration error ε¼ jI½f �� IN ½f �j.For the MC method the expectation of ε2 is equal to σ2ðf Þ=N,

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/ress

Reliability Engineering and System Safety

http://dx.doi.org/10.1016/j.ress.2014.07.0080951-8320/& 2014 Elsevier Ltd. All rights reserved.

n Corresponding author. Tel.: þ44 2075946624; fax: þ44 2075941129.E-mail address: [email protected] (S. Kucherenko).

Reliability Engineering and System Safety 134 (2015) 251–259

Page 2: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

where σ2 ¼ Var½f � and the root mean square error εMC ¼ σ=N0:5. Toreduce εMC one can either increase the number of sampled pointsN, or to decrease the variance σ2. The objective of this paper is todevelop a new efficient formula for evaluation of total sensitivityindices based on the reduction of the variance of its MC estimate.There are various variance reduction techniques used to increasethe accuracy of the MC estimates [14]. One of them is the controlvariate method which we further consider.

To improve the MC estimate of μf we define a new functionf ðxÞ:

f ðxÞ ¼ f ðxÞþCðgðxÞ�μgÞ; ð3Þ

where C is a constant coefficient, gðxÞ is a function known as acontrol variate for which an expectation μg ¼

RgðxÞdx is known. It

is easy to see that E½f ðxÞ� is an unbiased estimator for μf for anychoice of the coefficient C: E½f ðxÞ� ¼ E½f ðxÞ� as E½CðgðxÞ�μgÞ� ¼ 0:

Consider the variance of the resulting estimator f :

Var½f � ¼ Var½f �þC2Var½g�þ2C Cov½f ; g� ð4ÞThe main objective is to make sure that Var½f �rVar½f �. Mini-

mum of (4)

minC

Var½f �

is attained at C ¼ Cn:

Cn ¼ �Cov½f ; g�Var½g� :

In this case

Var½f � ¼ Var½f �� Cov2½f ; g�Var½g� :

The root mean square error εMC is then equal to Var½f �=N0:5. Thedifficulty here is to find a good control variate g to build anunbiased estimator f with a reduced variance.

3. Sobol' sensitivity indices

The method of global sensitivity indices is based on thedecomposition of a function into summands of increasing dimen-sionality. Consider an integrable function f ðxÞ defined in the unithypercube Hn. It can be expanded in the following form:

f ðxÞ ¼ f 0þ ∑n

j ¼ 1f jðxjÞþ ∑

0r io jrnf ijðxi; xjÞþ…þ f 1…nðx1;…; xnÞ: ð5Þ

Each of the components f i1…is ðxi1 ;…; xis Þ is a function of aunique subset of variables from x. It can be proven [12,15] thatthe expansion (6) is unique ifZ 1

0f i1…is ðxi1 ;…; xis Þdxik ¼ 0; 1rkrs;

in which case it is called a decomposition into summands ofdifferent dimensions. This decomposition is also known as theANOVA decomposition.

For square integrable functions, the variances of the terms inthe ANOVA decomposition add up to the total variance of thefunction

D¼ ∑n

s ¼ 1∑n

i1 o…o is

Di1 ;…;is :

Here D¼ Var½f � is the total variance, Di1 ;…;is are partial variancesdefined as

Di1 ;…;is ¼Z 1

0f 2i1 ;…;is ðxi1 ;…; xis Þdxi1 ;…; xis

We note that is customary in the area of global sensitivityindices to use D to denote the variance instead of Var½f � which ismore common in statistics.

Sobol' main effect sensitivity indices are defined as the ratios

Si1 ;…;is ¼Di1 ;…;is=D:

Consider two complementary subsets of variables xj and x � j,where x � j is n�1 dimensional vector. The total variance DT

j isdefined as

DTj ¼D�D � j;

where D � j is the sum of all the marginal variances not containingj in their subscripts. The corresponding total sensitivity indices[1,12] are equal to

STj ¼DTj =D:

Obviously, STj ¼ S�S � j, where S � j ¼D � j=D. Knowledge of Sjand STj in most cases provides sufficient information to determinethe sensitivity of the analysed function to individual inputvariables.

Sobol' sensitivity indices can be computed using direct for-mulas [12,13]. Significant progress has been made in improvingefficiencies of computing main effect indices [5]–[3]. In the nextsection we propose a new efficient approach to compute totalsensitivity indices.

4. Application of the control variate technique to estimationof total sensitivity indices

We apply the control variate technique idea presented inSection 2 to the evaluation of the total sensitivity indices STj forthe case of a single variable xj. This approach can be easilygeneralised for the case of a set of variables.

The Sobol–Jansen formula has a form [12,13]:

STj ¼12D

Z½f ðxÞ� f ðx0j; x � jÞ�2dx dx0j; ð6Þ

where x0j is j -th component of the n - dimensional x0 vector whichis sampled independently from x. In this section we omit forbrevity the integral limits.

Application of the control variate technique would requirefinding a new function (3), such that

Fðx; x0jÞ ¼ Fðx; x0jÞþCðGðx; x0jÞ�μGÞ;where Fðx; x0jÞ ¼ ½f ðxÞ� f ðx0j; x � jÞ�2, μG ¼ R

Gðx; x0jÞdx dx0j. It is notpossible to find a control variate Gðx; x0jÞ in a general case. However,a natural choice for the control variate for a function f ðxÞ is tochoose the first order terms of ANOVA

gðxÞ ¼ f 0þ ∑n

i ¼ 1f iðxiÞ: ð7Þ

Similarly, a good control variate for f ðx0j; x � jÞ is

gðx0j; x � jÞ ¼ f jðx0jÞþ ∑n

i ¼ 1;ia jf iðxiÞ: ð8Þ

Theorem. Approximating functions f ðxÞ and f ðx0j; x � jÞ in (6) with

(7) and (8) results in the following formula for STj :

STj ¼12D

Z½f ðxÞ� f jðxjÞ�½f ðx0j; x � jÞ� f jðx0jÞ��2dx dx0jþ Sj: ð9Þ

Proof. Consider function FM ¼ f ðxÞ� f jðxjÞ�½f ðx0j; x � jÞ� f jðx0jÞ�: Itsvariance is

Var½FM � ¼Z

½f ðxÞ� f ðx0j; x � jÞ�½f jðxjÞ� f jðx0jÞ��2dx dxj'

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259252

Page 3: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

¼Z

½f ðxÞ� f ðx0j; x � jÞ�2dxdx0jþZ

½f jðxjÞ� f jðx0jÞ�2dx dx0j

�2Z

½f ðxÞ� f ðx0j; x � jÞ�½f jðxjÞ� f jðx0jÞ�dx dx0j: ð10Þ

The first term in (10) is clearly equal to 2 DTj (formula (6)).Z

½f ðxÞ� f ðx0j; x � jÞ�2dx dx0j ¼ 2 DTj :

The second term isZ½f jðxjÞ� f jðx0jÞ�2dx dx0j ¼ 2Dj

and the third term is

2Z

½f ðxÞ� f ðx0j; x � jÞ�½f jðxjÞ� f jðx0jÞ�dx dx0j ¼ 4Dj:

The final form of the variance (10) is

Var½FM � ¼ Var½f ðxÞ� f ðx0j; x � jÞ��2Dj:

By dividing this expression by the total variance D andrearranging terms we obtain formula (9).

To apply formula (9) the values of the first order sensitivityindices and the first order terms of the ANOVA decompositionneed to be known. The efficiency of this method can be increasedfurther by adding higher order terms to control variate, however itmakes this method more difficult to apply practically.

As shown in [3] models with respect to their dependence on theinput variables can be loosely divided into three categories: modelswith not equally important variables only a few of which areimportant (type A), models with equally important variables andwith dominant low order terms (type B) and models with equallyimportant variables and with dominant interaction terms (type C).For type A and B functions f ðxÞ and f ðx0j; x � jÞ can be approximatedwith (7) and (8), respectively with sufficient accuracies.

The developed technique should provide a significant efficiencyimprovement for types A and B functions. For the type C functionsSTj are much larger than Sj and hence the choice of f jðxjÞ as controlvariate is not sufficient as the higher order interaction terms play asignificant role in the ANOVA decomposition of f ðxÞ.

5. The use of metamodels

The developed technique requires that the first order ANOVA termsf jðxjÞ and the corresponding Sj to be known explicitly. In a general caseof functions not known analytically f jðxjÞ and Sj can only be found bybuilding metamodels and then extracting the numerical values of thefirst order sensitivity indices and approximation of the first orderterms of the ANOVA decomposition from these metamodels. Weapplied the RS-HDMR method suggested in [9] to approximate f j andSj in the formula (9), although it is possible to use other metamodel-ling methods: kriging, RBF, etc. [6–8]. The RS-HDMR method is similarto the Polynomial Chaos Expansion (PCE) method [10]. The differencebetween these methods is that RS-HDMR is based on cutting higherorder terms in the ANOVA decomposition above a certain order(typically second or third) because for many practical problems onlythe low order terms in the ANOVA decomposition (5) are important,

hence f ðxÞ can be approximated by

f ðxÞ � f 0þ ∑d

s ¼ 1∑

i1 o…o is

f i1…is ðxi1 ;…; xis Þ: ð11Þ

Here d is a truncation order, which for many practical problemscan be equal to 2 or 3 [9]. The component functions in (11) areexpanded in terms of a suitable set of functions depending on thedistribution of input variables. Typically in the RS-HDMR and PCEmethods orthonormal polynomials are used. Using a completebase of orthogonal polynomials {φr} the first and second orderterms are expanded as:

f jðxjÞ ¼ ∑1

r ¼ 1αjrφrðxjÞ; f ijðxi; xjÞ ¼ ∑

1

p ¼ 1∑1

q ¼ 1βijpqφpðxiÞφqðxjÞ: ð12Þ

Here fαjrg; fβij

pqg are the coefficients of decomposition whichneeds to be determined. Due to orthogonality of polynomials thefirst and second order sensitivity indices are defined using onlythe coefficients of the decomposition (12) as follows [10]:

Sj ¼1D

∑1

r ¼ 1ðαj

rÞ2; Sij ¼1D

∑1

p ¼ 1∑1

q ¼ 1ðβij

pqÞ2: ð13Þ

In practice summations in (12) and (13) are limited to somemaximum orders. Coefficients of decomposition fαj

rg; fβijpqg can be

found using a regression method or MC or Quasi MC integration. Thedetails of implementation can be found in [11], where a Quasi RS-HDMR (QRS-HDMR) variant of the RS-HDMRmethod was developed.One of the distinctive characteristics of the QRS-HDMR method isthat Quasi MC sampling is used in the metamodel training set. In thiswork we used a regression method for determining the coefficientsof decomposition.

6. Test cases

In this section we evaluate total sensitivity indices STj for a set offunctions with known analytical values for the first order sensi-tivity indices and the expressions for the first order terms of theANOVA decomposition using the original formula (6) (which wefurther call standard) and the improved formula (9) using expres-sions for f jðxjÞ and Sj computed either analytically or applying themetamodelling technique. In each test case, the MC estimates oftotal sensitivity indices computed using formulas (6) and (9) arecalculated for a number of sampled points N up to 215. In all caseswe used MC sampling. We present results for the convergence ofSTj and the root mean square error (RMSE) εjðNÞ versus N. RMSE isdefined as:

εjðNÞ ¼1K

∑K

k ¼ 1ðSTnumðkÞ

j �STA

j Þ2" #1=2

: ð14Þ

Here STnumðkÞ

j is a numerical estimate of STj at kth repetition, STA

j isan analytical value of STj . For the MC method all runs should bestatistically independent. In this work the number of independentrepetitions K was equal to 20.

For each test case we also present histograms which graphicallyshow the difference between the original and control variate reductiontechniques. For all tests we present the histogram of the distribution off ðxÞ� f ðx0j; x � jÞ which is used in the standard formula (6) and thehistogram of the distribution of f ðxÞ� f jðxjÞ�½f ðx0j; x � jÞ� f jðx0jÞ� whichis used in the formula (9). We give the values of the variance VSðjÞ forthe standard formula (6) which corresponds to the distributionf ðxÞ� f ðx0j; x � jÞ, the variance VRðjÞ for the control variate formula (9),which corresponds to the distribution f ðxÞ� f jðxjÞ�½f ðx0j; x � jÞ� f jðx0jÞ�and their ratio VRðjÞ=VSðjÞ. We note that values VSðjÞ and VRðjÞ dependon an input variable j.

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259 253

Page 4: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

6.1. Ishigami function

The Ishigami function is often used as a benchmark in sensi-tivity analyses studies [1]. This function is defined as:

f ðxÞ ¼ sinðx1Þþ a sin2ðx2Þþ bx34 sin ðx1Þ:

Here input factors xj are uniformly distributed: �πrxjrπ;j¼ 1;2;3, parameters a¼ 7; b¼ 0:1. The first order terms of theANOVA decomposition for this function are f 1ðx1Þ ¼ sinðx1Þð1þðbπ4=5Þ Þ, f 2ðx2Þ ¼ a sin2ðx2Þ�a=2, f 3ðx3Þ ¼ 0.

Table 1 presents analytical values of the first order and totalindices. From the values of sensitivity indices one can see thatstrictly speaking it is neither type A nor type B function. However,for input 2 ST2 ¼ S2, hence the first term of the right hand sideexpression in (9) is equal to 0 and application of the control variateformula should show the dramatic increase in efficiency. One canalso expect that formula (9) can give efficiency improvement for thefirst input. For the third input variable f 3ðx3Þ ¼ 0 and S3 ¼ 0, hencethere should be no difference in efficiencies of formulas (6) and (9).

Numerical tests confirm these assumptions. Total sensitivityindices STj for the first and second input variables were computedusing the original formula (6) and the improved formula (9). We usedanalytical expressions for both f jðxjÞ and Sj (curves noted as “reduc-tion”) and expressions for f jðxjÞ and Sj obtained from a metamodel asdescribed in Section 3 (curves noted as “HDMR reduction”). QRS-HDMR with N¼128 sampled points was used to build a metamodeland to approximate f j and Sj. We note that from the efficiency pointof view, metamodels should be constructed with the minimumpossible number of function evaluations.

Figs. 1 and 2 show the results of convergence and RMSE ofSTj ; j¼ 1; 2; 3 versus the number of sampled points N for threedifferent cases: standard formula (6), control variate formula (9)with using f jðxjÞ and Sj computed analytically, and f jðxjÞ and Sj basedon metamodel computations. Control variate formula (9) withusing f jðxjÞ and Sj computed either analytically or using metamodelgives a significant speed up in convergence for the first input (Fig. 1a),while for the second input it gives practically instantaneous conver-gence (Fig. 1b) as expected. For the third input for reasons given abovethere is no difference in convergence of all three methods (Fig. 1c).

It follows from Fig. 2a that for the first input the control variateformula gives improvements in convergence up to 3 folds, whenanalytical expressions for f 1ðx1Þ and S1 are used and up to 2 folds,when metamodel based expressions for f 1ðx1Þ and S1 are used. Forthe second input (Fig. 2b) in the case of analytically computedf 2ðx2Þ and S2 indices ST2 ¼ S2 regardless of the number of sampledpoints. As a result, RMSE for this case is exactly equal to 0. In thecase of f 2ðx2Þ and S2 computed using metamodels, RMSE is manyorders of magnitude smaller than that computed using thestandard formula. For the third input RMSE for all three methodsis the same (Fig. 2c). Hence we can conclude that the controlvariate formula does not give improvements in the case of inputsfor which Sj � 0 and STj is relatively large.

Computation of variance for the first input for the original andcontrol variate formulas gives the following results: VSðj¼ 1Þ ¼13:8, VRðj¼ 1Þ ¼ 5:8. The ratio VSðj¼ 1Þ= VRðj¼ 1Þ ¼ 2:4. It alsofollows from the presented hystogram in Fig. 3 that the controlvariate formula shows a decrease of the variance.

6.2. G-function

The g-function is also used as a benchmark in global sensitivityanalyses tests [12]. This function is defined as:

f ðxÞ ¼ ∏n

j ¼ 1gjðxj; ajÞ ¼ ∏

n

j ¼ 1

j4xj�2jþaj1þaj

:

Here n is the number of independent input factors xj :0rxjr1; j¼ 1;…;n. Further we consider a problem with n¼8

Table 1First order and total sensitivity indices. Ishigami function.

Factor First order index Total index

1 0.3138 0.55742 0.4424 0.44243 0.0 0.2436

Fig. 1. Convergence of STj for the Ishigami function. First input (a); second input (b);third input (c). Blue line: standard formula (6); red line: control variate reductionformula (9) with using analytically given f jðxjÞ and Sj; green line: control variatereduction formula (9) with using f jðxjÞ and Sj based on metamodel computations;purple line: analytical values of STj . (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259254

Page 5: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

input variables with the vector of parameter fajg ¼ ½0; 1; 4:5;9; 99; 99; 99; 99�. Parameter aj is set to determine the impor-tance of the input factor xj, given that the range of variation ofgjðxj; ajÞ depends exclusively on the value of aj. If aj ¼ 0, thecorresponding input xj is important; if aj ¼ 1, xj is relativelyimportant, while for aj ¼ 9 it becomes non-important and foraj ¼ 99 non-significant.

For this function the first order partial variances are equal toDj ¼ 1=3ð1þajÞ2, the higher order partial variances are productsDj1…js ¼Dj1 ;…;Djs and the total variance is D¼∏n

j ¼ 1ðDjþ1Þ�1.The first order terms of the ANOVA decomposition are f jðxjÞ ¼gjðxj; ajÞ�1; j¼ 1;…;n. Table 2 presents analytical values of the firstorder and total sensitivity indices. From the values of the firstorder and total sensitivity indices one can see that STj is close to Sj.This is a type A function, and hence the developed method shouldbe efficient.

We note, that the g-function has a kink at xj ¼ 0:5; j¼ 1;…;n andit is not differentiable at this point. Building metamodels for suchfunctions using continuous smooth function as a decomposition baseis especially difficult as it is not possible to approximate regions ofthe functions in the vicinity of kinks with sufficient local accuracy.QRS-HDMR with N¼1024 sampled points was used to build ametamodel and to approximate f j and Sj. This number of sampledpoints is much higher than that used for the Ishigami function(N¼128) due to both the existence of a kink and higher dimension-ality of the considered g-function (n¼8 while for the Ishigamifunction n¼3).

The results for the convergence and RMSE of STj ; j¼ 1; 3; 5 versusthe number of sampled points N for the standard formula (6) and thecontrol variate formula (9) with using f jðxjÞ and Sj computedanalytically and using a metamodel are presented in Figs. 4 and 5.Improvements in convergence with control variate formula (9)strongly depend on the value of STj : it is very significant for the firstinput with high values of Sj and STj (Fig. 4a) and it is less significant forthe third input with small values of Sj and STj (Fig. 4b) for bothanalytically and metamodel computed f jðxjÞ and Sj. However, for thefirth input there is only some improvement in convergence with f jðxjÞand Sj computed analytically while in the case of metamodel basedf jðxjÞ and Sj there is no convergence to the analytical value of STj(Fig. 4). It is explained by the fact that for non-significant inputs with

Fig. 2. log2ðRMSEÞ vs. log2ðNÞ for the Ishigami function. First input (a); second input(b); third input (c). Blue line: standard formula (6); red line: control variatereduction formula (9) with using analytically given f jðxjÞ and Sj; green line: controlvariate reduction formula (9) with using f jðxjÞ and Sj based on metamodelcomputations. (For interpretation of the references to colour in this figure legend,the reader is referred to the web version of this article.)

Fig. 3. Histogram for the first input. Blue: standard formula (6); red: control variatereduction formula (9) with using using f jðxjÞ and Sj based on metamodelcomputations. Ishigami function. (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

Table 2First order and total sensitivity indices. G-function.

Factor First order index Total index

1 0.716 0.7872 0.179 0.2423 0.024 0.0344 0.0072 0.0105 1.0e-04 1.05e-046 1.0e-04 1.05e-047 1.0e-04 1.05e-048 1.0e-04 1.05e-04

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259 255

Page 6: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

extremely small values of Sj and STj as in the case of inputs 5 to 8 it ispractically not possible to compute f jðxjÞ and Sj with sufficientaccuracy due to high metamodelling errors. We note, that thissituation is different from the previous test case with the Ishigamifunction for which Sj ¼ 0 but STj has a relatively large value.

Fig. 5 also confirms these findings: for the first input the controlvariate formula gives improvements in convergence up to 15 folds,for the third input it is approximately 4 folds for both analytically andmetamodel based f jðxjÞ and Sj. For the fifth input improvements inconvergence is 4 folds for analytically given f jðxjÞ and Sj while for the

metamodel based f jðxjÞ and Sj there is no convergence to the truevalue of STj . We also notice that the values of RMSE drop manyorders of magnitude with the increase of the input number.

Fig. 6 shows that switching from integrand ½f ðxÞ� f ðx0j; zÞ� to

f ðxÞ� f jðxjÞ�½f ðx0j; zÞ� f jðx0jÞ� gives a dramatic decrease in the var-

iance: for the first input VSðj¼ 1Þ ¼ 0:8, VRðj¼ 1Þ ¼ 0:07, the ratioVSðj¼ 1Þ= VRðj¼ 1Þ ¼ 11:4.

Fig. 4. Convergence of STj for G-function. First input (a); third input (b); fifth input (c).Blue line: standard formula (6); red line: control variate reduction formula (9) withusing analytically given f jðxjÞ and Sj; green line: control variate reduction formula (9)with using f jðxjÞ and Sj based on metamodel computations; purple line: analyticalvalues of STj . (For interpretation of the references to colour in this figure legend, thereader is referred to the web version of this article.)

Fig. 5. log2ðRMSEÞ vs. log2ðNÞ for G-function. First input (a); third input (b); fifthinput (c). Blue line: standard formula (6); red line: control variate reductionformula (9) with using analytically given f jðxjÞ and Sj; green line: control variatereduction formula (9) with using f jðxjÞ and Sj based on metamodel computations;purple line: analytical values of STj . (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259256

Page 7: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

We can conclude that the control variate formula does not giveimprovements in the case of inputs for which STj � 0.

6.3. Product function

Owen considered product functions of the following form [4]:

f ðxÞ ¼ ∏n

j ¼ 1½μjþτjgjðxjÞ�:

Here gðxÞ ¼ffiffiffiffiffiffi12

pðx�0:5Þ. In this study, we use n¼6,

fτjg ¼ ½1; 1; 0:5; 0:5; 0:25; 0:25� and μj ¼ 1 for all j. The totalvariance D is D¼∏ðμi

2þτi2Þ�∏μi2. The first order terms of the

ANOVA decomposition are

f jðxjÞ ¼ ½μjþτjgjðxjÞ� ∏n

i¼ 1ia j

ðμiÞ�∏iμi:

The analytical values for first order and total sensitivity indicesare:

Sj ¼1D

ðμ2j þτ2j Þ�∏

iμi

" #; STj ¼

12D

2τ2j ∏ia j

ðμ2i þτ2i Þ

" #:

From the values of the first order and total sensitivity indicesone can see (Table 3) that this is a type C function, STj are muchlarger than Sj and hence the choice of f jðxjÞ as a control variate isnot sufficient as the higher order interaction terms play a sig-nificant role in the ANOVA decomposition of f ðxÞ.

Fig. 7 shows that although both methods converge to theanalytical values, there is no real improvement in applying thecontrol variate method using only the first order terms f jðxjÞ. Plotsof RMSE (Fig. 8) also show that there are no significant improve-ments in the use of the control variate method.

Comparison of the histograms also shows that there is nosignificant reduction of variance for a function of this type (Fig. 9):for the first input VSðj¼ 1Þ ¼ 7:0, VRðj¼ 1Þ ¼ 5:0, the ratioVSðj¼ 1Þ= VRðj¼ 1Þ ¼ 1:4. Similar results were obtained for otherinputs.

7. Conclusions

It was shown that the control variate reduction technique canbe used to improve the efficiency of the Sobol–Jansen formula forevaluation of total sensitivity indices. The control variate reductionformula for total sensitivity indices requires the knowledge of thefirst order sensitivity indices and the first order terms of theANOVA decomposition. They can be obtained either from

Fig. 6. Histogram for the first input. Blue: standard formula (6); red: control variatereduction formula (9) with using analytically given f 1ðx1Þ and S1. G-function. (Forinterpretation of the references to colour in this figure legend, the reader is referredto the web version of this article.)

Table 3First order and total sensitivity indices. Owen's product function.

Sj ; j¼ 1;…;6 0.165 0.165 0.041 0.041 0.0103 0.0103

STj ; j¼ 1;…;6 0.583 0.583 0.233 0.233 0.0685 0.0685

Fig. 7. Convergence of STj for Owen's product function. Second input (a); forth input(b); sixth input (c). Blue line: standard formula (6); red line: control variatereduction formula (9) with using analytically given f jðxjÞ and Sj; purple line:analytical values of STj . (For interpretation of the references to colour in this figurelegend, the reader is referred to the web version of this article.)

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259 257

Page 8: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

analytical exact expressions (in the case of explicitly given modelfunctions) or from using metamodelling techniques, although thelatter is more practical.

Presented results using well known test functions show highefficiency of the developed technique for types A and B functions.However, for type C functions the control variate reductionformula does not offer improved efficiency which is explained bythe importance of higher order terms in the ANOVA decomposi-tion for this type of functions.

In most cases a practitioner does not know in advance the type offunctions. However, in practice it is possible to estimate main effectsensitivity indices with a limited number of function evaluationsusing recent advances in global sensitivity analysis and metamodel-ling. In the case of inputs for which Sj � 0 it is not advisable to usethe control variate formula as it does not give improvements in thecase of inputs for which Sj � 0 and STj is relatively large, while fornon-significant inputs with extremely small values of Sj and STj thecontrol variate formula is less efficient than the Sobol–Jansenformula as it is practically not possible to compute the first ordersensitivity indices and the first order terms of the ANOVA decom-position with sufficient accuracy due to high metamodelling errors.

The efficiency of the proposed method can be increased furtherby adding higher order terms to control variate approximation of themodel function. However this would make the method less practical.

Finally, we note that it is important in practice to obtainconfidence intervals associated with MC estimates (in our case inestimating total sensitivity index using the control variate techni-que). It can be done e.g. by using the bootstrap procedure [16] and itsvariants specially adapted for evaluation of sensitivity indices basedon metamodels [17] or using Gaussian process formulation [18].

Acknowledgements

The financial support by the EPSRC Grant EP/H03126X/1 isgratefully acknowledged. Authors also like to thank Dr. ShufangSong for her help in preparation of this manuscript.

References

[1] Saltelli A, Tarantola S, Campolongo F, Ratto M. Sensitivity Analysis in Practice.London: Wiley; 2004.

[2] Sobol' I, Kucherenko S. Global sensitivity indices for nonlinear mathematicalmodels review. Wilmott Mag 2005;1:56–61.

[3] Kucherenko S, Feil B, Shah N, Mauntz W. The identification of model effectivedimensions using global sensitivity analysis. Reliab Eng Syst Safety2011;96:440–9.

[4] Owen A. Better estimation of small Sobol’ sensitivity indices. ACM Trans ModComp Simul 2013;23(11):1–17.

[5] Tarantola S, Gatelli D, Mara T. Random balance designs for the estimation offirst order global sensitivity indices. Reliab Eng Syst Safety 2006;91(6):717–27.

[6] Oakley A, O’Hagan A. Probabilistic sensitivity analysis of complex models: aBayesian approach. J R Stat Soc: Ser B 2004;66:751–69.

[7] Marrel A, Iooss B, Laurent B, Roustant O. Calculations of Sobol indices for theGaussian process metamodel. Reliab Eng Syst Safety 2009;94:742–51.

[8] Buhmann M. Radial basis functions: theory and implementations. CambridgeUniversity Press; 2003.

Fig. 8. log2ðRMSEÞ vs. log2ðNÞ for Owen's product function. Second input (a); forthinput (b); sixth input (c). Blue line: standard formula (6); red line: control variatereduction formula (9) with using analytically given f jðxjÞ and Sj; purple line:analytical values of STj . (For interpretation of the references to colour in this figurelegend, the reader is referred to the web version of this article.)

Fig. 9. Histogram for the first input. Blue: standard formula (6); red: control variatereduction formula (9) with using analytically given f 1ðx1Þ and S1. Owen's ProductFunction. (For interpretation of the references to colour in this figure legend, thereader is referred to the web version of this article.)

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259258

Page 9: Reliability Engineering and System SafetyGlobal sensitivity analysis Sobol' sensitivity indices Control variate technique Random sampling- high dimensional model representation abstract

[9] Li G, Wang S, Rabitz H. Practical approaches to construct RS-HDMR componentfunctions. J Phys Chem 2002;106:8721–33.

[10] Blatman G, Sudret B. Adaptive sparse polynomial chaos expansion based onleast angle regression. J Comput Phys 2011;230:2345–67.

[11] Zuniga M, Kucherenko S, Shah N. Metamodelling with independent anddependent inputs. Comput Phys Commun 2013;184(6):1570–80.

[12] Sobol’ I. Global sensitivity indices for nonlinear mathematical models andtheir Monte Carlo estimates. Math Comput Simul 2001;55(1–3):271–80.

[13] Saltelli A, Annoni P, Azzini I, Campolongo F, Ratto M, Tarantola S. Variancebased sensitivity analysis of model output. Design and estimator for the totalsensitivity index. Comput Phys Commun 2010;181(2):259–70.

[14] Rubinstein RY. Simulation and the Monte Carlo method. New York, NY, USA:John Wiley & Sons, Inc.; 1981.

[15] Sobol’ I. Sensitivity estimates for nonlinear mathematical models. Matem Mod1990;2(1):112–8 ([in Russian, Translated in Sobol’ I.M. Sensitivity estimates fornonlinear mathematical models, Math Mod and Comp Exp 1993;26:407–14).

[16] Davison A, Hinkley D. Bootstrap methods and their application. New York, NY:Cambridge University Press; 1997.

[17] Storlie C, Swiler L, Helton J, Sallaberry C. Implementation and evaluation ofnonparametric regression procedures for sensitivity analysis of computation-ally demanding models. Reliab Eng Syst Safety 2009;94(11):1735–63.

[18] Le Gratiet, Cannamela C, Iooss B. A Bayesian approach for global sensitivityanalysis of (multifidelity) computer codes. SIAM/ASA J Uncert Quant 2014;2(1):336–63.

S. Kucherenko et al. / Reliability Engineering and System Safety 134 (2015) 251–259 259