Andrea Saltelli, Jessica Cariboni and Francesca Campolongo

1

Andrea Saltelli, Jessica Cariboni and

Francesca Campolongo European Commission, Joint Research Centre

SAMO 2007 Budapest

Accelerating factors screening

2

1. Sensitivity analysis web at JRC (software, tutorials,..)

http://sensitivity-analysis.jrc.cec.eu.int/

2. New book on SA with exercises for students - at Wiley for review - Please flag errors!

3. Summer school in 2008 – date to be decided

Sensitivity analysis at the Joint Research Centre of Ispra

3

Where do we stands in terms of good practices for global SA :

Screening: Morris – Campolongo – EE (1991-2007)

Quantitative: Sobol’, plus several investigators, 1990-2007

4

Screening: Morris – Campolongo – EE (1991-2007)Good but not so efficient

Quantitative: Sobol’, Saltelli (1993-2002) • Efficient for Si (Mara’ + Tarantola [scrambled FAST], Ratto + Young [SDR] + proximities [Marco’s presentation of yesterday]) • Not so efficient for STi (Saltelli 2002)

5

The EE method can be seen as an extension of a derivative-based analysis.

Where to start? From the best available practice in screening: The method of Elementary Effects (Morris 1991)

Max Morris, Department of

Statistics Iowa State University

6

The method of Elementary Effects

Model ),..,( 1 kxxyy

Elementary Effect for the ith input factor in a point Xo

),...,(),..,,,..,,(

),...,(00000

,000

00 111211

kkiiik

xxyxxxxxxyxxEEi

x1

x2

(x01, x0

2) (x01+, x0

2)

7

x1

x2

x1

x2

x..

xr

r elem. effects EE1i EE2

i … EEr

i are computed at X1 , … , Xr and then averaged.

Average of EEi’s (xi)

Standard deviation of the EEi’s (xi)

Factors can be screened on the (xi) (xi) plane

Using EE method: The EEi is still a local measure Solution: take the average of several EE

8

A graphical representation of results

DK5 ZJ3

DK3

DJ3

ZK5

ZK4

DJ4

DK2

ZJ6

ZK1

0,00E+00

1,00E-01

2,00E-01

3,00E-01

4,00E-01

5,00E-01

6,00E-01

7,00E-01

8,00E-01

9,00E-01

0,00E+00 5,00E-02 1,00E-01 1,50E-01 2,00E-01 2,50E-01 3,00E-01 3,50E-01 4,00E-01 4,50E-01

mu

sigm

a

9

Using the EE method Each input varies across p possible values (levels – quantiles usually) within its range of variation xi U(0,1) p = 4 p1 = 0 p2 = 1/3 p3 = 2/3 p4 = 1

The optimal choice for is = p / 2 (p -1)

0 1/3 2/3 10 1/3

2/3 1

Grid in 2D Sampling the levels uniformly

10

Improving the EE (Campolongo et al., ….. 2007)

- Taking the modulus of (xi), *(xi)Instead of using the couple of (xi) and (xi)

x1

x2

A B

C

A’

C’

B’

-Maximizing the spread of the trajectories in the input space

-Application to groups of factors

11

]1,0[~

0

1

24)(

UX

a

a

aXXg

i

i

i

iiii

)(1

i

k

ii Xgy

STi available analytically

a=99a=9a=0.9

A comparison with variance-based methods:Is *(xi) related to either Si or STi?

Empirical evidence: the g-function of Sobol’

12

Empirical evidence: the g-

function

Factor a(i)

x1 0.001

x2 89.9

x3 5.54

x4 42.1

x5 0.78

x6 1.26

x7 0.04

x8 0.79

x9 74.51

x10 4.32

x11 82.51

x12 41.62

X10X6

X8

X5, X8

X7

X1

X3

X3, X10

X6

X5

X7

X1

0

2

4

6

8

10

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50S T

*

N=6656

N=130

A comparison with variance-based

*(xi) is a good proxy for STi

13

Implementing the EE methodOriginal implementation estimate r EE’s per input.

r trajectories of (k+1) sample points are generated, each providing one EE per input

x1

x2

x3

Y1 Y2

Y3

Y4

A trajectory of the EE design

Total cost = r (k + 1)r is in the range 4 -10

Each trajectory gives k effect EE at the cost of (k + 1) simulations. Efficiency =k/(k+1)~1

14

Conclusion: the EE is a useful method

Is its efficiency k/(k+1) ~ 1 good?

We can compare with the Saltelli 2002 method to implement the calculation of the first order and total order sensitivity indices:

15

One of this plus …

)2(

)1(2

)1(

)2()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

Ni

N

Ni

N

x

x

x

xx

xx

xx

B

)(

)1(2

)1(

)()(1

)2()2(1

)1()1(1

Nk

k

Ni

N

i

i

x

x

x

xx

xx

xx

A

)2(

)1(2

)1(

)()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

iN

iN

i

x

x

x

xx

xx

xx

A

… one of this plus

… plus K of these

With:

One can compute all first and total effects for k factors

Saltelli 2002

16

One of this

)2(

)1(2

)1(

)2()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

Ni

N

Ni

N

x

x

x

xx

xx

xx

B

)(

)1(2

)1(

)()(1

)2()2(1

)1()1(1

Nk

k

Ni

N

i

i

x

x

x

xx

xx

xx

A

)2(

)1(2

)1(

)()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

iN

iN

i

x

x

x

xx

xx

xx

A

One of this

K of these

Total: N(K+2) runs

To obtain N*2*k elementary effects (for Si or STi)

Efficiency=2k/(k+2)~2

Better that the EE method.

Saltelli 2002

17

Conclusion: the efficiency of EE might have scope for improvement.

The better efficiency of the global method (Saltelli 2002) against the screening method (EE) is due to the fact that two effects (one of the first order and one of the total order) are computed from each row of Ai.

Can we do the same with EE?

18

)2(

)1(2

)1(

)2()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

Ni

N

Ni

N

x

x

x

xx

xx

xx

B

)(

)1(2

)1(

)()(1

)2()2(1

)1()1(1

Nk

k

Ni

N

i

i

x

x

x

xx

xx

xx

A

)2(

)1(2

)1(

)()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

iN

iN

i

x

x

x

xx

xx

xx

A

… is one step in the non-Xi direction (all moves but Xi)

Saltelli 2002

From

To

19

)2(

)1(2

)1(

)2()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

Ni

N

Ni

N

x

x

x

xx

xx

xx

B

)(

)1(2

)1(

)()(1

)2()2(1

)1()1(1

Nk

k

Ni

N

i

i

x

x

x

xx

xx

xx

A

)2(

)1(2

)1(

)()2(1

)2()2(1

)1()1(1

Nk

N

Nk

Ni

N

iN

iN

i

x

x

x

xx

xx

xx

A

… is one step in the Xi direction (Xi moves and X~i does not)

Saltelli 2002

From

To

20

How about alternating steps along the Xi’s axes with steps along the along the X~i’s also for an EE-line screening method?

How can we combine steps along Xi’s axes with steps along the X~i’s?

21

Can we generate efficiently exploration trajectories in the hyperspace of the input factors where steps in the Xi and X~i directions are nicely arranged, e.g. in a square?

Beyond Elementary Effects Method

22


23

Our thesis is that

(1) Both |y1-y3| and |y2-y4| tells me about the first order effect of X1

24

… and that :

(2) ||y1-y4|-|y1-y2||, ||y2-y3|-|y2-y1||, ||y3-y2|-|y3-y4||, ||y4-y1|-|y4-y3||, all tell me about the total order effect of a factor.

25

Before trying to substantiate our thesis we give a look at how these squares could be built efficiently

Four runs, six factors

26

Four runs, six factors, six steps along the X~i

directions

We call these four runs ‘base

runs’

27

Base runs

Clones

For each step in the X~i direction we add two in

the Xi direction

28

Base runs

Clones

Let’s count: Run 3 is a step away from run 1 in the X1 direction. Run 4 is a step away from run 2 in the X1 direction. Run 2 was already a step away from run 1 in the X~1 direction Run 4 is also a step away from run 3 in the X~1 direction … the square is closed.

29


30

Base runs

Clones

Let do some more counting. We

have 4 base runs, 16 runs in total, six factors and four effects for

factor.

Efficiency= 24/16=3/2

31

For 6 base runs, we have 15 factors, 36 runs in total, again four effects for

factor.

Efficiency= 60/36 ~ 2 for increasing number of factors …

It would be nice to stop here! … but let us go back to the 6 factors example

32

There are many more effects hidden in the scheme: e.g. three more

effects for run 16.

Most of these effects are of the

X~i type

The number of extra terms is

between 2k and 4 k

33

The number of extra terms

grows with k

Some of these need only one more point to

close a square

Most of these need two extra

points to close a square

34

Let us forget about the additional terms for the moment and let us try screening …

35

Numerical Experiment: g-function

)(1

i

k

ii Xgg

i

iiii a

aXXg

1

24)(where

36

Results: g-function (180 runs)

a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15

0.01 0.02 0.05 99 0.30 1.50 78 57 89 96 0.50 98 87 88 90

37

g function

0.01.02.03.04.05.06.07.08.09.0

10.0

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

ST*10

EE1 (2007)

EE2 (2007)

mu* (2007)

Number of runs: EE(2007)= 25; EE =22

K=10, a=(0.01,0.02,0.015,99,78,57,89,97,96,87)

38

Test function Book (2007)

]7,5.6,6,5.5,5,5.4,4,5.3,3,5.2,2,5.0,5.0,5.0[

]7,6,2,5,4,4,3,3,5.0,5.0,5.0,5.0,1,1[

]7,6,1,1,2,2,3,3,3,2,2,2,1,1[

i

Z

i

i

28,...15),(~

14,...1),0(~

where

,14

1 14

iN

iNZ

ZY

i

i

ii

Zi

ii i

The last two Z’s and the last two omegas are the most important factors

39

Number of runs: new method= 64; old method =58

Book test case

0

100

200

300

400

500

600

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28

ST*1000

EE1 (2007)

EE2 (2007)

mu* (2007)

40

g function 25 replicas of EE1(2007)

41

g function 25 replicas of EE2(2007)

42

g function 25 replicas of EE

43

book function 25 replicas of EE2(2007)

44

book function 25 replicas of EE1(2007)

45

book function 25 replicas of EE

46

What next? Good for Si, STi ?

47

Si couple

STi couple

Si couple

STi couple

48

Si couple

STi couple

Si couple

STi couple

Try to exploit this design for the improvement of the Saltelli 2002 method for the STi

49

The number of extra terms

grows with k

Some of these need only one more point to

close a square

Most of these need two extra

points to close a square

(closed squares give 4 effects, 2

Si & 2 STi)

50

Conclusions

The new scheme (aka il matricione)

has promises for EE and STi

Work on the algorithms is needed to make a sizeable difference with

best available practices …

51

il matricione

Documents

Andrea Saltelli, Jessica Cariboni and Francesca Campolongo