1
Rationale 1 The demand for eruption scenario forecast is pres- sing. Volcanic systems are largely out of direct observa- tion. Some of the quantities which determine volcanic pro- cesses are uncertain. Taking into account these uncertainties in predictive models can be exceedingly demanding. We are therefore developing the application of sto- chastic methods to largely reduce the computational costs. What is stochastic quantization? 2 A practical situation: the random vector X =(X 1 ,...,X d ) is part of the input data of a numerical code ϕ and the random variable Y is one relevant model output; the probability density function f (x) of X is assumed to be known; there is a maximum number N of affordable simula- tions. Strategy stochastic quantization method: find N values of X , x (1) 1 ,...,x (1) d ,..., x (N ) 1 ,...,x (N ) d , and N corresponding weights, w (1) ,...,w (N ) , with N i=1 w (i) = 1, so that the resulting discrete distribu- tion is the “best” approximation of f (x); known input distribution unknown output distribution M odel ϕ quantization Stochastic M odel ϕ input discretization output discretization probability density probability density probability density probability density variable X 1 variable X 1 variable Y variable Y N = 10 N = 10 x (1) x (10) y (1) y (10) w 1 w 2 w 2 w 1 w 10 w 10 ... ... ... ... for i =1,...,N compute y (i) = ϕ x (i) 1 ,...,x (i) d and give it the weight w (i) ; the resulting discre- te distribution is an approximation of the unknown distribution of Y . Figure 1: approximation of input and output distribu- tions. The orange arrows represent performed computa- tions, while the blue one represents a computation often out of reach in real situations. The single parameter input case, d=1 3 Introduce a distance between two probability distri- butions, in the case in which X is a scalar quantity: if F (x) is the cumulative distribution function as- sociated with the density f (x) and ˆ F (x) is the one associated with its discretization ˆ f (x), we define the distance between f and ˆ f as follows: d(f, ˆ f )= x max x min F (x) - ˆ F (x) dx, (1) where x min and x max are the minimum and maximum possible values of X . countinuous cumulative probability F (x) variable X 0 1 discrete cumulative probability ˆ F (x) variable X x (1) x (10) w 1 w 2 w 10 0 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ... ... ... The procedure consists in searching for N points x (1) ,...,x (N ) and N corresponding weights w (1) ,...,w (N ) that minimize the quantity d(f, ˆ f ). distance between distributions probability F (x) - ˆ F (x) variable X 0 1 Figure 2: the distance between the continuous probability distribution and the discrete one is the shaded region area. The multi-parameter input case, d>1 4 When X is a d-dimensional vector quantity, a diffe- rent definition of distance is more appropriate. Let ˆ X be a discrete random vector with probability distribution ˆ f , approximating a continuous random vector X . The distance between f and ˆ f can be defined as the mean value of the error X - ˆ X re- sulting from the substitution of X with ˆ X . We thus minimize d(f, ˆ f )= E |X - ˆ X | . random points with density f (x) variable X 2 variable X 1 x (1) x (2) x (3) x (4) x (5) x (6) x (7) x (1) ,...,x (7) possible values of ˆ X variable X 2 variable X 1 It can be shown that, in the case d = 1, E |X - ˆ X | = x max x min F (x) - ˆ F (x) dx; hence, the criterion for the multidimensional pro- blem is a generalization of that used in the one- dimensional case. d(f, ˆ f ) is calculated through a Monte Carlo method which involves the concept of Voronoi partitions. The procedure consists in searching for the discre- te random vector ˆ X that minimizes E |X - ˆ X | ; the possible values x (1) ,...,x (N ) of ˆ X and the corre- sponding weights w (1) ,...,w (N ) generate the discrete approximation ˆ f of the density f . Figure 3: implementation of the multi-parameter input case. The blue points are a sample of X =(X 1 ,X 2 ); the orange points x (1) ,...,x (7) are the possible values of ˆ X , which is a discrete approximation of X . The orange lines define the Voronoi regions generated by the set of points x (1) ,...,x (7) : the region associated to x (i) contains the blue points which are closer to x (i) than to any other of the orange points. Testing stochastic quantization in simple cases 5 5 9 5 1 5 1 5 2 5 mean real value numerosity 20 50 100 200 500 1000 2000 MC SQ,N = 20 5 5 6 5 7 5 8 5 9 5 1 standard deviation numerosity 20 50 100 200 500 1000 2000 real value MC SQ,N = 20 6 8 2 2 4 6 8 3 2 95th percentile numerosity 20 50 100 200 500 1000 2000 real value MC SQ,N = 20 Let ϕ(X ) be a known analytical function. The pro- bability distribution of the output random variable Y can be calculated exactly and compared with the ap- proximations produced by SQ and by Monte Carlo (MC) methods with variable numerosity. The case in the figure refers to ϕ(X 1 ,X 2 )= X 2 1 X 2 2 . Figure 4: with the SQ method and only N = 20 simu- lations, we approximate the true values at a confidence level corresponding to N = 2000 MC simulations for the mean or N = 200 MC simulations for the standard deviation. Application of SQ to volcanic conduit dynamics 6 Application of SQ to a situation in which the output probability distribution cannot be explicitly calcula- ted, but quite complete statistical information about it can be obtained through MC simulations. One-dimensional steady model of magma flow in a cilindrical conduit with fixed diameter and uniform temperature [1]. Random input quantities: diameter D of the conduit and total mass fraction w H 2 0 of water. Random output quantity: logarithm of the mass flow rate ˙ m. Figure 5: the correspondence between the distribution of mass flow rate found with 1000 MC simulations and SQ method is fully satisfactory when N SQ = 20. Stochastic quantization M odel ϕ M odel ϕ w H 2 O (%) w H 2 O (%) 5 5 probability density probability density 6 7 6 7 D(m) 0.2 0.7 1.2 1.2 0.7 0.2 D(m) 0.5 9.5 9.5 0.5 6.5 7.5 7.5 6.5 probability distribution log 10 ( ˙ m(kg/s) ) log 10 ( ˙ m(kg/s) ) probability distribution SQ,N SQ = 20 SQ,N SQ = 15 SQ,N SQ = 10 MC MC CONCLUSIONS The SQ method allows the introduction of uncertainties in the deterministic approach without requiring excee- ding CPU time. As a consequence, volcanic scenarios can be estimated in the future by means of complex deterministic models and taking into account the intrin- sic uncertainties involved in the definition of volcanic systems. Merging deterministic and probabilistic approaches to forecast volcanic scenarios E. Peruzzo 1 , L. Bisconti 2 , M. Barsanti 2,3 , F. Flandoli 3 and P. Papale 2 1 Scuola Normale Superiore, Pisa, Italy 2 Istituto Nazionale di Geofisica e Vulcanologia, Sezione di Pisa, Italy 3 Dipartimento di Matematica Applicata, Università di Pisa, Italy e-mail:[email protected] Abstract We present the stochastic quantization (SQ) method for the approximation of a continuous probability den- sity function with a discrete one. This technique redu- ces the number of numerical simulations required to get a reasonably complete picture of the possible eruptive conditions at a considered volcano. Finally we show the results of a test using a one-dimensional steady model of magma flow [1] as a benchmark. Young Scientists' Outstanding Poster Paper Contest G e n e r a l A s s e m b l y 2 0 0 9 This poster participates in YSOPP References [1] P. Papale, Dynamics of magma flow in volca- nic conduits with variable fragmentation efficiency and nonequilibrium pumice degassing, J. Geophys. Res., 106, 11043-11065, 2001 [2] S. Graf, H. Luschgy, Foundations of quanti- zation for probability distributions, Springer-Verlag, 2000

Merging deterministic and probabilistic approaches to ... · e-mail:[email protected] Abstract We present the stochastic quantization (SQ) method for the approximation of a continuous

Embed Size (px)

Citation preview

Page 1: Merging deterministic and probabilistic approaches to ... · e-mail:e.peruzzo@sns.it Abstract We present the stochastic quantization (SQ) method for the approximation of a continuous

Rationale 1

• The demand for eruption scenario forecast is pres-sing.

• Volcanic systems are largely out of direct observa-tion.

• Some of the quantities which determine volcanic pro-cesses are uncertain.

• Taking into account these uncertainties in predictivemodels can be exceedingly demanding.

• We are therefore developing the application of sto-chastic methods to largely reduce the computationalcosts.

What is stochastic quantization? 2

A practical situation:

• the random vector X = (X1, . . . , Xd) is part of theinput data of a numerical code ! and the randomvariable Y is one relevant model output;

• the probability density function f(x) of X is assumedto be known;

• there is a maximum number N of a!ordable simula-tions.

Strategy ! stochastic quantization method:

• find N values of X,!x(1)1 , . . . , x(1)

d

", . . . ,

!x(N)1 , . . . , x(N)

d

",

and N corresponding weights,#w(1), . . . , w(N)

$, with

%Ni=1 w(i) = 1, so that the resulting discrete distribu-

tion is the “best” approximation of f(x);

known

input

distribution

unknown

output

distributionModel

!

quantizationStochastic

Model

!

inputdiscretization

outputdiscretization

prob

abilit

yde

nsi

typr

obab

ilit

yde

nsi

ty

prob

abilit

yde

nsi

typr

obab

ilit

yde

nsi

ty

variable X1

variable X1

variable Y

variable Y

N = 10

N = 10

x(1) x(10) y(1) y(10)

w1

w2

w2

w1w10 w10

. . . . . . . . .. . .

• for i = 1, . . . , N compute

y(i) = !!x(i)1 , . . . , x(i)

d

"

and give it the weight w(i); the resulting discre-te distribution is an approximation of the unknowndistribution of Y .

Figure 1: approximation of input and output distribu-tions. The orange arrows represent performed computa-tions, while the blue one represents a computation oftenout of reach in real situations.

The single parameter input case, d=1 3

• Introduce a distance between two probability distri-butions, in the case in which X is a scalar quantity:if F (x) is the cumulative distribution function as-sociated with the density f(x) and F (x) is the oneassociated with its discretization f(x), we define thedistance between f and f as follows:

d(f, f) =! xmax

xmin

"""F (x)! F (x)"""dx, (1)

where xmin and xmax are the minimum and maximumpossible values of X.

countinuous

cumulative

prob

abilit

y

F (x)

variable X0

1

discrete

cumulative

prob

abilit

y

F (x)

variable Xx(1) x(10)

w1

w2

w10

0

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

. . . . . .. . .

• The procedure consists in searching for N points!x(1), . . . , x(N)

"

and N corresponding weights!w(1), . . . , w(N)

"

that minimize the quantity d(f, f).

distance

between

distributions

prob

abilit

y

F (x)! F (x)

variable X0

1

Figure 2: the distance between thecontinuous probability distributionand the discrete one is the shadedregion area.

The multi-parameter input case, d>1 4

• When X is a d-dimensional vector quantity, a di!e-rent definition of distance is more appropriate.

• Let X be a discrete random vector with probabilitydistribution f , approximating a continuous randomvector X. The distance between f and f can bedefined as the mean value of the error

!!!X ! X!!! re-

sulting from the substitution of X with X. We thusminimize

d(f, f) = E"|X ! X|

#.

random points with density f(x)

vari

able

X2

variable X1

x(1) x(2)

x(3)

x(4)

x(5)

x(6)

x(7)

x(1), . . . , x(7) ! possible values of X

vari

able

X2

variable X1

• It can be shown that, in the case d = 1,

E!|X ! X|

"=

# xmax

xmin

$$$F (x)! F (x)$$$dx;

hence, the criterion for the multidimensional pro-blem is a generalization of that used in the one-dimensional case.

• d(f, f) is calculated through a Monte Carlo methodwhich involves the concept of Voronoi partitions.

• The procedure consists in searching for the discre-te random vector X that minimizes E

!|X ! X|

"; the

possible values x(1), . . . , x(N) of X and the corre-sponding weights w(1), . . . , w(N) generate the discreteapproximation f of the density f .

Figure 3: implementation of the multi-parameter inputcase. The blue points are a sample of X = (X1, X2); theorange points x(1), . . . , x(7) are the possible values of X,which is a discrete approximation of X. The orange linesdefine the Voronoi regions generated by the set of pointsx(1), . . . , x(7): the region associated to x(i) contains theblue points which are closer to x(i) than to any other ofthe orange points.

Testing stochastic quantization in simple cases 5

0.085

0.09

0.095

0.1

0.105

0.11

0.115

0.12

0.125

mea

n

real value

numerosity20 50 100 200 500 1000 2000

MC

SQ,N = 20

0.05

0.055

0.06

0.065

0.07

0.075

0.08

0.085

0.09

0.095

0.1

stan

dard

devi

atio

n

numerosity20 50 100 200 500 1000 2000

real valueMC

SQ,N = 20

0.16

0.18

0.2

0.22

0.24

0.26

0.28

0.3

0.32

95th

perc

enti

le

numerosity20 50 100 200 500 1000 2000

real valueMC

SQ,N = 20

• Let !(X) be a known analytical function. The pro-bability distribution of the output random variable Y

can be calculated exactly and compared with the ap-proximations produced by SQ and by Monte Carlo(MC) methods with variable numerosity.

• The case in the figure refers to

!(X1, X2) = X21X2

2 .

Figure 4: with the SQ method and only N = 20 simu-lations, we approximate the true values at a confidencelevel corresponding to N = 2000 MC simulations forthe mean or N = 200 MC simulations for the standarddeviation.

Application of SQ to volcanic conduit dynamics 6

• Application of SQ to a situation in which the outputprobability distribution cannot be explicitly calcula-ted, but quite complete statistical information aboutit can be obtained through MC simulations.

• One-dimensional steady model of magma flow in acilindrical conduit with fixed diameter and uniformtemperature [1].

• Random input quantities: diameter D of the conduitand total mass fraction wH20 of water.Random output quantity: logarithm of the mass flowrate m.

Figure 5: the correspondence between the distributionof mass flow rate found with 1000 MC simulations andSQ method is fully satisfactory when NSQ = 20.

Stochastic quantization

Model

!

Model

!

wH2O(%)

wH2O(%)

5

5

prob

abilit

yde

nsi

typr

obab

ilit

yde

nsi

ty

6 7

6 7

D(m)

0.2

0.71.2

1.20.7

0.2

D(m)0.5

9.5

9.5

0.5

6.5 7.5

7.56.5

prob

abilit

ydi

stri

buti

on

log10

!m(kg/s)

"

log10

!m(kg/s)

"

prob

abilit

ydi

stri

buti

on

SQ,NSQ = 20SQ,NSQ = 15SQ,NSQ = 10

MC

MC

CONCLUSIONSThe SQ method allows the introduction of uncertaintiesin the deterministic approach without requiring excee-ding CPU time. As a consequence, volcanic scenarioscan be estimated in the future by means of complexdeterministic models and taking into account the intrin-sic uncertainties involved in the definition of volcanicsystems.

Merging deterministic and probabilistic approaches to forecast volcanic scenarios

E. Peruzzo1, L. Bisconti2, M. Barsanti2,3, F. Flandoli3 and P. Papale2

1 Scuola Normale Superiore, Pisa, Italy2 Istituto Nazionale di Geofisica e Vulcanologia, Sezione di Pisa, Italy

3 Dipartimento di Matematica Applicata, Università di Pisa, Italye-mail:[email protected]

Abstract

We present the stochastic quantization (SQ) methodfor the approximation of a continuous probability den-sity function with a discrete one. This technique redu-ces the number of numerical simulations required to geta reasonably complete picture of the possible eruptiveconditions at a considered volcano. Finally we show theresults of a test using a one-dimensional steady modelof magma flow [1] as a benchmark.

Young Scientists'Outstanding Poster Paper

Contest

General Assembly

2009

This poster participates in

YSOPPReferences[1] P. Papale, Dynamics of magma flow in volca-

nic conduits with variable fragmentation e!ciencyand nonequilibrium pumice degassing, J. Geophys.Res., 106, 11043-11065, 2001

[2] S. Graf, H. Luschgy, Foundations of quanti-zation for probability distributions, Springer-Verlag,2000