14
ansformation method (for continuous distributions) ) : uniform distribution : arbitrary distribution dx = U(0,1)(u) du inverse function of integral, F -1 (u), is known, then x = F -1 (u) ibuted according to f(x) le: Exponential distribution 4. MC Methods 4.2 Generators for arbitrary distributions K. Desch – Statistical methods of data analysis SS10 x u F(x) f(t)dt λx λe λ) f(x; 0 x x λt λx 0 u λe dt 1 e 1 x F (u)= -ln(1-u)/λ

A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

  • View
    238

  • Download
    1

Embed Size (px)

Citation preview

Page 1: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

a) Transformation method (for continuous distributions)

U(0,1) : uniform distributionf(x) : arbitrary distribution

f(x) dx = U(0,1)(u) du

When inverse function of integral, F-1(u), is known, then x = F-1(u) distributed according to f(x)

Example: Exponential distribution

4. MC Methods 4.2 Generators for arbitrary distributions

K. Desch – Statistical methods of data analysis SS10

x

uF(x)f(t)dt

λxλeλ)f(x; 0x x

λt λx

0

u λe dt 1 e 1x F (u)= -ln(1-u)/λ

Page 2: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

b) Transformation method (discrete distributions)

4. MC Methods 4.2 Generators for arbitrary distributions

K. Desch – Statistical methods of data analysis SS10

k

1ii1k )P(xP 1P0,P 1n1

c) Hit-or-miss method (brute force)

Uniform distr. fr. 0 to c: ui

Uniform distr. from xmin to xmax: xi

when ui ≤ f(xi) → accept xi, otherwise not

- two random numbers per try

- inefficient when f(x) « c

- need to (conservatively) estimate c (maximum of f(x))

(can be done in “warm-up” run)

Page 3: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

4. MC Methods 4.2 Generators for arbitrary distributions

K. Desch – Statistical methods of data analysis SS10

Improvement:

- search for analytical function s(x) close to f(x)

- use c so that c • s(x) >f(x) for all x

1ix S (u)

x

S(x): s(t)dt

1. take ui in [0,1] and calculate xi = S-1 (ui)

2. take uj in [0,c]

3. when uj • s(xi) ≤ f(xi) accept xi, otherwise not

Page 4: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

b

a

I g(x)dxsearch for:

4. MC Methods 4.3 Monte Carlo Integration

K. Desch – Statistical methods of data analysis SS10

Integration over one dimension:

(E[g] = expectation value of g w.r.t. uniform distribution)

Take xi uniformly distributed in [a,b] →

n

1iiMC )g(x

n

abII

2

i2i2

i2ii n

g

n

g]E[g]E[g]V[g

b

a

1I g(x)dx (b a)E gb a

b a

2 2n n2

MC I i i ii 1 i 1

b a b a (b a)V[I ] σ V g V[ g ] V[g ]

n n n

Variance:

(CLT)

Page 5: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

4. MC Methods 4.3 Monte Carlo Integration

K. Desch – Statistical methods of data analysis SS10

Alternative: hit-or-miss integration

Page 6: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

- Variance of r(x): will be small when r is flat, so f ≈ g

- The method takes care of (integrable) singularities

(find f(x) with has the same singularity structure as g(x))

xi distributed as f(x)

4. MC Methods 4.3 Monte Carlo Integration

K. Desch – Statistical methods of data analysis SS10

Variance-reduced methods

a) importance sampling:

If f(x) is a known p.d.f., which could be integrated and inverted, then:

r(x)Ef(x)

g(x)Ef(x)dx

f(x)

g(x)g(x)dxI

b

a

b

a

2ii )r(rE]V[r

n

1i i

iMC )f(x

)g(x

n

abI

Expectation value of r(x) can be obtained with random numbers, which is distributed according to f(x):

Page 7: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

4. MC Methods 4.3 Monte Carlo Integration

K. Desch – Statistical methods of data analysis SS10

b) Control function

(subtraction of an integrable analytical function)

dxf(x)g(x)f(x)dxg(x)dx

analytical MC

c) Partitioning

(split integration range into several more „flat“ regions)

Page 8: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

let x be a random variable distributed according to f(x)

n independent “measurements” of x, x = (x1,…,xn) is sample of a distribution f(x) of size n (outcome of an experiment)

x = itself is a random variable with p.d.f. fsample (x)

sample space: all possible values of x = (x1,…,xn)

If all xi are independent

fsample(x) = f(x1)•f(x2)• … •f(xn)

is the p.d.f. for x

5. Estimation 5.1 Sample space, Estimators

K. Desch – Statistical methods of data analysis SS10

Page 9: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

A central problem of (frequentist) statistics:

Find the properties of f(x) when only a sample x = (x1,…,xn) has been measured

Task: construct functions of xi to estimate the properties of f(x)(e.g. μ, σ2, …)

Often f depends on parameters θj : f(xi;θj) try to estimate the parameters θj from measured sample x

Functions of (xi) are called a statistic.

If a statistic is used to estimate parameters (μ, σ2, θ, …), it called an estimator

Notation: is an estimator for θ

can be calculated; true value θ is unknown

Estimation of p.d.f. parameters is also called a fit

5. Estimation 5.1 Sample Space, Estimators

K. Desch – Statistical methods of data analysis SS10

Page 10: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

in simple words: n→∞ θ →

2.Bias:

itself is a random variable, distributed according to a p.d.f.

This p.d.f. is called the sampling distribution

Expectation value of the sampling distribution:

(or “ “)

1 Consistency:

an estimator is consistent if for each ε > 0 :

5. Estimation 5.2 Properties of Estimators

K. Desch – Statistical methods of data analysis SS10

0ε|θθ|Plimn

θθlimn

)x,...,(xθ 21 θ);θg(

1 n 1 nˆ ˆ ˆ ˆ ˆE θ(x) θ(x) g(θ,θ) dθ(x) ... θ(x) f(x ;θ)...f(x ;θ)dx ...dx

1ˆ ˆg(θ(x ,...,x ))dθ f(x )dxn i ibecause

Page 11: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

5. Estimation 5.2 Properties of estimators

K. Desch – Statistical methods of data analysis SS10

The bias of an estimator is defined as

An estimator is unbiased (or bias-free) if b=0

An estimator is asymptotically unbiased if

Attentions Consistent: for large sample size

Unbiased: for fixed sample size

3. Efficiency:

One estimator is more efficient than another if its variance is smaller,

or more precise if its mean squared error (MSE) is smaller

ˆE[ ]

θθ

0b limn

ˆb E[ ]

2 2ˆ ˆE (θ-θ) MSE V[θ] b

2 2 2 2 2 2 2ˆ ˆ ˆ ˆE (θ-θ) E[θ ]-2θE[θ] θ E[θ ] b E[θ] V[θ] b

2 2 2 2b (E[θ] θ) E[θ] 2θE[θ] θ

2ˆE ( - )

and

Page 12: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

5. Estimation 5.2 Properties of estimators

K. Desch – Statistical methods of data analysis SS10

4. Robustness

An estimator is robust if it does not strongly depend on single measurements(which might be systematically wrong)

5. Simplicity

(subjective)

Page 13: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

5. Estimation 5.3 Estimation of the mean

K. Desch – Statistical methods of data analysis SS10

n

1ix

n

1xx

In principle one can construct an arbitrary number of different esitmatorsfor the mean value of a pdf, = E[x]

Examples:

mean of the sample

10

i1

1x x

10mean of the first ten members of the sample

n

i1

1x x x

n-1

x 42

x = median of the sample

max minx xx =

2

all have different (wanted and unwanted)properties

Page 14: A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse

5. Estimation 5.3 Estimation of the mean

K. Desch – Statistical methods of data analysis SS10

The mean of a sample provides an estimate of the true mean:

a) is consistent:

CLT: p.d.f. of approaches Gaussian with variance

b) is unbiased

c) Is efficient ?

n

1ix

n

1xx

i

1 1E[x] E x (n )

n n

2

i2

n

xE)(E]xV[]θV[ xxx

2

i 2 2j i2 2

x 1 1 1 1E E(x ) E (x ) nV[x] σ

n n n n n

0j)cov(i,

x

x

x

x2 2x x

10 for n

n