Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C) [email protected]

Computational Computational Methods in PhysicsMethods in Physics

PHYS 3437 PHYS 3437Dr Rob ThackerDr Rob Thacker

Dept of Astronomy & Physics Dept of Astronomy & Physics (MM-301C)(MM-301C)

[email protected]@ap.smu.ca

Today’s LectureToday’s Lecture

Introduction to Monte Carlo Introduction to Monte Carlo methodsmethods BackgroundBackground Integration techniquesIntegration techniques

IntroductionIntroduction

““Monte Carlo” refers to the use of random Monte Carlo” refers to the use of random numbers to model random events that may numbers to model random events that may model a mathematical of physical problemmodel a mathematical of physical problem

Typically, MC methods require many millions Typically, MC methods require many millions of random numbersof random numbers Of course, computers cannot actually generate Of course, computers cannot actually generate

truly random numberstruly random numbers However, we can make the period of repetition absolutely However, we can make the period of repetition absolutely

enormousenormous Such pseudo-random number generators are based on Such pseudo-random number generators are based on

truncation of numbers of their significant digitstruncation of numbers of their significant digits See Numerical Recipes, p 266-280 (2See Numerical Recipes, p 266-280 (2ndnd edition FORTRAN) edition FORTRAN)

““Anyone who considers arithmetical methods of Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”producing random digits is, of course, in a state of sin.”

John von Neumann

History of numerical Monte History of numerical Monte Carlo methodsCarlo methods

Another contribution to numerical methods related Another contribution to numerical methods related to research at Los Alamosto research at Los Alamos

Late 1940s: scientists want to follow paths of Late 1940s: scientists want to follow paths of neutrons following various sub-atomic collision neutrons following various sub-atomic collision eventsevents

Ulam & von Neumann suggest using random Ulam & von Neumann suggest using random sampling to estimate this processsampling to estimate this process

100 events can be calculated in 5 hours on ENIAC100 events can be calculated in 5 hours on ENIAC The method is given the name “Monte Carlo” by The method is given the name “Monte Carlo” by

Nicholas MetropolisNicholas Metropolis Explosion of inappropriate use in the 1950’s gave Explosion of inappropriate use in the 1950’s gave

the technique a bad namethe technique a bad name Subsequent research illuminated when the method was Subsequent research illuminated when the method was

appropriateappropriate

TerminologyTerminology

Random deviate – a distribution of Random deviate – a distribution of numbers choosen uniformly between numbers choosen uniformly between [0,1][0,1]

Normal deviate – numbers chosen Normal deviate – numbers chosen randomly between (-∞,∞) weighted randomly between (-∞,∞) weighted by a Gaussianby a Gaussian

Background to MC Background to MC integrationintegration

Suppose we have a definite integralSuppose we have a definite integral

Given a “good” set of N sample points Given a “good” set of N sample points {x{xii} we can estimate the integral as } we can estimate the integral as

b

a

dxxfI )(

N

ii

b

a

xfN

abdxxfI

1

)()(

a b

Sample points e.g. x3 x9Each sample point yieldsan element of the integral ofwidth (b-a)/N and heightf(xi)

f(x)

What MC integration What MC integration really doesreally does

While the previous explanation is a reasonable While the previous explanation is a reasonable interpretation of the way MC integration works, interpretation of the way MC integration works, the most popular explanation is the most popular explanation is belowbelow

a b

Height givenby random sampleof f(x)

Average

Mathematical Mathematical ApplicationsApplications

Let’s formalize this just a little bit…Let’s formalize this just a little bit… Since by the mean value theoremSince by the mean value theorem

We can approximate the integral by We can approximate the integral by calculating calculating (b-a)<f>, and we can (b-a)<f>, and we can calculate <f> by averaging many values of f(x)calculate <f> by averaging many values of f(x)

Where xWhere xiiєє[a,b] and the values are chosen [a,b] and the values are chosen randomlyrandomly

b

afabdxxf )()(

N

iiN xf

Nf

1

)(1

ExampleExample

1

0...718281828.11edxeI x Consider evaluatingConsider evaluating

Let’s take N=1000, then evaluate f(x)=eLet’s take N=1000, then evaluate f(x)=exx with xwith xєє[0,1] at 1000 random points[0,1] at 1000 random points

For this set of points defineFor this set of points define II11=(b-a)<f>=(b-a)<f>N,1N,1=<f>=<f>N,1N,1 since b-a=1 since b-a=1

Next choose 1000 different xNext choose 1000 different xєє[0,1] and [0,1] and create a new estimate Icreate a new estimate I22=<f>=<f>N,2N,2

Next choose another 1000 different Next choose another 1000 different xxєє[0,1] and create a new estimate [0,1] and create a new estimate II33=<f>=<f>N,3N,3

Distribution of the Distribution of the estimatesestimates

We can carry on doing this, say We can carry on doing this, say 10,000 times at which point 10,000 times at which point we’ll have 10,000 values we’ll have 10,000 values estimating the integral, and the estimating the integral, and the distribution of these values will distribution of these values will be a normal distributionbe a normal distribution

The distribution of the all of the The distribution of the all of the IINN integrals constrains the integrals constrains the errors we would expect on a errors we would expect on a single Isingle INN estimate estimate This is the Central Limit This is the Central Limit

Theorem, for any given ITheorem, for any given INN estimate the sum of the random estimate the sum of the random variables within it will converge variables within it will converge toward a normal distributiontoward a normal distribution

Specifically, the standard Specifically, the standard deviation will be the estimate of deviation will be the estimate of the error in a single Ithe error in a single INN estimate estimate

The mean, xThe mean, x00, will approach e-1, will approach e-1

x0x0+Nx0-N

2

20

2

)(

N

xx

e

1

1/e

Calculating Calculating NN The formula for the standard deviation of N The formula for the standard deviation of N

samples is samples is

If there is no deviation in the data then RHS is If there is no deviation in the data then RHS is zerozero

Given some deviation as N→∞, the RHS will Given some deviation as N→∞, the RHS will settle to some constant value > 0 (in this case ~ settle to some constant value > 0 (in this case ~ 0.2420359…)0.2420359…)

Thus we can writeThus we can write

2

11

2

222

)(1

)(1

)()()1(

N

ii

N

ii

iiN

xfN

xfN

xfxfN

NNN

1~

)1(

1

A rough measure of how good a random number generator is how well does a histogramof the 10,000 estimates fit to a Gaussian.

Add mc.ps plotAdd mc.ps plot

1000 samples perI integrationStandard deviationis 0.491/√1000

Increasing thenumber of integralestimates makes thedistribution closerand closer to the infinite limit.

Resulting statisticsResulting statistics For data that fits a Gaussian, the theory of For data that fits a Gaussian, the theory of

probability distribution functions asserts thatprobability distribution functions asserts that 68.3% of the data (<f>68.3% of the data (<f>NN) will fall within ±) will fall within ±NN of the of the

meanmean 95.4% of the data (19/20) will fall within ±295.4% of the data (19/20) will fall within ±2NN of the of the

meanmean 99.7% of the data will fall within ±399.7% of the data will fall within ±3NN etcetc……

Interpretation of poll data:Interpretation of poll data: ““These results will be accurate to ±4%, (19 times out of These results will be accurate to ±4%, (19 times out of

20)”20)” The ±4% corresponds to ±2The ±4% corresponds to ±2 Since Since 1/sqrt(N) 1/sqrt(N)

This highlights one of the difficulties with random This highlights one of the difficulties with random sampling, to improve the result by a factor of 2 sampling, to improve the result by a factor of 2 we must increase N by a factor of 4!we must increase N by a factor of 4!

Why would we use this Why would we use this method to evaluate method to evaluate

integrals?integrals? For 1D it doesn’t make a lot of senseFor 1D it doesn’t make a lot of sense

Taking h~1/N then composite trapezoid rule Taking h~1/N then composite trapezoid rule error~herror~h22~1/N~1/N22=N=N-2-2

Double N, get result 4 times betterDouble N, get result 4 times better In 2D, we can use an extension of the In 2D, we can use an extension of the

trapezoid rule to use squarestrapezoid rule to use squares Taking h~1/NTaking h~1/N1/21/2 then error then errorhh22NN-1-1

In 3D we get h~1/NIn 3D we get h~1/N1/31/3 then error then errorhh22NN--

2/32/3

In 4D we get h~1/NIn 4D we get h~1/N1/41/4 then error then errorhh22NN--

1/21/2

MC beneficial for N>4MC beneficial for N>4 Monte Carlo methods always have Monte Carlo methods always have NN~N~N-1/2-1/2

regardless of the dimensionregardless of the dimension Comparing to the 4D convergence behaviour we Comparing to the 4D convergence behaviour we

see that MC integration becomes practical at see that MC integration becomes practical at this pointthis point It wouldn’t make any sense for 3D thoughIt wouldn’t make any sense for 3D though

For anything higher than 4D (e.g. 6D,9D which For anything higher than 4D (e.g. 6D,9D which are possible!) MC methods tend to be the only are possible!) MC methods tend to be the only way of doing these calculationsway of doing these calculations

MC methods also have the useful property of MC methods also have the useful property of being comparatively immune to singularities, being comparatively immune to singularities, provided thatprovided that The random generator doesn’t hit the singularityThe random generator doesn’t hit the singularity The integral does indeed exist!The integral does indeed exist!

Importance samplingImportance sampling In reality many integrals have functions In reality many integrals have functions

that vary rapidly in one part of the that vary rapidly in one part of the number line and more slowly in othersnumber line and more slowly in others

To capture this behaviour with MC To capture this behaviour with MC methods requires that we introduce methods requires that we introduce some way of “putting our points where some way of “putting our points where we need them the most”we need them the most”

We really want to introduce a new We really want to introduce a new function into the problem, one that function into the problem, one that allows us to put the samples in the right allows us to put the samples in the right places places

General outlineGeneral outline

Suppose we have two similar functions Suppose we have two similar functions g(x) & f(x), and g(x) is easy to integrate, g(x) & f(x), and g(x) is easy to integrate, thenthen

y(b)

y(a)

x

))((

))((I

y(b)y b, xy(a);y a,when x

')'(y(x)

theng(x)dx,dylet : variablesChange

)()(

)()(

dyyxg

yxf

dxxg

dxxgxg

xfdxxfI

b

a

b

a

General outline contGeneral outline cont The integral we have derived The integral we have derived

has some nice properties: has some nice properties: Because g(x)~f(x) (i.e. g(x) is a reasonable Because g(x)~f(x) (i.e. g(x) is a reasonable

approximation of f(x) that is easy to integrate) approximation of f(x) that is easy to integrate) then the integrand should be approximately 1then the integrand should be approximately 1

and the integrand shouldn’t vary much!and the integrand shouldn’t vary much! It should be possible to calculate a good It should be possible to calculate a good

approximation with a fairly small number of approximation with a fairly small number of samplessamples

Thus by applying the change of variables Thus by applying the change of variables and mapping our sample points we get a and mapping our sample points we get a better answer with fewer samplesbetter answer with fewer samples

y(b)

y(a) ))((

))((I dy

yxg

yxf

ExampleExample

Let’s look at integrating f(x)=eLet’s look at integrating f(x)=exx again on again on [0,1][0,1]

MC random samples are MC random samples are 0.23,0.69,0.51,0.930.23,0.69,0.51,0.93

Our integral estimate is thenOur integral estimate is then

0.14471.7183-1.8630answer toDifference

1.8630

)(4

1)01(I 93.051.069.023.0

eeee

Apply importance Apply importance samplingsampling

We first need to decide on our g(x) We first need to decide on our g(x) function, as a guess let us take g(x)=1+xfunction, as a guess let us take g(x)=1+x We’ll it isn’t really a guess – we know this is We’ll it isn’t really a guess – we know this is

the first two terms of the Taylor expansion of the first two terms of the Taylor expansion of eexx!!

y(x) is thus given byy(x) is thus given by For end points we get y(0)=0, y(1)=3/2For end points we get y(0)=0, y(1)=3/2 Rearrange y(x) to give x(y):Rearrange y(x) to give x(y):

x x

xdxxxy2

')'1()(2

yyx 211)(

Set up integral & evaluate Set up integral & evaluate samplessamples

The integral to evaluate is nowThe integral to evaluate is now

We must now choose y’s on the interval We must now choose y’s on the interval [0,3/2][0,3/2]

2/3

0

2112/3

0 211I dy

y

edyx

e yx

yy

0.3450.345 1.0381.038

1.0351.035 1.2111.211

0.7650.765 1.1351.135

1.3951.395 1.3241.324

y

e y

21

211

Close to 1 becauseg(x)~f(x)

EvaluateEvaluate For the new integral we haveFor the new integral we have

Clearly this technique of ensuring the integrand Clearly this technique of ensuring the integrand doesn’t vary too much is extremely powerfuldoesn’t vary too much is extremely powerful

Importance sampling is particularly important in Importance sampling is particularly important in multidimensional integrals and can add 1 or 2 multidimensional integrals and can add 1 or 2 significant figures of accuracy for a minimal significant figures of accuracy for a minimal amount of effortamount of effort

estimate! previous n thebetter tha times3 So

0.04721.7183-1.7655answer toDifference

1.7655

)324.1135.1211.1038.1(4

1)02/3(I

Rejection techniqueRejection technique Thus far we’ve look in detail at the effect of Thus far we’ve look in detail at the effect of

changing sample points on the overall estimate changing sample points on the overall estimate of the integralof the integral

An alternative approach may be necessary when An alternative approach may be necessary when you cannot easily sample the desired region you cannot easily sample the desired region which we’ll call Wwhich we’ll call W Particularly important in multi-dimensional integrals Particularly important in multi-dimensional integrals

when you can calculate the integral for a simple when you can calculate the integral for a simple boundary but not a complex oneboundary but not a complex one

We define a larger region V that includes WWe define a larger region V that includes W Note you must also be able to calculate the size of V Note you must also be able to calculate the size of V

easilyeasily The sample function is then redefined to be zero The sample function is then redefined to be zero

outside the volume, but have it’s normal value outside the volume, but have it’s normal value within itwithin it

Rejection technique Rejection technique diagramdiagram

Region we wantto calculate

V

W

f(x)

Area of W=integral of region V multiplied byfraction of points falling below f(x) within VAlgorithm: just count the total number of points calculated & the number in W!

Better selection of points: Better selection of points: sub-random sequencessub-random sequences

Choosing N points using a uniform deviate Choosing N points using a uniform deviate produces an error that converges as Nproduces an error that converges as N-0.5-0.5

If we could choose points “better” we could make If we could choose points “better” we could make convergence fasterconvergence faster For example, using a Cartesian grid of points leads to a For example, using a Cartesian grid of points leads to a

method that converges as Nmethod that converges as N-1-1

Think of Cartesian points as “avoiding” one another and Think of Cartesian points as “avoiding” one another and thus sampling a given region more completelythus sampling a given region more completely

However, we don’t know However, we don’t know a prioria priori how fine the grid how fine the grid should beshould be

We want to avoid short range correlations – particles We want to avoid short range correlations – particles shouldn’t be too close to one anothershouldn’t be too close to one another

A better solution is to choose points that attempt A better solution is to choose points that attempt to “maximally avoid” one anotherto “maximally avoid” one another

A list of sub-random A list of sub-random sequencessequences

Tore-SQRT sequencesTore-SQRT sequences Van der Corput & Van der Corput & Halton sequencesHalton sequences Faure sequenceFaure sequence Generalized Faure sequenceGeneralized Faure sequence Nets & (t,s)-sequencesNets & (t,s)-sequences Sobol sequenceSobol sequence Niederreiter sequenceNiederreiter sequence

Well look very briefly at Halton & Sobol Well look very briefly at Halton & Sobol sequences, both of which are covered in detail in sequences, both of which are covered in detail in Numerical RecipesNumerical Recipes

Many to choose from!

Halton’s sequenceHalton’s sequence Suppose in 1d we obtain the jth number in Suppose in 1d we obtain the jth number in

sequence, denoted Hsequence, denoted Hjj, via, via (1) write j as a number in base b, where b is prime(1) write j as a number in base b, where b is prime

e.g. 17 in base 3 is 122e.g. 17 in base 3 is 122 (2) Reverse the digits and place a radix point in front(2) Reverse the digits and place a radix point in front

e.g. 0.221 base 3e.g. 0.221 base 3

It should be clear why this works, adding an It should be clear why this works, adding an additional digit makes the “mesh” of numbers additional digit makes the “mesh” of numbers progressively finerprogressively finer

For a sequence of points in n dimensions (xFor a sequence of points in n dimensions (xii11,,

…,x…,xiinn) we would typically use the first n primes to ) we would typically use the first n primes to

generate separate sequences for each of the xgenerate separate sequences for each of the xiijj

componentscomponents

2d Halton’s sequence 2d Halton’s sequence exampleexample

Pairs of points constructed from base 3 & 5 Halton sequences

Sobol (1967) sequenceSobol (1967) sequence Useful method

described in Numerical Recipes as providing close to N-1 convergence rate

Algorithms are also available at www.netlib.org

From Num. Recipes

SummarySummary MC methods are a useful way of MC methods are a useful way of

numerically integrating systems that are numerically integrating systems that are not tractable by other methodsnot tractable by other methods

The key part of MC methods is the NThe key part of MC methods is the N-0.5-0.5 convergence rate convergence rate

Numerical integration techniques can be Numerical integration techniques can be greatly improved using importance greatly improved using importance samplingsampling

If you cannot write down a function If you cannot write down a function easily then the rejection technique can easily then the rejection technique can often be employedoften be employed

Next LectureNext Lecture

More on MC methods – simulating More on MC methods – simulating random walksrandom walks

Documents

Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C) [email protected]