Upload
julius-sentell
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Computational Computational Methods in PhysicsMethods in Physics
PHYS 3437 PHYS 3437Dr Rob ThackerDr Rob Thacker
Dept of Astronomy & Physics Dept of Astronomy & Physics (MM-301C)(MM-301C)
[email protected]@ap.smu.ca
Today’s LectureToday’s Lecture
Introduction to Monte Carlo Introduction to Monte Carlo methodsmethods BackgroundBackground Integration techniquesIntegration techniques
IntroductionIntroduction
““Monte Carlo” refers to the use of random Monte Carlo” refers to the use of random numbers to model random events that may numbers to model random events that may model a mathematical of physical problemmodel a mathematical of physical problem
Typically, MC methods require many millions Typically, MC methods require many millions of random numbersof random numbers Of course, computers cannot actually generate Of course, computers cannot actually generate
truly random numberstruly random numbers However, we can make the period of repetition absolutely However, we can make the period of repetition absolutely
enormousenormous Such pseudo-random number generators are based on Such pseudo-random number generators are based on
truncation of numbers of their significant digitstruncation of numbers of their significant digits See Numerical Recipes, p 266-280 (2See Numerical Recipes, p 266-280 (2ndnd edition FORTRAN) edition FORTRAN)
““Anyone who considers arithmetical methods of Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”producing random digits is, of course, in a state of sin.”
John von Neumann
History of numerical Monte History of numerical Monte Carlo methodsCarlo methods
Another contribution to numerical methods related Another contribution to numerical methods related to research at Los Alamosto research at Los Alamos
Late 1940s: scientists want to follow paths of Late 1940s: scientists want to follow paths of neutrons following various sub-atomic collision neutrons following various sub-atomic collision eventsevents
Ulam & von Neumann suggest using random Ulam & von Neumann suggest using random sampling to estimate this processsampling to estimate this process
100 events can be calculated in 5 hours on ENIAC100 events can be calculated in 5 hours on ENIAC The method is given the name “Monte Carlo” by The method is given the name “Monte Carlo” by
Nicholas MetropolisNicholas Metropolis Explosion of inappropriate use in the 1950’s gave Explosion of inappropriate use in the 1950’s gave
the technique a bad namethe technique a bad name Subsequent research illuminated when the method was Subsequent research illuminated when the method was
appropriateappropriate
TerminologyTerminology
Random deviate – a distribution of Random deviate – a distribution of numbers choosen uniformly between numbers choosen uniformly between [0,1][0,1]
Normal deviate – numbers chosen Normal deviate – numbers chosen randomly between (-∞,∞) weighted randomly between (-∞,∞) weighted by a Gaussianby a Gaussian
Background to MC Background to MC integrationintegration
Suppose we have a definite integralSuppose we have a definite integral
Given a “good” set of N sample points Given a “good” set of N sample points {x{xii} we can estimate the integral as } we can estimate the integral as
b
a
dxxfI )(
N
ii
b
a
xfN
abdxxfI
1
)()(
a b
Sample points e.g. x3 x9Each sample point yieldsan element of the integral ofwidth (b-a)/N and heightf(xi)
f(x)
What MC integration What MC integration really doesreally does
While the previous explanation is a reasonable While the previous explanation is a reasonable interpretation of the way MC integration works, interpretation of the way MC integration works, the most popular explanation is the most popular explanation is belowbelow
a b
Height givenby random sampleof f(x)
Average
Mathematical Mathematical ApplicationsApplications
Let’s formalize this just a little bit…Let’s formalize this just a little bit… Since by the mean value theoremSince by the mean value theorem
We can approximate the integral by We can approximate the integral by calculating calculating (b-a)<f>, and we can (b-a)<f>, and we can calculate <f> by averaging many values of f(x)calculate <f> by averaging many values of f(x)
Where xWhere xiiєє[a,b] and the values are chosen [a,b] and the values are chosen randomlyrandomly
b
afabdxxf )()(
N
iiN xf
Nf
1
)(1
ExampleExample
1
0...718281828.11edxeI x Consider evaluatingConsider evaluating
Let’s take N=1000, then evaluate f(x)=eLet’s take N=1000, then evaluate f(x)=exx with xwith xєє[0,1] at 1000 random points[0,1] at 1000 random points
For this set of points defineFor this set of points define II11=(b-a)<f>=(b-a)<f>N,1N,1=<f>=<f>N,1N,1 since b-a=1 since b-a=1
Next choose 1000 different xNext choose 1000 different xєє[0,1] and [0,1] and create a new estimate Icreate a new estimate I22=<f>=<f>N,2N,2
Next choose another 1000 different Next choose another 1000 different xxєє[0,1] and create a new estimate [0,1] and create a new estimate II33=<f>=<f>N,3N,3
Distribution of the Distribution of the estimatesestimates
We can carry on doing this, say We can carry on doing this, say 10,000 times at which point 10,000 times at which point we’ll have 10,000 values we’ll have 10,000 values estimating the integral, and the estimating the integral, and the distribution of these values will distribution of these values will be a normal distributionbe a normal distribution
The distribution of the all of the The distribution of the all of the IINN integrals constrains the integrals constrains the errors we would expect on a errors we would expect on a single Isingle INN estimate estimate This is the Central Limit This is the Central Limit
Theorem, for any given ITheorem, for any given INN estimate the sum of the random estimate the sum of the random variables within it will converge variables within it will converge toward a normal distributiontoward a normal distribution
Specifically, the standard Specifically, the standard deviation will be the estimate of deviation will be the estimate of the error in a single Ithe error in a single INN estimate estimate
The mean, xThe mean, x00, will approach e-1, will approach e-1
x0x0+Nx0-N
2
20
2
)(
N
xx
e
1
1/e
Calculating Calculating NN The formula for the standard deviation of N The formula for the standard deviation of N
samples is samples is
If there is no deviation in the data then RHS is If there is no deviation in the data then RHS is zerozero
Given some deviation as N→∞, the RHS will Given some deviation as N→∞, the RHS will settle to some constant value > 0 (in this case ~ settle to some constant value > 0 (in this case ~ 0.2420359…)0.2420359…)
Thus we can writeThus we can write
2
11
2
222
)(1
)(1
)()()1(
N
ii
N
ii
iiN
xfN
xfN
xfxfN
NNN
1~
)1(
1
A rough measure of how good a random number generator is how well does a histogramof the 10,000 estimates fit to a Gaussian.
Add mc.ps plotAdd mc.ps plot
1000 samples perI integrationStandard deviationis 0.491/√1000
Increasing thenumber of integralestimates makes thedistribution closerand closer to the infinite limit.
Resulting statisticsResulting statistics For data that fits a Gaussian, the theory of For data that fits a Gaussian, the theory of
probability distribution functions asserts thatprobability distribution functions asserts that 68.3% of the data (<f>68.3% of the data (<f>NN) will fall within ±) will fall within ±NN of the of the
meanmean 95.4% of the data (19/20) will fall within ±295.4% of the data (19/20) will fall within ±2NN of the of the
meanmean 99.7% of the data will fall within ±399.7% of the data will fall within ±3NN etcetc……
Interpretation of poll data:Interpretation of poll data: ““These results will be accurate to ±4%, (19 times out of These results will be accurate to ±4%, (19 times out of
20)”20)” The ±4% corresponds to ±2The ±4% corresponds to ±2 Since Since 1/sqrt(N) 1/sqrt(N)
This highlights one of the difficulties with random This highlights one of the difficulties with random sampling, to improve the result by a factor of 2 sampling, to improve the result by a factor of 2 we must increase N by a factor of 4!we must increase N by a factor of 4!
Why would we use this Why would we use this method to evaluate method to evaluate
integrals?integrals? For 1D it doesn’t make a lot of senseFor 1D it doesn’t make a lot of sense
Taking h~1/N then composite trapezoid rule Taking h~1/N then composite trapezoid rule error~herror~h22~1/N~1/N22=N=N-2-2
Double N, get result 4 times betterDouble N, get result 4 times better In 2D, we can use an extension of the In 2D, we can use an extension of the
trapezoid rule to use squarestrapezoid rule to use squares Taking h~1/NTaking h~1/N1/21/2 then error then errorhh22NN-1-1
In 3D we get h~1/NIn 3D we get h~1/N1/31/3 then error then errorhh22NN--
2/32/3
In 4D we get h~1/NIn 4D we get h~1/N1/41/4 then error then errorhh22NN--
1/21/2
MC beneficial for N>4MC beneficial for N>4 Monte Carlo methods always have Monte Carlo methods always have NN~N~N-1/2-1/2
regardless of the dimensionregardless of the dimension Comparing to the 4D convergence behaviour we Comparing to the 4D convergence behaviour we
see that MC integration becomes practical at see that MC integration becomes practical at this pointthis point It wouldn’t make any sense for 3D thoughIt wouldn’t make any sense for 3D though
For anything higher than 4D (e.g. 6D,9D which For anything higher than 4D (e.g. 6D,9D which are possible!) MC methods tend to be the only are possible!) MC methods tend to be the only way of doing these calculationsway of doing these calculations
MC methods also have the useful property of MC methods also have the useful property of being comparatively immune to singularities, being comparatively immune to singularities, provided thatprovided that The random generator doesn’t hit the singularityThe random generator doesn’t hit the singularity The integral does indeed exist!The integral does indeed exist!
Importance samplingImportance sampling In reality many integrals have functions In reality many integrals have functions
that vary rapidly in one part of the that vary rapidly in one part of the number line and more slowly in othersnumber line and more slowly in others
To capture this behaviour with MC To capture this behaviour with MC methods requires that we introduce methods requires that we introduce some way of “putting our points where some way of “putting our points where we need them the most”we need them the most”
We really want to introduce a new We really want to introduce a new function into the problem, one that function into the problem, one that allows us to put the samples in the right allows us to put the samples in the right places places
General outlineGeneral outline
Suppose we have two similar functions Suppose we have two similar functions g(x) & f(x), and g(x) is easy to integrate, g(x) & f(x), and g(x) is easy to integrate, thenthen
y(b)
y(a)
x
))((
))((I
y(b)y b, xy(a);y a,when x
')'(y(x)
theng(x)dx,dylet : variablesChange
)()(
)()(
dyyxg
yxf
dxxg
dxxgxg
xfdxxfI
b
a
b
a
General outline contGeneral outline cont The integral we have derived The integral we have derived
has some nice properties: has some nice properties: Because g(x)~f(x) (i.e. g(x) is a reasonable Because g(x)~f(x) (i.e. g(x) is a reasonable
approximation of f(x) that is easy to integrate) approximation of f(x) that is easy to integrate) then the integrand should be approximately 1then the integrand should be approximately 1
and the integrand shouldn’t vary much!and the integrand shouldn’t vary much! It should be possible to calculate a good It should be possible to calculate a good
approximation with a fairly small number of approximation with a fairly small number of samplessamples
Thus by applying the change of variables Thus by applying the change of variables and mapping our sample points we get a and mapping our sample points we get a better answer with fewer samplesbetter answer with fewer samples
y(b)
y(a) ))((
))((I dy
yxg
yxf
ExampleExample
Let’s look at integrating f(x)=eLet’s look at integrating f(x)=exx again on again on [0,1][0,1]
MC random samples are MC random samples are 0.23,0.69,0.51,0.930.23,0.69,0.51,0.93
Our integral estimate is thenOur integral estimate is then
0.14471.7183-1.8630answer toDifference
1.8630
)(4
1)01(I 93.051.069.023.0
eeee
Apply importance Apply importance samplingsampling
We first need to decide on our g(x) We first need to decide on our g(x) function, as a guess let us take g(x)=1+xfunction, as a guess let us take g(x)=1+x We’ll it isn’t really a guess – we know this is We’ll it isn’t really a guess – we know this is
the first two terms of the Taylor expansion of the first two terms of the Taylor expansion of eexx!!
y(x) is thus given byy(x) is thus given by For end points we get y(0)=0, y(1)=3/2For end points we get y(0)=0, y(1)=3/2 Rearrange y(x) to give x(y):Rearrange y(x) to give x(y):
x x
xdxxxy2
')'1()(2
yyx 211)(
Set up integral & evaluate Set up integral & evaluate samplessamples
The integral to evaluate is nowThe integral to evaluate is now
We must now choose y’s on the interval We must now choose y’s on the interval [0,3/2][0,3/2]
2/3
0
2112/3
0 211I dy
y
edyx
e yx
yy
0.3450.345 1.0381.038
1.0351.035 1.2111.211
0.7650.765 1.1351.135
1.3951.395 1.3241.324
y
e y
21
211
Close to 1 becauseg(x)~f(x)
EvaluateEvaluate For the new integral we haveFor the new integral we have
Clearly this technique of ensuring the integrand Clearly this technique of ensuring the integrand doesn’t vary too much is extremely powerfuldoesn’t vary too much is extremely powerful
Importance sampling is particularly important in Importance sampling is particularly important in multidimensional integrals and can add 1 or 2 multidimensional integrals and can add 1 or 2 significant figures of accuracy for a minimal significant figures of accuracy for a minimal amount of effortamount of effort
estimate! previous n thebetter tha times3 So
0.04721.7183-1.7655answer toDifference
1.7655
)324.1135.1211.1038.1(4
1)02/3(I
Rejection techniqueRejection technique Thus far we’ve look in detail at the effect of Thus far we’ve look in detail at the effect of
changing sample points on the overall estimate changing sample points on the overall estimate of the integralof the integral
An alternative approach may be necessary when An alternative approach may be necessary when you cannot easily sample the desired region you cannot easily sample the desired region which we’ll call Wwhich we’ll call W Particularly important in multi-dimensional integrals Particularly important in multi-dimensional integrals
when you can calculate the integral for a simple when you can calculate the integral for a simple boundary but not a complex oneboundary but not a complex one
We define a larger region V that includes WWe define a larger region V that includes W Note you must also be able to calculate the size of V Note you must also be able to calculate the size of V
easilyeasily The sample function is then redefined to be zero The sample function is then redefined to be zero
outside the volume, but have it’s normal value outside the volume, but have it’s normal value within itwithin it
Rejection technique Rejection technique diagramdiagram
Region we wantto calculate
V
W
f(x)
Area of W=integral of region V multiplied byfraction of points falling below f(x) within VAlgorithm: just count the total number of points calculated & the number in W!
Better selection of points: Better selection of points: sub-random sequencessub-random sequences
Choosing N points using a uniform deviate Choosing N points using a uniform deviate produces an error that converges as Nproduces an error that converges as N-0.5-0.5
If we could choose points “better” we could make If we could choose points “better” we could make convergence fasterconvergence faster For example, using a Cartesian grid of points leads to a For example, using a Cartesian grid of points leads to a
method that converges as Nmethod that converges as N-1-1
Think of Cartesian points as “avoiding” one another and Think of Cartesian points as “avoiding” one another and thus sampling a given region more completelythus sampling a given region more completely
However, we don’t know However, we don’t know a prioria priori how fine the grid how fine the grid should beshould be
We want to avoid short range correlations – particles We want to avoid short range correlations – particles shouldn’t be too close to one anothershouldn’t be too close to one another
A better solution is to choose points that attempt A better solution is to choose points that attempt to “maximally avoid” one anotherto “maximally avoid” one another
A list of sub-random A list of sub-random sequencessequences
Tore-SQRT sequencesTore-SQRT sequences Van der Corput & Van der Corput & Halton sequencesHalton sequences Faure sequenceFaure sequence Generalized Faure sequenceGeneralized Faure sequence Nets & (t,s)-sequencesNets & (t,s)-sequences Sobol sequenceSobol sequence Niederreiter sequenceNiederreiter sequence
Well look very briefly at Halton & Sobol Well look very briefly at Halton & Sobol sequences, both of which are covered in detail in sequences, both of which are covered in detail in Numerical RecipesNumerical Recipes
Many to choose from!
Halton’s sequenceHalton’s sequence Suppose in 1d we obtain the jth number in Suppose in 1d we obtain the jth number in
sequence, denoted Hsequence, denoted Hjj, via, via (1) write j as a number in base b, where b is prime(1) write j as a number in base b, where b is prime
e.g. 17 in base 3 is 122e.g. 17 in base 3 is 122 (2) Reverse the digits and place a radix point in front(2) Reverse the digits and place a radix point in front
e.g. 0.221 base 3e.g. 0.221 base 3
It should be clear why this works, adding an It should be clear why this works, adding an additional digit makes the “mesh” of numbers additional digit makes the “mesh” of numbers progressively finerprogressively finer
For a sequence of points in n dimensions (xFor a sequence of points in n dimensions (xii11,,
…,x…,xiinn) we would typically use the first n primes to ) we would typically use the first n primes to
generate separate sequences for each of the xgenerate separate sequences for each of the xiijj
componentscomponents
2d Halton’s sequence 2d Halton’s sequence exampleexample
Pairs of points constructed from base 3 & 5 Halton sequences
Sobol (1967) sequenceSobol (1967) sequence Useful method
described in Numerical Recipes as providing close to N-1 convergence rate
Algorithms are also available at www.netlib.org
From Num. Recipes
SummarySummary MC methods are a useful way of MC methods are a useful way of
numerically integrating systems that are numerically integrating systems that are not tractable by other methodsnot tractable by other methods
The key part of MC methods is the NThe key part of MC methods is the N-0.5-0.5 convergence rate convergence rate
Numerical integration techniques can be Numerical integration techniques can be greatly improved using importance greatly improved using importance samplingsampling
If you cannot write down a function If you cannot write down a function easily then the rejection technique can easily then the rejection technique can often be employedoften be employed
Next LectureNext Lecture
More on MC methods – simulating More on MC methods – simulating random walksrandom walks