Introduction to Quantum Monte-CarloIntroduction to Quantum Monte-Carlo Francesco Sottile Ecole...

Preliminary Monte-Carlo QMC

Introduction to Quantum Monte-Carlo

Francesco Sottile

Ecole Polytechnique and ETSF

ESNUM 9 June 2016

Outline

Preminary (statistic) concepts

Monte-Carlo: means, samplings and Markov chains

Quantum Monte-Carlo: variational and diffusion MC

Outline

Two theorems

Law of large numbers

If you perform the same experiment a large number of times, theaverage of the results obtained should be close to the expectedvalue, and will tend to become closer as more trials are performed.

Central limit theorem

The mean of a sufficiently large number of independent randomvariables, each with finite mean and variance, will be approximatelynormally distributed.

Two theorems

mean or expected value = µ = 〈x〉 =∑j

pjxj =

∫dx p(x) x

variance = σ2 =⟨(x − µ)2

⟩=∑j

pjx2j − µ2 =

∫dx p(x) x2 − µ2

Sn =x1 + x2 + ..+ xn

n−→ µ

Two theorems

(Sn − µ) −→ N (0,σ2

Two theorems

Large numbers + central limit

Sn −→ µ± ξ√n

Pseudo-Random Number Generator (PRNG)

Two (+1) requests for good PRNG

• It has to be good: long period, good lattice structure, goodsequences, etc.

• It has to be fast.

* It has to be reproducible

Pseudo-Random Number Generator (PRNG)

Two (+1) requests for good PRNG

• It has to be good: long period, good lattice structure, goodsequences, etc.

• It has to be fast.

* It has to be reproducible

Random Numbers

• Today’s libraries give reliable uniform random numbers(∈ [0, 1]).

• We are able, by transformation from the uniform distribution,to create random numbers distributed according to other(simple) functions, like the Gaussian.

Outline

Monte-Carlo sampling

4∼ Nhit

Monte-Carlo sampling

x x1 2

4∼∫ √

1− x2dx ∼ V

√1− x2

Barely relevant Monte-Carlo sampling

∫f (x)dx

advantages

• easy to implement

disadvantages

• converges only like O(

poorly compared to theSimpson’s method O

Barely relevant Monte-Carlo sampling

∫· ·∫

f (x)ddx

advantages

• easy to implement

• converges still like O(

compared to the Simpson’s

method O(

)disadvantages

• We hit a lot of empty(or barely relevant)space

Importance sampling

0f (x)dx =

p(x)p(x)dx

I =< f >=

How to choose p(x)?

• Choose p(x) to minimize the variance

σ = 0 ←− p(x) = f (x)I quite useless?

p(x) close to f (x), but simple enough to be sampled

Importance sampling

0f (x)dx =

p(x)p(x)dx

I =< f >=

How to choose p(x)?

Importance sampling

0f (x)dx =

p(x)p(x)dx

I =< f >=

How to choose p(x)?

Importance sampling

ex − 1

e − 1dx = 0.418

σ1 = 0.3009540σ2 = 0.0560286σ3 = 0.1380024σ4 = 0.1838806

0 0.2 0.4 0.6 0.8 10

f(x) =

Sample of a function f(x) using different probability distributions

Importance sampling

ex − 1

e − 1dx = 0.418

σ1 = 0.3009540σ2 = 0.0560286σ3 = 0.1380024σ4 = 0.1838806

0 0.2 0.4 0.6 0.8 10

f(x) =

p1(x) = 1

Importance sampling

ex − 1

e − 1dx = 0.418

σ1 = 0.3009540σ2 = 0.0560286σ3 = 0.1380024σ4 = 0.1838806

0 0.2 0.4 0.6 0.8 10

f(x) =

p1(x) = 1

p2(x) = 2 x

Importance sampling

ex − 1

e − 1dx = 0.418

σ1 = 0.3009540σ2 = 0.0560286σ3 = 0.1380024σ4 = 0.1838806

0 0.2 0.4 0.6 0.8 10

f(x) =

p1(x) = 1

p2(x) = 2 x

p3(x) = e

x/(e-1)

Importance sampling

ex − 1

e − 1dx = 0.418

σ1 = 0.3009540σ2 = 0.0560286σ3 = 0.1380024σ4 = 0.1838806

0 0.2 0.4 0.6 0.8 10

f(x) =

p1(x) = 1

p2(x) = 2 x

p3(x) = e

x/(e-1)

p4(x) = 3 x

Importance sampling

√importance sampling is crucial in practice

√it relies on finding p(x)

× many-dimensions complex p(x) are difficult to find andto sample

One solution is Markov chains

Importance sampling

Markov Chains

Distribution function p(x)

e−βH∫e−βH

;|ψ|2∫|ψ|2

;e−S(x)∫e−S(x)

Two problems to overcome

• Hamiltonians, wavefunctions, actions are complicate(d-dimensional) functions (no way to find an analyticprimitive).

• they are normalized by an integral that has intrinsically thesame difficulty of the main integral

Markov Chains

Markov Chains sequence

x1P−→ x2

P−→ x3..P−→ xn

x1, x2, .. random but not independent

Markov Chain operator P(x → y)

It is possible to demonstrate that, no matter how complicate p(x)

• P(x → y) generates a sequence that, at the end, isdistributed according to p(x)

• we don’t need to know P(x → y)

• we don’t need to know p(x), but just a function proportionalto p(x).

Markov Chains

x1P−→ x2

P−→ x3..P−→ xn

Markov Chains

x1P−→ x2

P−→ x3..P−→ xn

Markov Chains

x1P−→ x2

P−→ x3..P−→ xn

Markov Chains

x1P−→ x2

P−→ x3..P−→ xn

Markov Chains

Simple example: two levels system

Population of cityA and population of cityB.Every year: 40% of people of cityA moves to CityB; 30% of thecontrary. Initially the population is A and B, for cityA and cityB.So the second year will be(

(0.6A + 0.3B0.4A + 0.7B

(0.6 0.30.4 0.7

)is the stochastic matrix

Markov Chains

Finding the converged stable distribution (of people)

• Iterate the Markov process, applying P to (A,B) to produce(A’,B’), then (A”,B”), etc.

• Considering that, at convergency PX = X . So the convergeddistribution is the eigenvector of the stochastic matrix, relatedto the unitary eigenvalue

This is the case in which we know the stochastic matrix and wefind the final distribution function.

But we want to generate sequences, according to a known finaldistribution function, without knowing P.

Markov Chains

Detailed balance principle or microreversibility

It can be demonstrated that any stochastic matrix P converges toa distribution function p(x) if

p(x)P(x → y) = p(y)P(y → x)

Missing: how to conceive P, in order to generate this sequence.Missing: here it seems we have to know p(x).

Markov Chains

Detailed balance principle or microreversibility

It can be demonstrated that any stochastic matrix P converges toa distribution function p(x) if

p(x)P(x → y) = p(y)P(y → x)

Missing: how to conceive P, in order to generate this sequence.Missing: here it seems we have to know p(x).

Metropolis method

Method to generate a microreversible P(x → y)

• We are at x

• We propose a trial move xT according to a symmetricprobability distribution F (x → xT ) = F (xT → x)

• Accept the trial move xT (and so put y = xT ) with

probability min(

1, p(xT )p(x)

we don’t need the exact p(x),but just a function proportional to αp(x)

Metropolis method

• We are at x

probability min(

1, p(xT )p(x)

Metropolis method

• We are at x

probability min(

1, p(xT )p(x)

Metropolis method

• We are at x

probability min(

1, p(xT )p(x)

Metropolis method

• We are at x

probability min(

1, p(xT )p(x)

Metropolis method

• We are at x

probability min(

1, p(xT )p(x)

Metropolis method

In practice

• F (x → xT ) is a Gaussian centered on x .

σ dynamically adjusted

• Accepting a trial move with probability p(xT )p(x) ?

Get a random ξ ∈ [0, 1].

Accept if p(xT )p(x) > ξ

Metropolis method

In practice

• F (x → xT ) is a Gaussian centered on x .

σ dynamically adjusted

• Accepting a trial move with probability p(xT )p(x) ?

Get a random ξ ∈ [0, 1].

Accept if p(xT )p(x) > ξ

Metropolis method

The M(RT)2 method is today used inmany different application,ranging fromnon-linear differential equations, tosimulation of galaxy formations: whatabout electronic structure calculation?

Metropolis method

The M(RT)2 method is today used inmany different application,ranging fromnon-linear differential equations, tosimulation of galaxy formations: whatabout electronic structure calculation?

Outline

Quantum Monte-Carlo

• Method to calculate the exact values of certain (ground-state)properties.

• Capable to reach high accuracy

• Wavefunction sampling is an alternative to brute forcewave-function representation (CI, CC) with advantages anddisadvantages

• QMC better scaling (N3 vs N6)• QMC subject to statistical errors

Quantum Monte-Carlo

• Variational Monte-Carlo

• Diffusion Monte-Carlo

• Path Integral Monte-Carlo, Reptation Monte-Carlo, Green’sfunctions Monte-Carlo

Quantum Monte-Carlo

• Variational Monte-Carlo

• Diffusion Monte-Carlo

• Path Integral Monte-Carlo, Reptation Monte-Carlo, Green’sfunctions Monte-Carlo

Variational Monte-Carlo

Variational Theorem

〈E 〉 =

∫dx ψ∗(x)Hψ(x)∫dx ψ∗(x)ψ(x)

the variational theorem states that 〈E 〉 ≥ E0.And 〈E 〉 = E0 if and only if ψ ∝ φ0

Idea of VMC

Let’s consider a trial ψT (x , {α}).∫dx ψ∗T (x , {α})HψT (x , {α})∫dx ψ∗T (x , {α})ψT (x , {α})

= E ({α}) ≥ E0

Minimizing E ({α}), with respect to the paramaters {α} will givean (upper) estimate of E0

Of course, we will use Monte-Carlo methods to calculate the3N-dimensional integrals

Idea of VMC

= E ({α}) ≥ E0

Idea of VMC

= E ({α}) ≥ E0

Idea of VMC

= E ({α}) ≥ E0

What we don’t do

Naıvely we might uniformly sample ψHψ and ψψ for the twointegrals. ∫

dx ψ∗THψT ;

∫dx ψ∗TψT

for any {α}.

What we do: importance sampling

〈E 〉 =

∫dx ψ∗THψT∫dx |ψ∗T |2

∫dx |ψT |2 HψT

ψT∫dx |ψT |2

〈E 〉 =

∫dx ρ(x , {α})EL(x , {α})

EL(x , {α}) =HψT (x , {α})ψT (x , {α})

ρ(x , {α}) =|ψT (x , {α})|2∫dx |ψT (x , {α})|2

VMC in practice

1. Let’s generate a number of copies of the system, each one withdifferent (random) electron coordinates: x ’s (the walkers).

2. Let’s choose a form for the ψT (x , {α})

3. Let’s use the Metropolis method to propagate such walkers.

4. We can monitor some observables during the Markov chain, likelocal energy, variance, etc.

5. When the walkers are distributed like |ψT |2, say at step L, wecalculate the local energy

L+N∑i=L

EL(xi , {α})

6. We change now α and we go to step 3

VMC in practice

1. Let’s generate a number of copies of the system, each one withdifferent (random) electron coordinates: x ’s (the walkers).

2. Let’s choose a form for the ψT (x , {α})

3. Let’s use the Metropolis method to propagate such walkers.

4. We can monitor some observables during the Markov chain, likelocal energy, variance, etc.

5. When the walkers are distributed like |ψT |2, say at step L, wecalculate the local energy

L+N∑i=L

EL(xi , {α})

6. We change now α and we go to step 3

The trial wavefunction

Antisymmetric function for fermions

ψT (x) = D(x)J (x) =

ψT (x) =few∑ν

cν det[ψ↑ν,n(ri )

]det[ψ↓ν,m(rj)

]e−V (x).

V (x) =∑i

V1(ri ) +∑i ,j>i

V2(rij),

The trial wavefunction

Jastrow factor for jellium spheres

J = exp

(N∑i=1

V1(ri )

N∑i<j

V2(rij )

exp(V (N)

V1(ri ) =20∑n=1

α(i)n j0

(nβri

V(λ)2 (rij ) =

a(λ)rij + c(λ)r2ij + e(λ)r3

1 + b(λ)(ri )rij + d (λ)r2ij + r3

b(λ)(ri ) = b(λ)0 + b

(λ)1 arctan

[ r2i − R2

K (λ)

]λ = A,P (antiparallel and parallel spins), and j0 is the zero-order spherical Bessel

function(j0(x) = sin x

V (N) = γ(PC )2 + δ(PS )2

PC =N∑i

ri e PS = 2N∑i

riS(i)z

• Variational Monte-Carlo gives high quality results (recover90% of correlation energy)

• but it is still approximate (relies on the choice of the trialwavefunction)

We now want extremely accurate (exact) results for theground-state energy

Diffusion Monte-Carlo

ψ(t) =∑n

cne− i

~Entφn

Hφn = Enφn

∫dx φn(x)ψ(x , 0), n = 0, 1, 2, ..

In imaginary time,

ψ(τ) = c0e−E0τφ0 + c1e

−E1τφ1 + c2e−E2τφ2 + ..

τ→∞−−−→∝ φ0

We want a practical scheme to do this imaginary time evolutionand recover the ground-state energy

ψ(t) =∑n

cne− i

~Entφn

Hφn = Enφn

∫dx φn(x)ψ(x , 0), n = 0, 1, 2, ..

In imaginary time,

−E1τφ1 + c2e−E2τφ2 + ..

τ→∞−−−→∝ φ0

ψ(t) =∑n

cne− i

~Entφn

Hφn = Enφn

∫dx φn(x)ψ(x , 0), n = 0, 1, 2, ..

In imaginary time,

−E1τφ1 + c2e−E2τφ2 + ..

τ→∞−−−→∝ φ0

ψ(t) =∑n

cne− i

~Entφn

Hφn = Enφn

∫dx φn(x)ψ(x , 0), n = 0, 1, 2, ..

In imaginary time,

−E1τφ1 + c2e−E2τφ2 + ..

τ→∞−−−→∝ φ0

First step: shift of energy

i~∂ψ(x , t)

[− ~2

2m∇2 + (V (x)− ET )

]ψ(x , t)

ψ(x , t) =∑n

cne− i

~ (En−ET )tφn(x)

Second step: Wick rotation in time

~∂ψ(x , τ)

∂τ=

[− ~2

2m∇2 + (V (x)− ET )

]ψ(x , τ)

ψ(x , τ) =∑n

cne− (En−ET )

~ tφn(x)

Role of ET

• ET > E0 the wavefunction will diverge exponentially fast

• ET < E0 the wavefunction will vanish exponentially fast

• ET = E0 the wavefunction will exponentially converge to φ0!

We want a practical method that,starting from an initial wave-function, performs an imaginarytime iteration, permitting succes-sive adjustements to ET , such thatat the end, the stationary solutioncorresponds to ET (τ →∞) = E0

~∂ψ(x , τ)

∂τ=

[− ~2

2m∇2 + (V (x)− ET )

]ψ(x , τ)

ψ(x , τ) =∑n

cne− (En−ET )

~ tφn(x)

Role of ET

~∂ψ(x , τ)

∂τ=

[− ~2

2m∇2 + (V (x)− ET )

]ψ(x , τ)

ψ(x , τ) =∑n

cne− (En−ET )

~ tφn(x)

Role of ET

DMC: practical scheme

First step: generation of walkers

Let’s generate Nw replicas of the systems sampled from the initialwavefunction ψT (x , 0)

ψ(x , 0) =Nw∑i=1

wiδ(x − xi )

Second step: writing the propagator

The integral form of the imaginary time Schrodinger equationinvolves the Green’s function

ψ(x ′, τ + δτ) =

∫dx G (x , x ′, δτ)ψ(x , τ)

ψ(x , 0) =Nw∑i=1

wiδ(x − xi )

ψ(x ′, τ + δτ) =

∫dx G (x , x ′, δτ)ψ(x , τ)

The propagator G

• Only diffusive term

∂ψ(x , τ)

∂τ= −D ∇2ψ(x , τ)

GD(x , x ′, δτ) = e−(x−x′)2

• Only rate-term (branching)

∂ψ(x , τ)

∂τ= (V (x)− ET )ψ(x , τ)

GB(x , x , δτ) = e−(V (x)−ET )δτ

M = int[e−(V (x)−ET )δτ + ξ

The propagator G

• Only diffusive term

∂ψ(x , τ)

∂τ= −D ∇2ψ(x , τ)

GD(x , x ′, δτ) = e−(x−x′)2

• Only rate-term (branching)

∂ψ(x , τ)

∂τ= (V (x)− ET )ψ(x , τ)

GB(x , x , δτ) = e−(V (x)−ET )δτ

M = int[e−(V (x)−ET )δτ + ξ

The propagator G

∂ψ(x , τ)

∂τ=[−D ∇2 + (V (x)− ET )

]ψ(x , τ)

G (x , x ′, δτ) = GD(x , x ′, δτ) · GB(x , x ′, δτ) +O(δτ)2

= e−(x − x ′)2

2δτ− (V (x)− ET ) δτ

+O(δτ)2

• diffuse a walker, and accept x ′ with probability

1, ψ(x ′)G(x ,x ′)ψ(x)G(x ′,x)

)• remove or proliferate the walker according to the multiplicity,

calculated with the branching term

The propagator G

∂ψ(x , τ)

∂τ=[−D ∇2 + (V (x)− ET )

]ψ(x , τ)

G (x , x ′, δτ) = GD(x , x ′, δτ) · GB(x , x ′, δτ) +O(δτ)2

= e−(x − x ′)2

2δτ− (V (x)− ET ) δτ

+O(δτ)2

• diffuse a walker, and accept x ′ with probability

1, ψ(x ′)G(x ,x ′)ψ(x)G(x ′,x)

)• remove or proliferate the walker according to the multiplicity,

calculated with the branching term

ψ(x , 0) =Nw∑i=1

wiδ(x − xi )

ψ(x ′, τ + δτ) =

∫dx G (x , x ′, δτ)ψ(x , τ)

Third step: calculate quantity of interest

Calculate quantity of interest (at this step) averaging on thewalkers.

E0(x , τ) =Nw∑i=1

EL(xi , τ) =Hψ(x , τ)

ψ(x , τ)

Fourth step: adjust trial energy

EnewT =

ET + E0(x , τ)

We continue to propagate until E0 = ET , exact result for theground-state energy.

This is exactly how things ...do not work

DMC: two issues

Fluctuations

Branching term causes large fluctuations in the number of walkers,preventing convergency. Solution: importance sampling.

Interpretation

ψT has to be positive everywhere (which is not the case) to beinterpreted as walkers distribution density. Solution: fixed-nodesapproximation.

DMC: Importance Sampling

f (x , τ) = ψT (x)ψ(x , τ)

f (x , 0) = |ψT (x)|2

so let’s get some walkers from our previous VariationalMonte-Carlo calculation.

f (x , τ) = ψT (x)ψ(x , τ)

f (x , 0) = |ψT (x)|2

so let’s get some walkers from our previous VariationalMonte-Carlo calculation.

−∂f (x , τ)

∂τ=[−D∇2 + (EL(x)− ET )

]f (x , τ) + D∇ [f (x , τ)v(x)]

EL(x) =HψT

ψTand v(x) =

∇ψT

• Branching term is now related to the local energy, rather thanto the potential.

• A new term appears, a drift term, for which the relativeGreen’s function can be easily evaluated, i.e.G (x , x ′, δτ) = δ(x − x ′ − v(x)δτ).

so our final equation becomes a drifted diffusion process (Brownianmotion within an external field) + branching.

−∂f (x , τ)

∂τ=[−D∇2 + (EL(x)− ET )

]f (x , τ) + D∇ [f (x , τ)v(x)]

EL(x) =HψT

ψTand v(x) =

∇ψT

−∂f (x , τ)

∂τ=[−D∇2 + (EL(x)− ET )

]f (x , τ) + D∇ [f (x , τ)v(x)]

EL(x) =HψT

ψTand v(x) =

∇ψT

−∂f (x , τ)

∂τ=[−D∇2 + (EL(x)− ET )

]f (x , τ) + D∇ [f (x , τ)v(x)]

EL(x) =HψT

ψTand v(x) =

∇ψT

DMC: Walkers evolution

DMC: Fixed-nodes approximation

Fixed-nodes: howto?

f (x , τ) = ψT (x)ψ(x , τ) ≥ 0

This is possible if the nodes(ψ(x , τ))=nodes(ψT (x)) all along theimaginary time evolution.Walker refusal= during the diffusive-drifted process a newposition is proposed for the walker. If this position changes thesign of the wave-function, the move is refused.This is an approximation and implies a (small) error.

Fixed-nodes: howto?

f (x , τ) = ψT (x)ψ(x , τ) ≥ 0

Fixed-nodes: howto?

f (x , τ) = ψT (x)ψ(x , τ) ≥ 0

Fixed-nodes: howto?

f (x , τ) = ψT (x)ψ(x , τ) ≥ 0

• FN-DMC uniquely depends on the nodes of the trialwave-function

• Even within Fixed-Nodes the accuracy of DMC is very high,and comparable to much more cumbersom methods like CI orCC

DMC: Can we release the nodes?• An antisymmetric function can be written as a difference

between to positive functions, f1 and f2

• Let’s associate a set of Walkers to f1, called W1, and a set ofWalkers to f2, called W2

• The Schrodinger eq. is linear, so we can perform imaginarytime iteration of these two set of walkers separately.

• The W1 and W2 wavefunctions have a not-negligiblesuperposition with the bosonic ground-state, lower in energy.

• If no numerical errors, the bosonic part of W1 and W2 cancelout, and we have an exact result.

• Since numerical errors are present, after few steps ⇒ bosoniccatastrophe

Fermion MC

Fermion Sign Problem: unresolved

Several solution / year BUT

• They have errors

• They are known not to work

• They have uncontrolled approximations

• Scaling not demonstrated

QMC for solids: many issues

• Bloch theorem only valid in one-particle theories, not formany-body wavefunctions.

Consequences: twisted boundary conditions, supercells,finite-size error, Ewald sums not exact.

• Kinetic energy is large for deep electron, so pseudo-potentialare mandatory.

• Non-local pseudo-potential worsen the fermion-sign problem.

Introduction to Quantum Monte-CarloIntroduction to Quantum Monte-Carlo Francesco Sottile Ecole...

Documents

An introduction to Quantum Monte Carlo - TDDFT.org

Full configuration interaction quantum Monte Carlo and coupled cluster … · 2013-08-14 · Full configuration interaction quantum Monte Carlo and coupled cluster Monte Carlo: a

Stochastic Series Expansion (quantum Monte Carlo)

Quantum speedup of Monte Carlo methodscsxam/papers/montecarlo.pdf · Quantum speedup of Monte Carlo methods Ashley Montanaro July 11, 2017 Abstract Monte Carlo methods use random

Quantum Monte Carlo methods - physics.bgu.ac.il

Quantum Monte-Carlo study of deconfined bosonic spinons, a … · 2014. 8. 21. · Model: Quantum Monte-Carlo simulation of quantum spin ice! XY-FM transition has been confirmed

Quantum Monte Carlo methods: recent developments and

Quantum Monte Carlo Programming - dl.booktolearn.comdl.booktolearn.com/ebooks2/...quantum_monte-carlo_programming_… · Quantum Monte Carlo Programming for Atoms, Molecules, Clusters,

Quantum Monte Carlo: method, applications, impact ...mueller/qc/qc18/readings/mitas.pdf · Quantum Monte Carlo: method, applications, impact, relation to quantum computing Lubos Mitas

The quantum Monte Carlo technique: theory and …...2009/06/25 · Hydrides formation kinetics Quantum Monte Carlo methods B. L. Hammond, W. Lester, Jr., P. J. Reynolds, “Monte

Quantum Monte Carlo Studies of 2D Anisotropic

Chapter 5 Quantum Monte Carlo - ETH Z

Quantum Monte Carlo for Electronic Structure - ORNLweb.ornl.gov/~kentpr/talks/uc_group_mtg_June2003.pdf · Quantum Monte Carlo for Electronic Structure Paul Kent ... •Improved methods,

Recent advances in quantum Monte Carlo for quantum ... · Recent advances in quantum Monte Carlo for quantum chemistry: optimization of wave functions and calculation of observables

Introduction to Quantum Monte Carlo

Quantum monte carlo

Quantum-Inspired Hamiltonian Monte Carlo for Bayesian Samplingzhengzhang/submitted/Ziming_QHMC.pdf · Keywords: Hamiltonian Monte Carlo, Quantum-Inspired Methods, Sparse Modeling

Polarizability in Quantum Dots via Correlated Quantum Monte Carlo

Computational Physics Quantum Monte Carlo methods

Quantum Monte Carlo - National MagLab · Quantum Monte Carlo • Premise: need to use simulation techniques to “solve” many-body quantum problems just as you need them classically