Error reduction for the non-perturbative calculation of
91
Bielefeld University Faculty of Physics Master’s thesis Error reduction for the non-perturbative calculation of heavy quark momentum diffusion with dynamical fermions Author: HaukeS¨oren Sandmeyer Supervisor and 1st corrector: Prof. Dr. Edwin Laermann 2nd corrector: Dr. Olaf Kaczmarek June 29, 2015
Error reduction for the non-perturbative calculation of
Master’s thesis
Error reduction for the non-perturbative calculation of heavy quark
momentum
diffusion with dynamical fermions
Author: Hauke Soren Sandmeyer
2nd corrector: Dr. Olaf Kaczmarek
June 29, 2015
Contents
Contents
1. Introduction 5 1.1. The QCD Lagrangian . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 6 1.2. Path integral formulation .
. . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. QCD on the lattice 10 2.1. Discretizing the action . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 10 2.2. The doubling
problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13 2.3. Wilson fermions and chiral symmetry breaking . . . . . . .
. . . . . . . . 14 2.4. Staggered action . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . 15 2.5. Temperature on the
lattice and the continuum limit . . . . . . . . . . . . 17 2.6. Z3
symmetry and phase transition . . . . . . . . . . . . . . . . . . .
. . . 18
3. Numerical simulation 20 3.1. Monte Carlo simulations . . . . . .
. . . . . . . . . . . . . . . . . . . . . . 20 3.2. Heatbath and
Overrelaxation . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3. Rational Hybrid Monte Carlo . . . . . . . . . . . . . . . . .
. . . . . . . . 23 3.4. Local boson fields . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 24
3.4.1. Even odd preconditioning . . . . . . . . . . . . . . . . . .
. . . . . 25 3.4.2. Choosing the polynomial . . . . . . . . . . . .
. . . . . . . . . . . 26 3.4.3. Explicit update formulation . . . .
. . . . . . . . . . . . . . . . . . 29
4. Error reduction for gluonic operators 33 4.1. Luscher Weisz
method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2. Local boson fields and sublattices . . . . . . . . . . . . . .
. . . . . . . . . 35 4.3. Optimal sublattice shape . . . . . . . .
. . . . . . . . . . . . . . . . . . . 36
5. Heavy quark diffusion 39 5.1. The spectral function . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . 39 5.2. Transport
properties through spectral functions . . . . . . . . . . . . . . .
42 5.3. Heavy quark momentum diffusion from heavy quark effective
theory . . . 44 5.4. Correction of lattice effects . . . . . . . .
. . . . . . . . . . . . . . . . . . 47
6. Methods 48 6.1. Error handling and auto correlation time . . . .
. . . . . . . . . . . . . . . 48 6.2. Root calculation . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . 49
7. Technical setup 50 7.1. CUDA . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 50 7.2. Optimization . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
8. Thermalizing with local bosons 54 8.1. Tuning parameters . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3
Contents
8.2. Autocorrelation time . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . 56 8.3. Final choice of testing parameters . . .
. . . . . . . . . . . . . . . . . . . . 57
9. Noise reduction 60 9.1. The unimproved correlator . . . . . . .
. . . . . . . . . . . . . . . . . . . 60 9.2. Tuning the sublattice
update . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.2.1. Preliminary considerations . . . . . . . . . . . . . . . . .
. . . . . 60 9.2.2. Analysis of the error reduction for different
sublattice size . . . . 63 9.2.3. Polyakov loop improvement . . . .
. . . . . . . . . . . . . . . . . . 65
9.3. Error reduction for different masses and number of roots . . .
. . . . . . . 65
10.Results 70 10.1. Continuum extrapolation for quenched theory . .
. . . . . . . . . . . . . . 70 10.2. Quenched theory at different
temperature versus full QCD . . . . . . . . . 73
11.Conclusion and outlook 76
B. Results of the correlator 79
4
1. Introduction
The basic approach of thermodynamics is the study of states that
are in equilibrium. Within this framework, mathematical tools based
on statistical considerations, like the partition function, allow
the calculation of expectation values of certain quantities.
However, in real physical experiments it is impossible to create a
perfect equilibrium. For many experiments this is not a problem, as
they involve large number of particles (O(1023)) and small
temperature gradients and, therefore, are close to
equilibrium.
In heavy ion collisions, which are one of the most important
experiments in particle research, this is not the case. At those
collisions, temperatures about 150000 times the core temperature of
the sun are created for fractions of zeptoseconds (∼ 10−23s).
Nevertheless, within this short time periods, thermalization
processes lead to a creation of a thermal medium, the strongly
interacting quark gluon plasma [1, 2]. Thereby, the charm and the
bottom quark are key ingredients to understand the thermodynamic
and hydrodynamic properties of this quark gluon plasma. Since their
masses are much larger than the temperature of the quark gluon
plasma, their creation occurs mainly in the early preequilibrium
phase of the collision.
In contrast to qualitative arguments that suggest a rather low
thermalization rate because of the heavy mass, jets containing b or
c quarks are found to be effectively quenched [3, 4]. Also
perturbative calculations do not describe this fast thermalization
rates [5, 6]. Thus, a non perturbative calculation of the heavy
quark diffusion constant, which can be related to the
thermalization rates of heavy quarks, is highly desirable.
Unfortunately, a non perturbative analytical solution of the
underlying theory, Quan- tum Chromo Dynamics (QCD), does not exist.
Nevertheless, a numerical approach allows the calculation of
expectation values in thermal media using lattice gauge the- ory.
Lattice gauge theory discretizes the space time and calculates
expectation values of physical quantities with statistical methods.
It is based on the thermal field theory using imaginary time, which
requires an analytic continuation for real time observables. For
that purpose, very precise results from lattice simulations are
necessary.
For the measurement of transport properties on the lattice, the
spectral function of so called correlation functions can be used.
Only such correlators can be calculated directly on the lattice,
which makes it necessary to extract the spectral function through
analytic continuation or by making assumptions about its structure.
For the calculation of the heavy quark diffusion constant, the
spectral function of the vector current-current correlator can be
used. However, this spectral function is found to be rather complex
at finite temperatures, which makes it difficult to perform such a
extraction.
The situation can be solved by calculating the heavy quark momentum
diffusion con- stant. The latter can be related to the diffusion
constant and its spectral function is much more smooth [7]. Still,
the extraction of the spectral function needs a very high precision
of the corresponding correlator.
Since lattice gauge theory uses statistical methods to calculate
observables, the er- ror decreases with
√ N , where N is the number of measurements. Depending on the
operator, this leads to the problem of high numerically effort to
decrease the errors.
5
1. Introduction
Previous lattice studies used the Luscher Weisz error reduction
method [8] and the so called quenched approximation to calculate
the heavy quark momentum diffusion coefficient. The quenched
approximation omits the fermionic part in the simulation of the
quark gluon plasma. Now, the goal of this thesis is to develop an
error reduction method that can be used with dynamical fermions.
This is done by a combination of the Luscher Weisz method with
local bosons [9].
The outline is as following. First we give an overview about the
general formal- ism of lattice QCD and its different numerically
implementations. Then, the Luscher Weisz method is introduced and
transferred to the local boson method. After giving an overview
about heavy quark diffusion and the technical implementations, we
analyze the efficiency of the new algorithm. In the end we make a
comparison to the results stemming from the quenched
approximation.
1.1. The QCD Lagrangian
As the Quantum Chromo dynamic is a quantum field theory, its
construction starts according to the Dirac equation with the
non-interacting Lorentz invariant Lagrangian,
Lfree = ∑ f
ab −mfδαβδ ab ) ψ(x)β,bf . (1.1)
Here α and β correspond to indices of the Dirac structure, a and b
identify different color, µ refers to the Lorentz structure, and we
sum over all flavors f .
The basic idea for the construction of a Yang Mills Lagrangian is
to demand an invariance under local SU(3) gauge
transformations,
S (x) = exp ( i
) , (1.2)
where λ is the vector of all Gellmann matrices. This invariance can
be obtained by inserting gauge fields Aµ(x), the so called gluonic
fields, which are itself member of the gauge algebra via Aµ =
Aiµ(x)T i with T i = 1
2λ i.
Omitting the Dirac and color indices and keeping the sum over the
flavors in mind, the Lagrangian simply reads
LF = ψ (x) (iγµ (∂µ − igAµ)−m)ψ (x) =: ψ (x) (iγµDµ −m)ψ (x) ,
(1.3)
where g is an arbitrary parameter and specifies the coupling
strength between the quark fields and the gluonic fields. To
complete the gauge invariance, we require that the gauge fields
transform as
Aµ (x)→ A′µ (x) = S (x) { Aµ (x) + i
g ∂µ
} S−1 (x) . (1.4)
and we end up with the gauge invariant fermionic part of the
Lagrangian LF . The effect of this construction is that now
additional fields are involved and interact with the quark
6
1.2. Path integral formulation
fields. These fields will itself need a kinetic term in the
Lagrangian. Using the gluonic field strength tensor,
Fµν := i
i, (1.5)
LG = 1 4F
i µνF
µν,i. (1.6)
We finally get the gauge and Lorentz invariant Yang Mills
Lagrangian,
L = ψ (x) (iγµDµ −m)ψ (x)− 1 4F
i µνF
µν,i, (1.7)
that is the key formula for all following QCD calculations.
1.2. Path integral formulation
An analytical solution of the equations of motion for equation
(1.7) is not yet discovered and seems to be highly difficult to
find. Nevertheless, a well known solution for the free theory (1.1)
can be used to perform perturbation theory. This allows for example
the calculation of cross sections in scattering events.
However, for estimating expectation values in thermal equilibrium,
a different ap- proach is necessary. The so called path integral
formalism allows to introduce temper- ature in quantum field theory
and, moreover, makes it possible to calculate physical quantities
numerically.
We start with a transition amplitude for general fields φi(t′) and
φj(t),
G ( φi, t
′−t)H |φj . (1.8)
These fields can be scalar fields or quark spinors for example. The
latter is the case in QCD, where the fields are defined according
to the Lagrangian (1.7).
In a naive application of field theories according to the bare
Lagrangian, ultra violet divergences occur. To define the theory in
a well defined way, regularization schemes have to be introduced.
For standard perturbation theory, the dimensional regularization
can be used. Here we use the lattice regularization, which means
that we discretize the space. The idea is to introduce a 3D lattice
with lattice spacing a and write
x→ an ni = 0, 1, . . . , Nσ − 1. (1.9)
Then integrals transform to a sum,∫ dxf(x) =
∑ n
∂if(n) = f(a(n + ei))− f(a(n− ei)) 2a . (1.11)
Using the self commuting of the Hamiltonian, we can also discretize
the time evolution operator U(t′, t) with
U(t′, t) = Nt∏ n=1
U(t+ na, t+ (n− 1)a)
= Nt∏ n=1
eiaH , (1.12)
where Nt is defined by Nta = t′− t. Note that H is already the
discretized version here. We now insert the completeness
relation
∫ dφ |φ φ| = 1 after each product term of the
time evolution operator in equation (1.8). Taking the limit a→ 0
and Nt, Nσ →∞, this results in a path integral representation of
the transition amplitude [10],
G ( φi, t
Dφe−iS(φ,t′,t). (1.13)
The integration runs over all possible paths of fields that have
the boundaries φ(t′) = φi and φ(t′) = φi. More precisely the
integration measure is defined as
lim Nσ ,Nt→∞
∏ n∈Λ
dφn, (1.14)
where Λ is the set of all lattice points. Thereby, the action S is
given by the integration over the Lagrangian,
S(φ, t′, t) = ∫ t′
∫ d3xL(φ, t′′). (1.15)
In the case of QCD, the spinors transform to anti-commuting
Grassmann numbers in this construction [10, 11].
To connect to temperature, we look at the partition function,
Z(β) = tr ( e−βH
) , (1.16)
with β = 1/T . We are free to choose the basis in which we
calculate the trace and, therefore, can use φi for this purpose. We
get
Z(β) = tr ( e−βH
8
1.2. Path integral formulation
and observe a similarity to equation (1.8). The integrand only
differs in the periodicity φi = φj and the factor in front of the
Hamiltonian. We conclude that we can write the partition function
as a path integral if we expand our action to imaginary time by
analytic continuation:
Z(β) = tr ( e−βH
=: ∫ Dφe−iS(φ,−iβ,0). (1.17)
Dealing with the imaginary time in S(φ,−iβ, 0) is rather unhandy.
Thus, we perform a wick rotation t → −iτ and get an Euclidean
action SE(τ) = −iS(t → −iτ). With the Lagrangian from equation
(1.7) the QCD action transforms to
SE(ψ, ψ, β,m) = ∫ β
4F i µνF
Having constructed the partition function, we can also calculate
expectation values of an arbitrary operator O in a similar
way:
O = 1 Z
tr ( Oe−βH
∫ DφO(φ)e−SE(φ,β). (1.19)
Here an important interpretation can be done. The action is real
and therefore the factor
1 Z
e−SE(φ,β) (1.20)
is real as well. This can be interpreted as a propability weight
for the state φ. We integrate over all states and weight the value
of the operatorO(φ) accordingly. This is the starting point for
lattice calculations, where such expectation values are
approximated numerically using this probability interpretation in a
discretized space.
9
2. QCD on the lattice
2. QCD on the lattice Instead of direct solving the equations of
motion of the QCD Lagrangian, one could think of an numerical
approach for this equations. However, due to the huge number of
degrees of freedom, this would require too much computation power.
Therefore, an examination of the time evolution of observables at
high temperatures is currently impossible.
What is left is the calculation of expectation values according to
the path integral formulation (1.19).
2.1. Discretizing the action
As outlined in section 1.2, the path integral has been introduced
using a discrete space- time. In the end the limit a → 0 defined
the path integral for continuous theory. We now use the discrete
version for numerical calculations. Hence, we have to discretize
the QCD action (1.18) and introduce the lattice
Λ = {n = {n0, n1, n2, n3}|ni ∈ N; n0 ≤ Nτ − 1, n1, n2, n3 ≤ Nσ −
1}. (2.1)
A space time point x may then be written as an. With the
discretization rules in section 1.2, a simple discretized version
of the free fermion action is given by
Sfree F = a4 ∑
2a +mψ (an)
, (2.2)
where µ is the unit vector in the direction of µ. For sake of
brevity, we rescale the fields with a3/2ψ(an) → ψ(n) and the mass
with am → m. Note that from now on the physical mass is given by
mphys = m/a. With this rescaling we get
Sfree F =
∑ n∈Λ
ψ (n)
2 +mψ (n)
, (2.3)
which is not gauge invariant since we couple ψ(n) with ψ(n± µ). The
gauge invariance under transformations S(n) ∈ SU(3) at each lattice
point n can be gained with the so called link variables Uµ(n).
These links are members of the gauge group and transform as
U ′µ(n) = S(n)Uµ(n)S−1(n+ µ). (2.4)
Through this transformation property the action
SF = ∑ n∈Λ
2 +mψ (n)
(2.5)
becomes gauge invariant. Here we have used a commonly used
abbreviation U−µ(x) = U †µ(x− µ).
10
Figure 2.1: Visualization of the plaquette Uµν(n)
To prove that we get the right continuum result in the limit a→ 0,
let us have a closer look at the transformation property in
equation (2.4). The local gauge at lattice point n+µ has been
transported to the point n. For the continuum, such a gauge
transporter is known as the Schwinger line integral,
U(x, y) = eig ∫ y x
dzµAµ(z). (2.6)
We simply find that the link variables are the lattice version of
the gauge transporter and the above expression simplifies to
Uµ(n) = eigaAµ(n). (2.7)
If we Taylor expand this expression for a→ 0, we get the continuum
expression according to
Scont. F = Sdisc.
F +O(a2). (2.8)
At every lattice point, we introduced 4 link variables, each for
one µ direction. We can interpret this as this links sitting
between the lattice points, which is visualized by a schematic
diagram in figure 2.1.
To discretize the gluonic part of the action, we define the
plaquette Uµν(n) as a product of four link variables in a closed
loop:
Uµν(n) = Uµ (n)Uν (n+ µ)U †µ (n+ ν)U †ν (n) . (2.9)
Figure 2.1 depicts a visualization of the plaquette. Due to the
invariance under cyclic commutation, a trace over a plaquette is
gauge invariant. Therefore, it is reasonable to construct the gauge
part of the action under usage of the trace of the plaquette.
If we use the exponential expression of the link variables again
and make use of the Baker-Campbell-Haussdorff formula, the
plaquette may be written as
Uµν(n) = exp(iga2Fµν(n) +O(a3)). (2.10)
2. QCD on the lattice
The continuum expression can be restored if only the third term of
the Taylor expansion is used. This can be done by taking the real
part and compensating the leading 1 using
SG [U ] = β ∑ nεΛ
:= 1 3|Λ|
Re tr[Uµν (n)] (2.12)
is often called plaquette as well and is used for different lattice
analysis. Having discretized the action, we look at the path
integral in more details. The integral
runs over all fields involved in the action. Here, this are the
Grassmann valued fields ψ, ψ and the link variables Uµ. Thus, the
integration measure is defined as
∫ DψDψ
∫ DU :=
dUµ(n), (2.13)
where dψ(n) is the Grassmann integration measure and dUµ(n) is the
Haar measure for SU(3) matrices. If we write the fermion action
(2.5) as
ψnDnm[U,m]ψm, (2.14)
Dnm[U,m] = 3∑
a direct integration over the Grassman variables results
in[10]
Z = ∫ DψDψDUe−SF [U,ψ,ψ,m]−SG[U ] =
∫ DU detD[U,m]e−SG[U ]. (2.16)
For more than one flavor involved, the partition function
reads
Z = ∫ DU
detD[U,mf ]e−SG[U ]. (2.17)
The above definition of Dnm is often called Dirac matrix.
Similarly, one can derive a path integral representation for
expectation values. For
operators that depend on the quark fields one has to take the
Grassman variables into account before integrating them out.
However, for pure gluonic operators one can simply write
O = 1 Z
12
2.2. The doubling problem
In this expression we have, except the operator O[U ] itself, two
terms that depend on the links U . The first one, the determinant
detD[U,m], is highly nonlocal in U , while the second one exp(−SG[U
]) only involves near neighbor interaction. This non-locality of
the determinant makes it difficult to simulate lattice QCD
involving fermions. (See section 3)
A drastic simplification is to set detD[U,m] = 1 which is
equivalent to m→∞. This heavy quark limit is the so called quenched
approximation.
2.2. The doubling problem
The finiteness of the lattice in numerical simulations has a
dramatic side effect. The so called doubling problem arises when
looking at the Fourier transformed quark propagator in free theory.
The free quark propagator is defined as
Gab(n) = ψa(n)ψb(0)free , (2.19)
with a, b color indices. The Fourier transformation on the lattice
also becomes discrete with
f(n) = 1√ |Λ|
pµ = 2π Nµ
2
} . (2.21)
The value θµ depends on the boundary conditions. It is 1/2 for anti
periodic boundaries and 0 for periodic boundaries.
We can now transform the free fermion action (2.2) and get
SF = ∑ p
ψpDpψp (2.22)
µ=0 γµ sin pµ. (2.23)
As in the continuum, the Fourier transformed quark propagator is
given by the inverse of the Dirac operator,
Gab(n) = 1 |Λ|
]ab sin2 p0 + ω2(p)
with
sin2 pi. (2.25)
For the limit a → 0 with fixed pphys. = pµ/a, this goes over to the
correct continuum version with one pole at p = 0. Still, for finite
a, 15 so called doublers arise, where ω(p) = m. This additional
poles correspond to extra particles with E(p) = m, where one or
more entries of p are equal to π. For the free theory, these
doublers might not be a problem, since they do not interact. But
for an interaction theory, they simulate additional particles that
do not exist in continuum.
2.3. Wilson fermions and chiral symmetry breaking In order to
define a theory without doublers, different approaches have been
developed. One possibility is to add an extra term to the Dirac
matrix which vanishes in the limit a→ 0. This so called Wilson
term
LW := − 3∑
µ=0
Uµ(n)δn+µ,m − 21δn,m + U−µ(n)δn−µ,m 2 (2.26)
leads to an additional term in the Fourier transformed quark
propagator,
D(p) = m1+ i 3∑
3∑ µ=0
(1− cos pµ). (2.27)
This additional term solves the doubling problem, but breaks the so
called chiral sym- metry, which is given in the naive
discretization (2.5) and the limit m→ 0.
Through chiral symmetry one can split the massless fermion action
in one ’right hand’ term and one ’left hand’ term. To do so, we
introduce the projector
P± = 1± γ5 2 (2.28)
and write ψ = ψL + ψR with
ψR = P+ψ, ψL = P−ψ
ψR = ψP−, ψL = ψP+. (2.29)
The chiral symmetry is now given by the observation that the mixed
terms in the ex- panded version of the action vanish when m = 0.
Then we can write
L = LR + LL, (2.30)
where LR and LL are defined as usual with ψL/R instead of ψ.
Including different flavors, the massless action has the total
symmetry[10]
SU(Nf )L ⊗ SU(Nf )R ⊗U(1)V ⊗U(1)A, (2.31)
14
2.4. Staggered action
where the notation shows that the right and left handed terms
transform independently under SU(Nf ) flavor mixing. A degenerate
mass term breaks that symmetry since it mixes right handed and left
handed terms. What is left is the symmetry
SUV (NF )⊗UV (1) (2.32)
where the conserved quantity corresponding to UV (1) is the baryon
number. Due to the extra ’mass term’, the symmetry is explicitly
broken even in the massless
limit for Wilson fermions. An important consequence of spontaneous
chiral symmetry breaking is the appearance
of the so called Goldstone bosons, which are massless bosonic
excitations [11]. For the continuum theory, the pion is the
Goldstone boson as it is much lighter than other mesons. (It would
be massless in the limit m → 0.) Therefore, it is impossible to
investigate masses that involve Goldstone bosons with Wilson
fermions.
2.4. Staggered action In the case of Wilson fermions we removed the
doubler by adding a new term. A different approach is to interpret
the 15 doubler as additional flavors. Such a theory is of course
unphysical, but Kogut and Susskind showed that one can reduce the
number of flavors from 16 to 4 [12]. To distinguish from the
physical flavors, one usually then speaks of tastes.
We start again with the free discretized fermion action,
Sfree F =
∑ n∈Λ
ψ (n)
2 +mψ (n)
ψ = γn0 0 γn1
1 γn2 2 γn3
2 γn1 1 γn0
0 , (2.34)
where the ni are the entries of the lattice point n. Commuting the
gamma matrices with γµ in the action we get different phase factors
for each lattice point:
ψ(n)γµψ(n± µ) = ηµψ ′ 1ψ(n± µ)′. (2.35)
Thereby, the so called staggered phases ηµ are given as
ηµ = (−1) ∑
ν<µ nν . (2.36)
Applying this transformations to the total action (2.5), the action
becomes diagonal in Dirac space since the γ matrix structure is now
represented by the phases. The observation now is that we can
reduce the number of doublers to four if we skip three of the four
spinor components. Therefore, we introduce the fields χ(n) and χ(n)
which live only in color space and we get the simpler action
SF = ∑ nεΛ
χ (n)
3∑ µ=0
ηµ (n) Uµ (n)χ (n+ µ)− U−µ (n)χ (n− µ) 2 +mχ (n)
. (2.37)
15
2. QCD on the lattice
After this simplification, we have to answer how many quarks this
action represents. To do so, we have to map back on the Dirac
structure. Assuming that the lattice dimensions are even, this can
be done by introducing hypercubes of size 24. We access these cubes
through a vector N while a sub vector ρ points to the actual
lattice point. Using the notation χρ(N) = χ(2N +ρ), we map back to
the spinor fields with the linear transformation,
ψtα (N) = 1 8 ∑ ρ
Γαt,ρχρ (N) , (2.38)
0 γρ1 1 γρ2
2 γρ3 3 )αt . (2.39)
With this definition it is obvious that we now have four tastes t
involved. After some algebra one obtains for the total
action,
SF =16 ∑ N
( 3∑ t=0
3∑ t,t′=0
3∑ µ=0
ψt(N)γ5(τtτµ)t,t′(∇µ)2 µψ
t′(N) ) , (2.40)
with τµ = γTµ . Here ∇µ is the discretized derivative on the
lattice of hypercubes. The third term vanishes in the continuum
limit, but it mixes flavors and breaks the chiral symmetry for
finite lattices. The advantage over Wilson fermions is that it
still leaves the action invariant under U(1) ⊗ U(1)
transformations. Therefore, investigations with Goldstone particles
are still possible in a reduced form.
We now have an action that introduces four degenerate tastes. Using
the staggered action (2.37) for equation (2.18), we simulate
physics with 4 quarks, which have identical masses. Thus, in order
to reduce the number of tastes, one has to take roots of the
determinant. For example, if we want to calculate an expectation
value for physics with two light quarks mu/d and one heavy quark
ms, we have to use
O = 1 Z
)1/2 (detD[U,ms])1/4 e−SG[U ]. (2.41)
The question whether rooting is allowed for finite lattices is
controversial. We have seen that the tastes decouple in the
continuum limit, but it is not yet fully clear whether a continuum
extrapolation based on finite lattices leads to correct physical
values. For instance we have to ask whether the universality class
remains unchanged. Nevertheless lattice simulations using staggered
quarks have led to the right physical results. See [13, 14] for
further details. Improvements of the action, like ”highly improved
stag- gered quarks” (HISQ) reduce the taste mixing and, moreover,
reduce the order of the discretization error [15].
For the probability interpretation of equation (2.18), it is
important that the deter- minant detD is real and positiv. We have
to check whether this is still the case for
16
2.5. Temperature on the lattice and the continuum limit
staggered fermions. To do so we look at the eigenvalue spectrum of
the Dirac matrix. We define the massless staggered Dirac matrix as
D(m = 0) := M . It is easy to see that it is anti-hermitian which
means M † = −M . As a direct consequence, it follows that M has
pure imaginary eigenvalues. Moreover, M has a γ5 hermiticity
M †n,m = −Mn,m = η5(n)Mn,mη5(m) (2.42)
from which it follows that there is a complex conjugate counterpart
λj = λ∗i for every eigenvalue λi. Thus, the determinant detM is
real. Furthermore, the mass term m ensures that detD is strictly
positiv.
In addition to the staggered phases, we also have to define
boundary conditions for the fermions. One usually chooses periodic
boundaries in spatial direction and anti-periodic boundaries in the
temporal direction.
In the definition of the Dirac matrix, we only access the fermion
fields through terms with links multiplied to them. Hence, it is
sufficient to introduce another phase, that multiplies all links
reaching over the lattice borders in temporal direction with
−1.
We may also shift the staggered phases into the links. Using
ηµ(n± µ) = ηµ(n), (2.43)
we can multiply the phase ηµ(n) to the link Uµ(n), without changing
the Dirac matrix.
2.5. Temperature on the lattice and the continuum limit As outlined
in section 1.2, the imaginary time was introduced to calculate the
partition function using the path integral formalism. The time
integration in the definition of the action runs over τ ∈ [0, β].
From the discretization in section 2.1, it follows that the
connection between the lattice spacing a and the temperature is
given by
β = 1 T
= Nτa. (2.44)
This means that the lattice spacing directly defines the
temperature. However, through the rescaling of the fields (2.3) the
action does not depend on the lattice spacing any more. The only
relevant parameters are g and m. Hence, the lattice spacing is
indirectly defined by this two parameters. For simplicity, from now
on we switch to quenched theory, where only g is the relevant
parameter.
Any physical observable, Γ, cannot depend on the lattice spacing a,
which means dΓ/da = 0. The lattice versions of those quantities
ΓL(g) are related to the physical ones by multiplication with some
power of the lattice spacing, ΓL(g) = adΓΓ. This leads to the
renormalization group equation,
a d daΓ =
β(g) = −adg da. (2.46)
Solving the differential equation for an arbitrary β-function
gives
a
. (2.47)
We find that the β-function controls how the lattice spacing and
the gauge coupling are related. Moreover, it shows that a continuum
limit a→ 0 is only possible if β(g) has a root.
The β-function may be expanded around g = 0 and then takes the form
[16]
β(g) = −β0g 3 − β1g
nf
) . (2.48)
It can be shown that this expansion does not change when switching
to dynamical QCD [16]. Inserting this into (2.47) shows that the
continuum limit is given for g → 0. Otherwise, confinement would be
broken. From the definition m = amphys. one sees that this also
implies m→ 0.
For fixed lattice dimensions, the limit a→ 0 would result in an
infinite small volume. Therefore, it is necessary to increase Nσ
and Nτ so that β = Nτa and Lσ = Nσa stay constant.
Since computer resources are finite, a typical measurement on the
lattice is performed at different lattice spacings a, and the
results are extrapolated to continuum. As the lattice spacing can
only be controlled by the parameters g and m, a can only be esti-
mated. Having defined the lattice dimensions and g and m, the
temperature and the lattice spacing may be measured using the
static quark potential.
2.6. Z3 symmetry and phase transition In section 2.3 we have seen
that the action is invariant under chiral transformations in the
massless limit. Therefore, the chiral condensate would be an order
parameter for a phase transition in this limit.
For the quenched limit m → ∞, another order parameter exists.
Imagine a transfor- mation of all lattice links in time direction
for a fixed value nτ :
U0(n)→ U ′τ (n) = zkUτ (n) ∀n = (nτ ,n) with nτ fixed, zk ∈ Z3,
(2.49)
where Z3 is the center group for Nc = 3. While the gauge action is
clearly invariant under this transformation, a Wilson loop wrapping
around the lattice in time direction is not. The latter is called
Polyakov loop and transforms as P (n)→ zkP (n).
Due to spontaneous symmetry breaking, this global center symmetry
might be broken, which is the case above the so called critical
temperature Tc. This is why the Polyakov loop can be used as an
order parameter for the phase transition at Tc
18
µ
T
Figure 2.2: A rough sketch of the QCD phase diagram
Although the fermion action explicitly breaks the Z3 symmetry, the
Polyakov loop is still a well suited quantity to observe the
transition from the hadronic phase to the quark gluon plasma in
dynamical QCD.
In figure 2.2 a typical phase diagram for the QCD is shown. Without
considering the chemical potential µ, one simulates physics on the
y-axis, where only a crossover between the two phases takes place.
The red line separates the two phases and the dashed part indicates
the crossover region. Following the red line, at higher chemical
potential one expects at some point a second order phase transition
(Marked by the red point). At even higher chemical potential, the
phase transition becomes of 1st order.
19
3. Numerical simulation
3. Numerical simulation The expectation value of an operator O is
defined as
O = 1 Z
∫ DUO[U ](detD[U ])se−SG[U ], (3.1)
where the parameter s controls the number of flavors. For Wilson
fermions it is just Nf
while for staggered fermions one has to use s = Nf/4. To solve this
equation numerical, we have to deal with the high dimensional path
integral, which makes it impossible to use numerical integration
methods whose error depends on the dimension of the integral, as it
is the case for Gaussian quadrature for instance.
As a solution, the Monte Carlo integration approximates the
integral using a stochastic approach and, therefore, comes with a
dimensional independent error.
We now first define the Monte Carlo method and then give an
overview about different implementations for both, quenched and
dynamical theory. Note that the following constructions are not
only valid for gauge fields represented through the links. One may
use the same algorithms for scalar fields as it will be important
in section 3.4
3.1. Monte Carlo simulations
The basic idea is to approximate the integral using a finite set of
so called gauge con- figurations. These are defined as a full set
of link variables, Ci. Now the path integral now be approximated
using
O ≈ 1 Nconf
Nconf∑ i=1
O[Ci] (detD[Ci])s
Z e−SG[Ci], (3.2)
where Nconf is the total number of gauge configurations.
Alternatively, if the configura- tions Ci are distributed according
to
dP [C] = (detD[C])s
Z e−SG[C]dC, (3.3)
one can write
O ≈ 1 Nconf
Nconf∑ i=1
O[Ci]. (3.4)
An algorithm to obtain the desired probability distribution (3.3)
is the Markov chain. It generates configurations in a stochastic
sequence where the construction of a con- figuration is based on
that one before. In the limit Nconf → ∞, all configurations are
distributed with the probability distribution (3.4). However, for a
finite number of configurations, one has to take the correlation
between subsequent configurations into account.
Within the Markov chain, the configuration Cn is selected through a
transition prob- ability P (C = Cn−1 → C ′ = Cn). This transition
probability is independent of the
20
3.2. Heatbath and Overrelaxation
index n. We now have to choose P (C → C ′) in such a way that the
total probability to generate configuration Ci is given by (3.3).
One can show, that it is sufficient to require[17]
1. Ergodicity: Every configuration can be reached in a finite
number of steps.
2. Detailed balance: P (C)P (C → C ′) = P (C ′)P (C ′ → C).
3. Normalization: ∑ C′ P (C → C ′) = 1
Starting with an arbitrary configuration, the probability for
generating a certain config- uration in the next step is based on
this starting configuration. Only after some time, the probability
to generate this certain configuration will reach the distribution
in (3.4). This process of approximating the so called equilibrium
is called thermalization, the time until the equilibrium is reached
is the thermalization time.
3.2. Heatbath and Overrelaxation For generating configurations, the
choice of the transition probability has to match the above
conditions and, moreover, should lead to a short thermalization
time. There are several such algorithms, which are based on
different approaches. Here we concentrate on the heatbath and the
overrelaxation method, which both use a local update. This means
only one element of the configuration is updated, while all other
elements are kept fixed.
We first look at a simple bosonic quadratic action,
Sφ = ( φn − φn
)2 + const. (3.5)
where φn is the element that should be updated and φn contains all
terms that are somehow connected to φn.
The principle of the heatbath algorithm is to generate a new
element φn according to the local probability distribution which
stems from all terms multiplied to the φn. Using the quadratic
structure, we can choose a Gaussian distributed random number η
with
P (η) = e−η2 (3.6)
and set the new element as φn → φ′n = φn + η. (3.7)
The overrelaxation algorithm chooses φ′n so that S[φn, φn] remains
constant, but φ′n and φn are maximally different. This can be done
by setting
φn → φ′n = 2φn − φn. (3.8)
As the action stays constant, the overrelaxation method does not
fulfill ergodicity. It only acts within the subspace of
configurations with a certain value of the action. Therefore, it
has to be combined with another ergodic algorithm. Such combined
update routines
21
benefit from a rapidly decreasing correlation between two following
configurations. To distinguish the individual heatbath and
overrelaxation updates from such a combination, the latter is
usually called sweep.
For SU(N) gauge configurations the situation becomes more
complicated. We first introduce methods for quenched SU(2) theory
and then construct a method for arbitrary SU(N) matrices based on
the SU(2) update.
Let us assume we want to update a certain link Uµ(n) =: U . The
local probability distribution is then given by
dP (U) = dU exp ( β
N Re tr[UA]
A := ∑ ν 6=µ
+ U−ν(n+ µ)U−µ(n+ µ− ν)Uν(n− ν)). (3.10)
The quantity A consists of all constant links that are multiplied
to Uµ(n) and is often referred as the so called staple. For SU(2)
matrices, A may be rewritten as A = aV where a is given by a
=
√ detA and V is a SU(2) matrix. Using the invariance of the
Haar measure under transformations of the origin in group space (dU
= dUV ), we can rewrite the local probability distribution as
dP (X) = dX exp ( aβ
2 Re tr[X] )
(3.11)
with X = UV . If we generate a matrix X according to the above
probability distribution, the new link is obtained by
U → U ′µ(n) = U ′ = XV †. (3.12)
For the details of generating X accordingly see [18] and [10]. In
the case of SU(2) overrelaxation, one can use
U → U ′ = V †UV † (3.13)
and it follows tr[U ′A] = tr[V †U †V †A] = a tr[V †U †] = tr[UA].
(3.14)
For SU(N) theories, Cabibbo and Marinari invented an algorithm
based on embedded SU(2) subgroups [19]. For SU(3) this subgroups
can be chosen as
R =
S =
T =
. (3.15)
22
3.3. Rational Hybrid Monte Carlo
We now update link U by left multiplication with one of this
subgroup elements, for example R. Writing W := UA, we get for the
trace
tr[RW ] = r11w11 + r12w12 + r21w21 + r22w22 + terms without rij .
(3.16)
We observe that only the sub block elements of W corresponding to
those of R are relevant. Therefore, one gets the same situation as
in the SU(2) case except that W is not proportional to a SU(2)
matrix. This can be fixed by setting
W →W ′ = 1 2
( w11 + w∗22 w12 − w∗21 w21 − w∗12 w∗11 + w22
) , (3.17)
which leaves the real part of trace unchanged but can be rewritten
as W ′ = aV . From now on, we can use the SU(2) update routines.
With the other two sub matrices, the full link update is given
by
U → U ′ = TSRU. (3.18)
This method stays also valid for non quenched actions as it can be
seen in section 3.4.3 For a full lattice update, one has to visit
all of its elements. Thereby, the order of the
individual links does not influence the thermalization at all, but
controls the speed of reaching the equilibrium. A common choice is
to update all elements which sit on even lattice points first and
then all elements on odd points.
3.3. Rational Hybrid Monte Carlo For generating dynamical
configurations, which means to include the determinant in equation
(3.3) into the probability distribution, one needs a different
approach. Instead of a local algorithm, the Hybrid Monte Carlo
updates the whole configuration in one step. Let us first look at s
= 2. The determinant now may be rewritten as
(detD)2 = det[DD†] = π−|Λ| ∫ DφDφ†e−φ†(DD†)−1φ, (3.19)
where φ is a complex, bosonic field with color structure living on
the lattice points, often called pseudo fermion. The full partition
function is then given by
Z = ∫ DφDφ†DUe−φ†(DD†[U ])−1φ−SG[U ], (3.20)
where we omitted the trivial π−|Λ| factor. We now insert another
field Pµ(n) that does not change expectation values
Z = ∫ DPDφDφ†DUe−
. (3.21)
3. Numerical simulation
the exponent can be interpreted as a Hamiltonian with the conjugate
variables Q and P . The principle of the Hybrid Monte Carlo is to
solve the resulting equation of motions
P = ∂H
∂Q , (3.23)
Q = ∂H
∂P (3.24)
numerically using a leap frog algorithm [20]. The necessary fields
φ and P are newly generated for each update.
For an exact solution, this would result in the right probability
distribution. However, due to systematic and numerical errors in
the integration of the equations of motion, an error has to be
corrected. This can be done by a Metropolis step: A generated
configuration is only accepted if a random number r ∈ [0, 1) is
smaller than exp(S′−S), where S′ is the value of the action based
on the configuration after the update.
For s 6= 2 the necessary inversion (DD†)−s/2 can be approximated
with a rational polynomial, which defines the name ’Rational Hybrid
Monte Carlo’ (RHMC).
A great advantage of this rational Hybrid Monte Carlo is that it
does not rely on special properties of the used action. This makes
it possible to use improved actions like the HISQ action [15]. The
disadvantage is the global update, which makes numerical approaches
that rely on the locality of the action not applicable.
3.4. Local boson fields
The method of local boson fields is another method to approximate
the determinant detD in equation (3.3), first time proposed by
Luscher 1993 [9]. In contrast to RHMC the corresponding actions
stay local.
The main idea is to approximate this determinant using a polynomial
and a set of local boson fields. The polynomial
PN (z) = N∑ i=1
cix i = cN
N∏ k=1
(z − zk) (3.25)
is chosen in such a way that it approximates z−s in a certain
interval. The determinant may now put into the denominator of a
fraction by
(detD)s ≈ 1 detPN (D) . (3.26)
The convergence of this matrix polynomial is only given, if the
eigenvalues lie inside the convergence region of P (z). Let us
assume that this is the case for P with matrix D. Moreover we
require at this point that the roots zk come in complex conjugate
pairs.
24
3.4. Local boson fields
Then we can rewrite the determinant as a Gaussian integral over N
boson fields:
detDs = |cN |−|Λ| N∏ k=1
1 det (D − zk)
1 det (D − zk)† (D − zk)
= |cN |−|Λ| ∫ N/2∏
=: |cN |−|Λ| ∫ N/2∏
−SL . (3.27)
Here the φk are, as for the RHMC, boson fields with color
structure, so called pseudo fermions, and |Λ| is the volume of the
lattice. In the first step we have used
D†n,m = η5(n)Dn,mη5(m), η5(n)2 = 1 (3.28)
for staggered fermions or D†n,m = γ5Dn,mγ5, γ
2 5 = 1 (3.29)
for Wilson fermions, respectively.
3.4.1. Even odd preconditioning
The convergence of the polynomial depends on the position of the
eigenvalues. As the spectrum of the Wilson matrix differs from the
staggered formulation and the numerical calculations in this work
are performed using the latter, we concentrate in the following on
staggered fermions. For Wilson fermions see [21, 22].
The eigenvalues of the staggered Dirac matrix lie on a line between
m − iλmax and m+iλmax. This is rather unhandy for the construction
of the polynomial. Through even odd preconditioning it is possible
to map this region to [m2,m2 +λ2
max]. A construction without even odd preconditioning with s = 1
can be found in [23].
The concept is based on the even odd symmetry of the Dirac matrix.
We observe that D only connects even lattice points with odd ones
and vice versa. If we now order the vector of boson fields as
φ = ( φe φo
D := ( m1 Deo
det D = det(m2 −DoeDeo). (3.32)
25
3. Numerical simulation
Therefore, the spectrum lies, as intended, within [m2,m2 + λ2 max].
We could now use D
for the formulation of the Luscher action SL:
SL = N/2∑ k=1
φ†k(D − zk) †(D − zk)φk. (3.33)
However, D†D es less local and an update formulation becomes much
harder. This can be solved by rewriting the roots zk → rk and
demanding
det(D − zk) = det(m2 −DoeDeo − zk)
!= D := ( m− rk Deo
) . (3.34)
The new ’roots’ are given by rk = m − √ m2 − zk and the even odd
preconditioning is
simply introduced by redefinition of the roots. For the
construction in (3.27) we need the roots to come in complex
conjugate pairs.
This can be ensured if we choose
√ zk =
{ √ rei/2, for Im(zk) > 0√ rei/2+π, for Im(zk) < 0
(3.35)
with r = |zk| and = arg(zk). The total Luscher action now
reads
SL = N/2∑ k=1
3.4.2. Choosing the polynomial
The accuracy of this local boson theory is controlled by the order
of the polynomial N and the roots of the polynomial. To spare
computer time and memory usage it is necessary to get the best
accuracy for a given N . For this purpose different choices for the
polynomial have been proposed. For Wilson fermions, Luscher
originally suggested Chebychev polynomials [9]. In [24] Montvay
suggested to use a least square algorithm for optimization of the
polynomial. Forcrand et al had the idea to use adopted polynomials
to reduce the error of the determinant [25]. For simplicity we
concentrate on Chebychev polynomials and adopt the construction
given in [26]. The Chebychev polynomials are defined by
Tk[x] = cos(k arccos(x)). (3.37)
Through the even odd preconditioning, the eigenvalues of D lie
within [m2,m2 + λ2 max],
where λmax is the largest eigenvalue of MoeMeo. To match the
convergence region of the Chebychev Polynomials, [−1, 1], we have
to shift and rescale the Dirac matrix. We start with the
rescaling,
D′oo = 2 2m2 + λ2
max Doo (3.38)
and then shift with D′oo := Doo − (1− ε)Moo, (3.39)
where ε is defined as ε := 1− λ2
max 2m2 + λ2
λ2 max
. (3.41)
The spectrum of −Moo then obviously lies within [−1, 1]. We now
write the polynomial expansion of x−1 in terms of y as
x−s = [1 + (1− ε)y]−s = N∑ k=0
ckTk[y], (3.42)
with y = (x− 1)/(1− ε). The coefficients ck may then be calculated
as
ck = ∫ 1
1− y2 dy
= 2rk
1 + δk,0 (1 + r2)sF (s, s+ k; 1 + k; r2) Γ(s+ k)
Γ(s)Γ(1 + k) , (3.43)
r = −1 + √ ε(2− ε)
1− ε , (3.44)
F (α, β; γ, z) is the Gaussian hyper-geometric function and Γ(z) is
the Gamma function. It can be shown that the absolute error of the
polynomial is given by[26]x−s − PN [x]
≤ 2 Γ(s)
( 1 + r2
1− r2
)s (−r)N+1
1 + r , (3.45)
with −1 ≤ r ≤ 0, which results in an exponential error reduction
with increasing N . However this decreasing depends on the value of
r. With the definition in (3.40) we get ε << 1 for m <<
1 and, therefore, r . 1. This is the main problem of the local
boson construction as the degree of the polynomial N has to be
increased rapidly for decreasing mass m. Figure 3.1 shows the upper
error bound depending on the mass for different number of roots.
One clearly sees how the effort rises for decreasing m. Note that
the largest absolute error usually occurs near m2. Therefore, the
relativ error, defined as the residual,
R(x) = x(PN (x))1/s − 1
, (3.46) is much smaller. Figure 3.2 shows that this residual is
uniformly distributed over the convergence region. Again the need
for a large number of roots for smaller masses becomes
evident.
For the formulation of the local boson theory, the roots have to
come in complex pairs. As the coefficients ck are real, this can be
easily ensured choosing N even. The roots may then be calculated
numerically. They lie around the convergence region on an ellipse,
as it can be checked in figure 3.3.
27
[x ]|
m
N = 64 N = 128 N = 256 N = 512
Figure 3.1: Absolut error of the approximating polynomial according
to equation (3.45) for different number of roots N . λmax = 2.5, s
= 1/2
10−8
10−6
10−4
10−2
R (x
R (x
m = 0.016 m = 0.032
m = 0.064 m = 0.1
Figure 3.2: Residual R(x) for different number of roots N at fixed
m = 0.064 (left) and different values of m at fixed N = 128
(right), λmax = 2.5, s = 1/2.
28
real(x)
0 0.1 0.2 0.3 0.4 0.5
im ag
100
105
1010
1015
1020
1025
Figure 3.3: Position of the roots zk in the complex plane for N =
32, m = 0.064, λmax = 2.5 and s = 1/2. The color map shows the
value of the polynomial at the given point z.
3.4.3. Explicit update formulation
In the following, explicit calculations for the update process are
shown. For sake of brevity we skip the staggered phases and assume
them multiplied to the links. We aim to rewrite (3.27) in such a
way that the overrelaxation and heatbath algorithms can be applied.
For the boson fields, this requires a Gaussian like structure for
one selected field element φk(x) in the exponent. First we expand
the Luscher action and get
SL,k = φ†k (M +m− rk)† (M +m− rk)φk = φ†k
( M †M + 2 Im(rk)iM − 2mRe(rk) +m2 + |rk|2
) φk, (3.47)
where we have used M † = −M . We now have to factor out a certain
element of φk and therefore need to look closer at
φ†kM †Mφk =
)
· (∑
) (3.48)
29
=:i(n)
+U †µ(n− µ)Uν(n− µ)φk(n− µ+ ν) =:iii(n)
−U †µ(n− µ)U †ν (n− ν − µ)φk(n− ν − µ) =:iv(n)
] . (3.49)
When updating the boson fields, we change one element φk(x) of
field φk and keep the rest (φk(y)) fixed. We then get terms that
connect the element φk(x) with other constant field elements:
φ†k(x)Xφk(y) := 1 4φ † k(x)
∑ µ,ν
iv(x)
. (3.50)
The different connections of the field elements and links are
visualized in figure 3.4. The remaining terms connect φ(x) with
itself and simplify to
φ†k(x)Y φk(y) := 1 4φ † k(x)
(∑ µ=ν
and we get in total
φ†kM †Mφk = φ†k(x)Xφk(y) + φ†k(y)X†φk(x) + 2 |φk(x)|2 + const. .
(3.52)
For the update process, we have to include all terms which contain
the element φk(x). These can stem from the left side of equation
(3.47) and from the right side. We get
φ†k (D − rk)† (D − rk)φk = φ†k (M +m− rk)† (M +m− rk)φk =φ†k(x)
[Xφk(y) + 2 Im(rk)iMφk(y)]
:=bk
]
)
and can now introduce the gaussian like structure with( A
1/2 k φ†k(x) + b†kA
−1/2 k
n1
n2
Figure 3.4: Visualization of the squared Dirac matrix. The µ
direction is fixed to n1, while both two dimensional possibilities
for the direction of ν (n1 and n2) are shown (dashed links). The
starting field element φ(n) is colored in blue.
With a gaussian distributed field χ, one heatbath step is given
by
χ = A 1/2 k φ′k(x) + bkA
−1/2 k ⇔ φ′k(x) = χA
−1/2 k − bkA−1
φ′k(x) = −φk(x)− 2A−1 k bk . (3.56)
In the gauge field update, we want to update one certain link Uσ(x)
and keep the rest constant. To see where this one link contributes
to the total value of the Luscher action, we look again at (3.48)
and factor out Uσ(m). We then get
SL,k =1 2 ( φ†k(x+ σ)U †σ(x)ak − φ†k(x)Uσ(x)bk
+ a†kUσ(x)φk(x+ σ)− b†kU † σ(x)φk(x)
) + const. (3.57)
2 + Im(rk)φk(x) (3.58)
∑ ν 6=σ U
2 − Im(rk)φk(x+ σ) . (3.59)
3. Numerical simulation
Since everything is scalar in total, this can be combined to
SL,k = Re [ φ†k(x+ σ)U †σ(x)ak − φ†k(x)Uσ(x)bk
] + const. (3.60)
and we can introduce a trace over this scalar and reorder the links
and fields:
SL,k = Re {
]} + const.
= Re {
]} + const.
= Re {
= Re {tr [Uσ(x)Fφ,k]} (3.61)
Here we have used the fact that we are just considering the real
part in the first step. Including the staple F ′U and the link U ′σ
stemming from the non staggered gauge part
of the action we get in total
S = U ′σF ′ U − Uσ
∑ k
Fφ,k. (3.62)
For staggered fermions, the gauge part has been transformed as U
′σF ′U = −UσFU and we can use the standard overrelaxation and
heatbath algorithms with the weight
P [U, φ] = 1 Z
exp { −Uσ(x)
( FU +
∑ k
Fφ,k
)} . (3.63)
This formulation shows one of the great disadvantages of the local
boson theory. It is very difficult to bring improvements of the
action, like the HISQ action, into the above form. Moreover such
improvements are less local, which is neccessary for the error
reduction in section 4. Therefore, one has to deal with O(a2)
errors and moreover, it is impossible to introduce terms that
reduce the taste mixing of the staggered action, as it is the case
for the HISQ action.
32
V
U
UA[U ]
X[V ]
A[U ]
Figure 4.1: Lattice structure for one sublattice and a Wilson loop
wrapping around the lattice
4. Error reduction for gluonic operators
Many physical observables can be calculated on the lattice using
pure gluonic operators. However, this does not mean one can fully
skip the fermionic part, as it also influences the link variables.
Therefore, in the quenched approximation, one neglects some
information.
In the calculation of expectation values in lattice gauge theories,
one has to deal with a finite set of gauge configurations. This
leads to statistical errors of this expectation val- ues.
Unfortunately this errors can be very large for some operators,
especially for large distance correlations. To reduce the noise,
usually stemming only from parts of the op- erator, different
approaches have been suggested. 1983 Parisi et al suggested to
integrate over individual links, while keeping all other links
constant [27]. The disadvantage is that this link integration is
only possible for operators whose links lie straightforward on a
line.
This problem was solved by Luscher and Weisz using sublattice
updates. [8] Both methods have in common that they only work with
local actions. So far no attempt was done to use error reduction
methods for dynamical fermions. In the following we first describe
the basic methods and then formulate a variant of the Luscher Weisz
algorithm for dynamical fermions.
4.1. Luscher Weisz method
We start with a method, where the contribution of parts of an
operator can be noise reduced and then we show that the statistical
error reduces rapidly if we reduce the noise at more then one
single part. We assume, that we can split the operator O into a
product of the operators X[V ] and A[U ] where U and V are disjoint
sets of link variables in different sublattices. Such an operator
for example is a Wilson loop wrapping around the whole lattice.
Figure 4.1 shows how the lattice is divided into two sublattices.
Thereby, it is not necessary, that the links at the borders of the
sublattice are included. The sublattice has then the form of a
comb. (See figure 4.4)
The expectation value XA is computed according to
O = XA = 1 Z
33
V
U
U
W
U
with Z =
∫ DUDV e−S[U,V ]. (4.2)
To reduce the noise stemming from operator X, the central idea is
to replace X[V ] by another operator XU without changing the
expectation value. We define XU as
XU = ∫ DV ′X[V ′]e−Sloc.[U,V ′]∫ DV ′e−Sloc.[U,V ′]
, (4.3)
where Sloc.[U, V ′] consists of all terms in the local action that
are somehow connected to any element out of V . Therefore, the
dependence of XU of U can be understood in such a way that it only
depends on border links that lie nearby V . For a better overview
we write
f(U, V ′) := eSloc.[U,V ′] (4.4)
and define g(U) for collecting all remaining terms, including A[U
]. We then get
XA = 1 Z
= 1 Z
DV ′f(U, V ′) g(U)
DV ′f(U, V ′) g(U)
= A XU , (4.5)
where we have first expanded the fraction with ∫ DV ′f(U, V ′) and
then used the sym-
metry between V and V ′ in the numerator and changed X[V ] to X[V
′] in the third line. Since XU is an averaged quantity, the noise
stemming from X in Monte Carlos simulations is now reduced.
The same trick is also possible for more then one sublattice. Here
we split the lattice in four parts and get four operators X[V ],
A[U ], Y [W ], and B[U ] and want to compute
34
4.2. Local boson fields and sublattices
XAY B via interchange of X and Y (See figure 4.2). XU and Y U are
now defined as
XU = ∫ DV ′X[V ′]e−Sloc.[U,V ′]∫ DV ′e−Sloc.[U,V ′]
Y U = ∫ DW ′Y [W ′]e−Sloc.[U,W ′]∫ DW ′e−Sloc.[U,W ′]
. (4.6)
Under usage of the same trick we can now write
XAY B = XU A Y U B . (4.7)
In terms of averaging over gauge configurations we correlate now
two different averages, which leads to a very high error
reduction.
Note that the construction above only works if Sloc.[U, V ′] does
not depend on W and Sloc.[U,W ′] does not depend on V . Otherwise
one would loose the symmetry between V and V ′ or W and W ′
respectively. Depending on the locality of the action this might
lead to a need of gaps between the sublattices. This means it might
be impossible to shrink U to zero.
An improvement of this algorithm is an interlacing of the
sublattices. Assuming a very local action that makes it possible to
shrink the fixed area U to zero, one could for example calculate
ABCD through
ABCD = A B C D , (4.8)
where A, B, C and D all lie in different sublattices. Each bracket
denotes a different average. The inner ones are small sublattices
that include only one of the above opera- tors. The outer brackets
denote the whole lattice update. Additionally, sublattices that
include at the same time A and B or C and D, respectively are
realized. For the update process, one starts with generating
sublattices that include both, A and B or C and D. Within these
sublattices new sub updates are generated, that include only one of
the parts A, B, C or D.
However, this pyramiding makes only sense, if the action is very
local and noise stems from a lot of terms in the actual operator.
For Wilson loop correlations in quenched theory, Luscher showed
that this results in an exponential error reduction [8].
4.2. Local boson fields and sublattices For local boson algorithms,
the term e−S[U ] is given by
n∏ k=1 DφkDφ†ke
−S[U,φk,φ†k], (4.9)
∑ k
φ†k (D[U ]− rk)† (D[U ]− rk)φk − SG[U ] . (4.10)
35
V
U
U
ψk
φk
φk
Figure 4.3: Lattice structure for one sublattice with local boson
fields. A Wilson loop is wrapped around the lattice. The braces
mark the region of the local boson sublattices, while the link
sublattices are defined through the continuous line.
Since this modified action is local, we can define XU and Y U
without dependence on W and V respectively. But we now have to care
about the boson fields. Therefore, we define another sublattice ψk
which contains the element of the boson fields we want to average
over. This sublattice does not necessary have to share the
dimension of the link sublattice V . The remaining elements of the
boson field are combined to φk. An example configuration can be
seen in figure 4.3. Then, the average XU also depends on φk and we
get
XU,φk = ∫ DV ′Dψ′kX[V ′]e−SL,loc.[U,V ′,φk,ψ′k]∫ DV ′Dψ′ke
−SL,loc.[U,V ′,φk,ψ′k] , (4.11)
where we do not explicitly write down the product over the k’s and
the dependencies on the daggered terms. Now, we are able to use the
same trick as in equation (4.5) if we define f(U, V, ψ, φ) to
contain all those terms that are anyhow connected to a link out of
V or to any boson field element out of ψk. The remaining terms are
combined in g(U, φ). Then the proof can be done analog to
(4.5).
4.3. Optimal sublattice shape
In the following, we want to construct the optimal shape of the
different sublattices. We first look at the quenched case and then
include boson fields. To also reduce the noise for short distances
one wants to shift the sublattices as close to each other as
possible. This requires that, in the local actions, no element of
one sublattice is multiplied to any element out of the other
sublattice. The gauge action involves the plaquette. If one updates
the full sublattices including the border links, one needs at least
a gap with the width of two links between them. On the contrary,
without the border links, one can align the sublattices directly
next to each other. (See figure 4.4)
Including the local boson fields, the situation becomes more
complicated. In the update of the boson fields, one element φk(x)
depends on next to nearest neighbors,
36
U−ν(x+ µ)
Figure 4.4: Optimal shape of two neighboring quenched sublattices.
The two sublattices of links, V (red) and W (blue), are labeled
with fat lines. The fixed links are labeled with dashed lines. One
link of sublattice V (Uµ(x)) and its furthest dependencies in the
update process are highlighted (green).
ψk(x) φk(x+ 2µ)
U−µ(y − µ)
Figure 4.5: Optimal shape of two neighboring local boson
sublattices. The two link sublattices, V (red) and W (blue), are
labeled with fat lines. The local boson sublattices ψ (red) and χ
(blue) are marked with circles. Two elements of the boson fields,
ψk(x) (orange) and χk(y) (green), and its furthest dependencies for
the update process are highlighted.
37
4. Error reduction for gluonic operators
which means for example on element φk(x+ 2µ). This means that we
have to separate the boson field sublattices at least by distance
3. A good compromise can be found in figure 4.5, where the boson
field sublattices are separated with distance 3 and the gauge
sublattices have distance 1. We are not able to shift the link
sublattices closer to each other as the boson fields would then
depend on the links in the center of the gap between the boson
field sublattices.
38
5. Heavy quark diffusion
Despite the high numerical costs, error reduction methods for pure
gluonic operators are necessary for correlators that have a bad
signal to noise ratio for larger distances. Such correlators occur
amongst others in the calculation of transport coefficients in
heavy quark diffusion.
We chose the color electric correlator defined in [7] to test the
new error reduction methods for dynamical fermions. This correlator
can be used to find the heavy quark momentum diffusion coefficient.
As this correlator is an addition of different Wilson loops, the
above error reduction methods are applicable.
5.1. The spectral function
The usual calculation of transport properties in the quark-gluon
plasma using lattice QCD relies on the connection between
current-current correlators and their spectral distribution. In the
following we give a short overview about spectral functions and
their connection to transport coefficients. For a detailed
description see [28]. We here follow [28] and [29]. We start with
the definition of the Wightman correlation functions in real
time
GAB> (t) := Tr{ρA(t)B(0)} = A(t)B(0) GAB< (t) :=
Tr{ρB(0)A(t)} = B(0)A(t) = GBA> (−t), (5.1)
where ρ is the density matrix ρ = 1 Z e−βH and A(t), B(t) are
operators in the Heisenberg
picture. From e−βHA(t)eβH = A(t+ iβ) (5.2)
one can derive the Kubo-Martin-Schwinger (KMS) relation,
GAB> (t) = GBA> (−t− iβ). (5.3)
It follows that the Fourier transformed versions of the Wightman
correlation functions
GAB> (ω) = ∫ ∞ −∞
dteiωtGAB> (t)
GAB< (ω) = ∫ ∞ −∞
dteiωtGAB< (t) (5.4)
fulfill the relation GAB< (ω) = GBA> (−ω) = e−βωGBA> (ω).
(5.5)
To define the spectral function, we first look at the expectation
value of the commu- tator
GAB(t) = iTr{ρ[A(t), B(0)]} = i ( GAB> (t)−GAB< (t)
) (5.6)
39
GA †B†(t) = −GAB(t∗)∗. (5.7)
We also define the retarded correlator as the positive half Fourier
integral over this commutator,
GABR (ω) = ∫ ∞
0 dteiωtGAB(t). (5.8)
The spectral function can now be defined as the Fourier transformed
version of the commutator GAB(ω),
ρAB(ω) = 1 2πi
dteiωGAB(t). (5.9)
Through the properties (5.7) this spectral function may be written
as
ρAB(ω) = 1 2πi
) (5.10)
ρAA †(ω) = 1
† R (ω). (5.11)
In terms of simplicity we choose from now on the special case A = B
= A† and, therefore, omit the indices A and B. The connection
between the Wightman correlation function and the spectral function
can then be easily obtained from equation (5.5):
G>(ω) = 2πeβω
eβω − 1ρ(ω) and G<(ω) = 2π eβω − 1ρ(ω). (5.12)
So far, the spectral function has been defined in Minkowski space.
We establish the con- nection to lattice correlators by switching
to imaginary time τ and define the Euclidean correlator as
GE(τ) = G>(−iτ). (5.13)
GR(t) = iθ(t) [A(t), A(0)] (5.14)
and expressing θ(t) through
ω′ + iδ , (5.15)
one can easily see that the frequency space retarded correlator is
directly linked to the Euclidean correlator by analytic
continuation,
GR(ω) = GE(ωn → −iw − iδ), (5.16)
40
where GE(ωn) are the Fourier coefficients of the Euclidean
correlator,
GE(ωn) = ∫ β
0 dτeiωnτGE(τ). (5.17)
Thus, after computing the Euclidean correlator from the lattice,
one could extract the spectral function through analytic
continuation.
The spectral function can also be directly linked to the Euclidean
correlator. To relate to physics, we now look at hadronic
operators,
JH(τ,x) := ψ(τ,x)ΓHψ(τ,x), (5.18)
where ΓH defines the particle channel through ΓH = {1, γ5, γµ,
γµγ5}. The correspond- ing Euclidean correlator is
GH(τ,x) = JH(τ,x)JH(0,0) , (5.19)
Using the definition of the Euclidean correlator we can write
GH(τ,p) = ∫
) .
The spectral function can now be inserted through equation
(5.12),
GH(τ,p) = ∫ ∞
0 dωK (ω, τ) ρH(ω,p). (5.22)
Here one can see the physical meaning of the spectral function. As
K(ω, τ) is the free boson propagator, the spectral function denotes
the spectral distribution of the current- current correlator in
terms of energy.
Subsequently, inverting equation (5.22) offers another method to
extract the spectral function. However, both methods are very
imprecise in practise, as the Euclidean corre- lation function can
only be calculated in a discretized form. This makes it impossible
to calculate the spectral function without making assumptions
either about the correlator or about the spectral function
itself.
41
5.2. Transport properties through spectral functions
Now we want to see how transport properties are encoded in the
spectral function. We concentrate on the diffusion of heavy quarks
in the quark gluon plasma. For the derivation of other transport
properties see e.g. [28].
Transport coefficients are mainly relevant for systems out of
equilibrium. Now, the idea is to only introduce a small, slow
perturbation to a system. The response of the system is expected to
be linear, from which one can calculate the transport coefficients.
This Ansatz is called linear response theory. We start with the
perturbed Hamiltonian,
Hf (t) = H − f(t)B(t), (5.23)
where H is the Hamiltonian of the unperturbed system. Using the
standard evolution equation of an operator, one can find that the
difference of the perturbed and the unperturbed expectation value
of an operator A is given by [30]
δ A(t) := A(t)f − A(0)
= ∫ t
We now assume the source term to be
f(t) = eεtθ(−t)f0 (5.25)
δ A(t = 0)f =: χABs f0. (5.26)
One can then show that the retarded correlator is given by
GABR (ω)f0 = δA(0)f + iω
∫ ∞ 0
dteiωt δA(t)f . (5.27)
With this definitions, we can relate to heavy quark diffusion in
the quark gluon plasma. From the heavy quark mass M >> T and
its momentum p ∼
√ T/M , it follows that it
needs a lot of collisions with the thermal medium to change the
momentum substantial. Therefore, it is possible to use the Langevin
formalism to describe the thermalization of the heavy quarks [5].
The equations of motions for the latter are defined as
dx dt = p
dp dt = ξ(t)− ηp(t)
ξi(t)ξj(t′) = κδij(t− t′), (5.28)
where η is a momentum drag coefficient and ξi(t) is a source of
temporal uncorrelated kicks. κ defines the momentum diffusion
coefficient, which is the mean squared momen- tum transfer per time
unit.
42
5.2. Transport properties through spectral functions
For a given ξ(t), the solution is easily found and provides
Einstein’s fluctuation- dissipation relation
η = κ
2MT . (5.29)
From a comparison with hydrodynamical linear response theory, one
can also derive a relation to the diffusion coefficient D
[28],
D = T
κ . (5.30)
Let P (t,x) be the probability for a heavy quark to start at the
origin and to diffuse to x within t. If N(0,x) is the initial
distribution of the quarks, the distribution after time t will
be
N(t,x) = ∫
or in momentum space, N(t,k) = P (t,k)N(0,k). (5.32)
Linear response theory can then be used to connect to the retarded
current-current correlator (ΓH = γµ) by equation (5.27) and we
get
G00 R (ω,k) = χs(k)
dteiωtP (t, k) ) , (5.33)
where the indices of G00 R refer to the density component of the
current operator Jµ.
From a small perturbation of the chemical potential µ(x) = µ0
+δµ(x), one can derive the static susceptibility as[31]
χs = 4Nc
) . (5.34)
The left necessary quantity to calculate the transport coefficients
is the initial distri- bution of the heavy quarks, N(t,x). It can
be shown, that a Gaussian distribution is a good approximation
[31]. Then the spectral function takes the form of a
Lorentzian
ρ00(ω,0) ω
ω2 + η2 , (5.35)
where ωUV is a threshold at which other physical processes become
relevant. This equation is called Kubo formula and is often
expressed in terms of the spatial components of the spectral
function ρii(ω,k). The latter is related to the time component
by[31]
ρii(ω,k) = ω2
k2 ρ 00(ω,k). (5.36)
It follows that the diffusion coefficient D can be calculated
through
D = 1 3χs
lim ω→0
3∑ i=1
T =∞
Figure 5.1: Sketch of the heavy quark current-current spectral
function for different tem- peratures
As stated above, the extraction of the spectral function from
lattice results is very imprecise and requires assumptions about
the spectral function. A common method is to parametrize the
spectral function based on perturbative or phenomenological
predictions. However, finding a parametrization for the vector
current-current correlation function is very difficult.
For the free theory (T → ∞), a perturbative calculation can be done
[32]. A sketch of the spectral function for different temperatures
is shown in figure 5.1. For T → ∞ the Lorentzian peak for low
frequencies, often called transport peak, becomes a delta function.
Above a threshold of 2mq, pair production is possible, which
results in a quadratic diverging shape of the spectral
function.
For lower temperatures, the threshold shifts to higher frequencies
and peaks of bound states, like charmionium or bottomonium, appear.
The shape of those peaks and the position of the threshold can only
be qualitatively approximated.
Moreover, from the sketch in figure 5.1, it becomes clear that the
complex structure of the spectral function requires a large number
of values of the corresponding correlator for an inversion of
equation (5.22). This requires large lattices, which are
numerically expensive.
5.3. Heavy quark momentum diffusion from heavy quark effective
theory
We have seen that the complexity of the spectral function makes a
direct measurement of the diffusion coefficient very involved. In
2009 Caron-Huot et al. developed a method to calculate the heavy
quark momentum diffusion coefficient, κ, directly [7]. The idea is
based on heavy quark effective theory and results in a
color-electric correlator, whose
44
5.3. Heavy quark momentum diffusion from heavy quark effective
theory
spectral function is much smoother. The spectral function ρµν(ω) is
based on the correlator of the operator Jµ at zero
momentum. One can relate the spatial components J i to the heavy
quark’s velocity by vi =
∫ dxJ i. Thus, the classical force acting on a quark is given by
M
∫ dxdJi
dt . Using this observation, one can define the momentum diffusion
coefficient as
κ(M) := 2πM2 kin
, (5.38)
where Mkin is the heavy quark’s kinetic energy. From the above
classic relations, it follows that this defines a correlation of
the force acting on the heavy quark with itself.
Taking into account that this correlator should be mass independent
for Mkin → 0, one reaches
κ = β
}⟩] . (5.39)
Using a Foldy-Wouthuysen transformation [33] it can be shown, that
the leading force is the chromo-electric force induced by the
color-electric field Ei. Therefore, the derivatives of the currents
can be replaced by
dJ i
) , (5.40)
where θ and φ are two-component spinors of heavy quark effective
theory (HQET) and M is the pole mass
M = m(µ) [
) +O(g4)
] , (5.41)
with m(µ) the MS mass and CF given by CF := (N2 c − 1)/(2Nc).
Equation (5.39)
transforms to
κ = β
) (0,0)
}⟩ ,
(5.42) where the limit ω → 0 has already been taken. This two-point
function satisfies the KMS condition (5.3) and, therefore, it is
possible to relate to an Euclidean correlator. Finally the
Euclidean color electric correlator reads
GE(τ) = −1 3
3∑ i=1
Re Tr [U(β, τ)gEi(τ,0)U(τ, 0)gEi(0,0)] Re Tr [U(β, 0)] ,
(5.43)
where U(τ1, τ2) is a color parallel transporter for static quarks.
Having extracted the corresponding spectral function through (5.22)
or (5.11), the momentum diffusion coef- ficient follows from
κ = lim ω→0
0.0
1.0
2.0
3.0
0.0 1.0 2.0 3.0 4.0 5.0 ω / T
0.0
1.0
2.0
3.0
4.0
N f = 0, T = 12 Tc
Figure 5.2: The spectral function corresponding to the color
electric correlator (5.43) calculated perturbatively for Nf = 0 and
T = 3Tc (left) and T = 12Tc (right). [34]
In contrast to the spectral function of the current-current
correlator, this spectral func- tion seems to be rather smooth and
is not contaminated by any bound state contribu- tions. The vacuum
contribution to the spectral function has been calculated pertur-
batively at next to leading order (NLO) [34] and up to next to next
to leading order (NNLO) [35] without revealing any discontinuities
or peaks as it can be seen in figure 5.2.
For leading order, the spectral function scales with ∼ ω3 according
to
ρ(ω) = g2CF 6π ω3 +O(g4). (5.45)
Therefore, the heavy quark momentum diffusion coefficient is zero
for leading order per- turbation and only higher order calculations
contribute to its value. The corresponding leading order color
electric correlator is given by
Gnorm(τT ) := GLO cont(τT ) g2CF
] (5.46)
and may be used to normalize lattice results for a better
visualization. For the lattice version of the color-electric
correlator, one has to discretize the electric
fields. Inspired by lattice heavy quark effective theory, this can
be done by writing [36, 7]
J j = i
2aM ( θ†(n+ j)U †j (n)θ(n)− θ†(n)Uj(n)θ(n+ j)− (θ → φ)
) . (5.47)
46
5.4. Correction of lattice effects
Now Uj(n) refers to the lattice links. Inserting this in (5.39)
results in a combination of Wilson lines:
GE(τ) = ∑ i∈{±1,...,±3}Re Tr
⟨ ( − ) i
( − ) i
. (5.48)
Reading from left to right, the numerator starts with a straight
Wilson line. The follow- ing expression in parenthesis represents
the color electric field. The corresponding lines are single links
sitting on a square. The next Wilson line is shifted into the
direction of i and is followed by another electric field. The
denominator is the expectation value of the Polyakov loop.
5.4. Correction of lattice effects The Correlator (5.48) exhibits
discretization effects. Due to the finite lattice spacing a, the
correlator may take different values compared to the
continuum.
The situation can be brought under control by ’tree level
improvement’ [37]. As the correlator has been computed in
perturbation theory for the continuum and for the lattice
discretization [7, 37], one can estimate the strength of the
lattice effect. The correction is then done by shifting the
correlator to a different time by defining the pairs (τT, τT )
according to
GLO cont.(τT ) = GLO
lat.(τT ), (5.49)
where τT can naturally only take discrete values. The shifted
correlator is now given as
Gimp.(τT ) = Glat.(τT ). (5.50)
A table with different pairs (τT, τT ) can be found in the
appendix. Another correction has to be done because of
discretization effects at loop level. The
correction factor Z can be computed perturbatively from computation
of the heavy quark self-energy. It is then chosen in such a way,
that it cancels effects stemming from 1-loop lattice regularization
and one gets for leading order [37]
ZLO(g2) ≈ 1− 0.59777 β
ZNLO(g2) ≈ 1 + 0.474 β
47
6. Methods
6. Methods 6.1. Error handling and auto correlation time When
measuring quantities on the lattice, one uses configurations that
are generated by a Markov chain. Although one usually skips some
configurations within the update process, following configurations
are usually still correlated. This makes it impossible to use the
standard routines for the error calculation, as for example the
standard deviation. Instead one can use the Jacknife method, which
takes care of the correlation.
For that purpose, one divides the set of N measurements of a
quantity O in M blocks of size n = N/M . In the case that n has a
remainder, one has to omit some measurements until n becomes an
integer. We now create M subsets of N by leaving out one block,
which we refer with index i. The average over such a subset is
given by
Oi = M∑ j=1 j 6=i
n∑ k=0
Ojn+k, (6.1)
where Oj is one measurement in the Markov chain. Using this block
averages, one defines the so called pseudo values as
Oi = MO − (M − 1)Oi, (6.2)
where O is the normal average on the whole set of measurements.
From these pseudo values one can calculate the average and the
standard error as usual according to
O = M∑ i=1 Oi ± δO (6.3)
with
δO =
(Oi − O) 2. (6.4)
Thereby, the number of blocks, M , is chosen as follows. One runs
several error calcula- tions with the above technique for an
increasing block size. This results in an increasing error until a
plateau is reached. At this value the error for uncorrelated data
is reached. Usually this is the case for M ∼ 10− 20.
However, the necessary number of blocks depends on the correlation
between the individual measurements. This correlation can be
examined through the so called au- tocorrelation time. It can be
obtained by first defining the autocorrelation function
CO(t) = OiOi+t − Oi Oi+t (6.5)
for different values of t. In a typical Markov chain, the
autocorrelation function has an exponential descent,
CO(t) CO(0) ∼ e−t/τ , (6.6)
48
6.2. Root calculation
where τ is the so called exponential autocorrelation time. Often,
the descent is composed of several exponential terms,
CO(t) CO(0) ∼
Aie−t/τi . (6.7)
The exponential auto correlation time is then given by the largest
value of those τi.
6.2. Root calculation The calculation of the roots of the
polynomial (3.42) has to be done numerically. For a small number of
roots (∼ 200), this can be done using Mathematica with the FindRoot
command. Note that the default double precision is not enough to
calculate the roots, as the values of the polynomial reach very
high values outside the convergence area (see figure 3.3).
For a higher number of roots, one has to use a root finding
algorithm. In this work, we chose a Levenberg–Marquardt algorithm
to find minimums of |P (z)|2. This algorithm works as follows. At
first a starting guess for the root z0 is chosen. Moreover, one
defines a parameter λ and choses a start value, for example λ =
10−6. If we write z as a vector
z = ( x y
) , (6.8)
we can interpret the map z → f(z) := |P (z)|2 (6.9)
as R2 → R. A new guess for the minimum value can now be defined
as
zi+1 = zi − (G(zi) + λ diagG(zi))−1∇f(zi) (6.10)
with the matrix Gi,j(z) = ∂2f(z)
∂zi∂zj . (6.11)
With this new guess one performs a selection step according
to:
for (f(zi+1)− f(zi)) { < 0 use zi+1 and set λ := 10λ ≥ 0 keep zi
and set λ := λ/10 (6.12)
One can show that in every iteration step the new value zi+1 is
closer to the actual minumum. One now only needs to make good
guesses for the minimum. This can be done by following the elliptic
distribution of the roots.
49
7. Technical setup
7. Technic