91
Bielefeld University Faculty of Physics Master’s thesis Error reduction for the non-perturbative calculation of heavy quark momentum diffusion with dynamical fermions Author: HaukeS¨oren Sandmeyer Supervisor and 1st corrector: Prof. Dr. Edwin Laermann 2nd corrector: Dr. Olaf Kaczmarek June 29, 2015

Error reduction for the non-perturbative calculation of

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Master’s thesis
Error reduction for the non-perturbative calculation of heavy quark momentum
diffusion with dynamical fermions
Author: Hauke Soren Sandmeyer
2nd corrector: Dr. Olaf Kaczmarek
June 29, 2015
Contents
Contents
1. Introduction 5 1.1. The QCD Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2. Path integral formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2. QCD on the lattice 10 2.1. Discretizing the action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2. The doubling problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3. Wilson fermions and chiral symmetry breaking . . . . . . . . . . . . . . . 14 2.4. Staggered action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Temperature on the lattice and the continuum limit . . . . . . . . . . . . 17 2.6. Z3 symmetry and phase transition . . . . . . . . . . . . . . . . . . . . . . 18
3. Numerical simulation 20 3.1. Monte Carlo simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2. Heatbath and Overrelaxation . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3. Rational Hybrid Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4. Local boson fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.1. Even odd preconditioning . . . . . . . . . . . . . . . . . . . . . . . 25 3.4.2. Choosing the polynomial . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.3. Explicit update formulation . . . . . . . . . . . . . . . . . . . . . . 29
4. Error reduction for gluonic operators 33 4.1. Luscher Weisz method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2. Local boson fields and sublattices . . . . . . . . . . . . . . . . . . . . . . . 35 4.3. Optimal sublattice shape . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5. Heavy quark diffusion 39 5.1. The spectral function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2. Transport properties through spectral functions . . . . . . . . . . . . . . . 42 5.3. Heavy quark momentum diffusion from heavy quark effective theory . . . 44 5.4. Correction of lattice effects . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6. Methods 48 6.1. Error handling and auto correlation time . . . . . . . . . . . . . . . . . . . 48 6.2. Root calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7. Technical setup 50 7.1. CUDA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7.2. Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8. Thermalizing with local bosons 54 8.1. Tuning parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3
Contents
8.2. Autocorrelation time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 8.3. Final choice of testing parameters . . . . . . . . . . . . . . . . . . . . . . . 57
9. Noise reduction 60 9.1. The unimproved correlator . . . . . . . . . . . . . . . . . . . . . . . . . . 60 9.2. Tuning the sublattice update . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.2.1. Preliminary considerations . . . . . . . . . . . . . . . . . . . . . . 60 9.2.2. Analysis of the error reduction for different sublattice size . . . . 63 9.2.3. Polyakov loop improvement . . . . . . . . . . . . . . . . . . . . . . 65
9.3. Error reduction for different masses and number of roots . . . . . . . . . . 65
10.Results 70 10.1. Continuum extrapolation for quenched theory . . . . . . . . . . . . . . . . 70 10.2. Quenched theory at different temperature versus full QCD . . . . . . . . . 73
11.Conclusion and outlook 76
B. Results of the correlator 79
4
1. Introduction
The basic approach of thermodynamics is the study of states that are in equilibrium. Within this framework, mathematical tools based on statistical considerations, like the partition function, allow the calculation of expectation values of certain quantities. However, in real physical experiments it is impossible to create a perfect equilibrium. For many experiments this is not a problem, as they involve large number of particles (O(1023)) and small temperature gradients and, therefore, are close to equilibrium.
In heavy ion collisions, which are one of the most important experiments in particle research, this is not the case. At those collisions, temperatures about 150000 times the core temperature of the sun are created for fractions of zeptoseconds (∼ 10−23s). Nevertheless, within this short time periods, thermalization processes lead to a creation of a thermal medium, the strongly interacting quark gluon plasma [1, 2]. Thereby, the charm and the bottom quark are key ingredients to understand the thermodynamic and hydrodynamic properties of this quark gluon plasma. Since their masses are much larger than the temperature of the quark gluon plasma, their creation occurs mainly in the early preequilibrium phase of the collision.
In contrast to qualitative arguments that suggest a rather low thermalization rate because of the heavy mass, jets containing b or c quarks are found to be effectively quenched [3, 4]. Also perturbative calculations do not describe this fast thermalization rates [5, 6]. Thus, a non perturbative calculation of the heavy quark diffusion constant, which can be related to the thermalization rates of heavy quarks, is highly desirable.
Unfortunately, a non perturbative analytical solution of the underlying theory, Quan- tum Chromo Dynamics (QCD), does not exist. Nevertheless, a numerical approach allows the calculation of expectation values in thermal media using lattice gauge the- ory. Lattice gauge theory discretizes the space time and calculates expectation values of physical quantities with statistical methods. It is based on the thermal field theory using imaginary time, which requires an analytic continuation for real time observables. For that purpose, very precise results from lattice simulations are necessary.
For the measurement of transport properties on the lattice, the spectral function of so called correlation functions can be used. Only such correlators can be calculated directly on the lattice, which makes it necessary to extract the spectral function through analytic continuation or by making assumptions about its structure. For the calculation of the heavy quark diffusion constant, the spectral function of the vector current-current correlator can be used. However, this spectral function is found to be rather complex at finite temperatures, which makes it difficult to perform such a extraction.
The situation can be solved by calculating the heavy quark momentum diffusion con- stant. The latter can be related to the diffusion constant and its spectral function is much more smooth [7]. Still, the extraction of the spectral function needs a very high precision of the corresponding correlator.
Since lattice gauge theory uses statistical methods to calculate observables, the er- ror decreases with
√ N , where N is the number of measurements. Depending on the
operator, this leads to the problem of high numerically effort to decrease the errors.
5
1. Introduction
Previous lattice studies used the Luscher Weisz error reduction method [8] and the so called quenched approximation to calculate the heavy quark momentum diffusion coefficient. The quenched approximation omits the fermionic part in the simulation of the quark gluon plasma. Now, the goal of this thesis is to develop an error reduction method that can be used with dynamical fermions. This is done by a combination of the Luscher Weisz method with local bosons [9].
The outline is as following. First we give an overview about the general formal- ism of lattice QCD and its different numerically implementations. Then, the Luscher Weisz method is introduced and transferred to the local boson method. After giving an overview about heavy quark diffusion and the technical implementations, we analyze the efficiency of the new algorithm. In the end we make a comparison to the results stemming from the quenched approximation.
1.1. The QCD Lagrangian
As the Quantum Chromo dynamic is a quantum field theory, its construction starts according to the Dirac equation with the non-interacting Lorentz invariant Lagrangian,
Lfree = ∑ f
ab −mfδαβδ ab ) ψ(x)β,bf . (1.1)
Here α and β correspond to indices of the Dirac structure, a and b identify different color, µ refers to the Lorentz structure, and we sum over all flavors f .
The basic idea for the construction of a Yang Mills Lagrangian is to demand an invariance under local SU(3) gauge transformations,
S (x) = exp ( i
) , (1.2)
where λ is the vector of all Gellmann matrices. This invariance can be obtained by inserting gauge fields Aµ(x), the so called gluonic fields, which are itself member of the gauge algebra via Aµ = Aiµ(x)T i with T i = 1
2λ i.
Omitting the Dirac and color indices and keeping the sum over the flavors in mind, the Lagrangian simply reads
LF = ψ (x) (iγµ (∂µ − igAµ)−m)ψ (x) =: ψ (x) (iγµDµ −m)ψ (x) , (1.3)
where g is an arbitrary parameter and specifies the coupling strength between the quark fields and the gluonic fields. To complete the gauge invariance, we require that the gauge fields transform as
Aµ (x)→ A′µ (x) = S (x) { Aµ (x) + i
g ∂µ
} S−1 (x) . (1.4)
and we end up with the gauge invariant fermionic part of the Lagrangian LF . The effect of this construction is that now additional fields are involved and interact with the quark
6
1.2. Path integral formulation
fields. These fields will itself need a kinetic term in the Lagrangian. Using the gluonic field strength tensor,
Fµν := i
i, (1.5)
LG = 1 4F
i µνF
µν,i. (1.6)
We finally get the gauge and Lorentz invariant Yang Mills Lagrangian,
L = ψ (x) (iγµDµ −m)ψ (x)− 1 4F
i µνF
µν,i, (1.7)
that is the key formula for all following QCD calculations.
1.2. Path integral formulation
An analytical solution of the equations of motion for equation (1.7) is not yet discovered and seems to be highly difficult to find. Nevertheless, a well known solution for the free theory (1.1) can be used to perform perturbation theory. This allows for example the calculation of cross sections in scattering events.
However, for estimating expectation values in thermal equilibrium, a different ap- proach is necessary. The so called path integral formalism allows to introduce temper- ature in quantum field theory and, moreover, makes it possible to calculate physical quantities numerically.
We start with a transition amplitude for general fields φi(t′) and φj(t),
G ( φi, t
′−t)H |φj . (1.8)
These fields can be scalar fields or quark spinors for example. The latter is the case in QCD, where the fields are defined according to the Lagrangian (1.7).
In a naive application of field theories according to the bare Lagrangian, ultra violet divergences occur. To define the theory in a well defined way, regularization schemes have to be introduced. For standard perturbation theory, the dimensional regularization can be used. Here we use the lattice regularization, which means that we discretize the space. The idea is to introduce a 3D lattice with lattice spacing a and write
x→ an ni = 0, 1, . . . , Nσ − 1. (1.9)
Then integrals transform to a sum,∫ dxf(x) =
∑ n
∂if(n) = f(a(n + ei))− f(a(n− ei)) 2a . (1.11)
Using the self commuting of the Hamiltonian, we can also discretize the time evolution operator U(t′, t) with
U(t′, t) = Nt∏ n=1
U(t+ na, t+ (n− 1)a)
= Nt∏ n=1
eiaH , (1.12)
where Nt is defined by Nta = t′− t. Note that H is already the discretized version here. We now insert the completeness relation
∫ dφ |φ φ| = 1 after each product term of the
time evolution operator in equation (1.8). Taking the limit a→ 0 and Nt, Nσ →∞, this results in a path integral representation of the transition amplitude [10],
G ( φi, t
Dφe−iS(φ,t′,t). (1.13)
The integration runs over all possible paths of fields that have the boundaries φ(t′) = φi and φ(t′) = φi. More precisely the integration measure is defined as
lim Nσ ,Nt→∞
∏ n∈Λ
dφn, (1.14)
where Λ is the set of all lattice points. Thereby, the action S is given by the integration over the Lagrangian,
S(φ, t′, t) = ∫ t′
∫ d3xL(φ, t′′). (1.15)
In the case of QCD, the spinors transform to anti-commuting Grassmann numbers in this construction [10, 11].
To connect to temperature, we look at the partition function,
Z(β) = tr ( e−βH
) , (1.16)
with β = 1/T . We are free to choose the basis in which we calculate the trace and, therefore, can use φi for this purpose. We get
Z(β) = tr ( e−βH
8
1.2. Path integral formulation
and observe a similarity to equation (1.8). The integrand only differs in the periodicity φi = φj and the factor in front of the Hamiltonian. We conclude that we can write the partition function as a path integral if we expand our action to imaginary time by analytic continuation:
Z(β) = tr ( e−βH
=: ∫ Dφe−iS(φ,−iβ,0). (1.17)
Dealing with the imaginary time in S(φ,−iβ, 0) is rather unhandy. Thus, we perform a wick rotation t → −iτ and get an Euclidean action SE(τ) = −iS(t → −iτ). With the Lagrangian from equation (1.7) the QCD action transforms to
SE(ψ, ψ, β,m) = ∫ β
4F i µνF
Having constructed the partition function, we can also calculate expectation values of an arbitrary operator O in a similar way:
O = 1 Z
tr ( Oe−βH
∫ DφO(φ)e−SE(φ,β). (1.19)
Here an important interpretation can be done. The action is real and therefore the factor
1 Z
e−SE(φ,β) (1.20)
is real as well. This can be interpreted as a propability weight for the state φ. We integrate over all states and weight the value of the operatorO(φ) accordingly. This is the starting point for lattice calculations, where such expectation values are approximated numerically using this probability interpretation in a discretized space.
9
2. QCD on the lattice
2. QCD on the lattice Instead of direct solving the equations of motion of the QCD Lagrangian, one could think of an numerical approach for this equations. However, due to the huge number of degrees of freedom, this would require too much computation power. Therefore, an examination of the time evolution of observables at high temperatures is currently impossible.
What is left is the calculation of expectation values according to the path integral formulation (1.19).
2.1. Discretizing the action
As outlined in section 1.2, the path integral has been introduced using a discrete space- time. In the end the limit a → 0 defined the path integral for continuous theory. We now use the discrete version for numerical calculations. Hence, we have to discretize the QCD action (1.18) and introduce the lattice
Λ = {n = {n0, n1, n2, n3}|ni ∈ N; n0 ≤ Nτ − 1, n1, n2, n3 ≤ Nσ − 1}. (2.1)
A space time point x may then be written as an. With the discretization rules in section 1.2, a simple discretized version of the free fermion action is given by
Sfree F = a4 ∑
2a +mψ (an)
, (2.2)
where µ is the unit vector in the direction of µ. For sake of brevity, we rescale the fields with a3/2ψ(an) → ψ(n) and the mass with am → m. Note that from now on the physical mass is given by mphys = m/a. With this rescaling we get
Sfree F =
∑ n∈Λ
ψ (n)
2 +mψ (n)
, (2.3)
which is not gauge invariant since we couple ψ(n) with ψ(n± µ). The gauge invariance under transformations S(n) ∈ SU(3) at each lattice point n can be gained with the so called link variables Uµ(n). These links are members of the gauge group and transform as
U ′µ(n) = S(n)Uµ(n)S−1(n+ µ). (2.4)
Through this transformation property the action
SF = ∑ n∈Λ
2 +mψ (n)
(2.5)
becomes gauge invariant. Here we have used a commonly used abbreviation U−µ(x) = U †µ(x− µ).
10
Figure 2.1: Visualization of the plaquette Uµν(n)
To prove that we get the right continuum result in the limit a→ 0, let us have a closer look at the transformation property in equation (2.4). The local gauge at lattice point n+µ has been transported to the point n. For the continuum, such a gauge transporter is known as the Schwinger line integral,
U(x, y) = eig ∫ y x
dzµAµ(z). (2.6)
We simply find that the link variables are the lattice version of the gauge transporter and the above expression simplifies to
Uµ(n) = eigaAµ(n). (2.7)
If we Taylor expand this expression for a→ 0, we get the continuum expression according to
Scont. F = Sdisc.
F +O(a2). (2.8)
At every lattice point, we introduced 4 link variables, each for one µ direction. We can interpret this as this links sitting between the lattice points, which is visualized by a schematic diagram in figure 2.1.
To discretize the gluonic part of the action, we define the plaquette Uµν(n) as a product of four link variables in a closed loop:
Uµν(n) = Uµ (n)Uν (n+ µ)U †µ (n+ ν)U †ν (n) . (2.9)
Figure 2.1 depicts a visualization of the plaquette. Due to the invariance under cyclic commutation, a trace over a plaquette is gauge invariant. Therefore, it is reasonable to construct the gauge part of the action under usage of the trace of the plaquette.
If we use the exponential expression of the link variables again and make use of the Baker-Campbell-Haussdorff formula, the plaquette may be written as
Uµν(n) = exp(iga2Fµν(n) +O(a3)). (2.10)
2. QCD on the lattice
The continuum expression can be restored if only the third term of the Taylor expansion is used. This can be done by taking the real part and compensating the leading 1 using
SG [U ] = β ∑ nεΛ
:= 1 3|Λ|
Re tr[Uµν (n)] (2.12)
is often called plaquette as well and is used for different lattice analysis. Having discretized the action, we look at the path integral in more details. The integral
runs over all fields involved in the action. Here, this are the Grassmann valued fields ψ, ψ and the link variables Uµ. Thus, the integration measure is defined as
∫ DψDψ
∫ DU :=
dUµ(n), (2.13)
where dψ(n) is the Grassmann integration measure and dUµ(n) is the Haar measure for SU(3) matrices. If we write the fermion action (2.5) as
ψnDnm[U,m]ψm, (2.14)
Dnm[U,m] = 3∑
a direct integration over the Grassman variables results in[10]
Z = ∫ DψDψDUe−SF [U,ψ,ψ,m]−SG[U ] =
∫ DU detD[U,m]e−SG[U ]. (2.16)
For more than one flavor involved, the partition function reads
Z = ∫ DU
detD[U,mf ]e−SG[U ]. (2.17)
The above definition of Dnm is often called Dirac matrix. Similarly, one can derive a path integral representation for expectation values. For
operators that depend on the quark fields one has to take the Grassman variables into account before integrating them out. However, for pure gluonic operators one can simply write
O = 1 Z
12
2.2. The doubling problem
In this expression we have, except the operator O[U ] itself, two terms that depend on the links U . The first one, the determinant detD[U,m], is highly nonlocal in U , while the second one exp(−SG[U ]) only involves near neighbor interaction. This non-locality of the determinant makes it difficult to simulate lattice QCD involving fermions. (See section 3)
A drastic simplification is to set detD[U,m] = 1 which is equivalent to m→∞. This heavy quark limit is the so called quenched approximation.
2.2. The doubling problem
The finiteness of the lattice in numerical simulations has a dramatic side effect. The so called doubling problem arises when looking at the Fourier transformed quark propagator in free theory. The free quark propagator is defined as
Gab(n) = ψa(n)ψb(0)free , (2.19)
with a, b color indices. The Fourier transformation on the lattice also becomes discrete with
f(n) = 1√ |Λ|
pµ = 2π Nµ
2
} . (2.21)
The value θµ depends on the boundary conditions. It is 1/2 for anti periodic boundaries and 0 for periodic boundaries.
We can now transform the free fermion action (2.2) and get
SF = ∑ p
ψpDpψp (2.22)
µ=0 γµ sin pµ. (2.23)
As in the continuum, the Fourier transformed quark propagator is given by the inverse of the Dirac operator,
Gab(n) = 1 |Λ|
]ab sin2 p0 + ω2(p)
with
sin2 pi. (2.25)
For the limit a → 0 with fixed pphys. = pµ/a, this goes over to the correct continuum version with one pole at p = 0. Still, for finite a, 15 so called doublers arise, where ω(p) = m. This additional poles correspond to extra particles with E(p) = m, where one or more entries of p are equal to π. For the free theory, these doublers might not be a problem, since they do not interact. But for an interaction theory, they simulate additional particles that do not exist in continuum.
2.3. Wilson fermions and chiral symmetry breaking In order to define a theory without doublers, different approaches have been developed. One possibility is to add an extra term to the Dirac matrix which vanishes in the limit a→ 0. This so called Wilson term
LW := − 3∑
µ=0
Uµ(n)δn+µ,m − 21δn,m + U−µ(n)δn−µ,m 2 (2.26)
leads to an additional term in the Fourier transformed quark propagator,
D(p) = m1+ i 3∑
3∑ µ=0
(1− cos pµ). (2.27)
This additional term solves the doubling problem, but breaks the so called chiral sym- metry, which is given in the naive discretization (2.5) and the limit m→ 0.
Through chiral symmetry one can split the massless fermion action in one ’right hand’ term and one ’left hand’ term. To do so, we introduce the projector
P± = 1± γ5 2 (2.28)
and write ψ = ψL + ψR with
ψR = P+ψ, ψL = P−ψ
ψR = ψP−, ψL = ψP+. (2.29)
The chiral symmetry is now given by the observation that the mixed terms in the ex- panded version of the action vanish when m = 0. Then we can write
L = LR + LL, (2.30)
where LR and LL are defined as usual with ψL/R instead of ψ. Including different flavors, the massless action has the total symmetry[10]
SU(Nf )L ⊗ SU(Nf )R ⊗U(1)V ⊗U(1)A, (2.31)
14
2.4. Staggered action
where the notation shows that the right and left handed terms transform independently under SU(Nf ) flavor mixing. A degenerate mass term breaks that symmetry since it mixes right handed and left handed terms. What is left is the symmetry
SUV (NF )⊗UV (1) (2.32)
where the conserved quantity corresponding to UV (1) is the baryon number. Due to the extra ’mass term’, the symmetry is explicitly broken even in the massless
limit for Wilson fermions. An important consequence of spontaneous chiral symmetry breaking is the appearance
of the so called Goldstone bosons, which are massless bosonic excitations [11]. For the continuum theory, the pion is the Goldstone boson as it is much lighter than other mesons. (It would be massless in the limit m → 0.) Therefore, it is impossible to investigate masses that involve Goldstone bosons with Wilson fermions.
2.4. Staggered action In the case of Wilson fermions we removed the doubler by adding a new term. A different approach is to interpret the 15 doubler as additional flavors. Such a theory is of course unphysical, but Kogut and Susskind showed that one can reduce the number of flavors from 16 to 4 [12]. To distinguish from the physical flavors, one usually then speaks of tastes.
We start again with the free discretized fermion action,
Sfree F =
∑ n∈Λ
ψ (n)
2 +mψ (n)
ψ = γn0 0 γn1
1 γn2 2 γn3
2 γn1 1 γn0
0 , (2.34)
where the ni are the entries of the lattice point n. Commuting the gamma matrices with γµ in the action we get different phase factors for each lattice point:
ψ(n)γµψ(n± µ) = ηµψ ′ 1ψ(n± µ)′. (2.35)
Thereby, the so called staggered phases ηµ are given as
ηµ = (−1) ∑
ν<µ nν . (2.36)
Applying this transformations to the total action (2.5), the action becomes diagonal in Dirac space since the γ matrix structure is now represented by the phases. The observation now is that we can reduce the number of doublers to four if we skip three of the four spinor components. Therefore, we introduce the fields χ(n) and χ(n) which live only in color space and we get the simpler action
SF = ∑ nεΛ
χ (n)
3∑ µ=0
ηµ (n) Uµ (n)χ (n+ µ)− U−µ (n)χ (n− µ) 2 +mχ (n)
. (2.37)
15
2. QCD on the lattice
After this simplification, we have to answer how many quarks this action represents. To do so, we have to map back on the Dirac structure. Assuming that the lattice dimensions are even, this can be done by introducing hypercubes of size 24. We access these cubes through a vector N while a sub vector ρ points to the actual lattice point. Using the notation χρ(N) = χ(2N +ρ), we map back to the spinor fields with the linear transformation,
ψtα (N) = 1 8 ∑ ρ
Γαt,ρχρ (N) , (2.38)
0 γρ1 1 γρ2
2 γρ3 3 )αt . (2.39)
With this definition it is obvious that we now have four tastes t involved. After some algebra one obtains for the total action,
SF =16 ∑ N
( 3∑ t=0
3∑ t,t′=0
3∑ µ=0
ψt(N)γ5(τtτµ)t,t′(∇µ)2 µψ
t′(N) ) , (2.40)
with τµ = γTµ . Here ∇µ is the discretized derivative on the lattice of hypercubes. The third term vanishes in the continuum limit, but it mixes flavors and breaks the chiral symmetry for finite lattices. The advantage over Wilson fermions is that it still leaves the action invariant under U(1) ⊗ U(1) transformations. Therefore, investigations with Goldstone particles are still possible in a reduced form.
We now have an action that introduces four degenerate tastes. Using the staggered action (2.37) for equation (2.18), we simulate physics with 4 quarks, which have identical masses. Thus, in order to reduce the number of tastes, one has to take roots of the determinant. For example, if we want to calculate an expectation value for physics with two light quarks mu/d and one heavy quark ms, we have to use
O = 1 Z
)1/2 (detD[U,ms])1/4 e−SG[U ]. (2.41)
The question whether rooting is allowed for finite lattices is controversial. We have seen that the tastes decouple in the continuum limit, but it is not yet fully clear whether a continuum extrapolation based on finite lattices leads to correct physical values. For instance we have to ask whether the universality class remains unchanged. Nevertheless lattice simulations using staggered quarks have led to the right physical results. See [13, 14] for further details. Improvements of the action, like ”highly improved stag- gered quarks” (HISQ) reduce the taste mixing and, moreover, reduce the order of the discretization error [15].
For the probability interpretation of equation (2.18), it is important that the deter- minant detD is real and positiv. We have to check whether this is still the case for
16
2.5. Temperature on the lattice and the continuum limit
staggered fermions. To do so we look at the eigenvalue spectrum of the Dirac matrix. We define the massless staggered Dirac matrix as D(m = 0) := M . It is easy to see that it is anti-hermitian which means M † = −M . As a direct consequence, it follows that M has pure imaginary eigenvalues. Moreover, M has a γ5 hermiticity
M †n,m = −Mn,m = η5(n)Mn,mη5(m) (2.42)
from which it follows that there is a complex conjugate counterpart λj = λ∗i for every eigenvalue λi. Thus, the determinant detM is real. Furthermore, the mass term m ensures that detD is strictly positiv.
In addition to the staggered phases, we also have to define boundary conditions for the fermions. One usually chooses periodic boundaries in spatial direction and anti-periodic boundaries in the temporal direction.
In the definition of the Dirac matrix, we only access the fermion fields through terms with links multiplied to them. Hence, it is sufficient to introduce another phase, that multiplies all links reaching over the lattice borders in temporal direction with −1.
We may also shift the staggered phases into the links. Using
ηµ(n± µ) = ηµ(n), (2.43)
we can multiply the phase ηµ(n) to the link Uµ(n), without changing the Dirac matrix.
2.5. Temperature on the lattice and the continuum limit As outlined in section 1.2, the imaginary time was introduced to calculate the partition function using the path integral formalism. The time integration in the definition of the action runs over τ ∈ [0, β]. From the discretization in section 2.1, it follows that the connection between the lattice spacing a and the temperature is given by
β = 1 T
= Nτa. (2.44)
This means that the lattice spacing directly defines the temperature. However, through the rescaling of the fields (2.3) the action does not depend on the lattice spacing any more. The only relevant parameters are g and m. Hence, the lattice spacing is indirectly defined by this two parameters. For simplicity, from now on we switch to quenched theory, where only g is the relevant parameter.
Any physical observable, Γ, cannot depend on the lattice spacing a, which means dΓ/da = 0. The lattice versions of those quantities ΓL(g) are related to the physical ones by multiplication with some power of the lattice spacing, ΓL(g) = adΓΓ. This leads to the renormalization group equation,
a d daΓ =
β(g) = −adg da. (2.46)
Solving the differential equation for an arbitrary β-function gives
a
. (2.47)
We find that the β-function controls how the lattice spacing and the gauge coupling are related. Moreover, it shows that a continuum limit a→ 0 is only possible if β(g) has a root.
The β-function may be expanded around g = 0 and then takes the form [16]
β(g) = −β0g 3 − β1g
nf
) . (2.48)
It can be shown that this expansion does not change when switching to dynamical QCD [16]. Inserting this into (2.47) shows that the continuum limit is given for g → 0. Otherwise, confinement would be broken. From the definition m = amphys. one sees that this also implies m→ 0.
For fixed lattice dimensions, the limit a→ 0 would result in an infinite small volume. Therefore, it is necessary to increase Nσ and Nτ so that β = Nτa and Lσ = Nσa stay constant.
Since computer resources are finite, a typical measurement on the lattice is performed at different lattice spacings a, and the results are extrapolated to continuum. As the lattice spacing can only be controlled by the parameters g and m, a can only be esti- mated. Having defined the lattice dimensions and g and m, the temperature and the lattice spacing may be measured using the static quark potential.
2.6. Z3 symmetry and phase transition In section 2.3 we have seen that the action is invariant under chiral transformations in the massless limit. Therefore, the chiral condensate would be an order parameter for a phase transition in this limit.
For the quenched limit m → ∞, another order parameter exists. Imagine a transfor- mation of all lattice links in time direction for a fixed value nτ :
U0(n)→ U ′τ (n) = zkUτ (n) ∀n = (nτ ,n) with nτ fixed, zk ∈ Z3, (2.49)
where Z3 is the center group for Nc = 3. While the gauge action is clearly invariant under this transformation, a Wilson loop wrapping around the lattice in time direction is not. The latter is called Polyakov loop and transforms as P (n)→ zkP (n).
Due to spontaneous symmetry breaking, this global center symmetry might be broken, which is the case above the so called critical temperature Tc. This is why the Polyakov loop can be used as an order parameter for the phase transition at Tc
18
µ
T
Figure 2.2: A rough sketch of the QCD phase diagram
Although the fermion action explicitly breaks the Z3 symmetry, the Polyakov loop is still a well suited quantity to observe the transition from the hadronic phase to the quark gluon plasma in dynamical QCD.
In figure 2.2 a typical phase diagram for the QCD is shown. Without considering the chemical potential µ, one simulates physics on the y-axis, where only a crossover between the two phases takes place. The red line separates the two phases and the dashed part indicates the crossover region. Following the red line, at higher chemical potential one expects at some point a second order phase transition (Marked by the red point). At even higher chemical potential, the phase transition becomes of 1st order.
19
3. Numerical simulation
3. Numerical simulation The expectation value of an operator O is defined as
O = 1 Z
∫ DUO[U ](detD[U ])se−SG[U ], (3.1)
where the parameter s controls the number of flavors. For Wilson fermions it is just Nf
while for staggered fermions one has to use s = Nf/4. To solve this equation numerical, we have to deal with the high dimensional path integral, which makes it impossible to use numerical integration methods whose error depends on the dimension of the integral, as it is the case for Gaussian quadrature for instance.
As a solution, the Monte Carlo integration approximates the integral using a stochastic approach and, therefore, comes with a dimensional independent error.
We now first define the Monte Carlo method and then give an overview about different implementations for both, quenched and dynamical theory. Note that the following constructions are not only valid for gauge fields represented through the links. One may use the same algorithms for scalar fields as it will be important in section 3.4
3.1. Monte Carlo simulations
The basic idea is to approximate the integral using a finite set of so called gauge con- figurations. These are defined as a full set of link variables, Ci. Now the path integral now be approximated using
O ≈ 1 Nconf
Nconf∑ i=1
O[Ci] (detD[Ci])s
Z e−SG[Ci], (3.2)
where Nconf is the total number of gauge configurations. Alternatively, if the configura- tions Ci are distributed according to
dP [C] = (detD[C])s
Z e−SG[C]dC, (3.3)
one can write
O ≈ 1 Nconf
Nconf∑ i=1
O[Ci]. (3.4)
An algorithm to obtain the desired probability distribution (3.3) is the Markov chain. It generates configurations in a stochastic sequence where the construction of a con- figuration is based on that one before. In the limit Nconf → ∞, all configurations are distributed with the probability distribution (3.4). However, for a finite number of configurations, one has to take the correlation between subsequent configurations into account.
Within the Markov chain, the configuration Cn is selected through a transition prob- ability P (C = Cn−1 → C ′ = Cn). This transition probability is independent of the
20
3.2. Heatbath and Overrelaxation
index n. We now have to choose P (C → C ′) in such a way that the total probability to generate configuration Ci is given by (3.3). One can show, that it is sufficient to require[17]
1. Ergodicity: Every configuration can be reached in a finite number of steps.
2. Detailed balance: P (C)P (C → C ′) = P (C ′)P (C ′ → C).
3. Normalization: ∑ C′ P (C → C ′) = 1
Starting with an arbitrary configuration, the probability for generating a certain config- uration in the next step is based on this starting configuration. Only after some time, the probability to generate this certain configuration will reach the distribution in (3.4). This process of approximating the so called equilibrium is called thermalization, the time until the equilibrium is reached is the thermalization time.
3.2. Heatbath and Overrelaxation For generating configurations, the choice of the transition probability has to match the above conditions and, moreover, should lead to a short thermalization time. There are several such algorithms, which are based on different approaches. Here we concentrate on the heatbath and the overrelaxation method, which both use a local update. This means only one element of the configuration is updated, while all other elements are kept fixed.
We first look at a simple bosonic quadratic action,
Sφ = ( φn − φn
)2 + const. (3.5)
where φn is the element that should be updated and φn contains all terms that are somehow connected to φn.
The principle of the heatbath algorithm is to generate a new element φn according to the local probability distribution which stems from all terms multiplied to the φn. Using the quadratic structure, we can choose a Gaussian distributed random number η with
P (η) = e−η2 (3.6)
and set the new element as φn → φ′n = φn + η. (3.7)
The overrelaxation algorithm chooses φ′n so that S[φn, φn] remains constant, but φ′n and φn are maximally different. This can be done by setting
φn → φ′n = 2φn − φn. (3.8)
As the action stays constant, the overrelaxation method does not fulfill ergodicity. It only acts within the subspace of configurations with a certain value of the action. Therefore, it has to be combined with another ergodic algorithm. Such combined update routines
21
benefit from a rapidly decreasing correlation between two following configurations. To distinguish the individual heatbath and overrelaxation updates from such a combination, the latter is usually called sweep.
For SU(N) gauge configurations the situation becomes more complicated. We first introduce methods for quenched SU(2) theory and then construct a method for arbitrary SU(N) matrices based on the SU(2) update.
Let us assume we want to update a certain link Uµ(n) =: U . The local probability distribution is then given by
dP (U) = dU exp ( β
N Re tr[UA]
A := ∑ ν 6=µ
+ U−ν(n+ µ)U−µ(n+ µ− ν)Uν(n− ν)). (3.10)
The quantity A consists of all constant links that are multiplied to Uµ(n) and is often referred as the so called staple. For SU(2) matrices, A may be rewritten as A = aV where a is given by a =
√ detA and V is a SU(2) matrix. Using the invariance of the
Haar measure under transformations of the origin in group space (dU = dUV ), we can rewrite the local probability distribution as
dP (X) = dX exp ( aβ
2 Re tr[X] )
(3.11)
with X = UV . If we generate a matrix X according to the above probability distribution, the new link is obtained by
U → U ′µ(n) = U ′ = XV †. (3.12)
For the details of generating X accordingly see [18] and [10]. In the case of SU(2) overrelaxation, one can use
U → U ′ = V †UV † (3.13)
and it follows tr[U ′A] = tr[V †U †V †A] = a tr[V †U †] = tr[UA]. (3.14)
For SU(N) theories, Cabibbo and Marinari invented an algorithm based on embedded SU(2) subgroups [19]. For SU(3) this subgroups can be chosen as
R =
S =
T =
. (3.15)
22
3.3. Rational Hybrid Monte Carlo
We now update link U by left multiplication with one of this subgroup elements, for example R. Writing W := UA, we get for the trace
tr[RW ] = r11w11 + r12w12 + r21w21 + r22w22 + terms without rij . (3.16)
We observe that only the sub block elements of W corresponding to those of R are relevant. Therefore, one gets the same situation as in the SU(2) case except that W is not proportional to a SU(2) matrix. This can be fixed by setting
W →W ′ = 1 2
( w11 + w∗22 w12 − w∗21 w21 − w∗12 w∗11 + w22
) , (3.17)
which leaves the real part of trace unchanged but can be rewritten as W ′ = aV . From now on, we can use the SU(2) update routines. With the other two sub matrices, the full link update is given by
U → U ′ = TSRU. (3.18)
This method stays also valid for non quenched actions as it can be seen in section 3.4.3 For a full lattice update, one has to visit all of its elements. Thereby, the order of the
individual links does not influence the thermalization at all, but controls the speed of reaching the equilibrium. A common choice is to update all elements which sit on even lattice points first and then all elements on odd points.
3.3. Rational Hybrid Monte Carlo For generating dynamical configurations, which means to include the determinant in equation (3.3) into the probability distribution, one needs a different approach. Instead of a local algorithm, the Hybrid Monte Carlo updates the whole configuration in one step. Let us first look at s = 2. The determinant now may be rewritten as
(detD)2 = det[DD†] = π−|Λ| ∫ DφDφ†e−φ†(DD†)−1φ, (3.19)
where φ is a complex, bosonic field with color structure living on the lattice points, often called pseudo fermion. The full partition function is then given by
Z = ∫ DφDφ†DUe−φ†(DD†[U ])−1φ−SG[U ], (3.20)
where we omitted the trivial π−|Λ| factor. We now insert another field Pµ(n) that does not change expectation values
Z = ∫ DPDφDφ†DUe−
. (3.21)
3. Numerical simulation
the exponent can be interpreted as a Hamiltonian with the conjugate variables Q and P . The principle of the Hybrid Monte Carlo is to solve the resulting equation of motions
P = ∂H
∂Q , (3.23)
Q = ∂H
∂P (3.24)
numerically using a leap frog algorithm [20]. The necessary fields φ and P are newly generated for each update.
For an exact solution, this would result in the right probability distribution. However, due to systematic and numerical errors in the integration of the equations of motion, an error has to be corrected. This can be done by a Metropolis step: A generated configuration is only accepted if a random number r ∈ [0, 1) is smaller than exp(S′−S), where S′ is the value of the action based on the configuration after the update.
For s 6= 2 the necessary inversion (DD†)−s/2 can be approximated with a rational polynomial, which defines the name ’Rational Hybrid Monte Carlo’ (RHMC).
A great advantage of this rational Hybrid Monte Carlo is that it does not rely on special properties of the used action. This makes it possible to use improved actions like the HISQ action [15]. The disadvantage is the global update, which makes numerical approaches that rely on the locality of the action not applicable.
3.4. Local boson fields
The method of local boson fields is another method to approximate the determinant detD in equation (3.3), first time proposed by Luscher 1993 [9]. In contrast to RHMC the corresponding actions stay local.
The main idea is to approximate this determinant using a polynomial and a set of local boson fields. The polynomial
PN (z) = N∑ i=1
cix i = cN
N∏ k=1
(z − zk) (3.25)
is chosen in such a way that it approximates z−s in a certain interval. The determinant may now put into the denominator of a fraction by
(detD)s ≈ 1 detPN (D) . (3.26)
The convergence of this matrix polynomial is only given, if the eigenvalues lie inside the convergence region of P (z). Let us assume that this is the case for P with matrix D. Moreover we require at this point that the roots zk come in complex conjugate pairs.
24
3.4. Local boson fields
Then we can rewrite the determinant as a Gaussian integral over N boson fields:
detDs = |cN |−|Λ| N∏ k=1
1 det (D − zk)
1 det (D − zk)† (D − zk)
= |cN |−|Λ| ∫ N/2∏
=: |cN |−|Λ| ∫ N/2∏
−SL . (3.27)
Here the φk are, as for the RHMC, boson fields with color structure, so called pseudo fermions, and |Λ| is the volume of the lattice. In the first step we have used
D†n,m = η5(n)Dn,mη5(m), η5(n)2 = 1 (3.28)
for staggered fermions or D†n,m = γ5Dn,mγ5, γ
2 5 = 1 (3.29)
for Wilson fermions, respectively.
3.4.1. Even odd preconditioning
The convergence of the polynomial depends on the position of the eigenvalues. As the spectrum of the Wilson matrix differs from the staggered formulation and the numerical calculations in this work are performed using the latter, we concentrate in the following on staggered fermions. For Wilson fermions see [21, 22].
The eigenvalues of the staggered Dirac matrix lie on a line between m − iλmax and m+iλmax. This is rather unhandy for the construction of the polynomial. Through even odd preconditioning it is possible to map this region to [m2,m2 +λ2
max]. A construction without even odd preconditioning with s = 1 can be found in [23].
The concept is based on the even odd symmetry of the Dirac matrix. We observe that D only connects even lattice points with odd ones and vice versa. If we now order the vector of boson fields as
φ = ( φe φo
D := ( m1 Deo
det D = det(m2 −DoeDeo). (3.32)
25
3. Numerical simulation
Therefore, the spectrum lies, as intended, within [m2,m2 + λ2 max]. We could now use D
for the formulation of the Luscher action SL:
SL = N/2∑ k=1
φ†k(D − zk) †(D − zk)φk. (3.33)
However, D†D es less local and an update formulation becomes much harder. This can be solved by rewriting the roots zk → rk and demanding
det(D − zk) = det(m2 −DoeDeo − zk)
!= D := ( m− rk Deo
) . (3.34)
The new ’roots’ are given by rk = m − √ m2 − zk and the even odd preconditioning is
simply introduced by redefinition of the roots. For the construction in (3.27) we need the roots to come in complex conjugate pairs.
This can be ensured if we choose
√ zk =
{ √ rei/2, for Im(zk) > 0√ rei/2+π, for Im(zk) < 0 (3.35)
with r = |zk| and = arg(zk). The total Luscher action now reads
SL = N/2∑ k=1
3.4.2. Choosing the polynomial
The accuracy of this local boson theory is controlled by the order of the polynomial N and the roots of the polynomial. To spare computer time and memory usage it is necessary to get the best accuracy for a given N . For this purpose different choices for the polynomial have been proposed. For Wilson fermions, Luscher originally suggested Chebychev polynomials [9]. In [24] Montvay suggested to use a least square algorithm for optimization of the polynomial. Forcrand et al had the idea to use adopted polynomials to reduce the error of the determinant [25]. For simplicity we concentrate on Chebychev polynomials and adopt the construction given in [26]. The Chebychev polynomials are defined by
Tk[x] = cos(k arccos(x)). (3.37)
Through the even odd preconditioning, the eigenvalues of D lie within [m2,m2 + λ2 max],
where λmax is the largest eigenvalue of MoeMeo. To match the convergence region of the Chebychev Polynomials, [−1, 1], we have to shift and rescale the Dirac matrix. We start with the rescaling,
D′oo = 2 2m2 + λ2
max Doo (3.38)
and then shift with D′oo := Doo − (1− ε)Moo, (3.39)
where ε is defined as ε := 1− λ2
max 2m2 + λ2
λ2 max
. (3.41)
The spectrum of −Moo then obviously lies within [−1, 1]. We now write the polynomial expansion of x−1 in terms of y as
x−s = [1 + (1− ε)y]−s = N∑ k=0
ckTk[y], (3.42)
with y = (x− 1)/(1− ε). The coefficients ck may then be calculated as
ck = ∫ 1
1− y2 dy
= 2rk
1 + δk,0 (1 + r2)sF (s, s+ k; 1 + k; r2) Γ(s+ k)
Γ(s)Γ(1 + k) , (3.43)
r = −1 + √ ε(2− ε)
1− ε , (3.44)
F (α, β; γ, z) is the Gaussian hyper-geometric function and Γ(z) is the Gamma function. It can be shown that the absolute error of the polynomial is given by[26]x−s − PN [x]
≤ 2 Γ(s)
( 1 + r2
1− r2
)s (−r)N+1
1 + r , (3.45)
with −1 ≤ r ≤ 0, which results in an exponential error reduction with increasing N . However this decreasing depends on the value of r. With the definition in (3.40) we get ε << 1 for m << 1 and, therefore, r . 1. This is the main problem of the local boson construction as the degree of the polynomial N has to be increased rapidly for decreasing mass m. Figure 3.1 shows the upper error bound depending on the mass for different number of roots. One clearly sees how the effort rises for decreasing m. Note that the largest absolute error usually occurs near m2. Therefore, the relativ error, defined as the residual,
R(x) = x(PN (x))1/s − 1
, (3.46) is much smaller. Figure 3.2 shows that this residual is uniformly distributed over the convergence region. Again the need for a large number of roots for smaller masses becomes evident.
For the formulation of the local boson theory, the roots have to come in complex pairs. As the coefficients ck are real, this can be easily ensured choosing N even. The roots may then be calculated numerically. They lie around the convergence region on an ellipse, as it can be checked in figure 3.3.
27
[x ]|
m
N = 64 N = 128 N = 256 N = 512
Figure 3.1: Absolut error of the approximating polynomial according to equation (3.45) for different number of roots N . λmax = 2.5, s = 1/2
10−8
10−6
10−4
10−2
R (x
R (x
m = 0.016 m = 0.032
m = 0.064 m = 0.1
Figure 3.2: Residual R(x) for different number of roots N at fixed m = 0.064 (left) and different values of m at fixed N = 128 (right), λmax = 2.5, s = 1/2.
28
real(x)
0 0.1 0.2 0.3 0.4 0.5
im ag
100
105
1010
1015
1020
1025
Figure 3.3: Position of the roots zk in the complex plane for N = 32, m = 0.064, λmax = 2.5 and s = 1/2. The color map shows the value of the polynomial at the given point z.
3.4.3. Explicit update formulation
In the following, explicit calculations for the update process are shown. For sake of brevity we skip the staggered phases and assume them multiplied to the links. We aim to rewrite (3.27) in such a way that the overrelaxation and heatbath algorithms can be applied. For the boson fields, this requires a Gaussian like structure for one selected field element φk(x) in the exponent. First we expand the Luscher action and get
SL,k = φ†k (M +m− rk)† (M +m− rk)φk = φ†k
( M †M + 2 Im(rk)iM − 2mRe(rk) +m2 + |rk|2
) φk, (3.47)
where we have used M † = −M . We now have to factor out a certain element of φk and therefore need to look closer at
φ†kM †Mφk =
)
· (∑
) (3.48)
29
=:i(n)
+U †µ(n− µ)Uν(n− µ)φk(n− µ+ ν) =:iii(n)
−U †µ(n− µ)U †ν (n− ν − µ)φk(n− ν − µ) =:iv(n)
] . (3.49)
When updating the boson fields, we change one element φk(x) of field φk and keep the rest (φk(y)) fixed. We then get terms that connect the element φk(x) with other constant field elements:
φ†k(x)Xφk(y) := 1 4φ † k(x)
∑ µ,ν
iv(x)
. (3.50)
The different connections of the field elements and links are visualized in figure 3.4. The remaining terms connect φ(x) with itself and simplify to
φ†k(x)Y φk(y) := 1 4φ † k(x)
(∑ µ=ν
and we get in total
φ†kM †Mφk = φ†k(x)Xφk(y) + φ†k(y)X†φk(x) + 2 |φk(x)|2 + const. . (3.52)
For the update process, we have to include all terms which contain the element φk(x). These can stem from the left side of equation (3.47) and from the right side. We get
φ†k (D − rk)† (D − rk)φk = φ†k (M +m− rk)† (M +m− rk)φk =φ†k(x) [Xφk(y) + 2 Im(rk)iMφk(y)]
:=bk
]
)
and can now introduce the gaussian like structure with( A
1/2 k φ†k(x) + b†kA
−1/2 k
n1
n2
Figure 3.4: Visualization of the squared Dirac matrix. The µ direction is fixed to n1, while both two dimensional possibilities for the direction of ν (n1 and n2) are shown (dashed links). The starting field element φ(n) is colored in blue.
With a gaussian distributed field χ, one heatbath step is given by
χ = A 1/2 k φ′k(x) + bkA
−1/2 k ⇔ φ′k(x) = χA
−1/2 k − bkA−1
φ′k(x) = −φk(x)− 2A−1 k bk . (3.56)
In the gauge field update, we want to update one certain link Uσ(x) and keep the rest constant. To see where this one link contributes to the total value of the Luscher action, we look again at (3.48) and factor out Uσ(m). We then get
SL,k =1 2 ( φ†k(x+ σ)U †σ(x)ak − φ†k(x)Uσ(x)bk
+ a†kUσ(x)φk(x+ σ)− b†kU † σ(x)φk(x)
) + const. (3.57)
2 + Im(rk)φk(x) (3.58)
∑ ν 6=σ U
2 − Im(rk)φk(x+ σ) . (3.59)
3. Numerical simulation
Since everything is scalar in total, this can be combined to
SL,k = Re [ φ†k(x+ σ)U †σ(x)ak − φ†k(x)Uσ(x)bk
] + const. (3.60)
and we can introduce a trace over this scalar and reorder the links and fields:
SL,k = Re {
]} + const.
= Re {
]} + const.
= Re {
= Re {tr [Uσ(x)Fφ,k]} (3.61)
Here we have used the fact that we are just considering the real part in the first step. Including the staple F ′U and the link U ′σ stemming from the non staggered gauge part
of the action we get in total
S = U ′σF ′ U − Uσ
∑ k
Fφ,k. (3.62)
For staggered fermions, the gauge part has been transformed as U ′σF ′U = −UσFU and we can use the standard overrelaxation and heatbath algorithms with the weight
P [U, φ] = 1 Z
exp { −Uσ(x)
( FU +
∑ k
Fφ,k
)} . (3.63)
This formulation shows one of the great disadvantages of the local boson theory. It is very difficult to bring improvements of the action, like the HISQ action, into the above form. Moreover such improvements are less local, which is neccessary for the error reduction in section 4. Therefore, one has to deal with O(a2) errors and moreover, it is impossible to introduce terms that reduce the taste mixing of the staggered action, as it is the case for the HISQ action.
32
V
U
UA[U ]
X[V ]
A[U ]
Figure 4.1: Lattice structure for one sublattice and a Wilson loop wrapping around the lattice
4. Error reduction for gluonic operators
Many physical observables can be calculated on the lattice using pure gluonic operators. However, this does not mean one can fully skip the fermionic part, as it also influences the link variables. Therefore, in the quenched approximation, one neglects some information.
In the calculation of expectation values in lattice gauge theories, one has to deal with a finite set of gauge configurations. This leads to statistical errors of this expectation val- ues. Unfortunately this errors can be very large for some operators, especially for large distance correlations. To reduce the noise, usually stemming only from parts of the op- erator, different approaches have been suggested. 1983 Parisi et al suggested to integrate over individual links, while keeping all other links constant [27]. The disadvantage is that this link integration is only possible for operators whose links lie straightforward on a line.
This problem was solved by Luscher and Weisz using sublattice updates. [8] Both methods have in common that they only work with local actions. So far no attempt was done to use error reduction methods for dynamical fermions. In the following we first describe the basic methods and then formulate a variant of the Luscher Weisz algorithm for dynamical fermions.
4.1. Luscher Weisz method
We start with a method, where the contribution of parts of an operator can be noise reduced and then we show that the statistical error reduces rapidly if we reduce the noise at more then one single part. We assume, that we can split the operator O into a product of the operators X[V ] and A[U ] where U and V are disjoint sets of link variables in different sublattices. Such an operator for example is a Wilson loop wrapping around the whole lattice. Figure 4.1 shows how the lattice is divided into two sublattices. Thereby, it is not necessary, that the links at the borders of the sublattice are included. The sublattice has then the form of a comb. (See figure 4.4)
The expectation value XA is computed according to
O = XA = 1 Z
33
V
U
U
W
U
with Z =
∫ DUDV e−S[U,V ]. (4.2)
To reduce the noise stemming from operator X, the central idea is to replace X[V ] by another operator XU without changing the expectation value. We define XU as
XU = ∫ DV ′X[V ′]e−Sloc.[U,V ′]∫ DV ′e−Sloc.[U,V ′]
, (4.3)
where Sloc.[U, V ′] consists of all terms in the local action that are somehow connected to any element out of V . Therefore, the dependence of XU of U can be understood in such a way that it only depends on border links that lie nearby V . For a better overview we write
f(U, V ′) := eSloc.[U,V ′] (4.4)
and define g(U) for collecting all remaining terms, including A[U ]. We then get
XA = 1 Z
= 1 Z
DV ′f(U, V ′) g(U)
DV ′f(U, V ′) g(U)
= A XU , (4.5)
where we have first expanded the fraction with ∫ DV ′f(U, V ′) and then used the sym-
metry between V and V ′ in the numerator and changed X[V ] to X[V ′] in the third line. Since XU is an averaged quantity, the noise stemming from X in Monte Carlos simulations is now reduced.
The same trick is also possible for more then one sublattice. Here we split the lattice in four parts and get four operators X[V ], A[U ], Y [W ], and B[U ] and want to compute
34
4.2. Local boson fields and sublattices
XAY B via interchange of X and Y (See figure 4.2). XU and Y U are now defined as
XU = ∫ DV ′X[V ′]e−Sloc.[U,V ′]∫ DV ′e−Sloc.[U,V ′]
Y U = ∫ DW ′Y [W ′]e−Sloc.[U,W ′]∫ DW ′e−Sloc.[U,W ′]
. (4.6)
Under usage of the same trick we can now write
XAY B = XU A Y U B . (4.7)
In terms of averaging over gauge configurations we correlate now two different averages, which leads to a very high error reduction.
Note that the construction above only works if Sloc.[U, V ′] does not depend on W and Sloc.[U,W ′] does not depend on V . Otherwise one would loose the symmetry between V and V ′ or W and W ′ respectively. Depending on the locality of the action this might lead to a need of gaps between the sublattices. This means it might be impossible to shrink U to zero.
An improvement of this algorithm is an interlacing of the sublattices. Assuming a very local action that makes it possible to shrink the fixed area U to zero, one could for example calculate ABCD through
ABCD = A B C D , (4.8)
where A, B, C and D all lie in different sublattices. Each bracket denotes a different average. The inner ones are small sublattices that include only one of the above opera- tors. The outer brackets denote the whole lattice update. Additionally, sublattices that include at the same time A and B or C and D, respectively are realized. For the update process, one starts with generating sublattices that include both, A and B or C and D. Within these sublattices new sub updates are generated, that include only one of the parts A, B, C or D.
However, this pyramiding makes only sense, if the action is very local and noise stems from a lot of terms in the actual operator. For Wilson loop correlations in quenched theory, Luscher showed that this results in an exponential error reduction [8].
4.2. Local boson fields and sublattices For local boson algorithms, the term e−S[U ] is given by
n∏ k=1 DφkDφ†ke
−S[U,φk,φ†k], (4.9)
∑ k
φ†k (D[U ]− rk)† (D[U ]− rk)φk − SG[U ] . (4.10)
35
V
U
U
ψk
φk
φk
Figure 4.3: Lattice structure for one sublattice with local boson fields. A Wilson loop is wrapped around the lattice. The braces mark the region of the local boson sublattices, while the link sublattices are defined through the continuous line.
Since this modified action is local, we can define XU and Y U without dependence on W and V respectively. But we now have to care about the boson fields. Therefore, we define another sublattice ψk which contains the element of the boson fields we want to average over. This sublattice does not necessary have to share the dimension of the link sublattice V . The remaining elements of the boson field are combined to φk. An example configuration can be seen in figure 4.3. Then, the average XU also depends on φk and we get
XU,φk = ∫ DV ′Dψ′kX[V ′]e−SL,loc.[U,V ′,φk,ψ′k]∫ DV ′Dψ′ke
−SL,loc.[U,V ′,φk,ψ′k] , (4.11)
where we do not explicitly write down the product over the k’s and the dependencies on the daggered terms. Now, we are able to use the same trick as in equation (4.5) if we define f(U, V, ψ, φ) to contain all those terms that are anyhow connected to a link out of V or to any boson field element out of ψk. The remaining terms are combined in g(U, φ). Then the proof can be done analog to (4.5).
4.3. Optimal sublattice shape
In the following, we want to construct the optimal shape of the different sublattices. We first look at the quenched case and then include boson fields. To also reduce the noise for short distances one wants to shift the sublattices as close to each other as possible. This requires that, in the local actions, no element of one sublattice is multiplied to any element out of the other sublattice. The gauge action involves the plaquette. If one updates the full sublattices including the border links, one needs at least a gap with the width of two links between them. On the contrary, without the border links, one can align the sublattices directly next to each other. (See figure 4.4)
Including the local boson fields, the situation becomes more complicated. In the update of the boson fields, one element φk(x) depends on next to nearest neighbors,
36
U−ν(x+ µ)
Figure 4.4: Optimal shape of two neighboring quenched sublattices. The two sublattices of links, V (red) and W (blue), are labeled with fat lines. The fixed links are labeled with dashed lines. One link of sublattice V (Uµ(x)) and its furthest dependencies in the update process are highlighted (green).
ψk(x) φk(x+ 2µ)
U−µ(y − µ)
Figure 4.5: Optimal shape of two neighboring local boson sublattices. The two link sublattices, V (red) and W (blue), are labeled with fat lines. The local boson sublattices ψ (red) and χ (blue) are marked with circles. Two elements of the boson fields, ψk(x) (orange) and χk(y) (green), and its furthest dependencies for the update process are highlighted.
37
4. Error reduction for gluonic operators
which means for example on element φk(x+ 2µ). This means that we have to separate the boson field sublattices at least by distance 3. A good compromise can be found in figure 4.5, where the boson field sublattices are separated with distance 3 and the gauge sublattices have distance 1. We are not able to shift the link sublattices closer to each other as the boson fields would then depend on the links in the center of the gap between the boson field sublattices.
38
5. Heavy quark diffusion
Despite the high numerical costs, error reduction methods for pure gluonic operators are necessary for correlators that have a bad signal to noise ratio for larger distances. Such correlators occur amongst others in the calculation of transport coefficients in heavy quark diffusion.
We chose the color electric correlator defined in [7] to test the new error reduction methods for dynamical fermions. This correlator can be used to find the heavy quark momentum diffusion coefficient. As this correlator is an addition of different Wilson loops, the above error reduction methods are applicable.
5.1. The spectral function
The usual calculation of transport properties in the quark-gluon plasma using lattice QCD relies on the connection between current-current correlators and their spectral distribution. In the following we give a short overview about spectral functions and their connection to transport coefficients. For a detailed description see [28]. We here follow [28] and [29]. We start with the definition of the Wightman correlation functions in real time
GAB> (t) := Tr{ρA(t)B(0)} = A(t)B(0) GAB< (t) := Tr{ρB(0)A(t)} = B(0)A(t) = GBA> (−t), (5.1)
where ρ is the density matrix ρ = 1 Z e−βH and A(t), B(t) are operators in the Heisenberg
picture. From e−βHA(t)eβH = A(t+ iβ) (5.2)
one can derive the Kubo-Martin-Schwinger (KMS) relation,
GAB> (t) = GBA> (−t− iβ). (5.3)
It follows that the Fourier transformed versions of the Wightman correlation functions
GAB> (ω) = ∫ ∞ −∞
dteiωtGAB> (t)
GAB< (ω) = ∫ ∞ −∞
dteiωtGAB< (t) (5.4)
fulfill the relation GAB< (ω) = GBA> (−ω) = e−βωGBA> (ω). (5.5)
To define the spectral function, we first look at the expectation value of the commu- tator
GAB(t) = iTr{ρ[A(t), B(0)]} = i ( GAB> (t)−GAB< (t)
) (5.6)
39
GA †B†(t) = −GAB(t∗)∗. (5.7)
We also define the retarded correlator as the positive half Fourier integral over this commutator,
GABR (ω) = ∫ ∞
0 dteiωtGAB(t). (5.8)
The spectral function can now be defined as the Fourier transformed version of the commutator GAB(ω),
ρAB(ω) = 1 2πi
dteiωGAB(t). (5.9)
Through the properties (5.7) this spectral function may be written as
ρAB(ω) = 1 2πi
) (5.10)
ρAA †(ω) = 1
† R (ω). (5.11)
In terms of simplicity we choose from now on the special case A = B = A† and, therefore, omit the indices A and B. The connection between the Wightman correlation function and the spectral function can then be easily obtained from equation (5.5):
G>(ω) = 2πeβω
eβω − 1ρ(ω) and G<(ω) = 2π eβω − 1ρ(ω). (5.12)
So far, the spectral function has been defined in Minkowski space. We establish the con- nection to lattice correlators by switching to imaginary time τ and define the Euclidean correlator as
GE(τ) = G>(−iτ). (5.13)
GR(t) = iθ(t) [A(t), A(0)] (5.14)
and expressing θ(t) through
ω′ + iδ , (5.15)
one can easily see that the frequency space retarded correlator is directly linked to the Euclidean correlator by analytic continuation,
GR(ω) = GE(ωn → −iw − iδ), (5.16)
40
where GE(ωn) are the Fourier coefficients of the Euclidean correlator,
GE(ωn) = ∫ β
0 dτeiωnτGE(τ). (5.17)
Thus, after computing the Euclidean correlator from the lattice, one could extract the spectral function through analytic continuation.
The spectral function can also be directly linked to the Euclidean correlator. To relate to physics, we now look at hadronic operators,
JH(τ,x) := ψ(τ,x)ΓHψ(τ,x), (5.18)
where ΓH defines the particle channel through ΓH = {1, γ5, γµ, γµγ5}. The correspond- ing Euclidean correlator is
GH(τ,x) = JH(τ,x)JH(0,0) , (5.19)
Using the definition of the Euclidean correlator we can write
GH(τ,p) = ∫
) .
The spectral function can now be inserted through equation (5.12),
GH(τ,p) = ∫ ∞
0 dωK (ω, τ) ρH(ω,p). (5.22)
Here one can see the physical meaning of the spectral function. As K(ω, τ) is the free boson propagator, the spectral function denotes the spectral distribution of the current- current correlator in terms of energy.
Subsequently, inverting equation (5.22) offers another method to extract the spectral function. However, both methods are very imprecise in practise, as the Euclidean corre- lation function can only be calculated in a discretized form. This makes it impossible to calculate the spectral function without making assumptions either about the correlator or about the spectral function itself.
41
5.2. Transport properties through spectral functions
Now we want to see how transport properties are encoded in the spectral function. We concentrate on the diffusion of heavy quarks in the quark gluon plasma. For the derivation of other transport properties see e.g. [28].
Transport coefficients are mainly relevant for systems out of equilibrium. Now, the idea is to only introduce a small, slow perturbation to a system. The response of the system is expected to be linear, from which one can calculate the transport coefficients. This Ansatz is called linear response theory. We start with the perturbed Hamiltonian,
Hf (t) = H − f(t)B(t), (5.23)
where H is the Hamiltonian of the unperturbed system. Using the standard evolution equation of an operator, one can find that the difference of the perturbed and the unperturbed expectation value of an operator A is given by [30]
δ A(t) := A(t)f − A(0)
= ∫ t
We now assume the source term to be
f(t) = eεtθ(−t)f0 (5.25)
δ A(t = 0)f =: χABs f0. (5.26)
One can then show that the retarded correlator is given by
GABR (ω)f0 = δA(0)f + iω
∫ ∞ 0
dteiωt δA(t)f . (5.27)
With this definitions, we can relate to heavy quark diffusion in the quark gluon plasma. From the heavy quark mass M >> T and its momentum p ∼
√ T/M , it follows that it
needs a lot of collisions with the thermal medium to change the momentum substantial. Therefore, it is possible to use the Langevin formalism to describe the thermalization of the heavy quarks [5]. The equations of motions for the latter are defined as
dx dt = p
dp dt = ξ(t)− ηp(t)
ξi(t)ξj(t′) = κδij(t− t′), (5.28)
where η is a momentum drag coefficient and ξi(t) is a source of temporal uncorrelated kicks. κ defines the momentum diffusion coefficient, which is the mean squared momen- tum transfer per time unit.
42
5.2. Transport properties through spectral functions
For a given ξ(t), the solution is easily found and provides Einstein’s fluctuation- dissipation relation
η = κ
2MT . (5.29)
From a comparison with hydrodynamical linear response theory, one can also derive a relation to the diffusion coefficient D [28],
D = T
κ . (5.30)
Let P (t,x) be the probability for a heavy quark to start at the origin and to diffuse to x within t. If N(0,x) is the initial distribution of the quarks, the distribution after time t will be
N(t,x) = ∫
or in momentum space, N(t,k) = P (t,k)N(0,k). (5.32)
Linear response theory can then be used to connect to the retarded current-current correlator (ΓH = γµ) by equation (5.27) and we get
G00 R (ω,k) = χs(k)
dteiωtP (t, k) ) , (5.33)
where the indices of G00 R refer to the density component of the current operator Jµ.
From a small perturbation of the chemical potential µ(x) = µ0 +δµ(x), one can derive the static susceptibility as[31]
χs = 4Nc
) . (5.34)
The left necessary quantity to calculate the transport coefficients is the initial distri- bution of the heavy quarks, N(t,x). It can be shown, that a Gaussian distribution is a good approximation [31]. Then the spectral function takes the form of a Lorentzian
ρ00(ω,0) ω
ω2 + η2 , (5.35)
where ωUV is a threshold at which other physical processes become relevant. This equation is called Kubo formula and is often expressed in terms of the spatial components of the spectral function ρii(ω,k). The latter is related to the time component by[31]
ρii(ω,k) = ω2
k2 ρ 00(ω,k). (5.36)
It follows that the diffusion coefficient D can be calculated through
D = 1 3χs
lim ω→0
3∑ i=1
T =∞
Figure 5.1: Sketch of the heavy quark current-current spectral function for different tem- peratures
As stated above, the extraction of the spectral function from lattice results is very imprecise and requires assumptions about the spectral function. A common method is to parametrize the spectral function based on perturbative or phenomenological predictions. However, finding a parametrization for the vector current-current correlation function is very difficult.
For the free theory (T → ∞), a perturbative calculation can be done [32]. A sketch of the spectral function for different temperatures is shown in figure 5.1. For T → ∞ the Lorentzian peak for low frequencies, often called transport peak, becomes a delta function. Above a threshold of 2mq, pair production is possible, which results in a quadratic diverging shape of the spectral function.
For lower temperatures, the threshold shifts to higher frequencies and peaks of bound states, like charmionium or bottomonium, appear. The shape of those peaks and the position of the threshold can only be qualitatively approximated.
Moreover, from the sketch in figure 5.1, it becomes clear that the complex structure of the spectral function requires a large number of values of the corresponding correlator for an inversion of equation (5.22). This requires large lattices, which are numerically expensive.
5.3. Heavy quark momentum diffusion from heavy quark effective theory
We have seen that the complexity of the spectral function makes a direct measurement of the diffusion coefficient very involved. In 2009 Caron-Huot et al. developed a method to calculate the heavy quark momentum diffusion coefficient, κ, directly [7]. The idea is based on heavy quark effective theory and results in a color-electric correlator, whose
44
5.3. Heavy quark momentum diffusion from heavy quark effective theory
spectral function is much smoother. The spectral function ρµν(ω) is based on the correlator of the operator Jµ at zero
momentum. One can relate the spatial components J i to the heavy quark’s velocity by vi =
∫ dxJ i. Thus, the classical force acting on a quark is given by M
∫ dxdJi
dt . Using this observation, one can define the momentum diffusion coefficient as
κ(M) := 2πM2 kin
, (5.38)
where Mkin is the heavy quark’s kinetic energy. From the above classic relations, it follows that this defines a correlation of the force acting on the heavy quark with itself.
Taking into account that this correlator should be mass independent for Mkin → 0, one reaches
κ = β
}⟩] . (5.39)
Using a Foldy-Wouthuysen transformation [33] it can be shown, that the leading force is the chromo-electric force induced by the color-electric field Ei. Therefore, the derivatives of the currents can be replaced by
dJ i
) , (5.40)
where θ and φ are two-component spinors of heavy quark effective theory (HQET) and M is the pole mass
M = m(µ) [
) +O(g4)
] , (5.41)
with m(µ) the MS mass and CF given by CF := (N2 c − 1)/(2Nc). Equation (5.39)
transforms to
κ = β
) (0,0)
}⟩ ,
(5.42) where the limit ω → 0 has already been taken. This two-point function satisfies the KMS condition (5.3) and, therefore, it is possible to relate to an Euclidean correlator. Finally the Euclidean color electric correlator reads
GE(τ) = −1 3
3∑ i=1
Re Tr [U(β, τ)gEi(τ,0)U(τ, 0)gEi(0,0)] Re Tr [U(β, 0)] , (5.43)
where U(τ1, τ2) is a color parallel transporter for static quarks. Having extracted the corresponding spectral function through (5.22) or (5.11), the momentum diffusion coef- ficient follows from
κ = lim ω→0
0.0
1.0
2.0
3.0
0.0 1.0 2.0 3.0 4.0 5.0 ω / T
0.0
1.0
2.0
3.0
4.0
N f = 0, T = 12 Tc
Figure 5.2: The spectral function corresponding to the color electric correlator (5.43) calculated perturbatively for Nf = 0 and T = 3Tc (left) and T = 12Tc (right). [34]
In contrast to the spectral function of the current-current correlator, this spectral func- tion seems to be rather smooth and is not contaminated by any bound state contribu- tions. The vacuum contribution to the spectral function has been calculated pertur- batively at next to leading order (NLO) [34] and up to next to next to leading order (NNLO) [35] without revealing any discontinuities or peaks as it can be seen in figure 5.2.
For leading order, the spectral function scales with ∼ ω3 according to
ρ(ω) = g2CF 6π ω3 +O(g4). (5.45)
Therefore, the heavy quark momentum diffusion coefficient is zero for leading order per- turbation and only higher order calculations contribute to its value. The corresponding leading order color electric correlator is given by
Gnorm(τT ) := GLO cont(τT ) g2CF
] (5.46)
and may be used to normalize lattice results for a better visualization. For the lattice version of the color-electric correlator, one has to discretize the electric
fields. Inspired by lattice heavy quark effective theory, this can be done by writing [36, 7]
J j = i
2aM ( θ†(n+ j)U †j (n)θ(n)− θ†(n)Uj(n)θ(n+ j)− (θ → φ)
) . (5.47)
46
5.4. Correction of lattice effects
Now Uj(n) refers to the lattice links. Inserting this in (5.39) results in a combination of Wilson lines:
GE(τ) = ∑ i∈{±1,...,±3}Re Tr
⟨ ( − ) i
( − ) i
. (5.48)
Reading from left to right, the numerator starts with a straight Wilson line. The follow- ing expression in parenthesis represents the color electric field. The corresponding lines are single links sitting on a square. The next Wilson line is shifted into the direction of i and is followed by another electric field. The denominator is the expectation value of the Polyakov loop.
5.4. Correction of lattice effects The Correlator (5.48) exhibits discretization effects. Due to the finite lattice spacing a, the correlator may take different values compared to the continuum.
The situation can be brought under control by ’tree level improvement’ [37]. As the correlator has been computed in perturbation theory for the continuum and for the lattice discretization [7, 37], one can estimate the strength of the lattice effect. The correction is then done by shifting the correlator to a different time by defining the pairs (τT, τT ) according to
GLO cont.(τT ) = GLO
lat.(τT ), (5.49)
where τT can naturally only take discrete values. The shifted correlator is now given as
Gimp.(τT ) = Glat.(τT ). (5.50)
A table with different pairs (τT, τT ) can be found in the appendix. Another correction has to be done because of discretization effects at loop level. The
correction factor Z can be computed perturbatively from computation of the heavy quark self-energy. It is then chosen in such a way, that it cancels effects stemming from 1-loop lattice regularization and one gets for leading order [37]
ZLO(g2) ≈ 1− 0.59777 β
ZNLO(g2) ≈ 1 + 0.474 β
47
6. Methods
6. Methods 6.1. Error handling and auto correlation time When measuring quantities on the lattice, one uses configurations that are generated by a Markov chain. Although one usually skips some configurations within the update process, following configurations are usually still correlated. This makes it impossible to use the standard routines for the error calculation, as for example the standard deviation. Instead one can use the Jacknife method, which takes care of the correlation.
For that purpose, one divides the set of N measurements of a quantity O in M blocks of size n = N/M . In the case that n has a remainder, one has to omit some measurements until n becomes an integer. We now create M subsets of N by leaving out one block, which we refer with index i. The average over such a subset is given by
Oi = M∑ j=1 j 6=i
n∑ k=0
Ojn+k, (6.1)
where Oj is one measurement in the Markov chain. Using this block averages, one defines the so called pseudo values as
Oi = MO − (M − 1)Oi, (6.2)
where O is the normal average on the whole set of measurements. From these pseudo values one can calculate the average and the standard error as usual according to
O = M∑ i=1 Oi ± δO (6.3)
with
δO =
(Oi − O) 2. (6.4)
Thereby, the number of blocks, M , is chosen as follows. One runs several error calcula- tions with the above technique for an increasing block size. This results in an increasing error until a plateau is reached. At this value the error for uncorrelated data is reached. Usually this is the case for M ∼ 10− 20.
However, the necessary number of blocks depends on the correlation between the individual measurements. This correlation can be examined through the so called au- tocorrelation time. It can be obtained by first defining the autocorrelation function
CO(t) = OiOi+t − Oi Oi+t (6.5)
for different values of t. In a typical Markov chain, the autocorrelation function has an exponential descent,
CO(t) CO(0) ∼ e−t/τ , (6.6)
48
6.2. Root calculation
where τ is the so called exponential autocorrelation time. Often, the descent is composed of several exponential terms,
CO(t) CO(0) ∼
Aie−t/τi . (6.7)
The exponential auto correlation time is then given by the largest value of those τi.
6.2. Root calculation The calculation of the roots of the polynomial (3.42) has to be done numerically. For a small number of roots (∼ 200), this can be done using Mathematica with the FindRoot command. Note that the default double precision is not enough to calculate the roots, as the values of the polynomial reach very high values outside the convergence area (see figure 3.3).
For a higher number of roots, one has to use a root finding algorithm. In this work, we chose a Levenberg–Marquardt algorithm to find minimums of |P (z)|2. This algorithm works as follows. At first a starting guess for the root z0 is chosen. Moreover, one defines a parameter λ and choses a start value, for example λ = 10−6. If we write z as a vector
z = ( x y
) , (6.8)
we can interpret the map z → f(z) := |P (z)|2 (6.9)
as R2 → R. A new guess for the minimum value can now be defined as
zi+1 = zi − (G(zi) + λ diagG(zi))−1∇f(zi) (6.10)
with the matrix Gi,j(z) = ∂2f(z)
∂zi∂zj . (6.11)
With this new guess one performs a selection step according to:
for (f(zi+1)− f(zi)) { < 0 use zi+1 and set λ := 10λ ≥ 0 keep zi and set λ := λ/10 (6.12)
One can show that in every iteration step the new value zi+1 is closer to the actual minumum. One now only needs to make good guesses for the minimum. This can be done by following the elliptic distribution of the roots.
49
7. Technical setup
7. Technic