Asymptotic behaviour of time averages for non-ergodic ...prac.im.pwr.edu.pl/~hugo/publ/asymptErg.pdf · 286 J.Ślęzak/AnnalsofPhysics383(2017)285–311 mathematicalergodictheoryhasbecomeaverywidesubject,butintherecentyearsanewtrend

Annals of Physics 383 (2017) 285–311

Contents lists available at ScienceDirect

Annals of Physics

journal homepage: www.elsevier.com/locate/aop

Asymptotic behaviour of time averages fornon-ergodic Gaussian processesJakub ŚlęzakWroclaw University of Science and Technology, Wybrzeże Wyspiańskiego 27, 50-370 Wrocław, Poland

h i g h l i g h t s

• Ergodic criteria for Gaussian models which use Fourier transform are provided.• Smooth, field Hamiltonian models are ergodic, discrete models are non-ergodic.• Fractal Fourier structure can induce recurring correlations and non-mixing.• Non-ergodic models exhibit non-linear dynamics and complex constants of motion.• This non-linearity can be studied using time-averaged characteristic function.

a r t i c l e i n f o

Article history:Received 10 February 2017Accepted 16 May 2017Available online 31 May 2017

Keywords:Ergodicity breakingGaussian processStatistical analysisGeneralised Langevin equation

a b s t r a c t

In this work, we study the behaviour of time-averages for sta-tionary (non-ageing), but ergodicity-breaking Gaussian processesusing their representation in Fourier space.Weprovide explicit for-mulae for various time-averaged quantities, such as mean squaredisplacement, density, and analyse the behaviour of time-averagedcharacteristic function,which gives insight into richmemory struc-ture of the studied processes. Moreover, we show applications ofthe ergodic criteria in Fourier space, determining the ergodicity ofthe generalised Langevin equation’s solutions.

© 2017 Elsevier Inc. All rights reserved.

1. Introduction

1.1. The goal

The relation between the time averages and ensemble averages is one of the most importanttopics of statistical physics and this area of research is under intense development. The abstract,

E-mail address: [email protected].

http://dx.doi.org/10.1016/j.aop.2017.05.0150003-4916/© 2017 Elsevier Inc. All rights reserved.

http://dx.doi.org/10.1016/j.aop.2017.05.015

http://www.elsevier.com/locate/aop

http://www.elsevier.com/locate/aop

http://crossmark.crossref.org/dialog/?doi=10.1016/j.aop.2017.05.015&domain=pdf

mailto:[email protected]

http://dx.doi.org/10.1016/j.aop.2017.05.015

286 J. Ślęzak / Annals of Physics 383 (2017) 285–311

mathematical ergodic theory has become a very wide subject, but in the recent years a new trendhas emerged which concentrates on very practical questions.

One example is the behaviour of the time-averaged mean square displacement

δ2(∆) :=1

T − ∆

∫ T−∆

0dτ(X(τ + ∆) − X(τ )

)2. (1)

This quantity can be estimated using only one (sufficiently long) trajectory in contrast to theensemble-averaged mean square displacement

δ2(∆) :=

∫dP(X(t + ∆) − X(t)

)2, (2)

which requires many trajectories to estimate. The comparison of these two types of mean squaredisplacement is central in study of the weak ergodicity breaking and can be used to distinguishbetween different models of classical and anomalous dynamics [1,2].

In this work we discuss the middle-ground between the abstract ergodic theory and the notionsused in applications, concentrating on three main areas:

I. We characterise the behaviour of time-averages for non-ergodic processes, including explicitformulae for useful quantities such as the time-averaged mean square displacement (1) andtime-averaged density (Section 3.2).

II. We analyse the behaviour of the time-averaged characteristic function, which is a less knownstatistics providing deeper insight into the dependence structure of the studied processes,unobtainable using only covariance-based methods (Section 3.2).

III. We determine the ergodicity and mixing of various physical models, the most important beinggeneralised and classical Langevin equation (Sections 3.1 and 4.2).

Together, these results form a basis of methodology for statistical analysis of ergodicity-breakingprocesses, which appear in the well-used physical models.

Such an intent is impossible to realise in full generality, so we concentrate on a class crucial fromthe point of view of modelling: the Gaussian processes. (However, we do remark briefly on how totreat non-Gaussian case in Section 5.) Under this assumption, one can use the elegant and practicaldescription of the ergodic behaviour in Fourier space. Calculating the Fourier transform of a function

f (ω) :=

∫Rdt eiωt f (t) (3)

in order to study its properties is a very common technique. However, its relation to the ageingphenomenon (Section 2) and ergodicity (Section 3) may be surprising, and is severely underratedin the physical literature.

1.2. Basics of ergodic theory

We consider a continuous-time stochastic process X = (Xt )t∈R whichmay be real or even complexvalued. In this section the results are general and apply also to non-Gaussian processes. The variable Xcan be position, velocity, intensity of light, etc. We are interested in the behaviour of various averagesof f (X), where f may be in general a function of the whole trajectory X . We assume that E|f (X)| < ∞

and call a function f with this property an observable. The expected value is an average under theprobabilistic measure associated with X , E[f (X)] =

∫dP f (X); in physical literature the notion

⟨f (X)⟩ is also in use. Examples include observable of mean f (X) = X(t), mean square displacementf (X) =

(X(t + ∆) − X(t)

)2, covariance f (X) = X(t + ∆)X(t), and others. Take note that for the aboveexamples the observables seemingly depend on time t , however further on our assumptionswillmakethis choice irrelevant; it will be possible to take t = 0 without any loss of generality.

If the process has a time-varying mean m(t) := E[X(t)] = const. we can always decompose it asa random, zero-mean part and deterministic non-zero part. Because we will study systems in whichcomplex behaviour will be contained in the random part, we assume m(t) = 0; the case m(t) = 0would be a straightforward generalisation.

J. Ślęzak / Annals of Physics 383 (2017) 285–311 287

For every process there exists associated family of time-shift operators Sτ which describe thetemporal evolution of the system, i.e. SτX = (X(t + τ ))t∈R.

With that in mind we can formulate the ergodic theorem. We will use general form of this result:the probabilistic variant of the Birkhoff ergodic theorem, which gives deeper insight into behaviourof both ergodic and non-ergodic processes.

Theorem 1. If the shift operators Sτ are measure-preserving, then the time average exists almost surelyand is a S-invariant random variable E[f (X)|C] [3,4],

limT→∞

1T

∫ T

0dτ f

(SτX

)= E[f (X)|C]. (4)

By the variable E[·|C] we understand the conditional expected value under condition C. Thecondition C is formally the σ -algebra of time invariant sets, i.e. sets invariant under transformationsSτ ; essentially we calculate the expected value assuming that all time invariant properties of X arefixed. The physical interpretation of C is that this is a set of constants of motion associated with X .

By the measure preserving transformation we mean that P(f (X) ∈ A) = P(f (SτX) ∈ A) for allobservables and measurable events A. This essentially means that X and time shifted SτX are statis-tically indistinguishable, which is called the stationarity condition in the probabilistic literature [3],and non-ageing in many physical papers [2].

Therefore the Birkhoff ergodic theorem essentially states that if only process X is stationary, thetime-average of any observable converges to a random variable E[f (X)|C] which we can preciselydetermine if we can identify C, i.e. all time invariant properties of X .

It is now clear that for the classical ergodic theorem to hold, the process should be stationary and Cmust be sufficiently weak, so that E[f (X)|C] = E[f (X)]; there should be no significant time-invariantproperties of X .

One immediate and important consequence of the Birkhoff theorem is that for a stationary processthe ensemble average of the time average of any observable is equal to the ensemble average withoutany time averaging

E[limT→∞

1T

∫ T

0dτ f

(SτX

)]= E[f (X)]. (5)

It is a direct consequence of the so-called tower property E[E[·|C]

]= E[·]. In particular Eq. (5) shows

that for stationary processes there is no possibility for a phenomena such as the weak ergodicitybreaking [2], because we must observe

E[δ2(∆)

]= E

[(X(t + ∆) − X(t)

)2]. (6)

Another concept closely related to the ergodicity is mixing. We say that the system is mixing if forany observables f , g such that E|f (X)g(X)| < ∞,

limT→∞

E[f (X)g(STX)] = E[f (X)]E[g(X)]. (7)

This is basically a statement that the process X and the shifted process STX are asymptoticallyindependent; the dynamics of X leaves no persisting memory. It implies ergodicity and is often easierto study than the ergodicity itself; however, there are examples of non-mixing ergodic processes, evenin the Gaussian case, which we will show in Section 3.

1.3. Gaussian processes

The main part of our considerations is true only for the class of Gaussian processes. Gaussianprocess is a process for which any finite sum of a type

∑kakX(tk) has a Gaussian distribution. It is

not enough that for any t the values X(t) have Gaussian distribution as it is very easy to constructcounterexamples using copula theory. The sufficient and necessary condition for the process to be


Gaussian is that all X(t) are Gaussian (we also admit the degenerate case X(t) = const.) and they areonly linearly dependent [5]. The presence of non-linear dynamics excludes Gaussianity. Nevertheless,very large class of widely applied models is still Gaussian which will become apparent in the nextsections.

The limitation on the possible memory type has large consequences: the Gaussian variables arefully described by their linear dependence structure which is reflected in second moments. AnyGaussian process is uniquely determined by the mean function m(t) = E[X(t)] (which we laterassume to be 0 without loss of generality), the and covariance function rX (s, t) := E[X(s)X(t)]. Usingthese functions, a Gaussian process is stationary if and only ifm(t) = const. and rX (s, t) = rX (t − s).

Another consequence of the purely linear structure of Gaussian processes is that the generalmixingcondition (7) reduces to the much simpler requirement that [4]

limT→∞

rX (T ) = 0, (8)

which is often straightforward to check. The ergodicity itself can also be expressed in the languageof the covariance function. Instead of ergodicity, often the equivalent notion of metric transitivity isused in this context [4,6]. The main part of this theory was completed by Maruyama in 1970 [7].

Theorem 2 (Maruyama). A Gaussian stationary process X is ergodic if and only if

limT→∞

1T

∫ T

0dτ |rX (τ )| = 0. (9)

Take note that the presence of modulus |rX (τ )| is crucially important, because it excludes periodicoscillations of rX . Generally this condition may also be easy to check, as it is enough to know theasymptotic tail behaviour of the covariance function. But, at the same time, it does not give muchinsight into the memory structure of non-ergodic Gaussian processes, which we will study later.

2. Stationarity in Fourier space

2.1. Harmonisable representation

The Fourier transform of a stationary Gaussian process X cannot be a process well-defined inclassical sense as the stationary process cannot decay to zero at infinity. However, one can definethe Fourier transform in the weak sense [8]

S(ω) := limT→∞

12π

∫ T

−Tdt

e−iωt− 1

−itX(t). (10)

The above limit exists in the mean-square sense, moreover, the process X can be expressed as

X(t) =

∫RdS(ω) eiωt , (11)

where the integral over increments dS(ω) is also understood in the mean-square sense. Any processwhich can be expressed as (11) is called a weakly harmonisable process. The process S is called aspectral process. Often the values of the spectral process are taken to be complex conjugate S(−ω) =

S(ω)∗ if we need the resulting X to be real valued.If a process S has independent increments, i.e. is a scaled Brownian motion in the Fourier space,

we call X harmonisable, or strongly harmonisable, and, as one can directly calculate,

rX (s, s + t) = rX (t) =

∫R

σX (dω) eiωt (12)

where the measure σX , called the spectral measure, is defined as the mean-square amplitude of theincrements of S

σX (dω) := E|dS(ω)|2. (13)


Becausewe have shown that rX depends only on one parameter, any harmonisable Gaussian process isstationary. This is actually also the sufficient condition (see proof in [4]). In other words, the followingtheorem holds

Theorem 3. A Gaussian process X is stationary if and only if the corresponding spectral process S hasindependent increments.

Themeasure σX is a non-negative and has total mass σX (R) = E|X(t)|2 < ∞. Onemight think thatfor important physical cases it is enough to limit ourselves to a case when σX is absolutely continuous,that is σX (dω) = dωs(ω), and has a density s called the power spectral density [8]. The next sectionshows that it is not true.

2.2. Harmonic processes

Consider elementary example of a motion in the harmonic potential, governed by the equation

X = −ω20X, X(0) = X0, X(0) = V0, (14)

which has the solution

X(t) = X0 cos(ω0t) +V0

ω0sin(ω0t). (15)

The evolution of the system is purely deterministic. But, if we assume that the beginning of theevolution system interacted with the heat bath, the initial conditions X0 and V0 are random and haveGibbs distribution given by the density

ρ(x0, v0) ∼ exp(

−ω20

x202kBT

)exp

(−

v20

2kBT

). (16)

Therefore the values X(t) for any t are also random, and simple calculation shows that the resultingstochastic process has the covariance function

rX (t) =kBTω2

0cos(ω0t), (17)

so it is a stationary Gaussian process. Its spectral representation is therefore given by the measure

σX (dω) =kBT2ω2

0

(δ(dω − ω0) + δ(dω + ω0)

)(18)

which is concentrated in 2 points, which we denote by two Dirac deltas. In a natural way a questionabout ergodicity arises. Whereas the time average of the observable of the position f (X) = X(t)

limT→∞

1T

∫ T

0dτ X(τ ) = lim

T→∞

1T

(X0

ω0sin(ω0T ) =

V0

ω20(cos(ω0T ) − 1)

)= 0 (19)

converges to the ensemble mean 0 = E[X(t)], the observable of the mean square displacement doesnot, as

limT→∞

1T

∫ T

0dτ(X(τ + ∆) − X(τ )

)2= 2

(X20 +

V 20

ω20

)(sin(

ω0∆

2

))2

(20)

differs from the ensemble average

E[(

X(t + ∆) − X(t))2]

= 4kBTω2

0

(sin(

ω0∆

2

))2

. (21)

From the point of view of dynamical system theory this lack of ergodicity is expected; afterinitial contact with heat bath the system evolves as microcanonical ensemble and the trajectories aretrapped on the surface of constant energy, which prohibits ergodicity. Indeed, the term X2

0 + V 20 /ω2

0


on the left side of Eq. (20) is the total energy of the system, it is random, but constant on the fixedtrajectories of X . On the other hand, the factor 2kBT/ω2

0 in (21) is the mean total energy, which isensemble averaged.

The similar reasoning applies to more general process of the form

X(t) =

∑k

Akeiωkt , (22)

where the sumcan even be infinite ifAk are independent, complexGaussian variables and∑

kE|Ak|2 <

∞. The random functions of this class, called harmonic processes, are appearing e.g. in the phonontheory [9], where it is straightforward to recognise normal modes in sums of type (22).

The covariance function and spectral measure which correspond to a given harmonic process are

rX (t) =

∑k

E|Ak|2 cos(ωkt), σX (dω) =

∑k

E|Ak|2

2

(δ(dω − ωk) + δ(dω + ωk)

). (23)

If one calculates the ensemble- and time-average of the mean-square displacement for such process,the different nodes of oscillation prove to be uncoupled in both time- and ensemble-average sense;the corresponding formulae are sums of terms as in Eq. (21) or (20) (for details see Appendix A).Therefore, any process within this class is stationary, but non-ergodic. The next sectionwill show thatit is actually the only case of non-ergodic Gaussian stationary process.

3. Ergodicity in Fourier space

3.1. Spectral form of Maruyama’s theorem

All the properties of a Gaussian process can be described interchangeably by its covariance functionor its spectral measure; this very specific property of the Gaussian class is caused by its linearstructure. The Maruyama theorem also can be expressed in the language of the spectral measure,and this reformulation leads to a surprisingly elegant statement [4].

Theorem 4. A stationary Gaussian process is ergodic if and only if its spectral measure has no points.

To fully understand this theorem note that any measure can be decomposed as a sum of threedistinct components: the absolutely continuous, singular and discrete measures. For a stochasticprocess the corresponding decomposition of the spectral measure σ = σac + σs + σd causes alsothe process itself to decompose into three independent components

X(t) = Xac(t) + Xs(t) + Xd(t), (24)

which is guaranteed by the harmonic representation (11). Now:

• The component Xd(t) is non-ergodic and is a Gaussian harmonic process.• The component Xac is mixing. It has a power spectral density. The Riemann–Lebesgue lemma

shows that in this situation the covariance (and all other memory functions) of Xac decays atinfinity, i.e. the values of the process become asymptotically independent at long time scales.

• The last, singular component Xs is ergodic, but its memory structure may be not typical. Itscovariance function does not necessarily decay to 0. It may oscillate, but must be aperiodic andthe high correlation events must become more scarce as t → ∞.

The set of the measures, for which the covariance function decays, called Rajchman measures,generally does not have convenient description [10]. As a demonstration let us consider an exampleusing the most well-known singular measure: the Cantor measure.

TheCantor set is obtainedby removing from themiddle one-third the interval [0, 1], then repeatingthis procedure at two remaining intervals [0, 1/3], [1/3, 1] and recursively applying this procedureinfinitely many times. The points which will remain are Cantor points. Elementary calculation proves


Fig. 1. Plot of covariance function rC of the process with the Cantor spectral measure.

that the length of the intervals removed during construction of the Cantor set is 1, therefore the Cantormeasure cannot have a density and must be singular.

The points in Cantor set can be conveniently characterised as points in interval [0, 1]which have nodigits ‘‘1’’ in their ternary representation, i.e. have the representation

∑∞

k=1dk3−k for some sequence

dk ∈ 0, 2.The Cantor measure σC is the uniform measure on Cantor set. We move this measure left to

the interval [−1/2, 1/2] so that the corresponding process will be real-valued. The Cantor numbersin this interval can be represented as

∑∞

k=1dk3−k where dk ∈ −1, 1. The corresponding Cantor

measure is probablymost simple to understand as a discrete uniformdistribution on the i.i.d. seriesDk,P(Dk = −1) = P(Dk = 1) = 1/2 mapped onto interval [−1/2, 1/2] using formula Y =

∑∞

k=1Dk3−k.The process XC which has the Cantor measure as a spectral measure has the covariance function

rXC (t) =

∫R

σC (dω) eiωt= E

[eit

∑∞k=1 Dk3−k

]=

∞∏k=1

E[eitDk3−k

]=

∞∏k=1

cos(t3−k) . (25)

In the contrast to more well-known classes of covariance functions, rXC has a specific property closeto a self-similarity

rXC (3t) = cos(t)rXC (t), (26)

which also guarantees that rXC does not decay to zero. The extremal points of rC are located at tk = 3kπ ,where it attains values rC (tk) = (−1)k+1rC (π ) ≈ ±0.47, see Fig. 1. It may be not clear that function rCcan be easily calculated numerically, however it can be shown that taking N ≥ log3 t terms from theproduct (25), we overshoot nomore than to the level exp(2t29−N )rC , which is a very fast convergence,see proposition in Appendix B and the proof thereof.

This demonstrates that the process XC has a recurring correlation and cannot be mixing; however,the correlation events are becoming exponentiallymore rare as the time delay increases,which allowsfor ergodicity.

Physically, the XC can be interpreted as a dynamical process generated by the heat bath withCantor-like geometry inwhichwe observemacroscopic collective average of the oscillators’ positions.The fractal structure of such system is recognisable only in the Fourier space. In the position space it isvisible only as the recurring correlation. It does not affect e.g. the regularity of the trajectories, whichis determined by the asymptotics of the covariance function near t = 0. In fact, in the above case thetrajectories of XC are smooth. We comment more on the relation between the heat bath models andergodicity in Section 4.2.

Unfortunately, there is no simple correspondence between the fractal structure of the spectralmeasure and the recurring correlation or mixing, which becomes evident even with the slightgeneralisation of the model. If instead of removing one third we perform the recursive removalprocedure such that at any step the remaining intervals on left and right have length one-ηth of theprevious one (η being real number bigger than 2), the obtained singular measure and the process isnon-mixing for natural η, but it is mixing for any η which is not a Pisot–Vijayaraghavan number [11].


The Pisot–Vijayaraghavan numbers are a closed countable set which causes even infinitesimally smallchanges of η to change mixing behaviour.

The complex ergodic behaviour complicates the analysis of the models with singular spectralmeasures, but it is worth stressing the erratic behaviour of covariance functions may be useful fordescribing the observations which could be otherwise accounted for as an experimental errors. It isalso worth noting that the singular measures are gaining attention for their relation to the fractaldynamics and self-similarity [12–14].

3.2. Generalised Maruyama theorem for non-ergodic stationary processes

Any real stationary Gaussian process can be written as

X(t) = Xerg(t) +

N∑k=1

Rk cos(Θk + ωkt) + X0, ωk = 0, (27)

which follows from the ergodic decomposition (24), after taking the real part of the harmonic process(22). Variables Rk := |Ak| are amplitudes of the spectral points at frequencies ωk and have Rayleighdistribution with scale parameters σk = E|Ak|

2; Θk are phases of Ak and have uniform distribution on[0, 2π ), a consequence of rotational invariance of i.i.d. Gaussian vectors. In full generality the numberof spectral pointsmay be infinite,N = ∞, however in this case the process X may exhibit complicatedaperiodic behaviour; as it is not very important for most of the applicational purposes, further on inthis section we limit our considerations to the case N < ∞.

The decomposition into non-ergodic and ergodic components yields a useful and straightforwarddescription of the statistical properties of the Gaussian processes. It is made possible by the fullcharacterisation of the invariant sets of this dynamical system.

Theorem 5. For any stationary Gaussian process X with N spectral points at rationally incommensurablefrequencies ωk

Nk=1, the family of invariant sets C is the σ -algebra σ (X0, Rkk).

The proof is given in Appendix A. The assumption that spectral points at ωkNk=1 are rationally

incommensurable means that they cannot be represented as ωk = qkα for any rational qk’s. Itis fulfilled in most of the real physical systems, in which ωk are self-frequencies of the harmonicoscillators and depend on the complex set of the system’s parameters.

In such case, the aforementioned theorem guarantees that for any observable f , the time averageconverges to

limT→∞

1T

∫ T

0dτ f (SτX) = E

[f (X)|, X0, Rkk

], (28)

i.e. to the ensemble average calculated under condition that the amplitudes of the spectral pointsand the constant term X0 are fixed. For observables which depends on one time moment of X only,f (X) = f (X(t)) the above formula simplifies to the explicit integral

limT→∞

1T

∫ T

0dτ f (X(τ ))

=

∫ 2π

0dθ1

∫ 2π

0dθ2 . . .

∫Rdx

1√2πc

e−x2

2c2 f(x +

∑k

Rk cos(θk) + X0), (29)

which depends only on the variance of the ergodic component c2 = E[Xerg(t)2

], Rk’s and X0, which

are random, but fixed for each trajectory.In particular, using the cumulative distribution functionmethod, one can calculate the non-ergodic

time-averaged probability density ρ of any given discrete spectral component with amplitude Rk = R

ρ(x) = ρXk (x|R) =

(πR)−1√

1 −(x/R

)2 , −R ≤ x ≤ R. (30)


Fig. 2. Time-average kernel density estimation (each blue line corresponds to one trajectory with different random R) andthe ergodic density for Gaussian process with one discrete spectral component (black line), E

[R2]

= E[Xerg(t)2

]= 1. (For

interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

This quantity should be observed if one uses time-average estimation of the probability density,e.g. kernel density estimators or histogram [15].

This is an interesting example of a singularity caused by the non-ergodicity. The flat extrema ofcosine function are responsible for the probability density divergence of type x−1/2 at points −R and+R. However, this unusual behaviour is not easy to directly observe as the typical data contains someergodic component, e.g. some kind of noise from the experimental setup. In such case, the observedempirical probability distributionρ is a convolution of the ergodic component’s Gaussian densitywitha given stationary variance σ 2 and some number of densities of type (30), see Fig. 2. The singularconcentration of the probability mass around −R and +R distorts the tails of distribution in specificway: it thickens thembymoving the original distribution, but thins them through dividing by a squareroot factor. More precisely, we can recognise that this convolution of the densities is an example ofLaplace transform and use Abelian theorem [16] to obtain asymptotic behaviour of the left tail

ρ(−x) =1

√2π3σR

∫ R

−Rdy

1√1 −

(x/R

)2 e−(y+x)2

2σ2

= e−(x−R)2

2σ21

√2π3Rσ

∫ 2R

0dy

1√y√2 − y/R

e−y2

2σ2 e−y(x−R)

σ2

∼1

4π√1 − x/R

e−(x−R)2

2σ2 , x → ∞; (31)

symmetrically for the right tail. This result does not change significantly for any finite number ofspectral points, as it depends only on the presence of singularities of the convolved densities.

The asymptotic formula (31) differs considerably from normal distribution; formost of the realisa-tions the amplitudes Rk are large enough to strongly affect the time-averaged probability density, seeFig. 2. For nearly all realisations statistical tests also show significant non-gaussianity (e.g. Shapiro andKolmogorov–Smirnov tests). However, for small amplitude of Rk’s this effect could be less noticeable.

The fast decay of function exp(−x2) may complicate analysis of non-ergodicity through densityestimation. More convenient method is to use time-averaged characteristic function. It is a time-average of the observable f (X) = exp(iθX(t)), which for a non-ergodic component equals

limT→∞

1T

∫ T

0dτ eiθX(τ ) =

12π

∫ π

−π

dx eiθR cos(x)= J0(Rθ ), (32)

where J0 is a Bessel function of the first kind and order 0. Therefore the time-averaged characteristicfunction φ of any stationary Gaussian process with incommensurable frequencies has form

φ(θ ) = e−c2θ2/2∏k

J0(Rkθ

), (33)


estimation of R1

estimation of R2

estim

ated

c

2

1

0

N=200 N=500 N=1000 N=2000

N=200 N=500 N=1000 N=2000

N=200 N=500 N=1000 N=2000

3.5

3

2.5

estim

ated

R2 2

1

0

Fig. 3. Estimation of c = E[Xerg(t)2], R1 and R2 for different sample sizes N , using least-square fit of the time-averagedcharacteristic function for the process Xerg.(t) + 3 cos(t) + 2 cos(

√2t).

where exp(−c2θ2/2) is the ensemble averaged characteristic function of the ergodic component of theprocess. The additional, non-ergodic factors J0(Rkθ ) cause thewhole function to have zeros determinedby the zeros of function J0, which may be approximated numerically, the first being x1 ≈ 2.405, thesecond x2 ≈ 5.520. Therefore the location of zeros for time-averaged characteristic functionmay serveto preliminary estimate the number and values of amplitudes Rk. More precise estimation requiresleast-squares fitting using the formula from Eq. (33). The exemplary results of this procedure areshown in Fig. 3.

The estimators of Rk have tendency to return some undershoot values which cause negative bias,especially for lower lengths of trajectories, but are generally reliable. The results depend on the samplelength N , but do not depend on the sampling time ∆t as long as π∆t is incommensurable withωkk. If this is not true, the measured time series has a periodic component and the proper values oftime-averaged observables are obtained in the infill asymptotics, i.e. ∆t → 0, and ∆tN → ∞. Suchrequirements guarantee that the calculated mean converges to the time integral 1/T

∫ T0 dτ where

T → ∞.Let us consider models in which it may happen that ωk are commensurable. This situation

may appear in real system when ωk by coincidence or due to some symmetry are close to beingcommensurable, that is they can be expressed as ωk = αpk/qk + ϵk where pk, qk are small naturalnumbers and ϵk is small compared to 1/T .

For commensurable ωk the harmonic process is in fact periodic and the length of its periodis proportional to the lowest common denominator of ωk’s. We show example in Fig. 4, wherethe empirical characteristic function of process R1 cos(Θ1ω1t) + R2 cos(Θ2 + ω2t) is presented. Forsimplicity we fixed ω1 = 1 and R1 = 1, R2 = 2, as Rk’s are constants of motion in this case. Becauseof the periodicity, the calculated time-average depends on the random initial phases Θ1, Θ2.

We actually know the exact dependence on both random amplitudes and phases; the time-averaged characteristic function of any stationary Gaussian process is given by the formula (seeAppendix A)

φ(θ ) = e−c2θ2/2∑S∈G

exp

⎛⎝i∑mj∈S

mjΘj

⎞⎠∏mj∈S

imj Jmj (θRj), (34)


Fig. 4. Time-averaged characteristic function of the process 2 cos(Θ1 + t)+2 cos(Θ2 +ω2t) for differentω2 and 20 realisationsof Θ1, Θ2 . Black line, corresponding to the simulation for ω2 =

√2 up to numerical accuracy ϵ ∼ 10−16 , agrees perfectly with

ergodic average J0(t)J0(2t), see Eq. (33).

where G is family of sets of mj’s for which∑

jmjωj = 0. The result depends on linear combinationsof the random phases

∑jmjΘj (mod 2π ). This is true not only for the time-averaged characteristic

function, but also for any observable, which is stated in the following theorem:

Theorem 6. For any stationary Gaussian process X with spectral points ωkNk=1 (some of which may be

commensurable), the family of invariant sets C is the σ -algebra σ (X0, Rkk,M), where X0 is the constantterm, Rk are the amplitudes of the atoms of the spectral measure and M is a family

M =

⎧⎨⎩∑mj

mjΘj (mod 2π ) :

∑j

mjωj = 0

⎫⎬⎭ . (35)

Moreover, M can be reduced to contain at most N − 1 integer linear combinations.

The proof is given in Appendix A. This general theorem completely determines the behaviour oftime-averaged observables for stationary Gaussian processes and has important practical applica-tions, allowing for statistical analysis of non-ergodic stationary Gaussian models.

As an example, let us come back to the time-averaged characteristic function (34). For the caseof process with two spectral points with frequencies ω1 = 1, ω2 = 2, the numerically calculatedtime-averages are shown as yellow lines in Fig. 4. The integer combinations in familyM from Eq. (35)are exactly m1 = 2,m2 = −1 and multiples of those. Therefore the time-averaged characteristicfunction depends only on 2Θ1 − Θ2 (mod 2π ). Indeed, any yellow line in Fig. 4 corresponds tomany different random choices of Θ1, Θ2, and these lines can be parametrised only by the numberc ∈ [0, 2π ) defined as 2Θ1 −Θ2 = c (mod 2π ). Similarly, green lines on the same figure depend onlyon 3Θ1 − 2Θ2 (mod 2π ), because we havem1 = 3,m2 = −2, and so on.

It may seem counter-intuitive that the rationality or irrationality of the number ω2, which cannotbe experimentally studied, affects numerical simulations and the behaviour of the real systems. Thisapparent paradox disappears if we carefully analyse the behaviour of the time-averages in the twoessential cases

• Forω2 = p/qwith coprime p, q the trajectorywill have period 2πq. For q sufficiently larger thanthe experiment time (q ≫ T ), the periodicity will be unobservable during the measurementand the time-averaged observables will seem independent of initial phases.

• For irrational ω2 = p/q + ϵ with coprime p, q and ϵq ≪ 1 the time-averaged observables willnot depend on initial phases in long-time limit T → ∞, however the process will be very closeto a periodic one, therefore the convergence will be slow.

So, we realise that in the real experiment, in which the time of measurement is finite T < ∞, thepractically significant property is how close ω2 is to an irreducible fraction with a small denominator.Analogically, for multiple ωk’s we only need to determine how close they are to a set of commensu-rable numbers with simple rational ratios.


10-6

10-5

10-4

10-3

10-2

0 1/7 1/6 1/5 1/4 2/7 1/3 3/8 2/5 3/7 1/2 4/7 3/5 5/8 2/3 5/7 3/4 4/5 5/6 6/7

Fig. 5. Numerical estimation of variance var[φ(2.5)] obtained using 103 samples of trajectories with length T = 200 and valuesof ω2 taken as one thousand uniformly scattered numbers in interval [0, 1] stored in format double.

Fig. 6. Comparison between estimated time-average covariance function and a theoretical ensemble one for a stationary non-ergodic process with commensurable frequencies of the spectral points.

In order to illustrate this fact numericallywe simulated the processR1 cos(Θ1+t)+R2 cos(Θ2+ω2t)with fixed R1 = 2, R2 = 3 and calculated the variance of the time-averaged characteristic function atone point φ(2.5). For irrational ω2 and in the limit T → ∞ this quantity should equal zero, but in thefinite-time the numerical experiment it is a positive function of ω2 which indeedmeasures how closeω2 is to a simple irreducible fraction. See Fig. 5, where the peaks of the variance indicate the positionsof the simplest irreducible fractions like 1/2, 1/3, 2/3, 1/4, 3/4 etc.

Another result, which may be somehow unexpected, is that according to the calculation inSection 2.2, even for commensurable ωkk the time-average second-order properties depend only onRkk and the dependence of initial phases Θkk is lost. Indeed, it is sufficient to prove the agreementof the time- and ensemble-averaged covariance function, which we show in Appendix A. It meansthat the additional memory structure induced by the commensurability of ωkk is purely non-linearand it is possible to detect it only using higher-order statistics, e.g. the time-averaged characteristicfunction. This fact do not contradict the purely linear dependence structure of the Gaussian processes,because this property applies only to the time-averages, which in non-ergodic case make use only ofa part of the full information contained in the process. The additional non-linear dependences can beinterpreted as shadows of the full, linear but inaccessible information.

The exemplary covariance estimation is shown in Fig. 6, where we sampled the stationaryOrnstein–Uhlenbeck process with mean-returning parameter λ = 3, and the addition of two spectralcomponents 2 cos(t) and cos(2t). The covariance conditioned by R1 = 2, R2 = 1 is

r(t) = e−λt+ 2 cos(t) +

12cos(2t) (36)

and it agrees with the estimated time-averaged covariance, as shown in Fig. 6.

4. Ergodicity of the linear response systems

4.1. Linear filters

In this section we will make use of three basic facts.


Proposition 1. Let us consider any finite measure σ and a measurable function f , defined σ -almosteverywhere. In this case:

1. If σ is absolutely continuous, then σ f is absolutely continuous.2. If σ is Rajchman, then σ f is Rajchman.3. If σ is continuous, then σ f is continuous.

The function f is often called the spectral gain.Fact 1 fallows from the definition of the measure σ f . For any measurable set A it equals, by

definition

(σ f )(A) =

∫Aσ (dω) f (ω), (37)

in short (σ f )(dω) = σ (dω)f (ω). So, if σ has form σ (dω) = dωs(ω) then σ f has a form dωs(ω)f (ω).Fact 2 is a known result from the measure theory and can be proven using trigonometric polynomi-als [10,17]. The proof of Fact 3 is also simple: σ (x0)f (x0) = 0 only if σ (x0) = 0.

Considered together, these three facts guarantee that if Gaussian process X has spectralmeasure σ ,then Gaussian process Y with spectral measure σ f inherits all ergodic properties (ergodicity, mixing)from X . Process Y can only have more ergodic properties than X .

The above proposition can be used to determine the ergodic behaviour of various transformationsof a given process. If process X has spectral process S (i.e. is given by Eq. (11)), then the time-shiftedprocess t ↦→ X(t − T ) has harmonisable representation

X(t − T ) =

∫RdS(ω) e−iωTeiωt , (38)

in other words its spectral process has increments dS(ω)e−iωT . Consequently, it has the same spectralmeasure and the same distribution as X . The direct generalisation of this fact is that any process givenby

Y (t) =

∑k

akX(t − Tk), (39)

with deterministic∑

k|ak|2 < ∞, has a harmonisable representation

Y (t) =

∫RdS(ω)

∑k

ake−iωTkeiωt , (40)

therefore has spectral measure σ (dω)∑

kake−iωTk

2 and inherits the ergodic properties of the processX . Process of this form appears directly e.g. in the biological applications [18].

Because the mean square displacement is the variance of the process Y = X(t + ∆)− X(t) we canuse the above results and obtain

δ2(∆) = E|X(t + ∆) − X(t)|2 =

∫R

σ (dω)eiω∆

− 12 = 4

∫R

σ (dω) sin(ω

2

)2. (41)

Moreover, taking limit limh→0(X(t + h) − X(t))/h one obtains the harmonisable representation ofthe mean-square derivative

ddt

X(t) = i∫RdS(ω) ωeiωt , (42)

which exists if and only if∫

σ (dω) ω2 < ∞. Analogically, any process given by

Y (t) =

∑k

akdk

dtkX(t), (43)

with∑

k|ak|2ω2k < ∞ for ω in support of σ , has harmonic representation

Y (t) =

∫RdS(ω)

∑k

ak(iω)keiωt , (44)


and spectral measure σ (dω)∑

kak(iω)k2. Because ∑kak(iω)k

2 is a continuous function defined onthewholeR, the process Y inherits ergodic properties of X . The non-ergodicity of X can be not presentin Y , because the function ω ↦→

∑kak(iω)k

2 may have zeros and if such zero agrees with position ofspectral point of X , the process Y does not contain this spectral point. The simplest such case is whenX has exactly one spectral point at ω = 0, that is it contains a time-independent Gaussian constantX0. In this situation any time-averaged observable which depends on themean of X does not convergeto the ensemble-average. However, the derivative d

dt X(t), corresponding to the spectral gain functionω ↦→ ω2, does not contain X0 and is ergodic.

Similar reasoning generalises formula (39). For the convolution

Y (t) = g ∗ X(t) =

∫Rds g(t − s)X(s) (45)

it yields

Y (t) =

∫RdS(ω) g(ω)eiωt , (46)

where g is the Fourier transform of g ∈ L1(R) ∩ L2(R). Analogically to the previous case, Y inheritsergodic properties of X . If zeros of g agree with spectral points of X , the process Y may be ergodicwhereas X is not. Moreover, if singular non-mixing measure of X is contained in the domain outsideof Y support, Y may be mixing when X is non-mixing.

Special care must be taken in more general case g ∈ L2(R) but g ∈ L1(R). The Fourier transformof such g has jumps. The well-known result from the Fourier theory [19] states that if g has a jump atω0, then

∫ T−T dt g(t)e

iω0t converges as T → ∞ to the value (g(ω−

0 )+ g(ω+

0 ))/2, i.e. to the exact middleof the discontinuity of g . If the process X has spectral point at the exact frequency ω0, then for theprocess Y this spectral point will be modulated by |g(ω−

0 ) + g(ω+

0 )|2/4 as long as the filter is applied

symmetrically as a limit of convolutions with functions g which are supported on interval [−T , T ].The representation (46) vastly increases the number of models for which it is easy to study

ergodicity using Fourier methods, as using convolution is one of the most often chosen methods tomodel time-invariant linear responses of the system (see also Section 4.2). One of the most commonexamples that appear in practice is g being one or two sided exponent decaying with rate λd. Thesechoices correspond to the spectral responses 1/(λ2

d + ω2) or 4λ + d2/(λ2d + ω2), respectively.

One other practical consequence is that one can filter out non-ergodicity from the data. Usingestimators of power spectral density (e.g. periodogram [20]) the locations of spectral points can beestimated and the corresponding non-ergodicity removed by using any filter with zeros as its spectralgain function at their frequencies. The simplest choice of such filter is the smoothing

X(t) =

∫ t+π/ω0

t−π/ω0

ds X(s), (47)

which integrates the spectral component at ω0 over its period, therefore removing it. It correspondsto the filter gω0 and spectral gain gω0

gω0 (t) =

1, |t| ≤ π/ω0,

0, |t| > π/ω0,gω0 (ω) =

√2ω2

0

π3 sinc(

ωπ

ω0

). (48)

Formultiple number of spectral points onemayuse filter gω1∗gω2 . . . gωN or any otherwith the suitablespectral gain. The gain |g|

2 behaves like ∼(ω − ω0)2 near ω0, therefore it removes the spectral pointω0 in a numerically stable manner. If one is more interested in sure removal of the non-ergodicitynear the location ω0 than not distorting the spectrum, he can use spectral gain more flat around ω0,e.g. using triangular function filter guarantees asymptotical behaviour∼(ω−ω0)4. Other useful choiceis spectral gain

g(ω) =

1 − e−(ω−ω0)2/c, |ω| ≤ L;0, |ω| > L,

(49)


which allows for calibration of level of distortion around ω0 (parameter c) and frequency cut-off(parameter L). The corresponding filter can be expressed using error, Gaussian and trigonometricfunctions, so it can be easily computed for the purpose of statistical usage.

The above approach can be understood as, instead of using original observables f (X), use themodified observable f (X) = f (Y ) forwhich the time- and ensemble-averages coincide, evenwhen thisis not generally the case. Analysing the transformed process Y instead of X may be more difficult, asproperties of X are distorted by filtering, albeit in controlled manner. However, for a small number ofspectral points it is manageable, moreover it can be used as an effective method of localising spectralpoints: if the observables of the filtered process behave ergodically, it is a statistical verification ofgood choice of these locations. Further analysis can be performed on the filtered process Y , which isergodic, or by staying with the original X and using methods from Section 3.2.

4.2. Linear response systems and Langevin equations

Probably the most basic example of an ergodic Gaussian process is the solution of the classicalLangevin equation

ddt

X(t) = −λX(t) + ξ (t), (50)

governed by white Gaussian noise ξ (t) defined as increments of the Brownianmotion dtξ (t) = dB(t).The stationary solution of this equation is given by the convolution

X(t) =

∫Rdsξ (s) G(t − s) =

∫RdB(s) G(t − s), (51)

where the Green function G is given by the corresponding deterministic problem

ddt

G(t) = −λG(t) + δ(t). (52)

Here δ is Dirac delta considered as a distribution and the whole equation is interpreted in thedistributional sense. The solution is called casual when G(t) = 0 for t < 0 and the process X attime t depends only on past values of the noise, i.e. dB(s) for s ≤ t . In such case the solution is one-sided exponential decay G(t) = e−λt for t ≥ 0. It is stationary, therefore it must have harmonisablerepresentation. The Fourier transform of the Green function must satisfy iωG = −λG + 1, so it mustbe equal to G(ω) = (λ + iω)−1, and the harmonisable representation is

X(t) =

∫RdB(ω)

1λ + iω

eiωt , (53)

where dB is white Gaussian noise in the Fourier space (which can be interpreted as a generalisedFourier transform of dB, see [21]). Therefore the solution is mixing and has power spectral density(λ2

+ ω2)−1.The above result is much more general. Any stationary solution of the linear differential system

akdk

dtkX(t) + ak−1

dk−1

dtk−1 X(t) + · · · + a0X(t) = ξ (t), (54)

has a casual solution given by a proper Green’s function and harmonisable representation

X(t) =

∫RdS(ω)

1∑kj=1 aj(iω)j

eiωt , (55)

where dS(ω) = dωξ (ω) is a spectral process of the stationary Gaussian noise ξ , which in generalmay not be white noise. It is clear that X inherits ergodic properties of ξ . On the contrary to the caseconsidered in Section 4.1 it cannot be ergodic when ξ is not, as the rational function 1/|

∑jaj(iω)j|2

does not have zeros.


Very similar reasoning applies to systems with more rich memory structure, namely described bythe generalised Langevin equation [22]

ddt

X(t) = −

∫ t

0ds K (t − s)X(s) + ξ (t). (56)

The equation in the above form does not have stationary solution. However, its non-stationarysolution, determined, again, by the corresponding Green function

X(t) = X0G(t) +

∫ t

0dsξ (s) G(t − s), (57)

converges pointwise as t → ∞ to the stationary process given by

X(t) =

∫ t

−∞

dsξ (s) G(t − s). (58)

The above integrals and convergence are defined in mean-square or almost sure sense, dependingon the regularity of memory kernel K and noise ξ [23,24]. This process is a solution of the stationarygeneralised Langevin equation

ddt

X(t) = −

∫ t

−∞

ds K (t − s)X(s) + ξ (t). (59)

In this form we recognise the convolution of the process X with the kernel K understood as a casualfunction. For K ∈ L2(R), the harmonisable representation of solution is, analogically to the previousresults,

X(t) =

∫RdS(ω)

1

iω + K (ω), (60)

where once again dS(ω) = dωξ (ω). Without any change to the previous considerations, the solutionX inherits the ergodic properties from the noise ξ . The formula (60) is valid even for some kernelsK ∈ L2(R). The well-used model of subdiffusion is generalised Langevin equation in which ξ isfractional Browniannoise [23,25] and the kernel isK (t) = t−α, 0 < α < 2. One still can use the Fouriertheory for functions outside L2(R), where K (ω) = |ω|

α−1Γ (1−α)ei sgn(ω)π (α−1)/2 [19], substitute it intoEq. (60) and obtain valid result [23].

The above considerations apply also to the full formof the generalised Langevin equation [22,23,25]with the external potential

md2

dt2X(t) = −λX(t) −

∫ t

−∞

ds K (t − s)ddt

X(s) + ξ (t), (61)

in which the similarity to the Newton equation is more visible. When modelling diffusion by Eq. (59)the process X describes velocity and its integral, the position, is not ergodic, and even not stationary(the particle is not confined). For Eq. (61) the process X is position itself, and due to confining term−λX(t) it is stationary and ergodic when the driving noise ξ is.

The Fourier space approach gives also additional insight into the physical origin of the non-ergodicity. In many applications based on classical or quantum statistical models, the generalisedLangevin equation is derived from the bath of harmonic oscillators model. It is often called Kaz–Zwanzig model for classical systems [22,23,26,27] and Caldeira–Leggett model for quantum sys-tems [28,29]. In any case the Hamiltonian depends on macroscopic coordinate X, P , microscopicoscillators qj, pjj and is given by

H =P2

2M+

∑j

p2j2mj

+

∑j

mjω2j

2q2j +

∑j

γjqjX, (62)

with the possible addition of the macroscopic potential acting on X . The above sum may be infiniteas long as the total energy H is finite. The analytic formula of the noise ξ may be obtained solving


the corresponding Hamilton equations [22,23], and it has a purely point spectrum supported on self-frequencies of the bath harmonic oscillators

σξ (dω) =

∑j

mjγ 2j

2ω2j

(δ(dω − ωj) + δ(dω + ωj)

). (63)

This fact already prohibits ergodicity of solution. It is nothing surprising from the point of view ofabstract ergodic theory, which excludes ergodicity for systems governed by quadratic Hamiltonianssuch as of discrete heat bath [10]. Moreover, even the stationary solution does not exist. Eq. (59) is notdirectly derived from the Kaz–Zwanzig model. Instead, one can strictly derive only its non-stationaryvariants like Eq. (56), in which the convolution integral is taken from 0 to t and initial conditionsmust be provided. The stationary equation (59) is a long-time limit of the former one, but only underassumption that the stationary solution exists. This is not generally the case.

In this model, the fluctuation–dissipation theorem holds and states that kernel K is equal to thecovariance function of ξ . For a finite number of oscillators this is a finite sum of cosines, which isperiodic. Even for infinite number of oscillators the process ξ is non-ergodic, so both its covariancefunction and kernel K do not decay at infinity (see mixing condition (8)). It means that even in thethermodynamical limit N → ∞ the Kaz–Zwanzig model cannot describe the long-time asymptoticsof the most often used memory functions. These can be nonetheless approximated by this modelin finite time scales, most simply by the spectral points which form the Fourier series of a givencovariance function in a fixed interval.

Non-decaying kernel also causes problems with divergence of the convolution integral in Eq. (59).There is no spectral measure that would fulfil Eq. (59) or (61) so there cannot be stationary solutionin the classical sense. The model, however, is still a valid derivation of the stationary but non-ergodicstochastic noise ξ . Only the solution of the Generalised Langevin equation cannot be stationary if thememory kernel is given by fluctuation–dissipation theorem.

The amplitudes of the spectral points depend on energies of the harmonic oscillators. If these maybe considered small, under proper rescaling the bath may replaced by a smooth field φ, π . Under thisassumption the system is described by field Hamiltonian [30]

H =P2

2M+

12

∫dx |π (x)|2 +

12

∫dx |φ′(x)|2 +

∫dx d(x)φ′(x)X, (64)

where the function d, called the density of states, couples the field to the coordinate X and theconjugated momentum P . Function d must be square integrable for the last term to be finite. Thegeneralised Langevin equation can still be derived and has the same form as in the previous case,but this time the noise ξ has a power spectral density |d|

2. The system is mixing. Besides that

there are no essential restrictions on the type of memory determined by this model. The Fouriertransform is a bijection from L2(R) to L2(R), so in this model |d|

2can be any L1(R) function [31]. The

possible corresponding covariance functions form a wide space which encompasses commonly usedexponential and power law decays.

The most important conclusion from this comparison of these two related models is that theergodicity is determined by the type of the heat bath in the physical model: the discrete or the fieldone.

5. Comments about non-Gaussian case

In previous sections we concentrated on Gaussian processes. Here we want to briefly comment onthe limitations of this methodology. The non-linear memory structure of non-Gaussian process is ingeneral so complex that analysing ergodicity is complicated. Even negative results are not commonlyavailable. Nevertheless, some generalisations can be made. First, any harmonic process

X(t) =

∑k

Rk cos(Θk + ωkt) + X0, (65)


is non-ergodic for random Rk, as the analysis performed in Section 2.2 does not depend on the precisedistribution of Rk’s. These processes are also stationary, because under any time-shift τ , the jointdistribution of the phases does not change: [Θ1 + τω1, Θ2 + τω2 . . .] (mod 2π ) d

= [Θ1, Θ2 . . .]. Thedeterministic Rk’s can be treated as a degenerate case of Gaussian variables with zero variance, so inthis specific case the process is ergodic if frequencies ωk are incommensurable.

Ergodic properties of the general class of infinitely divisible processes were described in usinggeneralised memory functions such as codifference and correlation cascade [32–35]. For this class,the notion of a harmonisable process is also in use and is defined by the analogical formula [8,13]

X(t) =

∫RdS(ω) eiωt , (66)

where S is a rescaling of a Lévy processes L, i.e. process with independent stationary increments, forwhich additionally the increments are rotationally invariant.

When dS(ω) = dL(ω)f (ω), the integral (66) may be defined using Poisson random measure [36]and if Lévy process L has second moment, its covariance structure is the same as for a harmonisableGaussian process with power spectral density |f |2. The process X is generally non-ergodic, but thetime-averaged covariance function converges in the mean-squared sense to the ensemble one if andonly if

1T 2

∫ T

0dt1

∫ T

0dt2 E[X(t1 + ∆)X(t2 + ∆)X(t1)X(t2)]

T→∞−−−→ 0, (67)

which is a direct application of the second order ergodic theorem [37]. In practice it is most oftenexpected that this equation is fulfilled and the detection of non-ergodicity in this case requires usinghigher-order observables.

In the case of stable processes, dS(ω) is interpreted as rotationally invariant α-stable randommeasure with some control measure σα(dω). The stable harmonisable processes are stationary,however they are distinct from the stable moving-average processes, in particular solutions ofLangevin equations [13]. This fact by itself limits their practical applications. Moreover, they are allnon-ergodic, as shown in [38]. Still, there is an analogue of the elegant representation of the covariancestructure of Gaussian harmonisable processes. For stationary stable processes, instead of covariance,the codifference function

τ (t) := 2 lnE[eiX(0)

]− lnE

[ei(X(t)−X(0))] (68)

is used, together with notions of long- and short-dependence analogical to those used for covariance.For harmonisable processes the codifference is given by relatively simple formula, which generalisesthe Gaussian result from Section 2.1

τ (t) = 4α

∫R

σα(dω)sin(ω

2

)α − 2∫R

σα(dω), (69)

which is a particular case of more general result proven in Appendix B. This quantity may be used inpractice to study thememory structure of harmonisable stable processes. However, the behaviour thetime-averaged version of (68) was not yet sufficiently studied.

6. Summary

The main mathematical results of this work are Theorems 5 and 6 from Section 3.2. These tworesults determine the asymptotic behaviour of time-averages which correspond to physically impor-tant properties for important and large class of models. Using these theorems we show that in orderto thoroughly study the ergodicity the non-linear statisticsmust be used. Thereforewe propose a newtool: the time-averaged characteristic function, and show the theoretical properties of this functionin some important cases. The validity of our methods is checked using Monte-Carlo simulations.

The general overview of the spectral theory of the Gaussian processes and its links to theergodic theory is also an important part of the paper. We try to provide physical interpretation ofmany mathematical results in this field. We use them to explain the behaviour of solutions for the


generalised Langevin equation and their relation to the underlying physical models of the heat bath.We also show few other examples of systems described by the spectral theory, such as the processwith Cantor spectral measure, which exhibits unusual recurrent correlations.

Acknowledgement

The research was supported by NCN Maestro Grant No. 2012/06/A/ST1/00258.

Appendix A. Proof of the generalised Maruyama ergodic theorem

Proof of Theorem 5. We will combine methods for trigonometric series presented in [39] andGaussian ergodic theorem from Section 3.9 of [4].

We will use representation

X(t) = Xerg(t) +

N∑k=1

Rk cos(Θk + ωkt) + X0, ωk = 0, (A.1)

whereωk are distinct. The process Xerg(t) has no spectral points. The full distribution of X is generatedby the values Xerg(tl), Rj cos(Θj + ωjtj) and X0. Therefore it is sufficient to study the time-averagedistribution of the sum

L∑l=1

θlXerg(t + tl) +

N∑j=1

λjRj cos(Θj + ωjtj + ωjt) + λ0X0. (A.2)

For brevitywe denote Θj := Θj+ωjtj. As the distribution is uniquely determined by the correspondingcharacteristic function, wewill compare the time-averaged characteristic function and the ensemble-averaged one given condition σ (X0, Rk). Their equality will prove the theorem.

Firstwe calculate the conditional ensemble-averaged characteristic function. The variables Θj havethe same distribution as Θj modulo 2π , because they are independent of each other, Rj’s, X0, Xerg, andhave marginal uniform distribution. We get

E[ei(∑L

l=1 θlXerg(t+tl)+∑N

j=1 λjRj cos(Θj+ωjt)+λ0X0)|X0, Rk

]= E

[ei(∑L

l=1 θlXerg(tl))] N∏

j=1

E[eiRj cos(Θj)|Rj

]eiλ0X0

= φθ1,...,θL

N∏j=1

J0(Rjλj)eiλ0X0 , φθ1,...,θL := E[ei(∑L

l=1 θlXerg(tl))]

, (A.3)

where J0 are Bessel functions of first kind and order 0; they stem from the formula

12π

∫ π

−π

dx eiλR cos(x)= J0(λR). (A.4)

The rest of the proof will be the calculation of the time-average.We denote

Φ(t) := ei(∑L

l=1 θlXerg(t+tl))

(A.5)

which appear in the integral used during calculations of time-average

IT :=

∫ T

0dτ Φ(t)ei

(∑Lj=1 λjRj cos(Θj+ωjt)

)eiλ0X0 . (A.6)

The factor eiλ0X0 already agrees with the conditional average, so we will assume X0 = 0 later on forbrevity.


Next we expand each exponent of cosine using Jacobi–Anger identity

eiz cos(w)=

∞∑m=−∞

imeimwJm(z), (A.7)

obtaining

IT =

∫ T

0dτ Φ(τ )

N∏j=1

∞∑m=−∞

imeim(Θj+ωjτ )Jm(λjRj)

=

∫ T

0dτ Φ(τ )

∑S∈MN

∏mj∈S

eimj(Θj+ωjτ )imj Jmj (λjRj)

=

∫ T

0dτ Φ(τ )

∑S∈MN

exp

⎛⎝i∑mj∈S

mj(Θj + ωjτ )

⎞⎠∏mj∈S

imj Jmj (λjRj)

=

∑S∈MN

exp

⎛⎝i∑mj∈S

mjΘj

⎞⎠∏mj∈S

imj Jmj (λjRj)∫ T

0dτ Φ(τ ) exp

⎛⎝iτ∑mj∈S

mjωj

⎞⎠ (A.8)

whereMN is the family of allN-element subsets of integers S. Exchanging the order of infinite sum andintegral is possible because the integrated function is bounded by 1, which also justifies commutingthe sum and the limit in the next step. We shall denote ΩS :=

∑mj∈S

mjωj and check that

limT→∞

1T

∫ T

0dτ Φ(τ ) exp (iτΩS) = 0, if ΩS = 0. (A.9)

The proof of this statement is given in the lemma below. Let us conclude the whole proof. Becauseωk are rationally incommensurable, the equality ΩS =

∑mj∈S

mjωj = 0 holds for integer mj onlywhenm1 = m2 = · · · = mN = 0. In the sum

∑S∈MN

only one element S = 0, 0, . . . , 0 remains and

limT→∞

1TIT =

N∏j=1

J0(λjRj) limT→∞

1T

∫ T

0dτ Φ(τ ) =

N∏j=1

J0(λjRj)φθ1,...,θL , (A.10)

where the last equality holds due to the ergodicity of Xerg. This is the desired conditional mean.

Lemma 1.

limT→∞

1T

∫ T

0dτ Φ(τ ) exp (iτΩ) = 0, (A.11)

for Ω = 0 and Φ given by (A.5).

Proof. First note that Φ is a strictly stationary random process. Take U ∼ U([0, 2π )) independent ofΦ . Process t ↦→ Φ(t) exp(iU +Ωt) is also stationary and has finite first moment equal to 1. Therefore,the Birkhoff ergodic theorem guarantees the almost sure existence of time-average

limT→∞

1T

∫ T

0dτ Φ(τ ) exp (iUτΩ) = exp(iΩ) lim

T→∞

1T

∫ T

0dτ Φ(τ ) exp (τΩ) = X(Ω). (A.12)

So, the limit (A.11) also exists almost surely and equals a random variable exp(−iU)X(Ω) (for moredetails on X(Ω) see Doob [3] Chapter XI.2, page 516). We will prove that it is 0.

Let r be covariance function of Xerg and σ its continuous spectral measure. We will study E|·|2

of the above time-average and show its mean-square convergence to 0, which suffices to prove also


almost sure convergence to the same limit.

1T 2

∫ T

0dτ1

∫ T

0dτ2E

⎡⎣exp

⎛⎝iL∑j

θj(Xerg(tj + τ1) − Xerg(tj + τ2)

)⎞⎠⎤⎦ eiΩ(τ1−τ2)

=CT 2

∫ T

0dτ1

∫ T

0dτ2 exp

⎛⎝ L∑j,k=1

θjθkr(tk − tj + τ1 − τ2)

⎞⎠ eiΩ(τ1−τ2)

=CT 2

∫ T

0dτ1

∫ T

0dτ2 exp

(∫Rdσ (ω)eiω(τ1−τ2)

)eiΩ(τ1−τ2), (A.13)

where we denoted by C the factor before the integral and by σ the modified spectral measure; it isjust multiplied by a continuous function.

C = exp

⎛⎝2L∑

j,k=1

θiθjr(tk − tj)

⎞⎠ , dσ (ω) := dσ (ω)

L∑

j=1

θjeiωtj

2

. (A.14)

Next, we expand external exp(·) into Taylor series, obtaining

CT 2

∫ T

0dτ1

∫ T

0dτ2

(1 +

∞∑n=1

1n!

(∫Rdσ (ω)eiω(τ1−τ2)

)n)eiΩ(τ1−τ2)

=CT 2

∫ T

0dτ1

∫ T

0dτ2

(1 +

∞∑n=1

1n!

∫Rdσ ∗n(ω)eiω(τ1−τ2)

)eiΩ(τ1−τ2), (A.15)

where σ ∗n is n-fold convolution power of σ . This Taylor series is uniformly convergent. We commutelimit T → ∞ with the sum and calculate the integrals; for term n = 1 we have

1T 2

∫ T

0dτ1

∫ T

0dτ2eiΩ(τ1−τ2) =

1T 2

1Ω2

eiΩT− 1

2 T→∞−−−→ 0, (A.16)

where the assumption Ω = 0 is crucial. For any other term

limT→∞

1T 2

∫ T

0dτ1

∫ T

0dτ2

∫Rdσ ∗n(ω)ei(ω+Ω)(τ1−τ2)

= limT→∞

1T 2

∫Rdσ ∗n(ω − Ω)

∫ T

0dτ1

∫ T

0dτ2eiω(τ1−τ2)

= 2 limT→∞

∫Rdσ ∗n(ω − Ω)

1 − cos(ωT )(ωT )2

. (A.17)

In the list line one recognise the functional which returns the jump of the measure σ ∗n at point Ω .But, the measure σ is continuous and σ ∗n is also continuous; the result is σ ∗n(Ω) = 0.

Proof of Theorem6. Combining (A.8) and (A.9)we obtain the formula of time-averaged characteristicfunction of finite-dimensional distribution in general case, which is

φθ1,...,θLeiλ0X0

∑S∈GN

exp(i∑mj∈S

mjΘj

)∏mj∈S

imj Jmj (λjRj); (A.18)


here GN are all N-element subsets of integersmj for which∑N

j=1mjωj = 0. What is left is to show thatthe above quantity equals the conditional ensemble-average characteristic function

E[ei(∑L

l=1 θlXerg(t+tl)+∑N

j=1 λjRj cos(Θj+ωjt)+λ0X0)|X0, Rk,M

]= φθ1,...,θLe

iλ0X0E[ei(∑N

j=1 λjRj cos(Θj+ωjt))|Rk,M

]. (A.19)

The factor φθ1,...,θLeiλ0X0 already agrees, so we will omit it later on. Next, we expand the remaining

expected value using Jacobi–Anger identity (A.7), obtaining∑S∈MN

∏mj∈S

imj Jmj (λjRj)E[ei∑

mk∈S mk(Θk+ωkt)|M]. (A.20)

Because θ ↦→ eiθ is injection on [0, 2π ) and Θk differ from Θk only by a deterministic constants, theσ -algebra σ (M) is equivalent to σ (M) generated by

M =

⎧⎨⎩exp(∑

mj

mjΘj

):

∑j

mjωj = 0

⎫⎬⎭ . (A.21)

For terms with S ∈ GN ⊂ MN the random phases in (A.20) are M-, therefore also M-measurable,moreover they agree with the corresponding terms in (A.18). What is left is to show that the expectedvalue of the remaining terms for S ∈ GN is zero.

Now, for any incommensurable ωj the correspondingmj = 0. Commensurable ωj’s can be dividedinto subsets of jointly commensurable numbers, i.e. into blocks ωki for which ωki = αqki/pki , qki ∈

Z, pki ∤ qki ; different blocks have different incommensurable factors α. Each such block correspondsto different subset of independent Θj, therefore they can be considered separately.

Let us choose one such block and for simplicity of notation, change indices such that these areω1, ω2 . . . , ωr. The condition

∑rj=1mjωj = 0 is equivalent to condition

∑rj=1mjηj = 0, where ηj

are relatively prime integers obtained by multiplying ωj by the least common multiple of pj’s. Theequation

r∑j=1

mjηj = 0 (A.22)

has exactly r −1 linearly independent solutions in integers [39]. For our one chosen block let us namethese solutions m1

j , m2j , . . . , m

r−1j . Any other solution is a linear combination of the elementary

solutions

mj =

r−1∑ρ=1

νρmρ

j , νj ∈ .Z. (A.23)

Thereforer∑

j=1

mjΘj =

r∑j=1

r−1∑ρ=1

νρmρ

j Θj =

r−1∑ρ=1

νρ

r∑j=1

mρ

j Θj, (A.24)

and for each blockM depends actually only on r − 1 variables Ξρ =∑r

j=1mρ

j Θj (mod 2π ), the rest ofthe variables are linear combinations of the elementary ones. For all blocks together M depends onat most N − 1 such variables.

The factor in the studied conditional expectancy corresponding to the chosen block is

E

⎡⎣exp(i

r∑j=1

mjΘj

)|Ξρ

r−1ρ=1

⎤⎦ , mj ∈ S. (A.25)


Because mj ∈ S ∈ GN the sum∑r

j=1mjΘj (mod 2π ) is linearly independent of the set Ξρr−1ρ=1. We

prove in the lemma below that it implies that this sum is also probabilistically independent of Ξρr−1ρ=1

and has uniform distribution on [0, 2π ). Therefore

E

⎡⎣exp(i

r∑j=1

mjΘj

)|Ξρ

r−1ρ=1

⎤⎦ = E

⎡⎣exp(i

r∑j=1

mjΘj

)⎤⎦= E

[eiΘ

′]

=12π

∫ 2π

0dθ eiθ = 0, Θ ′

∼ U(0, 2π ). (A.26)

We have proven that all elements in the sum (A.20) which contain combinations of Θj linearlyindependent of elements of M are zero. Only M-dependent elements remain, which exactly agreeswith the time-average characteristic function (A.18). This concludes the proof.

Lemma 2. For i.i.d. ΘjNj=1, Θj ∼ U(0, 2π ) any N linearly independent integer combinations

Ξi =

N∑j=1

mijΘj (mod 2π ), mij ∈ Z (A.27)

are set of jointly independent random variables with distribution Ξi ∼ U(0, 2π ).

Proof. Because we work in modulo 2π arithmetic, all variables can be considered to have values intorus T = R/2πZ and U(0, 2π ) ≡ U(T). The continuous dual of TN is ZN and thus a natural space ofparameters of characteristic function of the vector (Ξi)Ni=1.

For uniform distribution on torus the characteristic function has a very simple form: it is theKronecker delta

E[eikΘj

]= δk, k ∈ Z. (A.28)

It is clear if we think about characteristic function as a Fourier series of density 1/(2π ) on T. Wewill show that the multidimensional characteristic function of (Ξi)Ni=1 is the product δk1 . . . δkN whichcorresponds to distribution U(TN ), that is i.i.d. uniform variables Ξi.

Let us choose any k1, . . . , kN ∈ Z and consider∑

ikiΞi. We calculate characteristic function

E

[exp

(i

N∑i=1

kiΞi

)]= E

⎡⎣exp(i

N∑i=1

kiN∑j=1

mijΘj

)⎤⎦= E

⎡⎣exp(i

N∑j=1

Θj

N∑i=1

kimij

)⎤⎦ =

N∏j=1

E

[exp

(iΘj

N∑i=1

kimij

)]

=

N∏j=1

δ∑Ni=1 kimij

. (A.29)

The above product equals 1 if, and only if for all j we have∑N

i=1kimij = 0. In all other cases it equals0. But, the linear integer combinations (A.27) are linearly independent which is equivalent to sayingthat this is true if, and only if k1 = k2 = · · · = kN = 0. So the above formulamust be exactly δk1 . . . δkNwhich was to be demonstrated.

Proposition 2. For any stationary Gaussian process X, the time-average covariance structure is theensemble-average structure conditioned by σ -algebra σ (X0, Rkk), where Rk are the amplitudes ofthe atoms of the spectral measure and X0 is the constant term. The result is true even for rationallycommensurable frequencies of spectral points.


Proof. Webeginwith calculating the conditional ensemble-average covariance. The conditionalmeanequals

E [X(t)|X0, Rk] = X0, as E [cos(Θk)] = 0, and E [Xerg(t)] = 0. (A.30)

Next we fix t1, t2 and use independence of Xerg and Xk’s.

E[(X(t1) − X0)(X(t2) − X0)|X0, Rk

]= E

[Xerg(t1)Xerg(t2)|X0, Rk

]+

∞∑k=1

E[Xk(t1)Xk(t2)|X0, Rk

]= E

[Xerg(t1)Xerg(t2)

]+

∞∑k=1

R2kE[cos(Θk + ωkt1) cos(Θk + ωkt2)

]= r(t2 − t1) +

∞∑k=1

R2k

12π

∫ 2π

0dx cos(x + ωkt1) cos(x + ωkt2)

= r(t2 − t1) +12

∞∑k=1

R2k cos(ωk(t2 − t1)). (A.31)

The time-average consists of integral from the parts Xerg(ti + τ )Xerg(tj + τ ), Xk(ti + τ )Xk(tj + τ ),Xk1 (ti + τ )Xk2 (tj + τ ), k1 = k2 and Xerg(ti + τ )Xk(tj + τ ), i, j ∈ 1, 2; we call them I1, I2, I3, I4,respectively. All sums are absolutely convergent:we can commute summation, integration and takinglimit T → ∞.

Time-average T−1I1 converges to r(t2 − t1) because Xerg is ergodic. For T−1I2 we have

1TI4 =

1TR2k

∫ T

0dτ cos(τ + ωkti) cos(τ + Θk + ωktj)

=12R2k cos(ωk(t2 − t1))

+1

ωkTcos(2Θk + ωk(t1 + t2 + T )) sin(ωkT )

T→∞−−−→

12R2k cos(ωk(t2 − t1)). (A.32)

Therefore we need to prove that T−1I3 and T−1I4 decay to 0. For T−1I3 it is straightforward, denotingω± := ωk2 ± ωk1 , Θ± := Θk2 ± Θk1 we get

1TI3 =

12Tω−

(sin(ωk1 ti − ωk2 tj − Θ−) − sin(ωk1 ti − ωk2 tj − Θ− − Tω−)

)+

12Tω+

(− sin(ωk1 ti + ωk2 tj + Θ+) − sin(ωk1 ti + ωk2 tj + Θ+ + Tω+)

) T→∞−−−→ 0. (A.33)

As for T−1I4, we will show that time-average of Xerg(ti + τ )Rk exp(iΘk + iωkτ ) converges, which isequivalent condition. The factor Rk exp(iΘk) can be brought outside integral, therefore only showingconvergence of time-average of Xerg(ti + τ ) exp(iωkτ ) is required. The latter is

1T

∫ T

0dτXerg(τ )eiωkτ . (A.34)

The limit T → ∞ of the above formula exists almost surely, argument is the same as at the beginningof the lemma. We will prove it is 0. Let us calculate E|·|

2 of (A.34)

1T 2

∫ T

0dτ1

∫ T

0dτ2 r(τ2 − τ1)eiωk(τ2−τ1)

=1T 2

∫ T

0dτ1

∫ T

0dτ2

∫Rdσ (ω)eiω(τ2−τ1)eiωk(τ2−τ1)


=1T 2

∫Rdσ (ω)

∫ T

0dτ1

∫ T

0dτ2ei(ω+ωk)(τ2−τ1)

= 2∫Rdσ (ω − ωk)

1 − cos(ωT )(ωT )2

T→∞−−−→ σ (ωk) = 0.

That shows mean-square and almost sure convergence to 0.

Appendix B. Other auxiliary results

Proposition 3. For numerically approximated covariance function of the Cantor process

r(t) :=

N∏k=1

cos(3−kt

)(B.1)

the above value converges to the cantor covariance function rC and for N ≥ log3 t is bounded by

(1 + ct29−N )rC (t) ≤ r(t) ≤ edt29−N

rC (t), c, d > 0 (B.2)

for t’s where it is positive, and the reverse inequality for t’s where it is negative.

Proof. For N ≥ log3 t we have x = 3−(N+j)t ≤ π/2, cos(3−(N+j)t

)> 0 and

1 − x2/2 ≤ cos(x) ≤ 1 − 4/π2x2. (B.3)

Expressing r in terms of rC

r(t) = rC (t)∞∏j=1

1cos(3−(N+j)t

) (B.4)

we obtain for area of positive values of rC and r

rC (t)∞∏j=1

11 − 3−2(N+j)t24/π2 ≤ r(t) ≤ rC (t)

∞∏j=1

11 − 3−2(N+j)t2/2

. (B.5)

We need to approximate product of terms1

1 − a9−N9−j = 1 +1

a−19N9j − 1= 1 + 9−N 1

a−19j − 9−N = 1 + pj. (B.6)

To use the inequality valid for positive pj’s

1 +

∞∑j=1

pj ≤

∞∏j=1

(1 + pj) ≤ exp

⎛⎝ ∞∑j=1

pj

⎞⎠ (B.7)

we estimate the sum of pj’s by∞∑j=1

pj ≥ a9−N∞∑j=1

19j ≥ a9−N 1

8(B.8)

from the bottom, and writing∞∑j=1

pj = 9−N 1a−1 − 9−N + 9−N

∞∑j=1

19a−19j − 9−N ≤

19Na−1 − 1

+19

∞∑j=1

pj (B.9)

we estimate∞∑j=1

pj ≤98

19Na−1 − 1

(B.10)


from the top. Substituting proper awe obtain the final result(1 +

t2

2π2 9−N)rC (t) ≤ r(t) ≤ exp

(98

12t−2 − 9−N 9−N

)rC (t). (B.11)

For the area with negative rC , r the above inequality is reversed.

Proposition 4. For α-stable harmonisable process

X(t) =

∫RdS(ω) eiωt , (B.12)

where the spectral process S has control measure σα , the general codifference function

τθ1,θ2 (s, t) := lnE[eiθ1X(s)

]+ lnE

[eiθ2X(t)

]− lnE

[ei(θ1X(s)+θ2X(t))

](B.13)

equals

τθ1,θ2 (s, t) =

∫R

σα(dω)θ1 + θ2eiω(t−s)

α − (|θ1|α + |θ2|α)∫R

σα(dω). (B.14)

Proof. We use the following formula [13]

E[exp

(i∫RdS(ω) f (ω)

)]= exp

(−

∫R

σα(ω) |f (ω)|α)

(B.15)

and note that

θ1X(s) + θ2X(t) =

∫RdS(ω)

(θ1eiωs

+ θ2eiωt) . (B.16)

Therefore

E[ei(θ1X(s)+θ2X(t))

]= exp

(−

∫R

σα(dω)θ1eiωs

+ θ2eiωtα)

= exp(

−

∫R


α) (B.17)

and

τθ1,θ2 (s, t) = |θ1|α

∫R

σα(dω) + |θ2|α

∫R

σα(dω) −

∫R

σα(dω) (θ1eiωs

+ θ2eiωtα)

=

∫R


α − (|θ1|α + |θ2|α)∫R

σα(dω). (B.18)

References

[1] R. Metzler, Int. J. Mod. Phys. Conf. Ser. 36 (2015) 1560007. http://dx.doi.org/10.1142/S2010194515600071.[2] R. Metzler, J.-H. Jeon, A.G. Cherstvy, E. Barkai, Phys. Chem. Chem. Phys. 16 (44) (2014) 24128–24164. http://dx.doi.org/10.

1039/C4CP03465A.[3] J.L. Doob, Stochastic Processes, Wiley-Interscience, 1990.[4] H. Dym, H.P. McKean, Gaussian Processes, Function Theory and the Inverse Spectral Problem, Academic Press, 1976.[5] S. Janson, Gaussian Hilbert Spaces, Cambridge University Press, 1997.[6] U. Grenander, Ark. Mat. 1 (1950) 195–277.[7] G. Maruyama, Theory Probab. Appl. 15 (1970) 1–22.[8] A.M. Yaglom, Correlation Theory of Stationary and Related Random Functions, Springer, 1987.[9] R. Fox, Phys. Rep. 48 (3) (1978) 179–283.

[10] R. Lyons, J. Fourier Anal. Appl. Special Issue (1995) 363–377.[11] R. Salem, Algebraic Numbers and Fourier Analysis, Heath, 1963.[12] B. Mandelbrot, Gaussian Self-Affinity and Fractals, Springer, 2002.

http://dx.doi.org/10.1142/S2010194515600071

http://dx.doi.org/10.1039/C4CP03465A



http://refhub.elsevier.com/S0003-4916(17)30149-5/sb3











[13] G. Samorodnitsky, M. Taqqu, Stable Non-Gaussian Random Processes, Chapman & Hall, 1994.[14] P. Embrechts, M. Mejima, Selfsimilar Processes, Princeton University Press, 2002.[15] B.W. Silverman, Density Estimation for Statistics and Data Analysis, Chapman and Hall, 1986.[16] A. Postnikov, Tauberian Theory and Its Applications, American Mathematical Society, 1980.[17] T. Eisner, B. Farkas, M. Haase, R. Nagel, Operator Theoretic Aspects of Ergodic Theory, Springer, 2015.[18] R. Barbuti, G. Caravagna, A. Maggiolo-Schettini, P. Milazzo, in: C. Priami, R.-J. Back, I. Petre, E. de Vink (Eds.), Transactions

on Computational Systems Biology XIII, Springer, 2011, pp. 61–84.[19] D. Champeney, A Handbook of Fourier Theorems, Cambridge University Press, 1987.[20] P. Brockwell, R. Davis, Time Series: Theory and Methods, Springer, 2006.[21] I.M. Gel’fand, N.Y. Vilenkin, Generalized Functions: Applications of Harmonic Analysis, Academic Press, 1964.[22] R. Zwanzig, Nonequlibrium Statistical Mechanics, Oxford University Press, 2001.[23] S.C. Kou, Ann. Appl. Stat. 2 (2) (2008) 501–535.[24] V. Anh, R. McVinish, J. Appl. Math. Stoch. Anal. 16 (2) (2003) 97–119.[25] A. Viñales, M. Despósito, Phys. Rev. E 73 (2006) 016111. http://dx.doi.org/10.1103/PhysRevE.73.016111.[26] H. Mori, Progr. Theoret. Phys. 33 (3) (1965) 423–455.[27] R. Kubo, Rep. Progr. Phys. 29 (1966) 255–284.[28] A. Caldeira, Scholarpedia 5 (2) (2010) 9187. http://dx.doi.org/10.4249/scholarpedia.9187.[29] U. Weiss, Quantum Dissipative Systems, Word Scientific, 2008.[30] L. Rey-Bellet, in: S. Attal, A. Joye, C.-A. Pillet (Eds.), Quantum Dissipative Systems II, Springer, 2006, pp. 41–78.[31] Y. Katznelson, An Introduction to Harmonic Analysis, Cambridge University Press, 2004.[32] A. Janicki, A. Weron, Simulations and Chaotic Behaviour, Marcel Dekker, 1994.[33] M. Magdziarz, Theory Probab. Appl. 54 (2) (2009) 322–324. http://dx.doi.org/10.1137/S0040585X97984267.[34] J. Rosiński, T. Żak, Theory Probab. Appl. 61 (2) (1996) 277–288. http://dx.doi.org/10.1016/0304-4149(95)00083-6.[35] M. Magdziarz, Stochastic Process. Appl. 119 (2009) 3416–3434. http://dx.doi.org/10.1016/j.spa.2009.06.002.[36] T. Marquardt, Bernoulli 12 (6) (2006) 1099–1126. http://dx.doi.org/10.3150/bj/1165269152.[37] K. Sobczyk, Stochastic Differential Equations: With Applications to Physics and Engineering, Springer, 2013.[38] M. Magdziarz, A. Weron, Phys. Rev. Lett. 105 (2010) 260603. http://dx.doi.org/10.1103/PhysRevLett.105.260603.[39] E. Kimme, SIAM Rev. 7 (1) (1965) 88–99.















http://dx.doi.org/10.1103/PhysRevE.73.016111



http://dx.doi.org/10.4249/scholarpedia.9187





http://dx.doi.org/10.1137/S0040585X97984267

http://dx.doi.org/10.1016/0304-4149(95)00083-6

http://dx.doi.org/10.1016/j.spa.2009.06.002

http://dx.doi.org/10.3150/bj/1165269152


http://dx.doi.org/10.1103/PhysRevLett.105.260603


Documents

Asymptotic behaviour of time averages for non-ergodic ...prac.im.pwr.edu.pl/~hugo/publ/asymptErg.pdf · 286 J.Ślęzak/AnnalsofPhysics383(2017)285–311 mathematicalergodictheoryhasbecomeaverywidesubject,butintherecentyearsanewtrend