10
Lithuanian Mathematical Journal, Vol. 52, No. 4, October, 2012, pp. 390–399 LIMITS OF THE LETAC PRINCIPLE. THE DISCRETE CASE Vytautas Kazakeviˇ cius Faculty of Mathematics and Informatics, Vilnius University, Naugarduko 24, LT-03225 Vilnius, Lithuania (e-mail: [email protected]) Received July 6, 2012; revised August 31, 2012 Abstract. The Letac principle says that if the backward iterations of a random function converge almost surely, then the forward iterations converge in distribution. In this paper, we find conditions for the inverse implication to hold. MSC: 60J10 Keywords: iterations of random functions, Letac principle 1 INTRODUCTION Let E be a locally compact topological space with countable base of open sets, and C (E,E) the space of all continuous functions f : E E endowed with the compactly open topology. It is well known that both E and C (E,E) are Polish spaces. If f,g C (E,E) and x E, we write fx and gf for f (x) and g f , respectively; so gfx = g(f (x)). Both functions (f,x) fx and (f,g) gf are continuous. We call random elements of E random variables and those of C (E,E) random functions. Let F be a random function, and (F n ) a sequence of its independent copies. Define X n = F n ··· F 1 and ξ n = F 1 ··· F n . Random functions X n and ξ n are called, respectively, forward and backward iterations of F . Obviously, for all n, the distributions of X n and ξ n coincide. Nevertheless, the distributions of the sequences (X n ) and (ξ n ) are completely different. For example, for all x, the sequence (X n x) is a Markov chain with transition probability kernel P (x, A)= P{Fx A}. (1.1) In general, the sequences (ξ n x) are not Markov chains. It is known that if some (ξ n x) converges almost surely to some random variable ¯ x with distribution π, then π is an invariant distribution of P . If the limit exists and is the same for all x, then π is a unique invariant distribution of P . This fact is first proved by Letac [3] and is called the Letac principle. It is often used to prove the existence of an invariant distribution of a Markov chain, especially in the nonirreducible case (see, e.g., [1]). It is interesting to investigate the limits of this principle. For example, the following question arises: If we want to prove some fact about a Markov chain generated by a random function F , how restrictive is 390 0363-1672/12/5204-0390 c 2012 Springer Science+Business Media New York

Limits of the Letac principle. The discrete case

Embed Size (px)

Citation preview

Page 1: Limits of the Letac principle. The discrete case

Lithuanian Mathematical Journal, Vol. 52, No. 4, October, 2012, pp. 390–399

LIMITS OF THE LETAC PRINCIPLE. THE DISCRETE CASE

Vytautas Kazakevicius

Faculty of Mathematics and Informatics, Vilnius University, Naugarduko 24, LT-03225 Vilnius, Lithuania(e-mail: [email protected])

Received July 6, 2012; revised August 31, 2012

Abstract. The Letac principle says that if the backward iterations of a random function converge almost surely, then theforward iterations converge in distribution. In this paper, we find conditions for the inverse implication to hold.

MSC: 60J10

Keywords: iterations of random functions, Letac principle

1 INTRODUCTION

Let E be a locally compact topological space with countable base of open sets, and C(E,E) the space of allcontinuous functions f : E → E endowed with the compactly open topology. It is well known that both E andC(E,E) are Polish spaces. If f, g ∈ C(E,E) and x ∈ E, we write fx and gf for f(x) and g◦f , respectively;so gfx = g(f(x)). Both functions (f, x) �→ fx and (f, g) �→ gf are continuous. We call random elements ofE random variables and those of C(E,E) random functions.

Let F be a random function, and (Fn) a sequence of its independent copies. Define

Xn = Fn · · ·F1 and ξn = F1 · · ·Fn.

Random functions Xn and ξn are called, respectively, forward and backward iterations of F . Obviously, forall n, the distributions of Xn and ξn coincide. Nevertheless, the distributions of the sequences (Xn) and(ξn) are completely different. For example, for all x, the sequence (Xnx) is a Markov chain with transitionprobability kernel

P (x,A) = P{Fx ∈ A}. (1.1)

In general, the sequences (ξnx) are not Markov chains.It is known that if some (ξnx) converges almost surely to some random variable x with distribution π, then

π is an invariant distribution of P . If the limit exists and is the same for all x, then π is a unique invariantdistribution of P . This fact is first proved by Letac [3] and is called the Letac principle. It is often used toprove the existence of an invariant distribution of a Markov chain, especially in the nonirreducible case (see,e.g., [1]).

It is interesting to investigate the limits of this principle. For example, the following question arises: Ifwe want to prove some fact about a Markov chain generated by a random function F , how restrictive is

390

0363-1672/12/5204-0390 c© 2012 Springer Science+Business Media New York

Page 2: Limits of the Letac principle. The discrete case

Limits of the Letac principle 391

the assumption of almost sure convergence of backward iterations? In other words, can the Letac principle bereversed: Is it true that the convergence in distribution of (Xnx) implies the almost sure convergence of (ξnx)?In this paper, we give an answer in the discrete case.

The paper is organized as follows. In Section 2, we formulate the main results. We show that if, for some x0,ξnx0

P→ x ( P→ stands for convergence in probability) and if x0 is a random variable with the same distributionas x but independent of (Fn), then (Xnx0) is a stationary mixing sequence. In the discrete case, this meansthat there exists a class of recurrent states which is positive, aperiodic, and accessible with probability onefrom x0. So, in what follows, we can suppose without loss of generality (restricting, if needed, the kernel Pto the corresponding subset of E) that P is ergodic. In this case, we show that (ξn) tends in probability to aconstant random function and therefore

P{Xnx �= Xnx

′} −→ 0 (1.2)

for all x, x′ ∈ E. Next, we prove that the ergodicity of P , together with (1.2), implies the convergence inprobability of (ξn). We also show that for all ergodic P , there exists a random function F satisfying (1.1) forwhich (1.2) also holds.

In Section 3, we investigate when the convergence in probability of backward iterations implies the almostsure convergence. Section 4 contains the proofs.

Throughout the paper, a sequence means a family indexed by positive integers. The set of all sequencesof elements of E is denoted by E∞. A Markov chain with state space E is a sequence of random variablesand also a random element of E∞. The distribution of a Markov chain with the initial distribution α and thetransition probability kernel P is denoted by Pα, and Eα stands for the corresponding expectation operator.We write Px and Ex if α is degenerate at x. We call the identity function (xn) �→ (xn) on E∞ the canonicalMarkov chain. It is a Markov chain when considered as a random element of E∞ defined on some probabilityspace (E∞,Pα). For any A ⊂ E, τA denotes the hitting time of the set A by the canonical Markov chain.We write τx for it if A = {x}.

As noted above, P→ means convergence in probability; d−→ stands for convergence in distribution, and d=means equality by distribution.

2 MAIN RESULTS

We begin with an example where backward iterations do not converge in probability while forward iterationsdo.

Example 1. Let E = {1, 2, 3}, and let F take values f1 and f2 with equal probabilities, where

f1x =

{2 for x �= 3,3 for x = 3,

f2x =

{2 for x = 2,3 for x �= 2.

Since f21 = f2f1 = f1 and f2

2 = f1f2 = f2, we have Xn = F1 and ξn = Fn for all n. Hence, (Xn) convergesin distribution (and even almost surely), while (ξn) does not converge in probability.

Kernel (1.1) in the example above is given by the matrix

P =

⎛⎝0 1

212

0 1 00 0 1

⎞⎠.

Note that ξnx → x for x = 2, 3 and only the sequence (ξn1) does not have a limit in probability. All threesequences have limit distributions, which are, respectively, δ2, δ3, and (1/2)δ2 + (1/2)δ3, where δx denotesthe probability concentrated at x. Therefore, we can suppose that the reason why (ξn1) does not converge

Lith. Math. J., 52(4):390–399, 2012.

Page 3: Limits of the Letac principle. The discrete case

392 V. Kazakevicius

in probability is that its limit distribution is not an extreme point of the set of invariant distributions of thematrix P . This guess is an excellent shot as our first two theorems show.

Recall that a stationary sequence of random variables (xn) is called ergodic if for any shift-invariant mea-surable A ⊂ E∞, the probability P{(xn) ∈ A} equals either 1 or 0. It is called mixing if

P{(x1, x2, . . . ) ∈ A, (xn+1, xn+2, . . . ) ∈ B

} −→n→∞ P

{(x1, x2, . . . ) ∈ A

}P{(x1, x2, . . . ) ∈ B

}for any measurable A,B ⊂ E∞. Any mixing sequence is ergodic, but the converse is not true.

Theorem 1. If, for some x0, ξnx0P→ x and x0 is a random variable with the same distribution as x but

independent of (Fn), then (Xnx0) is a mixing stationary sequence.

From now on, we switch completely to the discrete case. If E is a discrete space, it is countable (becauseseparable), and the elaborated theory of countable Markov chains can be applied. Let C+ be the set of allequivalence classes of positive recurrent states, E+ =

⋃C∈C+ C, and τ = τE+ . For each C ∈ C+, let πC stand

for the invariant probability of E with πC(C) = 1, dC be the period of C, and (C0, . . . , CdC−1) its cyclicpartition. Also denote

pC,i(x) = Px{τ < ∞, xτ ∈ Ci}, i = 0, . . . , dC − 1,

and

pC(x) =

dC−1∑i=0

pC,i(x) = Px{τ < ∞, xτ ∈ C}.

Theorem 2. (1) (Xnx) converges in distribution if and only if Px{τ < ∞} = 1 and pC,i(x) = d−1C pC(x) for

all C ∈ C+ and all i = 0, . . . , dC − 1.(2) Let Xnx

d−→ x0, and x0 be independent of (Fn). Then the sequence (Xnx0) is stationary. It is ergodicif and only if pC(x) = 1 for some C ∈ C+, and it is mixing if and only if dC = 1 for that C.

Theorems 1 and 2 show that if ξnx0P→ x, then there exists a positive aperiodic class C which is accessed

with probability 1 from x0. The set C+ of all x with Px{τC < ∞} = 1 is absorbing and contains x0.Of course, assumption ξnx0

P→ x gives no information about the behavior of sequences (ξnx) with x /∈ C+.Therefore, for simplicity, we can restrict P to C+ and consider only the case where E = C+, i.e., wherethere exists only one class of recurrent states which is aperiodic, positive, and accessible from everywhere.Nummelin [4] calls such kernels ergodic. So, in the sequel, we suppose that P is ergodic.

The next example shows that the ergodicity of P is not sufficient for the convergence in probability ofbackward iterations.

Example 2. Let E = {1, 2} and P{F = f1} = P{F = f2} = 1/2, where f1x = x and f2x = 3− x. Then

P =

(12

12

12

12

),

which is obviously ergodic. But it is easily seen that {f1, f2} is an Abelian group, and therefore

Xn = ξn =

{f1 if m is even,f2 if m is odd.

Here, m is the number of occurrences of f2 in the sample F1, . . . , Fn. Hence, P{ξn �= ξn+1} = 1/2, and (ξn)has no limit in probability.

Page 4: Limits of the Letac principle. The discrete case

Limits of the Letac principle 393

Note that if gix = i for i = 1, 2 and P{F = g1} = P{F = g2} = 1/2, then P is the same as inthe example above, and ξn = F1 for all n. So, we cannot expect to get the convergence in probability ofbackward iterations making additional assumptions about P . The point is that the theory of random functionsis richer than the theory of Markov chains: a random function F defines not only kernel (1.1), which givesonly one-dimensional distributions of F , but also the kernel

P2

((x, x′), (y, y′)

)= P{Fx = y, Fx′ = y′} (2.1)

representing two-dimensional distributions and analogous kernels in higher dimensions. The remarkable factis that the characterization of convergence in probability of backward iterations can be done only in terms ofone- and two-dimensional distributions of F .

The following theorem is the main result of the paper.

Theorem 3. If P is ergodic, then the following statements are equivalent:

(i) for some x0, (ξnx0) converges in probability;(ii) (ξn) tends in probability to some constant random function;

(iii) for all x, x′, there exists n with P{Xnx �= Xnx′} < 1;

(iv) (1.2) holds.

If we analyze the kernel P rather than the random function F , we can replace F by another function withthe same one-dimensional distributions. Our next theorem shows that there always exists a function whosebackward iterations converge in probability.

Theorem 4. If P is ergodic and all Fx, x ∈ E, are independent, then (1.2) holds.

3 ALMOST SURE CONVERGENCE

The following example shows that backward iterations can converge in probability but not almost surely.

Example 3. Let E = {1, 2, 3, . . . } and

P =

⎛⎜⎜⎝

1 0 0 0 · · ·p2 0 q2 0 · · ·p3 0 0 q3 · · ·...

......

.... . .

⎞⎟⎟⎠

with

px =1

x, qx = 1− 1

x.

Clearly, 1 is an absorbing state. It is accessed from x � 2 with probability 1 − qxqx+1 · · · = 1 −∏n�x(n−1)/n = 1, and therefore, P is ergodic. Let F be a random function such that P{Fx = y} = P (x, y)

for all x, y ∈ E and all Fx are independent.Since 1 is absorbing, we have

Xn2 =

{1 if ∃i � n, Fi(i+ 1) = 1,n+ 2 otherwise,

and

ξn2 =

{1 if ∃i � n, Fn+1−i(i+ 1) = 1,n+ 2 otherwise.

Lith. Math. J., 52(4):390–399, 2012.

Page 5: Limits of the Letac principle. The discrete case

394 V. Kazakevicius

The random variables Fij corresponding to different pairs (i, j) are independent; therefore, the events{ξn2 = 1}, n � 2, are independent. Moreover,

P{ξn2 = 1} = 1− q2 · · · qn+1 = 1− 1

n+ 1.

Therefore, ξn2 tends to 1 in probability, but not almost surely, because, for all k,

P{∀n � k, ξn2 = 1} =∏n�k

(1− 1

n+ 1

)= 0.

The reason why in the example above, backward iterations fail to converge almost surely is that the ab-sorbing class C = {1} is accessed in a time with infinite mean. It seems probable that (1.2), together withthe condition ExτC < ∞, yields the almost sure convergence of backward iterations, but we neither provedthis nor found a counterexample. Therefore, we end up with some partial results concerning the almost sureconvergence.

Theorem 5. If E is finite, P is ergodic, and (1.2) holds, then (ξn) converges almost surely.

Theorem 6. If P is ergodic and (ξnx0) converges almost surely for some x0, then (ξnx) converges almostsurely for all x accessible from x0.

4 PROOFS

The proof of Theorem 1 is preceded by two simple lemmas. Both facts undoubtedly are known, but we did notfind a proof in the literature to cite.

Lemma 1. Let xns and xs, n, s � 1, be random variables, and let xnsP→ xs as n → ∞ for all s. Then there

exists a sequence of indices nk → ∞ such that, almost surely, xnks → xs as k → ∞ for all s.

Proof. Let d be the metric of E, and let

ρ((xs), (ys)

)=

∑s�1

2−s[d(xs, ys) ∧ 1

].

Then ρ metrizies E∞. From the assumption of the lemma it follows that, for all s, E[d(xns, xs) ∧ 1] → 0. Bythe dominated convergence theorem, Eρ((xns), (xs)) → 0. Hence, ρ((xns), (xs))

P→ 0, and therefore, thereexists a sequence nk → ∞ such that, almost surely, ρ((xnks), (xs)) → 0. Obviously, then, almost surely,xnks → xs for all s. �

Lemma 2. Let W be a measurable space, (εn | n ∈ Z) a family of independent identically distributed randomelements of W , and ϕ a measurable function from W∞ to E. For n � 1, define xn = ϕ(εn, εn−1, . . . ). Then(xn) is a mixing stationary sequence.

Proof. Denote

ψ(wi | i ∈ Z) =(ϕ(w1, w0, . . . ), ϕ(w2, w1, . . . ), . . .

).

For all measurable A ⊂ W k, B ⊂ W l, integers i1, . . . , ik, j1, . . . , jl, and n > ik − j1, we have

P{(εi1 , . . . , εik) ∈ A, (εn+j1 , . . . , εn+jl) ∈ B

}= P

{(εi1 , . . . , εik) ∈ A

}P{(εj1 , . . . , εjl) ∈ B

}.

Page 6: Limits of the Letac principle. The discrete case

Limits of the Letac principle 395

By Theorem 1.17 of Walters [5], (εn | n ∈ Z) is mixing. Therefore, for all measurable A,B ⊂ E∞,

P{(xi) ∈ A, (xn+i) ∈ B

}= P

{(εi | i ∈ Z) ∈ ψ−1(A), (εn+i | i ∈ Z) ∈ ψ−1(B)

}−→n→∞ P

{(εi | i ∈ Z) ∈ ψ−1(A)

}P{(εi | i ∈ Z) ∈ ψ−1(B)

}= P

{(xi) ∈ A

}P{(xi) ∈ B

},

i.e., (xn) is also mixing. � Proof of Theorem 1. Since convergence in probability implies convergence in distribution and ξnx0

d= Xnx0,we have Xnx0

d−→ x0. Then also Xn+1x0d−→ x0. But Xn+1x0

d= FXnx0, where F is independent of (Fn)and x0. By the continuous mapping principle, FXnx0

d−→ Fx0, and therefore, Fx0d= x0. Then, for all n,

(x0, F1, . . . , Fn)d= (F1x0, F2, . . . , Fn+1),

which yields

(F1x0, F2F1x0, . . . , Fn · · ·F1x0)d= (F2F1x0, F3F2F1x0, . . . , Fn+1 · · ·F1x0),

i.e., (X1x0, . . . , Xnx0)d= (X2x0, . . . , Xn+1x0). Hence, (Xnx0) is a stationary sequence. It remains to prove

that it is mixing.Since (ξn+sx0 | n � 1) is a subsequence of (ξnx0), ξn+sx0

P→ x for all s. By Lemma 1, there existsa sequence nk → ∞ such that, almost surely, ξnk+sx0 → x for all s. Let ϕ be any measurable function from[C(E,E)]∞ to E, such that

ϕ(f1, f2, . . . ) = limk→∞

f1 · · · fnkx0

at each point (fn) where the limit exists.Obviously, almost surely,

ϕ(F1, F2, . . . ) = limk→∞

F1 · · ·Fnkx0 = lim ξnk

x0 = x.

The distribution of (Fn+1) is the same as that of (Fn), and therefore, almost surely,

F2 · · ·Fnk+1x0 −→ ϕ(F2, F3, . . . ),

and then by continuity

x = lim ξnk+1x0 = limF1 · · ·Fnk+1x0 = F1ϕ(F2, F3, . . . ).

Hence, almost surely,

F1ϕ(F2, F3, . . . ) = ϕ(F1, F2, . . . ).

Now let (Fn | n ∈ Z) be a family of independent copies of F , and for n � 1,

xn = ϕ(Fn, Fn−1, . . . ).

By Lemma 2, the sequence (xn) is mixing. Moreover, almost surely, xn = Fnxn−1 for all n. Therefore, forall n � 1, almost surely,

xn = Fn · · ·F1x0 = Xnx0,

and x0 is independent of (Fn | n � 1). Hence, (Xnx0) is mixing. �

Lith. Math. J., 52(4):390–399, 2012.

Page 7: Limits of the Letac principle. The discrete case

396 V. Kazakevicius

The proof of Theorem 2 is based on the following proposition, which easily follows from the limit theoremsfor countable Markov chains.

Proposition 1. Let x, y be fixed points of E.

(1) Let y belong to some C ∈ C+. If, for all i = 0, . . . , dC − 1, pC,i(x) = d−1C pC(x), then Pn(x, y) →

pC(x)πC(y); otherwise, the sequence (Pn(x, y)) has no limit.(2) If y belongs to some null class or is transient, then Pn(x, y) → 0.

Proof of Theorem 2. (1) If pC,i(x) �= d−1C pC(x) for some C ∈ C+ and some i, then we get from Proposition 1

that Pn(x, y) has no limit for y ∈ C. Since Pn(x, ·) is the distribution of Xnx, (Xnx) does not converge indistribution.

If pC,i(x) = d−1C pC(x) for all C ∈ C+ and i, Proposition 1 yields that Pn(x, y) → π(y) for all y ∈ E,

where

π(y) =

{pC(x)πC(y) for y ∈ C ∈ C+,

0 for y /∈ E+.

It remains to notice that π is a probability if and only if Px{τ < ∞} = 1. Indeed,

∑y

π(y) =∑C∈C+

∑y∈C

pC(x)πC(y) =∑C∈C+

pC(x) = Px{τ < ∞}.

(2) The stationarity of (Xnx0) is stated analogously as in the proof of Theorem 1. It follows from

Pπ =∑y

π(y)Py =∑C∈C+

∑y∈C

pC(x)πC(y)Py =∑C∈C+

pC(x)PπC

that if at least two probabilities pC(x) are positive, then Pπ is not an extreme point of the set of shift-invariantprobabilities on E∞. It is well known (see, e.g., the remark to Lemma 9.4 in Walters [5]) that then (Xnx0) isnot ergodic. If pC(x) = 1 for some C ∈ C+, then (Xnx0) is a stationary irreducible positive recurrent and,therefore, ergodic Markov chain, see [2, Chap. 2, Sect. 5, Thm. 6].

If (Xnx0) is mixing, it is ergodic, and therefore pC(x) = 1 for some C ∈ C+. Let us prove that C isaperiodic. Suppose the contrary and let d > 1 be its period. Let (C0, . . . , Cd−1) be a cyclic partition of C.Fix any y ∈ C0 and any z ∈ C1 with P (y, z) > 0. Then, for all m � 1, Pπ{x1 = z, xmd = z} = 0, whilePπ{x1 = z} � π(y)P (y, z) > 0. Hence,

Pπ{x1 = z, xn+1 = z} �−→ [Pπ{x1 = z}]2,

a contradiction.If pC(x) = 1 for some positive aperiodic class C, then (Xnx0) is a stationary irreducible positive recurrent

aperiodic and, therefore, mixing Markov chain, see [2, Chap. 2, Sect. 5, remark after Thm. 6]. �

Proof of Theorem 3. (i) ⇒ (ii) Let ξnx0P→ x. We will prove that then

ξnxP−→ x (4.1)

for all x. First, let x be recurrent and hence accessible from x0. Find k with P k(x0, x) > 0 and denote thisprobability by δ. For all m < n, we have

δP{ξnx �= ξmx0} = P{ξnx �= ξmx0, Fn+1 · · ·Fn+kx0 = x} � P{ξn+kx0 �= ξmx0}.

Page 8: Limits of the Letac principle. The discrete case

Limits of the Letac principle 397

This implies

P{ξnx �= x} � P{ξnx �= ξmx0}+P{ξmx0 �= x} � δ−1P{ξn+kx0 �= ξmx0}+P{ξmx0 �= x}� δ−1P{ξn+kx0 �= x}+ (

1 + δ−1)P{ξmx0 �= x}.

Hence, for all m,

limn→∞P{ξnx �= x} �

(1 + δ−1

)P{ξmx0 �= x},

i.e., P{ξnx �= x} → 0.Now let x be an arbitrary state. Fix a recurrent state x1. Then, for all k � 1,

P{F1 · · ·Fnx �= x} � P{∃j � k, F1 · · ·Fn−jx1 �= x}+P{∀j � k, Fn−j+1 · · ·Fnx �= x1}

�k∑

j=1

P{ξn−jx1 �= x}+Px{τx1> k}.

It follows from the first part of the proof that the first term tends to 0 as n → ∞; therefore,

limP{ξnx �= x} � Px{τx1> k}.

Since x1 is recurrent and k is chosen arbitrarily, P{ξnx �= x} → 0.Now we prove that (4.1) implies the convergence in probability of (ξn). The space C(E,E) is metrized by

the metric

ρ(f, g) =∑x

cx[d(fx, gx) ∧ 1

],

where (cx | x ∈ E) is some family of positive numbers summable to 1. Let (4.1) hold, and let x denotea random constant function with all values equal to x. By the dominated convergence theorem, Eρ(ξn, x) → 0,which yields ξn

P→ x.(ii) ⇒ (iii) If ξn

P→ x, then for all x, x′ ∈ E,

P{Xnx �= Xnx

′} = P{ξnx �= ξnx

′} � P{ξnx �= x}+P{ξnx

′ �= x} −→ 0.

Hence, P{Xnx �= Xnx′} < 1 for n sufficiently large.

(iii) ⇒ (iv) Consider the sequence ((Xnx,Xnx′)), which is a Markov chain with transition probability

kernel (2.1). Clearly, the diagonal Δ = {(x, x) | x ∈ E} is an absorbing set, and (iii) means that it isaccessible from everywhere. Therefore, all states (x, x′) /∈ Δ are transient and almost surely are visited by thechain only finitely many times. So, if the chain remains outside Δ with positive probability, then Xnx → ∞with positive probability, which is impossible because the ergodicity of P implies Xnx = OP(1). Hence, Δ isaccessed with probability 1, and then

P{Xnx �= Xnx

′} = P{∀i � n,

(Xix,Xix

′) /∈ Δ} −→ 0.

(iv) ⇒ (i) Fix ε. By the ergodicity of P , Xnx0 = OP(1), and there exists a finite K ⊂ E such that

supn

P{Xnx0 /∈ K} < ε.

Denote m = #K and find n0 such that, for all n � n0 and all x ∈ K \ {x0},

P{Xnx0 �= Xnx} <ε

m.

Lith. Math. J., 52(4):390–399, 2012.

Page 9: Limits of the Letac principle. The discrete case

398 V. Kazakevicius

Then, for all n � n0 and k � 1,

P{d(ξnx0, ξn+kx0) > ε

}=

∑x �=x0

P{d(ξnx0, ξnx) > ε, Fn+1 · · ·Fn+kx0 = x

}=

∑x �=x0

P{d(ξnx0, ξnx) > ε

}P{Xkx0 = x}

� ε+∑

x∈K\{x0}P{Xnx0 �= Xnx} � 2ε.

Hence, (ξnx0) is a Cauchy sequence in probability and therefore has a limit in probability. � Proof of Theorem 4. Fix x �= x′ and consider the chain ((Xnx,Xnx

′)). In view of Theorem 3, we need toshow that P{Xnx �= Xnx

′} < 1 for some n. Suppose the contrary that, almost surely, Xnx �= Xnx′ for

all n. Then, by the independence assumption, (Xnx) is independent of (Xnx′). Let y be any recurrent state,

and let τk (respectively, τ ′k) denote the successive moments of hitting y by the chain (Xnx) (respectively,(Xnx

′)). Clearly, (τk) and (τ ′k) are independent random walks (with delay) on Z. Moreover, E(τ2 − τ1) =E(τ ′2 − τ ′1) < ∞ (because of the ergodicity of P ). Therefore, (τk − τ ′k) is a random walk with zero-meanincrements and hence recurrent. Almost surely, it visits every state infinitely often. Therefore, almost surely,τk = τ ′k for some k. This yields P{∃n, Xnx = y = Xnx

′} = 1, and we get a contradiction. � Proof of Theorem 5. If E is finite, the space C(E,E) is also finite. Let Φ denote its subset consisting ofconstant functions. By Theorem 3, (ξn) converges in probability to a random constant function. Therefore,there exist a ∈ Φ and n with P{ξn = a} > 0. Since fa ∈ Φ for all f ∈ C(E,E), we have

P{fξn ∈ Φ} > 0 (4.2)

for all f . Now consider (ξn) as a Markov chain with finite state space C(E,E). Obviously, each a ∈ Φ is anabsorbing state. Then Φ is also absorbing, and (4.2) means that it is accessible from everywhere. Hence, allf /∈ Φ are transient. The chain (ξn) almost surely visits each transient state finitely many times. Therefore,almost surely, ξn hits Φ, and hence, almost surely, ξn → ξτ , where τ is the hitting time of Φ. � Proof of Theorem 6. Fix x, y ∈ E and denote

U0 = {f | fx0 = y}, U = {f | fx = y}.

Find l with P l(x0, x) > 0 and denote that probability by δ. If f /∈ U , then

P{fFl · · ·F1 /∈ U0} = P{fFl · · ·F1x0 �= y} � P{Fl · · ·F1x0 = x} = δ.

Then, using the Markov property, it is easy to prove that, almost surely,

∀m, ∃n � m, ξn ∈ U c =⇒ ∀m, ∃n � m, ξn ∈ U c0 ,

i.e., almost surely,

∃m, ∀n � m, ξn ∈ U0 =⇒ ∃m, ∀n � m, ξn ∈ U.

However, ∃m, ∀n � m, ξn ∈ U0 (respectively, ∃m, ∀n � m, ξn ∈ U ) means that ξnx0 → y (respectively,ξnx → y), and by the assumption of the theorem, (ξnx0) tends almost surely to some random variable x.Therefore, for all y, almost surely,

P{x = y, ξnx �→ y} = 0.

Page 10: Limits of the Letac principle. The discrete case

Limits of the Letac principle 399

This yields

P{ξnx �→ x} = 0,

i.e., ξnx → x almost surely. �

REFERENCES

1. P. Diaconis and D. Freedman, Iterated random functions, SIAM Rev., 41:45–76, 1999.

2. I. Ghikhman and A. Skorokhod, Theory of Random Processes, Vol. 1, Nauka, Moscow, 1971 (in Russian).

3. G. Letac, A contraction principle for certain Markov chains and its applications, in J.E. Cohen, H. Kesten, and Ch.M.Newman (Eds.), Random Matrices and Their Applications, Contemp. Math., Vol. 50, Amer. Math. Soc., Providence,RI, 1986, pp. 263–273.

4. E. Nummelin, General Irreducible Markov Chains and Nonnegative Operators, Cambridge Univ. Press, Cambridge,1984.

5. P. Walters, An Introduction to Ergodic Theory, Grad. Texts Math., Vol. 79, Springer-Verlag, New York, Heidelberg,Berlin, 1982.

Lith. Math. J., 52(4):390–399, 2012.