Advanced Probability: Solutions to Sheet 2ps422/solutions_2_guolong.pdf · Advanced Probability: Solutions to Sheet 2 Guolong Li November 26, 2013 1 Discrete-time martingales Exercise

Advanced Probability: Solutions to Sheet 2

Guolong Li∗

November 26, 2013

1 Discrete-time martingales

Exercise 1.1

Let us suppose that, at time 0, an urn contains a single black ball and a

single white ball. At each time n ≥ 1, a ball is chosen uniformly at random

from those in the urn and it is replaced, together with another ball of the

same colour. We will denote the number of black balls that we have chosen

by time n by Bn. Immediately after time n, then, the urn contains n + 2

balls, of which Bn + 1 are black.

Proposition. Let Mn := (Bn+1)/(n+2) for all n ≥ 0; this is the proportion

of black balls in the urn immediately after time n. With respect to a certain

natural filtration, M = (Mn : n ≥ 0) is a martingale that converges a.s. and

in Lp for all p ∈ [1,∞) to some [0, 1]-valued random variable, X∞.

Proof. Let us define Fn := σ(B0, . . . , Bn) for all n ≥ 0. We shall show that

the process M is a martingale with respect to this filtration. Clearly, for

all n ≥ 0, Mn is Fn-measurable and such that |Mn| ≤ 1, so that it is in

particular integrable. We also see that, almost surely,

E[Mn+1 | Fn] = Mn ·Bn + 2

n+ 3+ (1−Mn) · Bn + 1

n+ 3

=(Bn + 1)(Bn + 2) + (n+ 1−Bn)(Bn + 1)

(n+ 2)(n+ 3)

=(Bn + 1)(2 + n+ 1)

(n+ 2)(n+ 3)= Mn,

and so we conclude that M is a martingale.

∗Comments and corrections should be sent to [email protected].

1

mailto:[email protected]

As we mentioned earlier, |Mn| ≤ 1 for all n ≥ 0, so the process is bounded

in Lp for all p ≥ 1. Therefore, by the Lp martingale convergence theorem,

there exists a random variable X∞ such that, for all p ∈ (1,∞), Mn → X∞

a.s. and in Lp. (By the ‘a.s.’ part of the theorem, this X∞ is the same for

each p ∈ (1,∞).) A fortiori, then, Mn → X∞ in L1 as well. Additionally, as

Mn ∈ [0, 1], it follows that X∞ ∈ [0, 1] a.s.

Proposition. For each k ≥ 1 and each n ≥ 0, define

M (k)n :=

k∏r=1

Bn + r

n+ r + 1.

The process (M(k)n : n ≥ 0) is a martingale with respect to the same natural

filtration as in the previous proposition.

Proof. Let us fix some k ≥ 1. Again, it is obvious that, for each n ≥ 0,

M(k)n is Fn-measurable and that, as each factor lies in [0, 1], |M (k)

n | ≤ 1; the

process is therefore adapted and integrable. To verify that the martingale

property obtains, let us fix some n ≥ 0. Then, a.s.,

E[M(k)n+1 | Fn] = E

[k∏r=1

Bn+1 + r

n+ r + 2

∣∣∣∣∣ Fn

]

= Mn ·k∏r=1

Bn + r + 1

n+ r + 2+ (1−Mn) ·

k∏r=1

Bn + r

n+ r + 2

=

∏kr=2(Bn + r)∏k

r=1(n+ r + 2)· (Bn + 1)(Bn + k + 1) + (n+ 1−Bn)(Bn + 1)

n+ 2

=

∏kr=2(Bn + r)∏k

r=1(n+ r + 2)· (Bn + 1)(n+ k + 2)

n+ 2

=

∏kr=1(Bn + r)∏k

r=1(n+ r + 1)= M (k)

n .

Looking back at the definition of M(k)n , it’s quite clear that it almost

equals to Mkn . Our next goal will be to quantify this and show that as n→∞,

the difference disappears in a suitable manner. Each factor in the definition

on M(k)n can be rewritten as

Bn + r

n+ r + 1=Bn + r

n+ 2· n+ 2

n+ r + 1=Bn + 1 + r − 1

n+ 2· n+ 2

n+ r + 1=

(Mn+

r − 1

n+ 2

)n+ 2

n+ r + 2.

2

From this it is clear that each of the k factors tends to X∞ a.s. as n→∞and hence that M

(k)n → Xk

∞ a.s. as n → ∞. As we mentioned earlier,

|M (k)n | ≤ 1 so, by the same reasoning as that employed in the proof to

our first proposition, there exists a random variable M∞ such that, for all

p ∈ [1,∞), M(k)n → M∞ a.s. and in Lp. Therefore Xk

∞ = M∞ a.s. and

so, for all p ∈ [1,∞), M(k)n → Xk

∞ a.s. and in Lp. In particular, then, the

convergence holds in L1, and

E[Xk∞] = lim

n→∞E[M (k)

n ] = limn→∞

E[M(k)0 ] =

1

k + 1,

where the penultimate equality in the above holds by the martingale property.

We can use this to determine the law of X∞. The moment generating

function of X∞ exists as X∞ ∈ [0, 1] a.s. It is given by

MX∞(t) := E[etX∞ ] = E

[ ∞∑k=0

(tX∞)k

k!

]=∞∑k=0

tkE[Xk∞]

k!=∞∑k=0

tk

(k + 1)!=

et − 1

t.

We have used Fubini’s theorem in the third equality together with the

absolute convergence of the series in the third term. The moment generating

function of a random variable which is uniformly distributed on [0, 1] is

precisely equal to MX∞ so it follows that X∞ is itself uniformly distributed

on [0, 1].

We shall next reobtain this result in a more direct way by showing that

Bn is uniformly distributed on 0, 1, . . . , n for each n ≥ 0. For the cases

n = 0 and n = 1, this is immediate; let us suppose that we have established

the result for BN . Let us take some 1 ≤ k ≤ N . Then

P(BN+1 = k) = P(BN+1 = k | BN = k)P(BN = k)

+ P(BN+1 = k | BN = k − 1)P(BN = k − 1)

=(k − 1) + 1

N + 2· 1

N + 1+

(N + 2)− (k + 1)

N + 2· 1

N + 2

=1

N + 2.

We also have that

P(BN+1 = 0) = P(BN+1 = 0 | BN = 0)P(BN = 0) =1

N + 2

and, finally, that

P(BN+1 = N + 1) = P(BN+1 = N + 1 | BN = N)P(BN = N) =1

N + 2.

3

It follows from these calculations that BN+1 is uniformly distributed on the

set 0, 1, . . . , N + 1 and hence, by induction, we have proven the claim.

We can use this to rederive the distribution of X∞ in the following way.

As Mn → X∞ a.s., Mn → X∞ in distribution, so if we can show that the

distribution function of Mn converges everywhere to that of a random variable

that is uniformly distributed on [0, 1], it will follow that X∞ is uniformly

distributed on [0, 1]. We begin by noting that, as Mn = (Bn + 1)/(n+ 2), Mn

is uniformly distributed on 1/(n+ 2), . . . , (n+ 1)/(n+ 2). If Fn denotes

the distribution function of Mn, by definition

Fn(x) =

0 if x < 0

b(n+ 2)xc/(n+ 1) if 0 ≤ x ≤ 1

1 if x > 1

Clearly Fn(x)→ 0 if x < 0 and Fn(x)→ 1 if x > 1, so let us suppose that

0 ≤ x ≤ 1. Then Fn(x) = b(n+ 2)xc/(n+ 1)→ x as n→∞, and hence Fn

converges everywhere to the distribution function of a random variable that

is uniformly distributed on [0, 1]. We conclude that X∞ is, as we had shown

earlier, uniformly distributed on [0, 1].

Proposition. Let 0 < θ < 1 and define, for all n ≥ 0,

Nn(θ) :=(n+ 1)!

Bn!(n−Bn)!θBn(1− θ)n−Bn .

The process N(θ) := (Nn(θ) : n ≥ 0) is a martingale with respect to the same

filtration as that in the previous propositions.

Proof. Again, it is trivially the case that Nn(θ) is Fn-measurable and in-

tegrable, so it suffices to check the martingale property. Let n ≥ 0. Then,

a.s.,

E[Nn+1(θ) | Fn] = E

[(n+ 2)!

Bn+1!(n+ 1−Bn+1)!· θBn+1(1− θ)n+1−Bn+1

∣∣∣∣ Fn

]

= Mn ·(n+ 2)!

(Bn + 1)!(n−Bn)!· θBn+1(1− θ)n−Bn

+ (1−Mn) · (n+ 2)!

Bn!(n+ 1−Bn)!· θBn(1− θ)n+1−Bn

=(n+ 1)!

Bn!(n−Bn)!· θBn(1− θ)n−Bn(θ + 1− θ)

= Nn(θ).

4

Exercise 1.2

Given Θ, the probability we have a sequence of B1, B2, B3...Bn is

ΘBn(1−Θ)n−Bn .

So, when Θ is uniform distributed in [0, 1], the probability of having sequence

B1, B2, B3...Bn is∫ 1

0θBn(1− θ)n−Bndθ = B(Bn + 1, n−Bn + 1) =

Bn!(n−Bn)!

(n+ 1)!,

where B denotes the beta function. So, we have Nn(θ) is indeed the con-

ditional density function of Θ given B1, B2, B3...Bn. Then the probability

the (n+ 1)th toss is head is same as the conditional expectation of Θ given

B1, B2, B3...Bn. So it is

(n+ 1)!

Bn!(n−Bn)!

∫ 1

0θBn+1(1− θ)n−Bndθ =

Bn + 1

n+ 2.

So, it has same probability structure as in the previous problem.

Exercise 1.3

At each time n ≥ 1, let us suppose that an idealised monkey types a capital

letter at random; that is, let us suppose that (Un : n ≥ 1) is an iid sequence

of random variables that are uniformly distributed on A, . . . ,Z and that Un

corresponds to the nth random character that the monkey types. Let us define

T to be the first time that the idealised monkey types ‘ABRACADABRA’,

i.e.

T := infn ≥ 1 : Un−10 = A, Un−9 = B, . . . , Un = A.

Our goal in this exercise is to calculate E[T ] with a martingale argument.

In addition to the above, let us suppose that we have a sequence of

gamblers indexed by k ∈ Z>0 and that each gambler plays the following

game. He begins by betting £1 that the nth letter will be ‘A’. If he loses, he

leaves; if he wins, he receives £26 and continues playing. If he is still playing,

he places his fortune—all £26 of it—on the (n + 1)th letter being ‘B’. If

he loses, he leaves; if he wins, he receives £262. He continues in this way,

betting everything on ‘ABRACADABRA’ being the eventual sequence.

We will now introduce a martingale based on these gamblers. Let us

stipulate that the kth gambler begins playing at time k and that X(k)n denotes

5

the kth gambler’s fortune at time n; to be more specific, X(k)k−1 = 1, X

(k)k

is either 0 (if Uk 6= A) or 26 (if Uk = A), and so on up to X(k)k+10, which

is either 0 or 2611. (We choose this time parameterisation so that the kth

gambler first bets on the outcome of the kth letter.) Let us also set X(k)n = 0

if n < k− 1 and let us set X(k)n = X

(k)k+10 if n > k+ 10; this second condition

corresponds to the gambler keeping his fortune if the monkey has typed

‘ABRACADABRA’. Let us define Xn :=∑n

k=1X(k)n . The random variable

Xn corresponds to the sum of the fortunes of all ‘active’ players. Finally,

let us set Fn := σ(U1, . . . , Un) for each n ≥ 1. If k < n− 9, X(k)n+1 = X

(k)k+10,

which is Fn-measurable as k + 10 ≤ n. It follows that, almost surely,

E[Xn+1 | Fn] =

n+1∑k=1

E[X(k)n+1 | Fn] =

n−10∑k=1

E[X(k)n+1 | Fn] +

n+1∑k=n−9

E[X(k)n+1 | Fn]

=

n−10∑k=1

X(k)k+10 +

n+1∑k=n−9

E[X(k)n+1 | Fn]

=

n−10∑k=1

X(k)n +

n+1∑k=n−9

(26 ·X(k)

n ·1

26+ 0)

= Xn +X(n+1)n

= Xn + 1.

It follows from the above that (Xn − n : n ≥ 1) satisfies the martingale

property. Moreover, Xn − n is clearly Fn-measurable. It is also integrable as

|Xn − n| ≤ n · 2611 + n, and hence (Xn − n : n ≥ 1) is a martingale.

Now, the random time T is plainly a stopping time and it is clearly the

case that

T ≤ infn ∈ 11 · Z>0 : Un−10 = A, Un−9 = B, Un−8 = R, . . . , Un = A

.

By independence, the right-hand side of the above is a geometric random

variable with success probability 26−11. As this has a finite (albeit large) mean,

it follows by comparison that E[T ] <∞. By Exercise 2.4, if (Xn − n : n ≥ 1)

has (a.s.) bounded increments then we can apply an optional stopping

6

theorem with T . We see that

|Xn+1 − (n+ 1)−Xn + n| ≤ 1 + |Xn+1 −Xn|

= 1 +

∣∣∣∣∣n+1∑k=1

X(k)n+1 −

n∑k=1

X(k)n

∣∣∣∣∣= 1 +

∣∣∣∣∣n−10∑k=1

X(k)k+10 +

n+1∑k=n−9

X(k)n+1 −

n−11∑k=1

X(k)k+10 −

n∑k=n−10

X(k)n

∣∣∣∣∣≤ 1 +X(n−10)

n +n+1∑

k=n−9X

(k)n+1 +

n∑k=n−10

X(k)n

≤ 1 + 23 · 2611

and hence (Xn−n : n ≥ 1) has bounded increments. Therefore E[XT − T ] =

E[X1 − 1] = 0, and so E[T ] = E[XT ] = 2611 + 264 + 26 as X(k)T 6= 0 if and

only if k ∈ T, T − 3, T − 10.

Exercise 1.4

Let us write P(A | G ) for E[1A | G ] and let us suppose that X = (Xn : n ≥ 0)

is a sequence of (0, 1)-valued random variables defined in the following

way. We begin by fixing some a ∈ (0, 1) and setting X0 := a a.s. We then

inductively define Xn+1 from Xn by

P(Xn+1 = Xn/2 | Fn) = 1−Xn = 1− P(Xn+1 = (Xn + 1)/2 | Fn),

where Fn := σ(Xk : 0 ≤ k ≤ n).

Proposition. The process X is a martingale with respect to its natural

filtration. Moreover, there exists some X∞ such that Xn → X∞ in Lp for

each p ∈ [1,∞).

Proof. The process is clearly adapted and, as Xn is a.s. (0, 1)-valued for each

n ≥ 0, that it is integrable. We will now check the martingale property. If we

fix n ≥ 0 then, a.s.,

E[Xn+1 | Fn] = E[Xn+11Xn+1=Xn/2 | Fn] + E[Xn+11Xn+1=(Xn+1)/2 | Fn]

= E[Xn1Xn+1=Xn/2/2 | Fn] + E[(Xn + 1)1Xn+1=(Xn+1)/2/2 | Fn]

= XnE[1Xn+1=Xn/2 | Fn]/2 + (Xn + 1)E[1Xn+1=(Xn+1)/2 | Fn]/2

= Xn(1−Xn)/2 + (Xn + 1)Xn/2 = Xn,

7

where we have used the fact that Xn is bounded and Fn-measurable in

the third equality. The process is therefore a martingale. As X is bounded,

it is in particular bounded in Lp for all p ∈ (1,∞); by the Lp martingale

convergence theorem, there is some X∞ such that, for all p > 1, Xn → X∞

a.s. and in Lp. As Lp convergence implies L1 convergence on finite measure

spaces when p ∈ (1,∞), Xn → X∞ in L1 as well.

For the second part of the exercise, we see that

E[(Xn+1 −Xn)2] = E[E[X2n1Xn+1=Xn/2/4 | Fn]] + E[E[(1−Xn)21Xn+1=(Xn+1)/2/4 | Fn]]

= E[X2nE[1Xn+1=Xn/2 | Fn]/4] + E[(1−Xn)2E[1Xn+1=(Xn+1)/2 | Fn]/4]

= E[X2n(1−Xn) + (1−Xn)2Xn]/4

= E[Xn(1−Xn)]/4,

where the second equality holds as Xn is bounded and Fn-measurable.

As Xn → X∞ both in L1 and L2, we have that E[Xn] → E[X∞] and

E[X2n]→ E[X2

∞] and hence that

E[Xn(1−Xn)]→ E[X∞(1−X∞)].

Moreover, as (Xn : n ≥ 1) is a Cauchy sequence in L2,

E[Xn(1−Xn)]/4 = E[(Xn+1 −Xn)2]→ 0.

By combining the above two facts we see that E[X∞(1−X∞)] = 0. As X∞

is a.s. [0, 1]-valued, X∞(1 − X∞) ≥ 0 a.s.; as its expectation is 0, it must

equal 0 a.s., and hence X∞ is a.s. 0, 1-valued. Finally, as Xn → X∞ in L1

and as X is a martingale, E[X∞] = E[X0] = a and therefore X∞ is Bernoulli

distributed with parameter a.

Exercise 1.5

Let us suppose that X = (Xn : n ≥ 0) is a martingale in L2.

Proposition. The increments of X are pairwise orthogonal. That is, for all

n 6= m,

E[(Xn+1 −Xn)(Xm+1 −Xm)] = 0.

8

Proof. Without loss of generality let us suppose that n > m ≥ 0. Then

E[(Xm+1 −Xm)(Xn+1 −Xn)] = E[E[(Xm+1 −Xm)(Xn+1 −Xn) | Fn]]

= E[(Xm+1 −Xm)E[Xn+1 −Xn | Fn]]

= E[(Xm+1 −Xm)(Xn −Xn)]

= 0.

The second equality in the above holds as Xm+1 − Xm is Fn-measurable

and in L2 and Xn+1 −Xn is in L2, and the penultimate equality holds by

the martingale property.

Proposition. The process X is bounded in L2 if and only if

∞∑n=0

E[(Xn+1 −Xn)2] <∞.

Proof. The Pythagorean theorem for inner product spaces says that, if

v1, . . . , vN is an orthogonal set, then ‖∑N

n=1 vn‖2 =∑N

n=1‖vn‖2. As

Xn+1 − Xn : 0 ≤ n ≤ N is an orthogonal set in L2, we see that∑Nn=0 E[(Xn+1 −Xn)2] = E[(XN+1 −X0)

2] and hence that

0 ≤ limN→∞

N∑n=0

E[(Xn+1−Xn)2] = limN→∞

E[(XN −X0)2] ≤ sup

N≥1E[(XN −X0)

2]

If∑∞

n=0 E[(Xn+1 − Xn)2] < ∞ then the second term in the above is also

finite and, as convergent sequences are bounded, the third term is also finite.

It follows that X = (X −X0) + X0 is bounded in L2. Conversely, if X is

bounded in L2 then the third term is finite and hence so is the first term.

Exercise 1.6

In this exercise we will prove several versions of Wald’s identity. Let us take

X = (Xn : n ≥ 1) to be an iid sequence of integrable random variables, and

let us define Sn := X1 + · · ·+Xn for all n ≥ 0, Fn := σ(X1, . . . , Xn) for all

n ≥ 1, and F0 := ∅,Ω. Finally, let T be a stopping time with respect to

the filtration (Fn : n ≥ 0).

Proposition. If Xn is nonnegative for each n ≥ 1 then E[ST ] = E[T ]E[X1].

Remark. Under the assumptions above, if ω is such that T (ω) = ∞ then

we should interpret ST (ω) to mean limn→∞ Sn(ω). Provided we permit the

possibility of infinity, this interpretation is sensible as (Sn : n ≥ 1) is a

nondecreasing sequence.

9

Proof. Let us consider the process Y = (Yn : n ≥ 0) := (Sn−nE[X1] : n ≥ 0).

It is trivial that Y is adapted to the filtration (Fn : n ≥ 0) and that Yn is

integrable. We also a.s. have that, for each n ≥ 0,

E[Yn+1 | Fn] = E[Yn +Xn+1 − E[X1] | Fn] = Yn + E[Xn+1]− E[X1] = Yn.

This is the case as Y is adapted and as (Xn : n ≥ 1) is an iid sequence. It

follows from these properties that Y is a martingale and hence that Y T is,

too. The martingale property implies that E[Y Tn ] = E[Y T

0 ] = 0—that is, that

E[ST∧n] = E[T ∧ n]E[X1]. If we apply the monotone convergence theorem to

both sides of this then we see E[ST ] = E[T ]E[X1].

Proposition. If T is integrable then E[ST ] = E[T ]E[X1].

Proof. We will use our previous proposition in our proof. We see that

Sn =n∑k=1

Xk =n∑k=1

X+k −

n∑k=1

X−k =: S1n − S2

n.

As (X+n : n ≥ 1) and (X−n : n ≥ 1) are iid sequences of nonnegative,

integrable random variables, the first part of the question implies that

E[S1T ] = E[T ]E[X+

1 ] and E[S2T ] = E[T ]E[X−1 ]. As E[T ] <∞ and E[|X1|] <∞,

S1T and S2

T are integrable, and hence so is ST = S1T − S2

T :

E[ST ] = E[S1T ]− E[S2

T ] = E[T ](E[X+1 ]− E[X−1 ]) = E[T ]E[X1].

Let us now suppose that X1 is centred and that Ta := infn ≥ 0 : Sn ≥ afor some fixed a > 0. Were it the case that E[Ta] <∞, we would have that

Ta <∞ a.s. and hence, by the previous proposition, that 0 < a ≤ E[STa ] =

E[Ta]E[X1] = 0, which is absurd. It follows that E[Ta] =∞.

For the final part of the exercise, we may as well consider the more general

scenario in which P(Xn = 1) = p ∈ (1/2, 1) and P(Xn = −1) = q := 1− p as

the work is identical. If we can prove that E[Ta] <∞ then we can apply the

second proposition and conclude that E[STa ] = E[X1]E[Ta], i.e. that E[Ta] =

dae/(p− q). For each n ≥ 0, Ta ∧ n is a bounded (and therefore integrable)

stopping time. By the previous proposition, then, E[Ta ∧n]E[X1] = E[STa∧n].

The monotone convergence theorem then tells us that

0 ≤ E[Ta]E[X1] = limn→∞

E[Ta ∧ n]E[X1] = limn→∞

E[STa∧n] ≤ dae.

The last inequality is true as, for each n ≥ 0, E[STa∧n] ≤ dae. Therefore, as

E[X1] > 0, E[Ta] <∞ and so, from our earlier reasoning, we conclude that

E[Ta] = dae/(p− q). In the case where p = 2/3, E[Ta] = 3dae.

10

Exercise 1.7

In this exercise we will investigate the gambler’s ruin. Let us suppose that

(Xn : n ≥ 1) is an iid sequence with P(X1 = 1) = p ∈ (0, 1) and P(X1 =

−1) = q := 1 − p and that a, b ∈ Z are such that 0 < a < b. Let us define

Sn := a + X1 + · · · + Xn for all n ≥ 0; Fn := σ(X1, . . . , Xn) for all n ≥ 1;

F0 := ∅,Ω; and T := infn ≥ 0 : Sn ∈ 0, b.

Proposition. The process M = (Mn : n ≥ 0) defined by Mn := (q/p)Sn for

all n ≥ 0 is a martingale with respect to (Fn : n ≥ 0).

Proof. The process M is clearly adapted. It is also integrable as, for each

n ≥ 0, |Mn| = (q/p)Sn ≤ (q/p)a+n ∨ (q/p)a−n < ∞. To establish the

martingale property, let us fix some n ≥ 0. Then, noting that Mn is bounded

and that Xn+1 is independent of Fn, we a.s. have that

E[Mn+1 | Fn] = E[Mn(q/p)Xn+1 | Fn] = (q/p)SnE[(q/p)Xn+1 | Fn]

= MnE[(q/p)Xn+1 ]

= Mn((q/p) · p+ (p/q) · q) = Mn.

Proposition. The process N = (Nn : n ≥ 0) defined by Nn := Sn−n(p− q)for all n ≥ 0 is a martingale with respect to (Fn : n ≥ 0).

Proof. The process N is clearly adapted. As Nn is a finite sum of integrable

random variables added to a constant for each n ≥ 1, it is also integrable.

Let us fix some n ≥ 0. We see that

E[Nn+1 | Fn] = E[Xn+1] + Sn − (n+ 1)(p− q) = Nn

as Xn+1 is independent of Fn and E[Xn+1] = p− q. It follows that N is a

martingale.

If S steps up b times in a row, then T must have occurred by the end of

that sequence, so

T ≤ b · infk + 1 : k ∈ Z≥0, Xkb+j = 1 for all 1 ≤ j ≤ b

=: bτ

with probability 1. It is clear that τ has the distribution of a geometric

random variable (taking values in 1, 2, . . .) with success probability pb. As

τ has finite expectation, E[T ] <∞; it follows that T <∞ a.s. If Tα denotes

the first time that S hits α ∈ Z, then T = T0 ∧ Tb and

P(T0 < Tb) + P(Tb < T0) = 1. (1)

11

As MT is a bounded process, we can apply the dominated convergence

theorem to give us

(q/p)a = E[MT0 ] = E[MT

n ]→ E[MT ] = P(T0 < Tb) + (q/p)bP(Tb < T0) (2)

where the second equality follows from MT being a martingale. By (1) and

(2),

P(Tb < T0) =(q/p)a − 1

(q/p)b − 1and P(T0 < Tb) =

(q/p)b − (q/p)a

(q/p)b − 1. (3)

We note that P(ST = 0) = P(T0 < Tb). We also see that N has bounded

increments:

|Nn+1−Nn| = |(Sn+1−Sn)−((n+1)(p−q)−n(p−q))| ≤ |Xn+1|+|p−q| ≤ 2.

As E[T ] <∞, the optional stopping theorem entails that

a = E[N0] = E[NT ] = E[ST − T (p− q)] = bP(Tb < T0)− E[T ](p− q).

By the above, together with (3), we conclude that

E[T ] =bP(Tb < T0)− a

p− q=b((q/p)a − 1)− a((q/p)b − 1)

(p− q)((q/p)b − 1).

Exercise 1.8

We shall prove a version of the Azuma–Hoffding inequality. Let us suppose

that Y is a centred random variable that takes its values in [−c, c]. The

function fθ : [−c, c]→ R defined by fθ(y) := eθy is convex for each θ ∈ R. If

−c ≤ y ≤ c, then y = λ(−c) + (1− λ)c with λ = (c− y)/2c ∈ [0, 1]. As fθ is

convex, fθ(y) ≤ λfθ(−c) + (1− λ)fθ(c). That is,

eθy ≤ c− y2c· e−θc +

c+ y

2c· eθc.

It follows that, as E[Y ] = 0,

E[eθY ] ≤ E

[c− Y

2c· e−θc +

c+ Y

2c· eθc

]=

eθc + e−θc

2= cosh(θc).

We also have that

cosh(θc) =

∞∑n=0

(θc)2n

(2n)!≤∞∑n=0

(θc)2n

2n · n!=

∞∑n=0

1

n!·(

(θc)2

2

)n= exp

((θc)2

2

).

We also have a conditional analogue to this; the proof is essentially identical.

12

Lemma. If Y is a random variable taking values in [−c, c] and if E[Y | G ] = 0

a.s. for some sub-σ-algebra G of F then, for all θ ∈ R, with probability 1

E[eθY | G ] ≤ cosh(θc) ≤ exp

((θc)2

2

).

Next, let us suppose that M = (Mn : n ≥ 0) is a martingale such that

M0 = 0 and, for each n ≥ 1, there is some cn > 0 such that |Mn−Mn−1| ≤ cn.

By convexity, nonnegativity and Jensen’s inequality, the process (eθMn : n ≥0) is a nonnegative submartingale. If we take θ > 0, by Doob’s submartingale

inequality we also have that, for each x > 0,

P

(supk≤n

Mk ≥ x

)= P

(supk≤n

eθMk ≥ eθx

)≤ e−θxE[eθMn ].

As eθMn is bounded for every n ≥ 0, for each n ≥ 1 we have that

E[eθMn ] = E[E[eθ(Mn−Mn−1)eθMn−1 | Fn−1]] = E[eθMn−1E[eθ(Mn−Mn−1) | Fn−1]]

and so, as Mn −Mn−1 satisfies the conditions of the lemma with G := Fn−1

and c := cn,

E[eθMn ] ≤ eθ2c2n/2E[eθMn−1 ] ≤ · · · ≤ exp

(θ2

2

n∑k=1

c2k

).

Therefore, for every θ > 0 and x > 0,

P

(supk≤n

Mk ≥ x

)≤ exp

(θ2

2

n∑i=1

c2i − θx

).

By varying θ > 0 and using elementary calculus, we see that the right-hand

side is minimised when θ = x/∑n

i=1 c2i > 0. By putting this value of θ into

the above, we conclude that

P

(supk≤n

Mk ≥ x

)≤ exp

(−x2/2∑nk=1 c

2k

).

Exercise 1.9

Let us suppose that f : [0, 1] → R is Lipschitz continuous with Lipschitz

constant K and that, for each n ≥ 0, fn is the piecewise linear function

13

that agrees with f on the set Dn := k2−n : 0 ≤ k ≤ 2n. Let us take

Mn := f ′n1Dcn, say. By definition,

Mn =2n−1∑k=0

f((k + 1)2−n)− f(k2−n)

2−n· 1(k2−n,(k+1)2−n).

This is a suggestive way of writing Mn, especially once we notice that

([0, 1],B([0, 1]), µ) is a probability space, where µ denotes Lebesgue measure.

For each n ≥ 0, let us define

Fn := σ(P(Dn) ∪ (k2−n, (k + 1)2−n) : 0 ≤ k ≤ 2n − 1

).

This defines a filtration on our probability space.

Lemma. The process M = (Mn : n ≥ 0) is a martingale with respect to

(Fn : n ≥ 0).

Proof. Let us fix some n ≥ 0. It is clear that Mn is Fn-measurable and

integrable as it is a linear combination of indicator functions of sets in Fn.

To show that the martingale property holds, it is enough to verify that

E[Mn+11(k2−n,(k+1)2−n)] = E[Mn1(k2−n,(k+1)2−n)] for each 0 ≤ k ≤ 2n − 1.

(This is the case as every set in Fn is a finite disjoint union of sets of this

type together with a null set—namely, a subset of Dn.) This is true, however:

if 0 ≤ k ≤ 2n − 1, then

E[Mn+11(k2−n,(k+1)2−n)] =f((2k + 1)2−(n+1))− f(k2−n)

2−(n+1)· 2−(n+1)

+f((k + 1))2−n)− f((2k + 1)2−(n+1))

2−(n+1)· 2−(n+1)

= f((k + 1)2−n)− f(k2−n)

= E[Mn1(k2−n,(k+1)2−n)].

We will now show that f can be written as the integral of a bounded

function. As f is Lipschitz continuous with Lipschitz constant K, we have

that

|Mn| ≤2n−1∑k=0

|f((k + 1)2−n)− f(k2−n)|2−n

· 1(k2−n,(k+1)2−n) ≤ K

for each n ≥ 0. It follows that M is a bounded process and hence that it

is uniformly integrable. By the UI martingale convergence theorem there

is an integrable random variable M∞ such that Mn → M∞ a.s. and in L1.

14

It is thus clear that |M∞| ≤ K a.s. By the L1 convergence, if n ≥ 1 and

0 ≤ k ≤ 2n then E[Mn+m1(0,k2−n)] → E[M∞1(0,k2−n)]| as m → ∞, and

therefore

f(k2−n)− f(0) = E[Mn+m1[0,k2−n)]→ E[M∞1[0,k2−n)] =

∫ k2−n

0M∞(x) dx

as m→∞. It follows that for all dyadic rationals q ∈ [0, 1],

f(q) = f(0) +

∫ q

0M∞(x) dx. (4)

If y ∈ [0, 1] and (qn : n ≥ 0) is a sequence of dyadic rationals in [0, 1] such

that qn y, then∣∣∣∣∣ f(y)− f(0)−∫ y

0M∞(x) dx

∣∣∣∣∣ ≤ |f(y)− f(qn)|

+

∣∣∣∣∣ f(qn)− f(0)−∫ qn

0M∞(x) dx

∣∣∣∣∣+

∣∣∣∣∣∫ y

qn

M∞(x) dx

∣∣∣∣∣,which tends to 0 as n→∞ as f is continuous and |M∞| ≤ K. We conclude

that (4) also holds when q is any element of [0, 1].

Exercise 1.10

We shall prove a decomposition result for submartingales due to Doob.1

Doob’s decomposition theorem. If X = (Xn : n ≥ 0) is a submartingale

then, modulo null sets, there is a unique martingale M = (Mn : n ≥ 0) and

a unique previsible process A = (An : n ≥ 0) such that A0 = 0 a.s., A is a.s.

nondecreasing and X = M +A a.s.

Proof. We will begin with a proof of uniqueness. If X has a decomposition

X = M +A a.s. of the required type then, a.s.,

E[Xn+1−Xn | Fn] = E[Mn+1−Mn | Fn]+E[An+1−An | Fn] = An+1−An.

Therefore we a.s. have that, for all n ≥ 0,

An+1 =

n∑k=0

(Ak+1 −Ak) =

n∑k=0

E[Xk+1 −Xk | Fk], (5)

1There is a continuous-time analogue of this result due to Meyer.

15

and so A is uniquely determined (up to null sets) and hence M = X −A is

also uniquely determined (up to null sets).

For existence, let us define A0 := 0 and, for all n ≥ 0, An+1 as in (5)

and Mn := Xn − An. It is immediate from the definition of A that it is

previsible; it is equally immediate from the submartingale property that it

is a.s. nondecreasing. To see that M is a martingale we notice that, for all

n ≥ 0, Mn is Fn-measurable and integrable as it is a linear combination of

Fn-measurable and integrable functions, and that, a.s.,

E[Mn+1 −Mn | Fn] = E[Xn+1 −Xn | Fn]− E[An+1 −An | Fn] = 0

by the definition of A. The required decomposition thus exists.

Proposition. Suppose that X is a submartingale and that X = M +A a.s.

is its Doob decomposition. The processes M and A are bounded in L1 if and

only if X is bounded in L1; whenever this is the case, A∞ := limn→∞An <∞a.s.

Proof. It follows immediately from the triangle inequality that if A and M are

bounded in L1 then X is bounded in L1. Conversely, if supn≥0 E[|Xn|] ≤ Kthen, for each n ≥ 0,

E[|An+1|] = E[An+1] = E

[n∑k=0

E[(Xk+1 −Xk) | Fk]

]= E[Xn]−E[X0] ≤ 2K

and hence, as A0 = 0 a.s., supn≥0 E[|An|] ≤ 2K. As M = X − A a.s., the

triangle inequality implies that M is also bounded in L1. To close, as A is a.s.

nondecreasing, it possess an a.s. limit A∞. If X is bounded in L1, then A is

also bounded in L1, and hence E[A∞] < ∞ by the monotone convergence

theorem; this entails that A∞ <∞ a.s.

Exercise 1.11

Proposition. If X = (Xn : n ≥ 0) is a UI submartingale and X = M +A

a.s. is its Doob decomposition then M is UI.

Proof. As X is UI it is bounded in L1 and so, by the results in Exercise 1.8, A

must also be bounded in L1. Our work also showed that 0 ≤ An ≤ A∞ ∈ L1

and hence A is also UI. We will now show that M = X − A a.s. is UI,

16

so let us fix some ε > 0 and let δ > 0 be such that, if P(A) < δ, then

supn≥0 E[|An|1A] < ε/2 and supn≥0 E[|Xn|1A] < ε/2. Then

supn≥0

E[|Mn|1A] ≤ supn≥0

E[|An|1A] + supn≥0

E[|Xn|1A] < ε/2 + ε/2 = ε.

It follows that M is UI.

Proposition. If X = (Xn : n ≥ 0) is a UI submartingale and S and T are

stopping times such that T ≥ S, then E[XT | FS ] ≥ XS a.s.

Remark. You might be shaking your head here, as T needn’t be finite. This

is true. However, as X is UI, Xn → X∞ a.s. for some random variable X∞

and hence, if T (ω) =∞, we can (and do) take XT (ω) to mean X∞(ω).

Proof. As we a.s. have that XT = AT +MT = (AT − AS) + AS +MT , we

also a.s. have that

E[XT | FS ] = E[AT−AS | FS ]+E[AS | FS ]+E[MT | FS ] ≥ AS+MS = XS .

In the inequality we have used, respectively, the fact that A is a.s. nonde-

creasing and T ≥ S; the fact that AS is FS-measurable; and the optional

stopping theorem for UI martingales. The only one of these facts that has

not been proven in lectures is the second, which we prove now. For each

Borel set B,

A−1S (B) =(A−1∞ (B) ∩ S =∞

)∪∞⋃n=0

(A−1n (B) ∩ S = n

). (6)

Let us fix some n ≥ 0. Then, for all m ≥ 0,

A−1n (B) ∩ S = n ∩ S ≤ m =

∅ if m < n,

A−1n (B) ∩ S = n otherwise.

This belongs to Fm in both cases. We also see that A−1n (B)∩S =∞∩S ≤m = ∅, which is an element of every σ-algebra, and hence A−1S (B) ∈ FS

for each B ∈ B by (6). We conclude that AS is FS-measurable.

2 Weak Convergence

Let us suppose that (Xn : n ≥ 1) is an sequence of iid random variables, each

uniformly distributed on [0, 1], and let us define Mn := X1 ∨ · · · ∨Xn. Let

17

us also define, for all n ≥ 1, Yn := n(1−Mn) and FYn to be its distribution

function. Then, for all x ∈ R,

FYn(x) = P(n(1−Mn) ≤ x) = P(Mn ≥ 1− x/n) = 1− P(Mn < 1− x/n)

= 1− P(∀i ≤ n, Xi < 1− x/n)

= 1− (P(X1 < 1− x/n))n

where we have used the iid hypothesis in the final line. If x < 0, then

P(X1 < 1 − x/n) = 1 and so FYn(x) = 0 for all n ≥ 1. If x ≥ 0 then for

all sufficiently large n, P(X1 < 1− x/n) = 1− x/n and therefore (P(X1 <

1− x/n))n = (1− x/n)n → e−x as n→∞, and hence FYn(x)→ 1− e−x as

n→∞. It follows that

FYn(x)→

0 if x < 0,

1− e−x if x ≥ 0.

It follows then that Yn tends in distribution to an exponential random

variable of parameter 1 as n→∞.

Exercise 2.2

Let us suppose that (Xn : n ≥ 0) is a sequence of random elements of a

metric space (M,d) and that all of these random elements are defined on

the same probability space, (Ω,F ,P).

Proposition. If Xn → X a.s. then Xn → X in distribution.

Proof. If Xn → X a.s. and f ∈ Cb(M) then, as f is continuous, f(Xn) →f(X) a.s. and so E[f(Xn)]→ E[f(X)] by the dominated convergence theorem.

Proposition. If Xn → X in probability then Xn → X in distribution.

Proof. Let us suppose that Xn → X in probability and that it is not the case

that Xn → X in distribution; i.e., that there is some f ∈ Cb(M), some ε > 0

and some subsequence (n(k) : k ≥ 1) such that |E[f(Xn(k))]− E[f(X)]| ≥ εfor all k ≥ 1. As Xn → X in probability, there is a further subsequence

(n(k(r)) : r ≥ 1) such that Xn(k(r)) → X a.s. as r → ∞. By the previous

proposition, Xn(k(r)) → X in distribution as r →∞—but this is absurd. It

therefore must be that Xn → X in distribution.

18

Proposition. If Xn → c in distribution for some constant c ∈ M then

Xn → c in probability.

Proof. Let us suppose that Xn → c in distribution and that B(c, ε) denotes

the (open) ball in M with centre c and radius ε > 0. By the portmanteau

lemma we see that

lim infn→∞

P(Xn ∈ B(c, ε)) ≥ P(c ∈ B(c, ε)) = 1

and hence that lim supn→∞ P(Xn ∈ B(c, ε)) = 1. It follows that P(Xn ∈B(c, ε)) → 1 or, phrased equivalently, that P(d(Xn, c) ≥ ε) → 0. As ε > 0

was arbitrary we thus have that Xn → c in probability.

Exercise 2.3

Let us suppose that (Xn : n ≥ 0) and (Yn : n ≥ 0) are random variables. We

first remark that it need not be the case that, if Xn → X in distribution and

Yn → Y in distribution, then (Xn, Yn)→ (X,Y ) in distribution. For example,

we can take Xn and Z to be uniformly distributed on −1, 1 and we can

define Yn := −Xn; in this case, both Xn and Yn tend to Z in distribution

but (Xn, Yn) does not tend to (Z,Z) in distribution. (To see this, take the

open set U = (−2, 0)× (−2, 0) and observe that, for all n ≥ 0,

lim infn→∞

P((Xn, Yn) ∈ U) = 0 < 1/2 = P((Z,Z) ∈ U).

By the portmanteau lemma, it cannot be the case that (Xn, Yn)→ (Z,Z) in

distribution.)

Proposition. Suppose that X and Y are independent random variables

defined on the probability space (Ω,F ,P) and that, for every n ≥ 0, Xn

and Yn are independent random variables defined on the probability space

(Ωn,Fn,Pn). If Xn → X and Yn → Y in distribution, (Xn, Yn)→ (X,Y ) in

distribution.

Proof. If we let φ denote the characteristic function then, for all ξ = (ξ1, ξ2) ∈R2,

φ(Xn,Yn)(ξ) = φXn(ξ1)φYn(ξ2)→ φX(ξ1)φY (ξ2) = φ(X,Y )(ξ)

where the convergence follows from Levy’s convergence theorem and the

equalities follow from our independence assumptions. By Levy’s convergence

theorem, (Xn, Yn)→ (X,Y ) in distribution.

19

Proposition. Suppose that X,Y,X1, Y1, X2, Y2, . . . are random variables

defined on the common probability space (Ω,F ,P). If Y is a.s. constant,

Xn → X in distribution and Yn → Y in distribution, then (Xn, Yn)→ (X,Y )

in distribution.

Proof. Let us suppose that Y = c a.s., where c ∈ R is constant. It is clear

that Yn → c in distribution so, by the previous exercise, Yn → c in probability.

Let us fix some ξ = (ξ1, ξ2) ∈ R2; then

φ(Xn,Yn)(ξ) = E[eiξ1Xneiξ2(Yn−c)]eiξ2c = E[eiξ1Xn(eiξ2(Yn−c)−1)]eiξ2c+φXn(ξ1)eiξ2c.

(7)

Let us fix some ε > 0. By continuity, there is some δ > 0 such that |Yn−c| < δ

implies that |eiξ2(Yn−c) − 1| < ε/4. As Yn → c in probability, for this δ and

ε there is some N ≥ 0 such that, for all n ≥ N , P(|Yn − c| ≥ δ) < ε/4. We

therefore see that

|E[eiξ1Xn(eiξ2(Yn−c) − 1)]| ≤ E[|eiξ2(Yn−c) − 1|(1|Yn−c|<δ + 1|Yn−c|≥δ)]

≤ ε/4 + 2P(|Yn − c| ≥ δ)

≤ 3ε/4.

As Xn → X in distribution, by Levy’s convergence theorem there is some

N ′ ≥ N such that, for all n ≥ N ′, |φXn(ξ1)− φX(ξ1)| ≤ ε/4. From (7), then,

for all n ≥ N ′ ≥ N we have that

|φ(Xn,Yn)(ξ)− φX(ξ1)eiξ2c| ≤ |φ(Xn,Yn)(ξ)− φXn(ξ1)e

iξ2c|+ |φXn(ξ1)eiξ2c − φX(ξ1)e

iξ2c|

= |E[eiξ1Xn(eiξ2(Yn−c) − 1)]|+ |φXn(ξ1)− φX(ξ1)|

≤ ε

and so φ(Xn,Yn)(ξ)→ φX(ξ1)eiξ2c as n→∞. By Levy’s convergence theorem,

we conclude that (Xn, Yn)→ (X, c) in distribution and hence that (Xn, Yn)→(X,Y ) in distribution.

Remark. It is natural to consider the weak convergence of probability mea-

sures when those measures are on a given metric space, (M,d). Both proposi-

tions in this question have generalisations to this setting [?, Theorems 2.8 &

3.9]. I will develop a general proof here so that we can appreciate the change

in flavour when we attempt to generalise results to metric spaces; the proof

becomes overtly topological and ‘hands on’ when we are unable to bludgeon

claims with Levy’s cudgel.

20

Exercise 2.4

Proposition. All finite families of probability measures on (Rd,B(Rd)) are

tight.

Proof. Let us define Qm := [−m,m]d for all m ≥ 1; this is a compact set

and Qm Rd. Let us suppose that µ1, . . . , µn are our probability measures

and let us fix some ε > 0. As µk(Qm)→ 1 as m→∞, there must be some

Mk such that, for all m ≥Mk, µk(Qcm) ≤ ε. It follows that

sup1≤k≤n

µk(QcM1∨···∨Mn

) ≤ ε

and thus, as ε > 0 was arbitrary, we conclude that our family is tight.

Proposition. If (µn : n ≥ 0) is a tight sequence of measures on (Rd,B(Rd))with the property that supn≥0 µ(Rd) < ∞, then there is some sequence

(nk : k ≥ 0) such that µn(k) converges weakly to some measure, µ.

Proof. If there is a subsequence (nk : k ≥ 0) such that µn(k)(Rd) = 0 for

each k ≥ 0, then µn(k) converges weakly to the zero measure.

Let us suppose instead that there is some N ≥ 0 such that, for all

n ≥ N , µn(Rd) > 0. If it is the case that infn≥N µn(Rd) = 0, we can extract

a subsequence (nk : k ≥ 0) such that µn(k)(Rd) → 0. If f ∈ Cb(Rd), then

|µn(k)(f)| ≤ ‖f‖∞µn(k)(Rd)→ 0 as k →∞ and hence µn(k) converges weakly

to the zero measure as k →∞.

On the other hand, if it is the case that infn≥N µn(Rd) > 0, we can define

probability measures νn on (Rd,B(Rd)) by νn(A) := µn(A)/µn(Rd) for all

A ∈ B(Rd) and all n ≥ N . The sequence (νn : n ≥ N) is tight: for all

ε > 0, there is a compact set K with the property that supn≥N µn(Kc) ≤ε · infn≥N µn(Rd), and hence

supn≥N

νn(Kc) = supn≥N

µn(Kc)

µn(Rd)≤ ε.

By Prohorov’s theorem, there is a subsequence (νn(k) : k ≥ 0) of (νn : n ≥ N)

such that νn(k) converges weakly to a probability measure, ν, as k → ∞.

Further, as our assumptions imply that (µn(k)(Rd) : k ≥ 0) is a bounded

sequence, the Bolzano–Weierstrass theorem implies that there is a convergent

subsequence (µn(k(r))(Rd) : r ≥ 0). It follows therefore that, as r → ∞,

νn(k(r)) → ν weakly and µn(k(r))(Rd) → C, say. To close, we claim that

21

µn(k(r)) = µn(k(r))(Rd) · νn(k(r)) → C · ν =: µ weakly as r →∞. To prove this,

we observe that if f ∈ Cb(Rd), then

µn(k(r))(f) = µn(k(r))(Rd) · νn(k(r))(f)→ C · ν(f) = µ(f)

as r →∞. We conclude that µn(k(r)) → µ weakly as r →∞, as required.

22

Documents

Advanced Probability: Solutions to Sheet 2ps422/solutions_2_guolong.pdf · Advanced Probability: Solutions to Sheet 2 Guolong Li November 26, 2013 1 Discrete-time martingales Exercise