448

One Thousand Exercises in Probability

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: One Thousand Exercises in Probability
Page 2: One Thousand Exercises in Probability

One Thousand

Exercises in Probability

GEOFFREY R. GRIMMETT

Statistical Laboratory, University of Cambridge

and

DAVID R. STIRZAKER

Mathematical Institute, University of Oxford

OXFORD UNIVERSITY PRESS

Page 3: One Thousand Exercises in Probability

OXFORD UNIVERSITY PRESS

Great Clarendon Street, Oxford OX2 6DP

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide in

Oxford New York

Athens Auckland Bangkok Bogota Buenos Aires Cape Town Chennai Dar es Salaam Delhi Florence Hong Kong Istanbul Karachi Kolkata Kuala Lumpur Madrid Melbourne Mexico City Mumbai Nairobi Paris Sao Paulo Shanghai Singapore Taipei Tokyo Toronto Warsaw

with associated companies in Berlin Ibadan

Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries

Published in the United States by Oxford University Press Inc. , New York

© Geoffrey R. Grimmett and David R. Stirzaker 2001

The moral rights of the author have been asserted

Database nght Oxford University Press (maker)

First published 2001

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly pemiitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above

You must not circulate this book in any other binding or cover and you must impose this same condition on any acquirer

A catalogue record for this title is available from the British Library

Library of Congress Cataloging in Publication Data Data available

ISBN 0 19 857221 2

10 9 8 7 6 5 4 3 2 1

Typeset by the authors Printed in Great Britain on acid-free paper by Biddles Ltd, Guildford & King's Lynn

Page 4: One Thousand Exercises in Probability

Preface

This book contains more than 1000 exercises in probability and random processes, together

with their solutions. Apart from being a volume of worked exercises in its own right, it is

also a solutions manual for exercises and problems appearing in our textbook Probability and

Random Processes (3rd edn), Oxford University Press, 200 1 , henceforth referred to as PRP.

These exercises are not merely for drill, but complement and illustrate the text of PRP, or are

entertaining, or both. The current volume extends our earlier book Probability and Random

Processes: Problems and Solutions, and includes in addition around 400 new problems. Since

many exercises have multiple parts, the total number of interrogatives exceeds 3000.

Despite being intended in part as a companion to PRP, the present volume is as self­

contained as reasonably possible. Where knowledge of a substantial chunk of bookwork is

unavoidable, the reader is provided with a reference to the relevant passage in PRP. Expressions

such as 'clearly' appear frequently in the solutions. Although we do not use such terms in

their Laplacian sense to mean 'with difficulty' , to call something 'clear' is not to imply that

explicit verification is necessarily free of tedium.

The table of contents reproduces that of PRP ; the section and exercise numbers corre­

spond to those of PRP ; there are occasional references to examples and equations in PRP.

The covered range of topics is broad, beginning with the elementary theory of probability

and random variables, and continuing, via chapters on Markov chains and convergence, to

extensive sections devoted to stationarity and ergodic theory, renewals, queues, martingales,

and diffusions, including an introduction to the pricing of options. Generally speaking, exer­

cises are questions which test knowledge of particular pieces of theory, while problems are

less specific in their requirements. There are questions of all standards, the great majority

being elementary or of intermediate difficulty. We ourselves have found some of the later

ones to be rather tricky, but have refrained from magnifying any difficulty by adding asterisks

or equivalent devices. If you are using this book for self-study, our advice would be not to

attempt more than a respectable fraction of these at a first read.

We pay tribute to all those anonymous pedagogues whose examination papers, work

assignments, and textbooks have been so influential in the shaping of this collection. To them

and to their successors we wish, in tum, much happy plundering. If you find errors, try to

keep them secret, except from us. If you know a better solution to any exercise, we will be

happy to substitute it in a later edition.

We acknowledge the expertise of Sarah Shea-Simonds in preparing the TEXscript of this

volume, and of Andy Burbanks in advising on the front cover design, which depicts a favourite

confluence of the authors.

Cambridge and Oxford

April 2001

v

G. R. G.

D. R. S .

Page 5: One Thousand Exercises in Probability

Life is good for only two things, discovering mathematics and teaching it.

Simeon Poisson

In mathematics you don't understand things, you just get used to them.

Probability is the bane of the age.

John von Neumann

Anthony Powell

Casanova's Chinese Restaurant

The traditional professor writes a, says b, and means c; but it should be d.

George P6lya

Page 6: One Thousand Exercises in Probability

Contents

Questions Solutions

1 Events and their probabilities

1 . 1 Introduction

1 .2 Events as sets 1 1 35

1 .3 Probability 1 1 35

1 .4 Conditional probability 2 1 37

1 .5 Independence 3 1 39

1 .6 Completeness and product spaces

1 .7 Worked examples 4 140

1 .8 Problems 4 141

2 Random variables and their distributions

2 . 1 Random variables 10 1 5 1

2.2 The law of averages 10 1 52

2.3 Discrete and continuous variables 1 1 152

2.4 Worked examples 1 1 152

2.5 Random vectors 1 2 153

2.6 Monte Carlo simulation

2.7 Problems 1 2 1 54

3 Discrete random variables

3 . 1 Probability mass functions 1 6 1 5 8

3.2 Independence 1 6 1 5 8

3 . 3 Expectation 17 161

3 .4 Indicators and matching 1 8 1 62

3 .5 Examples of discrete variables 1 9 1 65

3 .6 Dependence 19 1 65

3 .7 Conditional distributions and conditional expectation 20 1 67

3 .8 Sums of random variables 2 1 1 69

3.9 Simple random walk 22 170

3 . 1 0 Random walk: counting sample paths 23 1 7 1

3 . 1 1 Problems 23 172

vii

Page 7: One Thousand Exercises in Probability

Contents

4 Continuous random variables

4. 1 Probability density functions 29 1 87

4.2 Independence 29 1 88

4.3 Expectation 30 1 89

4.4 Examples of continuous variables 30 190

4.5 Dependence 3 1 19 1

4.6 Conditional distributions and conditional expectation 32 193

4.7 Functions of random variables 33 195

4.8 Sums of random variables 34 199

4.9 Multivariate normal distribution 35 201

4. 10 Distributions arising from the normal distribution 36 202

4. 1 1 Sampling from a distribution 36 204

4. 1 2 Coupling and Poisson approximation 37 205

4. 1 3 Geometrical probability 38 206

4 . 14 Problems 39 209

5 Generating functions and their applications

5 . 1 Generating functions 48 230

5 .2 Some applications 49 232

5 .3 Random walk 50 234

5 .4 Branching processes 5 1 238

5 .5 Age-dependent branching processes 52 239

5 .6 Expectation revisited 52 241

5 .7 Characteristic functions 53 241

5 .8 Examples of characteristic functions 54 244

5 .9 Inversion and continuity theorems 55 247

5 . 1 0 Two limit theorems 56 249

5 . 1 1 Large deviations 57 253

5 . 1 2 Problems 57 254

6 Markov chains

6. 1 Markov processes 64 272

6.2 Classification of states 65 275

6.3 Classification of chains 66 276 6.4 Stationary distributions and the limit theorem 67 28 1

6.5 Reversibility 68 286

6.6 Chains with finitely many states 69 287

6.7 Branching processes revisited 70 289

6 .8 Birth processes and the Poisson process 7 1 290

6.9 Continuous-time Markov chains 72 293

6. 1 0 Uniform semigroups

6. 1 1 Birth-death processes and imbedding 73 297 6 . 1 2 Special processes 74 299

6. 1 3 Spatial Poisson processes 74 301

6 . 1 4 Markov chain Monte Carlo 75 303 6 . 1 5 Problems 76 304

viii

Page 8: One Thousand Exercises in Probability

Contents

7 Convergence of random variables

7.1 Introduction 85 323

7.2 Modes of convergence 85 323

7.3 Some ancillary results 86 326

7.4 Laws of large numbers 88 330

7.5 The strong law 88 331

7.6 The law of the iterated logarithm 89 3 3 1

7.7 Martingales 89 3 3 1

7 . 8 Martingale convergence theorem 90 332

7.9 Prediction and conditional expectation 90 333

7 . 10 Uniform integrability 9 1 334

7.1 1 Problems 9 1 336

8 Random processes

8 . 1 Introduction

8 .2 Stationary processes 97 349

8 .3 Renewal processes 97 350

8 .4 Queues 98 3151

8 .5 The Wiener process 99 352

8 .6 Existence of processes

8.7 Problems 99 353

9 Stationary processes

9 . 1 Introduction 101 355

9.2 Linear prediction 101 356

9 .3 Autocovariances and spectra 102 357

9.4 Stochastic integration and the spectral representation 102 359

9.5 The ergodic theorem 103 359

9.6 Gaussian processes 103 360

9.7 Problems 104 361

10 Renewals

10. 1 The renewal equation 107 370

10.2 Limit theorems 107 37 1

10.3 Excess life 108 373

10.4 Applications 108 375

10.5 Renewal-reward processes 109 375

10.6 Problems 109 376

11 Queues

1 1 . 1 Single-server queues

1 1 .2 M/M/1 1 1 2 382

1 1 .3 MIG/1 1 1 3 384

1 1 .4 GIM/1 1 1 3 384

1 1 .5 GIGll 1 1 3 385

1 1 .6 Heavy traffic 1 1 4 386

1 1 .7 Networks of queues 1 1 4 386

1 1 .8 Problems 1 1 5 387

ix

Page 9: One Thousand Exercises in Probability

Contents

12 Martingales

1 2. 1 Introduction 1 1 8 396

1 2.2 Martingale differences and Hoeffding's inequality 1 1 9 398

1 2.3 Crossings and convergence 1 1 9 398

1 2.4 Stopping times 1 20 399

1 2.5 Optional stopping 1 20 400

1 2.6 The maximal inequality

1 2.7 Backward martingales and continuous-time martingales 1 2 1 403

12 .8 Some examples

1 2.9 Problems 1 2 1 403

13 Diffusion processes

1 3 . 1 Introduction

1 3 . 2 Brownian motion

1 3 . 3 Diffusion processes 1 26 41 1

1 3 .4 First passage times 1 27 413

13 .5 Barriers 1 27 413

1 3 .6 Excursions and the Brownian bridge 1 27 413

1 3 .7 Stochastic calculus 1 27 415

1 3 . 8 The Ito integral 1 28 416

1 3 .9 Ito's formula 1 29 417

1 3 . 10 Option pricing 1 29 4 1 8

1 3 . 1 1 Passage probabilities and potentials 1 30 420

1 3 . 1 2 Problems 1 30 420

Bibliography 429

Index 430

x

Page 10: One Thousand Exercises in Probability

1 Events and their probabilities

1.2 Exercises. Events as sets

1. Let {Ai : i E l} be a c ollec ti on of s ets. Prov e 'De Morga n's La ws 't:

(l)Air = (1A � ,

I I ((1 Air = l)A �.

I I

2. Let A and B belong to s ome a -field:F. Show that .f'c ontai ns the s ets A n B , A \ B , and A l::,. B.

3. A conventi ona l knock-out tourna ment (s uch as tha t a t Wi mbled on) begi ns wi th 2n competi tors and has n rounds . There ar e no pla y-offs for the posi ti ons 2, 3, . .. , 2n - I , a nd the i ni tia l ta ble of draws is speci fied . Gi ve a concis e d es cripti on of the sa mple s pa ce of a ll possi ble outcomes.

4. Let .f' be a a -field of s ubs ets of Q a nd s uppos e tha t B E :F. Show tha t 9. = {A n B : A E 11 is a a -fi eld of s ubs ets of B .

5. Which of the following are id entica lly true? For thos e tha t are not, sa y when they are true.

(a) A U (B n C) = (A U B ) n (A U C); (b) A n (B n C) = (A n B ) n C; (c) (A U B ) n C = A U (B n C); (d) A \ (B n C) = (A \ B ) U (A \ C) .

1.3 Exercises. Probability

1. LetA a nd B be ev ents with proba bilities lP'(A ) = i a nd lP'(B ) = j. Show tha t 1\ � IP' (AnB ) � j, and giv e exa mples to s how tha t both extremes ar e poss ible. Find corres pond ing bounds for IP' (A U B ) .

2. A fair coin is toss ed repeated ly. Show that, with probability one, a head turns up s ooner or later. Show s imila rly that any given finite s equence of heads and tails occurs eventually with probability one. E xplain the connection with Murphy's Law.

3. Six cups and s aucers come in pairs : there a re two cups and s aucers which a re red , two white, and tw o with s ta rs on. If the cups a re placed randarnl y onto the s aucers (one each), find the probability that no cup is upon a s aucer of the s ame pattern.

t Augustus De Morgan is well known for having given the first clear statement of the principle of mathematical induction. He applauded probability theory with the words: "The tendency of our study is to substitute the satisfaction of mental exercise for the pemicious enjoyment of an immoral stimulus".

1

Page 11: One Thousand Exercises in Probability

[1.3.4]-[1.4.5] Exercises

4. Let A I , A 2, . . . , An be events where n :::: 2, and prove that

Events and their probabilities

lP' ( U Aj) = L lP'(Aj ) - L:)D(Aj n Aj ) + L lP'(Aj n Aj n Ak) j=l j i<j i<j <k

In each packet of Corn Flakes may be found a plastic bust of one of the last five Vice-Chancellors of Cambridge University, the probability that any given packet contains any specific Vice-Chancellor being !, independently of all other packets. Show that the probability that each of the last three Vice-Chancellors is obtained in a bulk purchase of six packets is 1 - 3(�)6 + 3(�)6 _ (�)6. 5. Let Ar , r :::: 1, be events such that lP'(Ar ) = 1 for all r. Show that lP' (n� 1 Ar ) = 1. 6. You are given that at least one of the events Ar , 1 � r � n, is certain to occur, but certainly no more than two occur. If lP'(Ar ) = p, and lP'(Ar n As ) = q, r =I- s, show that p :::: l/n and q � 2/n .

7. You are given that at least one, but no more than three, of the events Ar , 1 � r � n, occur, where n :::: 3. The probability of at least two occurring is 1. If lP'(Ar ) = p, lP'(Ar n As ) = q, r =I- s , and lP'(Ar n As nAt ) = x, r < s < t, show that p :::: 3/(2n) , and q � 4/n .

1.4 Exercises. Conditional probability

1. Prove that lP'(A I B) = lP'(B I A)lP'(A)/lP'(B) whenever lP'(A)lP'(B) =I- 0. Show that, if lP'(A I B) >

lP'(A), then lP'(B I A) > lP'(B).

2. For events A I , A2 , .. . , An satisfying lP'(A I n A2 n··· nAn-I) > 0, prove that

3. A man possesses five coins, two of which are double-headed, one is double-tailed, and two are normal. He shuts his eyes, picks a coin at random, and tosses it. What is the probability that the lower face of the coin is a head?

He opens his eyes and sees that the coin is showing heads; what is the probability that the lower face is a head?

He shuts his eyes again, and tosses the coin again. What is the probability that the lower face is a head?

He opens his eyes and sees that the coin is showing heads; what is the probability that the lower face is a head?

He discards this coin, picks another at random, and tosses it. What is the probability that it shows heads?

4. What do you think of the following 'proof' by Lewis Carroll that an urn cannot contain two balls of the same colour? Suppose that the urn contains two balls, each of which is either black or white; thus, in the obvious notation, lP'(BB) = lP'(BW) = lP'(WB) = lP'(WW) = ! . We add a black ball, so that lP'(BBB) = lP'(BBW) = lP'(BWB) = lP'(BWW) = !. Next we pick a ball at random; the chance that the ball is black is (using conditional probabilities) 1 . ! + � . ! + � . ! + j . ! = �. However, if there is probability � that a ball, chosen randomly from three, is black, then there must be two black and one white, which is to say that originally there was one black and one white ball in the urn . 5. The Monty Hall problem: goats and cars. (a) Cruel fate has made you a contestant in a game show; you have to choose one of three doors. One conceals a new car, two conceal old goats. You

2

Page 12: One Thousand Exercises in Probability

Independence Exercises [1.4.6]-[1.5.7]

choose, but your chosen door is not opened immediately. Ins tead, the pres enter opens another door to re veal a goat, and he offers you the opportunity to change your choice to the third door (unopened and s o far unchos en). Let p be the (conditional) probability that the third door conceals the car. The

value of p depends on the pres enter's protocol . Devis e protocols to yield the values p = 1, p = �. Show that, for cx E [1, �], there exis ts a protocol s uch that p = cx. Are you well advis ed to change your choice to the third door?

(b) In a variant of this ques tion, the pres enter is permitted to open the firs t door chos en, and to reward you with whatever lies behind. If he choos es to open another door, then this door invariably conceals a goat. Let p be the probability that the unopened door conceals the car, conditional on the pres enter having chos en to open a s econd door. Devis e protocols to yield the values p = 0, p = 1, and deduce that, for any cx E [0, 1 ] , there exis ts a protocol with p = cx.

6. The prosecutor's fallacyt. Let G be the event that an accus ed is guilty, and T the event that s ome tes timony is true. Some lawyers have argued on the ass umption that IP' (G I T) = IP' (T I G) . Show that this holds if and only if lP'(G) = IP' (T).

7. Urns. There are n urns of which the rth contains r - 1 red balls and n - r magenta balls. Y ou pick an urn at random and remove two balls at random without replacement. Find the probability that:

(a) the s econd ball is magenta;

(b) the s econd ball is magenta, given that the firs t is magenta.

1.5 Exercises. Independence

1. Let A and B be independent events ; s how that A c, B are independent, and deduce that A c, BC are independent.

2. We roll a die n times . Let Ai} be the event that the i th and jth rolls produce the s ame number. Show that the events {Aij : 1 ::::: i < j ::::: n} are pairwis e independent but not independent.

3. A fair coin is toss ed repeatedly. Show that the following two s tatements are equivalent:

(a) the outcomes of different toss es are independent,

(b) for any given finite s equence of heads and tails , the chance of this s equence occurring in the firs t m toss es is 2-m , where m is the length of the s equence.

4. Let Q = { I , 2, . .. , p} where p is prime, :F be the s et of all s ubs ets of Q , and IP' (A ) = IA II p for all A E :F. Show that, if A and B are independent events , then at leas t one of A and B is either 0 or Q .

5. Show that the conditional independence of A and B given C neither implies , nor is implied by, the independence of A and B. For which events C is it the cas e that, for all A and B, the events A and B are independent if and only if they are conditionally independent given C ?

6. Safe or sorry? Some form of prophylaxis is s aid to be 90 per cent effective at prevention during one year's treatment. If the degrees of effectiveness in different years are independent, s how that the treatment is more likely than not to fail within 7 years.

7. Families. Jane has three children, each of which is equally likely to be a boy or a girl independently of the others . Define the events:

A = {all the children are of the s ame s ex} ,

B = {there is at mos t one boy} ,

C = {the family includes a boy and a girl} .

tThe prosecution made this error in the famous Dreyfus case of 1894.

3

Page 13: One Thousand Exercises in Probability

[1.5.8]-[1.8.3] Exercises

(a) Show that A is independent of B, and that B is independent of C .

(b) Is A independent of C ?

(c) Do thes e res ults hold if boys and girls are not equally likely?

(d) Do thes e res ults hold if Jane has four children?

Events and their probabilities

8. Galton's paradox. Y ou flip three fair coins . At leas t two are alike, and it is an evens chance that

the third is a head or a tail. Therefore lP'(all alike) = 1. D o you agree?

9. Tw o fair dice are rolled. Show that the event that their s um is 7 is independent of the s core s hown by the firs t die.

1.7 Exercises. Worked examples

1. There are two roads from A to B and two roads from B to C. E ach of the four roads is blocked by s now with probability p, independently of the others. Find the probability that there is an open road from A to B given that there is no open route from A to C.

If, in addition, there is a direct road from A to C, this road being blocked with probability p independently of the others , find the required conditional probability.

2. Calculate the probability that a hand of 13 cards dealt from a normal s huffl ed pack of 52 contains exactly two kings and one ace. What is the probability that it contains exactly one ace given that it contains exactly two kings ?

3. A s ymmetric random walk takes place on the integers 0, 1, 2, . . . , N with abs orbing barriers at 0 and N, s tarting at k. Show that the probability that the walk is never abs orbed is zero.

4. The s o-called 's ure thing principle' ass erts that if you prefer x to y given C , and als o prefer x to y given CC , then you s urely prefer x to y . Agreed?

5. A pack contains m cards , labelled 1, 2, . . . , m. The cards are dealt out in a random order, one by one. Given that the label of the kth card dealt is the larges t of the firs t k cards dealt, what is the probability that it is als o the larges t in the pack?

1.8 Problems

1. A traditional fair die is thrown twice. What is the probability that:

(a) a s ix turns up exactly once?

(b) both numbers are odd?

(c) the s um of the s cores is 4 ?

(d) the s um of the s cores is divis ible by 3?

2. A fair coin is thrown repeatedly. What is the probability that on the nth throw:

(a) a head appears for the firs t time?

(b) the numbers of heads and tails to date are equal?

(c) exactly two heads have appeared altogether to date?

(d) at leas t two heads have appeared to date?

3. Let F ari d 90 be a -fields of s ubs ets of Q .

(a) Us e elementary s et operations to s how that F is clos ed under countable inters ections ; that is , if AI, A2 , · . . are in:F, then s o is ni Ai·

(b) Let J{ = Fn 90 be the collection of s ubs ets of Q lying in both F and g.. Show that J{ is a a- field.

(c) Show that FU g., the collection of s ubs ets of Q lying in either F or g., is not necess arily a a- field.

4

Page 14: One Thousand Exercises in Probability

Problems Exercises [1.8.4]-[1.8.14]

4. Des cribe the under lying probability s paces for the following experiments :

(a) a bias ed coin is toss ed thre e times ;

(b) two balls are drawn without replacement from an urn which originally contained two ultramarine and tw o vermilion balls ;

(c) a bias ed coin is toss ed re peatedly until a head turns up.

5. Show that the probability that exactly one of the events A and B occurs is

IP'(A ) + IP'(B) - 21P'(A n B).

6. Prove that lP'(A U B U C ) = 1 - 1P'(AC I BC n CC)IP'(BC I CC)IP'(C C).

7. (a) If A is independent of its elf, s how that IP'(A ) is 0 or 1 . (b) If IP' (A ) is 0 or 1 , s how that A is independent of all events B. 8. Let :F be a a -field of s ubs ets of Q , and s uppos e IP' : :F � [0, 1 ] s atis fies : (i) IP'(Q) = 1 , and (ii) IP' is additive, in that IP'(A U B) = IP'(A ) + IP'(B) whenever A n B = 0. Show that 1P'(0) = O.

9. Suppos e (Q , :F, 1P') is a probability s pace and B E :Fs atis fies lP'(B ) > O. Let Q : :F � [0, 1 ] be defined by Q (A ) = lP'(A I B). Show that (Q , :F, Q ) is a probability s pace. If C E :F and Q (C ) > 0, s how that Q (A I C ) = lP'(A IB n C ) ; dis cuss .

10. Let Bt. B2, . .. be a partition of the s ample space Q , each Bi having pos itive probability, and s how that

00 IP'(A ) = L IP' (A I Bj)IP'(Bj).

j=l

11. Prove Boole's inequalities:

12. Prove that

lP' (nA i) = L IP'(A i ) - L IP'(A i U A j) + L IP'(A i U A j U Ak) 1 i i<j i<j<k

- . . . - (- WlP'(A l U A 2 U . . · U A n ).

13. Let A I , A 2 , . . . , A n be events , and let Nk be the event that exactly k of the A i occur. Prove the result s ometiIDt!s referred to as Waring's theorem:

IP'(Nk) = �(_I)i (k; i) Sk+i' where Sj = . .L

. IP'(A i1 n A i2 n··· n A ij ) '

1=0 11 <12<"'<lj

Us e this res ult to find an express ion for the probability that a purchase of s ix packets of Corn Flakes yields exactly three dis tinct bus ts (s ee E xercis e ( 1 .3 .4)). 14. Prove Bayes's formula: if A I , A 2 , . . . , An is a partition of Q , each Ai having pos itive probability, then

5

Page 15: One Thousand Exercises in Probability

[1.8.15]-[1.8.22] Exercises Events and their probabilities

15. A random number N of dice is thrown. Let Ai be the event that N = i , and ass ume that

IP' (Ai ) = 2-i , i � 1 . The s um of the s cores is S. Find the probability that:

(a) N = 2 given S = 4; (b) S = 4 given N is even;

(c) N = 2, given that S = 4 and the firs t die s howed 1 ;

(d) the larges t number s hown by any die is r , where S is unknown.

16. Let A I , A 2 , . . . be a s equence of events . Define

00 00 Bn = U Am , Cn = n Am .

m= n m= n

Clearly Cn � A n � Bn. The s equences {Bn} and {Cn} are decreas ing and increas ing res pectively with limits

lim Bn = B = n Bn = n U Am , lim Cn = C = U Cn = U n Am· n n m2: n n n m2: n

The events B and C are denoted lim s UPn-+oo A n and lim infn-+oo A n res pectively. Show that

(a) B = {w E Q : wEA n for infinitely many values of n} ,

(b) C = {w E Q : wEA n for all b ut finitely many values of n} .

We s ay that the s equence {A n} converges to a limit A = lim A n if B and C are the s ame s etA . Suppos e that A n --+ A and s how that

(c) A is an event, in that A E :F,

(d) IP' (A n ) --+ IP' (A ) .

17. In Problem ( 1 .8 . 16) above, s how that B and C are independent whenever B n and Cn are inde­pendent for all n . Deduce that if this holds and furt hermore A n --+ A , then IP' (A ) equals either zero or one.

18. Show that the ass umption that IP' is countably additive is equivalent to the ass umption that IP' is continuous . That is to s ay, s how that if a function IP' : :F --+ [0, 1 ] s atis fies 1P'(0) = 0, IP' (Q ) = 1 , and IP' (A U B ) = IP' (A ) + IP' (B ) whenever A , B E :F and A n B = 0, then IP' is countably additive (in the s ens e of s atis fying Definition ( 1 .3 . 1b)) if and only if lP' is continuous (in the s ens e of Lemma ( 1 .3 .5)).

19. Anne, Betty, Chloe , and Dais y were all friends at s chool. Subs equently each of the (i) = 6 s ubpairs meet up; at each of the s ix meetings the pair involved quarrel with s ome fixed probability p, or become firm friends with probability 1 - p. Q uarrels take place independently of each other. In future, if any of the four hears a rumour, then s he tells it to her firm friends only. If Anne hears a rumour, what is the probability that:

(a) Dais y hears it?

(b) Dais y hears it if Anne and Betty have quarrelled?

(c) Dais y hears it if Betty and Chloe have quarrelled?

(d) Dais y hears it if s he has quarrelled with Anne?

20. A biased coin is toss ed repeatedly. E ach time there is a probability p of a head turning up. Let P n be the probability that an even number of heads has occurred after n toss es (zero is an even number). Show that P o = 1 and that P n = p ( 1 -P n-l ) + ( 1 -P )P n-1 ifn � 1 . Solve this difference equation.

21. A bias ed coin is toss ed repeatedly. Find the probability that there is a run of r heads in a row before there is a run of s tails , where r and s are pos itive integers .

22. A bowl contains twenty cherries , exactly fifteen of which have had their s tones removed. A greedy pig eats five whole cherries , picked at random, without remarking on the pres ence or abs ence of s tones . Subs equently, a cherry is picked randoml y from the remaining fifteen.

6

Page 16: One Thousand Exercises in Probability

Problems Exercises [1.8.23]-[1.8.31]

(a) What is the probability that this cherry contains a stone? (b) Given that this cherry contains a stone, what is the probability that the pig consumed at least one

stone? 23. The 'menages' problem poses the following question. Some consider it to be desirable that men and women alternate when seated at a circular table. If n couples are seated randomly according to this rule, show that the probability that nobody sits next to his or her partner is

I � k 2n (2n - k) - L.., (-I) -- (n - k) ! n ! k=O 2n - k k

You may find it useful to show first that the number of ways of selecting k non-overlapping pairs of adjacent seats is en;;k) 2n(2n - k)-l. 24. An urn contains b blue balls and r red balls. They are removed at random and not replaced. Show that the probability that the first red ball drawn is the (k + l)th ball drawn equals (r+�=�-l) / (rtb) . Find the probability that the last ball drawn is red.

25. An urn contains a azure balls and c carmine balls, where ac # O. Balls are removed at random and discarded until the first time that a ball (B, say) is removed having a different colour from its predecessor. The ball B is now replaced and the procedure restarted. This process continues until the last ball is drawn from the urn . Show that this last ball is equally likely to be azure or carmine.

26. Protocols. A pack of four cards contains one spade, one club, and the two red aces. You deal two cards faces downwards at random in front of a truthful friend. She inspects them and tells you that one of them is the ace of hearts. What is the chance that the other card is the ace of diamonds? Perhaps j?

Suppose that your friend's protocol was: (a) with no red ace, say "no red ace", (b) with the ace of hearts, say "ace of hearts", (c) with the ace of diamonds but not the ace of hearts, say "ace of diamonds". Show that the probability in question is j.

Devise a possible protocol for your friend such that the probability in question is zero.

27. Eddington's controversy. Four witnesses, A, B, C, and D, at a trial each speak the truth with probability j independently of each other. In their testimonies, A claimed that B denied that C declared that D lied. What is the (conditional) probability that D told the truth? [This problem seems to have appeared first as a parody in a university magazine of the 'typical' Cambridge Philosophy Tripos question.]

28. The probabilistic method. 10 per cent of the surface of a sphere is coloured blue, the rest is red. Show that, irrespective of the manner in which the colours are distributed, it is possible to inscribe a cube in S with all its vertices red. 29. Repulsion. The event A is said to be repelled by the event B if lP'(A I B) < lP'(A), and to be attracted by B iflP'(A I B) > lP'(A) . Show that if B attracts A, then A attracts B, and Be repels A.

If A attracts B, and B attracts C , does A attract C? 30. Birthdays. If m students born on independent days in 1991 are attending a lecture, show that the probability that at least two of them share a birthday is p = 1 - (365)! / { (365 - m)! 365m} . Show that p > i when m = 23 . 31. Lottery. You choose r of the first n positive integers, and a lottery chooses a random subset L of the same size. What is the probability that: (a) L includes no consecutive integers?

7

Page 17: One Thousand Exercises in Probability

[1.8.32]-[1.8.36] Exercises

(b) L includes exactly one pair of cons ecutive integers ?

(c) the numbers i n L are drawn in increas ing order?

(d) your choice of numbers is the s ame as L ?

(e) there are exactly k of your numbers matching members of L ?

Events and their probabilities

32. Bridge. During a game of bridge, you are dealt at random a hand of thirteen cards . With an obvious notation, s how that lP'(4S , 3H, 3D, 3C) � 0.026 and lP'(4S , 4H, 3D, 2C) � 0.01 8. However if s uits are not s pecified, s o numbers denote the s hape of your hand, s how that lP'(4, 3, 3, 3) � 0. 1 1 and lP'(4, 4, 3 , 2) � 0.22.

33. Poker. During a game of poker, you are dealt a five-card hand at random. With the convention that aces may count high or low, s how that:

lP'(1 pair) � 0.423,

lP'(straight) � 0.0039,

lP'(4 of a kind) � 0.00024,

lP'(2 pairs ) � 0.0475, lP'(3 of a kind) � 0.02 1 ,

lP'(flus h) � 0.0020, lP'(full hous e) � 0.0014,

lP'(straight flus h) � 0.000015 .

34. Poker dice. There are five dice each dis playing 9, 10 , J, Q, K, A. Show that, when rolled:

lP'(1 pair) � 0.46, lP'(2 pairs ) � 0.23,

lP'(no 2 alike) � 0.093 , lP'(full hous e) � 0.039,

lP'(5 of a kind) � 0.0008.

lP'(3 of a kin d) � 0. 15 ,

lP'(4 of a kind) � 0.019,

35. Y ou are los t in the National Park of Bandrikat . Touris ts compris e two-thirds of the vis itors to

the park, and give a correct ans wer to reques ts for directions with probability 1. (Ans wers to repeated ques tions are independent, even if the ques tion and the pers on are the s ame.) If you as k a Bandrikan for directions , the ans wer is always fals e.

(a) Y ou as k a pass er-by whether the exit from the Park is E as t or Wes t. The ans wer is E as t. What is the probability this is correct?

(b) Y ou as k the s ame pers on again, and receive the s ame reply. Show the probability that it is correct . 1 IS 2.

(c) Y ou as k the s ame pers on again, and receive the s ame reply. What is the probability that it is correct?

(d) Y ou ask for the fourth time, and receive the ans wer E as t. Show that the probability it is correct . 21 IS 70. (e) Show that, had the fourth ans wer been Wes t ins tead, the probability that E as t is nevertheless . 9 correct IS TO.

36. Mr Bayes goes to Bandrika. Tom is in the s ame pos ition as you were in the pre vious problem, but he has reas on to believe that, with probability E , E as t is the correc t ans wer. Show that:

(a) whatever ans wer firs t received, Tom continues to believe that E as t is correct with probability E ,

(b) if the firs t two replies are the s ame (that is , either WW or E E ), Tom continues to believe that E as t is correct with probability E ,

(c) after three like ans wers , Tom will calculate as follows , in the obvious notation:

9E lP'(E as t correct I E E E ) = --- ,

1 1- 2E

E valuate thes e when E = �.

1 1E lP'(E as t correct I WWW) = -- .

9 + 2E

t A fictional country made famous in the Hitchcock film 'The Lady Vanishes' .

8

Page 18: One Thousand Exercises in Probability

Problems Exercises [1.8.37J-[1.8.39J

37. Bonferroni's inequality. Show that

11"( U Ar) � t ll"(Ar) - L lI"(Ar n Ak)· r=l r=l r<k

38. Kounias's inequality. Show that

39. The n passengers for a Bell-Air flight in an airplane with n seats have been told their seat numbers. They get on the plane one by one. The first person sits in the wrong seat. Subsequent passengers sit in their assigned seats whenever they find them available, or otherwise in a randomly chosen empty seat. What is the probability that the last passenger finds his seat free?

9

Page 19: One Thousand Exercises in Probability

2

Random variables and their distributions

2.1 Exercises. Random variables

1. Let X be a random variable on a given probability space, and let a E R. Show that

(i) aX is a random variable,

(ii) X - X = 0, the random variable taking the value 0 always, and X + X = 2X .

2. A random variable X has distribution function F. What is the distribution function of Y = aX +b, where a and b are real constants?

3. A fair coin is tossed n times. Show that, under reasonable assumptions, the probability of exactly

k heads is (�) ( i )n . What is the corresponding quantity when heads appears with probability p on each toss?

4. Show that if F and G are distribution functions and 0 :::: 'A :::: 1 then 'AF + ( 1 -'A)G is a distribution function. Is the product FG a distribution function?

5. Let F be a distribution function and r a positive integer. Show that the following are distribution functions :

(a) F(x)', (b) 1 - { 1 - F(x) }', (c) F(x) + {1 - F(x) } 10g{ 1 - F(x)} , (d) {F(x) - l }e + exp { 1 - F(x)} .

2.2 Exercises. The law of averages

1. You wish to ask each of a large number of people a question to which the answer "yes" is embarrassing. The following procedure is proposed in order to determine the embarrassed fraction of the population. As the question is asked, a coin is tossed out of sight of the questioner. If the answer would have been "no" and the coin shows heads, then the answer "yes" is given. Otherwise people respond truthfully. What do you think of this procedure?

2. A coin is tossed repeatedly and heads turns up on each toss with probability p. Let Hn and Tn be the numbers of heads and tails in n tosses. Show that, for fi > 0,

IP (2P - 1 - fi :::: � (Hn - Tn) :::: 2p - 1 + fi) � 1 as n � 00.

3. Let {X r : r � I } be observations which are independent and identically distributed with unknown distribution function F. Describe and justify a method for estimating F (x ) .

1 0

Page 20: One Thousand Exercises in Probability

Worked examples Exercises [2.3.1J-[2.4.2J

2.3 Exercises. Discrete and continuous variables

1. Let X be a random variable with distribution function F, and let a = (am : -00 < m < (0) be a strictly increasing sequence of real numbers satisfying a-m � -00 and am � 00 as m � 00. Define G (x) = lP'(X :::: am) when am-I :::: x < am , so that G is the distribution function of a discrete random variable. How does the function G behave as the sequence a is chosen in such a way that sUPm lam - am- I I becomes smaller and smaller?

2. Let X be a random variable and let g : IR � IR be continuous and strictly increasing. Show that Y = g(X) is a random variable.

3. Let X be a random variable with distribution function { 0 if x:::: 0,

lP'(X:::: x } = x if 0 < x:::: I , 1 if x>1 .

Let F be a distribution function which is continuous and strictly increasing. Show that Y = F- I (X) is a random variable having distribution function F. Is it necessary that F be continuous and/or strictly increasing?

4. Show that, if f and g are density functions, and 0 :::: A :::: 1 , then Af + (1 - A)g is a density. Is the product f g a density function?

5. Which of the following are density functions? Find e and the corresponding distribution function F for those that are .

(a) f (x) = ex x > '

. { -d 1 o otherwIse.

(b) f (x) = eeX ( l + ex )-2, x E R

2.4 Exercises. Worked examples

1. Let X be a random variable with a continuous distribution function F. Find expressions for the distribution functions of the following random variables:

(a) X2

, (b) .JX,

(c) sin X, (d) G-I (X),

(e) F(X), (f) G-I (F(X» ,

where G is a continuous and strictly increasing function.

2. Truncation. Let X be a random variable with distribution function F, and let a < b. Sketch the distribution functions of the 'truncated' random variables Y and Z given by { a if X < a ,

Y = X if a :::: X :::: b , b if X> b ,

{ X iflXI:::: b , Z=

0 if IXI > b .

Indicate how these distribution functions behave a s a � -00, b � 00.

1 1

Page 21: One Thousand Exercises in Probability

[2.5.1]-[2.7.4] Exercises Random variables and their distributions

2.5 Exercises. Random vectors

1. A fair coin is tossed twice. Let X be the number of heads, and let W be the indicator function of the event {X = 2} . Find IP(X = x , W = w) for all appropriate values of x and w.

2. Let X be a Bernoulli random variable, so that IP(X = 0) = 1 - p, IP(X = 1 ) = p. Let Y = 1 - X and Z = XY. Find IP(X = x , Y = y) and IP(X = x , Z = z) for x , y , Z E {O, 1 } . 3. The random variables X and Y have joint distribution function

FX.Y (x . y) � { ;1 - ,-') G

+ � tan-I y) if x < 0,

if x � O.

Show that X and Y are (jointly) continuously distributed.

4. Let X and Y have joint distribution function F. Show that

lP(a < X :::: b, c < Y :::: d) = F(b , d) - F(a, d) - F (b , c) + F(a, c)

whenever a < band c < d. 5. Let X, Y be discrete random variables taking values in the integers, with joint mass function f. Show that, for integers x , y,

f(x , y) = IP(X � x , Y :::: y) - 1P(X � x + 1 , Y :::: y) - 1P(X � x , Y :::: y - 1 ) + IP(X � x + 1 , Y :::: y - 1) .

Hence find the joint mass function of the smallest and largest numbers shown in r rolls of a fair die.

6. Is the function F (x , y) = 1 - e -xY, 0 :::: x , y < 00, the joint distribution function of some pair of random variables?

2.7 Problems

1. E ach toss of a coin results in a head with probability p. The coin is tossed until the first head appears . Let X be the total number of tosses. What is IP(X > m)? Find the distribution function of the random variable X. 2. (a) Show that any discrete random variable may be written as a linear combination of indicator

variables .

(b) Show that any random variable may be expressed as the limit of an increasing sequence of discrete random variables .

(c) Show that the limit of any increasing convergent sequence of random variables is a random variable.

3. (a) Show that, if X and Y are random variables on a probability space (Q , :F, 1P), then so are X + Y, XY, and mini X, Y} .

(b) Show that the set of all random variables on a given probability space (Q , :F, 1P) constitutes a vector space over the reals. If Q is finite, write down a basis for this space.

4. Let X have distribution function

1 2

if x < 0, if 0 :::: x :::: 2,

if x > 2,

Page 22: One Thousand Exercises in Probability

Problems

and let Y = X2 . Find (a) IP' (� :::: X :::: �), (c) IP'(Y :::: X),

(b) 1P'(l :::: X < 2), (d) IP'(X :::: 2Y),

Exercises [2.7.5J-[2.7.13J

(e) IP' (X + Y :::: i), (f) the distribution function of Z = ../X. 5. Let X have distribution function { o

I - p F(x) = 1 � -P + '1xp

if x < - l , if - I :::: x < 0, if 0 :::: x:::: 2, if x > 2.

Sketch this function, and find: (a) IP'(X = - I ) , (b) IP'(X = 0) , (c) IP'(X :::: 1 ) . 6 . Buses arrive at ten minute intervals starting at noon. A man arrives at the bus stop a random number X minutes after noon, where X has distribution function { 0 if x < 0,

IP'(X::::x) = x/60 if O::::x::::60, 1 if x > 60.

What is the probability that he waits less than five minutes for a bus?

7. Airlines find that each passenger who reserves a seat fails to turn up with probability to inde­pendently of the other passengers. So Teeny Weeny Airlines always sell 10 tickets for their 9 seat aeroplane while Blockbuster Airways always sell 20 tickets for their 1 8 seat aeroplane. Which is more often over-booked? 8. A fairground performer claims the power of telekinesis. The crowd throws coins and he wills them to fall heads up. He succeeds five times out of six. What chance would he have of doing at least as well if he had no supernatural powers? 9. Express the distribution functions of

X+ = max{O, X} , X- = - min{O, X} , I X I = X+ + X- , -X,

in terms of the distribution function F of the random variable X. 10. Show that Fx(x) is continuous at x = Xo if and only if IP'(X = xo) = O. 11. The real number m is called a median of the distribution function F whenever limytm F(y) :::: � :::: F(m) . Show that every distribution function F has at least one median, and that the set of medians of F is a closed interval of JR.

12. Show that it is not possible to weight two dice in such a way that the sum of the two numbers shown by these loaded dice is equally likely to take any value between 2 and 12 (inclusive). 13. A function d : S x S � JR is called a metric on S if: (i) d(s , t) = d(t , s) :::: 0 for all s, t E S, (ii) d(s , t) = 0 if and only if s = t, and (iii) d(s , t) :::: d(s , u) + d(u , t) for all s , t, u E S. (a) Levy metric. Let F and G be distribution functions and define the Levy metric

ddF, G) = inf{ E > 0 : G(x - E ) - E :::: F(x) :::: G(x + E ) + E for all x } .

Show that dL is indeed a metric on the space of distribution functions.

13

Page 23: One Thousand Exercises in Probability

[2.7.14J-[2.7.18J Exercises Random variables and their distributions

(b) Total variation distance. Let X and Y be integer-valued random variables, and let

dTV(X, Y) = I:1lP'(X = k) - lP'(Y = k)l· k

Show that dTV satisfies (i) and (iii) with S the space of integer-valued random variables, and that dTV (X, Y) = 0 if and only if lP'(X = Y) = 1. Thus dTV is a metric on the space of equivalence classes of S with equivalence relation given by X � Y if lP'(X = Y) = 1. We call dTV the total variation distance.

Show that dTV(X, Y) = 2 sup 1lP'(X E A) - lP'(Y E A)I. A�Z

14. Ascertain in the following cases whether or not F is the joint distribution function of some pair (X, Y) of random variables. If your conclusion is affirmative, find the distribution functions of X and Y separately.

(a)

(b)

{ 1- e-x-y F(x , y) =

0 { I - e-x - xe-Y F(x , y) = � - e-Y - ye-Y

if x , y :::: 0, otherwise. if O::::x::::y , if O::::y::::x , otherwise.

15. It is required to place in order n books B 1 , B2, ... , Bn on a library shelf in such a way that readers searching from left to right waste as little time as possible on average. Assuming that each reader requires book Bi with probability Pi, find the ordering of the books which minimizes lP'(T :::: k) for all k, where T is the (random) number of titles examined by a reader before discovery of the required book.

16. Transitive coins. Three coins each show heads with probability � and tails otherwise. The first counts 10 points for a head and 2 for a tail, the second counts 4 points for both head and tail, and the third counts 3 points for a head and 20 for a tail.

You and your opponent each choose a coin; you cannot choose the same coin. Each of you tosses your coin and the person with the larger score wins £ 1 Ol D. Would you prefer to be the first to pick a coin or the second?

17. Before the development of radar and inertial navigation, flying to isolated islands (for example, from Los Angeles to Hawaii) was somewhat 'hit or miss ' . In heavy cloud or at night it was necessary to fly by dead reckoning, and then to search the surface. With the aid of a radio, the pilot had a good idea of the correct great circle along which to search, but could not be sure which of the two directions along this great circle was correct (since a strong tailwind could have carried the plane over its target). When you are the pilot, you calculate that you can make n searches before your plane will run out of fuel. On each search you will discover the island with probability P (if it is indeed in the direction of the search) independently of the results of other searches ; you estimate initially that there is probability a that the island is ahead of you. What policy should you adopt in deciding the directions of your various searches in order to maximize the probability of locating the island?

18. Eight pawns are placed randomly on a chessboard, no more than one to a square. What is the probability that: (a) they are in a straight line (do not forget the diagonals)? (b) no two are in the same row or column?

14

Page 24: One Thousand Exercises in Probability

Problems Exercises [2.7 .19J-[2. 7 .20J

19. Which of the following are distribution functions? For those that are, give the corresponding density function f.

(a) F(x) = { 1 - e-x2 x � 0, ° otherwise.

{ e-ljx x > ° (b) F(x) = ' . ° otherwIse. (c) F(x) = eX /(eX + e-X), x E R

2 (d) F(x) = e-x + eX /(eX + e-X), x E R.

20. (a) If U and V are jointly continuous, show that IP'(U = V) = 0. (b) Let X be unifonnly distributed on (0, 1 ) , and let Y = X. Then X and Y are continuous, and IP'(X = Y) = 1 . Is there a contradiction here?

1 5

Page 25: One Thousand Exercises in Probability

3

Discrete random variables

3.1 Exercises. Probability mass functions

1. For what values of the constant C do the following define mass functions on the positive integers 1 , 2, . . . ? (a) Geometric: f (x) = CZ-X• (b) Logarithmic: f (x) = C2-X Ix . (c) Inverse square: f (x) = Cx-2 . (d) 'Modified' Poisson: f (x) = C2x Ix ! .

2 . For a random variable X having (in turn ) each of the four mass functions of Exercise ( 1 ), find: (i) lP'(X > 1 ) , (ii) the most probable value of X,

(iii) the probability that X is even. 3. We toss n coins, and each one shows heads with probability p, independently of each of the others. Each coin which shows heads is tossed again. What is the mass function of the number of heads resulting from the second round of tosses? 4. Let Sk be the set of positive integers whose base- 1O expansion contains exactly k elements (so that, for example, 1024 E S4) . A fair coin is tossed until the first head appears, and we write T for the number of tosses required. We pick a random element, N say, from ST, each such element having equal probability. What is the mass function of N? 5. Log-convexity. (a) Show that, if X is a binomial or Poisson random variable, then the mass function f (k) = lP'(X = k) has the property that f (k - 1 )f (k + 1) � f(k)2 . (b) Show that, if f (k) = 90/(Jrk)4 , k � 1 , then f (k - l )f (k + 1 ) � f(k)2 . (c) Find a mass function f such that f (k)2 = f(k - l )f (k + 1 ) , k � 1 .

3.2 Exercises. Independence

1. Let X and Y be independent random variables, each taking the values - l or 1 with probability � , and let Z = XY. Show that X, Y, and Z are pairwise independent. Are they independent? 2. Let X and Y be independent random variables taking values in the positive integers and having the same mass function f (x) = 2-x for x = 1 , 2, . . . . Find:

(a) lP'(min{X, Y} � x) , (b) lP'(Y > X), (c) lP'(X = Y), (d) lP'(X � ky), for a given positive integer k, (e) lP'(X divides Y), (f) lP'(X = rY), for a given positive rational r .

1 6

Page 26: One Thousand Exercises in Probability

Expectation Exercises [3.2.3]-[3.3.5]

3. Let Xl , X 2 , X 3 be independent random variables taking values in the positive integers and having mass functions given by IP(Xi = x) = ( 1 - Pi )pf- 1 for x = 1 , 2, . . . , and i = 1 , 2, 3 . (a) Show that

1lD(X X X ) ( 1 - Pl ) ( l - P2)P2P� 1IC 1 < 2 < 3 = (1 - P2P3 ) ( 1 - PI P2P3)

(b) Find IP(Xl :::: X2 :::: X3)· 4. Three players, A, B, and C, take turns to roll a die; they do this in the order ABCABCA . .. . (a) Show that the probability that, of the three players, A is the first to throw a 6, B the second, and

C the third, is 216/ 1001 . (b) Show that the probability that the first 6 to appear is thrown by A, the second 6 to appear is thrown

by B, and the third 6 to appear is thrown by C, is 46656/75357 1 . 5. Let Xr , 1 :::: r :::: n, be independent random variables which are symmetric about 0; that is, Xr and -Xr have the same distributions. Show that, for all x, IP(Sn � x) = IP(Sn :::: -x) where Sn = l:�=l Xr .

Is the conclusion necessarily true without the assumption of independence?

3.3 Exercises. Expectation

1. Is it generally true that lE( 1/ X) = l /lE(X)? Is it ever true that lE( 1/ X) = l /lE(X)? 2. Coupons. Every package of some intrinsically dull commodity includes a small and exciting plastic object. There are c different types of object, and each package is equally likely to contain any given type. You buy one package each day. (a) Find the mean number of days which elapse between the acquisitions of the jth new type of object

and the (j + l ) th new type. (b) Find the mean number of days which elapse before you have a full set of objects. 3. Each member of a group of n players rolls a die. (a) For any pair of players who throw the same number, the group scores 1 point. Find the mean and

variance of the total score of the group. (b) Find the mean and variance of the total score if any pair of players who throw the same number

scores that number. 4. St Petersburg paradoxt. A fair coin is tossed repeatedly. Let T be the number of tosses until the first head. You are offered the following prospect, which you may accept on payment of a fee. If T = k, say, then you will receive £2k . What would be a 'fair' fee to ask of you? 5. Let X have mass function

I(x) = { �X(X + 1 ) }- 1 if x = 1 , 2, . . . , otherwise,

and let ex E R. For what values of ex is it the case:j: that lE(XIl!) < oo?

tThis problem was mentioned by Nicholas Bernoulli in 1713, and Daniel Bernoulli wrote about the question for the Academy of St Petersburg.

:j:If ex is not integral, than lE(X"') is called the fractional moment of order ex of X. A point concerning

notation: for real ex and complex x = reiIJ, x'" should be interpreted as rll! eiIJ'" , so that Ix'" I = r"'. In particular,

lE(lX'" I) = lE(IXI"')· .

17

Page 27: One Thousand Exercises in Probability

[3.3.6]-[3.4.8] Exercises Discrete random variables

6. Show that var(a + X) = var(X) for any random variable X and constant a . 7. Arbitrage. Suppose you find a warm-hearted bookmaker offering payoff odds of 7f (k) against the kth horse in an n-horse race where 2:1:=1 {7f (k) + l } - 1 < 1 . Show that you can distribute your bets in such a way as to ensure you win. 8. You roll a conventional fair die repeatedly. If it shows 1 , you must stop, but you may choose to stop at any prior time. Your score is the number shown by the die on the final roll. What stopping strategy yields the greatest expected score? What strategy would you use if your score were the square of the final roll? 9. Continuing with Exercise (8), suppose now that you lose c points from your score each time you roll the die. What strategy maximizes the expected final score if c = 1 ? What is the best strategy if c = I?

3.4 Exercises. Indicators and matching

1. A biased coin is tossed n times, and heads shows with probability p on each toss. A run is a sequence of throws which result in the same outcome, so that, for example, the sequence HHTHTIH contains five runs. Show that the expected number of runs is 1 + 2(n - l)p(l - p) . Find the variance of the number of runs. 2. An urn contains n balls numbered 1 , 2 , . . . , n . We remove k balls at random (without replacement) and add up their numbers. Find the mean and variance of the total. 3. Of the 2n people in a given collection of n couples, exactly m die. Assuming that the m have been picked at random, find the mean number of surviving couples. This problem was formulated by Daniel Bernoulli in 1768. 4. Urn R contains n red balls and urn B contains n blue balls. At each stage, a ball is selected at random from each urn, and they are swapped. Show that the mean number of red balls in urn R after stage k is 1n { l + ( 1 - 2/nl } . This 'diffusion model' was described by Daniel Bernoulli in 1769. 5. Consider a square with diagonals, with distinct source and sink. Each edge represents a component which is working correctly with probability p, independently of all other components. Write down an expression for the Boolean function which equals 1 if and only if there is a working path from source to sink, in terms of the indicator functions Xi of the events {edge i is working} as i runs over the set of edges. Hence calculate the reliability of the network. 6. A system is called a 'k out of n ' system if it contains n components and it works whenever k or more of these components are working. Suppose that each component is working with probability p, independently of the other components, and let Xc be the indicator function of the event that component c is working. Find, in terms of the Xc , the indicator function of the event that the system works, and deduce the reliability of the system. 7. The probabilistic method. Let G = (V, E) be a finite graph. For any set W of vertices and any edge e E E, define the indicator function

{ I if e connects W and We, [w ee) = 0 otherwise.

Set Nw = 2:eeE [w ee) . Show that there exists W � V such that Nw � 1 1 E I . 8. A total of n bar magnets are placed end to end in a line with random independent orientations. Adjacent like poles repel, ends with opposite polarities join to form blocks. Let X be the number of blocks of joined magnets. Find E(X) and var(X) .

1 8

Page 28: One Thousand Exercises in Probability

Dependence Exercises [3.4.9]-[3.6.5]

9. Matching. (a) Use the inclusion-exclusion formula (3 .4.2) to derive the result of Example (3.4.3), namely: in a random permutation of the first n integers, the probability that exactly r retain their original positions is

1 ( 1 1 (_ l )n-r ) ;! 2 ! - 3 ! + . . . + (n - r) ! .

(b) Let dn be the number of derangements of the first n integers (that is, rearrangements with no integers in their original positions). Show that dn+ I = ndn + ndn- l for n :::: 2. Deduce the result of part (a).

3.5 Exercises. Examples of discrete variables

1. De Moivre trials. Each trial may result in any of t given outcomes, the i th outcome having probability Pi . Let Ni be the number of occurrences of the i th outcome in n independent trials . Show that

. n ! n l n2 nt JP'(N; = n; for 1 � I ::: t) = PI P2 . . . Pt n l ! n2 ! . . · nt ! for any collection nl , n2 , . . . , nt of non-negative integers with sum n . The vector N is said to have the multinomial distribution. 2. In your pocket is a random number N of coins, where N has the Poisson distribution with parameter 'A. You toss each coin once, with heads showing with probability P each time. Show that the total number of heads has the Poisson distribution with parameter 'Ap. 3. Let X be Poisson distributed where JP'(X = n) = Pn ('A) = 'Ane-J.. /n ! for n :::: O. Show that JP'(X ::: n) = 1 - It Pn (x) dx . 4. Capture-recapture. A population of b animals has had a number a of its members captured, marked, and released. Let X be the number of animals it is necessary to recapture (without re-release) in order to obtain m marked animals . Show that

JP'(X - n) - , _ _ � (a - l ) ( b - a ) / (b - l) b m - l n - m n - l

and find lEX. This distribution has been called negative hypergeometric.

3.6 Exercises. Dependence

1. Show that the collection of random variables on a given probability space and having finite variance forms a vector space over the reals. 2. Find the marginal mass functions of the multinomial distribution of Exercise (3 .5 . 1 ) . 3. Let X and Y be discrete random variables with joint mass function

c I(x , y) = , (x + y - l ) (x + y)(x + y + 1 ) x , y = I , 2, 3 , . . . .

Find the marginal mass functions of X and Y, calculate C, and also the covariance of X and Y . 4. Let X and Y be discrete random variables with mean 0, variance 1 , and covariance p . Show that lE(max{X2 , y2 }) ::: 1 + VI _ p2 . 5. Mutual information. Let X and Y be discrete random variables with joint mass function I .

1 9

Page 29: One Thousand Exercises in Probability

[3.6.6]-[3.7.2] �xercises

(a) Show that lEOog fx (X)) :::: lEOog fy (X)) .

(b) Show that the mutual information

I - lE (10 { f(X, Y) }) - g

fx (X) fy (Y)

satisfies I :::: 0, with equality if and only if X and Y are independent.

Discrete random variables

6. Voter paradox. Let X, Y, Z be discrete random variables with the property that their values are distinct with probability 1 . Let a = JP'(X > Y), b = JP'(Y > Z), c = JP'(Z > X) .

(a) Show that min {a , b , c } � j , and give an example where this bound is attained.

(b) Show that, if X, Y, Z are independent and identically distributed, then a = b = c = i . (c) Find min{a , b, c} and sUPp min{a, b, c} when JP'(X = 0) = 1 , and Y, Z are independent with

JP'(Z = 1 ) = JP'(Y = - 1 ) = p, JP'(Z = -2) = JP'(Y = 2) = 1 - p. Here, sUPp denotes the

supremum as p varies over [0, 1 ] . [Part (a) is related to the observation that, in an election, it is possible for more than half of the voters to prefer candidate A to candidate B, more than half B to C, and more than half C to A.] 7. Benford's distribution, or the law of anomalous numbers. If one picks a numerical entry at random from an almanac, or the annual accounts of a corporation, the first two significant digits, X, Y, are found to have approximately the joint mass function

Find the mass function of X and an approximation to its mean. [A heuristic explanation for this phenomenon may be found in the second of Feller's volumes ( 197 1 ). ] 8. Let X and Y have joint mass function

. k _ c(j + k)aHk f() ) j , k :::: 0, , - j ! k ! '

where a is a constant. Find c, JP'(X = j) , JP'(X + Y = r) , and lE(X) .

3.7 Exercises. Conditional distributions and conditional expectation

1. Show the following:

(a) lE(a Y + bZ I X) = alE(Y I X) + blE(Z I X) for a , b E JR,

(b) lE(Y I X) :::: O if Y :::: 0, (c) lE(1 I X) = 1 , (d) if X and Y are independent then lE(Y I X ) = lE(Y),

(e) ( 'pull-through property ') lE(Y g (X) I X) = g(X)lE(Y I X) for any suitable function g,

(t) ('tower property' ) lE{lE(Y I X, Z) I X} = lE(Y I X) = lE{lE(Y I X) I X, Z} .

2. Uniqueness of conditional expectation. Suppose that X and Y are discrete random variables, and that ¢ (X) and 1fr (X) are two functions of X satisfying

lE (¢ (X)g(X)) = lE (1fr (X)g(X)) = lE (Yg (X)) for any function g for which all the expectations exist. Show that ¢ (X) and 1fr (X) are almost surely

equal, in that JP'(¢ (X) = 1fr (X)) = 1 . 20

Page 30: One Thousand Exercises in Probability

Sums of random variables Exercises [3.7.3]-[3.8.3]

3. Suppose that the conditional expectation of Y given X is defined as the (almost surely) unique function 1fr(X) such that JE(1fr (X)g (X)) = JE(Y g(X)) for all functions g for which the expectations exist. Show (a)-(f) of Exercise ( 1 ) above (with the occasional addition of the expression 'with probability 1 ' ) .

4. How should we define var(Y I X) , the conditional variance of Y given X? Show that var(Y) = JE(var(Y I X)) + var(lE(Y I X)) . 5. The lifetime of a machine (in days) is a random variable T with mass function f. Given that the machine is working after t days, what is the mean subsequent lifetime of the machine when:

(a) f(x) = (N + 1 )- 1 for x E {a, 1 , . . . , N} , (b) f(x) = 2-x for x = 1 , 2, . . " (The first part of Problem (3 . 1 1 . 1 3) may be useful .)

6. Let Xl , X 2, . . . be identically distributed random variables with mean J1" and let N be a random variable taking values in the non-negative integers and independent of the Xi . Let S = Xl + X2 + . . . + X N . Show that JE(S I N) = J1,N, and deduce that JE(S) = J1,JE(N). 7. A factory has produced n robots, each of which is faulty with probability ¢. To each robot a test is applied which detects the fault (if present) with probability 8. Let X be the number of faulty robots, and Y the number detected as faulty. Assuming the usual independence, show that

JE(X I Y) = {n¢ ( 1 - 8) + ( 1 - ¢) Y } / ( 1 - (8) .

8. Families. Each child is equally likely to be male or female, independently of all other children.

(a) Show that, in a family of predetermined size, the expected number of boys equals the expected number of girls. Was the assumption of independence necessary?

(b) A randomly selected child is male; does the expected number of his brothers equal the expected number of his sisters? What happens if you do not require independence?

9. Let X and Y be independent with mean J1,. Explain the error in the following equation:

'JE(X I X + Y = z) = JE(X I X = z - Y) = JE(z - Y) = z - J1,' .

10. A coin shows heads with probability p. Let Xn be the number of flips required to obtain a run of

n consecutive heads. Show that JE(Xn) = Ek=l p-k .

3.8 Exercises. Sums of random variables

1. Let X and Y be independent variables, X being equally likely to take any value in {a, 1 , . . . , m} , and Y similarly in {a, 1 , . . . , n} . Find the mass function of Z = X + Y. The random variable Z is said to have the trapezoidal distribution. 2. Let X and Y have the joint mass function

c f(x , y) = , (x + y - l ) (x + y)(x + y + 1)

Find the mass functions of U = X + Y and V = X - Y.

x , y = 1 , 2 , 3 , . . . .

3. Let X and Y be independent geometric random variables with respective parameters ct and {3. Show that

JP(X + Y = z) = � { ( 1 - {3)Z-l - ( 1 - ct)Z-l }. ct - {3

2 1

Page 31: One Thousand Exercises in Probability

[3.8.4]-[3.9.6] Exercises Discrete random variables

4. Let {Xr : 1 ::s r ::s n} be independent geometric random variables with parameter p. Show that Z = L:�=l Xr has a negative binomial distribution. [Hint: No calculations are necessary.]

5. Pepys's problemt. Sam rolls 6n dice once; he needs at least n sixes. Isaac rolls 6(n + 1) dice; he needs at least n + 1 sixes. Who is more likely to obtain the number of sixes he needs?

6. Let N be Poisson distributed with parameter A. Show that, for any function g such that the

expectations exist, E(Ng(N)) = AEg(N + 1) . More generally, if S = L:�l Xr , where {Xr : r � O} are independent identically distributed non-negative integer-valued random variables, show that

E (Sg (S)) = AE (g(S + Xo)Xo) .

3.9 Exercises. Simple random walk

1. Let T be the time which elapses before a simple random walk is absorbed at either of the absorbing

barriers at 0 and N, having started at k where 0 ::s k ::s N. Show that JP'(T < 00) == 1 and E(Tk) < 00 for all k � 1 . 2. For simple random walk S with absorbing barriers at 0 and N, let W be the event that the particle is absorbed at 0 rather than at N, and let Pk = JP'(W I So = k) . Show that, if the particle starts at k where 0 < k < N, the conditional probability that the first step is rightwards, given W, equals

PPk+I iPk ' Deduce that the mean duration h of the Walk, conditional on W, satisfies the equation

PPk+l h+1 - Pkh + (Pk - PPk+l )h-l = -Pk o for O < k < N. Show that we may take as boundary condition Jo = O. Find h in the symmetric case, when P = 1 . 3 . With the notation of Exercise (2), suppose further that at any step the particle may remain where it is with probability r where P + q + r = 1 . Show that h satisfies

PPk+l Jk+l - (1 - r)pkh + qPk- l h-l = -Pk and that, when p = q/p 7'= 1 ,

h = _I_ . 1 {k(pk + pN) _ 2NpN (1 _ pk) } . P _ q pk _ pN 1 _ pN

4. Problem of the points. A coin is tossed repeatedly, heads turning up with probability P on each toss. Player A wins the game if m heads appear before n tails have appeared, and player B wins otherwise. Let Pmn be the probability that A wins the game. Set up a difference equation for the Pmn . What are the boundary conditions?

5. Consider a simple random walk on the set {O, 1 , 2, . . . , N} in which each step is to the right with probability P or to the left with probability q = 1 - p. Absorbing barriers are placed at 0 and N. Show that the number X of positive steps of the walk before absorption satisfies

E(X) = H Dk - k + N (1 - Pk) } where Dk is the mean number of steps until absorption and Pk is the probability of absorption at O.

6. (a) "Millionaires should always gamble, poor men never" [J. M. Keynes] .

(b) "If I wanted to gamble, I would buy a casino" [Po Getty].

(c) "That the chance of gain is naturally overvalued, we may learn from the universal success of lotteries" [Adam Smith, 1776] .

Discuss.

tPepys put a simple version of this problem to Newton in 1 693, but was reluctant to accept the correct reply he received.

22

Page 32: One Thousand Exercises in Probability

Problems Exercises [3.10.1]-[3.11.6]

3.10 Exercises. Random walk: counting sample paths

1. Consider a symmetric simple random walk S with So = O. Let T = min{n ::: 1 : Sn = O} be the time of the first return of the walk to its starting point. Show that

lP'(T = 2n) = _1 _ (2n) 2-2n,

2n - 1 n

and deduce that JE(Ta) < 00 if and only if ex < � . You may need Stirling's formula: n ! � 1

nn+

'1. e-n.,fiii. 2. For a symmetric simple random walk starting at 0, show that the mass function of the maximum satisfies lP'(Mn = r) = lP'(Sn = r) + lP'(Sn = r + 1 ) for r ::: O.

3. For a symmetric simple random walk starting at 0, show that the probability that the first visit to S2n takes place at time 2k equals the product lP'(S2k = 0)lP'(S2n-2k = 0) , for 0 � k � n .

3.11 Problems

1. (a) Let X and Y be independent discrete random variables, and let g , h : JR --+ JR. Show that g (X) and h eY) are independent.

(b) Show that two discrete random variables X and Y are independent if and only if fx,Y (x , y) = fx (x)fy (y) for all x , y E JR.

(c) More generally, show that X and Y are independent if and only if fx,Y (x , y) can be factorized as the product g(x)h (y) of a function of x alone and a function of y alone.

2. Show that if var(X) = 0 then X is almost surely constant; that is, there exists a E JR such that

lP'(X = a) = 1 . (First show that if JE(X2

) = 0 then lP'(X = 0) = 1 .)

3. (a) Let X be a discrete random variable and let g : JR --+ JR. Show that, when the sum is absolutely convergent,

JE(g(X)) = L g(x)lP'(X = x) . x

(b) If X and Y are independent and g, h : JR --+ JR, show that JE(g(X)h (Y)) = JE(g(X))JE(h (Y)) whenever these expectations exist.

4. Let Q = {WI , wz , W3 }, with lP'(WI ) = lP'(WZ) = lP'(W3) = j. Define X, Y, Z : Q --+ JR by

X (WI ) = 1 , X (W2) = 2 , X (W3) = 3,

Y (WI ) = 2, Y (WZ) = 3 , y(W3) = 1 ,

Z (WI ) = 2, Z (WZ) = 2, Z(W3 ) = 1 .

Show that X and Y have the same mass functions. Find the mass functions of X + Y, X Y, and X / Y. Find the conditional mass functions fY lz and fZI Y ' 5. For what values of k and ex is f a mass function, where:

(a) fen) = k/{n(n + I ) } , n = 1 , 2, . . . ,

(b) fen) = kna , n = 1 , 2, . . . (zeta or Zipfdistribution)? 6. Let X and Y be independent Poisson variables with respective parameters A and /L. Show that:

(a) X + Y is Poisson, parameter A + /L, (b) the conditional distribution of X, given X + Y = n , is binomial, and find its parameters .

23

Page 33: One Thousand Exercises in Probability

[3.11.7]-[3.11.14] Exercises Discrete random variables

7. If X is geometric, show that JI»(X = n + k I X > n) = JI»(X = k) for k, n :::: 1 . Why do you think that this is called the 'lack of memory' property? Does any other distribution on the positive integers have this property?

8. Show that the sum of two independent binomial variables, bin(m , p) and bin(n , p) respectively, is bin(m + n , p) . 9. Let N be the number of heads occurring in n tosses of a biased coin. Write down the mass function of N in terms of the probability p of heads turning up on each toss. Prove and utilize the identity

in order to calculate the probability Pn that N is even. Compare with Problem ( 1 .8 .20).

10. An urn contains N balls, b of which are blue and r (= N - b) of which are red. A random sample of n balls is withdrawn without replacement from the urn . Show that the number B of blue balls in this sample has the mass function

This is called the hypergeometric distribution with parameters N, b, and n. Show further that if N, b, and r approach 00 in such a way that b / N -+- p and r / N -+- 1 - p, then

You have shown that, for small n and large N, the distribution of B barely depends on whether or not the balls are replaced in the urn immediately after their withdrawal.

11. Let X and Y be independent bin(n , p) variables, and let Z = X + Y. Show that the conditional distribution of X given Z = N is the hypergeometric distribution of Problem (3 . 1 1 . 10).

12. Suppose X and Y take values in {O, I}, with joint mass function f(x , y) . Write f(O, 0) = a, f(O, 1 ) = b, f ( l , 0) = c, f ( l , 1 ) = d, and find necessary and sufficient conditions for X and Y to be: (a) uncorrelated, (b) independent.

13. (a) If X takes non-negative integer values show that

00 lE(X) = L JI»(X > n) .

n=O

(b) An urn contains b blue and r red balls. Balls are removed at random until the first blue ball is drawn. Show that the expected number drawn is (b + r + 1 )/ (b + 1 ) .

(c) The balls are replaced and then removed at random until all the remaining balls are of the same colour. Find the expected number remaining in the urn.

14. Let Xl , X2 , . . . , Xn be independent random variables, and suppose that Xk is Bernoulli with parameter Pk . Show that Y = Xl + X2 + . . . + Xn has mean and variance given by

n var(Y) = L Pk ( l - Pk) ·

I

24

Page 34: One Thousand Exercises in Probability

Problems Exercises [3.11.15]-[3.11.21]

Show that, for lE(Y) fixed, var(y) is a maximum when PI = P2 = ' " = Pn . That is to say, the variation in the sum is greatest when individuals are most alike. Is this contrary to intuition?

15. Let X = (Xl , X2 , . . . , Xn) be a vector of random variables. The covariance matrix VeX) of X is defined to be the symmetric n by n matrix with entries (vij : 1 ::: i, j ::: n) given by vij = COV(Xi , Xj ) ' Show that IV(X) I = 0 i f and only i f the Xi are linearly dependent with probability one, in that lP'(a IXI + a2X2 + . . . + anXn = b) = 1 for some a and b. ( IV I denotes the determinant of V.)

16. Let X and Y be independent Bernoulli random variables with parameter � . Show that X + Y and IX - Y I are dependent though uncorrelated.

17. A secretary drops n matching pairs of letters and envelopes down the stairs, and then places the letters into the envelopes in a random order. Use indicators to show that the number X of correctly matched pairs has mean and variance 1 for all n � 2. Show that the mass function of X converges to a Poisson mass function as n --+ 00.

18. Let X = (X I , X 2 , . . . , Xn) be a vector ofindependent random variables each having the Bernoulli distribution with parameter p. Let f : {O, l }n --+ lR be increasing, which is to say that f(x) ::: f ey) whenever Xi ::: Yi for each i . (a) Let e(p) = lE(f(X» . Show that e(PI ) ::: e (pz) if PI ::: P2 . (b) FKG inequalityt. Let f and g be increasing functions from {O, l }n into R Show by induction

on n that cov(f (X) , g(X» � O. 19. Let R(p) be the reliability function of a network G with a given source and sink, each edge of which is working with probability P, and let A be the event that there exists a working connection from source to sink. Show that

R(p) = L1A (W)pN(w) ( 1 - p)m-N(w) w

where w is a typical realization (i.e. , outcome) of the network, N(w) is the number of working edges of w, and m is the total number of edges of G.

Deduce that R'(p) = cov(lA , N)/{p( l - p)}, and hence that

R(p) ( l - R(p» < R' ( ) < p( l - p) - p -mR(p) ( 1 - R(p»

p(l - p)

20. Let R (p) be the reliability function of a network G, each edge of which is working with probability p. (a) Show that R(PI P2) ::: R(PI )R(P2) if 0 ::: PI , P2 ::: 1 . (b) Show that R(pY ) ::: R(p)Y for all 0 ::: p ::: 1 and y � 1 .

21. DNA fingerprinting. In a certain style of detective fiction, the sleuth is required to declare "the criminal has the unusual characteristics . . . ; find this person and you have your man". Assume that any given individual has these unusual characteristics with probability 1 0-7 independently of all other individuals, and that the city in question contains 1 07 inhabitants . Calculate the expected number of such people in the city. (a) Given that the police inspector finds such a person, what is the probability that there is at least

one other? (b) If the inspector finds two such people, what is the probability that there is at least one more? (c) How many such people need be found before the inspector can be reasonably confident that he

has found them all?

tNamed after C. Fortuin, P. Kasteleyn, and I. Ginibre (1971) , but due in this form to T. E. Harris ( 1960).

25

Page 35: One Thousand Exercises in Probability

[3.11.22]-[3.11.30] Exercises Discrete random variables

(d) For the given population, how improbable should the characteristics of the criminal be, in order that he (or she) be specified uniquely?

22. In 17 10, J. Arbuthnot observed that male births had exceeded female births in London for 82 successive years . Arguing that the two sexes are equally likely, and 2-82 is very small, he attributed this run of masculinity to Divine Providence. Let us assume that each birth results in a girl with probability p = 0.485, and that the outcomes of different confinements are independent of each other. Ignoring the possibility of twins (and so on), show that the probability that girls outnumber boys in 2n live births is no greater than e;:)pnqn {qj(q - p) } , where q = 1 - p. Suppose that 20,000 children are born in each of 82 successive years. Show that the probability that boys outnumber girls every year is at least 0.99. You may need Stirling's formula.

23. Consider a symmetric random walk with an absorbing barrier at N and a reflecting barrier at 0 (so that, when the particle is at 0, it moves to 1 at the next step). Let ak (j) be the probability that the particle, having started at k, visits 0 exactly j times before being absorbed at N. We make the convention that, if k = 0, then the starting point counts as one visit. Show that

N - k ( l ) i- 1 ak (j) = �

1 - N ' j :::: 1 , 0 � k � N.

24. Problem of the points (3.9.4). A coin is tossed repeatedly, heads turning up with probability p

on each toss. Player A wins the game if heads appears at least m times before tails has appeared n times; otherwise player B wins the game. Find the probability that A wins the game.

25. A coin is tossed repeatedly, heads appearing on each toss with probability p. A gambler starts with initial fortune k (where 0 < k < N); he wins one point for each head and loses one point for each tail. If his fortune is ever 0 he is bankrupted, whilst if it ever reaches N he stops gambling to buy a Jaguar. Suppose that p < � . Show that the gambler can increase his chance of winning by doubling the stakes. You may assume that k and N are even.

What is the corresponding strategy if p :::: � ?

26. A compulsive gambler is never satisfied. At each stage he wins £ 1 with probability p and loses £ 1 otherwise. Find the probability that he is ultimately bankrupted, having started with an initial fortune of £k. 27. Range of random walk. Let {Xn : n :::: I} be independent, identically distributed random variables taking integer values . Let So = 0, Sn = EI=1 Xi . The range Rn of So, S1 , . . . , Sn is the number of distinct values taken by the sequence. Show that lP'(Rn = Rn- 1 + 1 ) = JP(S1 S2 ' " Sn =1= 0), and deduce that, as n --+ 00,

1 - JE(Rn ) --+ IP'(Sk =1= o for all k :::: 1 ) . n

Hence show that, for the simple random walk, n- 1 JE(Rn ) --+ I p - q l as n --+ 00.

28. Arc sine law for maxima. Consider a symmetric random walk S starting from the origin, and let Mn = max{Si : 0 � i � n } . Show that, for i = 2k, 2k + 1 , the probability that the walk reaches M2n for the first time at time i equals �1P'(S2k = 0)IP'(S2n-2k = 0) . 29. Let S be a symmetric random walk with So = 0, and let Nn be the number of points that have been visited by S exactly once up to time n . Show that JE(Nn ) = 2. 30. Family planning. Consider the following fragment of verse entitled 'Note for the scientist' .

People who have three daughters try for more, And then its fifty-fifty they'll have four, Those with a son or sons will let things be,

26

Page 36: One Thousand Exercises in Probability

Problems

Hence all these surplus women, QED. (a) What do you think: of the argument?

Exercises [3.11.31]-[3.11.37]

(b) Show that the mean number of children of either sex in a family whose fertile parents have followed this policy equals 1 . (You should assume that each delivery yields exactly one child whose sex is equally likely to be male or female.) Discuss .

31. Let f3 > 1 , let PI , P2 , . . . denote the prime numbers, and let N(I ) , N(2) , . . . be independent random variables, N(i ) having mass function lP'(N(i ) = k) = ( 1 - Yi )Yik for k 2: 0, where Yi = piP for all i . Show that M = rr�1 pf(i) is a random integer with mass function lP'(M = m) = Cm-P for m 2: 1 (this may be called the Dirichlet distribution), where C is a constant satisfying

00 ( I ) ( 00 I ) - I C = n I - !f = L mP 1= 1 P, m=1

32. N + I plates are laid out around a circular dining table, and a hot cake is passed between them in the manner of a symmetric random walle each time it arrives on a plate, it is tossed to one of the two neighbouring plates, each possibility having probability � . The game stops at the moment when the cake has visited every plate at least once. Show that, with the exception of the plate where the cake began, each plate has probability 1 / N of being the last plate visited by the cake.

33. Simplex algorithm. There are (�) points ranked in order of merit with no matches. You seek to reach the best, B . If you are at the jth best, you step to any one of the j - I better points, with equal probability of stepping to each. Let rj be the expected number of steps to reach B from the jth best vertex. Show that rj = El:: k- I . Give an asymptotic expression for the expected time to reach B from the worst vertex, for large m, n. 34. Dimer problem. There are n unstable molecules in a row, m I , m2, . . . , mn . One of the n - I pairs of neighbours, chosen at random, combines to form a stable dimer; this process continues until there remain Un isolated molecules no two of which are adjacent. Show that the probability that m I remains isolated is E�,:J (- IY / r ! � e- I as n � 00. Deduce that limn--+oo n- lJEUn = e-2 • 35. Poisson approximation. Let {Ir : I � r � n} be independent Bernoulli random variables with respective parameters {Pr : I � r � n} satisfying Pr � c < I for all r and some c. Let A. = E�= I Pr and X = E�= I Xr . Show that

36. Sampling. The length of the tail of the rth member of a troop of N chimeras is Xr . A random sample of n chimeras is taken (without replacement) and their tails measured. Let lr be the indicator of the event that the rth chimera is in the sample. Set

_ I N Y = - L Xr , n r=1

Show that JE(Y) = /1-, and var(¥) = (N - n)u2/{n (N - I ) } .

N 2 I " 2 U = N L...... (Xr - X) .

r=1

37. Berkson's fallacy. Any individual in a group G contracts a certain disease C with probability Y ; such individuals are hospitalized with probability c. Independently of this, anyone in G may be in hospital with probability a, for some other reason. Let X be the number in hospital, and Y the

27

Page 37: One Thousand Exercises in Probability

[3.11.38]-[3.11.40] Exercises Discrete random variables

number in hospital who have C (including those with C admitted for any other reason). Show that the correlation between X and Y is

p(X, Y) = yp (1 - a) ( 1 - yc) -- . 1 - yp a + yc - ayc '

where p = a + c - ac. It has been stated erroneously that, when p (X, Y) is near unity, this is evidence for a causal

relation between being in G and contracting c. 38. A telephone sales company attempts repeatedly to sell new kitchens to each of the N families in a village. Family i agrees to buy a new kitchen after it has been solicited Ki times, where the Ki are independent identically distributed random variables with mass function f (n) = lP'(Ki = n) . The value 00 is allowed, so that f(oo) :::: o. Let Xn be the number of kitchens sold at the nth round of solicitations, so that Xn = Lf::l I{Ki =n} . Suppose that N is a random variable with the Poisson distribution with parameter v . (a) Show that the Xn are independent random variables, Xr having the Poisson distribution with

parameter vf(r) . (b) The company loses heart after the Tth round of calls, where T = inf{n : Xn = OJ . Let

S = Xl + X 2 + . . . + X T be the number of solicitations made up to time T . Show further that JE(S) = vJE(F(T)) where F(k) = f(l ) + f (2) + . . . + f(k) .

39. A particle performs a random walk on the non-negative integers as follows. When at the point n (> 0) its next position is uniformly distributed on the set {O, 1 , 2, . . . , n + I } . When it hits 0 for the first time, it is absorbed. Suppose it starts at the point a . (a) Find the probability that its position never exceeds a, and prove that, with probability 1 , i t is

absorbed ultimately. (b) Find the probability that the final step of the walk is from 1 to 0 when a = 1 . (c) Find the expected number of steps taken before absorption when a = 1 .

40. Let G be a finite graph with neither loops nor mUltiple edges, and write dv for the degree of the vertex v. An independent set is a set of vertices no pair of which is joined by an edge. Let a (G) be the size of the largest independent set of G. Use the probabilistic method to show that a(G) :::: Lv 1/ (dv + 1 ) . [This conclusion is sometimes referred to as Turan 's theorem.]

28

Page 38: One Thousand Exercises in Probability

4 Continuous random variables

4.1 Exercises. Probability density functions

1. For what values of the parameters are the following functions probability density functions? 1

(a) f(x) = C{x ( 1 - x) }- :Z , 0 < x < 1 , the density function of the 'arc sine law ' . (b) f(x) = C exp( -x - e-X) , x E JR , the density function of the 'extreme-value distribution' . (c) f(x) = C(1 + x2)-m , x E JR.

2. Find the density function of Y = aX, where a > 0, in terms of the density function of X. Show that the continuous random variables X and -X have the same distribution function if and only if fx (x) = fx ( -x) for all x E JR.

3. If f and g are density functions of random variables X and Y, show that af + (1 - a)g is a density function for 0 ::: a ::: 1 , and describe a random variable of which it is the density function.

4. Survival. Let X be a positive random variable with density function f and distribution function F. Define the hazard function H(x) = - log[ l - F(x)] and the hazard rate

Show that:

. 1 r (x) = hm -JP'(X ::: x + h I X > x) , h.j.O h

(a) r (x) = H'(x) = f(x)/{ 1 - F(x)} , (b) If r (x) increases with x then H (x) / x increases with x ,

x 2: o .

(c) H(x)/x increases with x if and only if [ l - F(x)]a ::: 1 - F(ax) for all 0 ::: a ::: I , (d) If H(x)/x increases with x , then H(x + y) 2: H(x) + H(y) for all x , y 2: o.

4.2 Exercises. Independence

1. I am selling my house, and have decided to accept the first offer exceeding £ K . Assuming that offers are independent random variables with common distribution function F, find the expected number of offers received before I sell the house.

2. Let X and Y be independent random variables with common distribution function F and density function f. Show that V = max{X, Y } has distribution function lP'(V ::: x) = F(x)2 and density function fv (x) = 2f(x)F(x) , x E JR. Find the density function of U = min{X, Y } . 3. The annual rainfall figures in Bandrika are independent identically distributed continuous random variables {Xr : r 2: I } . Find the probability that:

29

Page 39: One Thousand Exercises in Probability

[4.2.4]-[4.4.5] Exercises

(a) Xl < X2 < X3 < X4, (b) Xl > X2 < X3 < X4·

Continuous random variables

4. Let { Xr : r :::: I } be independent and identically distributed with distribution function F satisfying F(y) < 1 for all y, and let Y(y) = min{k : Xk > y} . Show that

lim lP'(Y(y) ::: lEY(y») = 1 - e- l . y-+oo

4.3 Exercises. Expectation

1. For what values of ex is lE( IX I!¥ ) finite, if the density function of X is : (a) f (x) = e-x for x :::: 0, (b) f (x) = C ( 1 + x2)-m for x E lR? If ex is not integral, then lE( IX I!¥) is called the fractional moment of order ex of X, whenever the expectation is well defined; see Exercise (3.3.5).

2. Let Xl , X2 , . . . , Xn be independent identically distributed random variables for which lE(X11 ) exists . Show that, if m ::: n, then lE(Sm/ Sn) = m / n, where Sm = X I + X 2 + . . . + Xm . 3. Let X be a non-negative random variable with density function f. Show that

for any r :::: 1 for which the expectation is finite.

4. Show that the mean /1-, median m, and variance (12 of the continuous random variable X satisfy (/1- - m)2 ::: (12 . 5. Let X be a random variable with mean /1- and continuous distribution function F. Show that

if and only if a = /1-.

4.4 Exercises. Examples of continuous variables

1. Prove that the gamma function satisfies r (t) = (t - 1 )r (t - 1) for t > 1 , and deduce that r (n) = (n - I ) ! for n = 1 , 2 , . . . . Show that r (� ) = .j1i and deduce a closed form for r (n + � ) for n = 0, 1 , 2, . . . .

2. Show, as claimed in (4.4.8), that the beta function satisfies B(a , b) = r (a)r (b)/ r(a + b) . 3. Let X have the uniform distribution on [0, 1 ] . For what function g does Y = g (X) have the exponential distribution with parameter 1 ?

4. Find the distribution function of a random variable X with the Cauchy distribution. For what values of ex does I X I have a finite (possibly fractional) moment of order ex?

5. Log-normal distribution. Let Y = eX where X has the N(O, 1) distribution. Find the density function of Y.

30

Page 40: One Thousand Exercises in Probability

Dependence Exercises [4.4.6]-[4.5.5]

6. Let X be N(tL, u2) . Show that JE{ (X - tL)g (X) } = u2JE(g' (X» when both sides exist. 7. With the terminology of Exercise (4. 1 .4), find the hazard rate when: (a) X has the Weibull distribution, IP'(X > x) = exp(-axfl-1 ) , x � 0, (b) X has the exponential distribution with parameter A., (c) X has density function af + (1 - a)g, where 0 < a < 1 and f and g are the densities of

exponential variables with respective parameters A. and tL. What happens to this last hazard rate r (x) in the limit as x -+ oo?

8. Mills's ratio. For the standard normal density ¢ (x) , show that ¢' (x) + x¢ (x) = O. Hence show that

1. Let

1 1 1 - <P(x) 1 1 3 - - - < < - - - + -x x3 ¢ (x) x x3 x5 ' x > O.

4.5 Exercises. Dependence

x , y E R.

Show that f is a continuous joint density function, but that the (first) marginal density function g(x) = f�oo f(x , y) dy is not continuous. Let Q = {qn : n � I } be a set of real numbers, and define

00 fQ (x , y) = L(�)n f(x - qn , y) .

n=l

Show that f Q is a continuous joint density function whose first marginal density function is discon­tinuous at the points in Q. Can you construct a continuous joint density function whose first marginal density function is continuous nowhere? 2. ButTon's needle revisited. 1\vo grids of parallel lines are superimposed: the first grid contains lines distance a apart, and the second contains lines distance b apart which are perpendicular to those of the first set. A needle of length r « min {a, b}) is dropped at random. Show that the probability it intersects a line equals r (2a + 2b - r)/(rrab) . 3. ButTon's cross. The plane is ruled by the lines y = n , for n = 0, ± 1 , . . . , and on to this plane we drop a cross formed by welding together two unit needles perpendicularly at their midpoints. Let Z be the number of intersections of the cross with the grid of parallel lines. Show that JE(Z/2) = 2/rr and that

3 - .../2 4 var(Z/2) = -- - -2 -rr rr

If you had the choice of using either a needle of unit length, or the cross, in estimating 2/rr , which would you use? 4. Let X and Y be independent random variables each having the uniform distribution on [0, 1 ] . Let U = min{X, Y} and V = max{X, Y} . Find JE(U), and hence calculate cov(U, V).

5. Let X and Y be independent continuous random variables. Show that

JE (g (X)h (Y» ) = JE(g (X» JE(h (Y» ,

whenever these expectations exist. If X and Y have the exponential distribution with parameter 1 , find JE{exp(� (X + Y» } .

3 1

Page 41: One Thousand Exercises in Probability

[4.5.6]-[4.6.9] Exercises Continuous random variables

6. Three points A, B, C are chosen independently at random on the circumference of a circle. Let b(x) be the probability that at least one of the angles of the triangle ABC exceeds x 1T . Show that

{ I - (3x - 1)2 if 1 ::S x ::s 1 , b(x) = 3 ( 1 - x)2 if 1 ::s x ::s 1 .

Hence find the density and expectation of the largest angle in the triangle. 7. Let {Xr : 1 ::s r ::s n} be independent and identically distributed with finite variance, and define - -1 n - -X = n Lr=l Xr . Show that cov(X, Xr - X) = O. 8. Let X and Y be independent random variables with finite variances, and let U = X + Y and V = XY. Under what condition are U and V uncorrelated? 9. Let X and Y be independent continuous random variables, and let U be independent of X and Y taking the values ± 1 with probability 1 . Define S = U X and T = U Y. Show that S and T are in general dependent, but S2 and T2 are independent.

4.6 Exercises. Conditional distributions and conditional expectation

1. A point is picked uniformly at random on the surface of a unit sphere. Writing e and <I> for its longitude and latitude, find the conditional density functions of e given <1>, and of <I> given e. 2. Show that the conditional expectation 1/f(X) = JE(Y I X) satisfies JE(1/f (X)g (X» = JE(Y g(X», for any function g for which both expectations exist.

3. Construct an example of two random variables X and Y for which JE(Y) = 00 but such that JE(Y I X) < 00 almost surely. 4. Find the conditional density function and expectation of Y given X when they hav� joint density function: (a) ! (x , y) = 'A2e-'J...y for O ::S x ::s y < 00, (b) ! (x , y) = xe-x (y+1) for x , y � O.

5. Let Y be distributed as bin(n , X), where X is a random variable having a beta distribution on [0, 1] with parameters a and b. Describe the distribution of Y, and find its mean and variance. What is the distribution of Y in the special case when X is uniform?

6. Let {Xr : r � I } be independent and uniformly distributed on [0, 1 ] . Let 0 < x < 1 and define

N = min{n � 1 : Xl + X2 + · · · + Xn > x } .

Show that JP(N > n ) = xn In ! , and hence find the mean and variance of N.

7. Let X and Y be random variables with correlation p. Show that JE(var(Y I X» ::s (1 - p2) var Y.

8. Let X, Y, Z be independent and exponential random variables with respective parameters 'A , /-L, v. Find JP(X < Y < Z). 9. Let X and Y have the joint density ! (x , y) = cx (y - x)e-Y , 0 ::s x ::s y < 00. (a) Find c.

(b) Show that:

!X I Y (x I y) = 6x (y - x )y-3 , !Y lx (y I x) = (y - x)ex-y ,

32

O ::s x ::s y , O ::s x ::s y < 00.

Page 42: One Thousand Exercises in Probability

Functions of random variables Exercises [4.6.10]-[4.7.12]

(c) Deduce that JE(X I Y) = � Y and JE(Y I X) = X + 2. 10. Let {Xr : r 2:: o} be independent and identically distributed random variables with density function f and distribution function F. Let N = min{n 2:: I : Xn > XO} and M = min{n 2:: 1 : Xo 2:: XI 2:: . . . 2:: Xn-l < Xn } . Show that XN has distribution function F + ( 1 - F) log(1 - F), and find P(M = m) .

4.7 Exercises. Functions of random variables

1. Let X, Y, and Z be independent and unifonnly distributed on [0, 1] . Find the joint density function of XY and Z2, and show that P(XY < Z2) = � . 2. Let X and Y be independent exponential random variables with parameter 1 . Find the joint density function of U = X + Y and V = X/eX + Y), and deduce that V is unifonnly distributed on [0, 1 ] . 3. Let X be unifonnly distributed on [0, 1Jr] . Find the density function of Y = sin X.

4. Find the density function of Y = sin - 1 X when: (a) X is unifonnly distributed on [0, 1] , (b) X is unifonnly distributed on [- 1 , 1 ] . 5. Let X and Y have the bivariate normal density function

f(x , y) = 2JrvS exp { - 2(1 � p2) (x2 - 2pxy + y2) } .

Show that X and Z = (Y - pX)/ VI - p2 are independent N(O, 1 ) variables, and deduce that

1 1 . _ P(X > 0, Y > 0) = - + - sm 1 p . 4 2Jr

6. Let X and Y have the standard bivariate normal density function of Exercise (5), and define Z = max{X, Y} . Show that JE(Z) = J(1 - p)/Jr , and JE(Z2) = 1 . 7. Let X and Y be independent exponential random variables with parameters A and 11-. Show that Z = min{X, Y} is independent of the event {X < Y} . Find: (a) P(X = Z), (b) the distributions of U = max{X - Y, OJ , denoted (X - Y)+ , and V = max{X, Y} - min{X, y}, (c) P(X :::: t < X + Y) where t > 0.

8. A point (X, Y) is picked at random unifonnly in the unit circle. Find the joint density of R and X, where R2 = X2 + y2 . 9. A point (X, Y, Z) is picked unifonnly at random inside the unit ball of R3 . Find the joint density of Z and R, where R2 = X2 + y2 + Z2 .

10. Let X and Y be independent and exponentially distributed with parameters A and 11-. Find the joint distribution of S = X + Y and R = X/eX + Y). What is the density of R?

11 . Find the density of Y = a/(1 + X2) , where X has the Cauchy distribution. 12. Let (X, Y) have the bivariate normal density of Exercise (5) with ° :::: p < 1 . Show that

pl/> (b) [ 1 - <P (d)] [ 1 - <P(a)] [ 1 - <P(e)] :::: P(X > a, Y > b) :::: [1 - <P(a) ] [ 1 - <P (e)] + I/> (a) ,

33

Page 43: One Thousand Exercises in Probability

[4.7.13]-[4.8.8] Exercises Continuous random variables

where c = (b - pa)/ VI - p2, d = (a - pb)/ VI - p2, and ¢ and <I> are the density and distribution function of the N(O, 1 ) distribution.

13. Let X have the Cauchy distribution. Show that Y = X-I has the Cauchy distribution also. Find another non-trivial distribution with this property of invariance.

14. Let X and Y be independent and gamma distributed as ro .. , a) , ro.. , fJ) respectively. Show that W = X + Y and Z = X / (X + Y) are independent, and that Z has the beta distribution with parameters a, fJ.

4.8 Exercises. Sums of random variables

1. Let X and Y be independent variables having the exponential distribution with parameters A and IL respectively. Find the density function of X + Y.

2. Let X and Y be independent variables with the Cauchy distribution. Find the density function of aX + fJY where afJ =1= 0. (Do you know about contour integration?)

3. Find the density function of Z = X + Y when X and Y have joint density function f (x , y) = 1 (x + y )e- (x+y) , x , y :::: 0.

4. Hypoexponential distribution. Let {Xr : r :::: I } be independent exponential random variables with respective parameters {Ar : r :::: I } no two of which are equal. Find the density function of Sn = E�=l Xr . [Hint: Use induction.]

5. (a) Let X, Y, Z be independent and uniformly distributed on [0, 1 ] . Find the density function of X + Y + Z.

(b) If {Xr : r :::: I } are independent and uniformly distributed on [0, 1 ] , show that the density of E�=l Xr at any point x E (0, n) is a polynomial in x of degree n - 1 .

6. For independent identically distributed random variables X and Y, show that U = K + Y and V = X - Y are uncorrelated but not necessarily independent. Show that U and V are independent if X and Y are N(O, 1 ) .

7 . Let X and Y have a bivariate normal density with zero means, variances 0'2, t'2, and correlation p . Show that:

(a) JE(X I Y) = pO' Y, t' (b) var(X I Y) = 0'2 ( 1 - p2) ,

(0'2 + PO't')z (c) JE(X I X + Y = z) = 0'2 + 2pO't' + t'2 '

0'2t'2 ( 1 _ p2) (d) var(X I X + Y = z) = 2 2 . t' + 2pO't' + 0' 8. Let X and Y be independent N(O, 1 ) random variables, and let Z = X + Y. Find the distribution and density of Z given that X > ° and Y > 0. Show that

JE(Z I X > 0, Y > 0) = 2V2/7r .

34

Page 44: One Thousand Exercises in Probability

Multivariate normal distribution Exercises [4.9.1]-[4.9.9]

4.9 Exercises. Multivariate normal distribution

1. A symmetric matrix is called non-negative (respectively positive) definite if its eigenvalues are non-negative (respectively strictly positive). Show that a non-negative definite symmetric matrix V has a square root, in that there exists a symmetric matrix W satisfying W2 = V. Show further that W is non-singular if and only if V is positive definite. 2. If X is a random vector with the N (p" V) distribution where V is non-singular, show that Y = (X - p,)W-1 has the N(O, I) distribution, where I is the identity matrix and W is a symmetric matrix satisfying W2 = V. The random vector Y is said to have the standard multivariate normal distribution. 3. Let X = (Xl , X2 , . . . , Xn ) have the N(p" V) distribution, and show that Y = al X I + a2X2 + . . . + an Xn has the (univariate) N(/L, (12) distribution where

n /L = :�:>i JE(Xi ) '

i=l

n (12 = L af var(Xi ) + 2 L ai aj Cov(Xi , Xj ) .

i=l i <j

4. Let X and Y have the bivariate normal distribution with zero means, unit variances, and correlation p . Find the joint density function of X + Y and X - Y, and their marginal density functions. 5. Let X have the N(O, 1) distribution and let a > 0. Show that the random variable Y given by

{ X if l X I < a Y = -X if I X I � a

has the N(O, 1) distribution, and find an expression for p (a) = cov(X, Y) in terms of the density function ¢ of X. Does the pair (X, Y) have a bivariate normal distribution? 6. Let {Yr : 1 :::: r :::: n} be independent N(O, 1) random variables, and define Xj = E�=l Cjr Yr , 1 :::: r :::: n, for constants Cjr . Show that

What is var(Xj I Xk) ? 7 . Let the vector (Xr : 1 :::: r :::: n) have a multivariate normal distribution with covariance matrix V = (vij ) . Show that, conditional on the event E1 Xr = x , Xl has the N(a, b) distribution where a = (ps/ t)x , b = s2 (1 - p2) , and s2 = Vl 1 , t2 = Eij Vij , P = Ei Vi l / (st) . 8. Let X, Y, and Z have a standard trivariate normal distribution centred at the origin, with zero means, unit variances, and correlation coefficients Ph P2, and P3 . Show that

1 1 P(X > 0, Y > 0, Z > 0) = - + - {sin- l PI + sin- l P2 + sin- l P3 } · 8 4rr

9. Let X, Y, Z have the standard trivariate normal density of Exercise (8), with PI = p (X, Y) . Show that

JE(Z I X, Y) = { (P3 - PIP2)X + (P2 - PIP3)Y } / ( 1 - Pt) , var(Z I X, Y) = {I - Pt - pi - pj + 2PI P2P3 } / ( I - ph ·

35

Page 45: One Thousand Exercises in Probability

[4.10.1]-[4.11.7] Exercises Continuous random variables

4.10 Exercises. Distributions arising from the normal distribution

1. Let Xl and X2 be independent variables with the x2 (m) and x2 (n) distributions respectively. Show that Xl + X 2 has the X 2 (m + n) distribution. 2. Show that the mean of the t (r) distribution is 0, and that the mean of the F (r, s) distribution is s /(s - 2) if s > 2. What happens if s ::::: 2? 3. Show that the t ( 1 ) distribution and the Cauchy distribution are the same. 4. Let X and Y be independent variables having the exponential distribution with parameter 1 . Show that X/ Y has an F distribution. Which? 5. Use the result of Exercise (4.5.7) to show the independence of the sample mean and sample variance of an independent sample from the N(/L, (72) distribution. 6. Let {Xr : 1 ::::: r ::::: n } be independent N(O, 1) variables. Let \11 E [0, Jr] be the angle between the vector (Xl , X2 , . . . , Xn ) and some fixed vector in ]Rn . Show that \11 has density f (1/f) = (sin 1/f)n-2 / B(� , �n - � ) , ° ::::: 1/f < Jr , where B is the beta function.

4.11 Exercises. Sampling from a distribution

1. Uniform distribution. If U is uniformly distributed on [0, 1 ] , what is the distribution of X = LnUJ + I ? 2. Random permutation. Given the first n integers in any sequence So, proceed thus: (a) pick any position Po from { I , 2, . . . , n } at random, and swap the integer in that place of So with

the integer in the nth place of So, yielding Sl . (b) pick any position PI from { I , 2, . . . , n - 1 } at random, and swap the integer in that place of Sl

with the integer in the (n - l)th place of S1 . yielding S2, (c) at the (r - l)th stage the integer in position Pr- 1 , chosen randomly from { I , 2, . . . , n - r + I } ,

i s swapped with the integer at the (n - r + l)th place of the sequence Sr-1 . Show that Sn- 1 is equally likely to be any of the n ! permutations of { I , 2, . . . , n } . 3 . Gamma distribution. Use the rejection method to sample from the gamma density ro .. , t) where t (:::: 1) may not be assumed integral. [Hint: You might want to start with an exponential random variable with parameter l / t .] 4. Beta distribution. Show how to sample from the beta density f3 (a, (3) where a, f3 :::: 1 . [Hint: Use Exercise (3).] 5. Describe three distinct methods of sampling from the density f(x) = 6x (1 - x), 0 ::::: x ::::: 1 . 6. Aliasing method. A finite real vector is called a probability vector if it has non-negative entries with sum 1 . Show that a probability vector p of length n may be written in the form

1 n P = --1 L Vr , n - r=l

where each Vr is a probability vector with at most two non-zero entries. Describe a method, based on this observation, for sampling from p viewed as a probability mass function.

7. Box-Muller normals. Let UI and U2 be independent and uniformly distributed on [0, 1 ] , and let Tj = 2Uj - 1 . Show that, conditional on the event that R = V Tl + Ti ::::: 1 ,

X = � V-2 10g R2 , Y = � V-2 10g R2 ,

36

Page 46: One Thousand Exercises in Probability

Coupling and Poisson approximation Exercises [4.11.8]-[4.12.3]

are independent standard normal random variables. 8. Let U be uniform on [0, 1] and 0 < q < 1. Show that X = 1 + Llog U flog qJ has a geometric distribution. 9. A point (X, Y) is picked uniformly at random in the semicircle x2 + y2

� 1 , x ::: O. What is the distribution of Z = Y f X? 10. Hazard-rate technique. Let X be a non-negative integer-valued random variable with h (r) = lP'(X = r I X ::: r) . If {Uj : i ::: O} are independent and uniform on [0, 1 ] , show that Z = min{n : Un � h (n)} has the same distribution as X. 11. Antithetic variables. Let g (X l , X2 , . . . , xn) be an increasing function in all its variables, and let {Ur : r ::: I } be independent and identically distributed random variables having the uniform distribution on [0, 1 ] . Show that

cOV{g (Ul ' U2, . . . , Un ) , g(l - Ul , 1 - U2 , . . . , 1 - Un) } � O.

[Hint: Use the FKG inequality of Problem (3 . 10. 1 8).] Explain how this can help in the efficient estimation of [ = IJ g(x) dx .

12. Importance sampling. We wish to estimate [ = I g(x)fx (x) dx = lE(g (X)), where either it 4 is difficult to sample from the density fx , or g(X) has a very large variance. Let fy be equivalent

to fx , which is to say that, for all x, fx (x ) = 0 if and only if fy (x ) = O. Let {Yj : 0 � i � n } be independent random variables with density function fy , and define

Show that:

(a) lE(J) = [ = lE [g(Y)fx (Y) ] fy (Y) ,

(b) var(J) = .!. [lE ( g (Y)2 fX (Y)

2 ) _ [2] ,

n fy (Y)2

(c) J � [ as n � 00. (See Chapter 7 for an account of convergence.) The idea here is that fy should be easy to sample from, and chosen if possible so that var J is

much smaller than n- l [lE(g (X)2

) - [2] . The function fy is called the importance density.

13. Construct two distinct methods of sampling from the arc sin density

2 f (x ) = 1T�' O � x � 1 .

4.12 Exercises. Coupling and Poisson approximation

1. Show that X is stochastically larger than Y if and only if lE(u (X)) ::: lE(u (Y)) for any non­decreasing function u for which the expectations exist.

2. Let X and Y be Poisson distributed with respective parameters A and 11-. Show that X is stochas­tically larger than Y if A ::: 11-. 3. Show that the total variation distance between two discrete variables X, Y satisfies

dTV(X, Y) = 2 sup IJP(X E A) - JP(Y E A) I . A£lR

37

Page 47: One Thousand Exercises in Probability

[4.12.4]-[4.13.6] Exercises Continuous random variables

4. Maximal coupling. Show for discrete random variables X, Y that lP'(X = Y) ::: 1 - idTV(X, Y), where dTV denotes total variation distance. S. Maximal coupling continued. Show that equality is possible in the inequality of Exercise (4. 12.4) in the following sense. For any pair X, Y of discrete random variables, there exists a pair X', Y' having the same marginal distributions as X, Y such that lP'(X' = Y') = 1 - idTV(X, Y). 6. Let X and Y be indicator variables with lEX = p, lEY = q . What is the maximum possible value of lP'(X = Y), as a function of p, q ? Explain how X, Y need to be distributed in order that lP'(X = Y) be: (a) maximized, (b) minimized.

4.13 Exercises. Geometrical probability

With apologies to those who prefer their exercises better posed . . . 1. Pick two points A and B independently at random on the circumference of a circle C with centre o and unit radius. Let IT be the length of the perpendicular from 0 to the line AB, and let e be the angle AB makes with the horizontal. Show that ( IT , e) has joint density

f(p, (J) = 2 fj--=r 0 ::: p ::: 1 , 0 ::: (J < 2rr. rr v i - p-

2. Let SI and S2 be disjoint convex shapes with boundaries of length b(SI ) , b(S2) , as illustrated in the figure beneath. Let b(H) be the length of the boundary of the convex hull of SI and S2, incorporating their exterior tangents, and b(X) the length of the crossing curve using the interior tangents to loop round SI and S2. Show that the probability that a random line crossing SI also crosses S2 is {b(X) - b(H)}/b(SI ) ' (See Example (4. 1 3 .2) for an explanation of the term 'random line ' . ) How is this altered if SI and S2 are not disjoint?

S2

The circles are the shapes Sl and S2 . The shaded regions are denoted A and B, and b(X) is the sum of the perimeter lengths of A and B .

3 . Let S I and S2 be convex figures such that S2 � S I . Show that the probability that two independent random lines A l and A2, crossing SJ , meet within S2 is 2rr I S2 1 /b(SI )

2, where I S2 1 is the area of S2

and b(SI ) is the length of the boundary of SI . (See Example (4. 13 .2) for an explanation of the term 'random line' .) 4. Let Z be the distance between two points picked independently at random in a disk of radius a. Show that lE(Z) = 1 28a/(45rr) , and lE(Z2) = a2

.

S. Pick two points A and B independently at random in a ball with centre O. Show that the probability that the angle AOB is obtuse is i . Compare this with the corresponding result for two points picked at random in a circle. 6. A triangle is formed by A, B, and a point P picked at random in a set S with centre of gravity G. Show that lElABPI = IABGI .

3 8

Page 48: One Thousand Exercises in Probability

Problems Exercises [4.13.7]-[4.14.1]

7. A point D is fixed on the side BC of the triangle ABC. Two points P and Q are picked independently at random in ABD and ADC respectively. Show that lE l APQI = IAGI G2 1 = � IABq , where GI and G2 are the centres of gravity of ABD and ADC.

8. From the set of all triangles that are similar to the triangle ABC, similarly oriented, and inside ABC, one is selected uniformly at random. Show that its mean area is to IABq .

9. Two points X and Y are picked independently at random in the interval (0, a) . By varying a, show that F (z , a) = 1P'( I X - Y I � z) satisfies

8F + �F = 2z,

8a a a2 o � Z � a,

and hence find F (z , a) . Let r 2: 1 , and show that m r (a) = lE( I X - y n satisfies

Hence find mr(a) . 10. Lines are laid down independently at random on the plane, dividing i t into polygons. Show that the average number of sides of this set of polygons is 4. [Hint: Consider n random great circles of a sphere of radius R; then let R and n increase.] 11. A point P is picked at random in the triangle ABC. The lines AP, BP, CP, produced, meet BC, AC, AB respectively at L, M, N. Show that lElLMNI = (10 - rr2) IABCI . 12. Sylvester's problem. If four points are picked independently at random inside the triangle ABC, show that the probability that no one of them lies inside the triangle formed by the other three is �. 13. If three points P, Q, R are picked independently at random in a disk of radius a , show that lE IPQR I = 35a2/(48rr) . [You may find it useful that It It sin3 x sin3 y sin Ix - y l dx dy = 35rr / 1 28.] 14. Two points A and B are picked independently at random inside a disk C. Show that the probability that the circle having centre A and radius lAB I lies inside C is i . 15. Two points A and B are picked independently at random inside a ball S. Show that the probability that the sphere having centre A and radius IAB l lies inside S is 10.

4.14 Problems

2 1. (a) Show that I�oo e-x dx = ,J1i, and deduce that

1 { (X - J.t)2 } f(x) = r,c: exp - 2 ' u ", 2rr 2u -00 < x < 00 ,

i s a density function if u > O. (b) Calculate the mean and variance of a standard normal variable. (c) Show that the N(O, 1 ) distribution function cP satisfies

These bounds are of interest because cP has no closed form.

x > 0.

(d) Let X be N(O, 1 ) , and a > O. Show that IP'(X > x + a/x I X > x) -+ e-a as x -+ O.

39

Page 49: One Thousand Exercises in Probability

[4.14.2]-[4.14.11] Exercises Continuous random variables

2. Let X be continuous with density function f(x) = C(x - x2) , where a < x < f3 and C > O. (a) What are the possible values of a and f3 ? (b) What i s C?

3 . Let X be a random variable which takes non-negative values only. Show that

00 00 � )i - 1 ) IAj ::; X < � ) IAi ' i-I i-I

where Ai = {i - 1 ::; X < i } . Deduce that

00 00 � )J>(X ::: i) ::; lE(X) < 1 + L lP'(X ::: i ) . i-I i-I

4. (a) Let X have a continuous distribution function F. Show that (i) F(X) is uniformly distributed on [0 , 1 ] ,

(ii) - log F(X) i s exponentially distributed. (b) A straight line I touches a circle with unit diameter at the point P which is diametrically opposed

on the circle to another point Q. A straight line QR joins Q to some point R on I. If the angle PQR between the lines PQ and QR is a random variable with the uniform distribution on [- �rr, �rr] , show that the length ofPR has the Cauchy distribution (this length i s measured positive or negative depending upon which side of P the point R lies).

5. Let X have an exponential distribution. Show that lP'(X > s + x I X > s) = lP'(X > x), for x , s ::: O. This is the 'lack of memory' property again. Show that the exponential distribution is the only continuous distribution with this property. You may need to use the fact that the only non-negative monotonic solutions of the functional equation g(s + t) = g(s)g(t) for s , t ::: 0, with g(O) = 1 , are of the form g (s) = elLs . Can you prove this? 6. Show that X and Y are independent continuous variables if and only if their joint density function f factorizes as the product f(x , y) = g(x)h (y) of functions of the single variables x and y alone. 7. Let X and Y have joint density function f(x , y) = 2e-x-y , 0 < x < y < 00. Are they independent? Find their marginal density functions and their covariance. 8. Bertrand's paradox extended. A chord of the unit circle is picked at random. What is the probability that an equilateral triangle with the chord as base can fit inside the circle if: (a) the chord passes through a point P picked uniformly in the disk, and the angle it makes with a

fixed direction is uniformly distributed on [0, 2rr ) , (b) the chord passes through a point P picked uniformly at random on a randomly chosen radius, and

the angle it makes with the radius is uniformly distributed on [0, 2rr ) .

9. Monte Carlo. It is required to estimate J = Jd g(x) dx where 0 ::; g(x) ::; 1 for all x, as in Example (2.6.3). Let X and Y be independent random variables with common density function f(x) = 1 if 0 < x < 1 , f(x) = 0 otherwise. Let U = I{Y:og(x) j , the indicator function of the event that Y ::; g(X), and let V = g(X), W = � {g (X) + g( l -X)} . Show that lE(U) = lE(V) = lE(W) = J , and that var(W) ::; var( V) ::; var(U), s o that, of the three, W i s the most 'efficient' estimator o f J . 10. Let Xl , X2 , . . . , Xn be independent exponential variables, parameter >.. . Show by induction that S = Xl + X2 + . . . + Xn has the r(>" , n) distribution. 11. Let X and Y be independent variables, r(>.. , m) and r(>" , n) respectively. (a) Use the result of Problem (4. 14. 10) to show that X + Y is r(>.. , m + n) when m and n are integral

(the same conclusion is actually valid for non-integral m and n).

40

Page 50: One Thousand Exercises in Probability

Problems Exercises [4.14.12]-[4.14.19]

(b) Find the joint density function of X + Y and X/(X + Y), and deduce that they are independent. (c) If Z is Poisson with parameter At, and m is integral, show that lI"(Z < m) = lI"(X > t) . (d) If O < m < n and B is independent of Y with the beta distribution with parameters m and n - m,

show that Y B has the same distribution as X.

12. Let XI , X2 , . . . , Xn be independent N(O, 1) variables. (a) Show that Xi is X2 (1 ) . (b) Show that Xi + X� i s X2 (2) by expressing its distribution function as an integral and changing

to polar coordinates. (c) More generally, show that Xi + X� + . . . + X� is x 2 (n) .

13. Let X and Y have the bivariate normal distribution with means /1-1 , /1-2 , variances ut , u� , and correlation p . Show that (a) lE(X I Y) = /1-1 + pUI (Y - /1-2)/U2 , (b) the variance of the conditional density function fXIY is var(X I Y) = ut ( 1 _ p2) .

14. Let X and Y have joint density function f. Find the density function of Y / X .

15. Let X and Y be independent variables with common density function f. Show that tan - I (Y / X) has the uniform distribution on (- i 17:, i 17:) if and only if

100 1 f(x)f(xY) lx l dx = 2 '

-00

17: ( 1 + Y ) Y E JR.

Verify that this is valid if either f is the N (O, 1) density function or f(x) = a ( 1 + x4) - 1 for some constant a .

16. Let X and Y be independent N(O, 1 ) variables, and think of (X, Y) as a random point in the plane. Change to polar coordinates (R, e) given by R2 = X2 + y2 , tan e = Y/ X; show that R2 is X2 (2) , tan e has the Cauchy distribution, and R and e are independent. Find the density of R.

Find lE(X2 / R2) and

lE { min{ l X I , I Y I } }

. rnax{ I X I , I Y I }

17. If X and Y are independent random variables, show that U = min{X, Y} and V = max{X, Y} have distribution functions

FU (u) = 1 - { I - Fx (u) } { 1 - Fy (u)} , Fv (v) = Fx (v)Fy (v) .

Let X and Y be independent exponential variables, parameter 1 . Show that (a) U is exponential, parameter 2, (b) V has the same distribution as X + ! Y. Hence find the mean and variance of V.

18. Let X and Y be independent variables having the exponential distribution with parameters A and /1- respectively. Let U = min{X, Y} , V = max{X, Y} , and W = V - U. (a) Find lI"(U = X) = lI"(X ::; Y) . (b) Show that U and W are independent.

19. Let X and Y be independent non-negative random variables with continuous density functions on (0, (0). (a) If, given X + Y = u , X is uniformly distributed on [0, u] whatever the value of u , show that X

and Y have the exponential distribution.

4 1

Page 51: One Thousand Exercises in Probability

[4.14.20]-[4.14.27] Exercises Continuous random variables

(b) If, given that X + Y = u , Xlu has a given beta distribution (parameters ot and p, say) whatever the value of u , show that X and Y have gamma distributions.

You may need the fact that the only non-negative continuous solutions of the functional equation g (s + t) = g (s)g(t) for s , t 2: 0, with g (O) = 1 , are of the form g (s) = elLs . Remember Problem (4. 14.5). 20. Show that it cannot be the case that U = X + Y where U is uniformly distributed on [0, 1] and X and Y are independent and identically distributed. You should not assume that X and Y are continuous variables . 21. Order statistics. Let XI , X2 , . . . , Xn be independent identically distributed variables with a com­mon density function f. Such a collection is called a random sample. For each w E n, arrange the sam­pIe values Xl (w) , . . . , Xn (w) in non-decreasing order X(1) (w) :::: X(2) (W) :::: . . . :::: X(n) (w) , where ( 1 ) , (2) , . . . , (n) is a (random) permutation of 1 , 2 , . . . , n . The new variables X (1) , X(2) , . . . , X(n) are called the order statistics. Show, by a symmetry argument, that the joint distribution function of the order statistics satisfies

IP'(X(1) :::: YI , · · · , X(n) :::: Yn ) = n ! IP'(XI :::: YI , · · · , Xn :::: Yn , Xl < X2 < . . . < Xn)

where L is given by

= J . . . (�Yl L(XI ' . . . , Xn )n ! f(XI ) . . . f(xn ) dXI . . . dXn J�2�Y2

{ I if Xl < X2 < . . . < Xn , L (x) = ° otherwise, and x = (Xl , x2 , . . . , xn ) . Deduce that the joint density function of X (1) , . . . , X(n) is g(y) n ! L(y)f(YI ) · · · f(Yn) . 22. Find the marginal density function of the kth order statistic X (k) of a sample with size n : (a) b y integrating the result o f Problem (4. 14.2 1) , (b) directly. 23. Find the joint density function of the order statistics of n independent uniform variables on [0, T] . 24. Let Xl , X2 , . . . , Xn be independent and uniformly distributed on [0, 1 ] , with order statistics X(1) , X(2) , . . . , X(n) · (a) Show that, for fixed k, the density function of nX(k) converges as n -+ 00, and find and identify

the limit function. (b) Show that log X(k) has the same distribution as - 2:.i=k i - I Y; , where the Yi are independent

random variables having the exponential distribution with parameter 1 . (c) Show that Zl , Z2 , . . . , Zn , defined by Zk = (X(k) I X(k+l) )k for k < n and Zn = (X(n) )n , are

independent random variables with the uniform distribution on [0, 1 ] . 25. Let Xl , X2 , X3 be independent variables with the uniform distribution on [0, 1 ] . What i s the probability that rods of lengths Xl , X2 , and X3 may be used to make a triangle? Generalize your answer to n rods used to form a polygon. 26. Let Xl and X2 be independent variables with the uniform distribution on [0, 1 ] . A stick of unit length is broken at points distance Xl and X 2 from one of the ends. What is the probability that the three pieces may be used to make a triangle? Generalize your answer to a stick broken in n places. 27. Let X, Y be a pair of jointly continuous variables. (a) HOlder's inequality. Show that if p, q > 1 and p-l + q- l = 1 then

lE lXY I :::: {lE IXP I } I /P {lE l yq l } l /q .

42

Page 52: One Thousand Exercises in Probability

Problems Exercises [4.14.28]-(4.14.34]

Set p = q = 2 to deduce the Cauchy-Schwarz inequality JE(Xy)2 � JE(X2)JE(y2) . (b) Minkowski's inequality. Show that, if p � 1 , then

{JE( IX + Y IP) } I /p � {JE IXP I } I /p + {JEl yP I } I /p .

Note that in both cases your proof need not depend on the continuity of X and Y; deduce that the same inequalities hold for discrete variables . 28. Let Z be a random variable. Choose X and Y appropriately in the Cauchy-Schwarz (or Holder) inequality to show that g(p) = 10g JE IZP I is a convex function of p on the interval of values of p such that JE IZP I < 00. Deduce Lyapunov's inequality:

{JE IZr I } I /r � (JE IZs I } I /s whenever r � s > 0.

You have shown in particular that, if Z has finite rth moment, then Z has finite sth moment for all positive s � r . 29. Show that, using the obvious notation, JE{JE(X I Y, Z) I Y } = JE(X I Y) . 30. Motor cars of unit length park randomly in a street in such a way that the centre of each car, in turn, is positioned uniformly at random in the space available to it. Let m (x) be the expected number of cars which are able to park in a street of length x . Show that

m(x + 1) = � r {m(y) + m (x - y) + l } dy . x Jo It is possible to deduce that m (x) is about as big as i x when x is large. 31. ButTon's needle revisited: ButTon's noodle.

(a) A plane is ruled by the lines y = nd (n = 0, ± 1 , . . . ) . A needle with length L « d) is cast randomly onto the plane. Show that the probability that the needle intersects a line is 2L/(rrd) .

(b) Now fix the needle and let C be a circle diameter d centred at the midpoint of the needle. Let A be a line whose direction and distance from the centre of C are independent and uniformly distributed on [0, 27r] and [0, �d] respectively. This is equivalent to 'casting the ruled plane at random' . Show that the probability of an intersection between the needle and A is 2L / (rr d) .

(c) Let S be a curve within C having finite length L(S). Use indicators to show that the expected number of intersections between S and A is 2L (S)/(rrd) .

This type of result i s used in stereology, which seeks knowledge of the contents of a cell by studying its cross sections. 32. ButTon's needle ingested. In the excitement of calculating rr , Mr Buffon (no relation) inadver­tently swallows the needle and is X-rayed. If the needle exhibits no preference for direction in the gut, what is the distribution of the length of its image on the X-ray plate? If he swallowed Buffon's cross (see Exercise (4.5 .3)) also, what would be the joint distribution of the lengths of the images of the two arms of the cross? 33. Let Xl , X2 , " " Xn be independent exponential variables with parameter A, and let X(1) � X(2) � . . . � X(n) be their order statistics . Show that

YI = nX(1) , Yr = (n + 1 - r) (X(r) - X(r- l ) ) ,

are also independent and have the same joint distribution a s the Xi . 34. Let X ( 1) , X (2) , . . . , X (n) be the order statistics of a family of independent variables with common continuous distribution function F. Show that

1 � r < n ,

43

Page 53: One Thousand Exercises in Probability

[4.14.35]-[4.14.42] Exercises Continuous random variables

are independent and uniformly distributed on [0, 1 ] . This is equivalent to Problem (4. 14.33). Why? 35. Secretary/marriage problem. You are permitted to inspect the n prizes at a rete in a given order, at each stage either rejecting or accepting the prize under consideration. There is no recall, in the sense that no rejected prize may be accepted later. It may be assumed that, given complete information, the prizes may be ranked in a strict order of preference, and that the order of presentation is independent of this ranking. Find the strategy which maximizes the probability of accepting the best prize, and describe its behaviour when n is large.

36. Fisher's spherical distribution. Let R2 = X2 + y2 + Z2 where X, Y, Z are independent normal random variables with means A , /1- , v , and common variance 0'2 , where (A , /1-, v) '" (0, 0, 0) . Show that the conditional density of the point (X, Y, Z) given R = r , when expressed in spherical polar coordinates relative to an axis in the direction e = (A , /1- , v) , is of the form

where a = r ie l .

f (O , l/J) = � ea cos (} sin O , 0 :::: ° < rr, 0 :::: l/J < 2rr , 4rr smh a

37. Let l/J be the N(O, 1) density function, and define the functions Hn , n � 0, by HO = 1 , and (_ l )n Hnl/J = l/J(n) , the nth derivative of l/J. Show that: (a) Hn (x) is a polynomial of degree n having leading term xn , and

100 { O ifm ", n, Hm (x)Hn (x)l/J (x) dx = , .

-00

n . tfm = n .

38. Lancaster's theorem. Let X and Y have a standard bivariate normal distribution with zero means, unit variances, and correlation coefficient p, and suppose U = u (X) and V = v (Y) have finite variances. Show that I p (U, V) I :::: I p l . [Hint: Use Problem (4. 14.37) to expand the functions u and v. You may assume that u and v lie in the linear span of the Hn .] 39. Let X ( 1 ) , X(2) , . . . , X(n) be the order statistics of n independent random variables, uniform on [0, 1 ] . Show that:

r r (n - s + 1) (a) lE(X(r» ) = n + l ' (b) cov(X(r) , X(s» ) = (n + 1 )2 (n + 2)

for r :::: s .

40. (a) Let X, Y, Z be independent N(O, 1) variables, and set R = y'X2 + y2 + Z2 . Show that X2 / R2 has a beta distribution with parameters ! and 1 , and is independent of R2 .

(b) Let X, Y, Z be independent and uniform on [- 1 , 1] and set R = y'X2 + y2 + Z2 . Find the density of X2 / R2 given that R2 =::: 1 .

41. Let l/J and <I> be the standard normal density and distribution functions. Show that: (a) <I> (x ) = 1 - <1> ( -x), (b) f(x) = 2l/J (X)<I> (AX) , -00 < x < 00, is the density function of some random variable (denoted

by Y), and that I Y I has density function 2ifJ. (c) Let X be a standard normal random variable independent of Y, and define Z = (X +A I Y I ) / y' 1 + A 2 .

Write down the joint density of Z and I Y I , and deduce that Z has density function f. 42. The six coordinates (Xi , Yi ) , 1 =::: i =::: 3, of three points A, B, C in the plane are independent N(O, 1 ) . Show that the the probability that C lies inside the circle with diameter AB is 1 .

44

Page 54: One Thousand Exercises in Probability

Problems Exercises [4.14.43]-[4.14.49]

43. The coordinates (Xi . Yi . Zi ) . 1 ::: i ::: 3. of three points A. B. C are independent N(O. 1 ) . Show

that the probability that C lies inside the sphere with diameter AB is ! - .../34 3 . 3 7r

44. Skewness. Let X have variance u2 and write mk = lE(Xk ) . Define the skewness of X by skw(X) = lE[(X - ml )3 ]/u3 . Show that: (a) skw(X) = (m3 - 3m lm2 + 2mV/u3 . (b) skw(Sn) = skw(X I ) / ...;n. where Sn = l:�=1 Xr is a sum of independent identically distributed

random variables. (c) skw(X) = (1 - 2p)/ .jnpq. when X is bin(n . p) where p + q = 1 . (d) skw(X) = 1 /./I. when X is Poisson with parameter A. (e) skw(X) = 2/.fi. when X is gamma r(A . t ) . and t is integral.

45. Kurtosis. Let X have variance u2 and lE(Xk) = mk. Define the kurtosis of X by kur(X) = lE[(X - m l )4]/u4 . Show that: (a) kur(X) = 3. when X is N(/L. (2). (b) kur(X) = 9. when X is exponential with parameter A. (c) kur(X) = 3 + A -I . when X is Poisson with parameter A. (d) kur(Sn) = 3 + {kur(XI ) - 3}/n. where Sn = l:�=1 Xr is a sum of independent identically

distributed random variables. 46. Extreme value. Fisher-Gumbel-Tippett distribution. Let X r. 1 ::: r ::: n. be independent and exponentially distributed with parameter 1 . Show that X (n) = max {Xr : 1 ::: r ::: n} satisfies

lim IP'(X(n} - log n ::: x) = exp(-e-X ) . n�oo Hence show that Iooo { 1 - exp( _e-X) } dx = y where y i s Euler's constant. 47. Squeezing. Let S and X have density functions satisfying b(x) ::: fs (x) ::: a (x) and fs (x) ::: fx (x) . Let U be uniformly distributed on [0. 1 ] and independent of X. Given the value X. we implement the following algorithm:

if Ufx (X) > a (X). reject X ; otherwise: i f Ufx (X) < b (X) . accept X ; otherwise: i f U fx (X) ::: fs (X) . accept X; otherwise: reject X.

Show that. conditional on ultimate acceptance. X is distributed as S . Explain when you might use this method of sampling. 48. Let X. Y. and {U r : r 2: I } be independent random variables. where:

IP'(X = x) = (e - l)e-x • IP'(Y = y) = 1

) for x . y = I , 2 • . . . • (e - 1 y !

and the Ur are uniform on [0. 1 ] . Let M = max{Ul . U2 • . . . • Uy } . and show that Z = X - M is exponentially distributed.

49. Let U and V be independent and uniform on [0. 1 ] . Set X = -a- I log U and Y = - log V where a > O.

1 2 (a) Show that. conditional on the event Y 2: ! (X _ a)2 . X has density function f (x) = ../2l1ie- Zx

for x > O.

45

Page 55: One Thousand Exercises in Probability

[4.14.50]-[4.14.56] Exercises Continuous random variables

(b) In sampling from the density function f, it is decided to use a rejection method: for given a > 0, we sample U and V repeatedly, and we accept X the first time that Y :::: i (X - a)2 . What is the optimal value of a?

(c) Describe how to use these facts in sampling from the N(O, 1 ) distribution.

50. Let S be a semicircle of unit radius on a diameter D. (a) A point P is picked at random on D. If X is the distance from P to S along the perpendicular to

D, show JE(X) = rr /4. (b) A point Q is picked at random on S. If Y is the perpendicular distance from Q to D, show

JE(Y) = 2/rr .

51. (Set for the Fellowship examination of St John's College, Cambridge in 1 858.) 'A large quantity of pebbles lies scattered uniformly over a circular field; compare the labour of collecting them one by one: (i) at the centre 0 of the field,

(ii) at a point A on the circumference.'

To be precise, if Lo and LA are the respective labours per stone, show that JE(Lo) = �a and JE(LA) = 32a/(9rr) for some constant a .

(iii) Suppose you take each pebble to the nearer of two points A or B at the ends of a diameter. Show in this case that the labour per stone satisfies

4a { 16 17 1 } 2 JE(LAB) =

3rr "3 - (;-12 + 2 10g(1 + -12) ::: 1 . 1 3 x 3a.

(iv) Finally suppose you take each pebble to the nearest vertex of an equilateral triangle ABC inscribed in the circle. Why is it obvious that the labour per stone now satisfies JE(LABc) < JE(Lo)? Enthusiasts are invited to calculate JE(LABc) .

52. The lines L, M, and N are parallel, and P lies on L . A line picked at random through P meets M at Q. A line picked at random through Q meets N at R. What is the density function of the angle e that RP makes with L? [Hint: Recall Exercise (4.8 .2) and Problem (4. 14.4).]

53. Let !:l. denote the event that you can form a triangle with three given parts of a rod R.

(a) R is broken at two points chosen independently and uniformly. Show that lP'(!:l.) = ! . (b) R is broken in two uniformly at random, the longer part is broken in two uniformly at random.

Show that lP'(!:l.) = log(4/e) .

(c) R is broken in two uniformly at random, a randomly chosen part is broken into two equal parts. Show that lP'( !:l.) = ! .

(d) In case (c) show that, given !:l., the triangle is obtuse with probability 3 - 2../2.

54. You break a rod at random into two pieces. Let R be the ratio of the lengths of the shorter to the longer piece. Find the density function fR , together with the mean and variance of R.

55. Let R be the distance between two points picked at random inside a square of side a. Show that JE(R2) = la2 , and that R2/a2 has density function

f(r) = if O � r � 1 , { r - 4.vr + rr

4.Jr=1 - 2 - r + 2 sin -1 � - 2 sin -1 J 1 - r 1 if 1 � r � 2.

56. Show that a sheet of paper of area A cm2 can be placed on the square lattice with period 1 cm in such a way that at least r A 1 points are covered.

46

Page 56: One Thousand Exercises in Probability

Problems Exercises [4.14.57]-[4.14.63]

57. Show that it is possible to position a convex rock of surface area S in sunlight in such a way that its shadow has area at least ! S.

58. Dirichlet distribution. Let {Xr : 1 ::; r ::; k + I } be independent r (>.. , f3r) random variables (respectively). (a) Show that Yr = Xr/(X1 + . . . + Xr) , 2 ::; r ::; k + 1, are independent random variables. (b) Show that Zr = Xr/(X1 + . . . + Xk+1 ) , 1 ::; r ::; k, have the joint Dirichlet density

59. HoteUing's theorem. Let Xr = (X 1r , X 2r , . . . , Xmr) , 1 ::; r ::; n, be independent multivariate normal random vectors having zero means and the same covariance matrix V = (Vij ) . Show that the two random variables

are identically distributed.

n- 1 Tij = L XirXjr ,

r=l

60. Choose P, Q, and R independently at random in the square S(a) of side a. Show that JE IPQRI = l la2/ 144. Deduce that four points picked at random in a parallelogram form a convex quadrilateral with probability (� )2 . 61. Choose P, Q, and R uniformly at random within the convex region C illustrated beneath. By considering the event that four randomly chosen points form a triangle, or otherwise, show that the mean area of the shaded region is three times the mean area of the triangle PQR.

62. Multivariate normal sampling. Let V be a positive-definite symmetric n x n matrix, and L a lower-triangular matrix such that V = L'L; this is called the Cholesky decomposition of V. Let X = (Xl , X2 , . . . , Xn) be a vector of independent random variables distributed as N(O, 1 ) . Show that the vector Z = IL + XL has the multivariate normal distribution with mean vector IL and covariance matrix V.

63. Verifying matrix multiplications. We need to decide whether or not AB = C where A, B, C are given n x n matrices, and we adopt the following random algorithm. Let x be a random to, l }n -valued vector, each of the 2n possibilities being equally likely. If (AB - C)X = 0, we decide that AB = C, and otherwise we decide that AB '# C. Show that

{ = 1 lP' (the decision is correct) 1 � 2

if AB = c, if AB ,# C.

Describe a similar procedure which results in an error probability which may be made as small as desired.

47

Page 57: One Thousand Exercises in Probability

5

Generating functions and their applications

5.1 Exercises. Generating functions

1. Find the generating functions of the following mass functions, and state where they converge. Hence calculate their means and variances. (a) f(m) = (n+;::- I ) pn ( 1 _ p)m , for m � o.

(b) f (m) = {m(m + 1 ) }- 1 , for m � 1 . (c) f (m) = ( 1 - p)p lm l 1( 1 + p), for m = . . . , - 1 , 0, 1 , . . . .

The constant p satisfies 0 < p < 1 . 2. Let X (� 0) have probability generating function G and write t (n) = IP'(X > n) for the 'tail' probabilities of X. Show that the generating function of the sequence {t (n) : n � O} is T(s) = ( I - G (s»/ ( 1 - s) . Show that JE(X) = T( 1 ) and var(X) = 2T' ( I ) + T( 1 ) - T( I )2 . 3. Let Gx,Y (s , t) be the joint probability generating function of X and Y. Show that Gx(s) = Gx,Y (s , 1) and Gy (t) = GX, y ( 1 , t) . Show that

a2

I JE(XY) = -Gx y (s , t) . as at ' 8=t=1

4. Find the joint generating functions of the following joint mass functions, and state for what values of the variables the series converge.

(a) f(j, k) = ( 1 - a) (p - a)ai pk-i- 1 , for 0 � k � j , where 0 < a < 1 , a < p. (b) f(j, k) = (e - l)e-(2k+1)ki Ij ! , for j, k � o.

(c) f(j, k) = (Y) pj+k ( l _ p)k-i / [k log{ I / ( I - p))] , for O � j � k, k � 1 , where 0 < p < 1 . Deduce the marginal probability generating functions and the covariances.

5. A coin is tossed n times, and heads turns up with probability p on each toss. Assuming the usual independence, show that the joint probability generating function of the numbers H and T of heads and tails is G H,T (X , y) = {px + ( 1 - p)y }n . Generalize this conclusion to find the joint probability generating function of the multinomial distribution of Exercise (3 .5. 1 ) . 6. Let X have the binomial distribution bin(n , U), where U is uniform on (0, 1 ) . Show that X is uniformly distributed on to, 1 , 2, . . . , n } . 7. Show that

G(X , y , z , w) = � (xyzw + xy + yz + zw + ZX + yw + xz + 1 )

48

Page 58: One Thousand Exercises in Probability

Some applications Exercises [5.1.8]-[5.2.5]

is the joint generating function of four variables that are pairwise and triplewise independent, but are nevertheless not independent.

8. Let Pr > 0 and ar E R for 1 � r � n. Which of the following is a moment generating function, and for what random variable?

n n (a) M(t) = 1 + E Prt

r, (b) M(t) = E Pr e

art .

r=l r=l 9. Let G1 and G2 be probability generating functions, and suppose that 0 � a � 1 . Show that G1G2, and aG1 + (1 - a)G2 are probability generating functions. Is G (as )/G (a) necessarily a probability generating function?

5.2 Exercises. Some applications

1. Let X be the number of events in the sequence A I , A 2 , . . . , An which occur. Let Sm = lE (!) , the mean value of the random binomial coefficient (!) , and show that

� . . (j - l) IP'(X :::: i ) = �(- I») -1 i - I Sj ' )=1 n ( . 1 )

where Sm = E J - IP'(X :::: j) , m - l j=m

for 1 � i � n ,

for 1 � m � n .

2. Each person in a group o f n people chooses another at random. Find the probability: (a) that exactly k people are chosen by nobody, (b) that at least k people are chosen by nobody.

3. Compounding.

(a) Let X have the Poisson distribution with parameter Y, where Y has the Poisson distribution with parameter f,L. Show that Gx+y (x) = exp{f,L (xex- 1 - I)} .

(b) Let Xl , X2 , . . . be independent identically distributed random variables with the logarithmic mass function ( 1 _ p)k f(k) =

k 10g( l/ p) '

where 0 < P < 1 . If N is independent of the Xi and has the Poisson distribution with parameter f,L, show that Y = E�l Xi has a negative binomial distribution.

4. Let X have the binomial distribution with parameters n and p, and show that

lE (_1_) = 1 - ( 1 - p)n+1 1 + X (n + l )p

Find the limit of this expression as n -+ 00 and p -+ 0, the limit being taken in such a way that np -+ A where 0 < A < 00. Comment.

5. A coin is tossed repeatedly, and heads turns up with probability p on each toss. Let hn be the probability of an even number of heads in the first n tosses, with the convention that 0 is an even number. Find a difference equation for the hn and deduce that they have generating function H (1 + 2ps - s)- l + (1 - s )- l } .

49

Page 59: One Thousand Exercises in Probability

[5.2.6]-[5.3.6] Exercises Generating functions and their applications

6. An unfair coin is flipped repeatedly, where lP'(H) = P = 1 - q . Let X be the number of flips until HTH first appears, and Y the number of flips until either HTH or THT appears. Show that JE(sX) = (p2qs3 )/ { l - s + pqs2 - pq2s3) and find JE(s Y) . 7. Matching again. The pile of (by now dog-eared) letters i s dropped again and enveloped at random, yielding Xn matches. Show that lP'(Xn = j) = (j + 1 )lP'(Xn+l = j + 1 ) . Deduce that the derivatives of the Gn (s) = JE(sXn ) satisfy G�+ l = Gn , and hence derive the conclusion of Example (3.4.3), namely:

1 ( 1 1 ( _ 1)n-r ) lP'(Xn = r) = - - - - + . . . + . r ! 2 ! 3 ! (n - r) !

8. Let X have a Poisson distribution with parameter A, where A is exponential with parameter /-t. Show that X has a geometric distribution.

9. Coupons. Recall from Exercise (3 .3 .2) that each packet of an overpriced commodity contains a worthless plastic object. There are four types of object, and each packet is equally likely to contain any of the four. Let T be the number of packets you open until you first have the complete set. Find JE(sT ) and lP'(T = k).

5.3 Exercises. Random walk

1. For a simple random walk S with So = 0 and p = 1 - q < ! , show that the maximum M = max{Sn : n :::: OJ satisfies lP'(M :::: r) = (p/q), for r :::: o.

2. Use generating functions to show that, for a symmetric random walk, (a) 2kfo (2k) = lP'(S2k-2 = 0) for k :::: 1 , and (b) lP'(Sl S2 · · · S2n =f. 0) = lP'(S2n = 0) for n :::: 1 .

3. A particle performs a random walk on the corners of the square ABeD. At each step, the probability of moving from corner c to corner d equals Pcd , where

PAB = PBA = PCD = PoC = a, PAD = PoA = PBC = PcB = fJ,

and a, fJ > 0, a + fJ = 1 . Let GA(s) be the generating function of the sequence (PAA(n) : n :::: 0), where PAA (n) is the probability that the particle is at A after n steps, having started at A. Show that

G A (s) = ! {_1 _ + 1

} . 2 1 - s2 1 - I fJ - a l2s2

Hence find the probability generating function of the time of the first return to A.

4. A particle performs a symmetric random walk in two dimensions starting at the origin: each step is of unit length and has equal probability ! of being northwards, southwards, eastwards, or westwards. The particle first reaches the line x + y = m at the point (X, Y) and at the time T. Find the probability generating functions of T and X - Y, and state where they converge.

5. Derive the arc sine law for sojourn times, Theorem (3 . 10.2 1 ), using generating functions. That is to say, let L2n be the length of time spent (up to time 2n) by a simple symmetric random walk to the right of its starting point. Show that

lP'(L2n = 2k) = lP'(S2k = 0)lP'(S2n-2k = 0) for O � k � n .

6 . Let {Sn : n :::: O J be a simple symmetric random walk with So = 0, and let T = rnin {n > 0 : Sn = OJ. Show that

JE(rnin{T, 2m}) = 2JE IS2m l = 4mlP'(S2m = 0) for m :::: O.

50

Page 60: One Thousand Exercises in Probability

Branching processes Exercises [5.3.7]-[5.4.6]

7. Let Sn = E�=o Xr be a left-continuous random walk on the integers with a retaining barrier at zero. More specifically, we assume that the Xr are identically distributed integer-valued random variables with Xl � - 1 , IP'(XI = 0) =1= 0, and

{ Sn + Xn+ l if Sn > 0, �+1 = . Sn + Xn+ l + 1 if Sn = o.

Show that the distribution of So may be chosen in such a way that lE(zSn ) = lE(zSO) for all n, if and only if lE(Xl ) < 0, and in this case

lE(z�n ) = (1 - Z)lE(Xl )lE(ZXl ) l - lE(zx1 )

8. Consider a simple random walk starting at 0 in which each step is to the right with probability p (= 1 - q). Let Tb be the number of steps until the walk first reaches b where b > O. Show that lE(Tb I Tb < 00) = bl i p - q l ·

5.4 Exercises. Branching processes

1. Let Zn be the size of the nth generation in an ordinary branching process with Zo = 1 , lE(ZI ) = f,L, and var(ZI ) > O. Show that lE(ZnZm) = f,Ln-mlE(Z� ) for m :::: n . Hence find the correlation coefficient p(Zm , Zn) in terms of f,L. 2. Consider a branching process with generation sizes Zn satisfying Zo = 1 and IP'(ZI = 0) = O. Pick two individuals at random (with replacement) from the nth generation and let L be the index of the generation which contains their most recent common ancestor. Show that IP'(L = r) = lE(Z; I ) -lE(Z;� I ) for 0 :::: r < n . What can be said if IP'(ZI = 0) > O?

3. Consider a branching process whose family sizes have the geometric mass function I(k) = qpk , k � 0, where p + q = 1 , and let Zn be the size of the nth generation. Let T = min {n : Zn = O} be the extinction time, and suppose that Zo = 1 . Find IP'(T = n) . For what values of p is it the case that lE(T) < oo?

4. Let Zn be the size of the nth generation of a branching process, and assume Zo = 1. Find an expression for the generating function Gn of Zn , in the cases when ZI has generating function given by : (a) G(s) = 1 - a(1 - s)fJ, 0 < a, f3 < 1 .

(b) G(s) = 1- 1 {P (f(s))} , where P is a probability generating function, and I is a suitable function satisfying 1(1 ) = 1 .

(c) Suppose in the latter case that I (x) = xm and pes) = s {y - (y - 1)s }- 1 where y > 1 . Calculate the answer explicitly.

5. Branching with immigration. Each generation of a branching process (with a single progenitor) is augmented by a random number of immigrants who are indistinguishable from the other members of the population. Suppose that the numbers of immigrants in different generations are independent of each other and of the past history of the branching process, each such number having probability generating function H(s) . Show that the probability generating function Gn of the size of the nth generation satisfies Gn+ l (s) = Gn (G (s) )H(s) , where G is the probability generating function of a typical family of offspring.

6. Let Zn be the size of the nth generation in a branching process with lE(sZI ) = (2 - s)- 1 and Zo = 1 . Let Vr be the total number of generations of size r . Show that lE(Vl ) = !;rr2 , and lE(2V2 - V3) = !;rr2 - Jo;rr4 .

5 1

Page 61: One Thousand Exercises in Probability

[5.5.1]-[5.6.5] Exercises Generating junctions and their applications

5.5 Exercises. Age-dependent branching processes

1. Let Zn be the size of the nth generation in an age-dependent branching process Z(t), the lifetime distribution of which is exponential with parameter A. . If Z (0) = 1 , show that the probability generating function Gt (s) of Z(t) satisfies

a -Gt (s) = A. { G(Gt (s» - Gt (s) } . at

Show in the case of 'exponential binary fission' , when G(s) = s2 , that

and hence derive the probability mass function of the population size Z (t) at time t .

2. Solve the differential equation of Exercise ( 1 ) when A. = 1 and G(s) = � (1 + s2) , to obtain

Gt (s) = 2s + t ( 1 - s). 2 + t ( 1 - s)

Hence find IP'(Z(t) :::: k) , and deduce that

IP' (Z(t)/ t :::: x I Z(t) > 0) -+ e-2x as t -+ 00.

5.6 Exercises. Expectation revisited

1. Jensen's inequality. A function u : IR -+ IR is called convex if for all real a there exists A., depending on a, such that u (x ) :::: u (a )+A.(x -a) for all x. (Draw a diagram to illustrate this definition.) Show that, if u is convex and X is a random variable with finite mean, then lE(u (X» :::: u (lE(X» .

2. Let Xl , X2 , ' " be random variables satisfying lE (L� 1 I Xi I ) < 00. Show that

3. Let {Xn } be a sequence of random variables satisfying Xn .::: Y a.s . for some Y with lE lY I < 00. Show that

lE (lim sup Xn) :::: lim sup lE(Xn) . n---+oo n---+oo

4. Suppose that lE lXr l < 00 where r > O. Deduce that xrlP'( I X I :::: x) -+ 0 as x -+ 00. Conversely, suppose that xrlP'( I X I :::: x) -+ 0 as x -+ 00 where r :::: 0, and show that lE IX8 1 < 00 for 0 .::: s < r . 5. Show that lE lX I < 00 i f and only if the following holds: for all E > 0 , there exists 8 > 0, such that lE( IX I IA ) < E for all A such that IP'(A) < 8 .

52

Page 62: One Thousand Exercises in Probability

Characteristic functions Exercises [5.7.1]-[5.7.9]

5.7 Exercises. Characteristic functions

1. Find two dependent random variables X and Y such that c/>x+y (t) = c/>x (t)c/>y (t) for all t .

2. If c/> is a characteristic function, show that Re{ l - c/> (t) } ::: !Re{ 1 - c/> (2t) } , and deduce that 1 - I c/> (2t) I :::; 8{ 1 - 1c/> (t) l } . 3. The cumulant generating function Kx «(}) of the random variable X is defined by Kx «(}) = 10g E(eOx), the logarithm of the moment generating function of X. If the latter is finite in a neigh­bourhood of the origin, then Kx has a convergent Taylor expansion:

and kn (X) is called the nth cumulant (or semi-invariant) of X. (a) Express kl (X) , k2 (X), and k3 (X) in terms of the moments of X. (b) If X and Y are independent random variables, show that kn (X + Y) = kn (X) + kn (Y) .

4. Let X be N(O, 1 ) , and show that the cumulants of X are k2 (X) = 1 , km (X) = 0 for m =1= 2. 5. The random variable X is said to have a lattice distribution if there exist a and b such that X takes values in the set L (a , b) = {a + bm : m = 0, ±1 , . . . }. The span of such a variable X is the maximal value of b for which there exists a such that X takes values in L (a, b) . (a) Suppose that X has a lattice distribution with span b. Show that Ic/>x (2n/b) I = 1 , and that

Ic/>x (t) I < 1 for O < t < 2n/b. (b) Suppose that I c/>x «(}) I = 1 for some () =1= O. Show that X has a lattice distribution with span

2nk/(} for some integer k. 6. Let X be a random variable with density function f. Show that lc/>x (t) 1 -+ 0 as t -+ ±oo.

7. LetXt . X2 , . . . , Xn be independent variables, Xi being N(J,Li , 1 ) , and let Y = Xf+X�+ . . +X� . Show that the characteristic function of Y is

1 ( i t() ) c/>y (t) = (1 _ 2i t)n/2 exp 1 - 2i t

where () = J,Lf + J,L� + . . . + J,L� . The random variables Y is said to have the non-central chi-squared distribution with n degrees of freedom and non-centrality parameter () , written x2 (n ; ()) . 8 . Let X be N(J,L, 1) and let Y be x2 (n) , and suppose that X and Y are independent. The random variable T = X/'/Y/n i s said to have the non-central t-distribution with n degrees of freedom and non-centrality parameter J,L. If U and V are independent, U being x2 (m ; (}) and V being x2 (n) , then F = (U / m) / (V / n) is said to have the non-central F -distribution with m and n degrees of freedom and non-centrality parameter () , written F(m , n ; (}) . (a) Show that T2 is F ( 1 , n ; J,L2) . (b) Show that

E(F) = n (m + (}) if n > 2. m (n - 2)

9. Let X be a random variable with density function f and characteristic function c/>. Show, subject to an appropriate condition on f, that

1

00 1 1

00 -00 f

(x)2 dx = 2n -00 Ic/> (t) 12 dt.

53

Page 63: One Thousand Exercises in Probability

[5.7.10]-[5.8.11] Exercises Generating functions and their applications

10. If X and Y are continuous random variables, show that

i: cfJx (y)fy (y)e-ity dy = i: cfJy (x - t)fx (x) dx .

11. Tilted distributions. (a) Let X have distribution function F and let t' be such that M(t') = lE(e'x) < 00. Show that F, (x) = M (t')- l J::'oo e'Y dF(y) is a distribution function, called a 'tilted distribution' of X, and find its moment generating function. (b) Suppose X and Y are independent and lEe e' X ) , lEe e' y) < 00. Find the moment generating function of the tilted distribution of X + Y in terms of those of X and Y.

5.8 Exercises. Examples of characteristic functions

1. If cfJ is a characteristic function, show that �, cfJ2 , IcfJ 1 2 , Re(cfJ) are characteristic functions. Show that IcfJ l is not necessarily a characteristic function.

2. Show that lP'(X ::: x) ::: inf {e-tx MX (t) } , t�O

where Mx is the moment generating function of X.

3. Let X have the r o.. , m) distribution and let Y be independent of X with the beta distribution with parameters n and m - n, where m and n are non-negative integers satisfying n ::: m. Show that Z = X Y has the r 0.. , n) distribution. 4. Find the characteristic function of X2 when X has the N(/-L, (12) distribution.

5. Let Xl , X2 , . . . be independent N(O, 1) variables. Use characteristic functions to find the distri­bution of: (a) Xf , (b) 2:1=1 Xl , (c) XI I X2 , (d) XI X2, (e) XIX2 + X3X4 . 6 . Let Xl , X2 , . . . , Xn be such that, for all aI , a2 , . . . , an E JR, the linear combination alXI + a2X2 + . . . + anXn has a normal distribution. Show that the joint characteristic function of the Xm is exp(i tll,' - � tVt') , for an appropriate vector /L andmatrix V. Deduce that the vector (Xl , X2 , . . . , Xn) has a multivariate normal density function so long as V is invertible. 7. Let X and Y be independent N(O, 1) variables, and let U and V be independent of X and Y. Show that Z = (UX + VY)/"';U2 + V2 has the N(O, 1) distribution. Formulate an extension of this result to cover the case when X and Y have a bivariate normal distribution with zero means, unit variances, and correlation p .

8. Let X be exponentially distributed with parameter A . Show by elementary integration that lE(eitX) = A/(A - i t ) . 9. Find the characteristic functions of the following density functions: (a) f (x) = � e- Ix l for x E JR, (b) f (x) = � Ix l e- Ix l for x E JR. 10. Is it possible for X, Y, and Z to have the same distribution and satisfy X = U(Y + Z), where U is uniform on [0, 1] , and Y, Z are independent of U and of one another? (This question arises in modelling energy redistribution among physical particles .)

11. Find the joint characteristic function of two random variables having a bivariate normal distribution with zero means. (No integration is needed.)

54

Page 64: One Thousand Exercises in Probability

Inversion and continuity theorems Exercises [5.9.1]-[5.9.8]

5.9 Exercises. Inversion and continuity theorems

1. Let Xn be a discrete random variable taking values in { I , 2, . . . , n } , each possible value having probability n- l . Show that, as n -+ 00, lP'(n- 1 Xn ::: y) -+ y, for 0 ::: y ::: 1 . 2. Let Xn have distribution function

( sin(2mr x) Fn x) = X - 2 ' nrr 0 ::: x ::: 1 .

(a) Show that Fn is indeed a distribution function, and that Xn has a density function. (b) Show that, as n -+ 00, Fn converges to the uniform distribution function, but that the density

function of Fn does not converge to the uniform density function.

3. A coin is tossed repeatedly, with heads turning up with probability p on each toss. Let N be the minimum number of tosses required to obtain k heads. Show that, as p -J, 0, the distribution function of 2N p converges to that of a gamma distribution.

4. If X is an integer-valued random variable with characteristic function f/>, show that

lP'(X = k) = � rr: e-itkf/> (t) dt. 2rr J-rr: What is the corresponding result for a random variable whose distribution is arithmetic with span A (that is, there is probability one that X is a multiple of A, and A is the largest positive number with this property)?

5. Use the inversion theorem to show that

100 sin(at) sin(bt) d . { b} 2 t = rr IDln a, .

-00

t

6. Stirling's formula. Let fn (x) be a differentiable function on ffi. with a a global maximum at a > 0, and such that fgc> exp{fn (x) ) dx < 00. Laplace's method of steepest descent (related to Watson's lemma and saddlepoint methods) asserts under mild conditions that

1000 exp{fn (x) ) dx rv 1000 exp{ Jn (a) + ! (x - a)2 f�' (a) } dx as n -+ 00.

By setting fn (x) = n log x - x, prove Stirling's formula: n ! rv nne-n,J2rrn . 7. Let X = (Xl , X2 , . . . , Xn) have the multivariate normal distribution with zero means, and covariance matrix V = (vij ) satisfying IV I > 0 and Vij > 0 for all i, j . Show that { -.!!.L if i =I j

8f = 8Xi 8xj ,

8vij I 82 f . . . 2_-2 If I = j , 8xi

and deduce that lP'(maxk::;:n Xk ::: u) ::: I1�=I lP'(Xk ::: u) . 8 . Let X} , X2 have a bivariate normal distribution with zero means, unit variances, and correlation p. Use the inversion theorem to show that

8 I -lP'(XI > 0, X2 > 0) = � . 8p 2rr y 1 - p2

Hence find lP'(XI > 0, X2 > 0) .

55

Page 65: One Thousand Exercises in Probability

[S.10.1]-[S.10.9] Exercises Generating functions and their applications

5.10 Exercises. Two limit theorems

1. Prove that, for x ::: 0, as n --+ 00,

(a)

(b) nk jX 1 1 2 - rv en --e -zu du k ! -x ..fiii

.

2. It is well known that infants born to mothers who smoke tend to be small and prone to a range of ailments. It is conjectured that also they look abnormal. Nurses were shown selections of photographs of babies, one half of whom had smokers as mothers; the nurses were asked to judge from a baby's appearance whether or not the mother smoked. In 1500 trials the correct answer was given 910 times. Is the conjecture plausible? If so, why?

3. Let X have the r ( 1 , s) distribution; given that X = x, let Y have the Poisson distribution with parameter x . Find the characteristic function of Y, and show that

Y - JE(Y) � N(O, 1) as s --+ 00. Jvar(Y)

Explain the connection with the central limit theorem.

4. Let Xl , X2 , . . . be independent random variables taking values in the positive integers, whose common distribution is non-arithmetic, in that gcd{n : lP'(XI = n) > O} = 1 . Prove that, for all integers x, there exist non-negative integers r = r (x) , S = s (x) , such that

lP' (XI + . . . + Xr - Xr+I - . . . - Xr+s = x ) > O.

S. Prove the local central limit theorem for sums of random variables taking integer values. You may assume for simplicity that the summands have span 1 , in that gcd { Ix I : lP'(X = x) > O} = 1 . 6. Let Xl , X2 , . . . be independent random variables having common density function f (x) = 1 / {2 Ix l (10g Ix 1 )2 } for Ix l < e- I . Show that the Xi have zero mean and finite variance, and that the density function fn of Xl + X2 + . . . + Xn satisfies fn (x) --+ 00 as x --+ O. Deduce that the Xi do not satisfy the local limit theorem.

7. First-passage density. Let X have the density function f (x) = J2nx-3 exp(-{2x }-I ) , x > O. Show that ¢ (is) = JE(e-sX) = e--I2i, S > 0, and deduce that X has characteristic function

{ exp{ -( 1 - i ).Ji) if t ::: 0, ¢ (t) =

exp{-( I + i).JitT} if t � O. [Hint: Use the result of Problem (5. 12. 1 8).]

8. Let {Xr : r ::: I } be independent with the distribution of the preceding Exercise (7). Let Un = n- I E�=I Xr , and Tn = n-i Un . Show that: (a) lP'(Un < c) --+ 0 for any c < 00, (b) Tn has the same distribution as X l .

9 . A sequence of biased coins i s flipped; the chance that the rth coin shows a head is E>r , where E>r is a random variable taking values in (0, 1 ) . Let Xn be the number of heads after n flips. Does Xn obey the central limit theorem when: (a) the E>r are independent and identically distributed? (b) E>r = E> for all r, where E> is a random variable taking values in (0, I ) ?

56

Page 66: One Thousand Exercises in Probability

Problems Exercises [5.11.1]-[5.12.4]

5.11 Exercises. Large deviations

1. A fair coin is tossed n times, showing heads Hn times and tails Tn times. Let Sn = Hn - Tn . Show that

" lP(Sn > an) l /n --+ if 0 < a < 1 . yI(l + a) l+a (l - a) l-a

What happens if a � I ? 2. Show that

r, l/n 4 n --+ -r====;=;;=====;== yI(l + a) 1+a ( l - a) l-a

as n --+ 00, where 0 < a < 1 and

Tn = � (�) . Ik- !n l > !an

Find the asymptotic behaviour of T� /n where

Tn = where a > o.

3. Show that the moment generating function of X is finite in a neighbourhood of the origin if and only if X has exponentially decaying tails, in the sense that there exist positive constants A and /1- such that lP(IX I � a) ::: /1-e-Aa for a > O. [Seen in the light of this observation, the condition of the large deviation theorem (5 . 1 1 .4) is very natural] .

4. Let Xl , X2 , . . . be independent random variables having the Cauchy distribution, and let Sn = Xl + X2 + . . . + Xn . Find lP(Sn > an) .

5.12 Problems

1. A die is thrown ten times. What is the probability that the sum of the scores is 27? 2. A coin is tossed repeatedly, heads appearing with probability p on each toss. (a) Let X be the number of tosses until the first occasion by which three heads have appeared

successively. Write down a difference equation for f(k) = lP(X = k) and solve it. Now write down an equation for E(X) using conditional expectation. (fry the same thing for the first occurrence of IITH).

(b) Let N be the number of heads in n tosses of the coin. Write down GN(S) . Hence find the probability that: (i) N is divisible by 2, (ii) N is divisible by 3 .

3. A coin is tossed repeatedly, heads occurring on each toss with probability p. Find the probability generating function of the number T of tosses before a run of n heads has appeared for the first time.

4. Find the generating function of the negative binomial mass function

f(k) = (k - 1) pr (l _ p)k-r , r - 1

57

k = r, r + 1, . . . ,

Page 67: One Thousand Exercises in Probability

[5.12.5]-[5.12.13] Exercises Generating functions and their applications

where 0 < p < 1 and r is a positive integer. Deduce the mean and variance.

5. For the simple random walk, show that the probability Po (2n) that the particle returns to the origin at the (2n)th step satisfies po (2n) rv (4pq )n /.,;rrn, and use this to prove that the walk is persistent if 1 and only if p = 1 . You will need Stirling's formula: n ! rv nn+ Z e-n./2if.

6. A symmetric random walk in two dimensions is defined to be a sequence of points {(Xn , Yn) : n ::: O} which evolves in the following way: if (Xn , Yn ) = (x , y) then (Xn+l , Yn+l ) is one of the four points (x ± 1 , y) , (x , y ± 1 ) , each being picked with equal probability ! . If (Xo , Yo) = (0, 0) : (a) show that E(X� + Y;) = n , (b) find the probability Po (2n) that the particle i s at the origin after the (2n)th step, and deduce that

the probability of ever returning to the origin is 1 . 7. Consider the one-dimensional random walk {Sn } given by

{ Sn + 2 with probability p, Sn+1 = Sn - 1 with probability q = 1 - p,

where 0 < p < 1 . What is the probability of ever reaching the origin starting from So = a where a > O? 8. Let X and Y be independent variables taking values in the positive integers such that

for some p and all 0 .::s k .::s n . Show that X and Y have Poisson distributions.

9. In a branching process whose family sizes have mean J1, and variance u2, find the variance of Zn , the size of the nth generation, given that Zo = 1 . 10. Waldegrave's problem. A group {A I , A2 , . . . , Ar } of r (> 2) people play the following game. A I and A 2 wager on the toss of a fair coin. The loser puts £ 1 in the pool, the winner goes on to play A3 . In the next wager, the loser puts £ 1 in the pool, the winner goes on to play A4, and so on. The winner of the (r - l)th wager goes on to play A I , and the cycle recommences . The first person to beat all the others in sequence takes the pool. (a) Find the probability generating function of the duration of the game. (b) Find an expression for the probability that Ak wins. (c) Find an expression for the expected size of the pool at the end of the game, given that Ak wins . (d) Find an expression for the probability that the pool is intact after the nth spin of the coin. This problem was discussed by Montmort, Bernoulli, de Moivre, Laplace, and others.

11. Show that the generating function Hn of the total number of individuals in the first n generations of a branching process satisfies Hn (s ) = sG (Hn_ 1 (s» . 12. Show that the number Zn of individuals i n the nth generation of a branching process satisfies JP'(Zn > N I Zm = O) .::s Gm (O)N for n < m . 13. (a) A hen lays N eggs where N is Poisson with parameter A. The weight o f the nth egg is Wn , where WI , W2 , ' " are independent identically distributed variables with common probability generating function G(s) . Show that the generating function Gw of the total weight W = 2:�1 Wj is given by Gw(s) = exp{ -A + AG (S ) } . W is said to have a compound Poisson distribution. Show further that, for any positive integral value of n, Gw (s ) l /n is the probability generating function of some random variable ; W (or its distribution) is said to be infinitely divisible in this regard.

58

Page 68: One Thousand Exercises in Probability

Problems Exercises [5.12.14]-[5.12.19]

(b) Show that if R(s) is the probability generating function of some infinitely divisible distribution on the non-negative integers then R(s) = exp{-A + AG(S) } for some A (> 0) and some probability generating function G(s ) . 14. The distribution of a random variable X is called infinitely divisible if, for all positive integers n , there exists a sequence YI(

n) , yin) , . . . , yJn) of independent identically distributed random variables such that X and y?) + yin) + . . . + yJn) have the same distribution. (a) Show that the normal, Poisson, and gamma distributions are infinitely divisible. (b) Show that the characteristic function ¢ of an infinitely divisible distribution has no real zeros, in

that ¢ (t) :f. 0 for all real t . 15. Let X I , X 2 , . . , be independent variables each taking the values 0 or 1 with probabilities 1 - P and p, where 0 < P < 1 . Let N be a random variable taking values in the positive integers, independent of the Xj , and write S = Xl + X2 + . . . + XN . Write down the conditional generating function of N given that S = N, in terms of the probability generating function G of N. Show that N has a Poisson distribution if and only if lE(xN)P = lE(xN I S = N) for all P and x . 16. If X and Y have joint probability generating function

where PI + P2 � 1 ,

find the marginal mass functions of X and Y, and the mass function of X + Y . Find also the conditional probability generating function GX IY (s I y) = lE(sX I Y = y) of X given that Y = y . The pair X, Y is said to have the bivariate negative binomial distribution. 17. If X and Y have joint probability generating function

GX,Y (s , t) = exp { ex(s - 1 ) + {J (t - 1 ) + y (st - I) }

find the marginal distributions of X , Y, and the distribution of X + Y , showing that X and Y have the Poisson distribution, but that X + Y does not unless y = O.

18. Define

for a, b > O. Show that (a) l (a , b) = a- I I ( 1 , ab), (b) 8 1/8b = -2I ( l , ab), (c) / (a , b) = ,.fiTe-2ab / (2a ) . (d) If X has density function (d/ .jX)e-c/x-gx for x > 0, then

lE(e-tX ) = dV rr exp (-2y'c (g + t» ) , t > -g . g + t

1 (e) If X has density function (2rrx3)- Z e-I/ (2x) for x > 0, then X has moment generating function

given by lE(e-tX ) = exp { -51}, t ::: O. [Note that lE(Xn) = 00 for n ::: 1 . ] 19. Let X, Y, Z be independent N(O, 1 ) variables. Use characteristic functions and moment gener­ating functions (Laplace transforms) to find the distributions of (a) U = X/Y, (b) V = X-2, (c) W = XYZ/y'X2y2 + Y2Z2 + Z2X2 .

59

Page 69: One Thousand Exercises in Probability

[5.12.20]-[5.12.30] Exercises Generating functions and their applications

20. Let X have density function f and characteristic function tP, and suppose that f�oo I tP (t) I dt < 00. Deduce that

f (x) = - e-l fxtP (t) dt . 1 100 .

2rr -00

21. Conditioned branching process. Consider a branching process whose family sizes have the geometric mass function f (k) = qpk , k ::: 0, where /1- = p/q > 1 . Let Zn be the size of the nth generation, and assume Zo = 1. Show that the conditional distribution of Zn / /1-n , given that Zn > 0, converges as n -+ 00 to the exponential distribution with parameter 1 - /1--1 . 22. A random variable X is called symmetric if X and -X are identically distributed. Show that X is symmetric if and only if the imaginary part of its characteristic function is identically zero. 23. Let X and Y be independent identically distributed variables with means 0 and variances 1 . Let tP (t) be their common characteristic function, and suppose that X + Y and X - Y are independent. Show that tP (2t) = tP (t)3tP ( -t), and deduce that X and Y are N(O, 1 ) variables.

More generally, suppose that X and Y are independent and identically distributed with means 0 and variances 1 , and furthermore that E(X - Y I X + Y) = 0 and var(X - Y I X + Y) = 2. Deduce that tP (s)2 = tP'(s )2 - tP (s)tP" (s), and hence that X and Y are independent N(O, 1) variables. 24. Show that the average Z = n- 1 L:i=1 Xi of n independent Cauchy variables has the Cauchy distribution too. Why does this not violate the law of large numbers? 25. Let X and Y be independent random variables each having the Cauchy density function f(x) = {rr( 1 + x2) }- I , and let Z = i (X + Y). (a) Show by using characteristic functions that Z has the Cauchy distribution also. (b) Show by the convolution formula that Z has the Cauchy density function. You may find it helpful

to check first that f (x) + f(y - x) f(x)f(y - x) = rr(4 + y2) + g(y) {xf(x) + (y - x)f(y - x) }

where g(y) = 2/{rry(4 + y2) } . 26. Let XI , X2 , . . . , Xn be independent variables with characteristic functions tPl , t/1z, . . . , tPn . De­scribe random variables which have the following characteristic functions:

(a) tPl (t)t/1z (t) . . · tPn (t), (b) ItPl (t) 1 2 , (c) L:'i PjtPj (t) where Pj ::: 0 and L:'i Pj = 1 , (d) ( 2 - tPl (t))- I , (e) fooo tPl (ut)e-U du o

27. Find the characteristic functions corresponding to the following density functions on (-00, 00) : (a) 1 / cosh(rrx) , (b) ( 1 - cOS X)/(7rX2) , (c) exp(-x - e-X ), (d) i e- ix i .

Show that the mean of the 'extreme-value distribution' in part (c) is Euler's constant y .

28. Which of the following are characteristic functions: (a) tP (t) = 1 - It I if It I .::s 1 , tP (t) = 0 otherwise, (b) tP (t) = (1 + t4)- I , (c) tP (t) = exp( _t4), (d) tP (t) = cos t , (e) tP (t) = 2(1 - cos t)/ t2 .

29. Show that the characteristic function tP of a random variable X satisfies 1 1 - tP(t) 1 .::s E l tX I . 30. Suppose X and Y have joint characteristic function tP (s , t) . Show that, subject to the appropriate conditions of differentiability,

60

Page 70: One Thousand Exercises in Probability

Problems Exercises [5.12.31]-[5.12.39]

for any positive integers m and n . 31. If X has distribution function F and characteristic function ifJ, show that for t > 0

(a)

(b)

32. Let Xl , X2 , . . . be independent variables which are uniformly distributed on [0, 1 ] . Let Mn =

max{XI , X2 , . . . , Xn } and show that n ( l - Mn) .s X where X is exponentially distributed with parameter 1 . You need not use characteristic functions. 33. If X is either (a) Poisson with parameter A, or (b) r ( l , A), show that the distribution of Y ... =

(X - EX)/"';var X approaches the N(O, 1 ) distribution as A � 00. (c) Show that

e -n 1 + n + - + . . . + - � -( n2 nn ) 1 2 ! n ! 2 as n � 00.

34. Coupon collecting. Recall that you regularly buy quantities of some ineffably dull commodity. To attract your attention, the manufacturers add to each packet a small object which is also dull, and in addition useless, but there are n different types. Assume that each packet is equally likely to contain any one of the different types, as usual. Let Tn be the number of packets bought before you acquire a complete set of n objects. Show that n - 1 (Tn - n log n) .s T, where T is a random variable with distribution function JP(T :s x) = exp( _e-X ) , -00 < x < 00.

35. Find a sequence (ifJn ) of characteristic functions with the property that the limit given by ifJ (t) =

limn ...... oo ifJn (t) exists for all t, but such that ifJ is not itself a characteristic function. 36. Use generating functions to show that it is not possible to load two dice in such a way that the sum of the values which they show is equally likely to take any value between 2 and 12 . Compare with your method for Problem (2.7 . 12) . 37. A biased coin is tossed N times, where N is a random variable which is Poisson distributed with parameter A. Prove that the total number of heads shown is independent of the total number of tails. Show conversely that if the numbers of heads and tails are independent, then N has the Poisson distribution. 38. A binary tree is a tree (as in the section on branching processes) in which each node has exactly two descendants. Suppose that each node of the tree is coloured black with probability p, and white otherwise, independently of all other nodes. For any path T( containing n nodes beginning at the root of the tree, let B(T() be the number of black nodes in T(, and let Xn (k) be the number of such paths T( for which B(T() � k. Show that there exists f3c such that

{ 0 if f3 > f3c , E{Xn (f3n) } �

00 if f3 < f3c ,

and show how to determine the value f3c . Prove that

( ) { 0 if f3 > f3c , JP Xn (f3n) � 1 � 1 if f3 < f3c .

39. Use the continuity theorem (5 .9.5) to show that, as n � 00, (a) if Xn is bin(n , A/n) then the distribution of Xn converges to a Poisson distribution,

6 1

Page 71: One Thousand Exercises in Probability

[5.12.40]-[5.12.46] Exercises Generating junctions and their applications

(b) if Yn is geometric with parameter p = A/n then the distribution of Yn/n converges to an expo­nential distribution.

40. Let Xl , X 2 , . . . be independent random variables with zero means and such that JE I X] I < 00 for

all j . Show that Sn = Xl + X2 + . . . + Xn satisfies Sn/ Jvar(Sn) � N(O, I ) as n -+ 00 if

The following steps may be useful. Let al = var(Xj ) , a (n)2 = var(Sn ) , Pj = JEIX] I , and ifJj and 1frn be the characteristic functions of Xj and Sn/a (n) respectively. (i) Use Taylor's theorem to show that lifJj (t) - 1 1 � 2t2al and lifJj (t) - 1 + �alt2 1 � I t l 3pj for

j ::: 1 . (ii) Show that I log ( 1 + z) - z l � Id if I z l � � , where the logarithm has its principal value. (iii) Show that a] � Pj , and deduce from the hypothesis that maxl�j :�n aj /a (n) -+ 0 as n -+ 00,

implying that maXl�j�n lifJj (t /a (n» - 1 1 -+ O.

(iv) Deduce an upper bound for I log ifJj (t/a (n» - � t2al /a(n)2 1 , and sum to obtain that log 1frn (t) -+ _ l t2 2

41. Let Xl , X2 , . . . be independent variables each taking values +1 or - 1 with probabilities � and i . Show that

v; n "3 L kXk � N(O, 1 ) as n -+ 00. n k= l

42. Normal sample. Let Xl , X2 , . . . , Xn be independent N(/-L, (2) random variables. Define X = n- l �'i Xi and Zi = Xi - X. Find the joint characteristic function of X, Zl , Z2 , . . . , Zn , and hence prove that X and S2 = (n _ 1 )- 1 �'i (Xi - X)2 are independent.

43. Log-normal distribution. Let X be N (0, 1 ) , and let Y = eX ; Y is said to have the log-normal distribution. Show that the density function of Y is

1 f(x) = J;C" exp{ - i (log x)2 } ,

xv 2rr x > O.

For la l � 1, define fa (x) = { I +a sin (2rr log x ) } f (x) . Show that fa is a density function with finite moments of all (positive) orders, none of which depends on the value of a . The family {fa : la I � I } contains density functions which are not specified by their moments. 44. Consider a random walk whose steps are independent and identically distributed integer-valued random variables with non-zero mean. Prove that the walk is transient. 45. Recurrent events. Let {Xr : r ::: I } be the integer-valued identically distributed intervals between the times of a recurrent event process. Let L be the earliest time by which there has been an interval of length a containing no occurrence time. Show that, for integral a ,

46. A biased coin shows heads with probability p (= 1 - q). I t i s flipped repeatedly until the first time Wn by which it has shown n consecutive heads. Let JE(s Wn ) = Gn (s ) . Show that Gn =

62

Page 72: One Thousand Exercises in Probability

Problems Exercises [5.12.47]-[5.12.52]

psGn-l /( l - qsGn- l ) , and deduce that

47. In n flips of a biased coin which shows heads with probability p (= 1 - q), let Ln be the length of the longest run of heads. Show that, for r ::: 1 ,

48. The random process {Xn : n ::: I } decays geometrically fast in that, in the absence of external input, Xn+l = iXn . However, at any time n the process is also increased by Yn with probability i , where {Yn : n ::: I } is a sequence of independent exponential random variables with parameter A. Find the limiting distribution of Xn as n -+ 00.

49. Let G(s) = E(sX) where X ::: O. Show that E{(X + 1 )- 1 } = Jd G(s) ds , and evaluate this when X is (a) Poisson with parameter A, (b) geometric with parameter p, (c) binomial bin(n , p) , (d) logarithmic with parameter p (see Exercise (5.2.3)). Is there a non-trivial choice for the distribution of X such that E{(X + 1 )- 1 } = {E(X + l ) }- I ? 50. Find the density function of ��=1 Xr , where {Xr : r ::: I } are independent and exponentially distributed with parameter A, and tv. is geometric with parameter p and independent of the X r . 51. Let X have finite non-zero variance and characteristic function cp (t) . Show that

is a characteristic function, and find the corresponding distribution. 52. Let X and Y have joint density function

Ix l < 1 , I y l < 1 .

Show that cpx (t)cpy (t) = cpx+y (t) , and that X and Y are dependent.

63

Page 73: One Thousand Exercises in Probability

6

Markov chains

6.1 Exercises. Markov processes

1. Show that any sequence of independent random variables taking values in the countable set S is a Markov chain. Under what condition is this chain homogeneous? 2. A die is rolled repeatedly. Which of the following are Markov chains? For those that are, supply the transition matrix. (a) The largest number Xn shown up to the nth roll. (b) The number Nn of sixes in n rolls. (c) At time r , the time C r since the most recent six. (d) At time r, the time Br until the next six. 3. Let {Sn : n � O} be a simple random walk with So = 0, and show that Xn = ISn l defines a Markov chain; find the transition probabilities of this chain. Let Mn = max{Sk : ° � k � n} , and show that Yn = Mn - Sn defines a Markov chain. What happens if So #- O? 4. Let X be a Markov chain and let {nr : r � O} be an unbounded increasing sequence of positive integers. Show that Yr = Xnr constitutes a (possibly inhomogeneous) Markov chain. Find the transition matrix of Y when nr = 2r and X is: (a) simple random walk, and (b) a branching process. 5. Let X be a Markov chain on S, and let I : Sn -+ {O, I } . Show that the distribution of Xn , Xn+l , . . . , conditional on {l (X 1 , . . . , Xn) = I } n {Xn = i } , is identical to the distribution of Xn , Xn+l , . . . conditional on {Xn = i } . 6 . Strong Markov property. Let X be a Markov chain on S , and let T be a random variable taking values in {O, 1 , 2, . . . } with the property that the indicator function I{T=n} , of the event that T = n, is a function of the variables Xl , X2 , . . . , Xn . Such a random variable T is called a stopping time, and the above definition requires that it is decidable whether or not T = n with a knowledge only of the past and present, Xo , Xl , . . . , Xn , and with no further information about the future.

Show that

IP' (XT+m = j I Xk = Xk for O � k < T, XT = i ) = 1P'(XT+m = j I XT = i)

for m � 0, i , j E S, and all sequences (Xk) of states. 7. Let X be a Markov chain with state space S, and suppose that h : S -+ T is one-one. Show that Yn = h (Xn) defines a Markov chain on T . Must this be so if h is not one-one? 8. Let X and Y be Markov chains on the set Z of integers. Is the sequence Zn = Xn + Yn necessarily a Markov chain? 9. Let X be a Markov chain. Which of the following are Markov chains?

64

Page 74: One Thousand Exercises in Probability

Classification of states

(a) Xm+r for r � o.

(b) X2m for m � o.

(c) The sequence of pairs (Xn , Xn+l ) for n � o.

10. Let X be a Markov chain. Show that, for 1 < r < n,

lP'(Xr = k I Xi = Xi for i = 1 , 2 , . . . , r - 1, r + 1 . . . , n)

Exercises [6.1.10]-[6.2.5]

= lP'(Xr = k I Xr- l = Xr- l , Xr+ l = Xr+ l ) ·

11. Let {Xn : n � I } be independent identically distributed integer-valued random variables. Let Sn = L:�=l Xr , with So = 0, Yn = Xn + Xn- l with Xo = 0, and Zn = L:�=O Sr . Which of the following constitute Markov chains: (a) Sn , (b) Yn , (c) Zn , (d) the sequence of pairs (Sn , Zn) ? 12. A stochastic matrix P i s called doubly stochastic if L:i Pij = 1 for all j . It is called sub-stochastic if L:i Pij ::'S 1 for all j . Show that, if P is stochastic (respectively, doubly stochastic, sub-stochastic), then pn is stochastic (respectively, doubly stochastic, sub-stochastic) for all n .

6.2 Exercises. Classification of states

1. Last exits. Let lij (n) = lP'(Xn = j, Xk #- i for 1 ::'S k < n I Xo = i ) , the probability that the chain passes from i to j in n steps without revisiting i . Writing

00 Lij (s ) = L snlij (n) ,

n=l

show that Pij (s) = Pii (s )Lij (s) if i #- j . Deduce that the first passage times and last exit times have the same distribution for any Markov chain for which Pii (s) = Pjj (s ) for all i and j . Give an example of such a chain. 2. Let X be a Markov chain containing an absorbing state s with which all other states i communicate, in the sense that Pis (n) > 0 for some n = n (i ) . Show that all states other than s are transient. 3. Show that a state i is persistent if and only if the mean number of visits of the chain to i , having started at i , is infinite.

4. VISits. Let l'J = I {n � 1 : Xn = nl be the number of visits of the Markov chain X to j , and define Tlij = lP'(Vj = 00 I Xo = i ) . Show that:

a . . _ { I if i is persistent, ( ) Till - 0 ·f · · .

1 I IS transIent, { lP'(1j < 00 I Xo = i ) if j is persistent, . . (b) Tlij = . . . . where 1j = mm{n � 1 : Xn = J } .

o If J IS transIent, 5. Symmetry. The distinct pair i, j of states of a Markov chain is called symmetric if

lP'(1j < 11 I Xo = i ) = lP'(1I < 1j I Xo = j) ,

where 11 = min{n � 1 : Xn = i } . Show that, i f Xo = i and i , j i s symmetric, the expected number of visits to j before the chain revisits i is 1 .

65

Page 75: One Thousand Exercises in Probability

[6.3.1J-[6.3.8J Exercises Markov chains

6.3 Exercises. Classification of chains

1. Let X be a Markov chain on {O, 1 , 2 , . . . } with transition matrix given by POj = aj for j ::: 0, Pii = r and Pi, i - 1 = 1 - r for i ::: 1 . Classify the states of the chain, and find their mean recurrence times.

2. Determine whether or not the random walk on the integers having transition probabilities Pi, i+2 =

P, Pi, i- 1 = 1 - P, for all i , is persistent.

3. Classify the states of the Markov chains with transition matrices

(a)

(b)

( 1 - 2p 2p 0 ) P 1 - 2p P , o 2p 1 - 2p ( I � P

I � P � I � P )

.

P 0 I - p 0 In each case, calculate Pij (n) and the mean recurrence times of the states.

4. A particle performs a random walk on the vertices of a cube. At each step it remains where it is with probability ! , or moves to one of its neighbouring vertices each having probability ! . Let v and w be two diametrically opposite vertices. If the walk starts at v, find:

(a) the mean number of steps until its first return to v ,

(b) the mean number of steps until its first visit to w ,

(c) the mean number of visits to w before its first return to v .

5. Visits. With the notation of Exercise (6.2.4), show that

(a) if i � j and i is persistent, then TJij = TJji = 1 , (b) TJij = 1 i f and only if lP'(1j < 00 I Xo = i ) = 1P'(1j < 00 I Xo = j) = 1 . 6. First passages. Let TA = min{n ::: 0 : Xn E A } , where X i s a Markov chain and A i s a subset of the state space S, and let TJj = IP'(TA < 00 I Xo = j) . Show that { I if j E A,

TJj = L PjkTJk if j ¢ A . keS

Show further that if x = (Xj : j E S) is any non-negative solution of these equations then Xj ::: TJj for all j . 7. Mean first passage. In the notation of Exercise (6), let Pj = E(TA I Xo = j) . Show that { 0 if j E A,

Pj = 1 + L PjkPk if j ¢ A, keS

and that if x = (Xj : j E S) is any non-negative solution of these equations then Xj ::: Pj for all j . 8. Let X be an irreducible Markov chain and let A be a subset of the state space. Let Sr and Tr be the successive times at which the chain enters A and visits A respectively. Are the sequences {XSr : r ::: I } , {XTr : r ::: I } Markov chains? What can be said about the times at which the chain exits A?

66

Page 76: One Thousand Exercises in Probability

Stationary distributions and the limit theorem Exercises [6.3.9]-[6.4.6]

9. (a) Show that for each pair i, j of states of an irreducible aperiodic chain, there exists N = N(i , j) such that Pi} (r) > 0 for all r � N. (b) Show that there exists a function f such that, ifP is the transition matrix of an irreducible aperiodic

Markov chain with n states, then Pi} (r) > 0 for all states i , j , and all r � fen) . (c) Show further that f(4) � 6 and fen) � (n - I ) (n - 2) .

[Hint: The postage stamp lemma asserts that, for a , b coprime, the smallest n such that all integers strictly exceeding n have the form cta + f3b for some integers ct, f3 � 0 is (a - I ) (b - I) .]

10. An urn initially contains n green balls and n + 2 red balls. A ball is picked at random: if it is green then a red ball is also removed and both are discarded; if it is red then it is replaced together with an extra red and an extra green ball. This is repeated until there are no green balls in the urn . Show that the probability the process terminates is I / (n + I ) .

Now reverse the rules: i f the ball i s green, i t i s replaced together with an extra green and an extra red ball; if it is red it is discarded along with a green ball. Show that the expected number of iterations until no green balls remain is '£J=l (2j + I ) = n (n + 2) . [Thus, a minor perturbation of a simple symmetric random walk can be non-null persistent, whereas the original is null persistent.]

6.4 Exercises. Stationary distributions and the limit theorem

1. The proof copy of a book is read by an infinite sequence of editors checking for mistakes. Each mistake is detected with probability P at each reading; between readings the printer corrects the detected mistakes but introduces a random number of new errors (errors may be introduced even if no mistakes were detected) . Assuming as much independence as usual, and that the numbers of new errors after different readings are identically distributed, find an expression for the probability generating function of the stationary distribution of the number Xn of errors after the nth editor-printer cycle, whenever this exists . Find it explicitly when the printer introduces a Poisson-distributed number of errors at each stage.

2. Do the appropriate parts of Exercises (6.3 . 1 )-(6.3 .4) again, making use of the new techniques at your disposal.

3. Dams. Let X n be the amount of water in a reservoir at noon on day n. During the 24 hour period beginning at this time, a quantity Yn of water flows into the reservoir, and just before noon on each day exactly one unit of water is removed (if this amount can be found). The maximum capacity of the reservoir is K, and excessive inflows are spilled and lost. Assume that the Yn are independent and identically distributed random variables and that, by rounding off to some laughably small unit of volume, all numbers in this exercise are non-negative integers. Show that (Xn ) is a Markov chain, and find its transition matrix and an expression for its stationary distribution in terms of the probability generating function G of the Yn .

Find the stationary distribution when Y has probability generating function G(s) = p( l - qs)- l . 4. Show by example that chains which are not irreducible may have many different stationary distributions.

5. Diagonal selection. Let (Xi (n) : i, n � I ) be a bounded collection of real numbers. Show that there exists an increasing sequence n l , n2 , . . . of positive integers such that limr-+oo Xi (nr ) exists for all i . Use this result to prove that, for an irreducible Markov chain, if it is not the case that Pi} (n) � 0 as n � 00 for all i and j, then there exists a sequence (nr : r � I ) and a vector Of. (¥= 0) such that Pi} (nr) � ct} as r � 00 for all i and j .

6. Random walk o n a graph. A particle performs a random walk on the vertex set of a connected graph G, which for simplicity we assume to have neither loops nor multiple edges. At each stage it moves to a neighbour of its current position, each such neighbour being chosen with equal probability.

67

Page 77: One Thousand Exercises in Probability

[6.4.7]-[6.5.3] Exercises Markov chains

If G has T} « 00) edges, show that the stationary distribution is given by Jrv = dv/ (2T}), where dv is the degree of vertex v .

7. Show that a random walk on the infinite binary tree is transient.

S. At each time n = 0, 1 , 2 , . . . a number Yn of particles enters a chamber, where {Yn : n ::: O} are independent and Poisson distributed with parameter A. Lifetimes of particles are independent and geometrically distributed with parameter p. Let Xn be the number of particles in the chamber at time n . Show that X is a Markov chain, and find its stationary distribution.

9. A random sequence of convex polygons is generated by picking two edges of the current polygon at random, joining their midpoints, and picking one of the two resulting smaller polygons at random to be the next in the sequence. Let Xn + 3 be the number of edges of the nth polygon thus constructed. Find lE(Xn) in terms of Xo , and find the stationary distribution of the Markov chain X.

10. Let s be a state of an irreducible Markov chain on the non-negative integers. Show that the chain is persistent if there exists a solution y to the equations Yi ::: L,j:j # Pij Yj , i ¥= s , satisfying Yi --+ 00.

11. Bow ties. A particle performs a random walk on a bow tie ABCDE drawn beneath on the left, where C is the knot. From any vertex its next step is equally likely to be to any neighbouring vertex. Initially it is at A. Find the expected value of: (a) the time of first return to A, (b) the number of visits to D before returning to A, (c) the number of visits to C before returning to A, (d) the time of first return to A, given no prior visit by the particle to E, (e) the number of visits to D before returning to A, given no prior visit by the particle to E.

A D [?<J B E

12. A particle starts at A and executes a symmetric random walk on the graph drawn above on the right. Find the expected number of visits to B before it returns to A.

6.5 Exercises. Reversibility

1. A random walk on the set {O, 1 , 2 , . . . , b} has transition matrix given by Poo = 1 - AO, Pbb =

1 - J-Lb , Pi, i+ 1 = Ai and Pi+l , i = J-Li+1 for 0 ::::: i < b, where 0 < Ai , J-Li < 1 for all i , and Ai + J-Li = 1 for 1 ::::: i < b. Show that this process is reversible in equilibrium.

2. Kolmogorov's criterion for reversibility. Let X be an irreducible non-null persistent aperiodic Markov chain. Show that X is reversible in eqUilibrium if and only if

PiIhPhh . . . Pjn- l jn PjniI = PiI jn Pjnjn- l . . . PhiI for all n and all finite sequences h, h , . . . , jn of states.

3. Let X be a reversible Markov chain, and let C be a non-empty subset of the state space S. Define the Markov chain Y on S by the transition matrix Q = (qij ) where

{ {3Pij qij = Pij if i E C and j ¢ C,

otherwise,

68

Page 78: One Thousand Exercises in Probability

Chains with finitely many states Exercises [6.5.4)-[6.6.3)

for i "# j, and where f3 is a constant satisfying 0 < f3 < 1 . The diagonal tenns qjj are arranged so that Q is a stochastic matrix. Show that Y is reversible in equilibrium, and find its stationary distribution. Describe the situation in the limit as f3 -J.. o.

4. Can a reversible chain be periodic?

5. Ehrenfest dog-flea model. The dog-flea model of Example (6.5.5) is a Markov chain X on the state space {O, 1 , . . . , m } with transition probabilities

Show that, if Xo = i ,

i Pi i+ l = 1 - - , , m

i P · · 1 - - &or 0 <_ ; _< m . 1 , 1 - - m ' l ' •

ill: ( Xn - I) = (i - I) (1 - �) n -+ 0 as n -+ 00.

6. Which of the following (when stationary) are reversible Markov chains?

(a) The chain X = {Xn } having transition matrix P = ( 1 fi 01 1 � f3 ) where 01 + f3 > O.

(b) The chain Y = {Yn } having transition matrix P = ( 1 � P � 1 � p ) where 0 < P < 1 . P 1 - P 0

(c) Zn = (Xn , Yn) , where Xn and Yn are independent and satisfy (a) and (b) .

7. Let Xn , Yn be independent simple random walks. Let Zn be (Xn , Yn ) truncated to lie in the region Xn ::: 0, Yn ::: 0, Xn + Yn � a where a is integral. Find the stationary distribution of Zn .

8. Show that an irreducible Markov chain with a finite state space and transition matrix P is reversible in equilibrium if and only if P = DS for some symmetric matrix S and diagonal matrix D with strictly positive diagonal entries. Show further that for reversibility in equilibrium to hold, it is necessary but not sufficient that P has real eigenvalues.

9. Random walk on a graph. Let G be a finite connected graph with neither loops nor multiple edges, and let X be a random walk on G as in Exercise (6.4.6) . Show that X is reversible in equilibrium.

6.6 Exercises. Chains with finitely many states

The first two exercises provide proofs that a Markov chain with finitely many states has a stationary distribution.

1. The Markov-Kakutani theorem asserts that, for any convex compact subset C of IRn and any linear continuous mapping T of C into C, T has a fixed point (in the sense that T (x) = x for some x E C). Use this to prove that a finite stochastic matrix has a non-negative non-zero left eigenvector corresponding to the eigenvalue 1 . 2. Let T be a m x n matrix and let v E IRn . Farkas 's theorem asserts that exactly one of the following holds: (i) there exists x E IRm such that x ::: 0 and xT = v,

(ii) there exists y E IRn such that yv' < 0 and Ty' ::: o. Use this to prove that a finite stochastic matrix has a non-negative non-zero left eigenvector corre­sponding to the eigenvalue 1 . 3. Arbitrage. Suppose you are betting on a race with m possible outcomes. There are n bookmakers, and a unit stake with the i th bookmaker yields tij if the jth outcome of the race occurs. A vector

69

Page 79: One Thousand Exercises in Probability

[6.6.4]-[6.7.3] �xercises Markov chains

x = (Xl , x2 , . . . , xn ) , where Xr E (-00 , 00) is your stake with the rth bookmaker, is called a betting scheme. Show that exactly one of (a) and (b) holds: (a) there exists a probability mass function p = (PI , P2 , . . . , Pm ) such that L:j=1 tij Pj = 0 for all

values of i , (b) there exists a betting scheme x for which you surely win, that is, L:t=1 Xi tij > 0 for all j .

4. Let X be a Markov chain with state space S = { I , 2 , 3 } and transition matrix

where 0 < P < 1 . Prove that

( 1 - P p = 0

P

P I - p

o

where aln + wa2n + w2a3n = (1 - P + pw)n , w being a complex cube root of 1 . 5. Let P be the transition matrix of a Markov chain with finite state space. Let 1 be the identity matrix, U the l S I x l S I matrix with all entries unity, and 1 the row l S I -vector with all entries unity. Let 11: be a non-negative vector with L:i Jri = 1 . Show that 11: P = 11: if and only if 11: (I - P + U) = 1. Deduce that if P is irreducible then 11: = 1 (I - P + U) - 1 . 6. Chess. A chess piece performs a random walk on a chessboard; at each step it is equally likely to make any one of the available moves. What is the mean recurrence time of a comer square if the piece is a: (a) king? (b) queen? (c) bishop? (d) knight? (e) rook?

7. Chess continued. A rook and a bishop perform independent symmetric random walks with synchronous steps on a 4 x 4 chessboard ( 16 squares) . If they start together at a comer, show that the expected number of steps until they meet again at the same comer is 448/3.

8. Find the n-step transition probabilities Pij (n) for the chain X having transition matrix

1 :2 1 4 1 4 �2 ) . 12

6.7 Exercises. Branching processes revisited

1. Let Zn be the size of the nth generation of a branching process with Zo = 1 and lP'(ZI = k) = 2-k for k � O. Show directly that, as n -+ 00, lP'(Zn :::; 2yn I Zn > 0) -+ 1 - e-2y , y > 0, in agreement with Theorem (6.7.8).

2. Let Z be a supercritical branching process with Zo = 1 and family-size generating function G. Assume that the probability '1 of extinction satisfies 0 < '1 < 1 . Find a way of describing the process Z, conditioned on its ultimate extinction. 3. Let Zn be the size of the nth generation of a branching process with Zo = 1 and lP'(Z 1 = k) = qpk for k � 0, where P + q = 1 and P > ! . Use your answer to Exercise (2) to show that, if we condition on the ultimate extinction of Z, then the process grows in the manner of a branching process with generation sizes Zn satisfying Zo = 1 and lP'(ZI = k) = pq

k for k � O.

70

Page 80: One Thousand Exercises in Probability

Birth processes and the Poisson process Exercises [6.7.41-[6.8.71

4. (a) Show that lE(X I X > 0) � lE(X2)/lE(X) for any random variable X taking non-negative values.

(b) Let Zn be the size of the nth generation of a branching process with Zo = 1 and lP'(ZI = k) = qpk

for k � 0, where P > t . Use part (a) to show that lE(Zn/ J-tn I Zn > 0) � 2p/ (p - q) , where J-t = p/q .

(c) Show that, in the notation of part (b), lE(Zn/ J-tn I Zn > 0) --+ p/(p - q) as n --+ 00.

6.8 Exercises. Birth processes and the Poisson process

1. Superposition. Flies and wasps land on your dinner plate in the manner of independent Poisson processes with respective intensities A and J-t. Show that the arrivals of flying objects form a Poisson process with intensity A + J-t. 2. Thinning. Insects land in the soup in the manner of a Poisson process with intensity A, and each such insect is green with probability p, independently of the colours of all other insects . Show that the arrivals of green insects form a Poisson process with intensity Ap.

3. Let Tn be the time of the nth arrival in a Poisson process N with intensity A, and define the excess lifetime process E (t) = TN(t)+1 - t, being the time one must wait subsequent to t before the next arrival. Show by conditioning on Tl that

lP' (E (t) > x ) = e-J.. (t+x) + fot lP' (E (t - u) > x )Ae-J..u du o

Solve this integral equation in order to find the distribution function of E (t) . Explain your conclusion.

4. Let B be a simple birth process (6.S . l Ib) with B (O) = I ; the birth rates are An = nA . Write down the forward system of equations for the process and deduce that

Show also that lE (B (t» = I eAt and var(B (t» = I e2At ( 1 - e-At ) . k � I.

5. Let B be a process of simple birth with immigration (6.S . l Ic) with parameters A and v , and with B(O) = 0; the birth rates are An = nA + v . Write down the sequence of differential-<lifference equations for Pn (t) = lP'(B(t) = n) . Without solving these equations, use them to show that m (t) =

lE(B(t)) satisfies m' (t) = Am (t) + v , and solve for m (t) .

6 . Let N be a birth process with intensities AO , A b . . . , and let N (0) = O. Show that Pn ( t) =

lP'(N(t) = n) is given by

provided that Aj :f. Aj whenever i :f. j . 7. Suppose that the general birth process of the previous exercise is such that L:n A;;- 1 < 00. Show that AnPn (t) --+ f(t) as n --+ 00 where f i s the density function of the random variable T = sup{t : N(t) < oo}. Deduce that lE(N(t) I N(t) < 00) is finite or infinite depending on the convergence or divergence of L:n nA;;- 1 .

Find the Laplace transform of f in closed form for the case when An = (n + t )2 , and deduce an expression for f .

7 1

Page 81: One Thousand Exercises in Probability

[6.9.1]-[6.9.8] Exercises

6.9 Exercises. Continuous-time Markov chains

1. Let AJ-L > 0 and let X be a Markov chain on { I , 2} with generator

Markov chains

(a) Write down the forward equations and solve them for the transition probabilities Pij (t) , i, j = 1 , 2 .

(b) Calculate Gn and hence find L:�o (tn /n ! )Gn . Compare your answer with that to part (a). (c) Solve the equation KG = 0 in order to find the stationary distribution. Verify that Pij (t) � lfj

as t � 00.

2. As a continuation of the previous exercise, find: (a) lP'(X (t) = 2 I X (O) = 1 , X (3t) = 1 ) , (b) lP'(X (t) = 2 I X (O) = 1 , X (3t) = 1 , X (4t) = 1 ) .

3 . Jobs arrive in a computer queue in the manner of a Poisson process with intensity A . The central processor handles them one by one in the order of their arrival, and each has an exponentially distributed runtime with parameter J-L, the runtimes of different jobs being independent of each other and of the arrival process. Let X (t) be the number of jobs in the system (either running or waiting) at time t , where X (0) = O. Explain why X is a Markov chain, and write down its generator. Show that a stationary distribution exists if and only if A < J-L, and find it in this case.

4. Pasta property. Let X = (X (t) : t � O} be a Markov chain having stationary distribution K . We may sample X at the times of a Poisson process: let N be a Poisson process with intensity A, independent of X, and define Yn = X(Tn+), the value taken by X immediately after the epoch Tn of the nth arrival of N. Show that Y = {Yn : n � O} is a discrete-time Markov chain with the same stationary distribution as X. (This exemplifies the 'Pasta' property: Poisson arrivals see time averages.) [The full assumption of the independence of N and X is not necessary for the conclusion. It suffices that (N (s ) : s � t} be independent of (X (s ) : s .::s t } , a property known as 'lack of anticipation' . It is not even necessary that X be Markov; the Pasta property holds for many suitable ergodic processes.]

5. Let X be a continuous-time Markov chain with generator G satisfying gi = -gii > 0 for all i . Let HA = inf{t � 0 : X (t) E A} be the hitting time of the set A of states, and let7lj = lP'(HA < 00 I X (0) = j) be the chance of ever reaching A from j . By using properties of the jump chain, which you may assume to be well behaved, show that L:k gjk'fJk = 0 for j ¢ A.

6. In continuation of the preceding exercise, let J-Lj = E(HA I X (O) = j) . Show that the vector JL is the minimal non-negative solution of the equations

J-Lj = 0 if j E A, 1 + L gjkJ-Lk = 0 if j ¢ A . keS

7. Let X be a continuous-time Markov chain with transition probabilities Pij (t) and define Fi = inf{t > Tl : X (t) = i } where Tl is the time of the first jump of X. Show that, if gii ¥= 0, then lP'(Fi < 00 I X (0) = i ) = 1 if and only if i is persistent.

8. Let X be the simple symmetric random walk on the integers in continuous time, so that

Pi, i+ 1 (h) = Pi, i- l (h) = j;Ah + o(h) .

Show that the walk i s persistent. Let T be the time spent visiting m during an excursion from o . Find the distribution of T .

72

Page 82: One Thousand Exercises in Probability

Birth-death processes and imbedding Exercises [6.9.9]-[6.11.4]

9. Let i be a transient state of a continuous-time Markov chain X with X (0) = i . Show that the total time spent in state i has an exponential distribution.

10. Let X be an asymmetric simple random walk in continuous time on the non-negative integers with retention at 0, so that

{ Ah + o(h ) if j = i + 1 , i � 0, pjj (h) = JLh + o(h ) if j = i - I , i � 1 .

Suppose that X (0) = 0 and A > JL. Show that the total time Vr spent in state r is exponentially distributed with parameter A - JL.

Assume now that X (0) has some general distribution with probability generating function G . Find the expected amount of time spent at 0 in terms of G.

11 . Let X = {X (t) : t � O} be a non-explosive irreducible Markov chain with generator G and unique stationary distribution 11:. The mean recurrence time JLk is defined as follows . Suppose X (0) = k, and let U = inf{s : X es) :f. k} . Then JLk = E(inf{t > U : X(t) = k}) . Let Z = {Zn : n � O} be the imbedded 'jump chain' given by Zo = X (O) and Zn is the value of X just after its nth jump.

(a) Show that Z has stationary distribution it satisfying

where gj = -gjj , provided L:j 7rjgj < 00. When is it the case that it = 11: ?

(b) Show that 7rj = I / (JLj gj ) if JL j < 00, and that the mean recurrence time ilk of the state k in the jump chain Z satisfies ilk = JLk L:j 7rj gj if the last sum is finite.

12. Let Z be an irreducible discrete-time Markov chain on a countably infinite state space S, having transition matrix H = (hij ) satisfying hjj = 0 for all states i , and with stationary distribution v . Construct a continuous-time process X on S for which Z i s the imbedded chain, such that X has no stationary distribution.

6.11 Exercises. Birth-death processes and imbedding

1. Describe the jump chain for a birth-death process with rates An and JLn . 2. Consider an immigration-death process X, being a birth-death process with birth rates An = A and death rates JLn = nJL. Find the transition matrix of the jump chain Z, and show that it has as stationary distribution

7rn = 2(� !) (1 + �) pne-p

where p = AI JL. Explain why this differs from the stationary distribution of X.

3. Consider the birth-death process X with An = nA and JLn = n JL for all n � O. Suppose X (0) = 1 and let TI (t) = lP'(X (t) = 0) . Show that TI satisfies the differential equation

Hence find TI (t) , and calculate lP'(X (t) = 0 I X (u) = 0) for 0 < t < u . 4. For the birth-death process of the previous exercise with A < JL, show that the distribution of X(t) , conditional on the event {X(t) > OJ, converges as t � 00 to a geometric distribution.

73

Page 83: One Thousand Exercises in Probability

[6.11.5]-[6.13.2] Exercises Markov chains

5. Let X be a birth-death process with An = nA and fJ.-n = nfJ.-, and suppose X (0) = 1 . Show that the time T at which X(t) first takes the value 0 satisfies { .!. Iog (�) if).. < fJ.-,

E(T I T < 00) = A fJ.- - A

.!.. Iog (_A

_ ) if A > fJ.-. fJ.- A - fJ.-

What happens when A = fJ.-?

6. Let X be the birth-death process of Exercise (5) with A =1= fJ.-, and let Vr (t) be the total amount of time the process has spent in state r ::: 0, up to time t. Find the distribution of VI (00) and the generating function 'Er srE(Vr (t)) . Hence show in two ways that E(VI (00)) = [max{A , fJ.-}]- I . Show further that E(Vr (oo)) = Ar- 1 r- 1 [max {A , fJ.-lrr .

7. Repeat the calculations of Exercise (6) in the case A = fJ.-.

6.12 Exercises. Special processes

1. Customers entering a shop are served in the order of their arrival by the single server. They arrive in the manner of a Poisson process with intensity A, and their service times are independent exponentially distributed random variables with parameter fJ.-. By considering the jump chain, show that the expected duration of a busy period B of the server is (fJ.- - A)- 1 when A < fJ.-. (The busy period runs from the moment a customer arrives to find the server free until the earliest subsequent time when the server is again free.)

2. Disasters. Immigrants arrive at the instants of a Poisson process of rate v , and each independently founds a simple birth process of rate A . At the instants of an independent Poisson process of rate 8, the population is annihilated. Find the probability generating function of the population X (t) , given that X (0) = o.

3. More disasters. In the framework of Exercise (2), suppose that each immigrant gives rise to a simple birth-death process of rates A and fJ.-. Show that the mean popUlation size stays bounded if and only if 8 > A - fJ.-. 4. The queue MlG/oo. (See Section 1 1 . 1 .) An ftp server receives clients at the times of a Poisson process with parameter A, beginning at time O. The i th client remains connected for a length Sj of time, where the Sj are independent identically distributed random variables, independent of the process of arrivals. Assuming that the server has an infinite capacity, show that the number of clients being serviced at time t has the Poisson distribution with parameter A Ici [ I - G (x) ] dx , where G is the common distribution function of the Sj .

6.13 Exercises. Spatial Poisson processes

1. In a certain town at time t = 0 there are no bears. Brown bears and grizzly bears arrive as independent Poisson processes B and G with respective intensities f3 and y . (a) Show that the first bear i s brown with probability f3 / (f3 + y) . (b) Find the probability that between two consecutive brown bears, there arrive exactly r grizzly

bears.

(c) Given that B ( l ) = I , find the expected value of the time at which the first bear arrived.

2. Campbell-Hardy theorem. Let IT be the points of a non-homogeneous Poisson process on jRd with intensity function A. Let S = 'Exen g (x) where g is a smooth function which we assume for

74

Page 84: One Thousand Exercises in Probability

Markov chain Monte CarZo Exercises [6.13.3]-[6.14.4]

convenience to be non-negative. Show that E(S) = JlR.d g(U)A(U) du and var(S) = JlR.d g(u)2 A(U) du , provided these integrals converge.

3. Let n be a Poisson process with constant intensity A on the surface of the sphere oflR.3 with radius 1 . Let P be the process given by the (X, Y) coordinates of the points projected on a plane passing through the centre of the sphere. Show that P is a Poisson process, and find its intensity function.

4. Repeat Exercise (3), when n is a homogeneous Poisson process on the ball { (Xl , x2 , X3 ) : xt + xi + xj � I } .

5 . You stick pins in a Mercator projection of the Earth in the manner of a Poisson process with constant intensity A. What is the intensity function of the corresponding process on the globe? What would be the intensity function on the map if you formed a Poisson process of constant intensity A of meteorite strikes on the surface of the Earth?

6. Shocks. The rth point Tr of a Poisson process N of constant intensity A on lR.+ gives rise to an effect Xre-a(t-Tr) at time t ::: Tr , where the Xr are independent and identically distributed with

finite variance. Find the mean and variance of the total effect S(t) = 'E�? Xre-a(t-Tr ) in terms of the first two moments of the Xr , and calculate cov(S(s ) , S(t» .

What is the behaviour of the correlation p (S(s ) , S(t» as s -+ 00 with t - s fixed?

7. Let N be a non-homogeneous Poisson process on lR.+ with intensity function A. Find the joint density of the first two inter-event times, and deduce that they are not in general independent.

S. Competition lemma. Let {Nr (t) : r ::: l } be a collection ofindependent Poisson processes on lR.+ with respective constant intensities {Ar : r ::: I } , such that 'Er Ar = A < 00. Set N(t) = 'Er Nr (t) , and let I denote the index of the process supplying the first point in N, occurring at time T. Show that

A ' lP'(I = i , T ::: t) = lP'(I = i )lP'(T ::: t) = � e-J...t , i ::: 1 .

6.14 Exercises. Markov chain Monte Carlo

1. Let P be a stochastic matrix on the finite set e with stationary distribution :n: . Define the inner product (x, y) = 'EkeEl XkYk:n:b and let Z2 (:n:) = {x E lR.El : (x, x) < oo}. Show, in the obvious

notation, that P is reversible with respect to :n: if and only if (x, Py) = (Px , y) for all x, y E Z2 (:n: ) . 2 . Barker's algorithm. Show that a possible choice for the acceptance probabilities in Hastings's general algorithm is

b . . _ :n:jgji IJ - ,

:n:igij + :n:jgji where G = (gij ) is the proposal matrix.

3. Let S be a countable set. For each j E S, the sets Ajk , k E S, form a partition of the interval [0, 1 ] . Let g : S x [0, 1 ] -+ S be given by g(j , u) = k if U E Ajk . The sequence {Xn : n ::: O} of random variables is generated recursively by Xn+1 = g(Xn , Un+1 ) , n ::: 0, where {Un : n ::: I } are independent random variables with the uniform distribution on [0, 1 ] . Show that X is a Markov chain, and find its transition matrix.

4. Dobrushin's bound. Let U = (ust ) be a finite I S I x I T I stochastic matrix. Dobrushin 's ergodic coefficient is defined to be

d(U) = i sup l )Uit - ujt l . i, j eS teT (a) Show that, if V is a finite I T I x l U I stochastic matrix, then d(UV) � d(U)d(V) .

75

Page 85: One Thousand Exercises in Probability

[6.15.1]-[6.15.7] Exercises Markov chains

(b) Let X and Y be discrete-time Markov chains with the same transition matrix P, and show that

LIlP'(Xn = k) - lP'(Yn = k) 1 ::: d(p)n LIlP'(Xo = k) - lP'(Yo = k) l· k k

6.15 Problems

1. Classify the states of the discrete-time Markov chains with state space S =

transition matrices ( 1 2 0 D ( ! 1 1 i ) -"3" "3" 2: 2: 1 1 0 0 0 (a) 2: 2: (b) 1 0 1 0 0 4 4 0 0 0 0 1

{ I , 2, 3, 4} and

In case (a), calculate h4(n) , and deduce that the probability of ultimate absorption in state 4, starting from 3, equals �. Find the mean recurrence times of the states in case (b).

2. A transition matrix is called doubly stochastic if all its column sums equal 1 , that is, if Ei Pij = 1 for all j E S.

(a) Show that if a finite chain has a doubly stochastic transition matrix, then all its states are non-null persistent, and that if it is, in addition, irreducible and aperiodic then Pij (n) -+ N-1 as n -+ 00, where N is the number of states.

(b) Show that, if an infinite irreducible chain has a doubly stochastic transition matrix, then its states are either all null persistent or all transient.

3. Prove that intercommunicating states of a Markov chain have the same period.

4. (a) Show that for each pair i, j of states of an irreducible aperiodic chain, there exists N = N(i, j) such that Pij (n) > 0 for all n � N.

(b) Let X and Y be independent irreducible aperiodic chains with the same state space S and transition matrix P. Show that the bivariate chain Zn = (Xn , Yn ) , n � 0, is irreducible and aperiodic.

(c) Show that the bivariate chain Z may be reducible if X and Y are periodic.

5. Suppose {Xn : n � O} is a discrete-time Markov chain with Xo = i . Let N be the total number of visits made subsequently by the chain to the state j . Show that

and deduce that lP'(N = 00) = 1 if and only if lij = Ijj = 1 .

6. Let i and j be two states of a discrete-time Markov chain. Show that if i communicates with j , then there is positive probability of reaching j from i without revisiting i in the meantime. Deduce that, if the chain is irreducible and persistent, then the probability lij of ever reaching j from i equals 1 for all i and j .

7 . Let {Xn : n � O } be a persistent irreducible discrete-time Markov chain on the state space S with transition matrix P, and let x be a positive solution of the equation x = xP .

(a) Show that

i , j E S, n � 1 ,

76

Page 86: One Thousand Exercises in Probability

Problems Exercises [6.15.8]-[6.15.13]

defines the n-step transition probabilities of a persistent irreducible Markov chain on S whose first-passage probabilities are given by

i =1= j, n � 1 ,

where lj i (n) = lP'(Xn = i , T > n I Xo = j ) and T = min{m > 0 : Xm = j } .

(b) Show that x i s unique up to a multiplicative constant.

(c) Let 1j = min{n � 1 : Xn = j} and define hij = lP'(1j :::; Tj I Xo = i ) . Show that Xi hij = Xj hj i for all i , j E S .

8. Renewal sequences. The sequence U = {un : n � o} is called a 'renewal sequence' if

n Uo = 1 , Un = L Jiun-i

i=l for n � 1 ,

for some collection f = Un : n � I } of non-negative numbers summing to 1 .

(a) Show that U is a renewal sequence if and only if there exists a Markov chain X on a countable state space S such that Un = lP'(Xn = s I Xo = s) , for some persistent S E S and all n � 1 .

(b) Show that if U and v are renewal sequences then so is {un Vn : n � o}.

9. Consider the symmetric random walk in three dimensions on the set of points { (x , y, z) : x , y , z =

0, ± 1 , ±2, . . . } ; this process is a sequence {Xn : n � o} of points such that lP'(Xn+l = Xn + E) = i for E = (±1 , 0, 0) , (0, ± 1 , 0) , (0, 0, ±1 ) . Suppose that Xo = (0, 0, 0) . Show that

( 1 ) 2n (2n) ! ( 1 ) 2n (2n) ( n ! ) 2 lP' (X2n = (0, 0, 0)) = 6 L C ' · ' k , )2

= 2 L 3n · , · ' k ' i+j+k=n I . J . . n i+j+k=n I . J . .

and deduce by Stirling's formula that the origin is a transient state.

10. Consider the three-dimensional version of the cancer model (6. 12. 1 6) . If K = 1 , are the empires of Theorem (6. 12. 1 8) inevitable in this case?

11. Let X be a discrete-time Markov chain with state space S = { 1 , 2} , and transition matrix

( l - a p -- f3

Classify the states of the chain. Suppose that af3 > ° and af3 =1= 1 . Find the n-step transition probabilities and show directly that they converge to the unique stationary distribution as n --+ 00. For what values of a and f3 is the chain reversible in equilibrium?

12. Another diffusion model. N black balls and N white balls are placed in two urns so that each contains N balls. After each unit of time one ball is selected at random from each urn, and the two balls thus selected are interchanged. Let the number of black balls in the first urn denote the state of the system. Write down the transition matrix of this Markov chain and find the unique stationary distribution. Is the chain reversible in equilibrium?

13. Consider a Markov chain on the set S = {O, 1 , 2, . . . } with transition probabilities Pi, i+ 1 = ai , Pi,O = 1 - ai , i � 0, where (ai : i � 0) is a sequence of constants which satisfy ° < ai < 1 for all i . Let bo = 1 , bi = aOal . . . ai- l for i � 1 . Show that the chain is

(a) persistent if and only if bi --+ ° as i --+ 00, (b) non-null persistent if and only if L::i bi < 00, and write down the stationary distribution if the latter condition holds.

77

Page 87: One Thousand Exercises in Probability

[6.15.14]-[6.15.19] Exercises Markov chains

Let A and f3 be positive constants and suppose that ai = 1 - Ai -P for all large i . Show that the chain is

(c) transient if f3 > 1 , (d) non-null persistent if f3 < 1 . Finally, if f3 = 1 show that the chain is

(e) non-null persistent if A > 1 , (0 null persistent if A :::; 1 .

14. Let X be a continuous-time Markov chain with countable state space S and standard semigroup {Pr J . Show that Pij (t) is a continuous function of t . Let g(t) = - log Pii (t) ; show that g is a continuous function, g (O) = 0, and g (s + t) :::; g (s) + g (t) . We say that g is 'subadditive' , and a well known theorem gives the result that

lim g(t)

= A t.J,O t exists and

g(t) A = sup -- :::; 00. t>O t

Deduce that gii = limt.J,O t- I {Pii (t) - I } exists, but may be -00.

15. Let X be a continuous-time Markov chain with generator G = (gij ) and suppose that the transition semigroup Pt satisfies Pt = exp(tG) . Show that X is irreducible if and only if for any pair i, j of states there exists a sequence kI , k2 , . . . , kn of states such that gi,kl gkl ,k2 . . . gkn ,j =1= O. 16. (a) Let X = {X (t) : -00 < t < oo} be a Markov chain with stationary distribution :n:, and

suppose that X (0) has distribution :n: . We call X reversible if X and Y have the same joint distributions, where Y(t) = X(-t) . (i) If X (t) has distribution :n: for all t , show that Y is a Markov chain with transition probabilities

P�j (t) = (Jrj /Jri )Pji (t) , where the Pji (t) are the transition probabilities of the chain X. (ii) If the transition semigroup {Pr J of X is standard with generator G, show that Jrigij = Jrjgji

(for all i and j) is a necessary condition for X to be reversible.

(iii) If Pt = exp(tG) , show that X (t) has distribution :n: for all t and that the condition in (ii) is sufficient for the chain to be reversible.

(b) Show that every irreducible chain X with exactly two states is reversible in equilibrium.

(c) Show that every birth-death process X having a stationary distribution is reversible in equilibrium.

17. Show that not every discrete-time Markov chain can be imbedded in a continuous-time chain. More precisely, let

p _ ( 01 1 -(1 01 ) for some 0 < 01 < 1 - 1 - 01

be a transition matrix. Show that there exists a uniform semigroup {Pt } of transition probabilities in continuous time such that PI = P, if and only if i < 01 < 1 . In this case show that {Pt } is unique and calculate it in terms of 01 . 18. Consider an immigration-death process X(t) , being a birth-death process with rates An = A , J-tn = nJ-t . Show that its generating function G(s , t) = JE(sX(t» i s given by

where p = A/J-t and X(O) = I . Deduce the limiting distribution of X(t) as t -+ 00.

19. Let N be a non-homogeneous Poisson process on lR+ = [0, 00) with intensity function A. Write down the forward and backward equations for N, and solve them.

Let N (0) = 0, and find the density function of the time T until the first arrival in the process. If A(t) = c/O + t) , show that JE(T) < 00 if and only if c > 1 .

7 8

Page 88: One Thousand Exercises in Probability

Problems Exercises [6.15.20]-[6.15.27]

20. Successive offers for my house are independent identically distributed random variables Xl , X2 , . . . , having density function f and distribution function F. Let Yl = Xl , let Y2 be the first offer exceeding Yl , and generally let Yn+ 1 be the first offer exceeding Yn . Show that Yl , Y2 , . . . are the times of arrivals in a non-homogeneous Poisson process with intensity function A (t) = f (t)/ ( l - F(t)) . The Yi are called 'record values' .

Now let ZI be the first offer received which is the second largest to date, and let Z2 be the second such offer, and so on. Show that the Zi are the arrival times of a non-homogeneous Poisson process with intensity function A. 21. Let N be a Poisson process with constant intensity A, and let Yl , Y2 , . . . be independent random variables with common characteristic function c/J and density function f. The process N* (t) =

Yl + Y2 + . . . + YN(t) is called a compound Poisson process. Yn is the change in the value of N* at the nth arrival of the Poisson process N. Think of it like this. A 'random alarm clock' rings at the arrival times of a Poisson process. At the nth ring the process N* accumulates an extra quantity Yn . Write down a forward equation for N* and hence find the characteristic function of N* (t) . Can you see directly why it has the form which you have found?

22. If the intensity function A of a non-homogeneous Poisson process N is itself a random process, then N is called a doubly stochastic Poisson process (or Cox process) . Consider the case when A (t) = A for all t, and A is a random variable taking either of two values A l or A2 , each being picked

with equal probability i . Find the probability generating function of N(t) , and deduce its mean and variance.

23. Show that a simple birth process X with parameter A is a doubly stochastic Poisson process with intensity function A (t) = AX (t) . 24. The Markov chain X = {X(t) : t ::: O} is a birth process whose intensities Ak (t) depend also on the time t and are given by

lP' (X (t + h) = k + 1 1 X(t) = k) = 1 + JLk h + o(h ) 1 + JLt

as h .j, O. Show that the probability generating function G(s , t) = E(sX(t) ) satisfies

aG s - 1 { a G } at =

1 + JLt G + JLs-a; ' 0 < s < 1 .

Hence find the mean and variance of X (t) when X (0) = I . 25. (a) Let X be a birth-death process with strictly positive birth rates AO, A I , . . , and death rates JLl , JL2 , . . . . Let TJi be the probability that X (t) ever takes the value 0 starting from X (0) = i . Show that

Aj TJj+l - (Aj + JLj )TJj + JLj TJj- l = 0, j ::: 1 ,

and deduce that TJi = 1 for aH i so long as Ef ej = 00 where ej = JLIJL2 ' " JLj / (A IA2 " · Aj ) . (b) For the discrete-time chain on the non-negative integers with

find the probability that the chain ever visits 0, starting from 1 .

26. Find a good necessary condition and a good sufficient condition for the birth-death process X of Problem (6. 15 .25a) to be honest.

27. Let X be a simple symmetric birth-death process with An = JLn = nA, and let T be the time until extinction. Show that

( AX ) 1 lP'(T ::: x I X (0) = l) = 1 + AX '

79

Page 89: One Thousand Exercises in Probability

[6.15.28]-[6.15.33] Exercises Markov chains

and deduce that extinction is certain if 11"( X (0) < (0) = I .

Show that lP'(AT / I ::s x I X (O) = I) � e- 1/x as I � 00 .

28. Immigration-death with disasters. Let X be an immigration-death-disaster process, that is, a birth-death process with parameters Ai = A, fJ.-i = i fJ.-, and with the additional possibility of 'disasters' which reduce the population to O. Disasters occur at the times of a Poisson process with intensity 8, independently of all previous births and deaths.

(a) Show that X has a stationary distribution, and find an expression for the generating function of this distribution.

(b) Show that, in equilibrium, the mean of X (t) is .1../ (8 + fJ.-) .

29. With any sufficiently nice (Lebesgue measurable, say) subset B of the real line lR is associated a random variable X (B) such that

(i) X (B) takes values in {O, 1 , 2, . . . } , (ii) if Bb B2, . . . , Bn are disjoint then X (B l ) , X(B2) , . . . , X (Bn ) are independent, and furthermore

X (BI U B2) = X(Bl ) + X (B2) ,

(iii) the distribution of X (B) depends only on B through its Lebesgue measure ( 'length') I B I , and

lP'(X (B) � I) � I � as I B I � o.

lP'(X (B) = I)

Show that X is a Poisson process .

30. Poisson forest. Let N be a Poisson process in lR2 with constant intensity A, and let R(1 ) < R(2) < . . . be the ordered distances from the origin of the points of the process .

(a) Show that R[l ) ' R[2) , . . . are the points of a Poisson process on lR+ = [0, (0) with intensity Arr . (b) Show that R(k) has density function

2rr Ar (Arrr2)k- l e-A1rr2 f(r) = (k - I) ! '

r > O.

31. Let X be a n-dimensional Poisson process with constant intensity A. Show that the volume of the largest (n-dimensional) sphere centred at the origin which contains no point of X is exponentially distributed. Deduce the density function of the distance R from the origin to the nearest point of X. Show that E(R) = rO/n)/{n(Ac) l/n } where c is the volume of the unit ball of lRn and r is the gamma function.

32. A village of N + I people suffers an epidemic. Let X (t) be the number of ill people at time t , and suppose that X (O) = I and X is a birth process with rates Ai = Ai (N + I - i) . Let T be the length of time required until every member of the popUlation has succumbed to the illness. Show that

and deduce that

I N I E(T) = i L k(N + I - k) k=l

E(T) = 2(log N + y)

+ O(N-2) A(N + I ) where y i s Euler's constant. It i s striking that E(T) decreases with N, for large N. 33. A particle has velocity V (t) at time t , where V (t) i s assumed to take values in { n + � : n � OJ . Transitions during (t, t + h) are possible as follows : { (v + � )h + o(h) if w = v + l ,

lP' (V (t + h) = w I V (t) = v) = 1 - 2vh + o(h) if w = v ,

(v - � )h + o(h) if w = v - I .

80

Page 90: One Thousand Exercises in Probability

Problems Exercises [6.15.34]-[6.15.39]

Initially V (0) = � . Let

(a) Show that

00 G(s , t) = I >nlP'(V (t) = n + � ) .

n=O

aG = ( 1 _ s)2 aG _ ( 1 - s)G at as

and deduce that G(s , t) = {1 + ( 1 - s)t }- l.

(b) Show that the expected length mn (T) of time for which V = n + � during the time interval [0, T] is given by

mn (T) = loT lP'(V (t) = n + � ) dt

and that, for fixed k, mk (T) - log T -+ - 'Ef= l i - I as T -+ 00 .

(c) What is the expected velocity of the particle at time t? 34. A random sequence of non-negative integers {Xn : n � O} begins Xo = 0, Xl produced by

{ Xn + Xn-l with probability � , Xn+l = I I Xn - Xn- l l with probability 2 ·

1 , and is

Show that Yn = (Xn- l , Xn) is a transient Markov chain, and find the probability of ever reaching ( 1 , 1) from ( 1 , 2) .

35. Take a regular hexagon and join opposite corners by straight lines meeting at the point C. A particle performs a symmetric random walk on these 7 vertices, starting at A. Find:

(a) the probability of return to A without hitting C,

(b) the expected time to return to A, (c) the expected nmber of visits to C before returning to A,

(d) the expected time to return to A, given that there is no prior visit to C.

36. Diffusion, osmosis. Markov chains are defined by the following procedures at any time n : (a) Bernoulli model. Two adjacent containers A and B each contain m particles ; m are of type I and

m are of type II. A particle is selected at random in each container. If they are of opposite types they are exchanged with probability a if the type I is in A, or with probability p if the type I is in B. Let Xn be the number of type I particles in A at time n .

(b) Ehrenfest dog-flea model. Two adjacent containers contain m particles in all. A particle is selected at random. If it is in A it is moved to B with probability a, if it is in B it is moved to A with probability p . Let Yn be the number of particles in A at time n .

In each case find the transition matrix and stationary distribution of the chain.

37. Let X be an irreducible continuous-time Markov chain on the state space S with transition prob­abilities Pjk (t) and unique stationary distribution 11: , and write lP'(X (t) = j) = aj (t) . If c(x) is a concave function, show that d(t) = 'EjeS 7rjc(aj (t)/7rj ) increases to c ( 1 ) as t -+ 00 .

38. With the notation of the preceding problem, let Uk (t) = lP'(X (t) = k I X (O) = 0) , and suppose the chain is reversible in equilibrium (see Problem (6. 1 5 . 1 6». Show that uo (2t) = 'E/7rO/7rj )Uj (t)2 , and deduce that uo (t) decreases to 7ro as t -+ 00 .

39. Perturbing a Poisson process. Let [J be the set of points in a Poisson process on ]Rd with constant intensity A. Each point is displaced, where the displacements are independent and identically distributed. Show that the resulting point process is a Poisson process with intensity A.

8 1

Page 91: One Thousand Exercises in Probability

[6.15.40]-[6.15.46] Exercises Markov chains

40. Perturbations continued. Suppose for convenience in Problem (6. 1 5 .39) that the displacements have a continuous distribution function and finite mean, and that d = 1 . Suppose also that you are at the origin originally, and you move to a in the perturbed process. Let LR be the number of points formerly on your left that are now on your right, and RL the number of points formerly on your right that are now on your left. Show that lE(LR) = lE(RL) if and only if a = J1, where J1, is the mean displacement of a particle.

Deduce that if cars enter the start of a long road at the instants of a Poisson process, having independent identically distributed velocities, then, if you travel at the average speed, in the long run the rate at which you are overtaken by other cars equals the rate at which you overtake other cars.

41. Ants enter a kitchen at the instants of a Poisson process N of rate A.; they each visit the pantry and then the sink, and leave. The rth ant spends time Xr in the pantry and Yr in the sink (and Xr + Yr in the kitchen altogether), where the vectors Vr = (Xr , Yr ) and Vs are independent for r #- s . At time t = 0 the kitchen is free of ants. Find the joint distribution of the numbers A(t) of ants in the pantry and B(t) of ants in the sink at time t . Now suppose the ants arrive in pairs at the times of the Poisson process, but then separate to behave independently as above. Find the joint distribution of the numbers of ants in the two locations.

42. Let {Xr : r :::: I } be independent exponential random variables with parameter A., and set Sn =

E�=l Xr . Show that:

(a) Yk = Ski Sn , 1 � k � n - 1 , have the same distribution as the order statistics of independent variables {Uk : 1 � k � n - I } which are uniformly distributed on (0, 1 ) ,

(b) Zk = Xkl Sn , 1 � k � n, have the same joint distribution as the coordinates of a point (UI , . . . , Un ) chosen uniformly at random on the simplex E�=l Ur = 1 , Ur :::: 0 for all r .

43. Let X be a discrete-time Markov chain with a finite number of states and transition matrix P = (Pij ) where Pij > 0 for all i , j . Show that there exists A. E (0, 1 ) such that I Pij (n) - lZ"j I < A. n , where 1C is the stationary distribution.

44. Under the conditions of Problem (6. 1 5 .43), let Vi (n) = E�,;;;J I(Xr=i J be the number of visits of the chain to i before time n . Show that

Show further that, if f is any bounded function on the state space, then ( / 1 n- l / 2)

lE ;; L f(Xr ) - � f (i )7ri -+ o. r=O I ES

45. Conditional entropy. Let A and B = (Bo , BI , . . . , Bn) be a discrete random variable and vector, respectively. The conditional entropy of A with respect to B is defined as H(A I B) = lE (lE{- log f(A I B) I B}) where f(a I b) = JP(A = a I B = b) . Let X be an aperiodic Markov chain on a finite state space. Show that

and that H(Xn+1 I Xn ) -+ -L 7ri L Pij 10g Pij as n -+ 00,

j

if X is aperiodic with a unique stationary distribution 1C .

46. Coupling. Let X and Y be independent persistent birth-death processes with the same parameters (and no explosions). It is not assumed that Xo = yo . Show that:

82

Page 92: One Thousand Exercises in Probability

Problems Exercises [6.15.47)-[ 6.15.49J

(a) for any A � JR, IIP'(Xt E A) - 1P'(Yt E A) I -+ 0 as t -+ 00,

(b) if lP'(Xo :::; yo) = 1 , then lE[g(Xt)] :::; lE[g(Yt )] for any increasing function g. 47. Resources. The number of birds in a wood at time t i s a continuous-time Markov process X. Food resources impose the constraint 0 :::; X (t) :::; n . Competition entails that the transition probabilities obey

Pk,k+ 1 (h) = >"(n - k)h + o(h) , Pk,k- I (h) = /Lkh + o(h ) .

Find lE(sX(t») , together with the mean and variance of X(t) , when X(O) = r . What happens as t -+ oo?

48. Parrando's paradox. A counter performs an irreducible random walk on the vertices 0, 1 , 2 of the triangle in the figure beneath, with transition matrix

P = ( �I P2

Po qo ) o PI q2 0

where Pi + qi = 1 for all i . Show that the stationary distribution 11: has

1 - q2PI 7ro = , 3 - ql PO - q2PI - qOP2

with corresponding formulae for 7r1 , 7r2 .

o 2 Suppose that you gain one peseta for each clockwise step of the walk, and you lose one peseta

for each anticlockwise step. Show that, in equilibrium, the mean yield per step is

Y = E (2Pi - 1)7ri = 3 (2pOP IP2 - POPI - PI P2 - P2PO + Po + PI + P2 - 1)

. . 3 - ql Po - q2PI - QOP2 I

Consider now three cases of this process:

A. We have Pi = i - a for each i , where a > O. Show that the mean yield per step satisfies YA < O.

B. We have that Po = to - a, PI = P2 = i - a, where a > O. Show that JIB < 0 for sufficiently small a .

C. At each step the counter i s equally likely to move according to the transition probabilities of case A or case B, the choice being made independently at every step. Show that, in this case,

Po = to - a, PI = P2 = i - a. Show that YC > 0 for sufficiently small a. The fact that two systematically unfavourable games may be combined to make a favourable game is called Parrando's paradox. Such bets are not available in casinos.

49. Cars arrive at the beginning of a long road in a Poisson stream of rate >.. from time t = 0 onwards . A car has a fixed velocity V > 0 which is a random variable. The velocities of cars are independent and identically distributed, and independent of the arrival process. Cars can overtake each other freely. Show that the number of cars on the first x miles of the road at time t has the Poisson distribution with parameter >"lE[V- 1 min{x , Vt}] .

83

Page 93: One Thousand Exercises in Probability

[6.15.50]-[6.15.51] Exercises Marlwv chains

50. Events occur at the times of a Poisson process with intensity A, and you are offered a bet based on the process. Let t > O. You are required to say the word 'now' immediately after the event which you think will be the last to occur prior to time t. You win if you succeed, otherwise you lose. If no events occur before t you lose. If you have not selected an event before time t you lose.

Consider the strategy in which you choose the first event to occur after a specified time s, where 0 < s < t . (a) Calculate an expression for the probability that you win using this strategy.

(b) Which value of s maximizes this probability?

(c) If At � 1 , show that the probability that you win using this value of s is e - 1 . 51. A new Oxbridge professor wishes to buy a house, and can afford to spend up to one million pounds. Declining the services of conventional estate agents, she consults her favourite internet property page on which houses are announced at the times of a Poisson process with intensity A per day. House prices may be assumed to be independent random variables which are uniformly distributed over the interval (800,000, 2,000,000) . She decides to view every affordable property announced during the next 30 days. The time spent viewing any given property is uniformly distributed over the range ( 1 , 2) hours. What is the moment generating function of the total time spent viewing houses?

84

Page 94: One Thousand Exercises in Probability

7

Convergence of random variables

7.1 Exercises. Introduction

1. Let r ::: 1 , and define I I X l l r = {E l xr l } l/r . Show that: (a) I l eX l l r = l e i · I I X l i r for e E R, (b) I I X + Y I I r � I I X l i r + I I Y l l r , (c) I I X l i r = 0 if and only if JP>(X = 0) = 1 .

This amounts to saying that II . II r is a norm on the set of equivalence classes of random variables on a given probability space with finite rth moment, the equivalence relation being given by X � Y if and only if JP>(X = Y) = 1 .

2. Define (X, Y ) = E(XY) for random variables X and Y having finite variance, and define I I X I I =

J(X, X) . Show that: (a) (aX + bY, Z) = a (X, Z) + b (Y, Z) ,

(b) I I X + Y 1 1 2 + I I X - Y 1 1 2 = 2( I I X 1 I 2 + I I Y I I 2) , the paralleiogram property, (c) if (Xi , Xj ) = 0 for all i =f:. j then

3. Let E > O. Let g , h : [0, 1 ] --+ R, and define dE (g , h) = IE dx where E = {u E [0, 1 ] : I g (u) - h (u) 1 > d. Show that dE does not satisfy the triangle inequality. 4. Levy metric. For two distribution functions F and G, let

d(F, G) = inf{ 8 > 0 : F(x - 8) - 8 � G(x) � F (x + 8) + Hor all x E R} .

Show that d is a metric on the space of distribution functions. 5. Find random variables X, Xl , X2 , . . . such that E( I Xn - X 12) --+ 0 as n --+ 00, but E IXn l = 00

for all n.

7.2 Exercises. Modes of convergence

1. (a) Suppose Xn � X where r ::: 1 . Show that E IX� I --+ E l xr l . I (b) Suppose Xn -+ X. Show that E(Xn) --+ E(X) . Is the converse true? 2 (c) Suppose Xn -+ X. Show that var(Xn ) --+ var(X) .

85

Page 95: One Thousand Exercises in Probability

[7.2.2]-[7.3.3] Exercises Convergence 01 random variables

2. Dominated convergence. Suppose l Xn l ::: Z for all n , where E(Z) < 00. Prove that if Xn � X I then Xn --+ X.

3. Give a rigorous proof that E(XY) = E(X)E(Y) for any pair X, Y of independent non-negative random variables on (Q , J", lP) with finite means. [Hint: For k � 0, n � 1 , define Xn = kin if kin ::: X < (k + l )ln , and similarly for Yn . Show that Xn and Yn are independent, and Xn ::: X, and Yn ::: Y . Deduce that EXn -+ EX and EYn -+ EY, and also E(Xn Yn ) -+ E(XY). ] 4. Show that convergence in distribution is equivalent to convergence with respect to the Levy metric of Exercise (7 . 1 .4).

5. (a) Suppose that Xn S X and Yn � c, where c is a constant. Show that Xn Yn S cX, and that Xnl Yn S Xlc if c # O.

(b) Suppose that Xn S 0 and Yn � Y, and let g : JR2 -+ JR be such that g(x , y) is a continuous p function of y for all x, and g (x , y) is continuous at x = 0 for all y . Show that g (Xn , Yn) --+

g(O, Y) . [These results are sometimes referred to as 'Slutsky's theorem(s) ' . J 6. Let Xl , X2 , . . . be random variables on the probability space (Q , :F, lP) . Show that the set A = {w E Q : the sequence Xn (w) converges} is an event (that is, lies in :F), and that there exists a random variable X (that is, an F-measurable function X : Q -+ JR) such that Xn (w) -+ X(w) for w E A. 7. Let {Xn } be a sequence of random variables, and let {en } be a sequence of reals converging to the limit c. For convergence almost surely, in rth mean, in probability, and in distribution, show that the convergence of Xn to X entails the convergence of cnXn to cX. 8. Let {Xn } be a sequence of independent random variables which converges in probability to the limit X. Show that X is almost surely constant. 9. Convergence in total variation. The sequence of discrete random variables Xn , with mass functions In , is said to converge in total variation to X with mass function I if

L l in (x) - l (x) 1 -+ 0 as n -+ 00 .

x

Suppose Xn -+ X in total variation, and u : JR -+ JR is bounded. Show that E(u (Xn )) -+ E(u (X)) . 10. Let {Xr : r � I } be independent Poisson variables with respective parameters {Ar : r � I } . Show that L:�l Xr converges or diverges almost surely according as L:�l Ar converges or diverges.

7.3 Exercises. Some ancillary results

1. (a) Suppose that Xn � X. Show that {Xn } is Cauchy convergent in probability in that, for all E > 0, lP( I Xn - Xm I > E) -+ 0 as n , m -+ 00. In what sense is the converse true?

(b) Let {Xn } and {Yn } be sequences of random variables such that the pairs (Xj , Xj ) and (Yj , Yj ) have the same distributions for all i, j . If Xn � X, show that Yn converges in probability to some limit Y having the same distribution as X .

2. Show that the probability that infinitely many of the events {An : n � I } occur satisfies lP(An i.o.) � lim sUPn�oo lP(An ) . 3 . Let {Sn : n � O} be a simple random walk which moves to the right with probability p at each step, and suppose that So = O. Write Xn = Sn - Sn- l .

86

Page 96: One Thousand Exercises in Probability

Some ancillary results

(a) Show that {Sn = ° i .o.} is not a tail event of the sequence {Xn } .

(b) Show that JP>(Sn = ° i.o.) = ° if p i= � . (c) Let Tn = Sn/ -/ii, and show that

{ lim inf Tn .:s -x } n { lim sup Tn ::: x } n�oo n�oo

Exercises [7.3.4]-[7.3.10]

is a tail event of the sequence {Xn } , for all x > 0, and deduce directly that JP>(Sn = ° i.o.) = I if

P _ 1 - 2' 4. Hewitt-Savage zerQ-()ne law. Let Xl , X 2 , . . . be independent identically distributed random variables. The event A, defined in terms of the Xn , is called exchangeable if A is invariant un­der finite permutations of the coordinates, which is to say that its indicator function fA satisfies fA (X1 , X2 , . . . , Xn , . . . ) = fA (Xi 1 ' Xi2 ' . . . , Xin , Xn+1 , . . . ) for all n ::: 1 and all permutations (i I , i2 , . . . , in ) of ( 1 , 2, . , . , n ) . Show that all exchangeable events A are such that either JP>(A) = ° or JP>(A) = 1 .

5. Returning to the simple random walk S of Exercise (3), show that {Sn = ° i .o.} i s an exchangeable event with respect to the steps of the walk, and deduce from the Hewitt-Savage zero--one law that it has probability either ° or 1 .

6. Weierstrass's approximation theorem. Let I : [0, 1 ] -+ lR be a continuous function, and let Sn be a random variable having the binomial distribution with parameters n and x. Using the formula lE(Z) = lE(ZfA) + lE(ZfAc ) with Z = I(x) - l(n- 1 Sn) and A = { In- 1 Sn - x l > 8} , show that

lim sup I / (X) - t I(k/n) (:) xk ( 1 - x)n-k I = 0. n�oo O:'Sx:'S l k=O

You have proved Weierstrass 's approximation theorem, which states that every continuous function on [0, 1] may be approximated by a polynomial uniformly over the interval.

7. Complete convergence. A sequence Xl , X 2 , . " of random variables is said to be completely convergent to X if

for all E > 0.

Show that, for sequences of independent variables, complete convergence is equivalent to a.s. conver­gence. Find a sequence of (dependent) random variables which converges a.s . but not completely.

8. Let Xl , X2 , ' " be independent identically distributed random variables with common mean f..t and finite variance. Show that

as n -+ 00 .

9. Let {Xn : n ::: I } be independent and exponentially distributed with parameter 1 . Show that

( . Xn ) JP> hm sup -- = 1 = 1 .

n�oo log n

10. Let {Xn : n ::: I } be independent N(O, 1 ) random variables. Show that:

( . IXn l r;;) (a) JP> lim sup r.;;;:-:: = '" 2 = 1 ,

n�oo ", log n

87

Page 97: One Thousand Exercises in Probability

[7.3.11J-[7.5.1J Exercises

{ ° if En JP(XI > an) < 00, (b) JP(Xn > an i.o.) =

1 if En JP(XI > an) = 00.

Convergence of random variables

11. Construct an example to show that the convergence in distribution of Xn to X does not imply the convergence of the unique medians of the sequence Xn .

12. (i) Let {Xr : r ::: I } be independent, non-negative and identically distributed with infinite mean. Show that lim supr .... oo Xrlr = 00 almost surely.

(ii) Let {Xr } be a stationary Markov chain on the positive integers with transition probabilities

Pjk =

{ -j- if k = . + 1

j + 2 J ,

j ! 2 if k = 1 .

(a) Find the stationary distribution of the chain, and show that it has infinite mean.

(b) Show that lim supr .... 00 X r I r � 1 almost surely.

13. Let {X r : I � r � n } be independent and identically distributed with mean J1, and finite variance 2 -

- 1 n a . Let X = n Er=1 Xr . Show that

t (Xr - J1,) / t (Xr - X)2

r=1 r=1

converges in distribution to the N(O, 1 ) distribution as n -+ 00.

7.4 Exercise. Laws of large numbers

1. Let X2 , X3 , . . . be independent random variables such that

1 JP(Xn = n) = JP(Xn = -n) = -- , 2n log n

1 JP(Xn = 0) = 1 - -- .

n log n

Show that this sequence obeys the weak law but not the strong law, in the sense that n- 1 E1 Xi converges to ° in probability but not almost surely.

2. Construct a sequence {Xr : r ::: I } of independent random variables with zero mean such that n - 1 E�=1 Xr -+ -00 almost surely, as n -+ 00.

3. Let N be a spatial Poisson process with constant intensity )., in ]Rd, where d ::: 2. Let S be the ball of radius r centred at zero. Show that N(S)/ I S I -+ )., almost surely as r -+ 00, where l S I is the volume of the ball.

7.5 Exercises. The strong law

1. Entropy. The interval [0, 1] is partitioned into n disjoint sub-intervals with lengths PI , P2 , . . . , Pn , and the entropy of this partition is defined to be

n

h = - L Pi log Pi . i=1

88

Page 98: One Thousand Exercises in Probability

Martingales Exercises [7.5.2]-[7.7.4]

Let Xl , X2 , . . . be independent random variables having the uniform distribution on [0, 1 ] , and let Zm (i ) be the number of the Xl , X2 , . . . , Xm which lie in the i th interval of the partition above. Show that n

R - II zm (i) m - Pi i=l

satisfies m-l log Rm --+ -h almost surely as m --+ 00 .

2. Recurrent events. Catastrophes occur at the times Tl , T2 , . . . where Ti = Xl + X 2 + . . . + Xi and the Xi are independent identically distributed positive random variables. Let N(t) = max.{n : Tn � t } be the number of catastrophes which have occurred by time t. Prove that if lE(Xl ) < 00 then N(t) --+ 00 and N(t)/t --+ l /lE(X 1 ) as t --+ 00, almost surely. 3. Random walk. Let Xl , X2 , . . . be independent identically distributed random variables taking values in the integers Z and having a finite_mean. Show that the Markov chain S = {Sn } given by Sn = �1 Xi is transient if lE(X 1 ) =1= O.

7.6 Exercise. Law of the iterated logarithm

1. A function cp (x) is said to belong to the 'upper class ' if, in the notation of this section, IP'(Sn > cp (n)..jn i .o.) = O. A consequence of the law of the iterated logarithm is that .Ja log log x is in the upper class for all a > 2. Use the first Borel-Cantelli lemma to prove the much weaker fact that cp(x) = .Ja log x is in the upper class for all a > 2, in the special case when the Xi are independent N(O, 1) variables.

7.7 Exercises. Martingales

1. Let Xl , X2 , . . . be random variables such that the partial sums Sn = Xl + X2 + . . . + Xn determine a martingale. Show that lE(Xi Xj ) = 0 if i =1= j .

2. Let Zn b e the size of the nth generation of a branching process with immigration, i n which the family sizes have mean J1, (=1= 1 ) and the mean number of immigrants in each generation is m . Suppose that lE(Zo) < 00, and show that

is a martingale with respect to a suitable sequence of random variables. 3. Let Xo , XI . X2 , . . . be a sequence of random variables with finite means and satisfying lE(Xn+l I XO , XI . . . . , Xn) = aXn + bXn- l for n ::: 1 , where 0 < a , b < 1 and a + b = 1 . Find a value of a for which Sn = aXn + Xn-l , n ::: 1 , defines a martingale with respect to the sequence X . 4. Let Xn be the net profit to the gambler of betting a unit stake on the nth play in a casino; the Xn may be dependent, but the game is fair in the sense that lE(Xn+l I Xl , X2 , . . . , Xn) = 0 for all n . The gambler stakes Y on the first play, and thereafter stakes fn (X 1 , X2 , . . . , Xn) on the (n + l)th play, where fl , h, . . . are given functions. Show that her profit after n plays is

n Sn = "L: Xi fi- l (Xl , X2 , · · · , Xi-d ,

i=l

where fo = Y. Show further that the sequence S = {Sn } satisfies the martingale condition lE(Sn+l I Xl , X2 , . . . , Xn) = Sn , n ::: 1 , if Y is assumed to be known throughout.

89

Page 99: One Thousand Exercises in Probability

[7.8.1]-[7.9.5] Exercises Corwergence of random variables

7.8 Exercises. Martingale convergence theorem

1. Kolrnogorov's inequality. Let Xl , X2 , . . . be independent random variables with zero means and finite variances, and let Sn = Xl + X2 + . . . + Xn . Use the Doob-Kolmogorov inequality to show that

IP ( m� I Sj l > E) ::: ; t var(Xj ) I�J �n E . 1 J = for E > O.

2. Let Xl , X2 , . . . be independent random variables such that L:n n-2 var(Xn) < 00. Use Kol­mogorov's inequality to prove that

t Xi - �(Xi ) � Y i=l I

for some finite random variable Y, and deduce that

as n -+ 00,

as n -+ 00.

(You may find Kronecker's lemma to be useful: if (an ) and (bn ) are real sequences with bn t 00 and L:i ai/bi < 00, then b;l L:7=1 ai -+ 0 as n -+ 00.)

3. Let S be a martingale with respect to X, such that JE(S�) < K < 00 for some K E R. Suppose that var(Sn) -+ 0 as n -+ 00, and prove that S = limn--+oo Sn exists and is constant almost surely.

7.9 Exercises. Prediction and conditional expectation

1. Let Y be uniformly distributed on [- 1 , 1 ] and let X = y2 . (a) Find the best predictor of X given Y, and of Y given X. (b) Find the best linear predictor of X given Y, and of Y given X.

2. Let the pair (X, Y) have a general bivariate normal distribution. Find JE(Y I X). 3. Let Xl , X2 , • . . , Xn be random variables with zero means and covariance matrix V = (Vij ) , and let Y have finite second moment. Find the linear function h of the Xi which minimizes the mean squared error JE{(Y - h (X 1 , . . . , Xn))2 } . 4. Verify the following properties of conditional expectation. You may assume that the relevant expectations exist. (i) JE{JE(Y I fl.) } = JE(Y). (ii) JE(aY + (3Z I fl.) = aJE(Y I fl.) + (3JE(Z I fl.) for a, (3 E R.

(iii) JE(Y I fl.) ::: 0 if Y ::: o.

(iv) JE(Y I fl.) = JE{JE(Y I :If) I fl.} if fl. � :H. (v) JE(Y I fl.) = JE(Y) if Y is independent of IG for every G E fl..

(vi) Jensen's inequality. g{JE(Y I fl.) } ::: JE{g(Y) I fl.} for all convex functions g . (vii) If Yn � Y and I Yn I ::: Z a.s. where JE(Z) < 00, then JE(Yn I fl.) � JE(Y I fl.) . (Statements (ii)-{vi) are of course to be interpreted 'almost surely' .)

5. Let X and Y have joint mass function f(x , y) = {x (x + 1 ) }- 1 for x = y = 1 , 2, . . . . Show that JE(Y I X) < 00 while JE(Y) = 00.

90

Page 100: One Thousand Exercises in Probability

Problems Exercises [7.9.6]-[7.11.5]

6. Let (n, 37, 1P) be a probability space and let g. be a sub-u-field of :F. Let H be the space of g.-measurable random variables with finite second moment.

(a) Show that H is closed with respect to the norm II . 1 1 2 .

(b) Let Y be a random variable satisfying JE(y2) < 00, and show the equivalence of the following

two statements for any M E H: (i) JE{ (Y - M)Z} = 0 for all Z E H,

(ii) JE{ (Y - M)/G } = 0 for all G E g..

7.10 Exercises. Uniform integrability

1. Show that the sum {Xn + Yn } of two uniformly integrable sequences {Xn } and {Yn } gives a uniformly integrable sequence.

2. (a) Suppose that Xn � X where r � 1 . Show that { I Xn l r : n � I } is uniformly integrable, and deduce that JE(X�) -+ JE(Xr ) if r is an integer.

(b) Conversely, suppose that { l Xn l r : n � I } is uniformly integrable where r � 1 , and show that

Xn � X if Xn � X.

3. Let g : [0, 00) -+ [0, 00) be an increasing function satisfying g (x)/x -+ 00 as x -+ 00. Show that the sequence {Xn : n � I } is uniformly integrable if supn JE{g ( IXn In < 00.

4. Let {Zn : n � O} be the generation sizes of a branching process with Zo = 1 , JE(Zl ) = 1 , var(Zl ) =1= O. Show that {Zn : n � O} is not uniformly integrable.

5. Pratt's lemma. Suppose that Xn ::::: Yn ::::: Zn where Xn � X, Yn � Y, and Zn � Z. If JE(Xn) -+ JE(X) and JE(Zn ) -+ JE(Z), show that JE(Yn ) -+ JE(Y).

6. Let {Xn : n � I} be a sequence of variables satisfying JE(suPn I Xn l ) < 00. Show that {Xn } is uniformly integrable.

1. Let Xn have density function

7.11 Problems

n � 1 .

With respect to which modes of convergence does Xn converge as n -+ oo?

2. (i) Suppose that Xn � X and Yn � Y, and show that Xn + Yn � X + Y. Show that the corresponding result holds for convergence in rth mean and in probability, but not in distribution.

(li) Show that if Xn � X and Yn � Y then Xn Yn � XY. Does the corresponding result hold for the other modes of convergence?

3. Let g : JR -+ JR be continuous. Show that g (Xn ) � g(X) if Xn � X.

4. Let Yl , Y2 , . . . be independent identically distributed variables, each of which can take any value in {O, 1 , . . . , 9} with equal probability 10 . Let Xn = l:i=l Yj lO-j . Show by the use of characteristic

functions that Xn converges in distribution to the uniform distribution on [0, 1 ] . Deduce that Xn � Y for some Y which is uniformly distributed on [0, 1 ] .

5. Let N(t) be a Poisson process with constant intensity on JR. 9 1

Page 101: One Thousand Exercises in Probability

[7.11.6]-[7.11.13] Exercises Convergence of random variables

(a) Find the covariance of N(s) and N(t) .

(b) Show that N is continuous in mean square, which is to say that E ({N (t + h) - N (t) }2 ) � 0 as h � O.

(c) Prove that N is continuous in probability, which is to say that lP' ( IN (t + h) - N(t) 1 > E) � 0 as h � 0, for all E > o.

(d) Show that N is differentiable in probability but not in mean square.

6. Prove that n - 1 2:7=1 Xj � 0 whenever the Xi are independent identically distributed variables

with zero means and such that E(Xt) < 00.

7. Show that Xn � X whenever 2:n E( IXn - Xn < 00 for some r > O.

8. Show that if Xn S X then aXn + b S aX + b for any real a and b . 9. If X has zero mean and variance 0"2, show that

0"2 lP'(X ;::: t) ::s: �+ 2 for t > O. 0" t

10. Show that Xn � 0 if and only if

E C ���n l ) � 0 as n � 00 .

11. The sequence {Xn } is said to be mean-square Cauchy convergent if E{(Xn - Xm )2 } � 0 as m , n � 00. Show that {Xn } converges in mean square to some limit X if and only if it is mean-square Cauchy convergent. Does the corresponding result hold for the other modes of convergence?

12. Suppose that {Xn } is a sequence of uncorrelated variables with zero means and uniformly bounded

variances. Show that n- 1 2:7=1 Xi � O. 13. Let Xl , X2 , . . . be independent identically distributed random variables with the common dis­tribution function F, and suppose that F(x) < 1 for all x. Let Mn = max {Xl , X2 , . . . , Xn } and suppose that there exists a strictly increasing unbounded positive sequence aI , a2 , . . . such that lP' (Mn/an ::s: x) � H(x) for some distribution function H. Let us assume that H is continuous with 0 < H ( 1 ) < 1 ; substantially weaker conditions suffice but introduce extra difficulties.

(a) Show that n [ 1 - F(anx) ] � - log H(x) as n � 00 and deduce that

(b) Deduce that if x > 0

1 - F(anx) log H(x) ---- � if x > O. 1 - F(an ) log H(I)

1 - F(tx) log H(x) � --=--'--1 - F (t) log H(1 )

as t � 00 .

(c) Set x = Xl x2 and make the substitution

(x) = log H(eX ) g log H(1 )

to find that g (x + y) = g (x)g(y) , and deduce that

H(x) = { e

o

xp(-ax-fJ ) if x ;::: 0,

if x < 0,

92

Page 102: One Thousand Exercises in Probability

Problems Exercises [7.11.14]-[7.11.19]

for some non-negative constants a and {J. You have shown that H is the distribution function of y- I , where Y has a Weibull distribution. 14. Let X I , X 2 , . . . , X n be independent and identically distributed random variables with the Cauchy distribution. Show that Mn = max{XI , X2 , . . . , Xn } is such that ]l' Mn/n converges in distribution, the limiting distribution function being given by H (x) = e- I /x if x � O.

15. Let XI , X2 , . . . be independent and identically distributed random variables whose common characteristic function ,p satisfies ,p' (0) = ilL . Show that n -I E'j =1 Xj � IL. 16. Total variation distance. The total variation distance dTV (X, Y) between two random variables X and Y is defined by

dTV (X, Y) = sup IE(u (X» - E(u(Y» I u : llu lloo=1

where the supremum is over all (measurable) functions u : lR -+ lR such that l I u l ioo = supx l u (x ) 1 satisfies l I u l loo = l . (a) If X and Y are discrete with respective masses In and gn at the points Xn , show that

dTV (X, Y) = L l in - gn l = 2 sup IIP'(X E A) - 1P'(Y E A) I · n A�1R

(b) If X and Y are continuous with respective density functions I and g, show that

dTV(X, Y) = 100

I I (x) - g(x) 1 dx = 2 sup I IP'(X E A) - 1P'(Y E A) I . -00 A�1R

(c) Show that dTV (Xn , X) -+ 0 implies that Xn -+ X in distribution, but that the converse is false. (d) Maximal coupling. Show that IP'(X '" Y) � !dTV(X, Y), and that there exists a pair X', yl

having the same marginals for which equality holds . (e) If Xi , Yj are independent random variables, show that

17. Let g : lR -+ lR be bounded and continuous. Show that

00 (n).,)k L g (k/n)-k ,-e-nJ... -+ g ().,) as n -+ 00. k=O .

18. Let X n and Y m be independent random variables having the Poisson distribution with parameters n and m, respectively. Show that

(Xn - n) - (Ym - m) D ...:.....--'--�;::::====;:;:=---'- -+ N(O, 1 ) as m , n -+ 00.

,JXn + Ym

19. (a) Suppose that X I , X 2 , . . . is a sequence of random variables, each having a normal distribution, and such that Xn � X. Show that X has a normal distribution, possibly degenerate.

(b) For each n � 1 , let (Xn , Yn) be a pair of random variables having a bivariate normal distribution. Suppose that Xn � X and Yn � Y, and show that the pair (X, Y) has a bivariate normal distribution.

93

Page 103: One Thousand Exercises in Probability

[7.11.20]-[7.11.26] Exercises Convergence of random variables

20. Let X I , X2 , . . . be random variables satisfying var(Xn) < c for all n and some constant c. Show that the sequence obeys the weak law, in the sense that n- l �1 (Xj - lEXj ) converges in probability to 0, if the correlation coefficients satisfy either of the following: (i) p (Xj , Xj ) ::0 0 for all i =1= j , (ii) p (Xj , Xj ) � 0 as I i - j l � 00. 2t. Let X I , X2 , . . . be independent random variables with common density function { 0

f(x) = c x2 10g Ix l

if Ix l ::0 2, if Ix l > 2,

where c is a constant. Show that the Xj have no mean, but n- l �?=l Xj � 0 as n � 00. Show that convergence does not take place almost surely.

22. Let Xn be the Euclidean distance between two points chosen independently and unifonnly from the n-dimensional unit cube. Show that lE(Xn)/Jn � 1/J6 as n � 00. 23. Let Xl , X2 , . . . be independent random variables having the unifonn distribution on [- 1 , 1 ] . Show that

24. Let X I , X2 , . . . be independent random variables, each Xk having mass function given by

1 IP'(Xk = k) = IP'(Xk = -k) = -2 ' 2k IP'(Xk = 1 ) = IP'(Xk = - 1 ) = � ( 1 -

k; ) if k > 1 .

Show that Un = �1 Xj satisfies Un/ In � N(O, 1 ) but var(Un/ In) � 2 as n � 00. 25. Let Xl , X2 , . . . be random variables, and let NI , N2 , . . . be random variables taking values in the positive integers such that Nk � 00 as k � 00. Show that:

(i) if Xn � X and the Xn are independent of the Nb then XNk � X as k � 00,

(ii) if Xn � X then XNk � X as k � 00. 26. Stirling's formula.

(a) Let a (k, n) = nk / (k - I ) ! for i ::0 k ::0 n + 1 . Use the fact that I - x ::0 e-x if x � 0 to show that

a (n - k, n) < e-k2 j(2n) a (n + 1 , n) - if k � O.

(b) Let Xl , X2 , . . . be independent Poisson variables with parameter 1 , and let Sn = Xl + . . . + Xn . Define the function g : lR. � lR. by

{ -x g (x) =

0

if O � x � -M, otherwise,

where M is large and positive. Show that, for large n,

lE (g { SnJn n }) = :; {a (n + 1 , n) - a (n - k, n) }

94

Page 104: One Thousand Exercises in Probability

Problems Exercises [7.11.27]-[7.11.33]

where k = lMnl/2J . Now use the central limit theorem and (a) above, to deduce Stirling's formula:

as n � 00.

27. A bag contains red and green balls. A ball i s drawn from the bag, its colour noted, and then it is returned to the bag together with a new ball of the same colour. Initially the bag contained one ball of each colour. If Rn denotes the number of red balls in the bag after n additions, show that Sn = Rnl (n + 2) is a martingale. Deduce that the ratio of red to green balls converges almost surely to some limit as n � 00. 28. Anscombe's theorem. Let {Xi : i :::: I } be independent identically distributed random variables with zero mean and finite positive variance a2 , and let Sn = �'i Xi . Suppose that the integer-valued

random process M (t) satisfies t -1 M (t) � e as t � 00, where e is a positive constant. Show that

SM(t) .s N(O, 1 ) and a$t

SM(t) .s N(O, 1 ) aJM(t)

You should not assume that the process M is independent of the Xi .

as t � 00.

29. Kolmogorov's inequality. Let Xl , X2 , ' " be independent random variables with zero means, and Sn = Xl + X2 + . . · + Xn . Let Mn = maxl:;;k:;;n I Sk l and show that lF.(S;IAk ) > c2lP'(Ak) where Ak = {Mk-l :::: c < Mk } and c > O. Deduce Kolmogorov's inequality:

c > O.

30. Let Xl , X2 , . . . be independent random variables with zero means, and let Sn = Xl + X2 + . . . + Xn . Using Kolmogorov's inequality or the martingale convergence theorem, show that: (i) ��l Xi converges almost surely if ��l lF.(Xt) < 00,

(ii) if there exists an increasing real sequence (bn ) such that bn � 00, and satisfying the inequality ��l lF.(Xt )lbt < 00, then b;; l ��l Xk � 0 as n � 00.

31. Estimating the transition matrix. The Markov chain XO , Xl , . . . , Xn has initial distribution fi = lP'(Xo = i ) and transition matrix P. The log-likelihood function )., (P) is defined as )., (P) = log(fxo PXO,Xj PXj ,X2 ' " PXn_ j ,Xn ) ' Show that: (a) ).,(P) = log fxo + �i,j Nij log Pij where Nij is the number of transitions from i to j ,

(b) viewed as a function of the Pij ' )., (P) i s maximal when Pij = Pij where Pij = Nij I�k Nib (c) if X is irreducible and ergodic then Pij � Pij as n � 00.

32. Ergodic theorem in discrete time. Let X be an irreducible discrete-time Markov chain, and let /Li be the mean recurrence time of state i . Let Vi (n) = ��;;;J I{Xr=iJ be the number of visits to i up to n - 1 , and let f be any bounded function on S. Show that: (a) n-l Vi (n) � /Lil as n � 00, (b) if /Li < 00 for all i , then

1 n- l

- L f(Xr ) � L f(i )//Li as n � 00. n ,=0 ieS

33. Ergodic theorem in continuous time. Let X be an irreducible persistent continuous-time Markov chain with generator G and finite mean recurrence times /Lj .

95

Page 105: One Thousand Exercises in Probability

[7.11.34]-[7.11.37] Exercises

h th 1 lot d a.s. 1

(a) S ow at - I{X (s)=j } s -+ -- as t --+ 00; t 0 /-Ljgj

Convergence of random variables

(b) deduce that the stationary distribution :n: satisfies 11) = 1 / (/-Ljgj ) ; (c) show that, if f i s a bounded function on S,

� r f(X (s)) ds � L 1fi/(i ) as t --+ 00. t Jo i

34. Tail equivalence. Suppose that the sequences {Xn : n � I } and {Yn : n � I } are tail equivalent, which is to say that E�I lP'(Xn =F Yn) < 00. Show that: (a) E�1 Xn and E�1 Yn converge or diverge together, (b) E�1 (Xn - Yn ) converges almost surely,

(c) if there exist a random variable X and a sequence an such that an t oo and a; l E�=1 Xr � X, then

1 n - L Yr � X. an r=1

35. Three series theorem. Let {Xn : n � I } be independent random variables. Show that E�1 Xn converges a.s. if, for some a > 0, the following three series all converge: (a) En lP'( IXn l > a) , (b) En var(Xn II lXn l:::;a } ) , (c) En E(XnII lXn l:::a} ) · [The converse holds also, but is harder to prove.]

36. Let {Xn : n � I} be independent random variables with continuous common distribution function F. We call Xk a record value for the sequence if Xk > Xr for 1 :::: r < k, and we write h for the indicator function of the event that X k is a record value. (a) Show that the random variables h are independent. (b) Show that Rm = Ek=l lr satisfies Rm /log m � 1 as m --+ 00.

37. Random harmonic series. Let {Xn : n � I } be a sequence of independent random variables with lP'(Xn = 1 ) = lP'(Xn = -1 ) = ! . Does the series E�=1 Xr /r converge a.s. as n --+ oo?

96

Page 106: One Thousand Exercises in Probability

8 Random processes

8.2 Exercises. Stationary processes

1. Flip-flop. Let {Xn } be a Markov chain on the state space S = {O, I } with transition matrix

( l - a a ) p = f3 I - f3 '

where a + f3 > O. Find: (a) the correlation p (Xm , Xm+n ) , and its limit as m --+ 00 with n remaining fixed, (b) limn-+oo n- 1 E�=l JP>(Xr = 1 ) . Under what condition i s the process strongly stationary?

2. Random telegraph. Let (N(t) : t � O} be a Poisson process of intensity A, and let To be an independent random variable such that JP>(To = ±1 ) = 1 . Define T(t) = To (- I)N(t) . Show that (T(t) : t � O} is stationary and find: (a) p(T(s ) , T (s + t» , (b) the mean and variance of X(t) = Ici T(s) ds. 3. Korolyuk-Khinchin theorem. An integer-valued counting process (N(t) : t � O} with N(O) = o is called crudely stationary if Pk (S , t) = JP>(N(s + t) - N(s) = k) depends only on the length t and not on the location s . It is called simple if, almost surely, it has jump discontinuities of size 1 only. Show that, for a simple crudely stationary process N, limt,l.O t- 1JP>(N(t) > 0) = lE(N(1» .

8.3 Exercises. Renewal processes

1. Let (fn : n � 1) be a probability distribution on the positive integers, and define a sequence (un : n � 0) by uo = 1 and Un = E�=l frun-r , n � 1 . Explain why such a sequence is called a renewal sequence, and show that u is a renewal sequence if and only if there exists a Markov chain U and a state s such that Un = JP>(Un = s I Uo = s) . 2. Let {Xj : i � I } be the inter-event times of a discrete renewal process on the integers. Show that the excess lifetime Bn constitutes a Markov chain. Write down the transition probabilities of the sequence {Bn } when reversed in eqUilibrium. Compare these with the transition probabilities of the chain U of your solution to Exercise ( 1 ). 3. Let (un : n � 1 ) satisfy Uo = 1 and Un = E�=l frun-r for n � 1 , where (fr : r � 1 ) is a non-negative sequence. Show that: (a) Vn = pnun is a renewal sequence if p > 0 and E�l pn fn = 1 , (b) as n --+ 00, pnun converges to some constant c .

97

Page 107: One Thousand Exercises in Probability

[8.3.4]-[8.4.5] Exercises Random processes

4. Events occur at the times of a discrete-time renewal process N (see Example (5 .2. 15» . Let Un be the probability of an event at time n, with generating function U(s) , and let F(s) be the probability generating function of a typical inter-event time. Show that, if I s I < 1 :

5. Prove Theorem (8.3.5): Poisson processes are the only renewal processes that are Markov chains.

8.4 Exercises. Queues

1. The two tellers in a bank each take an exponentially distributed time to deal with any customer; their parameters are A and /.t respectively. You arrive to find exactly two customers present, each occupying a teller. (a) You take a fancy to a randomly chosen teller, and queue for that teller to be free; no later switching

is permitted. Assuming any necessary independence, what is the probability p that you are the last of the three customers to leave the bank?

(b) If you choose to be served by the quicker teller, find p. (c) Suppose you go to the teller who becomes free first. Find p.

2. Customers arrive at a desk according to a Poisson process of intensity A. There is one clerk, and the service times are independent and exponentially distributed with parameter /.t. At time 0 there is exactly one customer, currently in service. Show that the probability that the next customer arrives before time t and finds the clerk busy is

3. Vehicles pass a crossing at the instants of a Poisson process of intensity A; you need a gap of length at least a in orde.r to cross. Let T be the first time at which you could succeed in crossing to the other side. Show that lE(T) = (eaA - 1 )/A , and find lE(e9T) .

Suppose there are two lanes to cross, carrying independent Poissonian traffic with respective rates A and /.t. Find the expected time to cross in the two cases when: (a) there is an island or refuge between the two lanes, (b) you must cross both in one go. Which is the greater?

4. Customers arrive at the instants of a Poisson process of intensity A, and the single server has exponential service times with parameter /.t. An arriving customer who sees n customers present (including anyone in service) will join the queue with probability (n + 1 )/ (n + 2) , otherwise leaving for ever. Under what condition is there a stationary distribution? Find the mean of the time spent in the queue (not including service time) by a customer who joins it when the queue is in equilibrium. What is the probability that an arrival joins the queue when in equilibrium?

5. Customers enter a shop at the instants of a Poisson process of rate 2. At the door, two represen­tatives separately demonstrate a new corkscrew. This typically occupies the time of a customer and the representative for a period which is exponentially distributed with parameter 1 , independently of arrivals and other demonstrators. If both representatives are busy, customers pass directly into the shop. No customer passes a free representative without being stopped, and all customers leave by another door. If both representatives are free at time 0, show the probability that both are busy at time t is � - �e-2t + �e-5t 5 3 15 ·

98

Page 108: One Thousand Exercises in Probability

Problems Exercises [8.5.1]-[8.7.4]

8.S Exercises. The Wiener process

1. For a Wiener process W with W (0) = 0, show that

lP'(W(s) > 0, Wet) > 0) = ! + 2- sin- l ([. for s < t . 4 2n V i

Calculate lP'(W(s) > 0, Wet) > 0, W(u) > 0) when s < t < u . 2. Let W be a Wiener process. Show that, for s < t < u , the conditional distribution of W(t) given W(s) and W(u) is normal

N ( (U - t)W(s) + (t - s )W(u) , (u - t) (t - S) ) .

u - s u - s

Deduce that the conditional correlation between W(t) and W(u) , given W(s) and W(v), where s < t < u < v, is

(v - u) (t - s) (v - t) (u - s)

3. For what values of a and b is aWl + bW2 a standard Wiener process, where WI and W2 are independent standard Wiener processes?

4. Show that a Wiener process W with variance parameter a2 has finite quadratic variation, which is to say that

n- l L {W ((j + l )t/n) - W(jt/n) }2 � a2t as n --+ 00. j=O

5. Let W be a Wiener process. Which of the following define Wiener processes? (a) -W(t) , (b) ,JtW(l) , (c) W(2t) - W(t) .

8.7 Problems

1. Let {Zn } be a sequence of un correlated real-valued variables with zero means and unit variances, and define the 'moving average' r

Yn = L aiZn-i , i=O

for constants aO , aI , . . . , ar . Show that Y is stationary and find its autocovariance function.

2. Let {Zn } be a sequence of un correlated real-valued variables with zero means and unit variances. Suppose that {Yn } is an 'autoregressive' stationary sequence in that it satisfies Yn = aYn- 1 + Zn , -00 < n < 00, for some real a satisfying l a l < 1 . Show that Y has autocovariance function c(m) = a 1m l / (l - a2) . 3 . Let {Xn } be a sequence of independent identically distributed Bernoulli variables, each taking values 0 and 1 with probabilities 1 - p and p respectively. Find the mass function of the renewal process N(t) with interarrival times {Xn } . 4. Customers arrive in a shop in the manner of a Poisson process with parameter A. There are infinitely many servers, and each service time is exponentially distributed with parameter JL. Show that the number Q(t) of waiting customers at time t constitutes a birth-death process. Find its stationary distribution.

99

Page 109: One Thousand Exercises in Probability

[8.7.5]-[8.7.7] Exercises Random processes

5. Let X(t) = Y cos(Ot) + Z sin(Ot) where Y and Z are independent N(O, 1) random variables, and let X (t) = R cos(Ot + qt) where R and qt are independent. Find distributions for R and qt such that the processes X and X have the same fdds.

6. Bartlett's theorem. Customers arrive at the entrance to a queueing system at the instants of an inhomogeneous Poisson process with rate function ).. (t) . Their subsequent service histories are independent of each other, and a customer arriving at time s is in state A at time s + t with prob­ability p(s , t) . Show that the number of customers in state A at time t is Poisson with parameter too ).. (u)p(u , t - u) duo 7. In a Prague teashop (U Mysaka), long since bankrupt, customers queue at the entrance for a blank bill. In the shop there are separate counters for coffee, sweetcakes, pretzels, milk, drinks, and ice cream, and queues form at each of these. At each service point the customers ' bills are marked appropriately. There is a restricted number N of seats, and departing customers have to queue in order to pay their bills. If interarrival times and service times are exponentially distributed and the process is in equilibrium, find how much longer a greedy customer must wait if he insists on sitting down. Answers on a postcard to the authors, please.

100

Page 110: One Thousand Exercises in Probability

9 Stationary processes

9.1 Exercises. Introduction

1. Let . . . , Z- 1 ' Zo , Z1 , Z2 , . . . be independent real random variables with means 0 and variances 1, and let a, f3 E lR. Show that there exists a (weakly) stationary sequence {Wn } satisfying Wn = aWn-1 + f3Wn-2 + Zn , n = . . . , - 1 , 0, 1 , . . . , if the (possibly complex) zeros of the quadratic equation z2 - az - f3 = 0 are smaller than 1 in absolute value.

2. Let U be uniformly distributed on [0, 1] with binary expansion U = L:�1 Xiri . Show that the sequence

00 Vn = E Xi+n2-i ,

i= 1 n :::: 0,

is strongly stationary, and calculate its autocovariance function.

3. Let {Xn : n = . . . , - 1 , 0, 1 , . . . } be a stationary real sequence with mean 0 and autocovariance function c(m) . (i) Show that the infinite series L:�o an Xn converges almost surely, and in mean square, whenever

L:�o lan l < 00. (ii) Let

00 Yn = E akXn-k ,

k=O n = . . . , - 1 , 0, 1 , . . .

where L:�o lak l < 00. Find an expression for the autocovariance function cy of Y, and show that

00

E I cy (m) 1 < 00. m=-oo

4. Let X = {Xn : n 2: O} be a discrete-time Markov chain with countable state space S and stationary distribution 11: , and suppose that Xo has distribution 11: . Show that the sequence {f (Xn ) : n :::: O} is strongly stationary for any function f : S --+ lR.

9.2 Exercises. Linear prediction

1. Let X be a (weakly) stationary sequence with zero mean and autocovariance function c(m) . (i) Find the best linear predictor Xn+1 of Xn+1 given Xn .

(ii) Find the best linear predictor Xn+ 1 of Xn+1 given Xn and Xn- 1 .

101

Page 111: One Thousand Exercises in Probability

[9.2.2]-[9.4.2] Exercises Stationary processes

(iii) Find an expression for D = lE{(Xn+l - Xn+l )2} - lE{(Xn+l - Xn+l )2} , and evaluate this expression when: (a) Xn = cos(nU) where U is uniform on [-n, n) , (b) X i s an autoregressive scheme with c(k) = a lk l where la l < 1 .

2. Suppose l a l < 1 . Does there exist a (weakly) stationary sequence {Xn : -00 < n < oo} with zero means and autocovariance function

c(k) = { � 1 + a2

o

if k = 0,

if I k l = 1 ,

if I k l > 1 .

Assuming that such a sequence exists, find the best linear predictor Xn of Xn given Xn-l , Xn-2 , . . . , and show that the mean squared error of prediction is (1 + a2)- 1 . Verify that {Xn } is (weakly) stationary.

9.3 Exercises. Autocovariances and spectra

1. Let xn = A cos(n.i..) + B sin(n.i..) where A and B are uncorre1ated random variables with zero means and unit variances. Show that X is stationary with a spectrum containing exactly one point.

2. Let U be uniformly distributed on (-n, n) , and let V be independent of U with distribution func­tion F. Show that Xn = ei (U-Vn) defines a stationary (complex) sequence with spectral distribution function F. 3. Find the autocorrelation function of the stationary process (X (t) : -00 < t < oo} whose spectral density function is: (i) N(O, l ) , (ii) f(x) = i e- lx i , -00 < x < 00.

4. Let Xl , X 2 , . . . be a real-valued stationary sequence with zero means and autocovariance function c(m) . Show that

.

var (� tXj) = C(O) j ( S

i�(n.i..j� )

2 dF(.i..) n j=l (-1r,1rj n sm(.i..

where F is the spectral distribution function. Deduce that n- 1 'L-'j=l Xj � 0 if and only if F(O) - F(O-) = 0, and show that

1 n- l c(O) {F(O) - F(O-)} = lim - '" c(j) . n-+oo n L j=O

9.4 Exercises. Stochastic integration and the spectral representation

1. Let S be the spectral process of a stationary process X with zero mean and unit variance. Show that the increments of S have zero means.

2. Moving average representation. Let X be a discrete-time stationary process having zero means, continuous strictly positive spectral density function f, and with spectral process S. Let

1 einJ... Yn = dS (.i..) . (-1r,1rj J2nf(.i..)

102

Page 112: One Thousand Exercises in Probability

Gaussian processes Exercises [9.4.3]-[9.6.2]

Show that . . . , Y - 1 , Yo , Yl , . . . is a sequence of uncorrelated random variables with zero means and unit variances.

Show that Xn may be represented as a moving average Xn = L:�-oo aj Yn-j where the aj are constants satisfying

00 J2nf(A) = L aj e-ijJ,.

j=-oo for A E (-n, n ] .

3 . Gaussian process. Let X be a discrete-time stationary sequence with zero mean and unit vari­ance, and whose fdds are of the multivariate-normal type. Show that the spectral process of X has independent increments having normal distributions.

9.S Exercises. The ergodic theorem

1. Let T = { I , 2, . . . } and let 1 be the set of invariant events of (liT , ST) . Show that 1 is a 0" -field.

2. Assume that Xl , X2 , . . . is a stationary sequence with autocovariance function c(m) . Show that

(I n ) 2 n j- l (0)

var - L Xi = 2 L L c(i ) - � . n i= l n j= l i=O n

Assuming that r l L:{�� c(i) --+ 0"2 as j --+ 00, show that

as n --+ 00 .

3. Let Xl , X 2 , . . . be independent identically distributed random variables with zero mean and unit variance. Let

00 Yn = L aiXn+i for n ::: 1

i=O where the ai are constants satisfying L:i aT < 00. Use the martingale convergence theorem to show that the above summation converges almost surely and in mean square. Prove that n- 1 L:i=l Yi --+ 0 a.s. and in mean, as n --+ 00.

9.6 Exercises. Gaussian processes

1. Show that the function c(s , t) = min is , t} is positive definite. That is, show that

n L C(tk , tj )Z/Zk > 0

j,k=l

for all 0 :::: tl < t2 < . . . < tn and all complex numbers Z l , Z2 , . . . , Zn at least one of which is non-zero.

2. Let Xl , X2 , . . . be a stationary Gaussian sequence with zero means and unit variances which satisfies the Markov property. Find the spectral density function of the sequence in terms of the constant p = COV(Xl , X2) .

103

Page 113: One Thousand Exercises in Probability

[9.6.3]-[9.7.7] Exercises Stationary processes

3. Show that a Gaussian process is strongly stationary if and only if it is weakly stationary.

4. Let X be a stationary Gaussian process with zero mean, unit variance, and autocovariance function c(t) . Find the autocovariance functions of the processes X2 = {X(t)2 : -00 < t < oo} and X3 = {X(t)3 : -00 < t < oo}.

9.7 Problems

1. Let . . . , X- I , Xo , X I , . . . be uncorrelated random variables with zero means and unit variances, and define

00 Yn = Xn + ex L fJi- 1 Xn-i

i= 1 for - 00 < n < 00,

where ex and fJ are constants satisfying I fJ l < 1 , I fJ - ex l < 1 . Find the best linear predictor of Yn+1 given the entire past Yn , Yn- I , . . . .

2. Let {Yk : -00 < k < oo} be a stationary sequence with variance o} , and let

r Xn = L akYn-k .

k=O -00 < n < 00,

where QQ , aI , . . . , ar are constants. Show that X has spectral density function

where fy is the spectral density function of Y, 0-1 = var(X I ) , and G a (z) = E�=o akzk . Calculate this spectral density explicitly in the case of 'exponential smoothing' , when r = 00,

ak = ILk ( 1 - IL), and 0 < IL < 1 .

3. Suppose that Yn+ 1 = exYn + fJYn- 1 is the best linear predictor of Yn+ 1 given the entire past Yn , Yn- I , . . . of the stationary sequence {Yk : -00 < k < oo}. Find the spectral density function of the sequence.

4. Recurrent events (5.2.15). Meteorites fall from the sky at integer times TI , T2 , . . . where Tn = XI + X2 + . . . + Xn . We assume that the Xi are independent, X2 , X3 , . . . are identically distributed, and the distribution of X I is such that the probability that a meteorite falls at time n is constant for all n . Let Yn be the indicator function of the event that a meteorite falls at time n . Show that {Yn l is stationary and find its spectral density function in terms of the characteristic function of X2. 5. Let X = {Xn : n 2:: I } be given by Xn = cos(n U) where U is uniformly distributed on [-Jr, Jr] . Show that X is stationary but not strongly stationary. Find the autocorrelation function of X and its spectral density function.

6. (a) Let N be a Poisson process with intensity A, and let ex > O. Define X(t) = N(t + ex) - N(t) for t 2:: O. Show that X is strongly stationary, and find its spectral density function.

(b) Let W be a Wiener process and define X = {X(t) : t 2:: I } by X(t) = W(t) - Wet - 1 ) . Show that X i s strongly stationary and find its autocovariance function. Find the spectral density function of X.

7. Let Z I , Z2 , . . . be uncorrelated variables, each with zero mean and unit variance. (a) Define the moving average process X by Xn = Zn + exZn- 1 where ex is a constant. Find the

spectral density function of X.

104

Page 114: One Thousand Exercises in Probability

Problems Exercises [9.7.8]-[9.7.16]

(b) More generally, let Yn = 1:/=0 OIi Zn-i , where 010 = I and 011 , . . . , 00r are constants. Find the spectral density function of Y.

8. Show that the complex-valued stationary process X = {X(t) : - 00 < t < oo} has a spectral density function which is bounded and uniformly continuous whenever its autocorrelation function p is continuous and satisfies Jo

oo I p (t) 1 dt < 00.

9. Let X = {Xn : n :::: I} be stationary with constant mean IL = JE(Xn ) for all n , and such that cov(Xo , Xn) � 0 as n � 00. Show that n- I 1:J=1 Xj � IL.

10. Deduce the strong law of large numbers from an appropriate ergodic theorem.

11. Let Q be a stationary measure on OR T , 93T) where T = { I , 2, . . . } . Show that Q is ergodic if and only if

1 n - L Yi � JE(Y) n i=1

a.s. and in mean

for all Y : ll�T � llHor which JE(Y) exists, where Yi : JRT � JR is given by Yi (X) = y(.i- I (x)) . As usual, • is the natural shift operator on JR T • 12. The stationary measure Q on (JRT , 93T) is called strongly mixing if Q(A n .-n B) � Q(A)Q(B) as n � 00, for all A, B E 93T ; as usual, T = { I , 2 , . . . } and . is the shift operator on JRT . Show that every strongly mixing measure is ergodic.

13. Ergodic theorem. Let (0 , :F, lP) be a probability space, and let T : 0 � 0 be measurable and measure preserving (i.e., lP(T-I A) = lP(A) for all A E :F). Let X : 0 � JR be a random variable, and let Xi be given by Xi (W) = X(Ti- l (w) ) . Show that

1 n - L Xi � JE(X 1 1) n i=1

a.s. and in mean

where 1 is the (J' -field of invariant events of T . If T is ergodic (in that lP(A) equals 0 or 1 whenever A is invariant), prove that JE(X 1 1) = JE(X)

almost surely.

14. Consider the probability space (0 , :F, lP) where 0 = [0, 1 ) , :Fis the set of Borel subsets, and lP is Lebesgue measure. Show that the shift T : 0 � 0 defined by T(x) = 2x (mod 1 ) is measurable, measure preserving, and ergodic (in that lP(A) equals 0 or 1 if A = T-I A).

Let X : 0 � JR be the random variable given by the identity mapping X (w) = w. Show that the proportion of 1 's, in the expansion of X to base 2, equals ! almost surely. This is sometimes called 'Borel's normal number theorem' .

15. Let g : JR � JR be periodic with period 1 , and uniformly continuous and integrable over [0, 1 ] . Define Zn = g (X + ( n - 1 )01) , n :::: 1 , where X i s uniform on [0, 1 ] and 01 i s irrational. Show that, as n � 00,

1 n rl

- L Zj � 10 g (u) du n j=1 0 a.s.

16. Let X = {X (t) : t :::: O} be a non-decreasing random process such that: (a) X (0) = 0, X takes values in the non-negative integers, (b) X has stationary independent increments, (c) the sample paths {X(t , w) : t :::: O} have only jump discontinuities of unit magnitude.

Show that X is a Poisson process.

1 05

Page 115: One Thousand Exercises in Probability

[9.7.17]-[9.7.22] Exercises Stationary processes

17. Let X be a continuous-time process. Show that: (a) if X has stationary increments and m(t) = lE(X (t)) is a continuous function of t, then there exist

a and {J such that m(t) = a + {Jt , (b) if X has stationary independent increments and v(t) = var(X (t) - X (0)) is a continuous function

of t then there exists (12 such that var(X (s + t) - X(s)) = (12t for all s . 18. A Wiener process W i s called standard if W(O) = 0 and W(I) has unit variance. Let W be a standard Wiener process, and let a be a positive constant. Show that: (a) a W (t / a2) is a standard Wiener process, (b) W(t + a) - W (a) is a standard Wiener process, (c) the process V, given by V(t) = tW( 1 /t) for t > 0, V(O) = 0, is a standard Wiener process, (d) the process W ( 1 ) - W (1 - t) is a standard Wiener process on [0, 1 ] .

19. Let W be a standard Wiener process. Show that the stochastic integrals

X (t) = l dW(u) , Y(t) = fot e- (t-u) dW(u) , t 2: 0,

are well defined, and prove that X (t) = W (t) , and that Y has autocovariance function cov(Y (s) , Y (t)) = ! (e- Is-t l _ e-s-t) , s < t . 20. Let W be a standard Wiener process. Find the means of the following processes, and the autoco­variance functions in cases (b) and (c) : (a) X (t) = I W (t) l , (b) Y (t) = eW(t) , (c) Z(t) = fci W(u) du .

Which of these are Gaussian processes? Which of these are Markov processes?

21. Let W be a standard Wiener process. Find the conditional joint density function of W (t2) and W(t3 ) given that W(tl ) = W(t4) = 0, where tl < t2 < t3 < t4 ·

Show that the conditional correlation of W(t2) and W(t3) is

p = (t4 - t3) (12 - tl ) (t4 - t2) (t3 - tl )

22. Empirical distribution function. Let Ul , U2 , ' " be independent random variables with the uniform distribution on [0, 1 ] . Let Ij (x) be the indicator function of the event {Uj :::; x } , and define

0 :::; x :::; 1 .

The function Fn is called the 'empirical distribution function' of the Uj . (a) Find the mean and variance of Fn (x) , and prove that .Jii"(Fn (x) - x) S Y(x ) as n � 00, where

Y (x) is normally distributed. (b) What is the (multivariate) limit distribution of a collection of random variables of the form

(.Jii"(Fn (Xi ) - Xi ) : 1 :::; i :::; k}, where 0 :::; Xl < X2 < . . . < Xk :::; I ? (c) Show that the autocovariance function of the asymptotic finite-dimensional distributions of

.Jii"(Fn (x) - x) , in the limit as n � 00, is the same as that of the process Z(t) = W(t) - tW(1 ) , 0 :::; t :::; 1 , where W is a standard Wiener process. The process Z i s called a 'Brownian bridge' or 'tied-down Brownian motion' .

106

Page 116: One Thousand Exercises in Probability

10 Renewals

In the absence of indications to the contrary, {Xn : n 2: I } denotes the sequence of interarrival times of either a renewal process N or a delayed renewal process Nd . In either case, Fd and F are the distribution functions of Xl and X2 respectively, though Fd =F F only if the renewal process is delayed. We write JL = JE(X2) , and shall usually assume that 0 < JL < 00. The functions m and md

denote the renewal functions of N and Nd . We write Tn = L:i=l Xi , the time of the nth arrival.

10.1 Exercises. The renewal equation

1. Prove that JE(e9N(t» < 00 for some strictly positive e whenever JE(Xl ) > O. [Hint: Consider the renewal process with interarrival times Xi = EI{Xe:€ } for some suitable E .] 2. Let N be a renewal process and let W be the waiting time until the length of some interarrival time has exceeded s . That is, W = inf{t : C (t) > s } , where C(t) is the time which has elapsed (at time t) since the last arrival. Show that

Fw (x) = { 0 if x < s , 1 - F(s) + J� Fw (x - u) dF(u) if x 2: s ,

where F is the distribution function of an interarrival time. If N is a Poisson process with intensity A., show that

9W A. - e JE(e ) = (J.. 9) A. - ee - s for e < A. ,

and JE(W) = (eJ..s - 1 )/1.. . You may find it useful to rewrite the above integral equation in the form of a renewal-type equation.

3. Find an expression for the mass function of N(t) in a renewal process whose interarrival times are: (a) Poisson distributed with parameter A., (b) gamma distributed, r (A. , b) . 4 . Let the times between the events of a renewal process N be uniformly distributed on (0, 1 ) . Find the mean and variance of N (t) for 0 ::::: t ::::: 1 .

10.2 Exercises. Limit theorems

1. Planes land at Heathrow airport at the times of a renewal process with interarrival time distribution function F. Each plane contains a random number of people with a given common distribution and finite mean. Assuming as much independence as usual, find an expression for the rate of arrival of passengers over a long time period.

2. Let Zl , Z2 , . . . be independent identically distributed random variables with mean 0 and finite variance (12, and let Tn = L:i=l Zi . Let M be a finite stopping time with respect to the Zi such that JE(M) < 00. Show that var(TM) = JE(M)(12 .

107

Page 117: One Thousand Exercises in Probability

[10.2.3]-[10.4.1] Exercises Renewals

3. Show that lE(TN(t)+k) = lL(m(t) +k) for all k ::: l , but that it is not generally true that lE(TN(t» ) = ILm (t) .

4 . Show that, using the usual notation, the family {N(t)/t : 0 ::; t < oo} i s unifonnly integrable. How might one make use of this observation?

5. Consider a renewal process N having interarrival times with moment generating function M, and let T be a positive random variable which is independent of N. Find lE(sN(T» ) when: (a) T is exponentially distributed with parameter v, (b) N is a Poisson process with intensity J.. , in terms of the moment generating function of T . What

is the distribution of N(T) in this case, if T has the gamma distribution r (v, b)?

10.3 Exercises. Excess life

1. Suppose that the distribution of the excess lifetime E(t) does not depend on t . Show that the renewal process is a Poisson process.

2. Show that the current and excess lifetime processes, C (t) and E(t) , are Markov processes.

3. Suppose that X I is non-arithmetic with finite mean JL. (a) Show that E (t) converges in distribution as t � 00, the limit distribution function being

H(x) = r .!.[1 - F(y)] dy . Jo IL

(b) Show that the rth moment of this limit distribution is given by

assuming that this is finite. (c) Show that

1000 lE(Xr+I )

xr dH(x) = (

I 1 '

o IL r + )

lE(E(tl) = lE ({ (XI - t)+V) + lot h (t - x) dm (x)

for some suitable function h to be found, and deduce by the key renewal theorem that lE(E(tt) � lE(X1+I )f{IL(r + I ) } as t � 00, assuming this limit is finite.

4. Find an expression for the mean value of the excess lifetime E(t) conditional on the event that the current lifetime C(t) equals x .

5. Let M(t) = N(t) + 1 , and suppose that Xl has finite non-zero variance a2 . (a) Show that var(TM(t) - ILM(t)) = a2 (m (t) + 1 ) . (b) In the non-arithmetic case, show that var(M(t))/t � a2 / 1L3 as t � 00.

10.4 Exercise. Applications

1. Find the distribution of the excess lifetime for a renewal process each of whose interarrival times is the sum of two independent exponentially distributed random variables having respective parameters J.. and JL. Show that the excess lifetime has mean

1 J..e- (H/L)t + IL - + -:-c-:----:-� IL J.. (J.. + JL)

108

Page 118: One Thousand Exercises in Probability

Problems Exercises [10.5.1]-[10.5.6]

10.5 Exercises. Renewal-reward processes

1. IT X(t) is an irreducible persistent non-null Markov chain, and u (·) is a bounded function on the integers, show that

� r u(X(s)) ds � L Jl'iU (i ) , t 10 ieS where 1C is the stationary distribution of X(t) . 2. Let M(t) be an alternating renewal process, with interarrival pairs {Xr , Yr : r ::: I } . Show that

1 r a.s . JEXl t 10 IIM(s) is even} ds ---+ JEX 1 + JEYl as t � 00.

3 . Let C(s) be the current lifetime (or age) of a renewal process N(t) with a typical interarrival time X. Show that

1 lot

C( ) d a.s. JE(X2) - s s ---+ -2-- as t � 00. t 0 JE(X)

Find the corresponding limit for the excess lifetime.

4. Let j and k be distinct states of an irreducible discrete-time Markov chain X with stationary distribution 1C . Show that

1 /Jl'k IP'(T· < Tk I Xo = k) = ----------J JE(1j I Xo = k) + JE(Tk I Xo = j)

where Ti = min {n ::: 1 : X n = i } is the first passage time to the state i . [Hint: Consider the times of return to j having made an intermediate visit to k.]

10.5 Problems

1. (a) Show that IP'(N (t) � 00 as t � (0) = 1 . (b) Show that m(t) < 00 if J1, =p O. (c) More generally show that, for all k > 0, JE(N(t)k ) < 00 if J1, =p O.

2. Let v(t) = IE(N(t)2 ) . Show that

v(t) = m(t) + 2 lot m (t - s) dm(s ) .

Find v(t) when N i s a Poisson process.

3. Suppose that 0'2 = var(X l ) > O. Show that the renewal process N satisfies

N(t) - (t/J1,) � N(O, 1 ) , VtO'2/J1,3 as t � 00 .

4. Find the asymptotic distribution of the current life C (t) of N as t � 00 when Xl is not arithmetic.

5. Let N be a Poisson process with intensity A. Show that the total life D(t) at time t has distribution function lP'(D(t) :::: x) = 1 - ( 1 + A min{t, x } )e-AX for x ::: O. Deduce that IE(D(t)) = (2 - e-At )/A . 6. A Type 1 counter records the arrivals of radioactive particles. Suppose that the arrival process is Poisson with intensity A, and that the counter is locked for a dead period of fixed length T after

109

Page 119: One Thousand Exercises in Probability

[10.5.7]-[10.5.12] Exercises Renewals

each detected arrival. Show that the detection process N is a renewal process with interarrival time distribution F(x) = 1 - e-J.. (x-T) if x :::: T . Find an expression for lP'(N(t) :::: k) .

7. Particles arrive at a Type 1 counter in the manner of a renewal process N; each detected arrival locks the counter for a dead period of random positive length. Show that

lP'(Xl :::: x) = fox [ 1 - F(x - y)]h(y) dm(y)

where F L is the distribution function of a typical dead period.

S. (a) Show that m(t) = 1 At - ! (1 - e-2J..t ) if the interarrival times have the gamma distribution r (J.. , 2) .

(b) Radioactive particles arrive like a Poisson process, intensity J.. , at a counter. The counter fails to register the nth arrival whenever n is odd but suffers no dead periods. Find the renewal function iii of the detection process N.

9. Show that Poisson processes are the only renewal processes with non-arithmetic interarrival times having the property that the excess lifetime E(t) and the current lifetime C(t) are independent for each choice of t . 10. Let Nl be a Poisson process, and let N2 be a renewal process which i s independent of Nl with non-arithmetic interarrival times having finite mean. Show that N(t) = Nl (t) + N2 (t) is a renewal process if and only if N2 is a Poisson process.

11. Let N be a renewal process, and suppose that F is non-arithmetic and that a2 = var(X l) < 00. Use the properties of the moment generating function F* ( -0 ) of X 1 to deduce the formal expansion

1 a2 - J1,2 m* (O) = - + 2 + 0(1 ) as O � o. o J1, 2J1,

Invert this Laplace-Stieltjes transform formally to obtain

t a2 - J1,2 m (t) = - + 2 + 0( 1) as t � 00.

J1, 2J1,

Prove this rigorously by showing that

t lot

m(t) = - - FE (t) + [1 - FE (t - x)] dm (x) , J1, 0

where FE is the asymptotic distribution function of the excess lifetime (see Exercise ( 10.3 .3)), and applying the key renewal theorem. Compare the result with the renewal theorems.

12. Show that the renewal function md of a delayed renewal process satisfies

where m is the renewal function of the renewal process with interarrival times X 2 , X 3 , . . . .

1 10

Page 120: One Thousand Exercises in Probability

Problems Exercises [10.5.13]-[10.5.19]

13. Let m (t) be the mean number of living individuals at time t in an age-dependent branching process with exponential lifetimes, parameter A, and mean family size v (> 1 ) . Prove that m(t) = I e(v- l )At where I is the number of initial members.

14. Alternating renewal process. The interarrival times of this process are Zo , Yl , Zl , Y2 , . . . , where the Yj and Zj are independent with respective common moment generating functions My and Mz. Let p (t) be the probability that the epoch t of time lies in an interval of type Z. Show that the Laplace-Stieltjes transform p* of p satisfies

p* «(j) = -:----::I_--,M.---::::,-z (-,:--_(j:.".) --:-:-1 - My (-(j)Mz (-(j)

15. Type 2 counters. Particles are detected by a Type 2 counter of the following sort. The incoming particles constitute a Poisson process with intensity A. The jth particle locks the counter for a length Yj of time, and annuls any after-effect of its predecessors. Suppose that Y 1 , Y2 , . . . are independent of each other and of the Poisson process, each having distribution function G . The counter is unlocked at time o.

Let L be the (maximal) length of the first interval of time during which the counter is locked. Show that H(t) = lP'(L > t) satisfies

H(t) = e-At [ 1 - G(t)] + l H(t - x) [ 1 - G(x)]Ae-AX dx .

Solve for H in terms of G, and evaluate the ensuing expression in the case G(x) = 1 - e-f.LX where J1, > 0.

16. Thinning. Consider a renewal process N, and suppose that each arrival is 'overlooked' with probability q, independently of all other arrivals. Let M(t) be the number of arrivals which are detected up to time t / p where p = 1 - q .

(a) Show that M i s a renewal process whose interarrival time distribution function Fp i s given by Fp (x) = E�l pqr- l Fr (x/p) , where Fn is the distribution function of the time of the nth arrival in the original process N.

(b) Find the characteristic function of Fp in terms of that of F, and use the continuity theorem to show that, as p + 0, Fp (s) ----+ l - e-s/f.L for S > 0, so long as the interarrival times in the original process have finite mean J1,. Interpret !

(c) Suppose that p < 1 , and M and N are processes with the same fdds. Show that N is a Poisson process.

17. (a) A PC keyboard has 100 different keys and a monkey is tapping them (uniformly) at random. Assuming no power failure, use the elementary renewal theorem to find the expected number of keys tapped until the first appearance of the sequence of fourteen characters 'W. Shakespeare' . Answer the same question for the sequence 'omo' .

(b) A coin comes up heads with probability p on each toss. Find the mean number of tosses until the first appearances of the sequences (i) HHH, and (ii) HTH.

18. Let N be a stationary renewal process. Let s be a fixed positive real number, and define X(t) = N(s + t) - N(t) for t ::: O. Show that X is a strongly stationary process.

19. Bears arrive in a village at the instants of a renewal process; they are captured and confined at a cost of $c per unit time per bear. When a given number B bears have been captured, an expedition (costing $d) is organized to remove and release them a long way away. What is the long-run average cost of this policy?

1 1 1

Page 121: One Thousand Exercises in Probability

11 Queues

11.2 Exercises. MIMIl

1. Consider a random walk on the non-negative integers with a reflecting barrier at 0, and which moves rightwards or leftwards with respective probabilities pi (1 + p) and 1 I (1 + p) ; when at 0, the particle moves to 1 at the next step. Show that the walk has a stationary distribution if and only if p < 1 , and in this case the unique such distribution Jl is given by rro = i ( 1 - p) , rrn = i ( 1 _ p2)pn-l for n 2: 1 .

2 . Suppose now that the random walker of Exercise ( 1 ) delays its steps in the following way. When at the point n , it waits a random length of time having the exponential distribution with parameter On before moving to its next position; different 'holding times' are independent of each other and of further information concerning the steps of the walk. Show that, subject to reasonable assumptions on the On , the ensuing continuous-time process settles into an equilibrium distribution v given by Vn = CrrnlOn for some appropriate constant C.

By applying this result to the case when 00 = A, On = A + JL for n 2: 1 , deduce that the equilibrium distribution of the M(A)/M(/1,)/1 queue is Vn = (1 - p )pn , n 2: 0, where p = AI JL < 1 .

3 . Waiting time. Consider a M(A)/M(JL)/l queue with p = A I JL satisfying p < 1 , and suppose that the number Q(O) of people in the queue at time 0 has the stationary distribution rrn = (1 _ p)pn , n 2: O. Let W be the time spent by a typical new arrival before he begins his service. Show that the distribution of W is given by ll:'(W :::: x) = 1 - pe-X (!-L-A) for x 2: 0, and note that ll:'(W = 0) = 1 - p . 4. A box contains i red balls and j lemon balls, and they are drawn at random without replacement. Each time a red (respectively lemon) ball is drawn, a particle doing a walk on {O, 1 , 2, . . . } moves one step to the right (respectively left); the origin is a retaining barrier, so that leftwards steps from the origin are suppressed. Let rr (n ; i, j) be the probability that the particle ends at position n, having started at the origin. Write down a set of difference equations for the rr(n ; i, j) , and deduce that

rr(n ; i , j) = A(n ; i , j) - A (n + 1 ; i , j) for i :::: j + n

where A(n ; i, j) = (�) / (i�n) . 5. Let Q be a M(A)/M(JL)/1 queue with Q(O) = O. Show that Pn (t) = lI:'(Q(t) = n ) satisfies

" . . ( At)i e-At ) ( JLt)j e-!-Lt ) Pn (t) = L...J rr (n ; I , J ) . , . , i, j?O l . J .

where the rr (n ; i, j) are given in the previous exercise.

6. Let Q(t) be the length of an M(A)/M(JL)/1 queue at time t, and let Z = {Zn } be the jump chain of Q. Explain how the stationary distribution of Q may be derived from that of Z, and vice versa.

1 1 2

Page 122: One Thousand Exercises in Probability

GIGI1 Exercises [11.2.7]-[11.5.2]

7. Tandem queues. 1\vo queues have one server each, and all service times are independent and exponentially distributed, with parameter JLi for queue i . Customers arrive at the first queue at the instants of a Poisson process of rate A « min {JL 1 , JL2 n, and on completing service immediately enter the second queue. The queues are in equilibrium. Show that:

(a) the output of the first queue is a Poisson process with intensity A, and that the departures before time t are independent of the length of the queue at time t ,

(b) the waiting times of a given customer in the two queues are not independent.

11.3 Exercises. MlG!l

1. Consider M(A)ID(d)/I where p = Ad < 1 . Show that the mean queue length at moments of departure in equilibrium is � p (2 - p) I ( 1 - p) . 2. Consider M(A)IM(JL)/1 , and show that the moment generating function of a typical busy period is given by

(A + JL - s) - V(A + JL - s)2 - 4AJL MB (S) =

2A

for all sufficiently small but positive values of s . 3. Show that, for a MlG/l queue, the sequence of times at which the server passes from being busy to being free constitutes a renewal process.

11.4 Exercises. GlMIl

1. Consider GIM(JL)Il , and let (Xj = E«JLX)j e-f.LX Ij !) where X is a typical interarrival time. Suppose the traffic intensity p is less than 1 . Show that the equilibrium distribution 1C of the imbedded chain at moments of arrivals satisfies

00 7rn = l::: (Xi7rn+i- 1

i=O for n � 1 .

Look for a solution of the form 7r n = on for some 0 , and deduce that the unique stationary distribution is given by 7rj = (1 - TJ)TJj for j � 0, where TJ is the smallest positive root of the equation S = MX (JL(s - 1)) .

2. Consider a GIM(JL )/1 queue in eqUilibrium. Let TJ be the smallest positive root of the equation x = MX (JL(x - 1)) where Mx is the moment generating function of an interarrival time. Show that the mean number of customers ahead of a new arrival is TJ ( l - TJ)- I

, and the mean waiting time is TJ{JL(1 - TJ)}- I .

3. Consider D(1 )IM(JL)1l where JL > 1 . Show that the continuous-time queue length Q (t) does not converge in distribution as t � 00, even though the imbedded chain at the times of arrivals is ergodic.

11.5 Exercises. G/G/l

1. Show that, for a G/G/l queue, the starting times of the busy periods of the server constitute a renewal process.

2. Consider a GIM(JL)/1 queue in equilibrium, together with the dual (unstable) M(JL)/GIl queue. Show that the idle periods of the latter queue are exponentially distributed. Use the theory of duality

1 1 3

Page 123: One Thousand Exercises in Probability

[11.5.3]-[11.7.5] Exercises Queues

of queues to deduce for the former queue that: (a) the waiting-time distribution is a mixture of an exponential distribution and an atom at zero, and (b) the equilibrium queue length is geometric.

3. Consider G/M(JL)I 1 , and let G be the distribution function of S - X where S and X are typical (independent) service and interarrival times. Show that the Wiener-Hop! equation

F(x) = lXoo F(x - y) dG(y) , x :::: 0,

for the limiting waiting-time distribution F is satisfied by F (x) = 1 - TJe-/L(I -TJ)x , x :::: O. Here, TJ is the smallest positive root of the equation x = M X (JL(x - 1) ) , where M X is the moment generating function of X.

11.6 Exercise. Heavy traffic

1. Consider the M(>")/M(JL)/1 queue with p = >"/JL < 1 . Let Qp be a random variable with the equilibrium queue distribution, and show that ( 1 - p) Q p converges in distribution as p t 1 , the limit distribution being exponential with parameter 1 .

1 1.7 Exercises. Networks of queues

1. Consider an open migration process with c stations, in which individuals arrive at station j at rate Vj , individuals move from i to j at rate >"jj c/Jj (nj ) , and individuals depart from i at rate JLjc/Jj (nj ) , where nj denotes the number of individuals currently at station i . Show when cPj (nj ) = nj for all i that the system behaves as though the customers move independently through the network. Identify the explicit form of the stationary distribution, subject to an assumption of irreducibility, and explain a connection with the Bartlett theorem of Problem (8.7.6).

2. Let Q be an M(>")/M(JL)/s queue where >.. < SJL, and assume Q is in equilibrium. Show that the process of departures is a Poisson process with intensity >.. , and that departures up to time t are independent of the value of Q(t) . 3. Customers arrive in the manner of a Poisson process with intensity >.. in a shop having two servers. The service times of these servers are independent and exponentially distributed with respective parameters JLI and JL2 . Arriving customers form a single queue, and the person at the head of the queue moves to the first free server. When both servers are free, the next arrival is allocated a server chosen according to one of the following rules:

(a) each server is equally likely to be chosen,

(b) the server who has been free longer is chosen.

Assume that >.. < JL I + JL2 , and the process is in equilibrium. Show in each case that the process of departures from the shop is a Poisson process, and that departures prior to time t are independent of the number of people in the shop at time t . 4. Difficult customers. Consider an M(>")/M(JL)/1 queue modified so that on completion of service the customer leaves with probability 8 , or rejoins the queue with probability 1 - 8 . Find the distribution of the total time a customer spends being served. Hence show that equilibrium is possible if >.. < 8 JL, and find the stationary distribution. Show that, in eqUilibrium, the departure process is Poisson, but if the rejoining customer goes to the end of the queue, the composite arrival process is not Poisson.

5. Consider an open migration process in eqUilibrium. If there is no path by which an individual at station k can reach station j , show that the stream of individuals moving directly from station j to station k forms a Poisson process.

1 14

Page 124: One Thousand Exercises in Probability

Problems Exercises [11.8.1]-[11.8.7]

11.8 Problems

1. Finite waiting room. Consider M(J...)/M(J.L)lk with the constraint that arriving customers who see N customers in the line ahead of them leave and never return. Find the stationary distribution of queue length for the cases k = 1 and k = 2. 2. Baulking. Consider M(J...)/M(J.L)/1 with the constraint that if an arriving customer sees n customers in the line ahead of him, he joins the queue with probability p(n) and otherwise leaves in disgust.

(a) Find the stationary distribution of queue length if p(n) = (n + 1 ) - 1 .

(b) Find the stationary distribution 1C of queue length if p(n) = 2-n , and show that the probability that an arriving customer joins the queue (in equilibrium) is J.L( l - Iro)/J... .

3. Series. In a Moscow supermarket customers queue at the cash desk to pay for the goods they want; then they proceed to a second line where they wait for the goods in question. If customers arrive in the shop like a Poisson process with parameter J... and all service times are independent and exponentially distributed, parameter J.Ll at the first desk and J.L2 at the second, find the stationary distributions of queue lengths, when they exist, and show that, at any given time, the two queue lengths are independent in equilibrium.

4. Batch (or bulk) service. Consider M/G/1 , with the modification that the server may serve up to m customers simultaneously. If the queue length is less than m at the beginning of a service period then she serves everybody waiting at that time. Find a formula which is satisfied by the probability generating function of the stationary distribution of queue length at the times of departures, and evaluate this generating function explicitly in the case when m = 2 and service times are exponentially distributed.

5. Consider M(J...)/M(J.L)/l where J... < J.L. Find the moment generating function of the length B of a typical busy period, and show that E(B) = (J.L - J...) - 1 and var(B) = (J... + J.L)/(J.L - J...)3 . Show that the density function of B is

where It is a modified Bessel function.

6. Consider M(J...)/G/I in equilibrium. Obtain an expression for the mean queue length at departure times. Show that the mean waiting time in equilibrium of an arriving customer is !J...E(S2)/( l - p) where S is a typical service time and p = J...E(S) .

Amongst all possible service-time distributions with given mean, find the one for which the mean waiting time is a minimum.

7. Let Wt be the time which a customer would have to wait in a M(J...)/G/1 queue if he were to arrive at time t . Show that the distribution function F(x ; t) = lP'(Wt :s x) satisfies

of of - = - - J...F + J...lP'(Wt + S :s x ) ot ox where S is a typical service time, independent of Wt .

Suppose that F(x , t) -+ H(x) for all x as t -+ 00, where H is a distribution function satisfying 0 = h - J...H + J...lP'(U + S :s x) for x > 0, where U is independent of S with distribution function H, and h is the density function of H on (0, 00) . Show that the moment generating function Mu of U satisfies

M (0) _ (1 - p)O

u -J... + 0 - J...Ms (O)

where p is the traffic intensity. You may assume that lP'(S = 0) = O.

1 15

Page 125: One Thousand Exercises in Probability

[11.8.8]-[11.8.14] Exercises Queues

8. Consider a G/GIl queue in which the service times are constantly equal to 2, whilst the interarrival times take either of the values 1 and 4 with equal probability � . Find the limiting waiting time distribution.

9. Consider an extremely idealized model of a telephone exchange having infinitely many channels available. Calls arrive in the manner of a Poisson process with intensity A, and each requires one channel for a length of time having the exponential distribution with parameter J1" independently of the arrival process and of the duration of other calls. Let Q(t) be the number of calls being handled at time t, and suppose that Q (0) = I .

Determine the probability generating function of Q(t) , and deduce lE(Q (t)), lP'(Q (t) = 0) , and the limiting distribution of Q(t) as t � 00.

Assuming the queue is in equilibrium, find the proportion of time that no channels are occupied, and the mean length of an idle period. Deduce that the mean length of a busy period is (eAliL - 1)/,1,..

10. Customers arrive in a shop in the manner of a Poisson process with intensity A, where 0 < A < 1 . They are served one by one in the order of their arrival, and each requires a service time of unit length. Let Q(t) be the number in the queue at time t . By comparing Q(t) with Q(t + 1) , determine the limiting distribution of Q(t) as t � 00 (you may assume that the quantities in question converge). Hence show that the mean queue length in eqUilibrium is ,1,.( 1 - iA)/(1 - A).

Let W be the waiting time of a newly arrived customer when the queue is in equilibrium. Deduce from the results above that lE(W) = �A/(1 - A) .

11. Consider M(A)ID(l)/l , and suppose that the queue is empty at time O. Let T be the earliest time at which a customer departs leaving the queue empty. Show that the moment generating function Mr of T satisfies

log ( 1 - f) + log Mr (s) = (s - ,1,.) ( 1 - Mr (s)) ,

and deduce the mean value of T, distinguishing between the cases A < 1 and A :::: 1 .

12. Suppose A < J1" and consider a M(A)/M(J1,)11 queue Q in eqUilibrium.

(a) Show that Q is a reversible Markov chain.

(b) Deduce the eqUilibrium distributions of queue length and waiting time.

(c) Show that the times of departures of customers form a Poisson process, and that Q(t) is indepen­dent of the times of departures prior to t .

(d) Consider a sequence of K single-server queues such that customers arrive at the first in the manner of a Poisson process, and (for each j) on completing service in the jth queue each customer moves to the (j + l )th. Service times in the jth queue are exponentially distributed with parameter J1,j ' with as much independence as usual. Determine the Ooint) equilibrium distribution of the queue lengths, when A < J1,j for all j .

13. Consider the queue M(A)/M(J1,)lk, where k :::: 1 . Show that a stationary distribution Jr exists if and only if A < kJ1" and calculate it in this case.

Suppose that the cost of operating this system in equilibrium is

00 Ak + B l::(n - k + 1)JZ'n ,

n=k

the positive constants A and B representing respectively the costs of employing a server and of the dissatisfaction of delayed customers.

Show that, for fixed J1" there is a unique value ,1,.* in the interval (0, J1,) such that it is cheaper to have k = 1 than k = 2 if and only ir A < ,1,.* .

14. Customers arrive in a shop in the manner of a Poisson process with intensity A. They form a single queue. There are two servers, labelled 1 and 2, server i requiring an exponentially distributed

1 1 6

Page 126: One Thousand Exercises in Probability

Problems Exercises [11.8.15]-[11.8.19]

time with parameter J.Li to serve any given customer. The customer at the head of the queue is served by the first idle server; when both are idle, an arriving customer is equally likely to choose either.

(a) Show that the queue length settles into equilibrium if and only if A < J.Ll + J.L2 .

(b) Show that, when in equilibrium, the queue length is a time-reversible Markov chain.

(c) Deduce the equilibrium distribution of queue length.

(d) Generalize your conclusions to queues with many servers.

15. Consider the D(I )/M(J.L)/1 queue where J.L > 1 , and let Qn be the number of people in the queue just before the nth arrival. Let Qf.L be a random variable having as distribution the stationary

distribution of the Markov chain { Qn } . Show that ( 1 - J.L-l) Qf.L converges in distribution as J.L ,!.. 1 ,

the limit distribution being exponential with parameter 2.

16. Taxis arrive at a stand in the manner of a Poisson process with intensity 1: , and passengers arrive in the manner of an (independent) Poisson process with intensity 7r . IT there are no waiting passengers, the taxis wait until passengers arrive, and then move off with the passengers, one to each taxi. IT there is no taxi, passengers wait until they arrive. Suppose that initially there are neither taxis nor passengers at the stand. Show that the probability that n passengers are waiting at time t is

1 (7r 11:pne-(7r+r)t In (2t..jWi) , where In (x) is the modified Bessel function, i.e., the coefficient of zn in the power series expansion of exp{ � x (z + z -1 ) } .

17. Machines arrive for repair as a Poisson process with intensity A . Each repair involves two stages, the ith machine to arrive being under repair for a time Xi + Yi , where the pairs (Xi , Yi ) , i = 1 , 2, . . . , are independent with a common joint distribution. Let U (t) and V (t) be the numbers of machines in the X -stage and Y -stage of repair at time t. Show that U (t) and V (t) are independent Poisson random variables.

18. Ruin. An insurance company pays independent and identically distributed claims {Kn : n :::: I } at the instants of a Poisson process with intensity A , where AlE(Kl ) < 1 . Premiums are received at constant rate 1 . Show that the maximum deficit M the company will ever accumulate has moment generating function

OM (1 - p)8 lE(e ) =

A + 8 _ AlE(eOK) .

19. (a) Erlang's loss formula. Consider M(A)/M(J.L)/s with baulking, in which a customer departs immediately if, on arrival, he sees all the servers occupied ahead of him. Show that, in eqUilibrium, the probability that all servers are occupied is

7rs = "'� i l ' " L..J ={} P J . where P = AI J.L .

(b) Consider an M(A)/M(J.L)/oo queue with channels (servers) numbered 1 , 2, . . . . On arrival, a customer will choose the lowest numbered channel that is free, and be served by that channel. Show in the notation of part (a) that the fraction Pc of time that channel c is busy is Pc = P (7rc- l - 7rc) for c :::: 2, and pl = 7rl ·

1 17

Page 127: One Thousand Exercises in Probability

12 Martingales

12.1 Exercises. Introduction

1. (i) If (Y, !F) is a martingale, show that JE(Yn ) = JE(Yo) for all n .

(ii) If (Y, !F) i s a submartingale (respectively supermartingale) with finite means, show that JE(Yn) � JE(Yo) (respectively JE(Yn ) � JE(Yo».

2. Let (Y, !F) be a martingale, and show that JE(Yn+m I :Fn) = Yn for all n, m � o.

3. Let Zn be the size of the nth generation of a branching process with Zo = 1, having mean family size JL and extinction probability 7] . Show that ZnJL-n and 7]zn define martingales.

4. Let {Sn : n � o} be a simple symmetric random walk on the integers with So = k. Show that Sn and S; - n are martingales. Making assumptions similar to those of de Moivre (see Example ( 12. 1 .4» , find the probability of ruin and the expected duration of the game for the gambler's ruin problem.

S. Let (Y, !F) be a martingale with the property that JE(Y;) < 00 for all n. Show that, for i � j � k, JE{(Yk - lj )Yj } = 0, and JE{(Yk - Yj )2 I .ft } = JE(yl l .ft) - JE(Y] I .ft) . Suppose there exists K

such that JE(Y;) � K for all n . Show that the sequence {Yn } converges in mean square as n -+ 00.

6. Let Y be a martingale and let u be a convex function mapping JR to JR. Show that (u(Yn ) : n � o} is a submartingale provided that JE(u (Yn )+) < 00 for all n .

Show that I Yn l , Y; , and Y;i constitute submartingales whenever the appropriate moment condi­tions are satisfied.

7. Let Y be a submartingale and let u be a convex non-decreasing function mapping JR to JR. Show that (u(Yn ) : n � o} is a submartingale provided that JE(u (Yn )+) < 00 for all n .

Show that (subject to a moment condition) Y;i constitutes a submartingale, but that I Yn I and Y; need not constitute submartingales.

8. Let X be a discrete-time Markov chain with countable state space S and transition matrix p. Suppose that 1/1 : S -+ JR is bounded and satisfies L,jES Pij 1/l (j) � J...1/I (i ) for some J... > 0 and all

i E S . Show that J... -n1/l (Xn) constitutes a supermartingale.

9. Let Gn (s ) be the probability generating function of the size Zn of the nth generation of a branching process, where Zo = 1 and var(Zl ) > O. Let Hn be the inverse function of the function Gn, viewed as a function on the interval [0, 1 ] , and show that Mn = {Hn (s) }zn defines a martingale with respect to the sequence Z.

1 1 8

Page 128: One Thousand Exercises in Probability

Crossings and convergence Exercises [12.2.1]-[12.3.4]

12.2 Exercises. Martingale differences and Hoeffding's inequality

1. Knapsack problem. It is required to pack a knapsack to maximum benefit. S\1ppose you have n objects, the i th object having volume Vi and worth Wi , where Vl , V2 , . . . , Vn , Wl , W2 , . . . , Wn are independent non-negative random variables with finite means, and Wi ::: M for all i and some fixed M. Your knapsack has volume c, and you wish to maximize the total worth of the objects packed in it. That is, you wish to find the vector Z l , Z2 , . . . , Zn of O's and l 's such that �'{ Zi Vi ::: c and which maximizes �'{ Zi Wi . Let Z be the maximal possible worth of the knapsack's contents, and show that

lP'( I Z - EZ I � x) ::: 2 exp{-x2/(2nM2) } for x > o.

2. Graph colouring. Given n vertices Vl , V2 , . . . , Vn , for each 1 ::: i < j ::: n we place an edge between Vi and Vj with probability p; different pairs are joined independently of each other. We call Vi and V j neighbours if they are joined by an edge. The chromatic number X of the ensuing graph is the minimal number of pencils of different colours which are required in order that each vertex may be coloured differently from each of its neighbours. Show that lP'( lx - EX 1 � x) ::: 2 exp{ - ix2/n} for x > O.

12.3 Exercises. Crossings and convergence

1. Give a reasonable definition of a downcrossing of the interval [a , b) by the r3l)dom sequence Yo , Yl , . . . .

(a) Show that the number of downcrossings differs from the number of upcrossings by at most 1 .

(b) If (Y, fF) is a submartingale, show that the number Dn (a , b ; Y ) of downcrossings of [a , b ) by Y up to time n satisfies

E{(f: - b)+ } EDn (a, b; Y) ::: ; . - a

2. Let (Y, fF) be a supermartingale with finite means, and let Un (a , b; Y) be the number of upcross­ings of the interval [a , b) up to time n. Show that

Deduce that EUn (a, b; Y) ::: a/(b - a) if Y is non-negative and a � o. 3. Let X be a Markov chain with countable state space S and transition matrix P. Suppose that X is irreducible and persistent, and that 1/1 : S � S is a bounded function satisfying �jES Pij 1/I (j) ::: 1/I (i ) for i E S. Show that 1/1 i s a constant function.

4. Let Z l , Z2 , . . . be independent random variables such that:

with probability � n -2 ,

with probability 1 - n-2 ,

with probability in -2 ,

where a l = 2 and an = 4 �7:t aj . Show that Yn = �j= l Zj defines a martingale. Show that Y = lim Yn exists almost surely, but that there exists no M such that E I Yn 1 ::: M for all n .

1 1 9

Page 129: One Thousand Exercises in Probability

[12.4.1]-[12.5.5] Exercises Martingales

12.4 Exercises. Stopping times

1. If TI and T2 are stopping times with respect to a filtration :F, show that TI + T2 , max{TI , T2}, and min {TI , T2 } are stopping times also.

2. Let Xl , X2 , ' " be a sequence of non-negative independent random variables and let N(t) = max{n : Xl + X2 + . . . + Xn ::: t } . Show that N(t) + I is a stopping time with respect to a suitable filtration to be specified.

3. Let (Y, :F) be a submartingale and x > O. Show that

lP' ( max Ym � x) ::: .!.lE(Y,i ) . O::;:m::;:n x

4. Let (Y, :F) be a non-negative supermartingale and x > O. Show that

lP' ( max Ym � x) ::: .!.lE(Yo) . O::;:m::;:n x

5. Let (Y, :F) be a submartingale and let S and T be stopping times satisfying 0 ::: S ::: T ::: N for some deterministic N. Show that lEYo ::: lEYs ::: lEYT ::: lEYN .

6. Let {Sn } be a simple random walk with So = 0 such that 0 < P = lP'(SI = 1) < ! . Use de Moivre's martingale to show that lE(suPm Sm ) ::: p/( l - 2p) . Show further that this inequality may be replaced by an equality.

7. Let s:'be a filtration. For any stopping time T with respect to :F, denote by :FT the collection of all events A such that, for all n, A n { T ::: n } E :Fn . Let S and T be stopping times.

(a) Show that :FT is a a -field, and that T is measurable with respect to this a-field.

(b) If A E :Fs, show that A n {S ::: T} E :FT .

(c) Let S and T satisfy S ::: T. Show that :Fs £ :FT .

12.5 Exercises. Optional stopping

1. Let (Y, :F) be a martingale and T a stopping time such that lP'(T < (0) = 1 . Show that lE(Y T ) = lE(Yo) if either of the following holds:

(a) lE(suPn I YTAn I ) < 00, (b) lE( I YT/\n l 1+8 ) ::: c for some c, 8 > 0 and all n . 2. Let (Y, :F) be a martingale. Show that (YT An , :Fn) is a uniformly integrable martingale for any finite stopping time T such that either:

(a) lEl YT I < 00 and lE( I Yn I I{T>nj ) .... 0 as n .... 00, or

(b) {Yn } is uniformly integrable.

3. Let (Y, :F) be a uniformly integrable martingale, and let S and T be finite stopping times satisfying S ::: T. Prove that YT = lE(Yoo I :FT ) and that Ys = lE(YT I :Fs) , where Yoo is the almost sure limit as n .... 00 of Yn . 4. Let {Sn : n � O} be a simple symmetric random walk with 0 < So < N and with absorbing barriers at 0 and N. Use the optional stopping theorem to show that the mean time until absorption is lE{So (N - So)} ·

5. Let {Sn : n � O} be a simple symmetric random walk with So = O. Show that

cos{),, [Sn - ! (b - a) ] } Yn = --------�----­

(cos ).,)n

1 20

Page 130: One Thousand Exercises in Probability

Problems Exercises [12.5.6]-[12.9.2]

constitutes a martingale if cos A :f. O. Let a and b be positive integers. Show that the time T until absorption at one of two absorbing

barriers at -a and b satisfies

( -T ) cos{ �A(b - a)} lE {COS A} = 1 '

cos{ z A (b + aH O < A < � .

b + a

6. Let {Sn : n :::: O} be a simple symmetric random walk on the positive and negative integers, with So = O. For each of the three following random variables, determine whether or not it is a stopping time and find its mean:

U = min{n :::: 5 : Sn = Sn-5 + 5} , V = U - 5 , W = min{n : Sn = I } .

7. Let Sn = a + L:�= 1 X r be a simple symmetric random walk. The walk stops at the earliest time T when it reaches either ofthe two positions O or K where O < a < K . Show that Mn = L:�=o Sr - j S� is a martingale and deduce that lE(L:;=o Sr) = j (K2 - a2)a + a .

8 . Gambler's ruin. Let Xi be independent random variables each equally likely to take the values ± I , and let T = min{n : Sn E {-a, b} } . Verify the conditions of the optional stopping theorem (12.5. 1 ) for the martingale S� - n and the stopping time T .

12.7 Exercises. Backward martingales and continuous-time martingales

1. Let X be a continuous-time Markov chain with finite state space S and generator G. Let ." = {17 (i ) : i E S} be a root of the equation G.,,' = O. Show that 17 (X (t)) constitutes a martingale with respect to :Ft = a ({X(u) : u ::: t } ) .

2. Let N be a Poisson process with intensity A and N(O) = 0, and let Ta = min{t : N(t) = a} , where a is a positive integer. Assuming that lE{exp(1/I TaH < 00 for sufficiently small positive 1/1, use the optional stopping theorem to show that var(Ta ) = aA -2 . 3. Let Sm = L:;?=l Xr , m ::: n, where the Xr are independent and identically distributed with finite mean. Denote by Ul , U2 , . . . , Un the order statistics of n independent variables which are uniformly distributed on (0, t) , and set Un+! = t. Show that Rm = Sm/Um+! , 0 ::: m ::: n , is a backward martingale with respect to a suitable sequence of a-fields, and deduce that

IP'(Rm :::: 1 for some m ::: n I Sn = y) ::: min{y/ t , I } .

12.9 Problems

1. Let Zn be the size of the nth generation of a branching process with immigration in which the mean family size is J.t (:f. 1 ) and the mean number of immigrants per generation is m. Show that

defines a martingale.

2. In an age-dependent branching process, each individual gives birth to a random number of off­spring at random times. At time 0, there exists a single progenitor who has N children at the subsequent

1 2 1

Page 131: One Thousand Exercises in Probability

[12.9.3]-[12.9.8] Exercises Martingales

times B1 :::: B2 :::: . . . :::: BN ; his family may be described by the vector (N, B1 , B2 , . . . , BN) ' Each subsequent member x of the population has a family described similarly by a vector (N (x ) , B 1 (x) , . . . , BN(x) (x)) having the same distribution as (N, B1 , . . . , BN) and independent of all other individuals' families. The number N(x) is the number of his offspring, and Bj (x) is the time between the births of the parent and the i th offspring. Let {Bn, r : r � I } be the times of births of individuals in the nth generation. Let Mn «()) = Er e-9Bn ,r , and show that Yn = Mn «())/lE(Ml «()))n defines a martingale with respect to :Fn = a ({Bm , r : m :::: n , r � I } ) , for any value of () such that lEMl «()) < 00.

3. Let (Y, :F) be a martingale with lEYn = 0 and lE(Y;) < 00 for all n . Show that

x > o.

4. Let (Y, :F) be a non-negative submartingale with Yo = 0, and let {en } be a non-increasing sequence of positive numbers. Show that

x > O.

Such an inequality is sometimes named after subsets of Hajek, Renyi, and Chow. Deduce Kol­mogorov's inequality for the sum of independent random variables. [Hint: Work with the martingale Zn = cnYn - Ek=l cklE(Xk I :Fk- 1 ) + Ek=l (Ck- 1 - Ck) Yk- 1 where Xk = Yk - Yk- 1 ·] 5. Suppose that the sequence {Xn : n � I } of random variables satisfies lE(Xn I Xl , X2 , . . . , Xn- 1 ) = 0 for all n, and also E�l lE( IXkn/ e < 00 for some r E [ 1 , 2] . Let Sn = E'=l Zj where Zj = Xi / i , and show that

x > O.

Deduce that Sn converges a.s. as n � 00, and hence that n- 1 E1 Xk � O. [Hint: In the case 1 < r :::: 2, prove and use the fact that h (u) = l u l r satisfies h (v) - h (u) :::: (v -u)h' (u)+2h « v-u)/2) . Kronecker's lemma i s useful for the last part.]

6. Let X I , X 2, . . . be independent random variables with

Let YI = X I and for n � 2

{ I with probability (2n) - 1 , Xn = 0 with probability 1 - n- l ,

- 1 with probability (2n) - 1 .

Yn = { Xn if Yn- l = 0,

nYn- 1 l Xn l if Yn- l :;6 O.

Show that Yn is a martingale with respect to :Fn = a (YI , Y2 , . . . , Yn) . Show that Yn does not converge almost surely. Does Yn converge in any way? Why does the martingale convergence theorem not apply?

7. Let Xl , X 2, . . . be independent identically distributed random variables and suppose that M (t) = lE(etXJ ) satisfies M(t) = 1 for some t > O. Show that lP'(Sk � x for some k) :::: e-tx for x > 0 and such a value of t, where Sk = Xl + X2 + . . . + Xk . 8. Let Zn be the size of the nth generation of a branching process with family-size probability generating function G(s) , and assume Zo = 1 . Let ; be the smallest positive root of G(s) = s .

1 22

Page 132: One Thousand Exercises in Probability

Problems Exercises [12.9.9]-[12.9.14]

Use the martingale convergence theorem to show that, if 0 < � < 1 , then IP'(Zn � 0) = � and IP'(Zn � (0) = 1 - � .

9. Let (Y, 90 be a non-negative martingale, and let Y; = max{Yk : 0 :::: k :::: n } . Show that

[Hint: a log+ b :::: a log+ a + ble if a, b � 0, where log+ x = max{O, log x } .]

10. Let X = {X(t) : t � O} be a birth-death process with parameters Ai , J-Li , where Ai = 0 if and only if i = O. Define h (O) = 0, h ( 1 ) = 1 , and

j - l

h (j) = 1 + L J-L I J-L2 • . . J-Li , i=1 1.. 11..2 " ' Ai

j � 2.

Show that h (X (t)) constitutes a martingale with respect to the filtration J=t = a ({X(u) : 0 :::: u :::: tD, whenever Eh (X(t)) < 00 for all t . (You may assume that the forward equations are satisfied.)

Fix n, and let m < n; let 11: (m) be the probability that the process is absorbed at 0 before it reaches size n, having started at size m . Show that 11:(m) = 1 - {h (m)1 h (n)} .

11. Let (Y, 90 be a submartingale such that E(Y,i) :::: M for some M and all n .

(a) Show that Mn = limm�oo E(Y';+m I :F'n) exists (almost surely) and defines a martingale with respect to :F.

(b) Show that Yn may be expressed in the form Yn = Xn - Zn where (X, 90 is a non-negative martingale, and (Z, 90 is a non-negative supermartingale. This representation of Y is sometimes termed the 'Krickeberg decomposition' .

(c) Let (Y, 90 be a martingale such that E l Yn l :::: M for some M and all n . Show that Y may be expressed as the difference of two non-negative martingales.

12. Let £Yn be the assets of an insurance company after n years of trading. During each year it receives a total (fixed) income of £ P in premiums. During the nth year it pays out a total of £Cn in claims. Thus Yn+l = Yn + P - Cn+l . Suppose that Cl , C2 , ' " are independent N(J-L, a2) variables and show that the probability of ultimate bankruptcy satisfies

{ 2(P - J-L) Yo } IP' (Yn :::: 0 for some n) :::: exp

a2 .

13. P61ya's urn. A bag contains red and blue balls, with initially r red and b blue where rb > O. A ball is drawn from the bag, its colour noted, and then it is returned to the bag together with a new ball of the same colour. Let Rn be the number of red balls after n such operations.

(a) Show that Yn = Rn l(n + r + b) is a martingale which converges almost surely and in mean.

(b) Let T be the number of balls drawn until the first blue ball appears, and suppose that r = b = 1 . Show that E{(T + 2)- 1 } = A .

(c) Suppose r = b = 1 , and show that IP'(Yn � i for some n ) :::: �.

14. Here is a modification of the last problem. Let {An : n � I } be a sequence of random variables, each being a non-negative integer. We are provided with the bag of Problem ( 12.9. 13), and we add balls according to the following rules. At each stage a ball is drawn from the bag, and its colour noted; we assume that the distribution of this colour depends only on the current contents of the bag and not on any further information concerning the An . We return this ball together with An new balls of the same colour. Write Rn and Bn for the numbers of red and blue balls in the urn after n operations, and

1 23

Page 133: One Thousand Exercises in Probability

[12.9.15]-[12.9.19] Exercises Martingales

let J='n = a ({Rk , Bk : 0 ::: k ::: n}) . Show that Yn = Rn/(Rn + Bn) defines a martingale. Suppose Ro = Bo = 1 , let T be the number of balls drawn until the first blue ball appears, and show that

( 1 + AT ) 1 lE T = - , 2 + Ei=l Ai 2

so long as En (2 + Ei=l Ai ) - l = 00 a.s.

15. Labouchere system. Here is a gambling system for playing a fair game. Choose a sequence Xl , X2 , • . . , Xn of positive numbers.

Wager the sum of the first and last numbers on an evens bet. If you win, delete those two numbers; if you lose, append their sum as an extra term xn+l (= Xl + Xn) at the right-hand end of the sequence.

You play iteratively according to the above rule. If the sequence ever contains one term only, you wager that amount on an evens bet. If you win, you delete the term, and if you lose you append it to the sequence to obtain two terms.

Show that, with probability 1, the game terminates with a profit of E'i xi , and that the time until termination has finite mean.

This looks like another clever strategy. Show that the mean size of your largest stake before winning is infinite. (When Henry Labouchere was sent down from Trinity College, Cambridge, in 1 852, his gambling debts exceeded £6000.)

16. Here is a martingale approach to the question of determining the mean number of tosses of a coin before the first appearance of the sequence HHH. A large casino contains infinitely many gamblers G l , G2 , . . . , each with an initial fortune of $ 1 . A croupier tosses a coin repeatedly. For each n, gambler Gn bets as follows. Just before the nth toss he stakes his $ 1 on the event that the nth toss shows heads. The game is assumed fair, so that he receives a total of $ p - 1 if he wins, where p is the probability of heads. If he wins this gamble, then he repeatedly stakes his entire current fortune on heads, at the same odds as his first gamble. At the first subsequent tail he loses his fortune and leaves the casino, penniless. Let Sn be the casino's profit (losses count negative) after the nth toss. Show that Sn is a martingale. Let N be the number of tosses before the first appearance of HHH; show that N is a stopping time and hence find lE(N) .

Now adapt this scheme to calculate the mean time to the first appearance of the sequence HTH. 17. Let {(Xb Yk) : k :::: I } be a sequence of independent identically distributed random vectors such that each Xk and Yk takes values in the set {- I , 0, 1 , 2, . . . }. Suppose that lE(X 1 ) = lE(Yl ) = 0 and lE(X I Yl ) = e, and furthermore Xl and Yl have finite non-zero variances. Let Uo and Vo be positive integers, and define (Un+l , Vn+l ) = (Un , Vn) + (Xn+l , Yn+l ) for each n :::: O. Let T = min{n : Un Vn = O} be the first hitting time by the random walk (Un , Vn) of the axes of ]R2. Show that lE(T) < 00 if and only if e < 0, and that lE(T) = -lE(Uo Vo)/e in this case. [Hint: You might show that Un Vn - en is a martingale.]

18. The game 'Red Now' may be played by a single player with a well shuffled conventional pack of 52 playing cards. At times n = 1 , 2, . . . , 52 the player turns over a new card and observes its colour. Just once in the game he must say, just before exposing a card, "Red Now". He wins the game if the next exposed card is red. Let Rn be the number of red cards remaining face down after the nth card has been turned over. Show that Xn = Rn/(52 - n), 0 ::: n < 52, defines a martingale. Show that there is no strategy for the player which results in a probability of winning different from � . 19. A businessman has a redundant piece of equipment which he advertises for sale, inviting "offers over £ 1000". He anticipates that, each week for the foreseeable future, he will be approached by one prospective purchaser, the offers made in week 0, 1 , . . . being £ lOOOXo , £ l000X 1 , . . . , where Xo , Xl , . . . are independent random variables with a common density function f and finite mean. Storage of the equipment costs £ lOOOe per week and the prevailing rate of interest is a (> 0) per

1 24

Page 134: One Thousand Exercises in Probability

Problems Exercises [12.9.20]-[12.9.24]

week. Explain why a sensible strategy for the businessman is to sell in the week T, where T is a stopping time chosen so as to maximize

T /-L(T) = E{ ( 1 + a)-T XT - L ( 1 + a)-nc } . n= 1

Show that this problem is equivalent to maximizing E{( l + a)-T ZT } where Zn = Xn + cia . Show that there exists a unique positive real number y with the property that

ay = loo lP'(Zn > y) dy ,

and that, for this value of y , the sequence Vn = ( 1 +a)-n max{Zn , y } constitutes a supermartingale. Deduce that the optimal strategy for the businessman is to set a target price 1: (which you should specify in terms of y) and sell the first time he is offered at least this price.

In the case when f(x) = 2x-3 for x 2: 1 , and c = a = Ja, find his target price and the expected number of weeks he will have to wait before selling.

20. Let Z be a branching process satisfying Zo = 1 , E(ZI ) < 1 , and lP'(ZI 2: 2) > O. Show that E(suPn Zn) ::: 1/1(1/ - 1) , where 1/ is the largest root of the equation x = G (x) and G is the probability generating function of ZI . 21. Matching. In a cloakroom there are K coats belonging to K people who make an attempt to leave by picking a coat at random. Those who pick their own coat leave, the rest return the coats and try again at random. Let N be the number of rounds of attempts until everyone has left. Show that EN = K and var(N) ::: K .

22. Let W be a standard Wiener process, and define

M(t) = l W(u) du - iW(t)3 .

Show that M(t) is a martingale, and deduce that the expected area under the path of W until it first reaches one of the levels a (> 0) or b « 0) is - iab(a + b). 23. Let W = (WI , W2, . . . , Wd) be a d-dimensional Wiener process, the Wi being independent one-dimensional Wiener processes with Wi (0) = 0 and variance parameter a2 = d- 1 . Let R(t)2 = Wl (t)2 + W2(t)2 + . . . + Wd(t)2 , and show that R(t)2 - t is a martingale. Deduce that the mean time to hit the sphere of ]Rd with radius a is a2 • 24. Let W be a standard one-dimensional Wiener process, and let a , b > O. Let T be the earliest time at which W visits either of the two points -a, b. Show that lP'(W(T) = b) = al(a + b) and E(T) = abo In the case a = b, find E(e-sT) for s > O.

1 25

Page 135: One Thousand Exercises in Probability

13

Diffusion processes

13.3 Exercises. DitTusion processes

1. Let X = {X (t) : t � O} be a simple birth-death process with parameters An = nA and J1-n = nJ1- . Suggest a diffusion approximation to X. 2. Bartlett's equation. Let D be a diffusion with instantaneous mean and variance a(t , x) and b(t, x) , and let M(t, e) = lE(eOD(t») , the moment generating function of D(t) . Use the forward diffusion equation to derive Bartlett's equation:

where we interpret

- = ea t - M + -e b t - M oM ( 0 ) 1 2 ( 0 ) ot ' oe 2 ' oe

if g(t , x) = E�o Yn (t)xn . 3. Write down Bartlett's equation in the case of the Wiener process D having drift m and instanta­neous variance 1 , and solve it subject to the boundary condition D(O) = O. 4. Write down Bartlett's equation in the case of an Ornstein-Uhlenbeck process D having instan­taneous mean a(t , x) = -x and variance b(t, x) = 1 , and solve it subject to the boundary condition D(O) = O. 5. Bessel process. If WI (t) , W2 (t) , W3 (t) are independent Wiener processes, then R(t) defined as R2 = Wf + Wi + Wf is the three-dimensional Bessel process. Show that R is a Markov process. Is this result true in a general number n of dimensions?

6. Show that the transition density for the Bessel process defined in Exercise (5) is

o f(t, Y I s , x) = oy IP(R(t) ::::: y I R (s) = x)

y / x { ( y - x )2 ) ( y + x )2 ) } = '/2n(t - s) exp - 2(t _ s) - exp - 2(t - s) .

7. If W is a Wiener process and the function g : IR --+ IR is continuous and strictly monotone, show that g (W) is a continuous Markov process.

8. Let W be a Wiener process. Which of the following define martingales?

(a) eaW(t) , (b) cW(t/c2) , (c) tW(t) - fJ W(s) ds .

1 26

Page 136: One Thousand Exercises in Probability

Stochastic calculus Exercises [13.3.9]-[13.7.1]

9. Exponential martingale, geometric Brownian motion. Let W be a standard Wiener process and define S(t) = eat+bW(t) . Show that:

(a) S is a Markov process,

(b) S is a martingale (with respect to the filtration generated by W) if and only if a + � b2 = 0, and in this case lE(S(t)) = 1 .

10. Find the transition density for the Markov process of Exercise (9a).

13.4 Exercises. First passage times

1. Let W be a standard Wiener process and let X (t) = exp{i O W (t) + �82t } where i = .J=1. Show that X is a martingale with respect to the filtration given by :Fi = a ({ W (u) : u ::: tn .

2. Let T be the (random) time at which a standard Wiener process W hits the 'barrier' in space-time given by y = at+b wherea < 0, b � 0; that is, T = inf{t : W (t) = at+b} . Use the result of Exercise (1 ) to show that the moment generating function of T is given by lE(e1frT ) = exp{ -b( ../a2 - 21fr+a) } for 1fr < �a2 . You may assume that the conditions of the optional stopping theorem are satisfied.

3. Let W be a standard Wiener process, and let T be the time of the last zero of W prior to time t . Show that lP'(T ::: u) = (2/rr) sin- 1 ,JiiTi, 0 ::: u ::: t .

13.5 Exercise. Barriers

1. Let D be a standard Wiener process with drift m starting from D(O) = d > 0, and suppose that there is a reflecting barrier at the origin. Show that the density function fr (t , y) of D (t) satisfies F(t, y) � 0 as t � 00 if m � 0, whereas F(t, y) � 2 lm le-2lm ly for y > 0, as t � 00 if m < O.

13.6 Exercises. Excursions and the Brownian bridge

1. Let W be a standard Wiener process . Show that the conditional density function of W (t), given

that W(u) > 0 for 0 < u < t, is g (x) = (x/t)e-x2/(2t) , x > O. 2. Show that the autocovariance function of the Brownian bridge is c (s , t) = min {s, t} - s t, 0 ::: s, t ::: 1 .

3 . Let W be a standard Wienerprocess, and let W(t) = W(t) - t W ( 1 ) . Show that {W(t) : 0 ::: t ::: 1 } is a Brownian bridge.

4. If W is a Wiener process with W (0) = 0, show that W (t) = (1 - t) W (t / ( 1 - t)) for 0 ::: t < 1 , W ( 1 ) = 0 , defines a Brownian bridge.

5. Let 0 < s < t < 1 . Show that the probability that the Brownian bridge has no zeros in the interval (s , t) is (2/rr) cos- 1 ../(t - s)/ [t (1 - s) ] .

13.7 Exercises. Stochastic calculus

1. Doob's L2 inequality. Let W be a standard Wiener process, and show that

1 27

Page 137: One Thousand Exercises in Probability

[13.7.2]-[13.8.6] Exercises Diffusion processes

2. Let W be a standard Wiener process. Fix t > 0, n � 1 , and let /) = tin . Show that Zn = E'j:J (W(j+1 ),s - Wj,s)2 satisfies Zn -+ t in mean square as n -+ 00 .

3. Let W be a standard Wiener process. Fix t > 0, n � 1 , and let /) = tin . Let Vj = Wj,s and I!1j = Vj+l - Vj . Evaluate the limits of the following as n -+ 00 :

(a) It (n) = Ej Vj I!1j , (b) h(n) = Ej Vj+ l l!1j , (c) 13 (n) = Ej � (Vj+1 + Vj )l!1j , (d) 14 (n) = Ej W(j+! ),sl!1j .

4. Let W be a standard Wiener process. Show that U(t) = e-!3t W(e2!3t) defines a stationary Omstein-Uhlenbeck process .

5. Let W be a standard Wiener process. Show that Ut = Wt - f3 fJ e-!3 (t-s)Ws ds defines an Omstein-Uhlenbeck process.

13.8 Exercises. The ItO integral

In the absence of any contrary indication, W denotes a standard Wiener process, and :Ft is the smallest a-field containing all null events with respect to which every member of {Wu : 0 ::::: u ::::: t} is measurable.

1. (a) Verify directly that fot s dWs = tWt - fo

t Ws ds .

(b) Verify directly that fot W; dWs = j W? - fo

t Ws ds .

(c) Show thau : ( [fot WS dWsf) = fo

t lE(W;) ds .

2 . Let Xt = fJ Ws ds . Show that X is a Gaussian process, and find its autocovariance and autocor­relation function.

3. Let (n , :F, JP') be a probability space, and suppose that Xn � X as n -+ 00. If 90 � :F, show

that lE(Xn I g.) � lE(X I g.) .

4. Let l/fl and l/f2 be predictable step functions, and show that

whenever both sides exist.

5. Assuming that Gaussian white noise G t = dWt/dt exists in sufficiently many senses to appear as an integrand, show by integrating the stochastic differential equation dXt = -f3Xt dt + dWt that

if Xo = o.

6. Let l/f be an adapted process with 1 I l/f I I < 00. Show that I I I (l/f) 1 1 2 = 1 I l/f I I ·

1 28

Page 138: One Thousand Exercises in Probability

Option pricing Exercises [13.9.1]-[13.10.1]

13.9 Exercises. Ito's formula

In the absence of any contrary indication, W denotes a standard Wiener process, and :Ft is the smallest O'-field containing all null events with respect to which every member of {Wu : 0 ::::: u ::::: t} is measurable.

1. Let X and Y be independent standard Wiener processes . Show that, with Rt = Xt + Y?,

Zt = t Xs dXs + t Ys dYs Jo Rs Jo Rs

is a Wiener process. [Hint: Use Theorem ( 1 3 .8 . 1 3) .] Hence show that R2 satisfies

Rt = 2 lot Rs dWs + 2t .

Generalize this conclusion to n dimensions.

2. Write down the SDE obtained via Ito's formula for the process Yt = wf, and deduce that lE(Wf) = 3t2 .

3. Show that Yt = tWt is an Ito process, and write down the corresponding SDE.

4. Wiener process on a circle. Let Yt = ei Wt . Show that Y = Xl + i X 2 is a process on the unit circle satisfying

5. Find the SDEs satisfied by the processes: (a) Xt = Wt!(l + t) , (b) Xt = sin Wt , (c) [Wiener process on an ellipse] X t = a cos Wt , Yt = b sin Wt , where ab =I- O.

13.10 Exercises. Option pricing

In the absence of any contrary indication, W denotes a standard Wiener process, and :Ft is the smallest O'-field containing all null events with respect to which every member of {Wu : 0 ::::: u ::::: t} is measurable. The process St = exp«1i- - i0'2) + O'Wt) is a geometric Brownian motion, and r ::: 0 is the interest rate.

1. (a) Let Z have the N(y , 1'2) distribution. Show that

lE ( aeZ - K)+) = aeY+� 'f2 <I> cog(a/:) + y + 1') - K<I> cOg (a/ :) + y )

where <I> is the N(O, 1 ) distribution function. (b) Let Q be a probability measure under which O' W is a Wiener process with drift r - Ii- and instantaneous variance 0'2 . Show for 0 ::::: t ::::: T that

where

log(x/ K) + (r + !0'2) (T - t) dl (t , X) = ...(f'=t , d2 (t , X) = dl (t , X) - O'�. 0' T - t

1 29

Page 139: One Thousand Exercises in Probability

[13.10.2]-[13.12.2] Exercises Diffusion processes

2. Consider a portfolio which, at time t , holds � (t , S) units of stock and 1/1 (t , S) units of bond, and assume these quantities depend only on the values of Su for 0 :::: u :::: t. Find the function 1/1 such that the portfolio is self-financing in the three cases:

(a) W, S) = 1 for all t, S, (b) � (t , S) = St ,

(c) W, S) = fot Sv dv .

3. Suppose the stock price St is itself a Wiener process and the interest rate r equals 0, so that a unit of bond has unit value for all time. In the notation of Exercise (2), which of the following define self-financing portfolios?

(a) W, S) = 1/I(t , S) = 1 for all t, S, (b) � (t , S) = 2St , 1/I (t, S) = - s1 - t, (c) W, S) = -t, 1/I (t, S) = fci Ss ds , (d) W, S) = fci Ss ds , 1/I (t , S) = - fci S; ds . 4. An 'American call option' differs from a European call option in that it may be exercised by the buyer at any time up to the expiry date. Show that the value of the American call option is the same as that of the corresponding European call option, and that there is no advantage to the holder of such an option to exercise it strictly before its expiry date.

S. Show that the Black-Scholes value at time 0 of the European call option is an increasing function of the initial stock price, the exercise date, the interest rate, and the volatility, and is a decreasing function of the strike price.

13.11 Exercises. Passage probabilities and potentials

1. Let G be the closed sphere with radius E and centre at the origin of ]Rd where d � 3. Let W be a d-dimensional Wiener process starting from W(O) = W ¢. G. Show that the probability that W visits G is (E/r)d-2 , where r = Iw i . 2. Let G be an infinite connected graph with finite vertex degrees. Let I!1n be the set of vertices x which are distance n from 0 (that is, the shortest path from x to 0 contains n edges), and let Nn be the total number of edges joining pairs x, y of vertices with x E I!1n , y E I!1n+ 1 . Show that a random

walk on G is persistent if Ei Ni-1 = 00 .

3. Let G be a connected graph with finite vertex degrees, and let H be a connected subgraph of G. Show that a random walk on H is persistent if a random walk on G is persistent, but that the converse is not generally true.

13.12 Problems

1. Let W be a standard Wiener process, that is, a process with independent increments and continuous sample paths such that W(s + t) - W(s) is N (O, t) for t > O. Let a be a positive constant. Show that:

(a) a W (t / (2) is a standard Wiener process,

(b) W et + a) - W(a) is a standard Wiener process,

(c) the process V , given by V et) = tW ( l/ t) for t > 0, V (O) = 0, is a standard Wiener process.

2. Let X = {X(t) : t � O} be a Gaussian process with continuous sample paths, zero means, and autocovariance function c(s, t) = u (s )v (t) for s :::: t where u and v are continuous functions. Suppose

1 30

Page 140: One Thousand Exercises in Probability

Problems Exercises [13.12.3]-[13.12.9]

that the ratio r (t) = u (t)/v(t) is continuous and strictly increasing with inverse function r- l . Show that W(t) = X(r-1 (t))/v (r-1 (t)) is a standard Wiener process on a suitable interval of time.

If c(s , t) = s ( 1 - t) for s ::: t < 1 , express X in terms of W. 3. Let 13 > 0, and show that U (t) = e-/3t W (e2/3t - 1 ) is an Ornstein-Uhlenbeck process if W is a standard Wiener process.

4. Let V = {V (t) : t � O} be an Ornstein-Uhlenbeck process with instantaneous mean a(t , x) = -f3x where 13 > 0, with instantaneous variance b(t, x) = u2 , and with U(O) = u . Show that V(t) is N(ue-/3t , u2 (1 - e-2/3t)/ (2f3)) . Deduce that V(t) is asymptotically N(O, �u2/f3) as t � 00, and

show that V is strongly stationary if V (0) is N (0, �u2 / 13) . Show that such a process i s the only stationary Gaussian Markov process with continuous auto­

covariance function, and find its spectral density function.

5. Let D = {D(t) : t � O} be a diffusion process with instantaneous mean a(t , x) = ax and instantaneous variance b(t, x) = f3x where a and 13 are positive constants. Let D(O) = d. Show that the moment generating function of D(t) is

{ 2ad8eat } M(t, 8) = exp 138 ( 1 _ eat) + 2a .

Find the mean and variance of D(t) , and show that lP'(D(t) = 0) � e-2da//3 as t � 00 .

6. Let D be an Ornstein-Uhlenbeck process with D(O) = 0, and place reflecting barriers at -c and d where c, d > O. Find the limiting distribution of D as t � 00 .

7. Let Xo , Xl > . . . be independent N(O, 1 ) variables, and show that

t �� sin(kt) W(t) = -XO + - w --Xk .,fir rr k=l k

defines a standard Wiener process on [0, rr] . 8 . Let W be a standard Wiener process with W (0) = O. Place absorbing barriers at -b and b, where b > 0, and let Wa be W absorbed at these barriers. Show that Wa(t) has density function

fa(y , t) = -- L (_ 1 )k exp - , 1 00 { (y _ 2kb)2 } .fiiii k=-oo 2t

which may also be expressed as

fa( t) _ � -Ant · ( nrr(y + b) ) y , - w ane sm 2b ' n= l

where an = b-1 sin (�nrr) and An = n2rr2/ (8b2) .

-b < y < b,

-b < y < b,

Hence calculate lP'(suPO::;s::::;t I W(s) 1 > b) for the unrestricted process W. 9. Let D be a Wiener process with drift m, and suppose that D(O) = O. Place absorbing barriers at the points x = -a and x = b where a and b are positive real numbers. Show that the probability Pa that the process is absorbed at -a is given by

e2mb - 1 Pa = e2m(a+b) _ 1 .

1 3 1

Page 141: One Thousand Exercises in Probability

[13.12.10]-[13.12.18] Exercises Diffusion processes

10. Let W be a standard Wiener process and let F (u , v) be the event that W has no zero in the interval (u , v) . (a) IT ab > 0, show that IP(F(O, t) I W (O) = a , W(t) = b) = 1 - e-2ab/t . (b) IT W (O) = 0 and 0 < to ::::: tl ::::: t2 , show that

(c) Deduce that, if W (O) = 0 and 0 < tl ::::: t2 , then IP(F(O, t2) I F(O, tI » = ../tI / t2 . 11. Let W be a standard Wiener process. Show that

IP ( sup IW (s ) 1 � w) ::::: 2IP( IW(t ) 1 � w) ::::: 2�

o�s�t w for w > O.

Set t = 2n and w = 22n/3 and use the Borel-Cantelli lemma to show that t- I W(t) --* 0 a.s. as t --* 00.

12. Let W be a two-dimensional Wiener process with W(O) = w, and let F be the unit circle. What is the probability that W visits the upper semicircle G of F before it visits the lower semicircle H? 13. Let WI and W2 be independent standard Wiener processes; the pair W(t) = (WI (t) , W2 (t» represents the position of a particle which is experiencing Brownian motion in the plane. Let I be some straight line in JR2, and let P be the point on I which is closest to the origin O. Draw a diagram. Show that

(a) the particle visits I , with probability one,

(b) if the particle hits I for the first time at the point R, then the distance PR (measured as positive or negative as appropriate) has the Cauchy density function f(x) = d/{rr(d2+x2) } , - 00 < x < 00,

where d is the distance OP,

(c) the angle :POR is uniformly distributed on [- �rr, �rr ] . 14. Let rjJ (x + iy) = u (x , y) + i v (x , y) be an analytic function on the complex plane with real part u (x , y) and imaginary part v(x , y) , and assume that

Let (WI , W2) be the planar Wiener process of Problem ( 13) above. Show that the pair U(WI , W2) , V (WI , W2) i s also a planar Wiener process.

15. Let M(t) = maxo<s<t W(s) , where W is a standard Wiener process. Show that M(t) - W(t) has the same distribution as M(t) . 16. Let W be a standard Wiener process, u E JR, and let Z = {t : W(t) = u} . Show that Z is a null set (Le., has Lebesgue measure zero) with probability one.

17. Let M(t) = maxO<s<t W (s) , where W is a standard Wiener process. Show that M(t) is attained at exactly one point in YO�t] , with probability one.

18. Sparre Andersen theorem. Let So = 0 and Sm = Ej=I Xj , where (Xj : 1 ::::: j ::::: n) is a given sequence of real numbers. Of the n ! permutations of (Xj : 1 ::::: j ::::: n) , let Ar be the number of permutations in which exactly r values of (sm : 0 ::::: m ::::: n) are strictly positive, and let Br be the number of permutations in which the maximum of (sm : 0 ::::: m ::::: n) first occurs at the rth place. Show that Ar = Br for 0 ::::: r ::::: n. [Hint: Use induction on n. ]

132

Page 142: One Thousand Exercises in Probability

Problems Exercises [13.12.19]-[13.12.24]

19. Arc sine laws. For the standard Wiener process W, let A be the amount of time u during the time interval [0, t] for which W(u) > 0; let L be the time of the last visit to the origin before t ; and let R be the time when W attains its maximum in [0, t ] . Show that A, L, and R have the same distribution function F(x) = (2/n) sin-1 .JXTi for 0 :::: x :::: t. [Hint: Use the results of Problems ( 13 . 12 . 15)-( 13 . 1 2. 1 8) . ] 20. Let W be a standard Wiener process, and let U x be the amount of time spent below the level x

(:::: 0) during the time interval (0, 1 ) , that is, Ux = fJ I{W(t)<x} dt . Show that Ux has density function

Show also that

1 ( x2 ) fux (u) = n.Ju( 1 - u) exp - 2u ' O < u < 1 .

{ sup{ t :::: 1 : Wt = x } if this set is non-empty, Vx = 1 otherwise,

has the same distribution as U x . 21. Let sign(x) = 1 ifx > o and sign(x) = - 1 otherwise. Show that Vt = fJ sign(Ws ) dWs defines a standard Wiener process if W is itself such a process.

22. Mter the level of an industrial process has been set at its desired value, it wanders in a random fashion. To counteract this the process is periodically reset to this desired value, at times 0, T, 2T, . . . . If Wt is the deviation from the desired level, t units of time after a reset, then {Wt : 0 :::: t < T} can be modelled by a standard Wiener process. The behaviour of the process after a reset is independent of its behaviour before the reset. While Wt is outside the range (-a, a) the output from the process is unsatisfactory and a cost is incurred at rate C per unit time. The cost of each reset is R. Show that the period T which minimises the long-run average cost per unit time is T* , where

IoT* a ( a2 ) R = C --- exp -- dt. o .J(2nt) 2t

23. An economy is governed by the Black-Scholes model in which the stock price behaves as a geometric Brownian motion with volatility u , and there is a constant interest rate r . An investor likes to have a constant proportion y (E (0, 1 ) ) of the current value of her self-financing portfolio in stock and the remainder in the bond. Show that the value function of her portfolio has the form Vt = f (t) Sr where f(t) = c exp{ ( 1 - y)(i yu2 + r)t} for some constant c depending on her initial wealth.

24. Let u (t , x) be twice continuously differentiable in x and once in t, for x E lR and t E [0, T] . Let W be the standard Wiener process. Show that u is a solution of the heat equation

au 1 a2u at = "2 ax2

if and only if the process Ut = u (T - t, Wt), 0 :::: t :::: T, has zero drift.

1 3 3

Page 143: One Thousand Exercises in Probability
Page 144: One Thousand Exercises in Probability

1 Events and their probabilities

1.2 Solutions. Events as sets

1. (a) Let a E (U Ai )c . Then a ¢ U Ai , so that a E A� for all i . Hence (U Ai )C £ n A� . Conversely, if a E n A� , then a ¢ Ai for every i . Hence a ¢ U Ai , and so n A� £ (U Ai )c . The first De Morgan law follows.

(b) Applying part (a) to the family {A� : i E l}, we obtain that (Ui A�t = ni (A�)C = ni Ai · Taking the complement of each side yields the second law.

2. Clearly

(i) A n B = (AC U BC)C , (ii) A \ B = A n BC = (AC U B)c ,

(iii) A b,. B = (A \ B) U (B \ A) = (AC U B)c U (A U BC)c . Now :F is closed under the operations of countable unions and complements, and therefore each of these sets lies in :F.

3. Let us number the players 1 , 2, . . . , 2n in the order in which they appear in the initial table of draws. The set of victors in the first round is a point in the space Vn = { 1 , 2} x {3 , 4} x . . . x {2n - l , 2n } . Renumbering these victors in the same way as done for the initial draw, the set o f second-round victors can be thought of as a point in the space Vn- l , and so on. The sample space of all possible outcomes of the tournament may therefore be taken to be Vn x Vn- l x . . . x Vb a set containing

22n- 1 22n-2 21 22n - 1 . . . . = pomts.

Should we be interested in the ultimate winner only, we may take as sample space the set { I , 2, . . . , 2n } of all possible winners.

4. We must check that fJ, satisfies the definition of a a-field:

(a) 0 E :F, and therefore 0 = 0 n B E fJ" (b) if AI , A2 , . · · E :F, then Ui (Ai n B) = (Ui Ai ) n B E fJ" (c) if A E :F, then AC E :Fso that B \ (A n B) = AC n B E fJ,.

Note that fJ, i s a a-field of subsets of B but not a a -field of subsets of n, since C E fJ, does not imply that CC = n \ C E fJ,. 5. (a), (b), and (d) are identically true; (c) is true if and only if A £ C.

1.3 Solutions. Probability

1. (i) We have (using the fact that lP' is a non-decreasing set function) that

lP'(A n B) = lP'(A) + lP'(B) - lP'(A U B) ::: lP'(A) + lP'(B) - 1 = -b. . 135

Page 145: One Thousand Exercises in Probability

[1.3.2]-[1.3.4] Solutions Events and their probabilities

Also, since A n B � A and A n B � B, JP'(A n B) :::::: min{JP'(A) , JP'(B)} = j . These bounds are attained in the following example. Pick a number atrandomfrom { l , 2, . . . , 12} .

Taking A = { I , 2, . . . , 9} and B = {9, 10, 1 1 , 12} , we find that A n B = {9}, and so JP'(A) = �, JP'(B) = j , JP'(A n B) = fz. To attain the upper bound for JP'(A n B), take A = { 1 , 2, . . . , 9} and B = { I , 2, 3 , 4}.

(ii) Likewise we have in this case JP'(A U B) :::::: min{JP>(A) + JP'(B) , I } = I , and JP'(A U B) :::: max{JP'(A) , JP'(B)} = � . These bounds are attained in the examples above.

2. (i) We have (using the continuity property of JP') that

JP'(no head ever) = lim JP'(no head in first n tosses) = lim 2-n = 0, n-+oo n-+oo

so that JP'(some head turns up) = I - JP'(no head ever) = 1 .

(ii) Given a fixed sequence s of heads and tails of length k , we consider the sequence of tosses arranged in disjoint groups of consecutive outcomes, each group being of length k. There is probability 2-k

that any given one of these is s, independently of the others. The event {one of the first n such groups is s } is a subset of the event {s occurs in the first nk tosses} . Hence (using the general properties of probability measures) we have that

JP'(s turns up eventually) = lim JP'(s occurs in the first nk tosses) n-+oo :::: lim JP'(s occurs as one of the first n groups) n-+oo = I - lim JP'(none of the first n groups is s) n-+oo

= I - lim ( 1 - 2-k )n = 1 . n-+oo

3. Lay out the saucers in order, say as RRWWSS. The cups may be arranged in 6 ! ways, but since each pair of a given colour may be switched without changing the appearance, there are 6 ! -:- (2 ! )3 = 90 distinct arrangements. By assumption these are equally likely. In how many such arrangements is no cup on a saucer of the same colour? The only acceptable arrangements in which cups of the same colour are paired off are WWSSRR and SSRRWW; by inspection, there are a further eight arrangements in which the first pair of cups is either SW or WS, the second pair is either RS or SR, and the third either RW or WR. Hence the required probability is 10/90 = � . 4. We prove this by induction on n , considering first the case n = 2 . Certainly B = (A n B) U (B \ A) is a union of disjoint sets, so that JP'(B) = JP'(A n B) + JP'(B \ A) . Similarly A U B = A U (B \ A), and so

JP'(A U B) = JP'(A) + JP'(B \ A) = JP'(A) + {JP'(B) - JP'(A n B) } .

Hence the result is true for n = 2. Let m :::: 2 and suppose that the result is true for n :::::: m . Then it is true for pairs of events, so that

JP'(ul

Ai) = JP'(U Ai) + JP'(Am+l ) - JP'{ (U Ai) n Am+l } 1 1 1

= JP'(U Ai) + JP'(Am+l ) - JP'{ U(Ai n Am+l ) } . 1 1

Using the induction hypothesis, we may expand the two relevant terms on the right-hand side to obtain the result.

1 3 6

Page 146: One Thousand Exercises in Probability

Conditional probability Solutions [1.3.5]-[1.4.1]

Let AI , A2 , and A3 be the respective events that you fail to obtain the ultimate, penultimate, and ante-penultimate Vice-Chancellors. Then the required probability is, by symmetry,

3 1 - JP>( U Ai) = 1 - 3JP>(AI ) + 3JP>(A I n A2) - JP>(A I n A2 n A3)

1 = 1 - 3(�)6 + 3(�)6 _ (�)6 .

5. By the continuity of JP>, Exercise ( 1 .2. 1) , and Problem ( 1 . 8 . 1 1 ) ,

= 1 - lim JP> (Un

A�) :::: 1 - lim � JP>(A�) = 1 . n .... oo n .... oo L--r=I r=I

6. We have that 1 = JP>(U Ar) = L JP>(Ar ) - L JP>(Ar n As ) = np - in (n - 1)q . Hence 1 r r<s

p :::: n- I , and �n (n - l)q = np - 1 ::::: n - 1 . 7. Since at least one of the Ar occurs,

1 = JP> (U Ar) = L JP>(Ar ) - L JP>(Ar n As ) + L JP>(Ar n As n At ) 1 r r<s r<s<t

Since at least two of the events occur with probability i ,

� = JP>( U (Ar n As ») = L JP>(Ar n As ) - � r<s r<s

L JP>(Ar n As n At n Au ) + . . . . r<s t<u

(r,s)#(t ,u)

By a careful consideration of the first three terms in the latter series, we find that

Hence � = np - (3)x , so that p :::: 3/(2n) . Also, (�)q = 2np - � , whence q ::::: 4/n .

1.4 Solutions. Conditional probability

1. By the definition of conditional probability,

JP>(A I B) -JP>(A n B) _ JP>(B n A) JP>(A) _ JP>(B I A) JP>(A)

- JP>(B) - JP>(A) JP>(B) - JP>(B)

1 37

Page 147: One Thousand Exercises in Probability

[1.4.2]-[1.4.5] Solutions Events and their probabilities

if lP'(A)lP'(B) =f: O. Hence lP'(A I B)

lP'(A)

lP'(B I A)

lP'(B) ,

whence the last part is immediate.

2. Set Ao = n for notational convenience. Expand each term on the right-hand side to obtain

3. Let M be the event that the first coin is double-headed, R the event that it is double-tailed, and N the event that it is normal. Let Ht be the event that the lower face is a head on the ith toss, T� the event that the upper face is a tail on the i th toss, and so on. Then, using conditional probability ad nauseam, we find:

(i)

(ii)

(iii)

(iv)

lP'(Hh = �lP'(H/ I M) + !lP'(H/ I R) + �lP'(Hll I N) = � + 0 + � . � = l I I lP'(H/ n HJ ) lP'(M) 2 / 3 2 lP' H, H - - -- - -( 1 I u ) - lP'(HJ ) - lP'(H/ ) - 5" 5" - 3 ·

lP'(H? I HJ ) = l · lP'(M I HJ ) + �lP'(N I HJ )

= lP'(Hli I HJ ) + 1 ( 1 - lP'(H1I I HJ )) = j + � . 1 = � .

2 I 2 lP'(HI2 n Hu

l n Hu2) lP'(M) � 4

lP'(HI I H n H ) - - - -"- - -u u - lP'(HJ n HJ) - 1 . lP'(M) + ! . lP'(N) - � + Ih - 5 ·

(v) From (iv), the probability that he discards a double-headed coin is � , the probability that he

discards a normal coin is ! . (There is of course no chance of it being double-tailed.) Hence, by conditioning on the discard,

4. The final calculation of j refers not to a single draw of one ball from an urn containing three, but rather to a composite experiment comprising more than one stage (in this case, two stages). While it is true that {two black, one white} is the only fixed collection of balls for which a random choice is black with probability j , the composition of the urn is not determined prior to the final draw.

After all, if Carroll's argument were correct then it would apply also in the situation when the urn originally contains just one ball, either black or white. The final probability is now i , implying that the original ball was one half black and one half white ! Carroll was himself aware of the fallacy in this argument.

5. (a) One cannot compute probabilities without knowing the rules governing the conditional prob­abilities. If the first door chosen conceals a goat, then the presenter has no choice in the door to be opened, since exactly one of the remaining doors conceals a goat. If the first door conceals the car, then a choice is necessary, and this is governed by the protocol of the presenter. Consider two 'extremal' protocols for this latter situation.

(i) The presenter opens a door chosen at random from the two available.

(ii) There is some ordering of the doors (left to right, perhaps) and the presenter opens the earlier door in this ordering which conceals a goat.

Analysis of the two situations yields p = � under (i), and p = � under (ii) .

1 3 8

Page 148: One Thousand Exercises in Probability

Independence Solutions [1.4.6]-[1.5.3]

Let ex E [ � , � 1, and suppose the presenter possesses a coin which falls with heads upwards with probability 13 = 6a - 3. He flips the coin before the show, and adopts strategy (i) if and only if the coin shows heads. The probability in question is now � 13 + � ( 1 - 13) = ex .

You never lose by swapping, but whether you gain depends on the presenter's protocol.

(b) Let D denote the first door chosen, and consider the following protocols:

(iii) If D conceals a goat, open it. Otherwise open one of the other two doors at random. In this case p = O.

(iv) If D conceals the car, open it. Otherwise open the unique remaining door which conceals a goat. In this case p = 1 .

As in part (a), a randomized algorithm provides the protocol necessary for the last part.

6. This is immediate by the definition of conditional probability.

7. Let Cj be the colour of the i th ball picked, and use the obvious notation.

(a) Since each urn contains the same number n - 1 of balls, the second ball picked is equally likely to be any of the n (n - 1 ) available. One half of these balls are magenta, whence lP'(C2 = M) = � . (b) By conditioning on the choice of urn,

lP'(C2 = M I Cl = M) = lP'(Cl , C2 = M) = t (n - r) (n - r - 1 ) /� = � .

lP'(C I = M) r= 1 n (n - l ) (n - 2) 2 3

1. Clearly

1.5 Solutions. Independence

lP'(AC n B) = lP'(B \ {A n B}) = lP'(B) - lP'(A n B) = lP'(B) - lP'(A)lP'(B) = lP'(AC)lP'(B) .

For the final part, apply the first part to the pair B, A C • 2. Suppose i < j and m < n . If j < m, then Aij and Amn are determined by distinct independent rolls, and are therefore independent. For the case j = m we have that

lP'(Aij n Ajn) = lP'(i th, jth, and nth rolls show s!lIlle number)

6 = L ilP' (jth and nth rolls both show r I i th shows r ) = -l6 = lP'(Aij )lP'(Ajn ) , r=1

as required. However, if i =F j =F k,

3. That (a) implies (b) is trivial. Suppose then that (b) holds. Consider the outcomes numbered i I , i2 , · · · , im , and let Uj E {H, T} for 1 ::s j ::s m. Let Sj be the set of all sequences of length M = max{ij : 1 ::s j ::s m} showing Uj in the ij th position. Clearly I Sj l = 2M- 1 and I nj Sj l = 2M-m . Therefore,

139

Page 149: One Thousand Exercises in Probability

[1.5.4]-[1.7.2] Solutions Events and their probabilities

so that JP (nj Sj) = I1j JP(Sj ) .

4 . Suppose I A I = a, I B I = b , I A n B I = c, and A and B are independent. Then JP(A n B) = JP(A)JP(B), which is to say that clp = (alp) . (blp), and hence ab = pc. If ab of:. 0 then p l ab (i.e. , p divides ab). However, p is prime, and hence either p I a or p I b. Therefore, either A = n or B = n (or both).

5. (a) Flip two coins; let A be the event that the first shows H, let B be the event that the second shows H, and let C be the event that they show the same. Then A and B are independent, but not conditionally independent given C .

(b) Roll two dice; let A b e the event that the smaller i s 3, let B be the event that the larger i s 6, and let C be the event that the smaller score is no more than 3, and the larger is 4 or more. Then A and B are conditionally independent given C, but not independent.

(c) The definitions are equivalent if JP(C) = 1 . 6. (-&)7 < � .

7. (a) JP(A n B) = k = 1 . � = JP(A)JP(B), and JP(B n C) = i = � . i = JP(B)JP(C) .

(b) JP(A n C) = 0 of:. JP(A)JP(C) .

(c) Only in the trivial cases when children are either almost surely boys or almost surely girls.

(d) No.

S. No. JP(all alike) = 1 . 9. JP(l st shows r and sum is 7) = :k = � . � = JP(l st shows r)JP(sum is 7) .

1.7 Solutions. Worked examples

1. Write EF for the event that there is an open road from E to F, and EPC for the complement of this event; write E � F if there is an open route from E to F, and E + F if there is none. Now {A � C} = AB n Be, so that

JP(AB I A + e) = JP(AB , A + C) = JP(AB, B + C) = ( 1 - p2)p2 . JP(A + e) 1 - JP(A � C) 1 - (1 - p2)2

By a similar calculation (or otherwise) in the second case, one obtains the same answer:

2. Let A be the event of exactly one ace, and KK be the event of exactly two kings. Then JP(A I KK) = JP(A n KK)/JP(KK) . Now, by counting acceptable combinations,

so the required probability is

(4) (4) (44) / (4) (48) 7 . 1 1 . 37 0 1 2 10 2 1 1 = 3 . 46 . 47 � .44.

140

Page 150: One Thousand Exercises in Probability

Problems Solutions [l.7.3Hl.8.2]

3. First method: Suppose that the coin is being tossed by a special machine which is not switched off when the walker is absorbed. If the machine ever produces N heads in succession, then either the game finishes at this point or it is already over. From Exercise ( 1 .3 .2), such a sequence of N heads must (with probability one) occur sooner or later.

Alternative method: Write down the difference equations for Pk o the probability the game finishes at o having started at k, and for Jh , the corresponding probability that the game finishes at N; actually these two difference equations are the same, but the respective boundary conditions are different. Solve these equations and add their solutions to obtain the total 1 . 4. It is a tricky question. One of the present authors is in agreement, since if lP'(A I C ) > lP'(B I C) and lP'(A I CC) > lP'(B I CC) then

lP'(A) = lP'(A I C)lP'(C) + lP'(A I CC)lP'(CC)

> lP'(B I C)lP'(C) + lP'(B I CC)lP'(Cc) = lP'(B) .

The other author is more suspicious of the question, and points out that there is a difficulty arising from the use of the word 'you' . In Example ( 1 .7 . 10) , Simpson's paradox, whilst drug I is preferable to drug II for both males and females, it is drug II that wins overall.

5. Let Lk be the label of the kth card. Then, using symmetry,

lP'(Lk = m) 1 / 1 lP'(Lk = m i Lk > Lr for 1 :s r < k) = £ ) = - -k = kim . lP'(Lk > Lr or 1 :s r < k m

1.8 Solutions to problems

1. (a) Method I: There are 36 equally likely outcomes, and just 10 of these contain exactly one six. The answer is therefore � = is. Method II : Since the throws have independent outcomes,

lP'(first is 6, second is not 6) = lP'(first is 6)lP'(second is not 6) = � . i = � . There is an equal probability of the event {first is not 6, second is 6} .

(b) A die shows an odd number with probability ! ; by independence, lP'(both odd) = ! . ! = ! . (c) Write S for the sum, and {i, j } for the event that the first is i and the second j . Then lP'(S = 4) = lP'( 1 , 3) + lP'(2, 2) + lP'(3, 1 ) = �. (d) Similarly

lP'(S divisible by 3) = lP'(S = 3) + lP'(S = 6) + lP'(S = 9) + lP'(S = 12)

= {lP'(I , 2) + lP'(2, I ) }

+ {lP'( 1 , 5) + lP'(2, 4) + lP'(3, 3) + lP'(4, 2) + lP'(5, I ) }

+ {lP'(3, 6) + lP'(4, 5) + lP'(5, 4) + lP'(6, 3) } + lP'(6, 6)

= �� = t · 2. (a) By independence, lP'(n - 1 tails, followed by a head) = rn . (b) If n is odd, lP'(# heads = # tails) = 0; #A denotes the cardinality of the set A. If n is even, there are (n/2) sequences of outcomes with !n heads and !n tails. Any given sequence of heads and tails

has probability 2-n ; therefore lP'(# heads = # tails) = 2-n (n/2) . 141

Page 151: One Thousand Exercises in Probability

[1.8.3]-[1.8.9] Solutions Events and their probabilities

(c) There are (�) sequences containing 2 heads and n - 2 tails. Each sequence has probability 2-n , and therefore Jl>(exactly two heads) = (�) 2-n . (d) Clearly

Jl>(at least 2 heads) = 1 - Jl>(no heads) - Jl>(exactly one head) = 1 - 2-n - (�) 2-n •

3. (a) Recall De Morgan's Law (Exercise ( 1 .2. 1 )) : ni Ai = (Ui A�t , which lies in J=' since it is the complement of a countable union of complements of sets in :F.

(b) Je is a a-field because:

(i) 0 E J='and 0 E 9.; therefore 0 E Je.

(ii) If AI , A2 , . . . i s a sequence of sets belonging to both J='and 9., then their union lies in both J='and 9., which is to say that Je is closed under the operation of taking countable unions.

(iii) Likewise AC is in Je if A is in both J='and 9.. (c) We display an example. Let

Q = {a , b, c} , J=' = { {a} , {b, c} , 0, Q } , 9. = { {a, b} , {c}, 0, Q } .

Then Je = J='U 9. is given by Je = Ha} , {c}, {a, b} , {b, c} , 0, Q } . Note that {a} E Je and {c} E Je, but the union {a, c} is not in Je, which is therefore not a a-field.

4. In each case J='may be taken to be the set of all subsets of Q, and the probability of any member of J=' is the sum of the probabilities of the elements therein.

(a) Q = {H, T}3 , the set of all triples of heads (H) and tails (T). With the usual assumption of independence, the probability of any given triple containing h heads and t = 3 - h tails is ph ( 1 _ p)t , where p is the probability of heads on each throw.

(b) In the obvious notation, Q = {U, V}2 = {UU, VV, UV, VU} . Also Jl>(UU) = Jl>(VV) = � . � and

Jl>(UV) = Jl>(VU) = � . �. (c) Q is the set of finite sequences of tails followed by a head, {Tn H : n � O}, together with the infinite sequence TOO of tails. Now, Jl>(TnH) = ( 1 - p)n p, and Jl>(Too) = limn .... oo{1 - p)n = 0 if p #- O. 5. As usual, Jl>(A /::,. B) = Jl> ( A U B) \ Jl>(A n B)) = Jl>(A U B) - Jl>(A n B) .

6. Clearly, by Exercise ( 1 .4.2),

Jl>(A U B U C) = Jl> ( AC n BC n CC)C ) = 1 - Jl>(AC n Bc n CC)

= 1 - Jl>(Ac I BC n CC)Jl>(Bc I CC)Jl>(Cc) .

7 . (a) If A i s independent of itself, then Jl>(A) = Jl>(A n A ) = Jl>(A)2 , s o that Jl>(A) = 0 or 1 . (b) If Jl>(A) = 0 then 0 = Jl>(A n B ) = Jl>(A)Jl>(B) for all B . If Jl>(A) = 1 then Jl>(A n B ) = Jl>(B), so that Jl>(A n B) = Jl>(A)Jl>(B) .

8. Q U 0 = Q and Q n 0 = 0, and therefore 1 = Jl>(Q U 0) = Jl>(Q) + Jl>(0) = 1 + Jl>(0) , implying that Jl>(0) = O. 9. (i) Q(0) = Jl>(0 I B) = O. Also Q(Q) = Jl>(Q I B) = Jl>(B)/Jl>(B) = 1 . (ii) Let AI , A2 , • . . be disjoint members of :F. Then {Ai n B : i � I } are disjoint members of :F, implying that

142

Page 152: One Thousand Exercises in Probability

Problems Solutions [1.8.10]-[1.8.13]

Finally, since Q is a probability measure,

Q(A n C) P(A n C I B) P(A n B n C) Q(A I C) = Q(C) = P(C I B) = P(B n C) = P(A I B n C) .

The order of the conditioning (C before B, or vice versa) is thus irrelevant.

10. As usual,

11. The first inequality is trivially true if n = 1 . Let m :::: 1 and assume that the inequality holds for n :::; m. Then

:::; P (U Ai) + P(Am+1 ) :::; 'I: P(Ai ) , 1 1

by the hypothesis. The result follows by induction. Secondly, by the first part,

12. We have that

P(0 Ai) = p ( (Y Afr) = 1 - P(Y Af)

13. Clearly,

= 1 - �P(AD + �P(Af n Ai) - . . . + (- l )np (n Af) by Exercise ( 1 .3 .4) I I <J 1

= 1 - n + LP(Ai ) + (n) - LP(Ai U Aj ) _ (n) + ' " . 2 . . 3 I I <J

+ (- It (:) - (-l )np (Y Ai) using De Morgan's laws again

= (1 - 1 )n + � P(Ai ) - . . . - (_1 )np ( Y Ai) by the binomial theorem.

P(Nk) = L p(n Ai n Ai) . SS;; ( 1 , 2, . . . ,nj ieS us

IS I=k For any such given S, we write As = ni eS Ai . Then

p( n Ai n Ai) = P(As) - L P(Asu{j } ) + L P(Asu{j,k} ) - . . . i eS us us j <k

j,krtS

143

Page 153: One Thousand Exercises in Probability

[1.8.14]-[1.8.16] Solutions Events and their probabilities

by Exercise ( 1 .3 .4). Hence

where a typical summation is over all subsets S of { I , 2, . . . , n } having the required cardinality.

Let Ai be the event that a copy of the i th bust is obtained. Then, by symmetry,

where aj is the probability that the j most recent Vice-Chancellors are obtained. Now a3 is given in Exercise ( 1 . 3 .4), and a4 and as may be calculated similarly.

14. Assuming the conditional probabilities are defined,

15. (a) We have that

P(N = 2 I S = 4) = P({N = 2} n {S = 4}) = P(S = 4 I N = 2)P(N = 2) P(S = 4) L:i P(S = 4 I N = i)P(N = i )

(b) Secondly,

1 1 U · 4" = 1 1 1 1 3 1 1 1 · 6 · 2 + U · 4" + m; · s + 64 · 16

P(S = 4 I N = 2) 1 + P(S = 4 I N = 4)ft P(S = 4 I N even) = --------'-------�

P(N even)

-b . 1 + t-r . ft 4233 + 1 _ 6 _ _ --,-.--,.,.-- 1 1 - 4 3 · 4" + 16 + . . . 4 3

(c) Writing D for the number shown by the first die,

1 1 1 P(N = 2, S = 4, D = 1) 0 · 0 · 4" P(N = 2 I S = 4, D = 1 ) = = 1 1 1 1 2 1 1 1 · P(S = 4, D = 1 ) 6 . 6 . 4" + 0 . 30 . s + 64 . 16

(d) Writing M for the maximum number shown, if 1 :::; r :::; 6,

P(M :::; r) = fp(M :::; r I N = j)2-j = f (�) j Ij = � ( 1 - � r1 = _r_.

j=1 j=1 6 2 12 12 12 - r

Finally, P(M = r) = P(M :::; r) - P(M :::; r - 1) . 16. (a) w E B if and only if, for all n, W E U�n Ai , which is to say that W belongs to infinitely many of the An . (b) W E e if and only if, for some n, W E n�n Ai , which is to say that W belongs to all but a finite number of the An .

144

Page 154: One Thousand Exercises in Probability

Problems Solutions [1.8.17]-[1.8.20]

(c) It suffices to note that B is a countable intersection of countable unions of events, and is therefore an event. (d) We have that 00 00

Cn = n Ai � An � U Ai = Bn , i=n i=n

and therefore lP'(Cn) ::: IP'(An) ::: IP'(Bn) . By the continuity of probability measures ( 1 .3 .5), if Cn -+ C then IP'(Cn) -+ IP'(C), and if Bn -+ B then IP'(Bn) -+ IP'(B) . If B = C = A then

IP'(A) = IP'(C) ::: lim IP'(An) ::: IP'(C) = IP'(A) . n-+oo

17. If Bn and Cn are independent for all n then, using the fact that Cn � Bn ,

as n -+ 00,

and also IP'(Bn)IP'(Cn) -+ IP'(B)IP'(C) as n -+ 00, so that IP'(C) = IP'(B)IP'(C) , whence either IP'(C) = 0 or IP'(B) = l or both. In any case IP'(B n C) = IP'(B)IP'(C) .

If An -+ A then A = B = C so that IP'(A) equals 0 or 1 . 18. It is standard (Lemma ( 1 .3 .5» that IP' is continuous if it is countably additive. Suppose then that IP' is finitely additive and continuous. Let AI , A2 , . . . be disjoint events . Then Ui'" Ai = limn-+oo U1 Ai , so that, by continuity and finite-additivity,

19. The network of friendship is best represented as a square with diagonals, with the comers labelled A, B, C, and D. Draw a diagram. Each link of the network is absent with probability p. We write EF for the event that a typical link EF is present, and EP: for its complement. We write A � D for the event that A is connected to D by present links.

(d) IP'(A � D I ADc) = IP'(A � D I ADc n BCc)p + 1P'(A � D I ADc n BC) (1 - p) = { I - (1 - (1 - p)2)2 }p + (1 _ p2)2 ( 1 - p) .

(c) IP'(A � D I BCc) = 1P'(A � D I ADc n BCC)p + IP'(A � D I BCc n AD)( I - p) = { I - (1 - (1 - p)2)2 }p + ( 1 - p) .

(b) IP'(A � D I ABc) = IP'(A � D I ABC n ADc)p + 1P'(A � D I ABc n AD) ( 1 - p) = (1 - p) { 1 - p(1 - ( 1 - p)2) } P + (1 - p) .

(a) IP'(A � D) = IP'(A � D I ADc)p + IP'(A � D I AD)( I - p) = { I - (1 - (1 - p)2)2 }p2 + (1 - p2)2 p(1 - p) + ( 1 - p) .

20. We condition on the result of the first toss . If this is a head, then we require an odd number of heads in the next n - 1 tosses. Similarly, if the first toss is a tail, we require an even number of heads in the next n - 1 tosses. Hence

Pn = p(1 - Pn-} ) + ( 1 - P)Pn-1 = ( 1 - 2P)Pn- 1 + P

with Po = 1 . As an alternative to induction, we may seek a solution of the form Pn = A + BAn . Substitute this into the above equation to obtain

A + BAn = (1 - 2p)A + (1 - 2p)BAn-1 + p

145

Page 155: One Thousand Exercises in Probability

[1.8.21]-[1.8.23] Solutions Events and their probabilities

and A + B = 1 . Hence A = ! , B = ! , A = 1 - 2p. 21. Let A = {run of r heads precedes run of s tails}, B = {first toss is a head}, and C = {first s tosses are tails}. Then

where p = 1 - q is the probability of heads on any single toss. Similarly P(A I B) = pr- l + P(A I BC) ( I _pr- l ) . We solve for P(A I B) and P(A I BC) , and use the fact that P(A) = P(A I B)p +P(A I BC)q, to obtain

22. (a) Since every cherry has the same chance to be this cherry, notwithstanding the fact that five are now in the pig, the probability that the cherry in question contains a stone is fa = ! . (b) Think: about it the other way round. First a random stone is removed, and then the pig chooses his fruit. This does not change the relevant probabilities. Let C be the event that the removed cherry contains a stone, and let P be the event that the pig gets at least one stone. Then P(P I C) is the probability that out of 19 cherries, 15 of which are stoned, the pig gets a stone. Therefore

P(P I C) = 1 - P (pig chooses only stoned cherries I C) = 1 - M . M . # . }g . H·

23. Label the seats 1 , 2, . . . , 2n clockwise. For the sake of definiteness, we dictate that seat 1 be occupied by a woman; this determines the sex of the occupant of every other seat. For 1 :s k :s 2n, let Ak be the event that seats k, k + 1 are occupied by one of the couples (we identify seat 2n + 1 with seat 1) . The required probability is

Now, P(Aj ) = n (n - 1) !2 /n !2 , since there are n couples who may occupy seats i and i + 1, (n - I) ! ways of distributing the remaining n - 1 women, and (n - I) ! ways of distributing the remaining n - 1 men. Similarly, if 1 :s i < j :s 2n, then { (n - 2) !2

p(Aj n Aj ) =

o

n(n - l) n !2 if l i - j l # 1

if I i - j l = 1 ,

subject to P(A 1 n A2n) = O. In general,

(n - k) !

n !

if i l < i2 < . . , < h and ij+ l - ij � 2 for 1 :s j < k, and 2n + i l - ik � 2 ; otherwise this probability is O. Hence

(2n ) n (n - k) ' P n Af = I)-Ii " Sk,n

1 k=O n .

where Sk,n is the number of ways of choosing k non-overlapping pairs of adjacent seats.

Finally, we calculate Sk,n ' Consider first the number Nk,m of ways of picking k non-overlapping pairs of adjacent seats from a line (rather than a circle) of m seats labelled 1 , 2, . . . , m. There is a one­one correspondence between the set of such arrangements and the set of (m - k)-vectors containing

146

Page 156: One Thousand Exercises in Probability

Problems Solutions [1.8.24]-[1.8.27]

k l 's and (m - 2k) O's. To see this, take such an arrangement of seats, and count 0 for an unchosen seat and 1 for a chosen pair of seats; the result is such a vector. Conversely take such a vector, read its elements in order, and construct the arrangement of seats in which each 0 corresponds to an unchosen

seat and each 1 corresponds to a chosen pair. It follows that Nk,m = (m;;k) . Turning to Sk,n , either the pair 2n, 1 is chosen or it is not. If it is chosen, we require another

k - 1 pairs out of a line of 2n - 2 seats. If it is not chosen, we require k pairs out of a line of 2n seats. Therefore

(2n - k - 1) (2n - k) (2n - k) 2n Sk,n = Nk-l ,2n-2 + Nk,2n = k - l + k = k 2n - k '

24. Think about the experiment as laying down the b + r balls from left to right in a random order. The number of possible orderings equals the number of ways of placing the blue balls, namely (br) . The number of ways of placing the balls so that the first k are blue, and the next red, is the number of ways of placing the red balls so that the first is in position k + 1 and the remainder are amongst the r + b - k - 1 places to the right, namely (+�=�-1 ) . The required result follows.

The probability that the last ball is red is r / (r + b) , the same as the chance of being red for the ball in any other given position in the ordering.

25. We argue by induction on the total number of balls in the urn . Let Pac be the probability that the last ball is azure, and suppose that Pac = � whenever a, C ::: 1 , a + c � k. Let a and a be such that a, a ::: 1, a + a = k + 1 . Let Ai be the event that i azure balls are drawn before the first carmine ball, and let Cj be the event that j carmine balls are drawn before the first azure ball. We have, by taking conditional probabilities and using the induction hypothesis, that

a 0'

PaO' = L Pa-i,O'lP'(Ai ) + L Pa,O'-jlP'(Cj ) i=1 j=1

a- I 0'-1 = Po,O'lP'(Aa ) + Pa,OlP'(CO' ) + � L lP'(Ai ) + � L lP'(Cj ) .

i=1 j=1 Now po,O' = 0 and Pa,O = 1 . Also, by an easy calculation,

lP'(Aa) = _a- . a - I a l a ! - lP'(C ) a + a a + a - I ' " a + 1 =

(a + a) ! - 0' .

It follows from the above two equations that

PaO' = � (i= lP'(Ai ) + t lP'(Cj )) + ! (lP'(CO' ) - lP'(Aa) ) = ! . i=1 j=1

26. (a) If she says the ace of hearts is present, then this imparts no information about the other card, which is equally likely to be any of the three other possibilities .

(b) In the given protocol, interchange hearts and diamonds.

27. Writing A if A tells the truth, and A C otherwise, etc. , the only outcomes consistent with D telling the truth are ABCD, ABcCcD, NBCcD, and NBcCD, with a total probability of H . Likewise, the only outcomes consistent with D lying are A cBcCcDc, A cBCDc , ABcCDc, and ABCcDc , with a total probability of � . Writing S for the given statement, we have that

S 11"(0 n S)

lP'(D I ) = lP'(D n S) + lP'(DC n S)

147

1 3 1 3 lIT = _ .ll + 28 41 81 lIT

Page 157: One Thousand Exercises in Probability

[1.8.28]-[1.8.33J Solutions Events and their probabilities

Eddington himself thought the answer to be #; hence the 'controversy' . He argued that a truthful denial leaves things unresolved, so that if, for example, B truthfully denies that C contradicts D, then we cannot deduce that C supports D. He deduced that the only sequences which are inconsistent with the given statement are ABcCD and ABcCcDc, and therefore

Which side are you on?

if 25 lP(D I S) = 25 46 = 7 1

. 81 + 81

28. Let Br be the event that the rth vertex of a randomly selected cube is blue, and note that lP(Br) = to. By Boole's inequality,

8 8 1P ( U Br) ::: L lP(Br) = !o < 1 ,

r=l r=l so at least 20 per cent of such cubes have only red vertices.

29. (a) lP(B I A) = lP(A n B)/lP(A) = lP(A I B)lP(B)/lP(A) > lP(B) . (b) lP(A I BC) = lP(A n BC)/lP(BC) = {lP(A) - lP(A n B)}/lP(BC) < lP(A) . (c) No. Consider the case A n C = 0.

30. The number of possible combinations of birthdays of m people is 365m ; the number of combina­tions of different birthdays is 365 ! / (365 - m) ! . Use your calculator for the final part.

32. In the obvious notation, lP(wS, xH, yD, zC) = (!;) (�) (�) (�3) / (W . Now use your calcula­

tor. Thrning to the 'shape vector' (w, x , y , z) with w ::: x ::: y ::: z,

lP(w, x , y , z) = { 41P(wS, xH, yD, zC) if w =j:. x = y = z,

121P(wS, xH, yD, zC) if w = x =j:. y =j:. z,

on counting the disjoint ways of obtaining the shapes in question.

33. Use your calculator, and divide each of the following by (552) .

148

Page 158: One Thousand Exercises in Probability

Problems

34. Divide each of the following by 65 .

6 ! 5 ! 6 ! 5 !

3 ! (2 !)2 ' 3 ! (2 !) 3 ' 6 ! ,

6 ! 5 !

(5 !)2 ·

6 ! 5 !

6 ! 5 !

2 ! (3 !)2 '

6 ! 5 !

(4 !)2 '

Solutions [1.8.34]-[1.8.36]

35. Let 8r denote the event that you receive r similar answers, and T the event that they are correct. Denote the event that your interlocutor is a tourist by V . Then T n Vc = 0, and

Hence:

lP'(T I 8 ) = lP'(T n V n 8r) = lP'(T n 8r I V)lP'(V) . r lP'(8r) lP'(8r)

(a) lP'(T I 81 ) = i x VI = ! . (b) lP'(T I 82) = (i )2 . V [{ <i )2 + ( i)2H + j] = ! . (c) lP'(T I 83) = ( i )3 . V [{ (i )3 + (i)3H + j] = iu. (d) lP'(T I 84) = (i )4 . V [{ (i )4 + ( i)4H + j] = � .

(e) If the last answer differs, then the speaker is surely a tourist, so the required probability is

36. Let E (respectively W) denote the event that the answer East (respectively West) is given.

(a) Using conditional probability,

(Bas I E) ElP'(E I East correct) E · � · i lP' t correct =

lP'(E) =

!E + (� . i + j) ( I + E) = E,

E (� · i + 1) lP'(East correct I W) = 1 1 2 3 = E .

E (iJ + 3) + 3 < 4 ( 1 - E)

(b) Likewise, one obtains for the answer EE,

and for the answer WW,

(c) Similarly for EEE,

149

Page 159: One Thousand Exercises in Probability

[1.8.37]-[1.8.39] Solutions Events and their probabilities

and for WWW, E { (�) (i )3 + H l IE

E [(� ) ( i )3 + 1] + (1 - E) � (i)3 = 9 + 2E ·

Then for E = -lu, the first is lJ,:; the second is i , as you would expect if you look at Problem (1 .8.35) . 37. Use induction. The inductive step employs Boole's inequality and the fact that

38. We propose to prove by induction that

lP' ( U Ar) � t lP'(Ar) - L lP'(Ar n AI ) · r=l r=l 2�r�n

There is nothing special about the choice of A I in this inequality, which will therefore hold with any suffix k playing the role of the suffix 1 . Kounias's inequality is then implied.

The above inequality holds trivially when n = 1 . Assume that it holds for some value of n (� 1) . We have that

lP'CLJ Ar) = lP'( U Ar) + lP'(An+l ) - lP' ( An+! n U Ar) r=l r=l r=l

� t lP'(Ar) - L lP'(Ar n AI ) + lP'(An+l ) - lP'( An+l n U Ar) r=l 2�r�n r=l n+l

� LlP'(Ar) - L lP'(Ar n AI ) r=l 2�r�n+1

since lP'(An+1 n AI ) � lP'(An+1 n U�=l Ar ) . 39. We take n � 2. We may assume without loss of generality that the seats are labelled 1 , 2, . . . , n, and that the passengers are labelled by their seat assignments. Write F for the event that the last passenger finds his assigned seat to be free. Let K (� 2) be the seat taken by passenger 1 , so that lP'(F) = (n- l)- l Lk=2 ak where ak = lP'(F I K = k) . Note that an = O. Passengers 2, 3, . . . , K- l occupy their correct seats. Passenger K either occupies seat 1 , in which case all subsequent passengers take their correct seats, or he occupies some seat L satisfying L > K. In the latter case, passengers K + 1 , K + 2, . . . , L - 1 are correctly seated. We obtain thus that

2 � k < n .

Therefore ak = i for 2 � k < n , by induction, and so lP'(F) = ! (n - 2)/(n - 1 ) .

1 50

Page 160: One Thousand Exercises in Probability

2 Random variables and their distributions

2.1 Solutions. Random variables

1. (i) If a > 0, x E JR, then {w : aX(w) ::; x } = {w : X (w) ::; x/a} E F' since X is a random variable. If a < 0,

{w : aX(w) ::; x} = {w : X (w) � x/a} = { U1 {w : X (w) ::; � - � } r n�

which lies in F'since it is the complement of a countable union of members of :F. If a = 0,

{ 0 if x < 0, {w : aX(w) ::; x} = ,... •• ifx � O;

in either case, the event lies in :F.

(ii) For W E n, X(w) - X(w) = 0, so that X - X is the zero random variable (that this is a random variable follows from part (i) with a = 0). Similarly X(w) + X(w) = 2X (w). 2. Set Y = aX + b. We have that

{ IP'(X ::; (y - b)/a) = F ( y - b)/a) if a > 0, IP'(Y < y) =

- IP' (X � (y - b)/a) = 1 - lirnxt (y -b)/a F(x) if a < 0.

Finally, if a = 0, then Y = b, so that IP'(Y ::; y) equals ° if b > y and 1 if b ::; y .

3 . Assume that any specified sequence o f heads and tails with length n has probability 2-n . There are exactly (k) such sequences with k heads.

If heads occurs with probability p then, assuming the independence of outcomes, the probability of any given sequence of k heads and n -k tails is pk ( 1 -p )n-k . The answer is therefore (k) pk ( 1 -p )n-k . 4. Write H = )"F + (1 - )")G . Then 1imx--*-oo H(x) = 0, limx--*oo H(x) = 1 , and clearly H is non-decreasing and right-continuous. Therefore H is a distribution function.

S. The function g(F(x)) is a distribution function whenever g is continuous and non-decreasing on [0, 1 ] , with g (O) = 0, g( l ) = 1 . This is easy to check in each special case.

1 5 1

Page 161: One Thousand Exercises in Probability

[2.2.1]-[2.4.1] Solutions Random variables and their distributions

2.2 Solutions. The law of averages

1. Let p be the potentially embarrassed fraction of the population, and suppose that each sampled individual would truthfully answer "yes" with probability p independently of all other individuals. In the modified procedure, the chance that someone says yes is p + � ( 1 - p) = � ( 1 + p) . If the proportion of yes's is now ¢, then 2¢ - 1 is a decent estimate of p.

The advantage of the given procedure is that it allows individuals to answer ''yes'' without their being identified with certainty as having the embarrassing property. 2. Clearly Hn + Tn = n , so that (Hn - Tn)/n = (2Hn/n) - 1 . Therefore

lP' (2P - 1 - E � � (Hn - Tn) � 2p - 1 + E) = lP' ( I �Hn - p i � n --+ I as n --+ 00, by the law of large numbers (2.2. 1 ) . 3. Let In (x) bethe indicatorfunction ofthe event {Xn � x) . Bythe lawofaverages, n- l L�=l Ir (x) converges in the sense of (2.2. 1 ) and (2.2.6) to lP'(Xn � x) = F(x) .

2.3 Solutions. Discrete and continuous variables

1. With 8 = supm l am - am- I I , we have that I F(x) - G(x) 1 � I F(am ) - F(am-d l � I F(x + 8) - F(x - 8) 1

for x E [am- I , am) . Hence G(x) approaches F(x) for any x at which F is continuous. 2. For y lying in the range of g, {Y � y} = {X � g- l (y) } E y:: 3. Certainly Y is a random variable, using the result of the previous exercise (2). Also

lP'(Y � y) = lP' (F- I (X) � y) = lP' (X � F(y)) = F(y)

as required. If F is discontinuous then F-I (x) is not defined for all x, so that Y is not well defined. If F is non-decreasing and continuous, but not strictly increasing, then F- I (x) is not always defined uniquely. Such difficulties may be circumvented by defining F-I (x) = inf{y : F(y) � x} . 4. The function AI + ( 1 -A)g is non-negative and integrable over lIUo 1 . Finally, I g is not necessarily a density, though it may be: e.g., if 1 = g = 1 , 0 � x � 1 then I(x)g(x) = 1 , 0 � x � 1 . 5. (a) If d > 1 , then floo cx-d dx = c/(d - 1 ) . Therefore I is a density function if c = d - 1 , and F(x) = 1 - x-(d- l) when this holds. If d � 1 , then I has infinite integral and cannot therefore be a density function. . (b) By differentiating F(x) = eX /( 1 + eX ) , we see that F is the distribution function, and c = 1 .

2.4 Solutions. Worked examples

1. (a) If y � 0,

lP'(X2 � y) = lP'(X � ..jY) - lP'(X < -..jY) = F(..jY) - F(-..jY). (b) We must assume that X � O . If y � 0,

1 52

Page 162: One Thousand Exercises in Probability

Random vectors Solutions [2.4.2]-[2.5.6]

(c) If - 1 ::; y ::; 1 , 00

lP'(sin X ::; y) = L lP'( 2n + 1 )11" - sin- 1 y ::; X ::; (2n + 2)11" + sin- 1 y) n=-oo

00 = L {F ( 2n + 2)11" + sin -1 y) - F ( (2n + 1 )11" - sin -1 y) } .

n=-oo

(d) lP'(G-1 (X) ::; y) = lP'(X ::; G(y» = F(G (y» . (e) If 0 ::; y ::; 1, then lP'(F(X) ::; y) = lP'(X ::; F- 1 (y» = F(F- 1 (y» = y . There is a small difficulty if F is not strictly increasing, but this is overcome by defining F- 1 (y) = sup{x : F (x) = y} . (t) lP'(G-1 (F(X» ::; y) = lP'(F(X) ::; G(y» = G(y) . 2. It is the case that, for x E R, Fy (x) and Fz (x) approach F(x) as a � -00, b � 00.

2.5 Solutions. Random vectors

1. Write fxw = lP'(X = x , W = w) . Then foo = hI = ! , flO = � , and fxw = 0 for other pairs x, w. 2. (a) We have that

(b) Secondly, { 1 - P fx.z (x , z) = �

if (x , y) = ( 1 , 0) , if (x , y) = (0, 1 ) , otherwise.

if (x , z) = (0, 0) , if (x , z) = ( 1 , 0) , otherwise.

3. Differentiating gives fx. y (x , y) = e-x /{11" ( 1 + y2) } , x 2: 0, Y E R.

4. Let A = {X ::; b, c < Y ::; d}, B = {a < X ::; b, Y ::; d} . Clearly lP'(A) = F(b, d) - F(b , c) , lP'(B) = F(b, d) - F(a, d), lP'(A U B) = F(b, d) - F(a , c) ;

now lP'(A n B) = lP'(A) + lP'(B) - lP'(A U B), which gives the answer. Draw a map of R2 and plot the regions of values of (X, Y) involved. 5. The given expression equals

lP'(X = x , Y ::; y) - lP'(X = x , Y ::; y - 1) = lP'(X = x , Y = y) .

Secondly, for 1 ::; x ::; y ::; 6 ,

6. No, because F is twice differentiable with 02 F /oxoy < O.

153

if x < y,

i f x = y .

Page 163: One Thousand Exercises in Probability

[2.7.1]-[2.7.6] Solutions Random variables and their distributions

2.7 Solutions to problems

1. By the independence of the tosses,

JP(X > m) = JP(first m tosses are tails) = (1 _ p)m . Hence

{ I - ( 1 - p) lxJ JP(X < x) = - 0 Remember that Lx J denotes the integer part of x .

if x ::: 0, if x < O.

2. (a) If X takes values {Xi : i ::: I} then X = E� l Xi /Ai where Ai = {X = xi J . (b) Partition the real line into intervals of the form [k2-m , (k + 1)2-m ) , -00 < k < 00, and define Xm = E�-oo k2

-m h,m where h,m is the indicator function of the event {k2-m ::: X < (k + 1)2-m } . Clearly Xm is a random variable, and Xm (w) t X (w) as m � 00 for all w. (c) Suppose {Xm } is a sequence of random variables such that Xm (w) t X (w) for all w. Then {X ::: x } = nm {X m ::: x} , which is a countable intersection of events and therefore lies in :F. 3. (a) We have that

00 {X + Y ::: x } = n U ({X ::: r } n {Y ::: x - r + n- 1 J)

n=l reQ+

where Q+ is the set of positive rationals. In the second case, if XY is a positive function, then XY = exp{log X + log Y}; now use Exercise

(2.3.2) and the above. For the general case, note first that I Z I is a random variable whenever Z is a random variable, since { lZ I ::: a} = {Z ::: a} \ {Z < -a} for a ::: O. Now, if a ::: 0, then {XY ::: a} = {XY < O} U { IXY I ::: a} and

{XY < O} = ({X < O} n (Y > OJ) U ( {X > O} n (Y < OJ) .

Similar relations are valid if a < O. Finally {min{X, Y} > x } = {X > x} n {Y > y} , the intersection of events.

(b) It is enough to check that aX + fJY is a random variable whenever a, fJ E R and X, Y are random variables. This follows from the argument above.

If n is finite, we may take as basis the set (/ A : A E .n of all indicator functions of events. 4. (a) F(� ) - F (i ) = i . (b) F(2) - F(I ) = � . (c) JP(X2 ::: X) = JP(X ::: 1 ) = i . (d) JP(X ::: 2X2) = JP(X ::: � ) = � . (e) JP(X + X2 ::: � ) = JP(X ::: i ) = � . (f) JP(.JX ::: z) = JP(X ::: z2) = iz2 if 0 ::: z ::: ,.fi. 5. JP(X = - 1 ) = 1 - p, JP(X = 0) = 0, JP(X ::: 1) = ip . 6. There are 6 intervals of 5 minutes, preceding the arrival times of buses. Each such interval has b b'l ' 5 1 th . 6 1 1 pro a I ity 60 = 12 ' so e answer IS . 12 = ,; .

1 54

Page 164: One Thousand Exercises in Probability

Problems Solutions [2.7.7]-[2.7.12]

7. Let T and B be the numbers of people on given typical flights of TWA and BA. From Exercise (2. 1 .3),

Now

lP' = k = ( 10) (�) k (�) 1O-k (T )

k 10 10 ' lP'(B = k) = - -(20) ( 9 ) k ( 1 ) 20-k k 10 10

lP'(TWA overbooked) = lP'(T = 10) = ( [0 ) 10 , lP'(BA overbooked) = lP'(B ::: 19) = 20( ib) l9 ( to) + (ib)20 ,

of which the latter is the larger. 8. Assuming the coins are fair, the chance of getting at least five heads is (1 )6 + 6( 1 )6 = l4. 9. (a) We have that

(b) Secondly,

lP'(X+ < x) = { 0 - F(x) if x < 0, if x ::: O.

lP'(X- < x) = { 0 - 1 - limyt-x F (y) if x < 0, if x ::: O.

(c) lP'( IX I ::: x) = lP'( -x ::: X ::: x) if x ::: O. Therefore

lP'( IX I < x) = { 0 - F(x) - limyt-x F (y)

(d) lP'(-X ::: x) = 1 - limyt-x F (y) . 10. By the continuity of probability measures ( 1 .3 .5),

if x < 0, if x ::: O.

lP'(X = xo) = lim lP'(y < X ::: xo) = F(xo) - lim F (y) = F(xo) - F(xo-) , ytxo ytxo

using general properties of F. The result follows. 11. Define m = sup{x : F(x) < 1 } ' Then F (y) < 1 for y < m, and F(m) ::: 1 (if F(m) < 1 then F(m') < 1 for some m' > m, by the right-continuity of F, a contradiction). Hence m is a median, and is smallest with this property.

A similar argument may be used to show that M = sup {x : F (x) ::: 1 } is a median, and is largest with this property. The set of medians is then the closed interval [m , M], 12. Let the dice show X and Y. Write S = X + Y and Ii = lP'(X = i) , gi = lP'(Y = i) . Assume that lP'(S = 2) = lP'(S = 7) = lP'(S = 12) = fr. Now

lP'(S = 2) = lP'(X = 1 )lP'(Y = 1) = 11gb lP'(S = 12) = lP'(X = 6)lP'(Y = 6) = 16g6 ,

lP'(S = 7) ::: lP'(X = 1 )lP'(Y = 6) + lP'(X = 6)lP'(Y = 1) = I1g6 + 16gl .

1 ( g6 16 ) 1 ( 1 ) - = lP'(S = 7) ::: I1g l - + - = - x + -1 1 g l 11 1 1 x

155

Page 165: One Thousand Exercises in Probability

[2.7.13]-[2.7.14] Solutions Random variables and their distributions

where x = g6/ gl . However x + x- I > 1 for all x > 0, a contradiction. 13. (a) Clearly dL satisfies (i). As for (ii), suppose that ddF, G) = O. Then

and

F(x) � lim{G(x + E) + E } = G(x) E-I-O

F(y) :::: lim{G(y - E) - E} = G(y-) . E-I-O

Now G(y-) :::: G(x) if y > x; taking the limit as y ,j.. x we obtain

F(x) :::: lim G(y-) :::: G(x) , y-I-x

implying that F(x) = G(x) for all x . Finally, if F(x) � G(x + E) + E and G(x) � H(x + 8) + 8 for all x and some E, 8 > 0 , then

F(x) � H(x + 8 + E) + E + 8 for all x . A similar lower bound for F(x) is valid, implying that ddF, H) � ddF, G) + ddG, H). (b) Clearly dTV satisfies (i), and dTV(X, Y) = 0 if and only if lP'(X = Y) = 1 . By the usual triangle inequality,

1 lP'(X = k) - lP'(Z = k) 1 � 1 lP'(X = k) - lP'(Y = k) 1 + 1 lP'(Y = k) - lP'(Z = k) l , and (iii) follows by summing over k.

We have that

2 1lP'(X E A) - lP'(Y E A) I = I (lP'(X E A) - lP'(Y E A)) - (lP'(X E AC) - lP'(Y E AC)) I

= 1� (lP'(X = k) - lP'(Y = k)) JA (k) / where J A (k) equals 1 if k E A and equals - 1 if k E A c . Therefore,

2 1 lP'(X E A) - lP'(Y E A) I � }]lP'(X = k) - lP'(Y = k) I · I JA (k) 1 � tfrv(X, Y) . k

Equality holds if A = (k : lP'(X = k) > lP'(Y = k)} . 14. (a) Note that

82F -- = _e-x-y < 0, 8x8y x , y > 0,

so that F is not a joint distribution function. (b) In this case

and in addition

82F = { e-Y 8x8y 0 if 0 � x � y, if O � y � x,

1000 1000 82 F -- dx dy = 1 . o 0 8x8y Hence F is a joint distribution function, and easy substitutions reveal the marginals:

FX (x) = lim F(x , y) = 1 - e-x , x :::: 0, y-*oo Fy (y) = lim F(x , y) = 1 - e-Y - ye-Y , y :::: O.

x-*oo

1 56

Page 166: One Thousand Exercises in Probability

Problems Solutions [2.7.15]-[2.7.20]

15. Suppose that, for some i =1= j , we have Pi < Pj and Bi is to the left of Bj . Write m for the position of Bi and r for the position of Bj ' and consider the effect of interchanging Bi and Bj . For k ::; m and k > r, JP(T � k) is unchanged by the move. For m < k ::; r, JP(T � k) is decreased by an amount Pj - Pi , since this i s the increased probability that the search is successful at the mth position. Therefore the interchange of Bi and Bj is desirable.

It follows that the only ordering in which JP(T � k) can be reduced for no k is that ordering in which the books appear in decreasing order of probability. In the event of ties, it is of no importance how the tied books are placed. 16. Intuitively, it may seem better to go first since the first person has greater choice. This conclusion is in fact false. Denote the coins by Cl , C2 , C3 in order, and suppose you go second. If your opponent chooses Cl then you choose C3 , because JP(C3 beats Cl ) = � + � . � = � > i . Likewise JP(CI beats C2) = JP(C2 beats C3) = � > � . Whichever coin your opponent picks, you can arrange to have a better than evens chance of winning. 17. Various difficulties arise in sequential decision theory, even in simple problems such as this one. The following simple argument yields the optimal policy. Suppose that you have made a unsuccessful searches "ahead" and b unsuccessful searches "behind" (if any of these searches were successful, then there is no further problem). Let A be the event that the correct direction is ahead. Then

kn I d ) _ JP(current knowledge I A)lP(A) JP(A I current ow e ge - ( I d ) JP current know e ge ( 1 - p)aa

which exceeds i if and only if ( 1 - p)aa > ( 1 - p)b ( 1 - a) . The optimal policy is to compare ( 1 - p)aa with ( 1 - p)b ( 1 - a) . You search ahead if the former i s larger and behind otherwise; in the event of a tie, do either. 18. (a) There are (�) possible layouts, of which 8+8+2 are linear. The answer is 18/ (�) . (b) Each row and column must contain exactly one pawn. There are 8 possible positions in the first row. Having chosen which of these is occupied, there are 7 positions in the second row which are admissible, 6 in the third, and so one. The answer is 8 ! 1 (�) .

19. (a) The density function is f(x) = F' (x) = 2xe-x2 , x � O. (b) The density function is f(x) = F' (x) = x2e- 1/x , x > O. (c) The density function is f(x) = F'(x) = 2(eX + e-x )-2 , x E R. (d) This is not a distribution function because F' ( 1 ) < O. 20. We have that

JP(U = V) = ( fu. v (u , v) du dv = O. }{ (u . v) :u=v ) The random variables X, Y are continuous but not jointly continuous: there exists no integrable function f : [0, 1 ]2 --+ IR such that

JP(X ::; x , Y ::; y) = {X {Y f(u, v) du dv , O ::; x , y ::; l . }u=O }v=O

1 57

Page 167: One Thousand Exercises in Probability

3

Discrete random variables

3.1 Solutions. Probability mass functions

1. (a) C- 1 = Ef 2-k = 1 . (b) C- 1 = Ef 2-k /k = log 2. (c) C- 1 = Ef k-2 = 1f2/6. (d) C- 1 = Ef 2k/k ! = e2 - 1 .

2. (i) ! ; 1 - (2 10g 2)- 1 ; 1 - 61f-2 ; (e2 - 3)/ (e2 - I) . (ii) I ; I ; 1 ; 1 and 2. (iii) It is the case that JP(X even) = Ef::l JP(X = 2k) , and the answers are therefore (a) l , (b) 1 - (log 3)/(log 4), (c) ! . (d) We have that

00 22k 00 2i + (_2)i � - - � - 1 - 1 (e2 + e-2) - 1 t:l (2k) ! - 6J 2(i ! ) - :2 '

so the answer is !O - e-2) . 3. The number X of heads on the second round is the same as if we toss all the coins twice and count the number which show heads on both occasions. Each coin shows heads twice with probability p2, so JP(X = k) = (Z) p2k ( 1 _ p2)n-k . 4. Let Dk be the number of digits (to base 10) in the integer k. Then

5. (a) The assertion follows for the binomial distribution because k(n - k) � (n - k + 1 ) (k + 1 ) . The Poisson case is trivial.

(b) This follows from the fact that kg ::: (k2 - 1)4 . (c) The geometric mass function f (k) = qpk , k ::: o.

3.2 Solutions. Independence

1. We have that

JP(X = 1 , Z = 1 ) = JP(X = 1 , Y = 1) = ! = JP(X = I )JP(Z = 1 ) .

158

Page 168: One Thousand Exercises in Probability

Independence Solutions [3.2.2]-[3.2.3]

This, together with three similar equations, shows that X and Z are independent. Likewise, Y and Z are independent. However

JP(X = 1 , Y = 1 , Z = - I) = 0 =I- � = JP(X = I )JP(Y = I )JP(Z = - 1) , so that X, Y, and Z are not independent. 2. (a) If x 2: 1 ,

JP( rnin{X, Y} � x) = I - JP(X > x , Y > x) = 1 - JP(X > x)JP(Y > x) = 1 - 2-x · 2-x = 1 - 4-x .

(b) JP(Y > X) = JP(Y < X) by symmetry. Also

Since JP(Y > X) + JP(Y < X) + JP(Y = X) = 1 .

JP(Y = X) = 2:JP(Y = X = x) = 2: 2-x . TX = ! , x x

we have that JP(Y > X) = ! . (c) ! by part (b). (d)

(e)

00 JP(X 2: kY) = 2: JP(X 2: kY, Y = y)

y=1 00 00

= 2: JP(X 2: ky , Y = y) = 2: JP(X 2: ky)JP(Y = y) y=1 y=1 00 00 2 = '" '" 2-ky-x 2-y = . � � 2k+ 1 _ 1 y=l x=O

00 00 00 JP(X divides Y) = 2: JP(Y = kX) = 2: 2: JP(Y = kx , X = x)

k=1 k=l x= 1

= � � TkxTX = � 1 . � � � 2k+ 1 _ 1 k=1 x=1 k=1 (t) Let r = min where m and n are coprime. Then

00 00 1 JP(X = rY) = 2: JP(X = km , Y = kn) = 2: 2-kmTkn = 2m+n _ I ·

k=1 k= 1

3. (a) We have that JP(XI < X2 < X3) = 2: ( 1 - PI ) ( I - P2) ( 1 - P3)P�- 1 p�- I p�- I

i<j<k "'(I ) ( 1 ) i- I j- I j = � - PI - P2 PI P2 P3 i<j

= 2: ( 1 - PI ) ( I - P2)pi- 1 (P2P3)i P3

. 1 - P2P3 I

( 1 - PI ) ( 1 - P2)P2P� (I - P2P3) ( 1 - PIP2P3)

1 59

Page 169: One Thousand Exercises in Probability

[3.2.4]-[3.2.5] Solutions Discrete random variables

( 1 - Pl ) ( 1 - P2) ( 1 - P2P3) ( 1 - PI P2P3) '

4. (a) Either substitute PI = P2 = P3 = � in the result of Exercise (3b), or argue as follows, with the obvious notation. The event {A < B < C} occurs only if one of the following occurs on the first round: (i) A and B both rolled 6, (ii) A rolled 6, B and C did not, (iii) none rolled 6. Hence, using conditional probabilities,

In calculating lP'(B < C) we may ignore A's rolls, and an argument similar to the above tells us that

lP'(B < C) = ( � ) 2lP'(B < C) + i . Hence lP'(B < C ) = fr' yielding lP'(A < B < C) = 1�1 ' (b) One may argue as above. Alternatively, let N be the total number of rolls before the first 6 appears. The probability that A rolls the first 6 is

(N ( I 4 7 }) " ( 5 )k- l 1 - 36 lP' E ' " . . . = L 6 6 - lIT ' k=I ,4,7 • . . . Once A has thrown the first 6, the game restarts with the players rolling in order BCABCA . . . . Hence the probability that B rolls the next 6 is � also, and similarly for the probability that C throws the third 6. The answer is therefore (�) 3 . 5. The vector (-Xr : 1 � r � n) has the same joint distribution as (Xr : 1 � r � n), and the clairn follows.

Let X + 2 and Y + 2 have joint mass function f, where fi. j is the (i , j)th entry in the matrix

Then

( 1 1 6 TI o 1 6 1 1 6 TI t ) , TI

1 � i , j � 3.

lP'(X = - 1) = lP'(X = 1 ) = lP'(Y = - 1) = lP'(Y = 1 ) = 1 , lP'(X = 0) = lP'(Y = 0) = 1 , lP'(X + Y = -2) = i =1= 1\ = lP'(X + Y = 2) .

1 60

Page 170: One Thousand Exercises in Probability

Expectation Solutions [3.3.1]-[3.3.3]

3.3 Solutions. Expectation

1. (a) No! (b) Let X have mass function: 1(- 1) = � , / ( 1 ) = � , /(2) = � . Then

E(X) = -� + � + 3 = 1 = - � + 3 + � = EO/X) .

2 . (a) If you have already j distinct types of object, the probability that the next packet contains a different type is (c - j)/c , and the probability that it does not is j /c . Hence the number of days required has the geometric distribution with parameter (c - j)/c; this distribution has mean c/(c - j) .

(b) The time required to collect all the types is the sum of the successive times to collect each new type. The mean is therefore

c-l c L

� = CL�' j=o c - J k=l k

3. (a) Let Iij be the indicator function of the event that players i and j throw the same number. Then 6

E(lij ) = If"(lij = 1 ) = L (i)2 = i , i i j. i=1

The total score of the group is S = I:i <j Iij , so

E(S) = L E(lij ) = � (n) .

. . 6 2 I <J We claim that the family {lij : i < j } is pairwise independent. The crucial calculation for this is

as follows: if i < j < k then

Hence

6 E(lij Ijk) = If"(i , j , and k throw same number) = L ( i ) 3 = � = E(lij )E(ljk) ·

r= l

var(S) = var ( � Iij ) = � var(lij ) = (;) var(l12) I <J I <J

by symmetry. But var(l12) = i ( 1 - i ) · (b) Let X ij be the common score of players i and j, so that X ij = 0 if their scores are different. This time the total score is S = I:i<j Xij ' and

E(S) = (;) E(X 12) = (;) � . � = 172 (; ) . The Xij are not pairwise independent, and you have to slog it out thus:

1 6 1

Page 171: One Thousand Exercises in Probability

[3.3.4]-[3.4.1] Solutions Discrete random variables

4. The expected reward is L:�I 2-k . 2k = 00 . If your utility function is u, then your 'fair' entrance fee is L:�I 2-ku (2k) . For example, if u (k) = c ( I - k-a) for k :::: 1 , where c, a > 0, then the fair fee is

c f: rk ( I - 2-ak) = c ( 1 - 2a+� _ 1 ) . k=I This fee is certainly not 'fair' for the person offering the wager, unless possibly he is a noted philan­thropist. 5. We have that JE:(Xa ) = L:�I xa /(x (x + I ) } , which is finite if and only if a < 1 . 6 . Clearly

var(a + X) = JE: ( { (a + X) - JE:(a + X) } 2) = JE: ({X - JE:(X)}2 ) = var(X) .

7. For each r, bet ( I + 7r(r)}- I on horse r. If the rth horse wins, your payoff is (7r (r) + I } { I + 7r (r) }- I = 1 , which is in excess of your total stake L:k {7r (k) + I }- I . S. We may assume that: (a) after any given roll of the die, your decision whether or not to stop depends only on the value V of the current roll; (b) if it is optimal to stop for V = r, then it is also optimal to stop when V > r .

Consider the strategy: stop the first time that the die shows r or greater. Let S(r) be the expected score achieved by following this stategy. By elementary calculations,

S(6) = 6 · lP(6 appears before 1) + 1 · lP( I appears before 6) = � , and similarly S (5) = 4, S (4) = 4, S (3) = Ii , S(2) = � . The optimal strategy is therefore to stop at the first throw showing 4, 5, or 6. Similar arguments may be used to show that 'stop at 5 or 6' is the rule to maximize the expected squared score. 9. Proceeding as in Exercise (8), we find the expected returns for the same strategies to be:

S (6) = i - 3c , S (5) = 4 - 2c, S(4) = 4 - �c, S (3) = 1f - �c, S(2) = ; - c.

If c = � , it is best to stop when the score is at least 4; if c = 1 , you should stop when the score is at least 3. The respective expected scores are i and Jf.

3.4 Solutions. Indicators and matching

1. Let lj be the indicator function of the event that the outcome of the (j + I)th toss is different from the outcome of the jth toss. The number R of distinct runs is given by R = 1 + L:j�t lj . Hence

JE:(R) = 1 + (n - I)JE:(l} ) = 1 + (n - I)2pq ,

where q = 1 - p. Now remark that lj and h are independent if I i - kl > 1 , so that

JE:{(R - I )2 } = JE:{ (� lj ) 2 } = (n - 1)JE:(l} ) + 2(n - 2)JE:(lt h)

J=I + { (n _ 1)2 - (n - 1) - 2(n - 2) }JE:(lI )2 .

1 62

Page 172: One Thousand Exercises in Probability

Indicators and matching

Now lE(lt ) = 2pq and lE(llh) = p2q + pq2 = pq, and therefore

Solutions [3.4.2]-[3.4.5]

var(R) = var(R - 1) = (n - l)lE(lt ) + 2(n - 2)lE(lt h) - { (n - 1 ) + 2(n - 2) }lE(lI )2 = 2pq (2n - 3 - 2pq (3n - 5)) .

2. The required total is T = I:�=1 Xi , where Xi is the number shown on the ith ball. Hence lE(T) = klE(Xd = !k(n + 1) . Now calculate, boringly,

Hence

k � .2 + k(k - 1) 2 " . . = - � J � IJ n 1 n (n - 1) . . I >J

= � { In (n + l) (n + 2) - !n(n + 1) }

k(k - 1) n + n(n _ l ) � j{n(n + l) - j (j + l ) }

J=1 = ik(n + 1) (2n + 1) + -b,k(k - 1) (3n + 2) (n + 1) .

var(T) = k(n + l) { i (2n + 1) + -b, (k - 1) (3n + 2) - ik(n + I) } = f,, (n + l)k(n - k) .

3. Each couple survives with probability

so the required mean is n ( I - �) (1 - �) . 2n 2n - l

4. Any given red ball is in urn R after stage k if and only if it has been selected an even number of times. The probability of this is

m �n (�) (�r ( 1 - � t-m = � { [ (1 - �) + � r + [ ( 1 - �) - � r} = � { I + ( I - �t } ,

and the mean number of such red balls is n times this probability. 5. Label the edges and vertices as in Figure 3 . 1 . The structure function is

� (X) = Xs + ( 1 - Xs) { ( 1 - Xl )X4 [X3 + ( 1 - X3)X2X6]

+Xl [X2 + ( 1 - X2) (X3 (X6 + X4( 1 - X6) )) J ) . 1 63

Page 173: One Thousand Exercises in Probability

[3.4.6]-[3.4.9] Solutions Discrete random variables

2

1 3

s

Figure 3 . 1 . The network with source s and sink t.

For the reliability, see Problem ( 1 .8 . 19a). 6. The structure function is I{S�k} ' the indicator function of {S :::: k} where S = E�=l Xc. The reliability is therefore Et=k (7) pi (1 _ p)n-i .

7. Independently colour each vertex livid or bronze with probability ! each, and let L be the random set of livid vertices. Then EN L = ! 1 E I . There must exist one or more possible values of N L which are at least as large as its mean.

S. Let Ir be the indicator function that the rth pair have opposite polarity, so that X = 1 + E��t Ir . We have that lP'(lr = 1 ) = ! and lP'(lr = Ir+l = 1 ) = ! , whence EX = ! (n + 1 ) and var X = ! (n - 1 ) . 9 . (a) Let Ai be the event that the integer i remains in the i th position. Then

= n · - - + · · · + (- 1) - . 1 (n) 1 n- l 1 n 2 n (n - 1 ) n !

Therefore the number M of matches satisfies

Now

(b)

l I n 1 lP'(M = 0) = - - - + . . . + (- 1 ) - . 2 ! 3 ! n !

lP'(M = r) = (;) lP' (r given numbers match, and the remaining n - r are deranged)

n+l

= n ! (n - r) ! (� _ � + . . . + (_ 1 )n-r __ 1_) . r ! (n - r) ! n ! 2! 3 ! (n - r) !

dn+l = E #{derangements with 1 in the rth place} r=2

= n {#{derangements which swap 1 with 2} + #{derangements in which 1 is in the 2nd place and 2 is not in the 1 st place } }

= ndn- l + ndn ,

1 64

Page 174: One Thousand Exercises in Probability

Dependence Solutions [3.5.1]-(3.6.2]

where #A denotes the cardinality of the set A. By rearrangement, dn+ 1 - (n + l )dn = - (dn - ndn- l ) . Set un = dn - ndn-l and note that U2 = 1 , to obtain Un = (_ 1)n , n � 2 , and hence n ! n ! n n ! dn = - - - + . . . + (- 1) - . 2 ! 3 ! n !

Now divide by n ! to obtain the results above.

3.5 Solutions. Examples of discrete variables

1. There are n ! / (n l ! n2 ! ' " nt !) sequences of outcomes in which the ith possible outcome occurs nj times for each i . The probability of any such sequence is P� 1 p�2 . . . p�t , and the result follows. 2. The total number H of heads satisfies

00 00 ( ) 'An -J.. II"(H = x) = � II"(H = x I N = n)II"(N = n) = � : pX (1 _ p)n-x +,

('Ap)X e-J..p 00 ('A (I - p)}n-x e-J.. (1 -p) = x ! :E (n - x) ! . n=x

The last summation equals 1 , since it is the sum of the values of the Poisson mass function with parameter 'A(1 - p) . 3. dpn/d'A = Pn-l - Pn where P-l = O. Hence (d/d'A)II"(X ::: n) = Pn ('A) . 4. The probability of a marked animal in the nth place is a/b . Conditional on this event, the chance of n - 1 preceding places containing m - 1 marked and n - m unmarked animals is

(a - I ) (b - a ) / (b - l) , m - l n - m n - l

as required. Now let Xj be the number of unmarked animals between the j - Ith and jth marked animals, if all were caught. By symmetry, EXj = (b - a)/(a + 1 ) , whence EX = m (EXl + 1 ) = m(b + 1)/(a + 1 ) .

3.6 Solutions. Dependence

1. Remembering Problem (2.7.3b), it suffices to show that var(aX + bY) < 00 if a , b E lR and var(X) , var(Y) < 00. Now,

var(aX + bY) = E ({aX + bY - E(aX + bY) }2) = a2 var(X) + 2ab cov(X, Y) + b2 var(Y) ::: a2 var(X) + 2abvvar(X) var(Y) + b2 var(Y) = (avvar(X) + bvvar(Y) ) 2

where we have used the Cauchy-Schwarz inequality (3.6.9) applied to X - E(X) , Y - E(Y). 2. Let Ni be the number of times the ith outcome occurs. Then Ni has the binomial distribution with parameters n and Pi .

1 65

Page 175: One Thousand Exercises in Probability

[3.6.3]-[3.6.8] Solutions

3. For x = 1 , 2, . . . , 00

IP'(X = x) = L IP'(X = X, Y = y) y=1

Discrete random variables

OO C { I I } = :; 2" (x + y - l ) (x + y) - (x + y) (x + y + 1 )

C C ( 1 1 ) = 2x (x + 1 ) = 2" � - x + 1 '

and hence C = 2. Clearly Y has the same mass function. Finally E(X) = E�1 (x + 1 )- 1 = 00, so the covariance does not exist. '

4. Max{u , v } = 1 (u + v) + 1 1u - v i , and therefore

E (max{X2 , y2 }) = 1E(X2 + y2) + 1E I (X - Y) (X + Y) I � 1 + 1 VE ( X - y)2 )E ( X + y)2 ) = 1 + h i (2 - 2p) (2 + 2p) = 1 + VI - p2 ,

where we have used the Cauchy-Schwarz inequality. 5. (a) log y � y - 1 with equality if and only if y = 1 . Therefore,

E (10 fy (X» ) < E [ fY (X) _ 1] = 0 g fx (X) - fx (X) ,

with equality if and only if fy = fx . (b) This holds likewise, with equality if and only if f (x , y) = fx (x) fy (y) for all x , y, which is to say that X and Y are independent. 6. (a) a + b + c = E{l(x>Yj + I(Y>Zj + l(z>Xj } = 2, whence min{a, b, c} � j . Equality is attained, for example, if the vector (X, Y, Z) takes only three values with probabilities f(2, 1 , 3) = f(3 , 2, 1 ) = f( l , 3, 2) = 1 . (b) IP'(X < Y) = IP'(Y < X), etc. (c) We have that c = a = p and b = 1 - p2 . Also supmin{p, (1 - p2) } = 1 (.J5 - 1) . 7. We have for 1 � x � 9 that

fx (x) = t log (1 + 0 1 ) = log IT (1 +

1 ) = log (1 + �) . y=O

1 x + y y=O

10x + y x

By calculation, EX :::::: 3 .44.

8. (i) f ( .) = " !.- j � + � = e J a a 00 { . k k j k } a ( . + ) j

X J c � . , a k ' k " , c . , . k=O J . . . J . J .

(ii) 1 = L fx (i) = 2ace2a , whence c = e-2a j (2a) . j

r crar crar 2r (ii) fx+y (r) = L . , . , = --, - , r � 1 . . o J . (r - J ) . r . J =

1 66

Page 176: One Thousand Exercises in Probability

Conditional distributions and conditional expectation Solutions [3.7.1]-[3.7.4]

. . . � cr (r - l ) (2at 1 (Ill) lE(X + Y - 1) = L.J , = 2a . Now lE(X) = lE(Y) , and therefore lE(X) = a + � . r= 1 r .

3.7 Solutions. Conditional distributions and conditional expectation

1. (a) We have that

lE(aY + bZ I X = x) = L (ay + bz)lP'(Y = y , Z = z I X = x) y,z

Y ,z y ,z

= a L YlP'(Y = y I X = x) + b L zlP'(Z = z I X = x) . y z

Parts (b)-(e) are verified by similar trivial calculations. Turning to (f),

E{E(Y I X, Zl I X = x } = � {� YP(Y = Y I X = x , Z = ,)P(X = x , Z = , I X = X) } " " lP'(Y = y , X = x , Z = z) lP'(X = x , Z = z) = L.J L.J y .

z y lP'(X = x , Z = z) lP'(X = x)

= L YlP'(Y = y I X = x) = lE(Y I X = x) y

= lE{ lE(Y I X) I X = x , Z = z} , by part (e).

2. If <I> and 1/1 are two such functions then lE ( (<I> (X) - 1/I (X))g (X) ) = 0 for any suitable g . Setting g(X) = I{x=xJ for any x E R such that lP'(X = x) > 0, we obtain <I> (x) = 1/1 (x) . Therefore lP'(<I> (X) = 1/I(X)) = 1 . 3. We do not seriously expect you to want to do this one. However, if you insist, the method is to check in each case that both sides satisfy the appropriate definition, and then to appeal to uniqueness, deducing that the sides are almost surely equal (see Williams 1991 , p. 88).

4. The natural definition is given by

Now,

var(Y) = lE({Y _ lEy}2) = lE [lE ( { Y - lE(Y I X) + lE(Y I X) - lEY} 2 1 X) ]

= lE(var(Y I X)) + var (lE(Y I X))

since the mean of lE(Y I X) is lEY, and the cross product is, by Exercise (Ie),

2lE [lE( {Y - lE(Y I X) } {lE(Y I X) - lEy} 1 X) ]

= 2lE [{lE(Y I X) - lEY}lE{Y - lE(Y I X) 1 X } ] = 0

1 67

Page 177: One Thousand Exercises in Probability

[3.7.5]-[3.7.10] Solutions Discrete random variables

since JE{Y - JE(Y I X) I X} = JE(Y I X) - JE(Y I X) = o. 5. We have that

(a)

(b)

6. Clearly

00 00 IP'(T > t + r) JE(T - t i T > t) = E IP'(T > t + r I T > t) = E . r=O r=O IP'(T > t) N-t N - t - r JE(T - t i T > t) = E = � (N - t + I) . N - t r=O

00 2-(t+r) JE(T - t i T > t) = E � = 2.

r=O

JE(S I N = n) = JE (t Xi) = /Ln , 1= 1

and hence JE(S I N) = /LN. It follows that JE(S) = JE{JE(S I N)} = JE(/LN). 7. A robot passed is in fact faulty with probability 7r = (q, (1 - 8) }/(1 - q,8). Thus the number of faulty passed robots, given Y, is bin(n - Y, 7r) , with mean (n - Y){q, (1 - 8)}/(1 - q,8). Hence

(n - y)A. (1 - 8) JE(X I Y) = Y + 'I' . 1 - q,8

8. (a) Let m be the family size, q,r the indicator that the rth child is female, and /Lr the indicator of a male. The numbers G, B of girls and boys satisfy

m B = E /Lr , JE(G) = �m = JE(B).

r= 1 r= 1

(It will be shown later that the result remains true for random m under reasonable conditions.) We have not used the property of independence. (b) With M the event that the selected child is male,

(m-l

JE(G 1 M) = JE E q,r) = � (m - I) = JE(B) . r=1

The independence is necessary for this argument.

9. Conditional expectation is defined in terms of the conditional distribution, so the first step is not justified. Even if this step were accepted, the second equality is generally false.

10. By conditioning on Xn- l ,

where Xn has the same distribution as Xn . Hence JEXn = (1 + JEXn- l )/P. Solve this subject to JEX1 = P-1 .

1 68

Page 178: One Thousand Exercises in Probability

Sums of random variables Solutions [3.8.1]-[3.8.5)

3.8 Solutions. Sums of random variables

1. By the convolution theorem,

JI»(X + Y = z) = LJI»(X = k)JI»(Y = z - k) k

k + l (m + 1 ) (n + 1) (m 1\ n) + 1

(m + l ) (n + 1) m + n + l - k (m + 1 ) (n + 1)

i f ° :::: k :::: m 1\ n,

if m 1\ n < k < m V n ,

i f m V n :::: k :::: m + n ,

where m 1\ n = min{m , n } and m v n = max{m , n} . 2. If z � 2,

Also, if z � 0,

00 c JI»(X + Y = z) = L JI»(X = k, Y = z - k) = .

k=l z(z + 1)

00 JI»(X - Y = z) = L JI»(X = k + z, Y = k)

k=l 00 1

- CL-----------------k=l (2k + z - 1) (2k + z) (2k + z + 1)

OO { I I } = �C {; (2k + z - 1) (2k + z) - (2k + z) (2k + z + 1)

I 00 (-IY+1 = � C � (r + z) (r + z + 1) .

By symmetry, if z :::: 0, JI»(X - Y = z) = JI»(X - Y = -z) = JI»(X - Y = I z l } . z- l afJ { ( 1 fJ)Z- 1 ( 1 a)Z- I } 3. L a(1 - a)r- l fJ( 1 - fJ)z-r- 1 = - - - . r� a - fJ

4. Repeatedly flip a coin that shows heads with probability p. Let Xr be the number of flips after the r - Ith head up to, and including, the rth. Then Xr is geometric with parameter p. The number of flips Z to obtain n heads is negative binomial, and Z = E�=l Xr by construction.

5. Sam. Let Xn be the number of sixes shown by 6n dice, so that Xn+l = Xn + Y where Y has the same distribution as X I and is independent of Xn . Then,

6 JI»(Xn+1 � n + 1) = LJI»(Xn � n + 1 - r)JI»(Y = r)

r=O 6 = JI»(Xn � n) + L [JI»(Xn � n + 1 - r) - JI»(Xn � n)] JI»(Y = r) . r=O

We set g(k) = JI»(Xn = k) and use the fact, easily proved, that g(n) � g(n - 1) � . . . � g(n - 5) to find that the last sum is no bigger than

6 g(n) L(r - 1)JI»(Y = r) = g(n) (lE(Y) - 1 ) .

r=O

1 69

Page 179: One Thousand Exercises in Probability

[3.8.6]-[3.9.2] Solutions

The claim follows since lE(Y) = 1 . 00 00 (n)e-A 6. (i) LHS = L ng(n)e-A).,n In ! = )., L g ).,n- l = RHS. n=O n=l (n - I ) !

(ii) Conditioning on N and X N ,

LHS = lE (lE(Sg(S) I N)) = lE{NlE(XNg(S) I N) }

Discrete random variables

-A).,n J

( (n- l ) ) = L (: - I) ! xlE g L Xr + x dF(x)

n r=l

= )., J xlE (g(S + x)) dF(x) = RHS.

3.9 Solutions. Simple random walk

1. (a) Consider an infinite sequence of tosses of a coin, any one of which turns up heads with probability p. With probability one there will appear a run of N heads sooner or later. If the coin tosses are 'driving' the random walk, then absorption occurs no later than this run, so that ultimate absorption is (almost surely) certain. Let S be the number of tosses before the first run of N heads. Certainly JP(S > Nr) :::: (1 - pNt, since Nr tosses may be divided into r blocks of N tosses, each of which is such a run with probability pN . Hence JP(S = s ) :::: ( 1 - pN) Ls / N J , and in particular lE(Sk) < 00 for all k :::: 1 . By the above argument, lE(Tk) < 00 also.

2. If So = k then the first step X I satisfies

JP(XI = 1 I W) = JP(XI = l)JP(W I Xl = 1) = PPHI . JP(W) Pk

Let T be the duration of the walk. Then

as required.

h = lE(T I So = k, W) = lE(T I So = k, W, Xl = l)JP(XI = 1 I So = k, W)

+ lE(T I So = k, W, Xl = - 1)JP(XI = - 1 I So = k, W)

= (1 + h+l ) Pk+I P + ( 1 + Jk- l )

(1 _ Pk+I P )

Pk Pk

= 1 + PPHI h+l + (Pk - PPHI ) Jk- l ,

Pk Pk

Certainly Jo = O. If P = i then Pk = 1 - (kiN) , so the difference equation becomes

(N - k - l )JHI - 2(N - k)h + (N - k + l )Jk- 1 = 2(k - N)

for 1 :::: k :::: N - 1 . Setting Uk = (N - k)h , we obtain

UHI - 2Uk + uk- l = 2(k - N),

with general solution Uk = A + Bk - 1 (N - k)3 for constants A and B. Now Uo = UN = 0, and therefore A = 1N3 , B = - 1N2 , implying that Jk = 1 {N2 - (N - k)2 }, 0 :::: k < N.

170

Page 180: One Thousand Exercises in Probability

Random walk: counting sample paths Solutions [3.9.3]-[3.10.3]

3. The recurrence relation may be established as in Exercise (2). Set Uk = (pk - pN)Jk and use the fact that Pk = (pk - pN)/(I _ pN) where p = q/p , to obtain

The solution is

pUk+1 - (1 - r)uk + qUk- 1 = pN _ pk .

_ A + B k + k(pk + pN) Uk - P , p - q for constants A and B. The boundary conditions, Uo = UN = 0, yield the answer.

4. Conditioning in the obvious way on the result of the first toss, we obtain

Pmn = PPm-I ,n + ( 1 - P)Pm,n- l , if m , n :::: 1 . The boundary conditions are PmO = 0, POn = 1 , if m, n :::: 1 . 5. Let Y be the number of negative steps of the walk up to absorption. Then lE(X + Y) = Dk and

{ N - k if the walk is absorbed at N, X - Y = -k if the walk is absorbed at 0.

Hence lE(X - Y) = (N - k) (1 - Pk) - kpk o and solving for lEX gives the result.

3.10 Solutions. Random walk: counting sample paths

1. Conditioning on the first step X I ,

lP'(T = 2n) = 1lP'(T = 2n I X l = 1) + !lP'(T = 2n I X l = - 1) = 1 f- I (2n - 1) + ! fI (2n - 1)

where fb(m) is the probability that the first passage to b of a symmetric walk, starting from 0, takes place at time m. From the hitting time theorem (3.10. 14),

fI (2n - 1) = f- I (2n - 1) = _1 -lP'(S2n_ 1 = 1) = _1_ (2n - 1) 2-<2n- 1 ) , 2n - l 2n - l n

which therefore is the value of lP'(T = 2n) . For the last part, note first that �f lP'(T = 2n) = 1 , which is to say that lP'(T < 00) = 1 ; either

appeal to your favourite method in order to see this, or observe that lP'(T = 2n) is the coefficient of s2n in the expansion of F(s) = 1 - �. The required result is easily obtained by expanding the binomial coefficient using Stirling's formula.

2. By equation (3 . 10. 13) of PRP, for r :::: 0, lP'(Mn = r) = lP'(Mn :::: r) - lP'(Mn :::: r + 1 )

= 2lP'(Sn :::: r + 1) + lP'(Sn = r) - 2lP'(Sn :::: r + 2) - lP'(Sn = r + 1) = lP'(Sn = r) + lP'(Sn = r + 1 ) = max{lP'(Sn = r) , lP'(Sn = r + I ) }

since only one of these two terms is non-zero.

3. By considering the random walk reversed, we see that the probability of a first visit to S2n at time 2k is the same as the probability of a last visit to So at time 2n - 2k. The result is then immediate from the arc sine law (3 . 10. 19) for the last visit to the origin.

17 1

Page 181: One Thousand Exercises in Probability

[3.11.1)-[3.11.4) Solutions Discrete random variables

3.11 Solutions to problems

1. (a) Clearly, for all a , b E JR,

lP(g(X) = a , h (Y) = b) = lP(X = x , Y = y) x ,y :

g(x)=a,h(y)=b

L lP(X = x)lP(Y = y) x ,y:

g(x)=a,h(y)=b

= L lP(X = x) L lP(Y = y) x:g(x)=a y :h(y)=b

= lP(g (X) = a)lP(h (Y) = b) .

(b) See the definition (3.2. 1 ) of independence. (c) The only remaining part which requires proof is that X and Y are independent if fx,Y (x , y) =

g(x)h (y) for all x , y E JR. Suppose then that this holds. Then

fx(x) = L fx,y (x , y) = g(x) L h(y) , fy (y) = L fx,y (x , y) = h(y) L g(x) .

Now

so that

y y x x

1 = L fx(x) = L g(x) L h(y) , x x y

fx(x)fy (y) = g(x)h (y) L g(x) L h(y) = g(x)h (y) = fx, Y (x , y) . x y

2. If lE(X2) = Ex x2lP(X = x) = 0 then lP(X = x) = 0 for x ::j: O. Hence lP(X = 0) = 1 . Therefore, i f var(X) = 0, it follows that lP(X - lEX = 0 ) = 1 .

3. (a)

lE (g(X») = LylP(g(X) = y) = L L ylP(X = x) = Lg(x)lP(X = x)

as required.

(b)

y y x:g(x)=y x

lE(g(X)h (Y») = Lg(x)h (Y)fx, y (x , y) by Leroma(3 .6.6) x ,y

= Lg(x)h (y)fx (x)fy (y) by independence x ,y

= Lg(x)fx (x) L h(y)fy (y) = lE(g(X» lE(h (Y» . x y

4. (a) Clearly fx (i ) = fy (i ) = l for i = 1 , 2, 3 .

(b) (X+ Y)(w} ) = 3, (X+y)(W2) = 5, (X+Y)(WJ ) = 4, and therefore fx+y (i ) = l for i = 3 , 4, 5.

(c) (XY)(w} ) = 2, (XY)(W2) = 6, (XY)(W3 ) = 3, and therefore fXy(i ) = t for i = 2, 3 , 6.

172

Page 182: One Thousand Exercises in Probability

Problems Solutions [3.11.5]-[3.11.8]

(d) Similarly /X/y (i ) = ! for i = i , � , 3 . lP'(Y = 2, Z = 2) lP'(WI ) 1

(e) fy1Z(2 I 2) = lP'(Z = 2) = lP'(W I U W2) = 2 '

and similarly /Y IZ (3 I 2) = i , fylz( l I 1) = 1 , and other values are O. (f) Likewise fzly(2 I 2) = fzly(2 I 3) = fzly (l 1 1 ) = 1 .

00 k 00 { 1 I} 5. (a) L = k L - - -- = k, and therefore k = 1 .

n=1 n(n + 1) n=1 n n + 1 (b) ��I kna = k� ( -a) where � is the Riemann zeta function, and we require a < - 1 for the sum to converge. In this case k = � ( -a) - I . 6. (a) We have that

(b)

n n e-A).,n-k e-I1'1J,k lP'(X + Y = n) = LlP'(X = n - k)lP'(Y = k) = L . --

k=O k=O (n - k) ! k !

= e-A-IL t (n) ).,n-k JLk =

e-A-IL ()., + JL)n

n ! k=O k n !

lP'(X = k, X + Y = n) lP'(X = k I X + Y = n) = �-lP'(=X=--+--=Y-=-n-'--)----'-

= lP'(X = k)lP'(Y = n - k) = (n) ).,k JLn-k . lP'(X + Y = n) k ()., + JL)n

Hence the conditional distribution is bin(n , ).,/()., + JL)) . 7. (i) We have that

lP'(X = n + k, X > n) lP'(X = n + k I X > n) = ) lP'(X > n

p( l - p)n+k- I k- I �oo ' - I = p(l - p) = lP'(X = k) . L.Jj=n+1 p( 1 - p)J

(ii) Many random variables of interest are 'waiting times ' , i.e., the time one must wait before the occurrence of some event A of interest. If such a time is geometric, the lack-of-memory property states that, given that A has not occurred by time n , the time to wait for A starting from n has the same distribution as it did to start with. With sufficient imagination this can be interpreted as a failure of memory by the process giving rise to A. (iii) No. This is because, by the above, any such process satisfies G(k + n) = G(k)G(n) where G(n) = lP'(X > n) . Hence G(k + 1) = G(l )k+1 and X is geometric.

8. Clearly, k

lP'(X + Y = k) = LlP'(X = k - j, Y = j) j=O

= t ( : .) pk-jqm-k+j (�) pjqn-j , ---'\ k J J J=v

= k m+n-k � ( m ) (n) = k m+n-k (m + n) p q � k - ' . P q k j=O J J

173

Page 183: One Thousand Exercises in Probability

[3.11.9]-[3.11.13] Solutions Discrete random variables

which is bin(m + n , p) . 9. Turning immediately to the second request, by the binomial theorem,

as required. Now,

Yf(N even) = L (:) pk ( 1 - p)n-k k even

= H (p + 1 - p)n + ( 1 - p - p)n } = � { I + (I - 2p)n }

in agreement with Problem ( 1 . 8.20). 10. There are (�) ways of choosing k blue balls, and (�=k) ways of choosing n - k red balls. The total number of ways of choosing n balls is (�) , and the claim follows. Finally,

(n) b ! (N - b) ! (N - n) ! Yf(B = k) = k (b - k) ! · (N - b - n + k) ! · N !

= (:) {! . b; I . . . b -�+ I }

x { N; b . . . N - b -; + k + 1 } { Z . . . N -; + 1 }-I

� (:) pk ( 1 _ p)n-k as N � oo.

11. Using the result of Problem (3 . 1 1 .8),

12. (a) JE(X) = c + d, JE(Y) = b + d, and JE(XY) = d, so cov(X, Y) = d - (c + d)(b + d) , and X and Y are uncorrelated if and only if this equals O. (b) For independence, we require f(i , j) = Yf(X = i )Yf(Y = j) for all i , j, which is to say that

a = (a + b)(a + c) , b = (a + b)(b + d), c = (c + d) (a + c) , d = (b + d)(c + d) .

Now a + b + c + d = I , and with a little work one sees that any one of these relations implies the other three. Therefore X and Y are independent if and only if d = (b + d)(c + d), the same condition as for uncorrelatedness.

13. (a) We have that

00 00 m- I 00 00 00 JE(X) = L mYf(X = m) = L L Yf(X = m) = L L Yf(X = m) = L Yf(X > n) .

m=O m=O n=O n=O m=n+ 1 n=O

174

Page 184: One Thousand Exercises in Probability

Problems Solutions [3.11.14H3.11.14J

(b) First method. Let N be the number of balls drawn. Then, by (a),

r r

lE(N) = L JJ»(N > n) = L JJ»(first n balls are red) n=O n=O

r r r - l r - n + l r r ! (b + r - n) ! = L b + r b + r - 1 . . .

b + r - n + 1 = L (b + r) ! (r - n) ! n=O n=O

= � t (n + b) = b + r + l (b + r) ! n=O b b + l '

where we have used the combinatorial identity E�=o (ntb) = (rt!tl) . To see this, either use the simple identity (r': l ) + (�) = (x;:l ) repeatedly, or argue as follows. Changing the order of summation, we find that

� xr � (n ; b) = 1 � x � xn (n

; b)

= ( 1 - x)- (b+2) = "fxr (b + r + 1) r=O b + 1

by the (negative) binomial theorem. Equating coefficients of xr , we obtain the required identity. Second method. Writing m (b , r) for the mean in question, and conditioning on the colour of the first ball, we find that

b r m (b, r) = -

b- + { I + m (b, r - 1) } -

b- '

+ r + r With appropriate boundary conditions and a little effort, one may obtain the result. Third method. Withdraw all the balls, and let Ni be the number of red balls drawn between the i th and (i + l)th blue ball (No = N, and Nb is defined analogously). Think: of a possible 'colour sequence' as comprising r reds, split by b blues into b + 1 red sequences. There is a one-one correspondence between the set of such sequences with No = i , Nm = j (for given i, j, m) and the set of such sequences with No = j, Nm = i ; just interchange the 'Oth' red run with the mth red run. In particular lE(No) = lE(Nm) for all m . Now No + Nl + . . . + Nb = r , so that lE(Nm) = r /(b + 1 ) , whence the claim is immediate. (c) We use the notation just introduced. In addition, let Br be the number of blue balls remaining after the removal of the last red ball. The length of the last 'colour run' is Nb + Br , only one of which is non-zero. The answer is therefore r/(b + 1) + b/(r + 1 ) , by the argument of the third solution to part (b).

14. (a) We have that lE(Xk) = Pk and var(Xk) = Pk (1 - Pk) , and the claims follow in the usual way, the first by the linearity oflE and the second by the independence of the Xi ; see Theorems (3.3.8) and (3 .3 . 1 1 ). (b) Let s = Ek Pb and let Z be a random variable taking each of the values PI , P2 , . . . , Pn with equal probability n- 1 . Now lE(Z2) _ lE(Z)2 = var(Z) 2:: 0, so that

L .!.pf 2:: (L .!.Pk)

2 = s�

k n k n n

with equality if and only if Z is (almost surely) constant, which is to say that PI = P2 = . . . = Pn . Hence

175

Page 185: One Thousand Exercises in Probability

[3.11.15]-[3.11.18] Solutions Discrete random variables

with equality if and only if P I = P2 = . . . = Pn . Essentially the same route may be followed using a Lagrange multiplier.

(c) This conclusion is not contrary to informed intuition, but experience shows it to be contrary to much uninformed intuition.

15. A matrix V has zero determinant if and only if it is singular, that is to say if and only if there is a non-zero vector x such that xVx' = O. However,

Hence, by the result of Problem (3. 1 1 .2), Ek Xk (Xk - lEXk) is constant with probability one, and the result follows.

16. The random variables X + Y and IX - Y I are uncorrelated since

However,

cov {X + Y, I X - Y I ) = lE{ (X + Y) IX - Y I } - lE(X + Y)lE( IX - YD

= ! + ! - 1 . ! = O.

! = lP(X + Y = 0, IX - Y I = 0) # lP(X + Y = O)lP( IX - Y I = 0) = ! . ! = l , so that X + Y and I X - Y I are dependent.

17. Let h be the indicator function of the event that there is a match in the kth place. Then lP(h = 1) = n- I , and for k # j,

1 lP(h = 1 , Ij = 1) = lP(/j = 1 I h = 1)lP(h = 1) = -n('-n--

-l-:-)

var(X) = lE(X2) - (lEX)2 = lE ( E h) 2 - 1

1

= E lE(h)2 + LlE(/jh) - I = I + 2 (n) 1 - 1 = 1 .

1 j# 2 n(n - 1)

We have by the usual (mis)matching argument of Example (3.4.3) that

1 n-r (- 1 )j

lP(X = r) = - '"' -, L..J . , ' r . j=O I .

which tends to e- I Ir ! as n -+ 00.

o � r � n - 2,

18. (a) Let YI , Y2 , . . . , Yn be Bemoulli with parameter P2 , and ZI , Z2 , . . . , Zn Bernoulli withparam­eter PI I P2 , and suppose the usual independence. Define Aj = Yj Zj , a Bernoulli random variable that has parameter lP(Aj = 1 ) = lP(Yj = 1)lP(Zj = 1 ) = Pl . Now (A I , A2 , . . . , An ) � (YI , Y2 , . . . , Yn) so that f(A) � f(Y) . Hence e(PI ) = lE(f(A)) � lE(f(Y)) = e(p2) '

(b) Suppose first that n = 1 , and let X and X' be independent Bernoulli variables with parameter p. We claim that

{J(X) - f(X') } {g(X) - g (X') } 2: 0;

176

Page 186: One Thousand Exercises in Probability

Problems Solutions [3.11.19]-(3.11.19]

to see this consider the three cases X = X', X < X', X > X' separately, using the fact that I and g are increasing. Taking expectations, we obtain

lE ({f(X) - I(X') } {g(X) - g(X') }) ::: 0,

which may be expanded to find that

o :::; lE(J(X)g(X») - lE (J(X')g(X») - lE (J(X)g(X'») + lE (J(X')g(X'»)

= 2{ lE (J(X)g(X») - lE(f(X» lE(g(X» }

by the properties of X and X' . Suppose that the result is valid for all n satisfying n < k where k ::: 2. Now

lE (J(X)g(X») = lE {lE (J(X)g(X) I Xl , X2 , . . . , Xk- t ) } ;

here, the conditional expectation given X I , X2 , . . . , Xk- I is defined in very much the same way as in Definition (3 .7.3), with broadly similar properties, in particular Theorem (3 .7.4); see also Exercises (3.7 . 1 , 3). If Xl , X2 , " " Xk- I are given, then I(X) and g(X) may be thought of as increasing functions of the single remaining variable Xb and therefore

by the induction hypothesis. Furthermore

are increasing functions of the k - 1 variables X I , X 2 , . . . , X k- I , implying by the induction hypothesis that lE(f'(X)g' (X» ::: lE(f' (X» lE(g' (X» . We substitute this into (*) to obtain

lE (J(X)g(X») ::: lE(f' (X» lE(g' (X» = lE(f(X» lE(g(X»

by the definition of I' and g' .

19. Certainly R(p) = lE(/A) = La> IA (w)lP'(w) and lP'(w) = pN(a» qm-N(a» where p + q = 1 . Differentiating, we obtain

Applying the Cauchy-Schwarz inequality (3 .6.9) to the latter covariance, we find that R' (p) :::; (pq)-I '/var(/A) var(N) . However IA is Bernoulli with parameter R(p), so that var(/A) = R(p) ( 1 -R(p» , and finally N is bin(m , p) so that var(N) = mp(1 - p), whence the upper bound for R' (p) follows.

As for the lower bound, use the general fact that cov(X + Y, Z) = cov(X, Z) +cov(Y, Z) to deduce that COV(/A , N) = COV(/A , IA) + COV(/A , N - IA) . Now IA and N - IA are increasing functions of w, in the sense of Problem (3. 1 1 . 1 8); you should check this. Hence cov(/ A , N) ::: var(/ A) + 0 by the result of that problem. The lower bound for R' (p) follows.

177

Page 187: One Thousand Exercises in Probability

[3.11.20]-[3.11.21] Solutions Discrete random variables

20. (a) Let each edge be blue with probability PI and yellow with probability P2 ; assume these two events are independent of each other and of the colourings of all other edges. Call an edge green if it is both blue and yellow, so that each edge is green with probability PI P2 . IT there is a working green connection from source to sink, then there is also a blue connection and a yellow connection. Thus

lP'(green connection) ::: lP'(blue connection, and yellow connection) = lP'(blue connection)lP'(yellow connection)

so that R(PI P2) ::: R(p} )R (P2) . (b) This is somewhat harder, and may be proved by induction on the number n of edges of G. IT n = 1 then a consideration of the two possible cases yields that either R (p) = 1 for all P, or R (p) = P for all p. In either case the required inequality holds.

Suppose then that the inequality is valid whenever n < k where k ::: 2, and consider the case when G has k edges. Let e be an edge of G and write w(e) for the state of e; w (e) = 1 if e is working, and w(e) = 0 otherwise. Writing A for the event that there is a working connection from source to sink, we have that

R (pY ) = lP'pY (A I w(e) = l )pY + lP'pY (A I w(e) = 0) (1 - pY) ::: lP'p (A I w(e) = I )Y pY + lP'p (A I w(e) = O)Y (1 - pY)

where lP'0! is the appropriate probability measure when each edge is working with probability a. The inequality here is valid since, if w (e) is given, then the network G is effectively reduced in size by one edge; the induction hypothesis is then utilized for the case n = k - 1 . It is a minor chore to check that

xYpY + yY (1 - p)Y ::: {xp + y ( l - p) }Y if x ::: y ::: 0;

to see this, check that equality holds when x = y ::: 0 and that the derivative of the left-hand side with respect to x is at most the corresponding derivative of the right-hand side when x , y ::: O. Apply the latter inequality with x = lP'p (A I w(e) = 1 ) and y = lP'p (A I w(e) = 0) to obtain

R(pY ) ::: {lP'p (A I w(e) = l )p + lP'p (A I w(e) = 0) (1 - p) Y = R(p)Y .

21. (a) The number X of such extraordinary individuals has the bin(107 , 10-7) distribution. Hence lEX = 1 and

lP'(X > 1 I X > 1) = lP'(X > 1 ) = 1 - lP'(X = 0) - lP'(X = 1 ) - lP'(X > 0) 1 - lP'(X = 0)

1 - (1 _ 10-7) 107 _ 107 . 10-7 (1 _ 10-7) 107- 1 1 - (1 - 10-7) 107

1 - 2e- 1 - 1 � 0.4. l - e-

(Shades of (3 .5 .4) here: X is approximately Poisson distributed with parameter 1 .) (b) Likewise

1 - 2e-1 - l e-1 lP'(X > 2 I X ::: 2) � � � 0.3 . 1 - 2e-(c) Provided m « N = 107 ,

N ! ( 1 ) m ( 1 ) N-m e-1 lP'(X = m) = - 1 - - � - , m ! (N - m) ! N N m !

178

Page 188: One Thousand Exercises in Probability

Problems Solutions [3.11.22]-[3.11.23]

the Poisson distribution. Assume that "reasonably confident that n is all" means that IP'(X > n I X � n) ::: r for some suitable small number r. Assuming the Poisson approximation, IP'(X > n) ::: r IP'(X � n) if and only if

00 1 00 1 e- 1 '"' _ < re- 1 '"' - . L...J k ! - L...J k ! k=n+l k=n For any given r , the smallest acceptable value of n may be determined numerically. If r is small, then very roughly n � l/r will do (e.g. , if r = 0.05 then n � 20). (d) No level p of improbability is sufficiently small for one to be sure that the person is specified uniquely. If p = 10-7 a, then X is bin ( 107 , 10-7 a), which is approximately Poisson with parameter a. Therefore, in this case,

1 - e-a - ae-a IP'(X > 1 I X � 1) � _ = p , say. 1 - e a An acceptable value of p for a very petty offence might be p � 0.05, in which case a � 0. 1 and so p = 10-8 might be an acceptable level of improbability. For a capital offence, one would normally require a much smaller value of p . We note that the rules of evidence do not allow an overt discussion along these lines in a court of law in the United Kingdom. 22. The number G of girls has the binomial distribution bin(2n , p) . Hence

2n (2 ) IP'(G � 2n - G) = IP'(G � n) = L ; pkq2n-k k=n (2n) � k 2n-k (2n) n n q

::: L...J p q = p q -- , n k=n n q - p

where we have used the fact that e:) ::: (�) for all k. With p = 0.485 and n = 104 , we have using Stirling's formula (Exercise (3 . 10. 1 )) that (2n)

pn qn _q _ ::: � { ( 1 _ 0.03)( 1 + 0.03) } n 00

.50135

n q - p 'V (mr ) .

0.5 15 ( 9 ) 104

-5 = 3.,fii 1 - 104 ::: 1 .23 x 10 .

It follows that the probability that boys outnumber girls for 82 successive years is at least ( 1 - 1 .23 x 10-5)82 � 0.99899. 23. Let M be the number of such visits. If k =F 0, then M � 1 if and only if the particle hits 0 before it hits N, an event with probability 1 - kN-1 by equation ( 1 .7.7). Having hit 0, the chance of another visit to 0 before hitting N is 1 - N-l , since the particle at 0 moves immediately to 1 whence there is probability 1 - N-1 of another visit to 0 before visiting N. Hence

so that

( k ) ( l )r- l IP'(M � r I So = k) = 1 - N 1 - N ' r � 1 ,

IP'(M = j I So = k) = IP'(M � j I So = k) - IP'(M � j + 1 I So = 0)

= ( 1 _ �) ( 1 _ �)

j- l � , j � 1 .

179

Page 189: One Thousand Exercises in Probability

[3.11.24]-[3.11.27] Solutions Discrete random variables

24. Either read the solution to Exercise (3.9.4), or the following two related solutions neither of which uses difference equations. First method. Let Tk be the event that A wins and exactly k tails appear. Then k < n so that IP'(A wins) = 2:k:� IP'(Tk) . However IP'(Tk) is the probability that m

+ k tosses yield m heads, k tails,

and the last toss is heads. Hence

whence the result follows. Second method. Suppose the coin is tossed m

+ n - 1 times. If the number of heads is m or more,

then A must have won; conversely if the number of heads is m - 1 or less, then the number of tails is n or more, so that B has won. The number of heads is bin(m

+ n - 1 , p) so that

m+n-l ( + 1) IP'(A wins) = L

m : - pkqm+n-I-k . k=m

25. The chance of winning, having started from k, is

1 - (qjp)k 1 - (qjp)N which may be written as

see Example (3.9.6). If k and N are even, doubling the stake is equivalent to playing the original game with initial fortune � k and the price of the Jaguar set at � N. The probability of winning is now

which is larger than before, since the final term in the above display is greater than 1 (when p < � ). If p = � , doubling the stake makes no difference to the chance of winning. If p > � , it is better

to decrease the stake. 26. This is equivalent to taking the limit as N -+ 00 in the previous Problem (3. 1 1 .25). In the limit when p f= � , the probability of ultimate bankruptcy is

lim (qjp)k - (qjp)N =

{ (qjp)k N"""*oo 1 - (qjp)N 1

if I p > 2:-if I P < 2 '

where p + q = 1 . If p = � , the corresponding limit is limN"""*oo (1 - kjN) = 1 .

27. Using the technique of reversal, we have that IP'(Rn = Rn-l

+ 1) = IP'(Sn- 1 f= Sn , Sn-2 f= Sn , . . . , So f= Sn)

= IP'(Xn f= 0, Xn-l + Xn f= 0, . . . , Xl

+ . . . + Xn f= 0) = IP'(XI f= 0, X2

+ Xl f= 0, . . . , Xn

+ . . . + Xl f= 0) = IP'(SI f= 0, S2 f= 0, . . . , Sn f= 0) = IP'(SI S2 · · · Sn f= 0) .

It follows that lE:(Rn) = lE:(Rn- l ) + IP'(SI S2 · · · Sn f= 0) for n :::: 1 , whence

1 1 { n

} -lE:(Rn ) = - 1 +

L IP'(SI S2 · · · Sm f= 0) -+ IP'(Sk f= 0 for all k :::: 1 ) n n m=l

1 80

Page 190: One Thousand Exercises in Probability

Problems Solutions [3.11.28]-[3.11.29]

since 1P'(Sl S2 . . . Sm =j: 0) {- lP'(Sk =j: 0 for all k � 1) as m � 00 .

There are various ways of showing that the last probability equals I p - q l , and here is one. Suppose p > q. If Xl = 1, the probability of never subsequently hitting the origin equals 1 - (q/ p), by the calculation in the solution to Problem (3 . 1 1 .26) above. If Xl = - 1 , the probability of staying away from the origin subsequently is O. Hence the answer is p ( l - (q / p)) + q . 0 = p - q .

If q > p , the same argument yields q - p, and if p = q = 1 the answer i s O.

28. Consider first the event that M2n is first attained at time 2k. This event occurs if and only if: (i) the walk: makes a first passage to S2k (> 0) at time 2k, and (ii) the walk thereafter does not exceed S2k . These two events are independent. The chance of (i) is, by reversal and symmetry,

1P'(S2k-l < S2k, S2k-2 < S2k > . . . , So < S2k) = 1P'(X2k > 0, X2k- l + X2k > 0, . . . , Xl + . . . + X2k > 0) = lP'(Xl > 0, Xl + X2 > 0, . . . , Xl + . . . + X2k > 0) = lP'(Sj > 0 for 1 :s i :s 2k) = 11P'(Sj =j: 0 for 1 :s i :s 2k) = 11P'(S2k = 0) by equation (3 . 10.23) .

As for the second event, we may translate S2k to the origin to obtain the probability of (ii):

where we have used the result of Exercise (3 . 10.2). The answer is therefore as given. The probabilities of (i) and (ii) are unchanged in the case i = 2k + 1 ; the basic reason for this is

that S2r is even, and S2r+ 1 odd, for all r .

29. Let Uk = lP'(Sk = 0) , fk = lP'(Sk = 0 , Sj ::j: 0 for 1 :s i < k) , and use conditional probability (or recall from equation (3 . 10.25)) to obtain

n U2n = L U2n-2khk ·

k=l

Now Nl = 2, and therefore it suffices to prove that lE(Nn ) = lE(Nn-d for n � 2. Let N�_l be the number of points visited by the walk Sl , S2 , . . . , Sn exactly once (we have removed So). Then { N�_l + 1 if Sk =j: So for 1 :s k :s n ,

Nn = N�_l - 1 if Sk = So for exactly one k in { l , 2, . . . , n } , N�_l otherwise.

Hence, writing an = lP'(Sk ::j: 0 for 1 :s k :s n) ,

lE(Nn) = lE(N�_l ) + an - lP'(Sk = So exactly once) = lE(Nn-l ) + an - {J2an-2 + f4an-4 + . . . + hLn/2J }

where LxJ is the integer part of x . NOW a2m = a2m+l = U2m by equation (3 . 10.23). If n = 2k is even, then

If n = 2k + 1 is odd, then

1 8 1

Page 191: One Thousand Exercises in Probability

[3.11.30]-[3.11.34] Solutions Discrete random variables

In either case the claim is proved. 30. (a) Not much. (b) The rhyme may be interpreted in any of several ways. Interpreting it as meaning that families stop at their first son, we may represent the sample space of a typical family as {B , GB, c¥B, . . . }, with IP'(GnB) = 2-(n+1) . The mean number of girls is E�l nlP'(anB) = E�l n2-(n+l) = 1 ; there is exactly one boy.

The empirical sex ratio for large populations will be near to 1 : 1 , by the law of large numbers. However the variance of the number of girls in a typical family is var(#girls) = 2, whilst var(#boys) = 0; #A denotes the cardinality of A. Considerable variation from 1 : 1 is therefore possible in smaller populations, but in either direction. In a large number of small popUlations, the number of large predominantly female families would be balanced by a large number of male singletons. 31. Any positive integer m has a unique factorization in the form m = Il pr(i) for non-negative integers m(1) , m (2) , . . . . Hence,

II ( . . ) II ( 1 ) 1 (II _m(i») f3 C IP'(M = m) = IP' N (z ) = m(z ) = 1 - fJ f3m(i) = C Pi = mf3 i i Pi Pi i

where C = IIi (1 - pif3) . Now Em IP'(M = m) = 1 , so that c-1 = Em m-f3 . 32. Number the plates 0, 1 , 2, . . . , N where 0 is the starting plate, fix k satisfying 0 < k ::; N, and let Ak be the event that plate number k is the last to be visited. In order to calculate IP'(Ak), we cut the table open at k, and bend its outside edge into a line segment, along which the plate numbers read k, k + 1 , . . . , N, 0, 1 , . . . , k in order. It is convenient to relabel the plates as -(N + 1 - k) , -(N -k), . . . , - 1 , 0, 1 , . . . , k. Now Ak occurs if and only if a symmetric random walk, starting from 0, visits both -(N - k) and k - 1 before it visits either -(N + 1 - k) or k. Suppose it visits -(N - k) before it visits k - 1 . The (conditional) probability that it subsequently visits k - 1 before visiting -(N + 1 - k) is the same as the probability that a symmetric random walk, starting from 1, hits N before it hits 0, a probability of N-1 by ( 1 .7.7). The same argument applies if the cake visits k - 1 before it visits -(N - k) . Therefore IP'(Ak) = N-1 . 33. With j denoting the jth best vertex, the walk: has transition probabilities Pjk = (j - 1 )-1 for 1 ::; k < j . By conditional expectation,

1 j- l r · = 1 + -- '"' rk J . 1 L.." ,

] - k=l rt = 0.

Induction now supplies the result. Since rj � log j for large j , the worst-case expectation is about log (�) . 34. Let Pn denote the required probability. If (mr , mr+l ) is first pair to make a dimer, then ml i s ultimately uncombined with probability Pr- l . By conditioning on the first pair, we find that Pn = (PI + P2 + . . . + Pn-2)/(n - 1) , giving n (Pn+l - Pn) = -(Pn - Pn-l ) . Therefore, n ! (Pn+l - Pn) = (_ 1 )n- l (P2 - p} ) = (_ 1 )n , and the claim follows by summing.

Finally, n

EUn = L lP'(mr is uncombined) = Pn + PI Pn-l + . . . + Pn- l Pl + Pn , r=l

since the rth molecule may be thought of as an end molecule of two sequences of length r and n -r + 1 . Now Pn � e- 1 as n � 00 , and it i s an easy exercise of analysis to obtain that n-1EUn � e-2 .

1 82

Page 192: One Thousand Exercises in Probability

Problems Solutions [3.11.35]-[3.11.36]

35. First,

where the last summation is over all subsets {q , . . . , rk } of k distinct elements of { I , 2, . . . , n } . Secondly,

Hence

Ak :::: k ! L Pr, Pr2 · · · Prk + (�) L Pf L Prl Pr2 . . . Prk-2 {rl , . . . , rk } i rl , .. · . rk_2

:::: k! L Pq Pr2 · · · Prk + (�) rnr-X Pi (�

Pj ) k- l

. rl . · · · . rk J

By Taylor's theorem applied to the function log (I-x) , there exist Br satisfying 0 < Br < {2( 1-c)2) }- 1 such that

n (**) II (1 - Pr) = II exp{ -Pr - Br p; } = exp{ -A - AO (rnr-X Pi ) } .

r=1 r

Finally,

The claim follows from (*) and (**). 36. It is elementary that

We write Y - JE(Y) as the mixture of indicator variables thus:

It follows from the fact

N - - '" Xr ( n ) Y - JE(Y) = � - Ir - - . r=1 n N

1 83

Page 193: One Thousand Exercises in Probability

[3.11.37]-[3.11.38] Solutions Discrete random variables

that

H n c

c

H n c

H n c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (l - y)a

H n c

Figure 3.2. The tree of possibility and probability in Problem (3. 1 1 .37). The presence of the disease is denoted by C, and hospitalization by H; their negations are denoted by C and H.

N x2 { ( n ) 2 } x ·x · { ( n ) ( n ) } var(Y) = L ; lE Ir - - + L I l lE Ii - - Ij - -r=l n N ii-j n N N

= � x; � ( 1 _ �) + '" XiXj { � n - 1 _ �} L..J n2 N N � n2 N N - 1 N2 r=l I#}

� 2 N - n '" N - n = L..J xr N2n - �XiXj n (N - 1)N2 r=l I #}

= N:;� 1 ) {tx: - � tx: - � �XiXj } r=l r=l I#}

N - n 1 {� 2 -2 } N - n 1 � 2 = n (N - 1) N L..J xr - Nx = n (N - 1) N L..J(Xr - X) . r=l r=l

37. The tree in Figure 3.2 illustrates the possibilities and probabilities. If G contains n individuals, X is bin(n , yp + (1 - y)a) and Y is bin(n , yp) . It is not difficult to see that cov(X, Y) = nyp(l - v) where v = yp + ( 1 - y)a. Also, var(Y) = nyp(l - yp) and var(X) = nk(l - v) . The result follows from the definition of correlation. 38. (a) This is an extension of Exercise (3 .5.2). With IP'n denoting the probability measure conditional on N = n, we have that

1 84

Page 194: One Thousand Exercises in Probability

Problems Solutions [3.11.39]-(3.11.39]

where s = n - 2:�=1 rj . Therefore, 00

IP'(Xj = rj for 1 ::; i ::; k) = L IP'n (Xj = rj for 1 ::; i ::; k)IP'(N = n) n=O

= IT { vrj f(W�

,e-Vf(i) } f v8 ( 1 - �(kW e-v(I -F(k» .

i=1 rl • 8=0 s .

The final sum is a Poisson sum, and equals 1 . (b) We use an argument relevant to Wald's equation. The event {T ::; n - I } depends only on the random variables Xl , X2 , . . . , Xn- l , and these are independent of Xn . It follows that Xn is independent of the event {T ::: n} = {T ::; n - l }c . Hence,

00 00 00

lE(S) = LlE(Xi I{T�i} ) = LlE(Xi )lE(l{T�j } ) = LlE(Xi )IP'(T ::: i ) j=1 i=1 j=1 00 00 00 t

= L vf(i ) L IP'(T = t) = v L IP'(T = t) L f(i ) j=1 t=i t=1 i=1

00

= v L IP'(T = t)F(t) = lE(F(T)) . t=1

39. (a) Place an absorbing barrier at a + 1 , and let Pa be the probability that the particle is absorbed at O. By conditioning on the first step, we obtain that

1 ::; n ::; a .

The boundary conditions are Po = 1 , Pa+1 = O. It follows that Pn+1 - Pn = (n + I) (Pn - Pn-} ) for 2 ::; n ::; a. We have also that P2 - PI = PI - 1 , and

Pn+l - Pn = ! (n + I ) ! (P2 - PI ) = ! (n + I ) ! (PI - po) ·

Setting n = a we obtain that -Pa = ! (a + I ) ! (PI - 1) . By summing over 2 ::; n < a,

a Pa - PI = (PI - PO) + ! (Pl - PO) L j ! ,

j=3

and we eliminate PI to conclude that

(a + I ) ! Pa = 4 + 3 ! + 4 ! + . . . + (a + l) ! ·

It is now easy to see that, for given r , Pr = Pr (a) � 1 as a � 00, so that ultimate absorption at 0 is (almost) certain, irrespective of the starting point. (b) Let Ar be the probability that the last step is from 1 to 0, having started at r . Then

Al = 1 ( 1 + Al + A2) , (r + 2)Ar = Al + A2 + · · · + Ar+l ,

1 85

r ::: 2.

Page 195: One Thousand Exercises in Probability

[3.11.40]-[3.11.40] Solutions

It follows that

whence

1 Ar - Ar- l = --1 (Ar+l - Ar) , r +

Discrete random variables

r:::: 3 .

Letting r --+ 00 , we deduce that A3 = A2 so that Ar = A2 for r ::: 2. From (**) with r = 2 , A2 = �Ab and from (*) Al = � . (c) Let /-Lr be the mean duration of the walk starting from r. As above, /-Lo = 0, and

r ::: 1 ,

whence /-Lr+l - /-Lr = (r + l ) (/-Lr - /-Lr-d - 1 for r ::: 2 . Therefore, vr+l = (/-Lr+1 - /-Lr)/(r + 1 ) ! satisfies vr+ 1 - vr = - 1/ ( r + 1 ) ! for r ::: 2 , and some further algebra yields the value of /-L 1 . 40. We label the vertices 1 , 2, . . . , n, and we let 1r be a random permutation of this set. Let K be the set of vertices v with the property that 1r (w) > 1r (v) for all neighbours w of v. It is not difficult to see that K is an independent set, whence a (G) ::: IK I . Therefore, a(G) ::: JE IK I = 2:v lP'(v E K). For any vertex v, a random permutation 1r is equally likely to assign any given ordering to the set comprising v and its neighbours. Also, v E K if and only if v is the earliest element in this ordering, whence lP'(v E K) = l / (dv + 1 ) . The result follows.

1 86

Page 196: One Thousand Exercises in Probability

4 Continuous random variables

4.1 Solutions. Probability density functions

1 1. (a) {x ( 1 - x)}- � is the derivative of sin- I (2x - 1) , and therefore C = 1f-I . (b) C = 1 , since

(c) Substitute v = (1 + x2)- I to obtain

tx) dx rl m 3 1 1 1 1-00 -(1-+--'x2---)-m = 10 v - � ( 1 - v)- � dv = B (z , m - z)

where B(· , ·) is a beta function; see paragraph (4.4.8) and Exercise (4.4.2). Hence, if m > i ,

r ( I )r (m 1 ) C-I = B( 1 _ 1 ) = 2 - Z 2 , m 2 r (m)

2. (i) The distribution function Fy of Y is

Fy (y) = JP(Y ::: y) = JP(aX ::: y) = JP(X ::: yja) = Fx (yja) .

So, differentiating, fy (y) = a-I fx (yja) . (ii) Certainly

F_X (x) = JP(-X ::: x) = JP(X ::: -x) = 1 - JP(X ::: -x) since JP(X = -x) = 0. Hence f-x (x) = fx (-x) . If X and -X have the same distribution function then f-x(x) = fx (x) , whence the claim follows. Conversely, if fx(-x) = fx (x) for all x, then, by substituting u = -x,

JP(-X ::: y) = JP(X ::: -y) = L7 fx (x) dx = /:00

fx (-u) du = /:00

fx (u) du = JP(X ::: y),

whence X and -X have the same distribution function. 3. Since a ::: 0, f ::: 0, and g ::: 0, it follows that af + (1 - a)g ::: 0. Also

l {af + (1 - a)g} dx = a l f dx + (1 - a) l g dx = a + 1 - a = 1 .

1 87

Page 197: One Thousand Exercises in Probability

[4.1.4]-[4.2.4] Solutions Continuous random variables

If X is a random variable with density f, and Y a random variable with density g, then a f +(1 -a)g is the density of a random variable Z which takes the value X with probability a and Y otherwise.

Some minor technicalities are necessary in order to find an appropriate probability space for such a Z. If X and Y are defined on the probability space (0, !F, lP'), it is necessary to define the product space (0 , .1=", lP') x (� , g., Q) where � = to, I } , g. is the set of all subsets of �, and Q(O) = a, Q(I) = 1 - a . For w x (1 E ° x �, we define

Z(w x (1) = { X(W) if (1 = 0, Y(w) if (1 = 1 .

. . . 1 F (x + h) - F(x) f(x) 4. (a) By defimtion, r (x) = hm -h 1 ( ) = 1 ( ) h,l.O - F x - F x (b) We have that

H(x) = !:.-. { .!. (X r (y) dY } = r (x) _ � r r (y) dy = � r [r (x) - r(y)] dy, x dx x Jo x x Jo x Jo which is non-negative if r is non-increasing. (c) H(x)/x is non-decreasing if and only if, for 0 ::; a ::; 1 ,

1 1 -H(ax) ::; -H(x) for all x ::: 0, ax x which is to say that _a-l log [1 - F(ax)] ::; - log[ 1 - F(x)]. We exponentiate to obtain the claim. (d) Likewise, if H(x)/x is non-decreasing then H(at) ::; aH(t) for ° ::; a ::; 1 and t ::: 0, whence H(at) + H(t - at) ::; H(t) as required.

4.2 Solutions. Independence

1. Let N be the required number. Then lP'(N = n) = F(K)n- 1 [ 1 - F(K)] for n ::: 1 , the geometric distribution with mean [ 1 - F(K)rl . 2. (i) Max{X, Y} ::; v if and only if X ::; v and Y ::; v. Hence, by independence,

lP' (max{X, Y} ::; v) = lP'(X ::; v, Y ::; v) = lP'(X ::; v)lP'(Y ::; v) = F(v)2 . Differentiate to obtain the density function of V = max{X, Y}. (ii) Similarly minIX, Y} > u if and only if X > u and Y > u. Hence

lP'(U ::; u) = 1 - lP'(U > u) = 1 - lP'(X > u)lP'(Y > u) = 1 - [ 1 - F (u) ]2 , giving fu (u) = 2f(u) [ 1 - F (u) ] . 3. The 24 permutations of the order statistics are equally likely by symmetry, and thus have equal probability. Hence lP'(XI < X2 < X3 < X4) = i4, and lP'(XI > X2 < X3 < X4) = i4, by enumerating the possibilities. 4. lP'(Y(y) > k) = F(y)k for k ::: 1 . Hence lEY(y) = F(y)/[ 1 - F(y)] --+ 00 as y --+ 00.

Therefore, lP' (Y(y) > lEY(y)) = { I - [1 - F(y)1 } IF(y)/[l -F(y)]J

'" exp { I - [1 - F(y)] l F(y) J } --+ e-l as y --+ 00 .

1 - F(y)

188

Page 198: One Thousand Exercises in Probability

Expectation

4.3 Solutions. Expectation

I. (a) lE(Xa ) = Jooo xae-x dx < 00 if and only if a > - 1 .

(b) In this case

if and only if - I < a < 2m - 1 . 2. We have that

(En x. ) n 1 = lE _1_1 = LlE(Xi /Sn) . Sn i=1

Solutions [4.3.IH4.3.5]

By symmetry, lE(Xi / Sn) = lE(X 1 I Sn) for all i , and hence 1 = nlE(X 1 I Sn) . Therefore m

lE(SmISn) = LlE(Xi /Sn) = mlE(Xl /Sn) = min . i=1

3. Either integrate by parts or use Fubini's theorem:

r roo xr- 1JP(X > x) dx = r roo xr- 1 { roo f(y) dY } dx k k h� = roo f(y) { [Y rxr- 1 dX } dy = roo yr f(y) dy . 1y=0 1x=o 10

An alternative proof is as follows. Let Ix be the indicator of the event that X > x , so that Jooo Ix dx = X. Taking expectations, and taking a minor liberty with the integral which may be made rigorous, we obtain lEX = Jooo lE(lx ) dx . A similar argument may be used for the more general case.

4. We may suppose without loss of generality that J1, = 0 and (1 = 1 . Assume further that m > 1 . In this case, at least half the probability mass lies to the right of 1 , whence lE(XI{X;:::m} ) ::: � . Now 0 = lE(X) = lE{X[l{X;:::m} + I{X<m} ] } , implying that lE(XI{x<m}) ::: -� . Likewise,

2 ) 1 lE(X I{X;:::m} ::: 2 ' By the definition of the median, and the fact that X is continuous,

lE(X I X < m) ::: - 1 , lE(X2 I X < m) ::: 1 . It follows that var(X I X < m ) ::: 0, which implies in tum that, conditional on {X < m } , X is almost surely concentrated at a single value. This contradicts the continuity of X, and we deduce that m ::: 1 . The possibility m < - 1 may be ruled out similarly, or by considering the random variable -X. S. It is a standard to write X = X+ - X- where X+ = max{X , O} and X- = - min IX , OJ. Now X+ and X-are non-negative, and so, by Lemma (4.3 .4),

J1, = lE(X) = lE(X+) - lE(X-) = 1000 JP(X > x) dx - 10

00 JP(X < -x) dx

= roo[l _ F(x)] dx _ roo F(-x) dx = roo [1 - F(x)] dx _ rO F(x) dx . 10 10 10 1-00

It is a triviality that

J1, = loP. F(x) dx + loP. [ 1 - F(x)] dx and the equation follows with a = J1,. It is easy to see that it cannot hold with any other value of a, since both sides are monotonic functions of a.

1 89

Page 199: One Thousand Exercises in Probability

[4.4.1)-[4.4.6] Solutions Continuous random variables

4.4 Solutions. Examples of continuous variables

1. (i) Integrating by parts,

r(t) = 1000 X,- l e-x dx = (t - 1) 1000 xt-2e-x dx = (t - 1 )f(t - 1 ) .

If n is an integer, then i t follows that r (n) = (n - l )f (n - l) = . . . = (n - l ) ! r ( 1 ) where r(1 ) = 1 . (ii) We have, using the substitution u2 = x, that

r (� )2 = {1000 x- ! e-x dx } 2 = {1000 2e-u2 dU } 2 = 4 roo

e-u2 du roo e-v2 dv = 4100 rr/2 e-r2 r dr df} = 7f

Jo Jo r=O J(}=o as required. For integral n,

1 1 1 1 3 1 1 (2n) ! '-r (n + 2" ) = (n - 2")r(n - 2") = . . . = (n - 2) (n - 2) · · · 2" r (2 ) = 4nn ! v7f •

2. By the definition of the gamma function,

r(a)r (b) = 1000 xa- 1e-x dx 1000 l-le-y dy = 1000 1000 e-(x+Y)xa-1 yb- l dx dy .

Now set u = x + y, v = x/(x + y) , obtaining

roo r 1 e-uua+b- l va- l ( 1 _ v)b- l dv du

Ju=O Jv=O = 1000 ua+b- 1e-u du 101 va-l ( 1 - v)b- l dv = r(a + b)B(a, b).

3. If g is strictly decreasing then JP(g(X) ::: y) = JP(X ::: g- l (y» = 1 - g- l (y) so long as o ::: g- l (y) ::: 1 . Therefore JP(g(X) ::: y) = 1 - e-Y , y ::: 0, if and only if g- 1 (y) = e-Y , which is to say that g (x) = - log x for 0 < x < 1 . 4. We have that

Also,

jx 1 1 1 JP(X < x) = 2

du = - + - tan- 1 x . - -00 7f(1 + u ) 2 7f

is finite if and only if la I < 1 . S. Writing <I> for the N(O, 1) distribution function, JP(Y ::: y) = JP(X ::: log y) = <I> (log y). Hence

fy (y) = .!. fx (log y) = _1_e -! (logy)2 , 0 < Y < 00 . y y../2ii 6. Integrating by parts,

LHS = i: g (x ) { (X - tL)�¢ (X : tL ) } dx

= - [g(x)a¢ e : tL) ] �oo + i: g' (x)a¢ e: tL) dx = RHS.

1 90

Page 200: One Thousand Exercises in Probability

Dependence

7. (a) r (x ) = a{3xfJ-1 . (b) r (x) = A .

Solutions [4.4.1H 4.5.3]

Aae-)..x + J.t( 1 - a)e-/.LX (c) r (x) = ).. , which approaches mintA , J.t} as x � 00. ae- x + ( 1 - a)e-/.LX 8. Clearly q>' = -xifJ. Using this identity and integrating by parts repeatedly,

100 100 ifJ'(u) ifJ (x) 100 ifJ' (u) 1 - <I>(x) = ifJ (u) du = - -- du = - + -3- du x x u x x u

= ifJ (x) _ ifJ (x) _ roo 3ifJ' (u) du = ifJ (x) _ ifJ (x) + 3ifJ (x) _ roo 15ifJ (u) du o x x3 ix u5 x x3 x5 ix u6

4.5 Solutions. Dependence

1. (i) As the product of non-negative continuous functions, f is non- negative and continuous. Also

g (x) = �e- ix i e- :Zx Y dy = �e- ix i 100 1 1 2 2

-00 J27rx-2

if x I- 0, since the integrand is the N(O, x-2 ) density function. It is easily seen that g(O) = 0, so that g is discontinuous, while

(ii) Clearly f Q ::: ° and

100 g (x) dx = 100 �e- Ix l dx = 1 . -00 -00

100 100 00

-00 -00 fQ (x , y) dx dy = ?; ( �r · 1 = 1 .

Also f Q is the uniform limit of continuous functions on any subset of JR.2 of the form [ -M, M] x JR.; hence f Q is continuous. Hence f Q is a continuous density function. On the other hand

100 00 ( l )n fQ (x , y) dY = L 2" g (x - qn ) , -00 n=l

where g is discontinuous at 0. (iii) Take Q to be the set of the rationals, in some order. 2. We may assume that the centre of the rod is uniformly positioned in a square of size a x b, while the acute angle between the rod and a line of the first grid is uniform on [0, � rr] . If the latter angle is o then, with the aid of a diagram, one finds that there is no intersection if and only if the centre of the rod lies within a certain inner rectangle of size (a - r cos 0) x (b - r sin 0) . Hence the probability of an intersection is

2 101C/2 2r - {ab - (a - r cos O)(b - r sin O) } dO = -(a + b - � r) . rrab 0 rrab

3. (i) Let I be the indicator of the event that the first needle intersects a line, and let J be the indicator that the second needle intersects a line. By the result of Exercise (4.5 .2), lE(l) = lE(J) = 2/rr ; hence Z = I + J satisfies lE(� Z) = 2/rr .

19 1

Page 201: One Thousand Exercises in Probability

[4.5.4]-[4.5.6] Solutions Continuous random variables

(ii) We have that

In the notation of (4.5 .8), if 0 < 8 < �rr , then two intersections occur if z < � min{sin 8 , cos 8 } or 1 - z < i min {sin 8 , cos 8 } . With a similar component when � rr :5 8 < rr , we find that

lE(l J) = lP(two intersections) = � II (z ,9) : dz d8

O<z < i min {sin 9,cos 9 }

O<9< �1I'

and hence

4 1011'/2 4 10

11'/4 4 (

1 ) = - � min{sin 8 , cos 8 } d8 = - sin 8 d8 = - 1 - M ' rr 0 rr 0 rr '\1 2

1 1 4 1 M 3 - ../2 4 var(z Z) = ; -rr2

+ ;(2 - '\12) = -rr- -

rr2·

(iii) For Buffon's needle, the variance of the number of intersections is (2/ rr) - (2/ rr)2 which exceeds var(i Z). You should therefore use Buffon's cross. 4. (i) Fu (u) = 1 - ( 1 - u)( 1 - u) if 0 < u < l , and so lE(U) = J6 2u ( 1 - u) du = t . (Alternatively, place three points independently at random on the circumference of a circle of circumference 1 . Measure the distances X and Y from the first point to the other two, along the circumference clockwise. Clearly X and Y are independent and uniform on [0, 1 ] . Hence by circular symmetry, lE(U) = lE(V - U) = lE(1 - V) = t .) (ii) Clearly UV = XY, so that lE(UV) = lE(X)lE(Y) = 1 . Hence

cov(U, V) = lE(UV) - lE(U)lE(V) = 1 - t ( 1 - t) = -lo, since lE(V) = 1 - lE(U) by 'symmetry' . 5. (i) If X and Y are independent then, by (4.5.6) and independence,

lE(g (X)h (Y» = II g (x)h (y)fx, Y (x , y) dx dy

= I g (x)fx (x) dx I h (y)fy (y) dy = lE(g (X» lE(h (Y» .

(ii) By independence

6. If 0 is the centre of the circle, take the radius OA as origin of coordinates. That is, A = ( 1 , 0), B = ( 1 , e), C = ( 1 , <1» , in polar coordinates, where we choose the labels in such a way that 0 :5 e :5 <1>. The pair e , <I> has joint density function f (8 ,tjJ) = (2rr2)- 1 for O < 8 < t/J < 2rr .

The three angles of ABC are � e, � (<I> - e), rr - � <1>. You should plot in the 81t/J plane the set of pairs (8 , t/J) such that 0 < 8 < t/J < 2rr and such that at least one of the three angles exceeds xrr.

1 92

Page 202: One Thousand Exercises in Probability

Conditional distributions and conditional expectation Solutions [4.5.7]-[4.6.3]

Then integrate f over this region to obtain the result. The shape of the region depends on whether or not x < � . The density function g of the largest angle is given by differentiation:

{ 6(3x - 1) if l ::: x ::: � , g(x) = 6(I - x) if � ::: x ::: l .

The expectation is found to be H-rr. 7 . We have that JE(X') = j.t, and therefore JE(Xr - X) = O. Furthermore,

8. The condition is that JE(Y) var(X) + JE(X) var(Y) = O. 9. If X and Y are positive, then S positive entails T positive, which displays the dependence. Finally, S2 = X and T2 = Y.

4.6 Solutions. Conditional distributions and conditional expectation

1. The point is picked according to the uniform distribution on the surface of the unit sphere, which is to say that, for any suitable subset C of the surface, the probability the point lies in C is the surface integral fe (4rr)-1 dS. Changing to polar coordinates, x = cos O cos cp, y = sin O cos cp, z = sin cp, subject to x2 + y2 + z2 = I , this surface integral becomes (4rr)-1 Ie I cos cp l dO dcp, whence the joint density function of e and ell is

1 f(O, cp) = -I cos cp l , 4rr I cp l ::: �rr, 0 ::: 0 < 2rr.

The marginals are then fe (O) = (2rr)-1 , f4>(cp) = � I cos cp l , and the conditional density functions are

for appropriate 0 and cp. Thus e and ell are independent. The fact that the conditional density functions are different from each other is sometimes referred to as 'Borel's paradox' . 2. We have that

and therefore

t/f(x) = 100 y fx,Y (x , y) dy

-00 fx (x)

100 100 fx y(x y) JE (t/f(X)g (X») = y i ') g (x)fx (x) dx dy -00 -� x (x

= i: i: (y g(x)}fx, Y (x , y) dx dy = JE(Yg(X» .

3. Take Y to be a random variable with mean 00, say fy (y) = y -2 for i ::: y < 00, and let X = Y. Then JE(Y I X) = X which is (almost surely) finite.

1 93

Page 203: One Thousand Exercises in Probability

[4.6.4]-[4.6.6] Solutions

4. (a) We have that

Continuous random variables

!x (x) = 100 'A2e-J..y dy = 'Ae-J..x ,

so that !Y IX (y I x ) = 'AeJ..{x-y) , for 0 .::: x .::: y < 00.

0 .::: x < 00,

(b) Similarly, !x (x) = fooo xe-x(y+l) dy = e-x ,

so that !Y lx (y I x ) = xe-xy , for 0 .::: y < 00.

0 ::: x < 00,

5. We have that fo 1 fo 1 (n) xa- 1 ( l - x)b- 1 IP'(Y = k) = IP'(Y = k I X = x)!x (x) dx = k x

k (l - x)n-k b dx o 0 B(a, )

= (n) B(a + k, n - k + b) . k B(a , b)

In the special case a = b = 1 , this yields

IP' Y = k) = (n) r(k + l)r (n - k + 1 ) =

_1 _ ( k r (n + 2) n + 1 '

whence Y is uniformly distributed. We have in general that

0 .::: k '::: n ,

fo 1 na lE(Y) = lE(Y I X = x)!x (x) dx = -- , o a + b

and, by a similar computation oflE(y2) , nab(a + b + n) var(Y) = (a + b)2 (a + b + 1 ) ·

6. By conditioning on Xl ,

Gn (x) = IP'(N > n) = foX Gn- 1 (x - u) du = fo

X Gn- 1 (v) dv .

Now Go (v) = 1 for all v E (0, 1 ] , and the result follows by induction. Now,

More generally,

00 lEN = L IP'(N > n) = eX .

n==O

G N (S) = L snlP'(N = n) = L sn x - ::...- = (s - l)eSX + 1 , 00 00 ( n- 1 n )

n=l n=l (n - I) ! n !

194

Page 204: One Thousand Exercises in Probability

Functions of random variables Solutions [4.6.7]-[4.7.1]

7. We may assume without loss of generality that lEX = lEY = O. By the Cauchy-Schwarz inequality,

Hence,

lE(Xy)2 lE(var(Y I X)) = lE(y2) - lE(lE(Y I X)2) < lEy2 - 2 = (1 - p2) var(Y) . - lE(X )

8. One way is to evaluate

Another way is to observe that min{Y, Z} is exponentially distributed with parameter /1- + v , whence JP(X < min{Y, Z}) = )../ ().. + /1- + v) . Similarly, JP(Y < Z) = /1-/(/1- + v) , and the product of these two terms is the required answer. 9. By integration, for x , y > 0,

fy(y) = loy f(x , y) dx = gcy3e-Y , fx (x) = 100 f(x , y) dy = cxe-x ,

whence c = 1 . It is simple to check the values of fXIY (x I y) = f(x , y)/ fy (y) and fylX (y I x), and then deduce by integration that lE(X I Y = y) = � Y and lE(Y I X = x) = x + 2. 10. We have that N > n if and only if Xo is largest of {Xo , Xl , . . . , Xn } , an event having probability 1/ (n + 1) . Therefore, JP(N = n) = 1/ {n (n + 1 ) } for n 2: 1 . Next, on the event {N = n} , Xn is the largest, whence

00 F(x)n+1 00 F(x)n+1 00 F(x)n+1 JP(XN ::::: x) = L: n (n + 1) = L: n - L: n + 1 + F(x) , n=l n=l n=l

as required. Finally,

1 1 JP(M = m) = JP(Xo 2: Xl 2: . . . 2: Xm- l ) - JP(Xo 2: Xl 2: . . . 2: Xm) = m ! - (m + I ) ! '

4.7 Solutions. Functions of random variables

1. We observe that, if 0 ::::: u ::::: 1,

JP(XY ::::: u) = JP(XY ::::: u , Y ::::: u) + JP(XY ::::: u , Y > u) = JP(Y ::::: u) + JP(X ::::: u/Y, Y > u) 1 1 u = u + - dy = u ( I - log u) .

u y

By the independence of XY and Z,

JP(XY ::::: u , Z2 ::::: v) = JP(XY ::::: u)JP(Z ::::: ../V) = u../V(1 - log u) , 0 < u , v < 1 .

1 95

Page 205: One Thousand Exercises in Probability

[4.7.2H4.7.5] Solutions Continuous random variables

Differentiate to obtain the joint density function

Hence

) 10g(1/u) g (u , v = 2....(V , 0 :::: u , v :::: 1 .

IP'(XY :::: Z2) = If

10;%U) du dv = a .

O:;:u:;:v:;: l Arguing more directly,

lP'(XY :::: Z2) = fff

dx dy dz = � . O:;:x.y . z:;: l

xy:;:z2

2. The transformation x = uv, y = u - uv has Jacobian

J = 1 1 v u

I = -u . - v -u

Hence I J I = l u i , and therefore fu. v (u , v) = ue-u , for 0 :::: u < 00, 0 :::: v :::: 1 . Hence U and V are independent, and fv (v) = I on [0, 1 ] as required. 3. Arguing directly,

lP'(sin X :::: y) = lP'(X :::: sin- 1 y) = � sin- 1 y , rr 0 :::: y :::: 1 ,

so that fy (y) = 2/ ( rr 0-?), for 0 :::: y :::: 1 . Alternatively, make a one-dimensional change of variables. 4. (a) lP'(sin- 1 X :::: y) = lP'(X :::: sin y) = sin y , for 0 :::: y :::: !rr . Hence fy (y) = cos y, for O :::: y :::: irr . (b) Similarly, lP'(sin- 1 X :::: y) = ! (1 + sin y ) , for -!rr :::: y :::: !rr , s o that fy (y) = ! cos y, for -!rr < y < !rr . 2 - - 2 5. Consider the mapping w = x, Z = (y - px)/v'1=IJ2 with inverse x = w, y = pw +zJl - p2 and Jacobian

J = ll

�I = �· p V 1 _ p2 Y 1 - p-

The mapping is one--one, and therefore W (= X) and Z satisfy

implying that W and Z are independent N(O, 1) variables. Now

{X > 0, Y > O} = { W > 0, Z > -W p / .J 1 - p2 } , and therefore, moving to polar coordinates,

ill" 00 1 1 2 �ll" 1 lP'(X > 0, Y > 0) = f 1 _e--,; r r dr dO = f - dO J9=a r=O 2rr Ja 2rr

1 96

Page 206: One Thousand Exercises in Probability

Functions of random variables Solutions [4.7.6]-[4.7.9]

where a = _ tan- 1 (p/v'l=f)2) = - sin- 1 p . 6. We confine ourselves to the more interesting case when p =1= 1 . Writing X = U, Y = pU + VI - p2V, we have that U and V are independent N(O, 1) variables. It is easy to check that Y > X if and only if (1 - p) U < VI - p2 V. Turning to polar coordinates,

JE:(max{X, Y}) = 1000 � [£Vt+7r {pr cos e + rH sin e } de + £�7r r cos e de] dr

where tan 1fr = .J (1 - p) / (1 + p). Some algebra yields the result. For the second part,

JE:(max{X, y}2) = JE:(X2I{x>Y}) + JE:(y2I{y>X} ) = JE:(X2I{x<y}) + JE:(y2 I{y<x}) ,

by the symmetry of the marginals of X and Y. Adding, we obtain 2JE:(max{X, y}2) = JE:(X2) + JE:(y2) = 2. 7. We have that

A lP'(X < Y, Z > z) = lP'(z < X < Y) = __ e-(A+/L)z = lP'(X < Y)lP'(Z > z) . A + JL A (a) lP'(X = Z) = lP'(X < Y) = -- . A + JL

(b) By conditioning on Y, A lP' ( X - Y)+ = 0) = lP'(X :s Y) = A + JL ' lP' ( X - Y)+ > w) = �e-AW for w > O. A + JL

By conditioning on X,

lP'(V > v) = lP'(IX - Y I > v) = 1000 lP'(Y > v + x)fx (x) dx + 100 lP'(Y < x - v)fx (x) dx

v > o.

(c) By conditioning on X, the required probability is found to be

8. Either make a change of variables and find the Jacobian, or argue directly. With the convention that V r2 - u2 = 0 when r2 - u2 < 0, we have that

F(r, x) = lP'(R :s r, X :s x) = - Vr2 - u2 du, 2 jX 7r -r

82F 2r f(r, x) = - = r-;---;:; ' 8r8x 7r V r2 - x2

9. As in the previous exercise,

Ix l < r < 1 .

3 jZ lP'(R :s r, Z :s z) = -4 7r(r2 - w2) dw . 7r -r

1 97

Page 207: One Thousand Exercises in Probability

[4.7.10H4.7.14] Solutions Continuous random variables

Hence f(r, z) = �r for I z l < r < 1 . This question may be solved in spherical polars also.

10. The transformation s = x + y, r = xl(x + y), has inverse x = rs, y = (1 - r)s and Jacobian J = s . Therefore,

fR (r) = (X) fR S (r, s) ds = [00 fx y (rs, ( 1 - r)s) s ds

Jo ' Jo ' O � r � 1 .

11. We have that

whence 1 fy (y) = 2fx (-v(aly) - 1 ) = rr Jy(a _ y) ' o � y � a .

12. Using the result of Example (4.6.7), and integrating by parts, we obtain

IP(X > a, Y > b) = 100

ifJ (x) { 1 - <I> ( n) } dx = [ 1 - <I> (a)] [ 1 - <I> (c)] + [00

[1 - <I> (x)]ifJ (�) h dx. Ja 1 - p2 1 - p2

Since [ 1 - <I> (x)]/ifJ (x) is decreasing, the last term on the right is no greater than

1 - <I> (a) [00 ifJ ( )ifJ ( b - px ) P d ifJ (a) Ja

x VI _ p2 VI _ p2 x ,

which yields the upper bound after an integration.

13. The random variable Y is symmetric and, for a > 0,

Ioa-I du la _v-2 dv IP(Y > a) = IP(O < X < a- I ) = 2 = 2 ' o rr(1 + u ) 00 rr(1 + v- ) by the transformation v = l/u . For another example, consider the density function

f(x) = 2 ' { lx-2 if x > 1 � if O � x � 1 .

14. The transformation w = x + y, z = x I (x + y) has inverse x = wz, y = (1 - z)w, and Jacobian J = w, whence

A(Awz)a- I e-J..wz A(A( 1 - z)w)fJ- 1 e-J.. ( l-z)w f(w, z) = W · r(a) · r ({3)

w > 0, 0 < z < 1 .

Hence W and Z are independent, and Z is beta distributed with parameters a and {3.

1 98

Page 208: One Thousand Exercises in Probability

Sums of random variables Solutions [4.8.1H4.8.4]

4.8 Solutions. Sums of random variables

1. By the convolution formula (4.8 .2), Z = X + Y has density function

fz(z) = rz A/Le-J...xe-/L(z-x) dx = � (e-J...z _ e-/Lz) , 10 /L - A

if A =j: /L. What happens if A = /L? (Z has a gamma distribution in this case.)

2. Using the convolution formula (4.8 .2), W = otX + {3Y has density function

z � 0,

100 1 1 fw(w) = . dx , -00 not ( 1 + (x/ot)2) n{3 ( 1 + {(w - x)/{3}2 )

which equals the limit of a complex integral:

lim { ot{3 . __ 1_ . 1 dz R-+oo 1 D n2 z2 + ot2 (z - w)2 + {32

where D is the semicircle in the upper complex plane with diameter [- R, R] on the real axis. Evalu­ating the residues at z = iot and z = w + i{3 yields

ot{32n i { 1 1 I I } fw(w) = -n-2- -2i-ot ' (iot - w)2 + {32 + -2i-{3 . -(w-+-j-{3')2'+-ot"2

1 1 = n(ot + {3) . 1 + {w/(ot + {3) }2 ' -00 < w < 00

after some manipulation. Hence W has a Cauchy distribution also.

3. Using the convolution formula (4.8.2),

rz 1 -z 1 2 -z fz(z) = 10 2"ze dy = 2"Z e ,

4. Let fn be the density function of Sn . By convolution,

This leads to the guess that

z � o.

n � 2,

which may be proved by induction as follows. Assume that (*) holds for n ::: N. Then

1 99

Page 209: One Thousand Exercises in Probability

[4.8.5]-[4.8.8] Solutions

for some constant A. We integrate over x to find that

N N+l A A I = E IT -s- + - ,

r=1 s=1 As - Ar AN+l si'r

and (*) follows with n = N + 1 on solving for A. 5. The density function of X + Y is , by convolution,

h(x) =

Therefore, for 1 � x � 2,

{ X if O � x � l ,

2 - x if 1 � x � 2.

Continuous random variables

101

11

Iox- 1

!3(x) = h(x - y) dy = (x - y) dy + (2 - x + y) dY = i - (x _ � )2 . o x- I 0

Likewise,

A simple induction yields the last part.

6. The covariance satisfies cov(U, V) = JE(X2 - y2) = 0, as required. If X and Y are symmetric random variables taking values ±1 , then

JP>(U = 2, V = 2) = ° but JP>(U = 2)JP>(V = 2) > 0.

If X and Y are independent N(O, 1 ) variables, fu. v (u , v) = (4rr)- l e- ! (u2+v2) , which factorizes as a function of u multiplied by a function of v.

7. From the representation X = apU + a VI - p2 V, Y = r:U, where U and V are independent N(O, 1 ) , we learn that

Similarly,

apy JE(X I Y = y) = JE(apU I U = y/r:) = - .

r:

whence var(X I Y) = 0'2( 1 - p2) . For parts (c) and (d), simply calculate that cov(X, X + Y) = 0'2 + pa r:, var(X + Y) = 0'2 + 2pa r: + r:2 , and

8. First recall that JP>(I X I � y) = 2<f>(y) - 1 . We shall use the fact that U = (X + Y)/..[2, V = (X - Y)/..[2 are independent and N(O, 1) distributed. Let 8 be the triangle of lR2 with vertices (0, 0) , (0, Z), (Z, 0) . Then

JP>(Z � z I X > 0, Y > 0) = 4JP>( X, Y) E 8) = JP> ( l U I � z/...fi, I V I � z/...fi) by symmetry

= 2{2<f> (z/...fi) _ 1 }2 ,

200

Page 210: One Thousand Exercises in Probability

Multivariate normal distribution Solutions [4.9.1]-[4.9.5]

whence the conditional density function is

f(z) = 2�{2<f> (Z/�) - 1 }cf> (Z/�) .

Finally,

1&(Z I X > 0, Y > 0) = 21&(X I X > 0, Y > 0)

4.9 Solutions. Multivariate normal distribution

1. Since V is symmetric, there exists a non-singular matrix M such that M' = M-l and V = MAM-1 , where A is the diagonal matrix with diagonal entries the eigenvalues A 1 , A2 , • . . , An of V.

1 1 Let A! be the diagonal matrix with diagonal entries $t, ..;>:2, . . . , $n; A! is well defined since

1 V is non-negative definite. Writing W = MA! M', we have that W = W' and also

1 as required. Clearly W is non-singular if and only if A ! is non-singular. This happens if and only if Ai > ° for all i , which is to say that V is positive definite.

2. By Theorem (4.9.6), Y has the multivariate normal distribution with mean vector 0 and covariance matrix

3. Clearly Y = (X - p.)a' + p.a' where a = (al ' a2 , . . . , an ) . Using Theorem (4.9.6) as in the previous solution, (X - p. )a' is univariate normal with mean 0 and variance aVa' . Hence Y is normal with mean p.a' and variance a Va' .

4. Make the transformation u = x + y, v = x - y, with inverse x = i (u + v) , y = i (u - v) , so

that I J I = i . The exponent of the bivariate normal density function is

and therefore U = X + Y, V = X - Y have joint density

1 { u2 v2 } f(u , v) =

4n VI _ p2 exp - 4(1 + p) - 4(1 _ p)

,

whence U and V are independent with respective distributions N(O, 2(1 + p» and N(O, 2(1 - p» .

5. That Y is N(O, 1) follows by showing that JP'(Y � y) = JP'(X � y) for each of the cases y � -a, I y l < a, y 2: a.

Secondly,

p(a) = 1&(XY) = x2cf> (X) dx - x2cf> (X) dx - x2cf> (X) dx = 1 - 4 x2cf> (x) dx . ja j-a

100

100

-a -00

a a

201

Page 211: One Thousand Exercises in Probability

[4.9.6]-[4.10.1] Solutions Continuous random variables

The answer to the final part is no; X and Y are N(O, 1 ) variables, but the pair (X, Y) is not bivariate normal. One way of seeing this is as follows. There exists a root a of the equation p(a) = O. With this value of a, if the pair X, Y is bivariate normal, then X and Y are independent. This conclusion is manifestly false: in particular, we have that JP'(X > a , Y > a) ¥- JP'(X > a)JP'(Y > a). 6. Recall from Exercise (4.8.7) that for any pair of centred normal random variables

JE(X I Y) = cov(X, Y) Y, var(X I Y) = { I - p (X, y)2} var X.

var Y

The first claim follows immediately. Likewise,

7. As in the above exercise, we calculate a = JE (X 1 I �i Xr ) and b = var(X 1 I �i Xr ) using the facts that var X I = V l l , var (�i Xi ) = �ij Vij , and COV (X l ' �i Xr ) = �r VIr · 8. Let p = JP'(X > 0, Y > 0, Z > 0) = JP'(X < 0, Y < 0, Z < 0) . Then

1 - p = JP' ({X > O} U {Y > O} U {Z > On = JP'(X > 0) + JP'(Y > 0) + JP'(Z > 0) + p

- �X > � Y > � - �Y > � Z > � - �X > � Z > � 3 [3 1

{ . - 1 . - 1 . -1 }] = 2: + p - 4 + 27r

sm PI + sm P2 + sm P3 .

9. Let U, V, W be independent N(O, 1 ) variables, and represent X, Y, Z as X = U, Y = PI U +

J1 - p?V,

Z = P3U + P2 - PIP3 V + J1 - p?

1 2 2 2 2 - PI - P2 - P3 + PIP2P3 2 W.

( 1 - PI )

We have that U = X, V = (Y - PI X) / J 1 - p? and JE(Z I X, Y) follows immediately, as does the conditional variance.

4.10 Solutions. Distributions arising from the normal distribution

1. First method. We have from (4.4.6) that the x2(m) density function is

1 1 1 1 � (x) = ___ 2-m/2x 'lm- e- 'lx Jm r (m/2) , x 2: O.

The density function of Z = X I + X 2 is, by the convolution formula,

g (z) = c 10Z Am- le- !x (z - x) !n- le- ! (z-x) dx

= cz � (m+n)- l e- !Z 101 u !m- l ( 1 _ u) !n- l du

202

Page 212: One Thousand Exercises in Probability

Distributions arising from the normal distribution Solutions (4.10.2]-[4.10.6J

by the substitution u = xlz , where c is a constant. Hence g(z) = c'z ! (m+n)- l e- !z for z ;:: 0, for an appropriate constant c', as required. Second method. If m and n are integral, the following argument is neat. Let Zl , Z2 , . . . , Zm+n be independent N(O, 1 ) variables. Then Xl has the same distribution as Zi + Z� + . . . + Z; , and X2 the same distribution as Z;+ 1 + Z;+2 + . . . + Z;+n (see Problem (4. 14. 12» . Hence Xl + X2 has the same distribution as Zi + . . . + Z;+n ' i.e., the x2 (m + n) distribution. 2. (i) The t (r) distribution is symmetric with finite mean, and hence this mean is 0. (ii) Here is one way. Let U and V be independent x 2 (r) and X 2 (s ) variables (respectively) . Then

lE ( Ulr ) = �lE(U)lE(V- l ) Vis r

by independence. Now U is r (- � , !r) and V is r (-� , is) , so that lE(U) = r and

if s > 2, since the integrand is a density function. Hence

( Ulr ) s lE Vis = s - 2

(iii) If s :::: 2 then lE(V-l ) = 00.

3. Substitute r = 1 into the t (r) density function.

if s > 2.

4. First method. Find the density function of X I Y, using a change of variables. The answer is F(2, 2) . Second method. X and Y are independent X2 (2) variables (just check the density functions), and hence XIY is F(2, 2) . 5. The vector (X, Xl - X, X2 - X, . . . , Xn - X) has, by Theorem (4.9.6), a multivariate normal distribution. We have as in Exercise (4.5.7) that cov(X, Xr - X) = ° for all r, which implies that X is independent of each Xr . Using the form of the multivariate normal density function, it follows that X is independent of the family {Xr - X : 1 :::: r :::: n} , and hence of any function of these variables. Now S2 = (n - 1 )- 1 Er (Xr - X)2 is such a function. 6. The choice of fixed vector is immaterial, since the joint distribution of the Xj is spherically symmetric, and we therefore take this vector to be (0, 0, . . . , 0, 1 ) . We make the change of variables U2 = Q2 + X�, tan IJI = QI Xn , where Q2 = E�:t X; and Q ;:: 0. Since Q has the x2 (n - 1) distribution, and is independent of Xn , the pair Q, Xn has joint density function

X E JR, q > 0.

The theory is now slightly easier than the practice. We solve for U, IJI, find the Jacobian, and deduce the joint density function !U,IV (U , 1/1) of u, IJI. We now integrate over u, and choose the constant so that the total integral is 1 .

203

Page 213: One Thousand Exercises in Probability

[4.11.1]-[4.11.7] Solutions Continuous random variables

4.11 Solutions. Sampling from a distribution

1. Uniform on the set { I , 2, . . . , n } .

2. The result holds trivially when n = 2, and more generally by induction on n . 3. We may assume without loss of generality that A = 1 (since ZI>" is r (>" , t) if Z is r( 1 , t)). Let U, V be independent random variables which are uniformly distributed on [0, 1 ] . We set X = -t log V and note that X has the exponential distribution with parameter 1 I t . It is easy to check that

for x > 0,

where e = tte-t+ l I r (t) . Also, conditional on the event A that

Xt- l -t U < e te-Xlt - r (t) ,

X has the required gamma distribution. This observation may be used as a basis for sampling using the rejection method. We note that A = { log U ::: (n - 1 ) (log(Xln) - (Xln) + I ) } . We have that lP(A) = lie, and therefore there is a mean number e of attempts before a sample of size 1 is obtained.

4. Use your answer to Exercise (4. 1 1 .3) to sample X from r( l , a) and Y from r( l , fJ) . By Exercise (4.7. 14), Z = X/(X + Y) has the required distribution.

5. (a) This is the beta distribution with parameters 2, 2. Use the result of Exercise (4). (b) The required r ( 1 , 2) variables may be more easily obtained and used by forming X = - log( U I U2) and Y - log(U3 U4) where {Ui : 1 ::: i ::: 4} are independent and uniform on [0, 1 ] . (c) Let UI , U2 , U3 be as in (b) above, and let Z be the second order statistic U(2) . That is, Z is the middle of the three values taken by the Uj ; see Problem (4. 14.21 ). The random variable Z has the required distribution. (d) As a slight variant, take Z = max{UI , U2} conditional on the event {Z ::: U3 } . (e) Finally, let X = ../Ui/(../Ui + ../U2), Y = ../Ui + ../Ui.. The distribution of X, conditional on the event {Y ::: I }, is as required.

6. We use induction. The result is obvious when n = 2. Let n :::: 3 and let p = (PI , P2 , . . . , Pn) be a probability vector. Since p sums to 1 , its minimum entry P(l) and maximum entry P(n) must satisfy

1 1 P(l ) ::: - < -- , n n - l

1 - P(l) 1 + (n - 2) P(l) 1 P(l) + P(n) :::: P( l) + 1 = 1 :::: --1 n - n - n -

We relabel the entries of the vector p such that PI = P(l) and P2 = P(n) , and set VI = ( n - l)pl o 1 -(n - 1)PI , 0, . . . , 0) . Then

1 n - 2 p = --1 VI + --1 Pn- l n - n -

n - l ( 1 )

where Pn- l = -- 0, PI + P2 - -- , P3 , . . . , Pn , n - 2 n - l

is a probability vector with at most n - 1 non-zero entries. The induction step is complete. It is a consequence that sampling from a discrete distribution may be achieved by sampling from

a collection of Bernoulli random variables.

7. It is an elementary exercise to show that lP(R2 ::: 1) = in , and that, conditional on this event, the vector (TI , T2) is uniformly distributed on the unit disk. Assume henceforth that R2 ::: 1 , and write (R, e) for the point (TI , T2) expressed in polar coordinates . We have that R and e are independent with joint density function fR,s (r, e) = rln , 0 ::: r ::: 1 , ° ::: e < 2n . Let (Q, \II) be the polar

204

Page 214: One Thousand Exercises in Probability

Coupling and Poisson approximLltion Solutions [4.11.8]-[4.12.1]

coordinates of (X, Y) , and note that IJI = 8 and e- 1 Q2 = R2 . The random variables Q and IJI are I 2

independent, and, by a change of variables, Q has density function fQ (q) = qe- '1.Q , q > O. We recognize the distribution of (Q, IJI) as that of the polar coordinates of (X, Y) where X and Y are independent N(O, 1) variables. [Alternatively, the last step may be achieved by a two-dimensional change of variables . ]

8. We have that

9. The polar coordinates (R, 8) of (X, Y) have joint density function

2r fR 9 (r, 0) = - , . 7r

Make a change of variables to find that Y / X = tan 8 has the Cauchy distribution.

10. By the definition of Z,

m- l JP(Z = m) = h (m) II ( 1 - h (r))

= JP(X > O)JP(X > 1 I X > 0) . . · JP(X = m I X > m - 1) = JP(X = m).

11. Suppose g is increasing, so that h (x) = -g(1 - x) is increasing also. By the FKG inequality of Problem (3. l 1 . 1 8b), " = cov(g (U) , -g(l - U)) � 0, yielding the result.

Estimating I by the average (2n)- 1 ��1 g (Ur ) of 2n random vectors Ur requires a sample of size 2n and yields an estimate having some variance 2nu2 . If we estimate I by the average (2n)-1 {��=1 g (Ur ) + g(l - Ur) } , we require a sample of size only n , and we obtain an estimate with the smaller variance 2n (u2 - ,, ) .

12. (a) By the law of the unconscious statistician,

JE [g(y)fx (Y) ] = !

g (Y)fx (Y) f ( ) d = I . fy (Y) fy (y) Y Y Y

(b) This is immediate from the fact that the variance of a sum of independent variables is the sum of their variances; see Theorem (3.3 . l lb). (c) This is an application of the strong law of large numbers, Theorem (7.5 . 1 ) .

13. (a) If U is uniform on [0, 1], then X = sin(�7r U) has the required distribution. This is an example of the inverse transform method.

(b) If U is uniform on [0, 1 ] , then 1 - U2 has density function g (x) = {2.Jf=X} -1 , 0 � x � 1 . Now g(x) � (7r /4)f (x) , which fact may be used as a basis for the rejection method.

4.12 Solutions. Coupling and Poisson approximation

1. Suppose that JE(u (X)) � JE(u(Y)) for all increasing functions u . Let C E IR and set u = Ie where

( ) _ { I if x > c,

Ie x - o if x � c,

205

Page 215: One Thousand Exercises in Probability

[4.12.2]-[4.13.1] Solutions Continuous random variables

to find that lP(X > c) = lE(Ic(X)) � lE(Ic(Y)) = lP(Y > c) . Conversely, suppose that X �st Y. We may assume by Theorem (4. 12.3) that X and Y are

defined on the same sample space, and that lP(X � Y) = 1 . Let u be an increasing function. Then lP(u (X) � u (Y)) � lP(X � Y) = 1 , whence lE(u (X) - u (Y)) � 0 whenever this expectation exists .

2. Let a = 11-/>" , and let {lr : r � I } be independent Bernoulli random variables with parameter a.

Then Z = 2:;=1 Ir has the Poisson distribution with parameter >..a = 11-, and Z .::: X. 3. Use the argument in the solution to Problem (2.7. 13).

4. For any A � JR,

lP(X #- Y) � lP(X E A, Y E AC) = lP(X E A) - lP(X E A, Y E A) � lP(X E A) - lP(Y E A),

and similarly with X and Y interchanged. Hence,

lP(X #- Y) � sup I lP(X E A) - lP(Y E A) I = 1dTV (X, Y) . A�1R

5. For any positive x and y, we have that (y - x)+ + x /\ Y = y, where x /\ y = min{x, y}. It follows that

�)fX (k) - fy (k) }+ = L{fy (k) - fx (k)}+ = 1 - L fx(k) /\ fy (k) , k k k

and by the definition of dTV (X, Y) that the common value in this display equals 1dTV (X, Y) = 8 . Let U be a Bernoulli variable with parameter 1 - 8, and let V, W, Z be independent integer-valued variables with

lP(V = k) = (fx (k) - fy (k) }+ /8, lP(W = k) = (fy (k) - fx (k)}+ /8, lP(Z = k) = fx(k) /\ fy (k)/( 1 - 8).

Then X' = U Z + (1 - U) V and Y' = U Z + (1 - U) W have the required marginals, and lP(X' = Y') = lP(U = 1) = 1 - 8 . See also Problem (7. 1 1 . 16d).

6. Evidently dTV (X, Y) = Ip - q l , and we may assume without loss of generality that p � q. We have from Exercise (4. 12.4) that lP(X = Y) .::: 1 - (p - q) . Let U and Z be independent Bernoulli variables with respective parameters 1 -p+q andq/ ( 1 -p+q) . The pair X' = U(Z- I)+ 1 , Y' = UZ has the same marginal distributions as the pair X, Y, and lP(X' = Y') = lP(U = 1 ) = 1 - P + q as required.

To achieve the minimum, we set X" = 1 - X' and Y" = Y', so that lP(X" = Y") = 1 - lP(X' = Y') = P - q.

4.13 Solutions. Geometrical probability

1. The angular coordinates \II and 1: of A and B have joint density f (1/1, 0") = (27Z' ) -2 . We make the change of variables from (p, () ) 1-+ (1/1, 0") by p = cos{ 1 (0" - 1/1 )} , () = 1 (7Z' + 0" + 1/1), with inverse

I I ()

I + - I 1/1 = () - '!7Z' - cos- p, 0" = - '!7Z' cos p,

and Jacobian I J I = 2/ Vf=P2. 206

Page 216: One Thousand Exercises in Probability

Geometrical probability Solutions [4.13.2]-[4.13.6]

2. Let A be the left shaded region and B the right shaded region in the figure. Writing >.. for the random line, by Example (4. 1 3 .2),

lP'(>.. meets both Sl and S2) = lP'(>.. meets both A and B) = lP'(>.. meets A) + lP'(>.. meets B) - lP'(>.. meets either A or B) ex b(A) + b(B) - b(H) = b(X) - b(H),

whence lP'(>.. meets S2 I >.. meets Sl ) = [b(X) - b(H)]/b(Sl ) . The case when S2 � Sl is treated in Example (4. 1 3 .2). When Sl n S2 =I- 0 and Sl /::,. S2 =I- 0,

the argument above shows the answer to be [b(Sl ) + b(S2) - b(H)]fb(Sd. 3. With I I I the length of the intercept I of >" 1 with S2, we have that lP'(>"2 meets I) = 2 1 / 1 /b(Sl ) , by the Buffon needle calculation (4. 1 3.2). The required probability is

1 r21r 100 2 1 1 1 dp dB r21f I S2 1 21l' I S2 1 2 Jo -00 b(Sl ) . b(Sl )

= Jo b(Sl )2

dB = b(Sl )2 ·

4. If the two points are denoted P = (Xl , Yl ) , and Q = (X2 , Y2) , then

We use Crofton's method in order to calculate 1&(Z). Consider a disc D of radius x surrounded by an annulus A of width h. We set A(x) = 1&(Z I P, Q E D), and find that

Now

whence

>.. (x + h) = >.. (x) ( 1 - � - O(h)) + 21&(Z I P E D, Q E A) c: + O(h)) .

2 1o !1f 102x cos O 32x 1&(Z I P E D, Q E A) = -2 r2 dr dO + 0(1) = -, 1l'X 0 0 91l' d>" 4>.. 128 - = -- +-, dx x 91l'

which is easily integrated subject to >"(0) = 0 to give the result. S. (i) We may assume without loss of generality that the sphere has radius 1 . The length X = IAOI has density function f (x) = 3x2 for 0 :::: x :::: 1 . The triangle includes an obtuse angle if B lies either in the hemisphere opposite to A, or in the sphere with centre ! X and radius ! X, or in the segment cut off by the plane through A perpendicular to AO. Hence,

lP'(obtuse) = i + 1&( iX)3 ) + (11l')- 11& (I: 1l'(I - y2) dY) = i + rt; + (1)- 11&(� - X + 1X3) = i ·

(ii) In the case of the circle, X has density function 2x for 0 :::: x :::: I , and similar calculations yield 1 1 1 r;--;;;; 3 lP'(obtuse) = 2 + "8 + ;1&(cos-

l X - Xv 1 - X2) = 4 ·

6. Choose the x-axis along AB. With P = (X, Y) and G = (Yl , Y2),

1&IABPI = i IAB I 1&(Y) = ! IAB IY2 = IABGI ·

207

Page 217: One Thousand Exercises in Probability

[4.13.7]-[4.13.11] Solutions

7. We use Exercise (4. 13 .6). First fix P, and then Q, to find that

Continuous random variables

With b = IAB I and h the height of the triangle ABC on the base AB, we have that IGI G2 1 = 1b and

the height of the triangle AGI G2 is � h . Hence,

8. Let the scale factor for the random triangle be X, where X E (0, 1 ) . For a triangle with scale factor x, any given vertex can lie anywhere in a certain triangle having area (1 - x)2 IABCj . Picking one at random from all possible such triangles amounts to supposing that X has density function f(x) = 3(1 - x)2, 0 ::::: x ::::: 1 . Hence the mean area is

9. We have by conditioning that, for 0 ::::: z ::::: a ,

( a ) 2 2a da F(z , a + da) = F(z , a) --

d- + IP'(X ::: a - z) . 2 + o(da)

a + a (a + da)

( 2 da ) z 2da = F(z, a) 1 - - + - . - + o(da), a a a

and the equation follows by taking the limit as da ..t- O. The boundary condition may be taken to be F (a, a) = 1 , and we deduce that

2z ( Z ) 2 F(z, a) = -;; - -;; , 0 ::::: z ::::: a .

Likewise, by use of conditional expectation,

Now, lE( (a -Xn = ar / (r + 1) , yielding the required equation. The boundary condition is mr (0) = 0, and therefore

2ar mr (a) =

(r + l ) (r + 2) . 10. If n great circles meet each other, not more than two at any given point, then there are 2(�) intersections. It follows that there are 4 (�) segments between vertices, and Euler's formula gives the number of regions as n (n - 1 ) + 2. We may think of the plane as obtained by taking the limit as R -+ 00 and 'stretching out' the sphere. Each segment is a side of two polygons, so the average number of sides satisfies

4n (n - 1) -::---'-:--'-: -+ 4 2 + n (n - 1)

as n -+ 00.

11 . By making an affine transformation, we may without loss of generality assume the triangle has vertices A = (0, 1 ) , B = (0, 0) , C = ( 1 , 0) . With P = (X, Y), we have that

L = C : Y ' 0)

, M = (X � Y ' X : Y

), N = (0,

1 � X)

.

208

Page 218: One Thousand Exercises in Probability

Problems Solutions [4.13.12]-[4.14.1]

Hence,

1 xy

101 ( x ) rr2 3

JEIBLNI = 2 dx dy = -x - -- Iog x dx = - - - , ABC 2(1 - x)( 1 - y) 0 1 - x 6 2

and likewise JE ICLMI = JEIANMI = irr2 - � . It follows that JEILMNI = � ( 10 - rr2) = ( 10 -rr2) IABCI .

12. Let the points be P, Q , R , S. By Example (4. 1 3 .6),

1P'( one lies inside the triangle formed by the other three) = 41P'(S E PQR) = 4 . -b. . 13. We use Crofton's method. Let m(a) be the mean area, and condition on whether points do or do not fall in the annulus with internal and external radii a, a + h . Then

m(a + h) = m(a) C : hr + [� + O(h)] m(a) ,

where m(a) is the mean area of a triangle having one vertex P on the boundary of the circle. Using polar coordinates with P as origin,

Letting h ..J.. 0 above, we obtain dm 6m 6 35a2 - = - - + - ' -- , da a a 36rr

whence m(a) = (35a2)/(48rr) . 14. Let a be the radius of C, and let R be the distance of A from the centre. Conditional on R, the required probability is (a - R)2 /a2 , whence the answer is JE« a - R)2/a2) = JJ ( 1 - r)22r dr = i . 15. Let a be the radius of C, and let R be the distance of A from the centre. As in Exercise (4. 13 . 14), the answer is JE« a - R)3 /a3) = JJ (1 - r)33r2 dr = 10.

1. (a) We have that

4.14 Solutions to problems

Secondly, f � 0, and it is easily seen that J�oo f(x) dx = 1 using the substitution y = (x -I-L)/(u..ti) .

1 1 2 1 2 (b) The mean is J�oo x (2rr) - � e - � x dx , which equals 0 since x e - � x is an odd integrable function.

1 1 2 The variance is J�oo x2 (2rr) - � e - � x dx , easily integrated by parts to obtain 1 .

209

Page 219: One Thousand Exercises in Probability

[4.14.2]-[4.14.5] Solutions Continuous random variables

(c) Note that

1 2 and also 1 - 3y-4 < 1 < 1 + y-2. Multiply throughout these inequalities by e- �Y 1../2ii, and integrate over [x , (0), to obtain the required inequalities. More extensive inequalities may be found in Exercise (4.4.8). (d) The required probability is a (x) = [ 1 - ct> (x + alx)]/[1 - ct> (x)] . By (c),

2. Clearly f � 0 if and only if 0 ::5 a < /3 ::5 1 . Also

as x -+ 00.

C -1 = lfJ (X - x2) dx = ! (/32 _ a2) _ � (/33 _ a

3) .

3. The Ai partition the sample space, and i - I ::5 X (w) < i if w E Ai . Taking expectations and using the fact that lE(Ii ) = lP(Ai ), we find that S ::5 lE(X) ::5 1 + S where

00 00 i- I 00 00 00

S = L )i - l)lP(Ai ) = L L 1 · lP(Ai ) = L L lP(Ai ) = L lP(X � j) . i=2 i=2j=1 j=1 i=j+l j=1

4. (a) (i) Let F- 1 (y) = sup{x : F(x) = y} , so that

lP(F(X) ::5 y) = lP(X ::5 F-1 (y» = F(F- 1 (y» = y ,

(ii) lP( - log F(X) ::5 y) = lP(F(X) � e-Y) = 1 - e-Y if y � O. (b) Draw a picture. With D = PR,

Differentiate to obtain the result.

5. Clearly

O ::5 y ::5 1 .

lP(X > s + x) e-A(s+x) -Ax lP(X > s + x I X > s) =

lP(X > s) =

e-AS = e

if x , s � 0, where A is the parameter of the distribution. Suppose that the non-negative random variable X has the lack-of-memory property. Then G (x) =

lP(X > x) is monotone and satisfies G (O) = 1 and G(s + x) = G(s)G(x) for s , x � O. Hence G(s) = e-AS for some A; certainly A > 0 since G (s) ::5 1 for all s .

Let us prove the hint. Suppose that g is monotone with g (O) = 1 and g (s + t) = g(s)g(t) for s , t � O. For an integer m, g (m) = g(1 )g(m - 1) = . . . = g( 1 )m . For rational x = min, g (x)n = g (m) = g(1 )m so that g (x) = g(I )X ; all such powers are interpreted as exp{x 10g g(1) } . Finally, if x i s irrational, and g is monotone non-increasing (say), then g(u) ::5 g (x) ::5 g (v) for all

210

Page 220: One Thousand Exercises in Probability

Problems Solutions [4.14.6]-[4.14.9]

rationals u and v satisfying v :::: x :::: u. Hence g ( 1 )U :::: g (x) :::: g ( 1 )v . Take the limits as u .,!.. x and v t x through the rationals to obtain g (x) = eiLX where J.L = log g ( 1 ) .

6. If X and Y are independent, we may take g = fx and h = fy . Suppose conversely that f(x , y) = g (x)h (y) . Then

fx (x) = g (x) i: h (y) dy , fy (y) = h (y) i: g (x) dx

and

1 = i: fy (y) dy = i: g (x) dx i: h (y) dy .

Hence fx(x)fy (y) = g(x)h (y) = f(x , y ) , so that X and Y are independent.

7. They are not independent since lP'(Y < 1 , X > 1 ) = O whereas lP'(Y < 1 ) > O and lP'(X > 1) > O. As for the marginals,

fx(x) = 100 2e-X-y dy = 2e-2x , fy (y) = loy 2e-X-Y dx = 2e-Y ( 1 - e-Y) ,

for x, y ::: O . Finally,

JE(XY) = foo foo xy2e-x-y dx dy = 1 ix=o iy=x

and JE(X) = ! , JE(Y) = � , implying that cov(X, Y) = i . 8. As in Example (4. 1 3. 1 ) , the desired property holds if and only if the length X of the chord satisfies X :::: .../3. Writing R for the distance from P to the centre 0, and e for the acute angle between the chord

and the line OP, we have that X = 2";1 - R2 sin2 e, and therefore IP'(X :::: ...(3) = IP'(R sin e ::: ! ) . The answer i s therefore

IP' (R ::: _.1_) = � f !1f IP' (R ::: -�-) dB ,

2 sm e n io 2 sm B which equals � - .../3/ (2n) in case (a) and § + n - 1 10g tan(n 1 12) in case (b) .

9. Evidently,

Secondly,

JE(U) = IP'(Y :::: g (X» = 11 dx dy = 101 g (x) dx ,

O::o;x ,y::O; l 0 y::O;g(x)

JE(V) = JE(g (X» = 101 g(x) dx ,

JE(W) = ! 101 {g (x) + g ( 1 - x) } dx = 10

1 g(x) dx .

JE(U2) = JE(U) = J, JE(V2) = 101 g (x)2 dx :::: J since g :::: 1 ,

JE(W2) = i { 210 1 g (x)2 dx + 210

1 g(x)g(l - x) dX }

= JE(V2) - ! 101 g (x) {g (x) - g ( 1 - x) } dx

fl 2 = JE(V2) - i io {g (x) - g( 1 - x) } dx :::: JE(V2) .

2 1 1

Page 221: One Thousand Exercises in Probability

[4.14.10J-[4.14.11J Solutions Continuous random variables

Hence var(W) :::: var(V) :::: var(U).

10. Clearly the claim is true for n = 1 , since the r (A , 1 ) distribution is the exponential distribution. Suppose it is true for n :::: k where k ::: 1 , and consider the case n = k + 1 . Writing in for the density function of Sn , we have by the convolution formula (4.8.2) that

which is easily seen to be the r (A , k + 1) density function.

11. (a) Let Zl , Z2 , . . . , Zm+n be independent exponential variables with parameter A. Then, by Problem (4. 14. 10), X' = Zl + . . . + Zm is r (A , m), yl = Zm+l + . . . + Zm+n is r (A , n) , and X' + yl is r (A , m + n) . The pair (X, Y) has the same joint distribution as the pair (X' , yl), and therefore X + Y has the same distribution as X, + yl, i.e., r (A , m + n) . (b) Using the transformation u = x + y, v = x/(x + y), with inverse x = uv, y = u( 1 - v), and Jacobian

J = I v u 1 = -u, 1 - v -u

we find that U = X + y, V = X / (X + Y) have joint density function

for u ::: 0, 0 :::: v :::: 1 . Hence U and V are independent, U being r (A , m + n), and V having the beta distribution with parameters m and n. (c) Integrating by parts,

lP'(X > t) = 100 Am

xm- 1 e-Ax dx t (m - I) !

I\. m- l -Ax I\. m-2 -Ax d [ , m- l ] 00 100 , m- l

= -(m-_-l)-! X e t

+ t (m _ 2) ! X e x

= e-At (M)m- l + lP'(X' > t) (m - I) !

where X, is r (A , m - 1 ) . Hence, by induction,

m- l (Ati lP'(X > t) = L e-J...t -k l = lP'(Z < m).

k=O .

(d) This may be achieved by the usual change of variables technique. Alternatively, reflect that, using the notation and result of part (b), the invertible mapping u = x + y, v = x / (x + y) maps a pair X, Y of independent (r (A , m) and r (A , n» variables to a pair U, V of independent (r (A , m + n) and B(m , n» variables. Now U V = X, so that (figuratively)

"r (A , m + n) x B (m , n) = r (A , m)" .

Replace n by n - m to obtain the required conclusion.

212

Page 222: One Thousand Exercises in Probability

Problems Solutions [4.14.12H4.14.15]

12. (a) Z = Xr satisfies

the r(1 , 1 ) or x 2 (1 ) density function.

(b) H z ::: 0, Z = Xr + X� satisfies

the X2(2) distribution function.

z ::: 0,

(c) One way is to work in n-dimensional polar coordinates ! Alternatively use induction. It suffices to show that if X and Y are independent, X being x 2(n) and Y being X 2(2) where n ::: I , then Z = X + Y is x 2(n + 2) . However, by the convolution formula (4.8 .2),

z ::: 0,

for some constant c. This is the x 2 (n + 2) density function as required.

13. Concentrate on where x occurs in fx I y (x I y) ; any multiplicative constant can be sorted out later:

f ( I ) fx, y (x , y) ( ) { I ( x2 2xIL I 2px (y - IL2) ) }

XIY x y = = CI Y exp - - - -- -fy (y) 2(1 - p2) ar ar 0'1 0'2

by Example (4.5 .9), where C} (y) depends on y only. Hence

x E JR,

for some C2 (y). This is the normal density function with mean ILl + pal (y - IL2)/a2 and variance ar ( 1 - p2) . See also Exercise (4.8.7).

14. Set u = y/x , v = x, with inverse x = v, y = uv, and I J I = I v l . Hence the pair U = Y/ X, V = X has joint density fu, v (u , v) = fx, y (v , uv) l v l for -00 < u , v < 00 . Therefore fu (u) = J�oo f(v, uv) l v l dv.

15. By the result of Problem (4. 14. 14), U = Y/ X has density function

fu (u) = L: f (y)f(uy) l y l dy ,

and therefore it suffices to show that U has the Cauchy distribution if and only if Z = tan - I U is uniform on (- 11T, 11T) . Clearly

lP'(Z � 0) = lP'(U � tan 0) , - 11T < 0 < 11T,

213

Page 223: One Thousand Exercises in Probability

[4.14.16]-[4.14.17] Solutions Continuous random variables

whence fz (O) = fu (tan 0) sec2 O . Therefore fz (O ) = Jr - 1 (for 1 0 1 < iJr ) if and only if

1 fu (u) =

Jr ( 1 + u2) ' -00 < u < 00.

When f is the N (0, 1) density,

f(x)f(xy) lx l dx = 2 _e- �x ( l+y )x dx , 100 looo 1 1 2 2

-00

0 2:n-

which is easily integrated directly to obtain the Cauchy density. In the second case, we have the following integral:

100 a2 1x l -00

( 1 + x4) ( 1 + x4y4) dx .

In this case, make the substitution z = x2 and expand as partial fractions.

16. The transformation x = r cos 0, y = r sin 0 has Jacobian J = r, so that

r > 0, 0 :::: 0 < 2Jr.

Therefore R and e are independent, e being uniform on [0, 2Jr) , and R2 having distribution function

this is the exponential distribution with parameter i (otherwise known as r(i , 1 ) or X2 (2)) . The 1 2

density function of R is fR (r) = re- � r for r > O. Now, by symmetry, ( X2 ) _ � (X2 + Y2 ) _ �

lE R2 - 2

lE R2 - 2 ·

In the first octant, i.e., in { (x , y) : 0 :::: y :::: x} , we have min {x , y } = y, max{x , y} = x . The joint density fx, Y is invariant under rotations, and hence the expectation in question is

lo y /e1r/4 1°O tan O 1 2 2 8 -fX, y (x , y) dx dy = 8 -

2-re- �r dr dO = - log 2 .

O�y�x x 9=0 r=O Jr Jr

17. (i) Using independence,

lP'(U :::: u) = 1 - lP'(X > u, Y > u) = 1 - ( 1 - Fx (u)) ( 1 - Fy (u)) .

Similarly lP'(V :::: v) = lP'(X :::: v , Y :::: v) = Fx (v)Fy (v) .

(ii) (a) By (i), lP'(U :::: u) = 1 - e-2u for u ::: O.

(b) Also, Z = X + i Y satisfies

lP'(Z > v) = looo lP'(Y > 2(v - x)) fx (x) dx = lov e-2(v-x)e-x dx + 100 e-X dx

= e-2V (eV - 1) + e-V = 1 - ( 1 - e-v)2 = lP'(V > v) .

214

Page 224: One Thousand Exercises in Probability

Problems Solutions [4.14.18]-[4.14.19]

Therefore E(V) = E(X) + �E(Y) = � , and var(V) = var(X) + i var(Y) = i by independence.

18. (a) We have that

(b) Clearly, for w > 0,

Now

JP(U :::: u , W > w) = JP(U :::: u , W > w, X :::: Y) + JP(U :::: u , W > w, X > Y).

JP(U :::: u , W > w, X :::: Y) = JP(X :::: u , Y > X + w) = iou Ae-AXe-JL(x+w) dx

= _A_e-JLW ( I _ e-(A+JL)U) A + JL

and similarly

JP(U < U w > w X > Y) = �e-AW (1 _ e-(A+JL)U) . - , , A + JL

Hence, for 0 :::: u :::: u + w < 00,

JP(U < U w > w) = ( 1 - e-(A+JL)U ) (_A_e-JLW + � e-AW) - ,

A + JL A + JL ' an expression which factorizes into the product of a function of u with a function of w. Hence U and W are independent.

19. U = X + Y, V = X have joint density function fy (u - v) fx (v) , 0 :::: v :::: u . Hence

fv u (v I u) = fu, v (u , v) = U fy (u - v) fx (v)

. I fu (u) Jo fy (u - y)fx (y) dy

(a) We are given that fv l U (v I u) = u-l for 0 :::: v :::: u ; then

1 iou fy (u - v)fx (v) = - fy (u - y)fx (y) dy u 0

is a function of u alone, implying that

fy (u - v) fx (v) = fy (u)fx (O) = fy (O) fx (u)

by setting v = 0

by setting v = u .

In particular fy (u) and fx (u) differ only by a multiplicative constant; they are both density functions, implying that this constant is 1 , and fx = fy . Substituting this throughout the above display, we find that g(x) = fx(x)lfx (O) satisfies g (O) = 1 , g is continuous, and g (u - v)g(v) = g (u) for 0 :::: v :::: u . From the hint, g(x) = e-AX for x 2: 0 for some A > 0 (remember that g is integrable). (b) Arguing similarly, we find that

C f" fy (u - v) fx (v) = ua+.8- 1 v

a-leu - v).8- 1

Jo fy (u - y)fx (y) dy

215

Page 225: One Thousand Exercises in Probability

[4.14.20]-[4.14.22] Solutions Continuous random variables

for 0 :::: v :::: u and some constant c. Setting fx (v) = X (v)va- 1 , fy (y) = 7J (y)yfJ- 1 , we obtain 7J (u - v)X (v) = h (u) for 0 :::: v :::: u , and for some function h . Arguing as before, we find that 7J and X are proportional to negative exponential functions, so that X and Y have gamma distributions.

20. We are given that U is uniform on [0, 1], so that 0 :::: X, Y :::: 1 almost surely. For 0 < E < ! , E = lP(X + Y < E) :::: lP(X < E, Y < E) = lP(X < E)2 ,

and similarly

E = lP(X + Y > 1 - E) :::: lP(X > 1 - E, Y > ! - E) = lP(X > 1 - E)2 ,

implying that lP(X < E) :::: ./E and lP(X > 1 - E) :::: ./E. Now

2E = lP(! - E < X + Y < 1 + E) :::: lP(X > 1 - E, Y < E) + lP(X < E, Y > 1 - E) = 2lP(X > 1 - E)lP(X < E) :::: 2(./E)2 .

Therefore all the above inequalities are in fact equalities, implying that lP(X < E) = lP(X > 1 - E) = ./E if 0 < E < ! . Hence a contradiction:

21. Evidently

lP(X(l ) :::: Yl , · · · , X(n) :::: Yn ) = L lP (XlI'\ :::: Yl o · · · , XlI'n :::: Yn , XlI'\ < . . . < XlI'n ) 11'

where the sum is over all permutations :n: = (:n:l , :n:2 , . • • , :n:n) of ( 1 , 2, . . . , n) . By symmetry, each term in the sum is equal, whence the sum equals

The integral form is then immediate. The joint density function is, by its definition, the integrand.

22. (a) In the notation of Problem (4. 14.21 ), the joint density function of X(2) , . . . , X(n) is

(Y2 g2 (Y2 , · · · , Yn) =

J-oo g (Yl , · · · , Yn) dYl

= n ! L (Y2 , . . . , Yn)F(Y2) f(Y2)f(Y3) · · · f (Yn)

where F is the common distribution function of the Xj . Similarly X (3) , • • • , X (n) have joint density

and by iteration, X(k) , . . . , X(n) have joint density

We now integrate over Yn , Yn- l o . . . , Yk+l in turn, arriving at

n ! k-l n-k fX(k) (Yk) = (k _ I ) ! (n _ k) ! F(Yk) { I - F(Yk) } f (Yk) ·

216

Page 226: One Thousand Exercises in Probability

Problems Solutions [4.14.23]-[4.14.25]

(b) It is neater to argue directly. Fix x, and let Ir be the indicator function of the event {Xr ::: xl, and let S = It + 12 + . . . + In . Then S is distributed as bin(n , F(x)), and

lP'(X(k) ::: x) = lP'(S :::: k) = t (;) F(x)l (1 - F(x))n-l . l=k

Differentiate to obtain, with F = F(x),

by successive cancellation of the terms in the series. 23. Using the result of Problem (4. 14.21) , the joint density function is g(y) = n ! L(y)T-n for ° ::: Yi ::: T, 1 ::: i ::: n, where y = (YI , Y2 , . . . , Yn ) . 24. (a) We make use of Problems (4. 14.22)-(4. 14.23). The density function of X(k) is fk (x) = k (�)xk-l (1 - x )n-k for ° ::: x ::: 1 , so that the density function of nX (k) is

1 k n(n - 1) . . . (n - k + 1 ) k- l ( x ) n-k 1 k- l -x -fk (xln) = - x 1 - - --+- --- x e n k! nk n (k - l) ! as n --+- CXj> . The limit is the r ( 1 , k) density function. (b) For an increasing sequence X(I ) , X(2) , . . . , X(n) in [0, 1 ] , we define the sequence Un = -n 10gX(n) , Uk = -k log(X(k) lx(k+I) ) for 1 ::: k < n. This mapping has inverse

{ n } -u k · - 1 X(k) = X(k+1 ) e k/ = exp -� I Uj ,

I=k

with Jacobian J = (_ I )n e-U l -u2- · · ·-Un In ! . Applying the mapping to the sequenCe X(I ) , X(2) , . . . , X(n) we obtain a family UI , U2 , . · . , Un of random variables with joint density g (U I , U2 , . · . , un) = e-ul -u2- · · ·-un for Ui :::: 0, 1 ::: i ::: n, yielding that the Uj are independent and exponentially distributed, with parameter 1 . Finally log X (k) = - L:I=k i - I Ui . (c) In the notation of part (b), Zk = exp( -Uk) for 1 ::: k ::: n, a collection of independent variables. Finally, Uk is exponential with parameter 1 , and therefore

lP'(Zk ::: z) = lP'(Uk :::: - log z) = e10gz = z , o < z ::: l .

25. (i) (Xl , X2 , X3) is uniformly distributed over the unit cube ofR3 , and the answer is therefore the volume of that set of points (Xl , x2 , X3) of the cube which allow a triangle to be formed. A triangle is impossible if Xl :::: x2 + x3 , or x2 :::: xl + x3 , or x3 :::: xl + X2 . This defines three regions of the cube which overlap only at the origin. Each of these regions is a tetrahedron; for example, the region X3 :::: Xl + X2 is an isosceles tetrahedron with vertices (0, 0, 0) , ( 1 , 0, 1 ) , (0, 1 , 1 ) , (0, 0, 1 ) , with volume i . Hence the required probability is 1 - 3 . i = � . (ii) The rods of length xl , X2 , . . . , Xn fail to form a polygon if either Xl :::: x2 +x3 + . . . +xn or any of the other n - 1 corresponding inequalities hold. We therefore require the volume of the n-dimensional hypercube with n corners removed. The inequality Xl :::: X2 + X3 + . . . + Xn corresponds to the convex hull of the points (0, 0, . . . , 0) , ( 1 , 0, . . . , 0) , ( 1 , 1 , 0, . . . , 0) , (1 , 0, 1 , 0, . . . , 0) , . . . , ( 1 , 0, . . . , 0, 1 ) .

217

Page 227: One Thousand Exercises in Probability

[4.14.26]-[4.14.27] Solutions Continuous random variables

Mapping Xl 1-+ 1 - Xl , we see that this has the same volume Vn as has the convex hull of the origin o together with the n unit vectors el , e2 , . . . , en . Clearly V2 = � , and we claim that Vn = l in ! . Suppose this holds for n < k, and consider the case n = k. Then

where Vk-l (0, xl e2 , . . . , xl ek) is the (k - I )-dimensional volume of the convex hull of 0, xl e2 , . . . , xl ek . Now

so that

v: - -l-- dx - -101 x

k-l 1

k - 0 (k - I) ! 1 - k ! '

The required probability is therefore 1 - nl (n !) = 1 - { (n _ l ) ! }- l .

26. (i) The lengths of the pieces are U = min{Xl , X2 } , V = IXI - X2 1 , W = 1 - U - V, and we require that U < V + W, etc, as in the solution to Problem (4. 14.25). In terms of the Xi we require

either :

or :

I XI - X2 1 < � , IX l - X2 1 < � ,

1 - X2 < � , l - Xl < � '

Plot the corresponding region of R2. One then sees that the area of the region is i , which is therefore the probability in question.

(ii) The pieces may form a polygon if no piece is as long as the sum of the lengths of the others. Since the total length is I , this requires that each piece has length less than � . Neglecting certain null events, this fails to occur if and only if the disjoint union of events AO U A l U . . . U An occurs, where

AO = {no break: in (0, � ] } , Ak = {no break: in (Xk o Xk + � l ) for 1 � k � n ;

remember that there i s a permanent break: at 1 . Now JP(AO) = ( � )n , and for k 2: I ,

Hence JP(AO U A l U . . . U An ) = (n + I )2-n whence the required probability i s 1 - (n + I )2-n .

27. (a) The function g (t) = (tP I p) + (t-q Iq) , for t > 0, has a unique minimum at t = I , and hence g (t) 2: g( l ) = 1 for t > O. Substitute t = x l/qy- l/p where

(we may as well suppose that JP(XY = 0) =1= 1 ) to find that

IX j P I Y lq I XY I plE lXP I

+ qlEl yq l

2: {lE IXP l } l /p {lE l yq l } l /q

'

HOlder's inequality follows by taking expectations.

2 18

Page 228: One Thousand Exercises in Probability

Problems Solutions [4.14.28]-[4.14.31]

(b) We have, with Z = IX + Y I .

lE(ZP) = lE(Z . Zp-l ) ::: lE( IX I ZP- I ) + lE(I Y I ZP- I ) ::: {lEIXP I } I /p {lE(ZP) } I /q + {lEl yP I } I /p{lE(ZP) } I/q

by Holder's inequality, where p- l + q- l = 1 . Divide by {lE(ZP) } I /q to get the result.

28. Apply the Cauchy-Schwarz inequality to I Z I � (b-a) and I Z I � (b+a) , where 0 ::: a ::: b, to obtain {lEI Zb I }2 ::: lE IZb-a l lE l zb+a l . Now take logarithms: 2g (b) ::: g (b - a) + g (b + a) for O ::: a ::: b. Also g(p) --+ g (O) = I as p .j.. 0 (by dominated convergence). These two properties of g imply that g is convex on intervals of the form [0, M) if it is finite on this interval. The reference to dominated convergence may be avoided by using Holder instead of Cauchy-Schwarz.

By convexity, g (x)/x is non-decreasing in x , and therefore g (r)/r ;:: g (s )/s if 0 < s ::: r .

29. Assume that X, Y, Z are jointly continuously distributed with joint density function f. Then

Hence

lE(X I Y = y , Z = z) = jXfXI Y,Z (X I y , z) dx = jx f(x , y , z) dx . !y.z (y , z)

lE{lE(X I Y, Z) I Y = y } = j lE(X I Y = y , Z = z)fzIY (z I y) dz = {[ x f(x , y , z) !y.z (y , z) dx dz }} !y.z (y , z) fy (y) �r f(x , y , z) = x dx dz = lE(X I Y = y) . fy (y)

Alternatively, use the general properties of conditional expectation as laid down in Section 4.6.

30. The first car to arrive in a car park of length x + I effectively divides it into two disjoint parks of length Y and x - Y, where Y is the position of the car's leftmost point. Now Y is uniform on [0, x) , and the formula follows by conditioning on Y. Laplace transforms are the key to exploring the asymptotic behaviour of m(x)/x as x --+ 00. 31. (a) If the needle were of length d, the answer would be 2/rr as before. Think about the new needle as being obtained from a needle of length d by dividing it into two parts, an 'active' part of length L, and a 'passive' part of length d - L, and then counting only intersections involving the active part. The chance of an 'active intersection' is now (2/rr) (L/d) = 2L/(rrd) .

(b) As in part (a), the angle between the line and the needle i s independent of the distance between the line and the needle's centre, each having the same distribution as before. The answer is therefore unchanged.

(c) The following argument lacks a little rigour, which may be supplied as a consequence of the statement that S has finite length. For E > 0, let Xl , X2 , . . . , Xn be points on S, taken in order along S, such that xo and Xn are the endpoints of S, and IXi+1 - Xi I < E for 0 ::: i < n ; Ix - y l denotes the Euclidean distance from x to y . Let Ji be the straight line segment joining Xi to Xi+! , and let lj be the indicator function of {Ji n A =;f 0}. If E is sufficiently small, the total number of intersections between Jo U It U · · · U In- l and S has mean

219

Page 229: One Thousand Exercises in Probability

[4.14.321-£4.14.34] Solutions Continuous random variables

by part (b) . In the limit as E � 0, we have that Ei lE(li ) approaches the required mean, while

2 n-l 2L (S)

- L IXi+ l - xi i � -- . rrd i=O rrd

32. (i) Fix Cartesian axes within the gut in question. Taking one end of the needle as the origin, the other end is uniformly distributed on the unit sphere of R3 . With the X-ray plate parallel to the (x , y)-plane, the projected length V of the needle satisfies V :::: v if and only if lZ I � �, where Z is the (random) z-coordinate of the 'loose' end of the needle. Hence, for ° � v � I ,

since 4rr � is the surface area of that part of the unit sphere satisfying Iz l � � (use Archimedes's theorem of the circumscribing cylinder, or calculus). Therefore V has density function

fv (v) = v/� for O � v � 1 . (ii) Draw a picture, if you can. The lengths of the projections are determined by the angle e between the plane of the cross and the X-ray plate, together with the angle \11 of rotation of the cross about an axis normal to its arms . Assume that e and \11 are independent and uniform on [0, �rr] . If the axis system has been chosen with enough care, we find that the lengths A, B of the projections of the arms are given by

with inverse

_1 �- A2 \11 = tan - -2 . I - B

Some slog is now required to calculate the Jacobian J of this mapping, and the answer will be fA, B (a , b) = 4 1 J lrr-2 for 0 < a , b < 1 , a2 + b2 > 1 .

33. The order statistics of the Xi have joint density function

on the set I of increasing sequences of positive reals. Define the one-one mapping from I onto (0, oo)n by

Y l = nX l , Yr = (n + 1 - r) (xr - Xr- l ) for 1 < r � n ,

with inverse Xr = Ek= l Yk/ (n - k + 1 ) for r :::: 1 . The Jacobian is (n !)-l , whence the joint density function of Yl , Y2 , . . . , Yn is

34. Recall Problem (4. 14.4). First, Zi = F(Xi ), 1 � i � n, is a sequence of independent variables with the uniform distribution on [0, 1 ] . Secondly, a variable U has the latter distribution if and only if - log U has the exponential distribution with parameter 1 .

220

Page 230: One Thousand Exercises in Probability

Problems Solutions [4.14.35]-[4.14.37]

It followsthat Lj = - log F(Xj ) , 1 ::: i ::: n, is a sequence of independent variables with the expo­nential distribution. Theorder statistics L (l ) , . . . , L (n) are in order - log F(X(n) ) , . . . , - log F(X(I ) ) , since the function - log FO is non-increasing. Applying the result of Problem (4. 14.33), El -n log F(X(n) ) and

Er = -(n + 1 - r) {log F(X(n+l-r) - log F(X(n+2-r) ) } , 1 < r ::: n , are independent with the exponential distribution. Therefore exp( -Er) , 1 ::: r ::: n, are independent with the uniform distribution. 35. One may be said to be in state j if the first j - 1 prizes have been rejected and the jth prize has just been viewed. There are two possible decisions at this stage: accept the jth prize if it is the best so far (there is no point in accepting it if it is not), or reject it and continue. The mean return of the first decision equals the probability j I n that the jth prize is the best so far, and the mean return of the second is the maximal probability f (j) that one may obtain the best prize having rejected the first j . Thus the maximal mean return V (j) in state j satisfies

V(j) = max{jln , f(j) } . Now j I n increases with j, and f (j) decreases with j (since a possible stategy is to reject the (j + l )th prize also). Therefore there exists J such that j In ::: f (j) if and only if j ::: J . This confirms the optimal stategy as having the following form: reject the first J prizes out of hand, and accept the subsequent prize which is the best of those viewed so far. If there is no such prize, we pick the last prize presented.

Let fIJ be the probability of achieving the best prize by following the above strategy. Let Ak be the event that you pick the kth prize, and B the event that the prize picked is the best. Then,

n n ( k ) ( J 1 ) J n 1 TIJ = L IP'(B I Ak)IP'(Ak) = L - - . - = - L -,

k=J+l k=J+l n k - 1 k n k=J+ l k - 1

and we choose the integer J which maximizes this expresion. When n is large, we have the asymptotic relation TIJ � (J In) log(nl J) . The maximum of the

function hn (x) = (xln) log(nlx) occurs at x = nle , and we deduce that J � nle . [A version of this problem was posed by Cayley in 1875. Our solution is due to Lindley (1961) . ] 36. The joint density function of (X, Y, Z) is

1 f(x , Y , z) = 2 3/2 exp{ -1 (r2 - 2Ax - 2JLY - 2vz + 1..2 + JL2 + v2) }

(2nD" )

where r2 = x2 + y2 + z2 . The conditional density of X, Y, Z given R = r is therefore proportional to exp{Ax + JLY + vz} . Now choosing spherical polar coordinates with axis in the direction (A , JL , v), we obtain a density function proportional to exp(a cos O) sin O, where a = rV1..2 + JL2 + v2. The constant is chosen in such a way that the total probability is unity. 37. (a) ¢'(x) = -x¢ (x), so HI (x) = x . Differentiate the equation for Hn to obtain Hn+l (x) = xHn (x) - H� (x), and use induction to deduce that Hn is a polynomial of degree n as required. Integrating by parts gives, when m ::: n,

f: Hm (x)Hn (x)¢ (x) dx = (_ I )n f: Hm (x)¢(n) (x) dx

= (_ l )n-l f: H:n (x)¢(n- l ) (x) dx

= . . . = (_ l )n-m roo H�m) (x)¢(n-m) (x) dx , Loo 221

Page 231: One Thousand Exercises in Probability

[4.14.38]-[4.14.39] Solutions Continuous random variables

and the claim follows by the fact that H�m) (x) = m I . (b) �y Taylor's theorem and the first part,

whence

00 tn 00 ( t)n ¢ (x) E -Hn (x) = E �¢(n) (x) = ¢ (x - t) , O n ! 0 n ! n= n=

38. The polynomials of Problem (4. 14.37) are orthogonal, and there are unique expansions (subject to mild conditions) of the form u (x) = E�o arHr (x) and v(x) = E�o br Hr (x) . Without loss of generality, we may assume that JE(U) = JE(V) = 0, whence, by Problem (4. 14.37a), ao = bo = O. By (4. 14.37a) again,

00 00 var(U) = JE(u (X)2 ) = E a;r ! , var(V) = E b;r ! .

r=1 r=1 By (4. 14.37b),

JE(� Hm�)Sm � Hn��)tn ) = JE (exp{sX _ �s2 + tY _ � t2 }) = estP .

By considering the coefficient of sm tn ,

and so

where we have used the Cauchy-Schwarz inequality at the last stage.

39. (a) Let Yr = X(r) - X(r-l ) with the convention that X(O) = 0 and X(n+l) = 1 . By Problem (4. 14.21) and a change of variables, we may see that Yl , Y2 , . . . , Yn+l have the distribution of a point chosen uniformly at random in the simplex of non-negative vectors y = (Yl , Y2 , . . . , Yn+ 1 ) with sum 1 . [This may also be derived using a Poisson process representation and Theorem (6. 12.7).] Conse­quently, the Yj are identically distributed, and their joint distribution is invariant under permutations

of the indices of the Yj . Now E�if Yr = 1 and, by taking expectations, (n + I )JE(Yl ) = 1 , whence JE(X(r» ) = rJE(Yl ) = r/ (n + 1 ) . (b) We have that

101 2 JE(Yf ) = x2n (1 - x)n- l dx =

2 ' o (n + l ) (n + )

1 = JE [ (I: Yr) 2] = (n + I )JE(Yf) + n (n + I )JE(YI Y2) ,

r=1

222

Page 232: One Thousand Exercises in Probability

Problems Solutions [4.14.40]-[4.14.42]

implying that

and also

1 JE(YI Y2) = (n + 1 ) (n + 2)

,

2 r es + 1 ) JE(X(r)X(s» = rJE(YI ) + r es - I)JE(YI Y2) = (n + 1 ) (n + 2)

The required covariance follows.

40. (a) By paragraph (4.4.6), X2 is r (� , �) and y2 + Z2 is r(1 , 1 ) . Now use the results of Exercise (4.7. 14). (b) Since the distribution of X2 / R2 is independent of the value of R2 = X2 + y2 + Z2 , it is valid also if the three points are picked independently and uniformly within the sphere.

41. (a) Immediate, because the N(O, 1) distribution is symmetric. (b) We have that

i: 2¢ (x)<'P (AX) dx = i: {4> (x)<'P (Ax) + 4> (x) [ 1 - <'P (-Ax)] } dx

= i: 4> (x) dx + i: 4> (X) [<'P (AX) - <'P (-Ax)] dx = 1 ,

because the second integrand is odd. Using the symmetry of 4>, the density function of I Y I is

4> (x) + 4> (x) { <'P (AX) - <'P( -AX) } + 4> ( -x) + 4> ( -x) { <'P (-AX) - <'P(AX) } = 2¢ (x) .

(c) Finally, make the change of variables W = I Y I , Z = (X + A I Y I ) /yh + A2, with inverse Iy l = w, x = zVl + A2 - AW, and Jacobian VI + A2 . Then

tw,Z (W , z) = VI + A2 tX, I Y I (zVl + A2 - AW , w) = VI + A2 . 4> (zVl + A2 - AW) . 24> (w) ,

The result follows by integrating over W and using the fact that

W > 0, X E R.

tx) 4> (zVl + A2 _ AW)4> (W) dw = 4>� . 10 1 + A2

42. The required probability equals

where UI , U2 are N(O, �) , VI , V2 are N(O, �) , and UI , U2 , VI , V2 are independent. The answer is therefore

p = lP'(� (Nf + Ni) ::::: i (Nf + Nl)) where the Nt are independent N(O, 1) = lP'(K I ::::: 1 K2) where the Kj are independent chi-squared X2 (2)

( KI 1 ) I I = lP' < - = lP'(B < - ) = .,. KI + K2 - 4 - 4 ..

223

Page 233: One Thousand Exercises in Probability

[4.14.43]-[4.14.48] Solutions Continuous random variables

where we have used the result of Exercise (4.7. 14), and B is a beta-distributed random variable with parameters 1 , 1 . 43. The argument of Problem (4. 14.42) leads to the expression

JP>(U[ + ui + uj .:::: Vf + vi + vi) = JP>(KI .:::: 1 K2) where the Kj are X2 (3)

1 1 .../3 = JP>(B < 7 ) = - - -- .. 3 4JZ"

where B is beta-distributed with parameters � , � . 44. (a) Simply expand thus: lE[(X - J.L)3 ] = lE[X3 - 3X2J.L + 3XJ.L2 - J.L3] where J.L = lE(X).

(b) var(Sn) = na2 and lE[(Sn - nJ.L)3 ] = nlE[(X 1 - J.L)3 ] plus terms which equal zero because lE(Xl - J.L) = o. (c) If Y is Bernoulli with parameter p, then skw(Y) = ( 1 - 2p)/..(jifj, and the claim follows by (b).

(d) ml = A, m2 = A + A2 , m3 = A3 + 3A2 + A, and the claim follows by (a).

(e) Since AX is r ( 1 , t), we may as well assume that A = 1 . It is immediate thatlE(Xn ) = r(t+n)/ r(t), whence

t (t + l ) (t + 2) - 3t . t (t + l) + 2t3 2 skw(X) = t3/2 = "ft.

45. We find as above that kur(X) = (m4 - 4m3m l + 6m2mr - 3mt)/a4 where mk = lE(Xk) . (a) m4 = 3a4 for the N(O, (2) distribution, whence kur(X) = 3a4/a4. (b) mr = r !/Ar , and the result follows.

(c) In this case, m4 = A 4 + 6A 3 + 7A 2 + A, m3 = A 3 + 3A 2 + A, m2 = A 2 + A, and m 1 = A.

(d) (var Sn)2 = n2a4 and lE[(Sn - nm l )4] = nlE[(Xl - m l )4] + 3n (n - l)a4 . 46. We have as n � 00 that

( _x ) n JP>(X(n) .:::: x + log n) = { I - e-(x+logn) }n = 1 _ � � e-e-x .

By the lack-of-memory property,

1 lE(X(1 » = - , n

whence, by Lemma (4.3 .4),

1 1 lE(X(2» = - + -- , n n - l

roo { 1 - e-e-x } dx = lim lE(X(n) - log n) = lim (� + �1 + . . . + 1 - log n

) = y. 10 n-+oo n-+oo n n -

47. By the argument presented in Section 4. 1 1 , conditional on acceptance, X has density function Is . You might use this method when Is is itself particularly tiresome or expensive to calculate. If a (x) and b(x) are easy to calculate and are close to Is, much computational effort may be saved.

48. M = max{Ul , U2 , . . . , Uy } satisfies

y et - 1 JP>(M ':::: t) = lE(t ) = -- . e - l

224

Page 234: One Thousand Exercises in Probability

Problems Solutions [4.14.49]-[4.14.51]

Thus, lP'(Z � z) = lP'(X � LzJ + 2) + lP'(x = LzJ + 1 , Y � LzJ + 1 - z)

(e - l )e- LzJ -2 L J 1 e LZJ +I -Z - 1 =

1 _ e-1 + (e - l )e- Z - .

e _ 1 = e-z

49. (a) Y has density function e-Y for y > 0, and X has density function fx (x) = ae-ax for x > O. Now Y � � (X - a)2 if and only if

, 2 e ,;a � � ' x2

Vae-aX -- - � -e- '; , a 11: 11:

, 2 which is to say that aVfx (X) � f(X), where a = a- 1 e,;a ./2/11: . Recalling the argument of Example (4. 1 1 .5), we conclude that, conditional on this event occurring, X has density function f. (b) The number of rejections is geometrically distributed with mean a-I , so the optimal value of a is

, 2 that which minimizes ae - ,;a ...[ii72, that is, a = 1 . (c) Setting

Z = { + X with probability �} conditional on Y > � (X _ a)2 , -X with probability �

we obtain a random variable Z with the N(O, 1 ) distribution.

50. (a) JE(X) = fol �du = 11:/4.

2 fot1r (b) JE(Y) = - sin 0 dO = 2/11: . 11: 0 51. You are asked to calculate the mean distance of a randomly chosen pebble from the nearest collection point. Running through the cases, where we suppose the circle has radius a and we write P for the position of the pebble,

(i)

(ii)

(iii)

(iv)

1 fo21f foa 2 2a JEIOPI = -2 r dr dO = -. 11: a 0 0 3

2 fot1r fo2a cOS 9 32a JEIAPI = -2 r2 dr dO = -. 11: a 0 0 911: 4 [fo!1r foa sec 9 ht1r fo2a COS 9 ]

JE( IAPI /\ IBPI ) = -2 r2 dr dO + r2 dr dO 11: a 0 0 !1r 0 = - - - -./2+ - log(1 +./2) ::::::: - x 1 . 1 3 .

4a { 1 6 17 1 } 2a 311: 3 6 2 3

6 {foj1r foX �t1r fo2a COS 9 } JE( IAPI /\ IBPI /\ ICPI ) = -2 r2 dr dO + r2 dr dO 11: a 0 0 j1r 0

where x = a sin(111:) cosec(j11: - 0) 2a {lot1r 1 11: �t1r

} = - -3..J3cosec3 (- + 0) dO + 8 cos3 0 dO 11: 0 8 3 j1r

= 2a { 1 6 - �..J3 + 3..J3 10g � } ::::::: 2a x 0.67. 11: 3 4 16 2 3

225

Page 235: One Thousand Exercises in Probability

[4.14.52]-[4.14.55] Solutions Continuous random variables

52. By Problem (4. 14.4), the displacement of R relative to P is the sum of two independent Cauchy random variables. By Exercise (4.8 .2), this sum has also a Cauchy distribution, and inverting the transformation shows that e is uniformly distributed.

53. We may assume without loss of generality that R has length 1 . Note that 8 occurs if and only if the sum of any two parts exceeds the length of the third part.

(a) If the breaks are at X, Y, where 0 < X < Y < 1 , then 8 occurs if and only if 2Y > 1 , and 2(Y - X) < 1 and 2X < 1 . These inequalities are satisfied with probability i . (b) The length X of the shorter piece has density function Ix (x) = 2 for 0 ::: x ::: � . The other pieces are of length ( 1 - X) Y and ( 1 - X)(1 - Y), where Y is uniform on (0, 1 ) . The event 8 occurs if and only if 2Y < 2XY + 1 and X + Y - XY > � , and this has probability

r� { I 1 - 2x } 2 10 2( 1 _ x) - 2( 1 _ x) dx = log(4/e).

(c) The three lengths are X, i ( 1 - X), i ( 1 - X), where X is uniform on (0, 1) . The event 8 occurs

if and only if X < � . (d) This triangle is obtuse if and only if

which is to say that X > � - 1 . Hence,

1 > PO '

2 ( 1 - X ) "\'2

I X 1 2

lP'(� - 1 < X < � ) lP'(obtuse I 8) =

1 = 3 - 2../2.

lP'(X < 2 )

54. The shorter piece has density function Ix (x) = 2 for 0 ::: x ::: � . Hence,

lP'(R < r) = lP' -- < r = --( X ) 2r - I - X - l + r '

with density function I R (r) = 2/ ( 1 - r)2 for 0 ::: r ::: 1 . Therefore,

101 101 1 r lE(R) = lP'(R > r) dr = -- dr = 2 10g 2 - 1 , o 0 1 + r 101 101 2r ( 1 - r) lE(R2) = 2rlP'(R > r) dr = dr = 3 - 4 10g 2,

o 0 1 + r

and var(R) = 2 - (2 log 2)2 . 55. With an obvious notation,

By a natural re-scaling, we may assume that a = 1 . Now, Xl - X2 and YI - Y2 have the same triangular density symmetric on (- 1 , 1 ) , whence (Xl - X2)2 and (Yl - Y2)2 have distribution

226

Page 236: One Thousand Exercises in Probability

Problems Solutions [4.14.56]-[4.14.59]

1 function F(z) = 2...;z - Z and density function fz(z) = z- ! - 1 , for 0 ::: Z ::: 1 . Therefore R2 has the density f given by { r (_

1 _ 1) (_1 _ 1) dz 10 ...;z � f(r) = 1 1 (_

1 - 1) (_1

- 1) dz

The claim follows since

r- l ...;z �

lb l id 2 ( . - 1 � . -1 l) ---- z = sm - - sm -

a ...;z � r r

if 0 ::: r ::: 1 ,

if 1 ::: r ::: 2.

for O ::: a ::: b ::: 1 .

56. We use an argument similar to that used for Buffon's needle. Dropping the paper at random amounts to dropping the lattice at random on the paper. The mean number of points of the lattice in a small element of area dA is dA. By the additivity of expectations, the mean number of points on the paper is A. There must therefore exist a position for the paper in which it covers at least r A 1 points.

57. Consider a small element of surface d S. Positioning the rock at random amounts to shining light at this element from a randomly chosen direction. On averaging over all possible directions, we see that the mean area of the shadow cast by d S is proportional to the area of d S. We now integrate over the surface of the rock, and use the additivity of expectation, to deduce that the area A of the random shadow satisfies lE(A) = cS for some constant C which is independent of the shape of the rock. By considering the special case of the sphere, we find C = ! . It follows that at least one orientation of

the rock gives a shadow of area at least ! S.

58. (a) We have from Problem (4. 14. 1 1b) that Yr = Xr/(Xl + . . . + Xr ) is independent of Xl + . . . + Xr , and therefore of the variables Xr+l , Xr+2 , ' " , Xk+1 ' Xl + . . . + Xk+1 ' Therefore Yr is independent of {Yr+s : s � I } , and the claim follows. (b) Let S = Xl + . . . + Xk+l ' The inverse transformation Xl = Z lS , X2 = Z2S , . . . , Xk = ZkS, xk+1 = S - Zl S - Z2S - . . . - ZkS has Jacobian

S 0 0 Z l 0 S 0 Z2

J = = sk

.

0 0 0 Zk -s -s -s 1 - Z l - . . . - Zk

The joint density function of Xl , X2 , . . . , Xk o S is therefore (with C1 = E�;!;� fJr ) , { IT j.}r (zrs)fJr- l e-'Azrs } . j.}k+1 (s ( 1 - Z l - . . . - Zk) }fJk+l - l e-'As(1 -z 1 -" '-Zk)

r=l r (fJr ) r (fJk+d k

= f('A , {j, s) (IT zfr-l) (1 - Z l - . . . - Zk )fJk+l - l , r=l

where f is a function of the given variables. The result follows by integrating over s .

59. Let C = (crs ) be an orthogonal n x n matrix with Cni = 1/.;ri for 1 ::: i ::: n . Let Yir = E�=l XisCrs , and note that the vectors Yr = (Ylr > Y2r > . . . , Ynr ) , 1 ::: r ::: n, are multivariate normal. Clearly lEYir = 0, and

lE(Yir Yjs ) = L CrtCsulE(XitXju ) = L Crtcsu8tu Vij = L CrtCst Vjj = 8rs vjj , t , u t , u t

227

Page 237: One Thousand Exercises in Probability

[4.14.60]-[4.14.60] Solutions Continuous random variables

where 8tu is the Kronecker delta, since C is orthogonal. It follows that the set of vectors Y r has the same joint distribution as the set of Xr . Since C is orthogonal, Xir = E�=1 Csr Yis ' and therefore

1 1 1 Sij = E CsrCtr Yis Yjt - - E Xir E Xjr = E 8st Yis Yjt - r.:: E Xir r.:: E Xjr r,s , t n r r s , t "I n r "I n r

n- l = E Yis Yjs - Yin Yjn = E YiS YjS '

s s=1

This has the same distribution as 1ij because the Y r and the Xr are identically distributed.

60. We sketch this. Let lE/PQR/ = m (a), and use Crofton's method. A point randomly dropped in S(a + da) lies in S(a) with probability

Hence

( a ) 2 2da -- = 1 - - + o(da) . a + da a

dm 6m 6mb - = - - + -- , da a a

where mb (a) is the conditional mean of /PQR/ given that P is constrained to lie on the boundary of S (a). Let b(x) be the conditional mean of /PQR/ given that P lies a distance x down one vertical edge.

x

P 1--------;

R2

By conditioning on whether Q and R lie above or beneath P we find, in an obvious notation, that

By Exercise (4. 1 3 .6) (see also Exercise (4. 1 3 .7» , mR" R2 = ! (!a) (!a) = la2 . In order to find mR" we condition on whether Q and R lie in the triangles Tl or T2 , and use an obvious notation.

Recalling Example (4. 13 .6), we have that m T, = m T2 = iT . !ax . Next, arguing as we did in that example,

1 4 { 1 I I } m T" T2 = :1 . lJ ax - 4ax - 4ax - gax .

Hence, by conditional expectation,

1 4 1 1 4 1 1 4 3 13 m R, = 4 . 'l"i . :1ax + 4 . 'l"i . :1 ax + :1 . lJ . gax = TIlirax .

We replace x by a - x to find m R2 ' whence in total

_ (:,) 2 1 3ax (a - X ) 2 13a (a - x) 2x (a - x) . a2 _ � 2 _ 12ax 12x2 b (x) - a lOS + a lOS + a2 S - lOS

a lOS + lOS .

228

Page 238: One Thousand Exercises in Probability

Problems Solutions [4.14.61H4.14.63]

Since the height of P is uniformly distributed on [0, a], we have that

We substitute this into the differential equation to obtain the solution m (a) = tka2 . Turning to the last part, by making an affine transfonnation, we may without loss of generality

take the parallelogram to be a square. The points form a convex quadrilateral when no point lies inside the triangle formed by the other three, and the required probability is therefore 1 - 4m (a) I a2 = 1 - & = � . 61. Choose four points P, Q, R, S uniformly at random inside C, and let T be the event that their convex hull is a triangle. By considering which of the four points lies in the interior of the convex hull of the other three, we see that lP'(T) = 4lP'(S E PQR) = 4lEIPQRI/ I C I . Having chosen P, Q, R, the four points form a triangle if and only if S lies in either the triangle PQR or the shaded region A. Thus, lP'(T) = { I A I + lEIPQRI }/ I C I , and the claim follows on solving for lP'(T). 62. Since X has zero means and covariance matrix I, we have that lE(Z) = /L + lE(X)L = /L, and the covariance matrix of Z is lE(L'X'XL) = L'IL = V. 63. Let D = (dij ) = AD - C. The claim is trivial if D = 0, and so we assume the converse. Choose i , k such that dik i= 0, and write Yi = 'L,}=l dij Xj = S + dikXk . Now lP'(Yi = 0) = lE (lP'(Xk = -S I dik I S») . For any given S, there is probability at least � that Xk i= -S I dik ' and the second claim follows.

Let xl , X2 , . . . , Xm be independent random vectors with the given distribution. If D i= 0, the probability that Dxs = ° for 1 .::: s .::: m is at most ( � )m , which may be made as small as required by choosing m sufficiently large.

229

Page 239: One Thousand Exercises in Probability

5

Generating functions and their applications

5.1 Solutions. Generating functions

1. (a) If l s i < (1 _ p)- l ,

G (s) = [;sm (n +: - 1) pn ( 1 _ p)m = { 1 - S� _ p)}n .

Therefore the mean is G'(1) = n ( 1 - p)lp. The variance is G"(1) +G'( 1 ) -G'(1 )2 = n(1 - p)lp2. (b) If l s i < 1 ,

G(s) = f sm (� _ _ 1 _) = 1 + ( 1 - S ) log( 1 - s) . m=l m m + 1 s

Therefore G' ( 1 ) = 00, and there exist no moments of order 1 or greater.

(c) If p < l s i < p-l ,

G(s) = f sm ( 1 - P) p lm l = 1 - P { I + � + pis } . 1 + p 1 + p 1 - sp 1 - (pis) m=-oo

The mean is G'(1 ) = 0, and the variance is G"(1) = 2p(1 - p)-2 . 2. (i) Either hack it out, or use indicator functions fA thus:

00 ( 00 ) (X-l

) ( 1 -X) I - G( ) T (s) = ?; snp(X > n) = lE ?; sn f{n<XJ = lE ?; sn = lE 1 _

ss = 1 _ s

S .

(ii) It follows that

T( 1 ) = lim { 1 - G(S) } = lim

G'(s) = G'(1) = lE(X) st l 1 - s st l 1

by L'Hopital's rule. Also,

T'( 1 ) = lim { - (1 - s)G' (s) + 1 - G(S) } st l (1 - s)2 = �G"( I ) = Hvar(X) - G'( 1 ) + G' ( 1 )2 }

230

Page 240: One Thousand Exercises in Probability

Generating functions Solutions [5.1.3]-[5.1.5]

whence the claim is immediate.

3. (i) We have that Gx, Y (s , t) = lE(sxty) , whence Gx,Y (s , 1) = Gx (s) and GX, y ( 1 , t) = Gy (t) . (ii) If lE lXY I < 00 then

lE(XY) = lE (XYSX-1 tY- 1 ) I = --Gx,Y (s , t) . a2

I 8=t=1 as at

4. We write G (s , t) for the joint generating function.

(a) 00 i

G (s , t) = :E :�::>i tk ( 1 - a)(f3 - a)ai f3k-j-1 j=O k=O

8=t=1

= f= (as ) j (1 - a)(f3 - a) . 1 - (f3t)i+1

j=O f3 f3 1 - f3t if f3 l t l < 1

( 1 - a)(f3 - a) { 1 f3t } = ( 1 - f3t)f3 1 - (as/f3) -

1 - ast ( 1 - a)(f3 - a)

( 1 - ast)(f3 - as)

(the condition a l s t l < 1 is implied by the other two conditions on s and t). The marginal generating functions are

( 1 - a)(f3 - a) 1 - a G (s , 1) = (1 ) (f3 ) ' G ( 1 , t) = -

1 - ,

- as - as - at and the covariance is easily calculated by the conclusion of Exercise (5. 1 .3) as a(1 - a)-2 . (b) Arguing similarly, we obtain G(s , t) = (e - l)/{e(1 - te8-2) } if I t l e8-2 < 1 , with marginals

and covariance e(e - 1)-2 . (c) Once again,

1 - e- 1 1 - e- 1 G (s , 1) = 2 ' G ( 1 , t) = l ' 1 - e8- 1 - te-

log{ l - tp( 1 - P + sp) } G(s , t) = log( 1 _ p) if I tp ( 1 - p + sp) 1 < 1 .

The marginal generating functions are

G(s , 1) = log{ 1 - p + p2 ( 1 - s) } ,

log(1 - p)

and the covariance is

G( l , t) = log( 1 - tp) , log( 1 - p)

p2 {p + log( 1 - p)} ( 1 - p)2 {log( 1 - p)}2 ·

5. (i) We have that

23 1

Page 241: One Thousand Exercises in Probability

[5.1.6]-[5.2.1] Solutions Generating functions and their applications

where p + q = 1 . (ii) More generally, if each toss results in one of t possible outcomes, the i th of which has probability Pi , then the corresponding quantity is a function of t variables, Xl , X2 , . . . , xI , and is found to be (PlXl + P2X2 + . . . + ptXt )n . 6. We have that

101 1 1 sn+1

JE(sX) = JE{JE(sX I U)} = ( 1 + u (s - l) }n du = --1 ' � , o n + - s

the probability generating function of the uniform distribution. See also Exercise (4.6.5). 7. We have that

GX, Y,Z (X , y , z) = G (x , y , z , 1 ) = i (xyz + xy + yz + zx + x + y + z + 1) = ! (x + 1 ) 1 (y + 1 ) 1 (z + 1) = Gx (x)Gy (y)Gz (z) ,

whence X, Y, Z are independent. The same conclusion holds for any other set of exactly three random variables. However, G(x , y , z , w) i= Gx (x)Gy (y)Gz (z)Gw (w) . 8. (a) We have by differentiating that JE(X2n) = 0, whence lP(X = 0) = 1 . This is not a moment generating function. (b) This is a moment generating function if and only if Lr Pr = 1 , in which case it is that of a random variable X with lP(X = ar) = Pr . 9. The coefficients of sn in both combinations of G l o G2 are non-negative and sum to 1 . They are therefore probability generating functions, as is G(as)/G (a) for the same reasons.

5.2 Solutions. Some applications

1. Let G(s) = JE(sx) and Gs(s) = Lj=o sj Sj . By the result of Exercise (5. 1 .2),

Now,

so that

� � k s ( 1 - G(s» 1 - sG(s) T (s) = L smlP(X � m) = I + s L s lP(X > k) = I + = . m=O k=O 1 - s 1 - s

T(s) - T(O) Gs(s - 1) - Gs (O) s s - 1

where we have used the fact that T (0) = G s (0) = 1 . Therefore

n n L si- llP(X � i) = L (s - l)j- l Sj . i=l j=l

Equating coefficients of si- l , we obtain as required that

� (j - l) . . lP(X � i ) = � Sj i - I (- 1)1 -1 ,

J=I

232

1 ::: i ::: n .

Page 242: One Thousand Exercises in Probability

Some applications Solutions [5.2.2H5.2.4]

Similarly, Gs(s) - Gs (O) T(1 + s) - T(O)

s 1 + s whence the second fonnula follows.

2. Let Ai be the event that the ith person is chosen by nobody, and let X be the number of events A I , A2 , " " An which occur. Clearly

( n - j ) j (n - j - 1 ) n- j

JP(Ai! n Ai2 n · · · n Ai · ) = --1 '} n - n - l

if i I #: i2 ::f; . . . ::f; i j , since this event requires each of i i , . . . , i j to choose from a set of n - j people, and each of the others to choose from a set of size n - j - 1 . Using Waring's Theorem (Problem ( 1 .8. 13) or equation (5.2. 14» ,

where

JP(X = k) = t(- I)j-k (j) Sj

. k k J =

Sj = (�) (n - j ) j (n - j - 1) n- j .

J n - l n - l Using the result of Exercise (5 .2. 1 ),

1 ::: k ::: n ,

while JP(X ?: 0) = 1 . 3. (a)

lE(xX+Y) = lE{lE(xX+Y I Y) } = lE{xYeY(x- I) } = lE{ (xeX-ll } = exp{t-t(xeX- 1 - I ) } .

(b) The probability generating function of X I is

� (s ( 1 - p)}k 10g{ 1 - s ( 1 - p)} G (s) = L.J = . k= 1 k log( 1 /p) log p

Using the 'compounding theorem' (5 . 1 .25),

4. Clearly,

Gy (s) = GN (G (S» = eJJ-(G(s)- I ) = p . ( ) -JJ-/ IOg p 1 - s ( 1 - p)

lE -- = lE tX dt = lE(tX) dt = (q + pt)n dt = - q ( 1 ) (10 1 )

10 1 10 1 1 n+1 1 + X 0 0 0 pen + 1 )

where q = 1 - p. In the limit,

( 1 ) 1 - ( 1 - 'A/n)n+1 1 - e-').. lE 1 + X = 'A(n + 1)/n + 0( 1 ) -+ 'A '

233

Page 243: One Thousand Exercises in Probability

[5.2.5]-[5.3.1] Solutions Generating functions and their applications

the corresponding moment of the Poisson distribution with parameter A.

5. Conditioning on the outcome of the first toss, we obtain hn = qhn- l + p(1 - hn- l ) for n ::: 1 , where q = 1 - p and hO = 1 . Multiply throughout by sn and sum to find that H(s) = E�o snhn satisfies H(s) - 1 = (q - p)sH(s) + psl( 1 - s) , and so

1 - qs 1 { I I } H(s) = = - -- + . ( 1 - s) { 1 - (q - p)s } 2 1 - s 1 - (q - p)s

6. By considering the event that HTH does not appear in n tosses and then appears in the next three, we find that JP(X > n)p2q = JP(X = n + l)pq + JP(X = n + 3) . We multiply by sn+3 and sum over n to obtain

1 - JE(sx) _-----'-_p2qs3 = pqs2JE(sX) + JE(sx) , l - s which may be solved as required. Let Z be the time at which THT first appears, so Y = min{X , Z}. By a similar argument,

JP(Y > n)p2q = JP(X = Y = n + l)pq + JP(X = Y = n + 3) + JP(Z = Y = n + 2)p, JP(Y > n)q2 p = JP(Z = Y = n + l)pq + JP(Z = Y = n + 3) + JP(X = Y = n + 2)q .

We multipy by sn+l , sum over n, and use the fact that JP(Y = n) = JP(X = Y = n) + JP(Z = Y = n). 7. Suppose there are n + 1 matching letter/envelope pairs, numbered accordingly. Imagine the envelopes lined up in order, and the letters dropped at random onto these envelopes. Assume that exactly j + 1 letters land on their correct envelopes. The removal of any one of these j + 1 letters, together with the corresponding envelope, results after re-ordering in a sequence of length n in which exactly j letters are correctly placed. It is not difficult to see that, for each resulting sequence of length n, there are exactly j + 1 originating sequences of length n + 1 . The first result follows. We multiply by si and sum over j to obtain the second. It is evident that G 1 (s) = s . Either use induction, or integrate repeatedly, to find that Gn (s) = E�=o (s - or Ir ! . 8. We have for Is I < JL + 1 that

JE(SX) = JE{JE(sx I A) } = JE(eA(s- l ) = JL

= � f (_S _) k JL - (s - 1) JL + 1 k=O JL + 1

9. Since the waiting times for new objects are geometric and independent,

T ( 3S ) ( s ) ( s ) JE(s ) - s -- -- ---4 - s 2 - s 4 - 3s .

Using partial fractions, the coefficient of sk is fz { ! (! )k-4 - 4(!)k-4 + i (� )k-4 } , for k ::: 4.

5.3 Solutions. Random walk

1. Let Ak be the event that the walk: ever reaches the point k. Then Ak 2 Ak+l if k ::: 0, so that

r- l JP(M ::: r) = JP(Ar) = JP(Ao) II JP(Ak+ 1 I Ak) = (plq)r ,

k=O

234

r ::: 0,

Page 244: One Thousand Exercises in Probability

Random walk

since IP(Ak+1 I Ak) = IP(AI l Ao) = p/q for k � 0 by Corollary (5 .3 .6).

2. (a) We have by Theorem (5.3 . 1c) that

Solutions [5.3.2]-[5.3.4]

00 2 00 ", 2k I s 2 ", 2k L...J s 2kfo (2k) = s Fo (s) = � = s IPo (s) = L...J s po (2k - 2) , k=l V I - s- k=l

and the claim follows by equating the coefficients of s2k . (b) It is the case that an = IP(SI S2 ' . . S2n =1= 0) satisfies

00

an = L fo(k) , k=2n+2 k even

with the convention that ao = 1 . We have used the fact that ultimate return to 0 occurs with probability 1 . This sequence has generating function given by

00 00 00 !k- l L s2n L fo(k) = L fo(k) L s2n n=O k=2n+2 k=2 n=O

k even k even _ 1 - Fo (s) _ 1

by Theorem (5 .3. 1c) - l - s2 - � 00

= IPo(s) = L s2nIP(S2n = 0) . n=O

Now equate the coefficients of s2n . (Alternatively, use Exercise (5. 1 .2) to obtain the generating function of the an directly.)

3. Draw a diagram of the square with the letters ABCD in clockwise order. Clearly PAA (m) = 0 if m is odd. The walk is at A after 2n steps if and only if the numbers of leftward and rightward steps are equal and the numbers of upward and downward steps are equal. The number of ways of choosing 2k horizontal steps out of 2n is (�) . Hence

with generating function

Writing FA (s) for the probability generating function of the time T of first return, we use the argument which leads to Theorem (5.3 . 1 a) to find that GA(S) = 1 + FA(S)GA(S) , and therefore FA(S) = 1 - GA(S)- l .

4. Write (Xn , Yn) for the position of the particle at time n. It is an elementary calculation to show that the relations Un = Xn + Yn , Vn = Xn - Yn define independent simple symmetric random walks

U and V. Now T = min{n : Un = m}, and therefore G T (S) = {s- l ( 1 - �) }m for l s i ::: 1 by Theorem (5 .3.5).

235

Page 245: One Thousand Exercises in Probability

[S.3.SHS.3.6] Solutions Generating functions and their applications

Now X - Y = VT , so that

where we have used the independence of U and V. This converges if and only if I ! (s + s - 1 ) I � 1 , which is to say that s = ± l . Note that GT (S) converges in a non-trivial region of the complex plane.

S. Let T be the time of the first return of the walk S to its starting point O. During the time-interval (0, T), the walk is equally likely to be to the left or to the right of 0, and therefore

{ T R + L' if T � 2n , L2n = 2nR if T > 2n ,

where R is Bernoulli with parameter ! , L' has the distribution of L2n-T , and R and L' are independent. It follows that G2n (S) = JE(sL2n ) satisfies

n G2n (S) = L ! ( 1 + s2k)G2n_2k (s)f(2k) + L ! ( 1 + s2n)f(2k)

k=1 k>n

where f(2k) = lP'(T = 2k) . (Remember that L2n and T are even numbers.) Let H(s, t) I:�o t2nG2n (S) . Multiply through the above equation by t2n and sum over n, to find that

H(s, t) = !H(s , t) (F(t) + F(st} } + ! (J (t) + J (st} }

where F(x) = I:�o x2k f(2k) and

00 1 J (x) = L x2n L f(2k) = �'

n=O k>n 1 x Ix l < 1 ,

by the calculation in the solution to Exercise (5 .3 .2). Using the fact that F(x) = 1 - \11 - x2, we deduce that H (s , t) = 1 / vi ( 1 - t2) ( 1 - s2t2) . The coefficient of s2kt2n is

6. We show that all three terms have the same generating function, using various results established for simple symmetric random walk. First, in the usual notation,

00 2 � 2m ' 2s L...J 4mlP'(S2m = O)s = 2s PO (s) = 2 3/2 ' m=O ( 1 - s )

Secondly, by Exercise (5 .3 .2a, b),

m m JE(T 1\ 2m) = 2mlP'(T > 2m) + L 2kfo (2k) = 2mlP'(S2m = 0) + L lP'(S2k-2 = 0) .

k=1 k=1

236

Page 246: One Thousand Exercises in Probability

Random walk

Hence,

Solutions [5.3.7]-[5.3.8]

00 2m s2 Po (s) I 182 L s lE(T 1\ 2m) = --2- + s PO(s) = 2 3/2 · m=O 1 - s ( 1 - s )

Finally, using the hitting time theorem (3 . 10. 14), (5.3.7), and some algebra at the last stage,

7. Let In be the indicator of the event {Sn = OJ, so that Sn+l = Sn + Xn+l + In . In equilibrium, lE(So) = lE(So) + lE(Xl ) + lE(lo) , which implies that lP'(So = 0) = lE(lo) = -lE(Xl ) and entails lE(Xl ) :::; O. Furthermore, it is impossible that lP'(So = 0) = 0 since this entails lP'(So = a) = 0 for all a < 00. Hence lE(X 1 ) < 0 if S is in equilibrium. Next, in equilibrium,

Now,

Hence

lE{ zSn+Xn+! +In ( 1 - In) } = lE(zSn I Sn > O)lE(zX! )lP'(Sn > 0) lE(zSn+Xn+! +In In) = zlE(zX! )lP'(Sn = 0) .

lE(ZSO ) = lE(zX! ) [{lE(zSO ) - lP'(So = O) } + zlP'(So = 0)] which yields the appropriate choice for lE(zSO) . 8. The hitting time theorem (3 . 10. 14), (5.3 .7), states that lP'(TOb = n) = ( lb l jn)lP'(Sn = b) , whence

b lE(TOb I TOb < (0) = ( L lP'(Sn = b) . lP' TOb < (0) n

The walk is transient if and only if p ¥= � , and therefore lE(TOb I TOb < (0) < 00 if and only if

p ¥= ! . Suppose henceforth that p ¥= ! . The required conditional mean may be found by conditioning on the first step, or alternatively

as follows. Assume first that p < q , so that lP'(TOb < (0) = (pjq)b by Corollary (5.3.6). Then L:n lP'(Sn = b) is the mean of the number N of visits of the walk to b. Now

lP'(N = r) = (�)bpr- l ( l _ p) , r � l ,

where p = lP'(Sn = 0 for some n � 1) = l - Ip - q l . Therefore lE(N) = (pjq)b j lp - q l and

b (pjq)b lE(TOb I TOb < (0) = (pjq)b

. I p _ q l ·

We have when p > q that lP'(TOb < (0) = 1 , and lE(TOl ) = (p - q)- l . The result follows from the fact that lE(TOb) = blE(Tol ) .

237

Page 247: One Thousand Exercises in Probability

[5.4.1]-[5.4.4] Solutions Generating functions and their applications

5.4 Solutions. Branching processes

1. Clearly 1&(Zn I Zm) = ZmtLn-m since, given Zm , Zn is the sum of the numbers of (n - m)th generation descendants of Zm progenitors. Hence 1&(ZmZn I Zm) = Z�tLn-m and 1&(ZmZn) =

1&{1&(ZmZn I Zm)} = 1&(Z� )tLn-m . Hence

cov(Zm , Zn) = tLn-m1&(Z� ) - 1&(Zm)1&(Zn ) = tLn-m var(Zm) ,

and, by Lemma (5.4.2),

2. Suppose 0 .:::: r .:::: n, and that everything is known about the process up to time r. Conditional on this information, and using a symmetry argument, a randomly chosen individual in the nth generation has probability 1/Zr of having as rth generation ancestor any given member of the rth generation. The chance that two individuals from the nth generation, chosen randomly and independently of each other, have the same rth generation ancestor is therefore 1 /Zr • Therefore

IP(L < r) = 1&{IP(L < r I Zr) } = 1&(1 - Z;l )

and so

O ':::: r < n .

If 0 < IP(ZI = 0) < 1 , then almost the same argument proves that IP(L = r I Zn > 0) =

TJr - TJr+l for 0 .:::: r < n, where TJr = 1&(Z;l I Zn > 0) . 3. The number Zn of nth generation decendants satisfies

whence, for n � 1 ,

if p = q ,

if p #- q , { 1 n (n + 1 )

IP(T = n) = IP(Zn = 0) - IP(Zn- l = 0) = n-l n 2 p q (p - q)

It follows that 1&(T) < 00 if and only if p < q. 4. (a) As usual,

if p = q ,

if p #- q .

This suggests that Gn (s) = 1 - a 1+p+· ·+pn- l ( 1 - s )pn for n � 1 ; this formula may be proved easily by induction, using the fact that Gn (s) = G(Gn-l (s)) . (b) As in the above part (a),

238

Page 248: One Thousand Exercises in Probability

Age-dependent branching processes Solutions [5.4.5]-[5.5.1]

where P2 (s) = P(P(s)) . Similarly Gn (s) = j- I (Pn (f (s))) for n 2: I , where Pn (s) = P(Pn- I (S) ) . (c) With P(s) = as/{ I - (I - a)s} where a = y- l , i t i s an easy exercise to prove, by induction, that Pn (s) = an s/ { I - ( 1 - an )s } for n 2: 1 , implying that

5. Let Zn be the number of members of the nth generation. The (n + I)th generation has size Cn+l + In+l where Cn+l is the number of natural offspring of the previous generation, and In+l is the number of immigrants. Therefore by the independence,

whence Gn+l (s) = JE(sZn+l ) = JE{G (s)Zn }H(s) = Gn (G (s))H(s) .

6. By Example (5.4.3),

Z n - (n - I)s n - 1 1 JE(s n ) = = -- + 2 I ' n 2: o. n + 1 - ns n n ( 1 + n- - s)

Differentiate and set s = 0 to find that

Similarly,

00 00 1 JE(VI ) = E IP'(Zn = 1) = E 2 = !n2 • n=O n=O (n + 1 )

00 n 00 1 00 1 I 2 00 1 JE(V2) = E (n + 1)3 = E (n + 1 )2 - E (n + 1)3 = (in - E (n + 1)3 ' n=O n=O n=O n=O

_ � n2 _ � (n + 1)2 - 2(n + 1) + 1 _ I 2 I 4 � 1 JE(V3) - L....- (n + 1)4 - L....- (n + 1 )4 - (in + 9Qn - 2 L....- (n + 1)3 . n=O n=O n=O

The conclusion is obtained by eliminating L:n (n + 1) -3 .

5.5 Solutions. Age-dependent branching processes

1. (i) The usual renewal argument shows as in Theorem (5 .5 . 1 ) that

Gt (s) = l G (Gt-u (s))fr (u) du + 100 sfr (u) du .

Differentiate with respect to t, to obtain

a r a at Gt (s) = G(Go(s))fr (t) + 10 at (G(Gt-u (s))} fr(u) du - sfr (t) ·

Now Go(s) = s, and a a at (G (Gt-u (s))} = - au (G(Gt-u (s)) } ,

239

Page 249: One Thousand Exercises in Probability

[5.5.2]-[5.5.2] Solutions Generating functions and their applications

so that, using the fact that h (u) = 'Ae-)...u if u � 0,

r � {G (Gt-u (s))} h (u) du = - [G(Gt-u (s» h (u)] t - 'A r G(Gt-u (s»fT (u) du , Jo at 0 Jo

having integrated by parts. Hence

:t Gt (s) = G(s)'Ae-)...t + { -G (s)'Ae-At + 'AG(Gt (s» } - 'A { Gt (s) -[X) sfT (u) dU } - s'Ae-At

= 'A {G(Gt (s» - Gt (s} } .

(ii) Substitute G(s) = s2 into the last equation to obtain

aGt = 'A(G2 _ G ) at t t

with boundary condition Go(s) = s . Integrate to obtain At + c(s) = log{ 1 - Grl } for some function c(s) . Using the boundary condition at t = 0, we find that c(s) = log{ 1 - G01 } = log{ 1 - s-l }, and hence Gt (s) = se-At / { 1 - s ( 1 - e-)...t} } . Expand in powers of s to find that Z(t) has the geometric distribution IP(Z(t) = k) = ( 1 - e-)...t)k- I e-At for k � 1 .

2. The equation becomes

with boundary condition Go(s) = s . This differential equation is easily solved with the result

Gt (s) = 2s + t ( 1 - s) =

4/t 2 + t ( 1 - s) 2 + t ( 1 - s)

2 - t

We pick out the coefficient of sn to obtain

and therefore

IP Z t - n - --- --4 ( t ) n

( ( ) - ) - t (2 + t) 2 + t ' n � 1 ,

00 4 ( t ) n 2 ( t ) k IP Z t > k - --- -- - - --( ( ) - ) - L t (2 + t) 2 + t - t 2 + t '

n=k

It follows that, for x > 0 and in the limit as t -+ 00,

IP(Z(t) > xt) ( t ) LxtJ - I ( 2) I - LxtJ IP(Z(t) � xt I Z (t) > 0) = IP(Z(t); 1) = 2 + t = 1 + t -+ e-2x •

240

Page 250: One Thousand Exercises in Probability

Characteristic functions Solutions [5.6.1]-[5.7.2]

5.6 Solutions. Expectation revisited

1. Set a = E(X) to find that u (X) 2: u (EX) + A(X - EX) for some fixed A. Take expectations to obtain the result. 2. Certainly Zn = E7=1 Xi and Z = E�l IXi I are such that I Zn I :::: Z, and the result follows by dominated convergence. 3. Apply Fatou's lemma to the sequence {-Xn : n 2: I } to find that

E (lim sup Xn) = -E (lim inf -Xn) 2: - lim inf E(-Xn) = lim sup E(Xn) . n�oo n�oo n�oo n�oo

4. Suppose that EIXr I < 00 where r > O. We have that, if x > 0,

xrJP'( IX I 2: x) :::: f ur dF(u) --+ 0 as x --+ 00, hx ,oo)

where F is the distribution function of I X I . Conversely suppose that xrJP'( IX I 2: x ) --+ 0 where r 2: 0 , and let 0 :::: s < r . Now E I Xs l =

limM-+oo foM US dF(u) and, by integration by parts,

The first term on the right-hand side is negative. The integrand in the second term satisfies sus -1JP'( 1 X I > u) :::: sus- I . u-r for all large u . Therefore the integral is bounded uniformly in M, as required. 5. Suppose first that, for all E > 0, there exists 8 = 8 (E) > 0, such that E( IX I IA) < E for all A satisfying JP'(A) < 8. Fix E > 0, and find x (> 0) such that JP'(IX I > x) < 8 (E) . Then, for y > x,

/:y

lu i dFx (u) :::: /:x

l u i dFx (u) + E ( IX I I£ IX I >x} ) :::: i: l u i dFx (u) + E .

Hence f!.y l u i dFx (u) converges as y --+ 00, whence E IX I < 00. Conversely suppose that E IX I < 00. It follows that E ( IX I 1nXI >y} ) --+ 0 as y --+ 00. Let E > 0,

and find y such that E ( IX I I£ IX I >y} ) < iE . For any event A, IA :::: IAnBC + IB where B = { IX I > y } . Hence

E( IX I IA) :::: E ( IX I IAnBc ) + E ( IX I IB) :::: yJP'(A) + iE . Writing 8 = E/(2y) , we have that E( IX I IA) < E if JP'(A) < 8 .

5.7 Solutions. Characteristic functions

1. Let X have the Cauchy distribution, with characteristic function r/l (s) = e- 1s l . Setting Y = X, we have that r/lx+y (t) = r/l (2t) = e-2l t l = r/lx (t)r/ly (t) . However, X and Y are certainly dependent. 2. (i) It is the case that Re{r/l (t) } = E(cos tX), so that, in the obvious notation,

Re{ l - r/l (2t) } = i: ( l - cos(2tx) } dF(x) = 2 i: ( I - cos(tx)}{ l + cos (tx) } dF(x)

:::: 4 i: (I - cos(tx) } dF(x) = 4 Re{ 1 - r/l (t) } .

241

Page 251: One Thousand Exercises in Probability

[5.7.3]-[5.7.6] Solutions Generating junctions and their applications

(ii) Note first that, if X and Y are independent with common characteristic function cJ>, then X - Y has characteristic function

1fr (t) = JE(eitX )JE(e-itY ) = cJ> (t)cJ> (-t) = cJ> (t)cJ> (t) = 1cJ> (t) 1 2 .

Apply the result of part (i) to the function 1fr to obtain that 1 - 1cJ> (2t) 1 2 :::: 4( 1 - 1cJ> (t) 1 2) . However 1cJ> (t) 1 :::: 1 , so that

1 - 1cJ> (2t) 1 :::: 1 - 1cJ> (2t) 1 2 :::: 4(1 - 1cJ> (t) 1 2) :::: 8( 1 - 1cJ> (t) l ) .

3. (a) With mk = JE(Xk) , we have that

say, and therefore, for sufficiently small values of e , 00 ( W+l

Kx (e) = E - S (el . r=l r

Expand S (e? in powers of e , and equate the coefficients of e, e2, e3 , in turn, to find that kl (X) = m I ,

k2 (X) = m2 - mf , k3 (X) = m3 - 3m lm2 + 2mI ' (b) If X and Y are independent, Kx+y (e) = 10g{JE(eox)JE(eOY) } = Kx(e) + Ky (e) , whence the claim is immediate.

' 02 4. The N(O, 1) variable X has moment generating function JE(eox) = e 2 , so that Kx(e) = �e2. 5. (a) Suppose X takes values in L (a , b). Then

lcJ>x (2n/b) I = 1:�:>21TiX/blP'(X = X) I = l e21Tia/b l l� >21TimlP'(X = a + bm) 1 = 1

since only numbers of the form x = a + bm make non-zero contributions to the sum. Suppose in addition that X has span b, and that IcJ>X (T) I = 1 for some T E (O, 2n/b) . Then

cJ>x (T) = eic for some c E R. Now

JE (cos(TX - c») = �JE (iTX-ic + e-iTX+ic ) = 1 ,

using the fact that JE(e-iTX) = cJ>x (T) = e-ic . However cos x :::: 1 for all x , with equality if and only if x is a multiple of 2n . It follows that T X - c is a multiple of 2n , with probability 1 , and hence that X takes values in the set L (c/ T, 2n/ T) . However 2n/ T > b, which contradicts the maximality of the span b. We deduce that no such T exists. (b) This follows by the argument above. 6. This is a form of the 'Riemann-Lebesgue lemma' . It is a standard result of analysis that, for E > 0, there exists a step function gE such that f�oo If (x) - gE (x) 1 dx < E . Let cJ>E (t) = f�oo eitx gE (X) dx . Then

If we can prove that, for each E, IcJ>E (t) I -+ ° as t -+ ±oo, then it will follow that lcJ>x (t) I < 2E for all large t, and the claim then follows.

242

Page 252: One Thousand Exercises in Probability

Characteristic junctions Solutions [5.7.7]-[5.7.9]

Now g£ (x) is a finite linear combination of functions of the form CIA (x) for reals c and intervals A, that is g£ (x) = '2:[=1 Ck IAk (x) ; elementary integration yields

where ak and bk are the endpoints of Ak . Therefore

2 K 1¢£ (t) l :::: - L Ck -+ 0, as t -+ ±oo. t k=1

7. If X is N(/-L, 1 ) , then the moment generating function of X2 is

M 2 (S) = E(eSX ) = eSx -- e- :Z X-/L dx = exp -- , 2 100 2 1 1 ( )2 1 ( /-L2s )

X -oo ...fiii .J1 - 2s 1 - 2s

if s < ! , by completing the square in the exponent. It follows that

n { 1 ( /-Lh ) } 1 ( s() ) My(s) = Jl ..Jf=2s exp 1 � 2s = ( 1 _ 2s)n/2 exp 1 - 2s .

It is tempting to substitute s = j t to obtain the answer. This procedure may be justified in this case using the theory of analytic continuation.

S. (a) T2 = X2 I(Y In) , where X2 is X2 ( 1 ; /-L2) by Exercise (5 .7.7), and Y is x2 (n) . Hence T2 is F(1 , n; /-L2) . (b) F has the same distribution function as

(A2 + B)/m Z = -'------::-:-"""":'-'--Vln

where A, B, V are independent, A being N(JO, 1) , B being x2 (m- 1) , and V being x2 (n) . Therefore

E(Z) = � {E(A2)E (�) + (m - 1)E (BI0�

1 » ) }

= � { ( 1 + () _n_ + (m _ 1)_n_ }

= n (m + () , m n - 2 n - 2 m (n - 2)

where we have used the fact (see Exercise (4. 10.2» that the F(r, s) distribution has mean s I(s - 2) if s > 2. 9. Let X be independent of X with the same distribution. Then 1¢ 1 2 is the characteristic function of X - X and, by the inversion theorem,

1 100 . 100 - 1¢ (t) 12e-l tx dt = fx_x (x) = f(y)f(x + y) dy . 2n - 00 -00

Now set x = O. We require that the density function of X - X be differentiable at O.

243

Page 253: One Thousand Exercises in Probability

[5.7.10]-[5.8.2] Solutions Generating functions and their applications

10. By definition, e-ity¢x (y) = 1

00 eiy(x-t) fx (x) dx .

-00

Now multiply by fy (y), integrate over y E R, and change the order of integration with an appeal to Fubini 's theorem. 11. (a) We adopt the usual convention that integrals of the form J: g(y) dF(y) include any atom of the distribution function F at the upper endpoint v but not at the lower endpoint u. It is a consequence that Fr: is right-continuous, and it is immediate that Fr: increases from 0 to 1 . Therefore Fr: is a distribution function. The corresponding moment generating function is

Mr: (t) = 100

etx dFr: (x) = _1 _ 100

etx+r:x dF(x) = M(t + .) . -00 M(t) -00 M(t)

(b) The required moment generating function is

Mx+y (t + .) Mx (t + .)My (t + .) = Mx+y (t) Mx (t)My (t)

the product of the moment generating functions of the individual tilted distributions.

5.8 Solutions. Examples of characteristic functions

1. (i) We have that if)(t) = JE(eitX) = JE(e-itX) = ¢-x (t) . (ii) If X I and X 2 are independent random variables with common characteristic function ¢, then ¢X, +X2 (t) = ¢x, (t)¢X2 (t) = ¢ (t)2 . (iii) Similarly, ¢X, -X2 (t) = ¢x, (t)¢-X2 (t) = ¢ (t)¢ (t) = 1¢ (t) 1 2 . (iv) Let X have characteristic function ¢, and let Z be equal to X with probability � and to -X otherwise. The characteristic function of Z i s given by

where we have used the argument of part (i) above. (v) If X is Bernoulli with parameter 1 , then its characteristic function is ¢ (t) = � + 1eit . Suppose Y is a random variable with characteristic function 1/f(t) = 1¢ (t) l . Then 1/f(t)2 = ¢ (t)¢ (-t) . Written in terms of random variables this asserts that YI + Y2 has the same distribution as Xl - X2 , where the Yi are independent with characteristic function 1/f, and the Xi are independent with characteristic function ¢. Now Xj E to, I } , so that Xl - X2 E {- 1 , 0, I } , and therefore Y.i E {- � , � } . Write

I a = lJ»(Yj = 2 ) . Then

lJ»(YI + Y2 = 1 ) = a2 = lJ»(XI - X2 = 1 ) = � , lJ»(YI + Y2 = - 1 ) = ( 1 - a)2 = lJ»(XI - X2 = - 1) = � ,

implying that a2 = ( 1 - a)2 so that a = J; , contradicting the fact that a2 = � . We deduce that no such variable Y exists. 2. For t � 0,

Now minimize over t � O.

244

Page 254: One Thousand Exercises in Probability

Examples of characteristic junctions Solutions [5.8.3]-(5.8.5]

3. The moment generating function of Z is

MZ(t) = JE {JE(etXY I Y) } = JE{Mx (ty)} = JE { C. � tY ) m}

101 ( A

)m yn- l ( l _ y)m-n- l

= -- dy. o A - ty B (n , m - n)

Substitute v = 1 / y and integrate by parts to obtain that

satisfies

iOO (v _ 1 )m-n- l Imn = dv

1 (AV - t)m

I = - I = c m n A I [ 1 (V _ 1)m-n- l ] OO m - n - l mn A (m - 1) (AV - t)m- l

1 + A(m _ 1 ) m-l ,n ( , , ) m- l ,n

for some c(m , n , A) . We iterate this to obtain

, 'ioo dv Imn = c In+l n = c +1 ' 1 (AV - t)n c'

for some c' depending onm, n , A . Therefore Mz(t) = C"(A - t)-n for some c" depending on m , n , A . However Mz(O) = 1 , and hence c" = An , giving that Z is r(A , n) . Throughout these calculations we have assumed that t is sufficiently small and positive. Alternatively, we could have set t = is and used characteristic functions. See also Problem (4. 14. 12). 4. We have that

1Il' ( itX2 ) 100 itx2 1 ( x - p,)2 ) d .,. e = e � exp - 2 x -00 ", 27r(l· 2a

100 1 ( [x - p,(l - 2a2it )- lt ) ( i tP,2 ) = --- exp . exp . dx -00 ../27r(l2 2(12 ( 1 - 2a2, t)- 1 1 - 2a2a

1 ( i tp,2 ) = exp . J 1 - 2a2i t 1 - 2a2i t

The integral i s evaluated by using Cauchy's theorem when integrating around a sector in the complex plane. It is highly suggestive to observe that the integrand differs only by a multiplicative constant from a hypothetical normal density function with (complex) mean JL( 1 - 2a2i t )- 1 and (complex) variance (12 ( 1 - 2(12i t)- 1 .

1 5. (a) Use the result of Exercise (5.8.4) with p, = 0 and (12 = 1 : <Px2 (t) = ( 1 - 2i t)- Z , the

1 characteristic function of the X 2 ( 1 ) distribution.

1 (b) From (a), the sum S has characteristic function <ps (t) = (l - 2i t)- Zn , the characteristic function of the x2(n) distribution. (c) We have that

245

Page 255: One Thousand Exercises in Probability

[5.8.6]-[5.8.6] Solutions Generating functions and their applications

Now

JE (exp{- i t2/XH) = tXJ � exp (_ t22 _ x2 ) dx . 1-00 'V 2n 2x 2 There are various ways of evaluating this integral. Using the result of Problem (5. 12. 1 8c), we find that the answer is e - It I , whence Xl / X 2 has the Cauchy distribution. (d) We have that

JE(eitX1 X2 ) = JE{JE(eitX1X2 I X2) } = JE(cPXl (tX2» ) = JE (e- ! t2X� ) = roo

� exp{- ix2 ( 1 + t2) } dx = k' 1-00 'V 2n 1 + t2

1 on observing that the integrand differs from the N (0, ( 1 + t2) - 2 ) density function only by a multi-plicative constant. Now, examination of a standard work of reference, such as Abramowitz and Stegun ( 1965, Section 9.6.2 1 ), reveals that

roo cos(xt) dt = Ko(x) , Jo Ji+t2 where Ko (x) is the second kind of modified Bessel function. Hence the required density, by the inversion theorem, is f(x) = Ko( lx i ) /n . Note that, for small x , Ko(x) '"" - log x , and for large positive x , Ko (x) '"" e-x .jnxj2.

As a matter of interest, note that we may also invert the more general characteristic function cP (t) = ( 1 - i t)-a (1 + i t)-P . Setting 1 - i t = -z/x in the integral gives

1 joo e-itx e-x xa-1 j-X+iXoo e-z dz f(x) = - dt = ---;;,---2n -00 ( 1 - i t)a ( 1 + i t)P 2P2ni -x-ixoo (_z)a ( 1 + z/(2x»P

eX (2x) ! <p-a) r(a) W! <a_P> . ! (1 _a_P) (2x)

where W is a confluent hypergeometric function. When a = f3 this becomes 1

(x/2)a-2 f(x) = r(a).y'n" Ka_! (x)

where K is a Bessel function of the second kind. (e) Using (d), we find that the required characteristic function is cPX1 X2 (t)cPX3X4 (t) = ( 1 + t2)- 1 . In order to invert this, either use the inversion theorem for the Cauchy distribution to find the required density to be f(x) = ie- Ix l for -00 < x < 00, or alternatively express (1 + t2)- 1 as partial fractions, ( 1 + t2)- 1 = i { ( l - i t )- l + (1 + i t)- l } , and recall that ( 1 - i t )- l is the characteristic function of an exponential distribution.

6. The joint characteristic function of X = (Xl , X2 , . . . , Xn) satisfies tf>x(t) = JE(eitX' ) = JE(eiY ) where t = (t1 , t2 , . . . , tn ) E ]Rn and Y = tX' = t1X 1 + . . . + tnXn . Now Y is normal with mean and variance

n JE(Y) = �:::>jJE(Xj ) = tp,',

j=l

n var(Y) = L tj tkcOV(Xj , Xk) = tVt' ,

j,k=l

where p. is the mean vector of X, and V is the covariance matrix of X. Therefore tf>x(t) = cPy (l) = exp(i tp.' - i tVt') by (5.8.5).

246

Page 256: One Thousand Exercises in Probability

Inversion and continuity theorems Solutions [5.8.7]-[5.9.2]

Let Z = X - II-. It is easy to check that the vector Z has joint characteristic function c/>z (t) = e- ! tVt' , which we recognize by (5.8.6) as being that of the N(O, V) distribution.

1 2 1 2 7. We have that lE(Z) = 0, lE(Z2) = 1 , and lE(etz) = lE{lE(etZ I U, V) } = lE(e z t ) = ez t . IT X and Y have the bivariate normal distribution with correlation p, then the random variable Z = (UX + VY)jVU2 + 2pUV + V2 is N(O, 1 ) . 8. By definition, lE(eitX) = lE(cos(tX» + ilE(sin(tX». By integrating by parts,

and

1000 • -AX At sm(tx )A.e dx = ---r-2 ' o A. + t

A.2 + iAt A. A.2 + t2 =

A. - i t ·

9. (a) We have that e- ix i = e-x I{x�O} + eX I{x <O} , whence the required characteristic function is

1 ( 1 1 ) 1 c/> (t) = 2 1 - i t + 1 + i t = 1 + t2 .

(b) By a similar argument applied to the r ( l , 2) distribution, we have in this case that

1 ( 1 1 ) 1 - t2 c/>(t) = 2 ( 1 - i t)2 + ( 1 + it)2

= ( 1 + t2)2 .

10. Suppose X has moment generating function M(t) . The proposed equation gives

M(t) = M(ut)2 du = - M(v)2 dv. 10 1 1 lot o t 0

Differentiate to obtain tM' + M = M2, with solution M (t) = A.j(A. + t) . Thus the exponential distribution has the stated property. 11. We have that

c/>X, y (s , t) = lE(eisX+itY ) = c/>sx+ty ( 1 ) · Now sX + tY is N(O, s2u2 + 2stu 7:p + 7:2) where u2 = var(X) , 7:2 = var(Y), p = corr(X, Y), and therefore

c/>X, y (s , t) = exp{ - i (s2u2 + 2stu7:p + t27:2) } . The fact that c/>x, Y may be expressed in terms of the characteristic function of a single normal variable is sometimes referred to as the Cramer-Wold device.

5.9 Solutions. Inversion and continuity theorems

1. Clearly, for 0 � y � 1, JP(Xn � ny) = n- 1 Lny J --+ y as n --+ 00 .

2. (a) The derivative of Fn is In (x) = 1 - cos(2nnx) , for 0 � x � 1 . It is easy to see that In is non-negative and fJ In (x) dx = 1 . Therefore Fn is a distribution function with density function In . (b) As n --+ 00,

I sin(2nnx)

I < _1_ --+ 0

2nn - 2nn '

247

Page 257: One Thousand Exercises in Probability

[5.9.3H5.9.6] Solutions Generating functions and their applications

and so Fn (x) --+ x for ° :::: x :::: 1 . On the other hand, cos(2mrx) does not converge unless x E to, I } , and therefore fn (x) does not converge on (0, 1 ) . 3 . We may express N as the sum N = Tl + T2 + . . . + Tk of independent variables each having the geometric distribution lP'(1j = r) = pqr- l for r � 1 , where p + q = 1 . Therefore

implying that Z = 2N p has characteristic function

t = 2 t = { pe2pit }k = { p(1 + 2pi t + O(p» }k --+ 1 _ 2it -k cpz ( ) CPN ( p ) 1 _ (1 - p)e2pit p(1 - 2i t + 0(1» ( )

as p + 0, the characteristic function of the r ( i , k) distribution. The result follows by the continuity theorem (5.9.5). 4. All you need to know is the fact, easily proved, that 'tfrm (t) = eitm satisfies

for integers j and k.

1

:n: d {

2n if j + k = 0, -:n:

'tfrj (t)'tfrk (t) t = ° if j + k ;.!: O,

Now, cP (t) = l:.1=-00 eitj lP'(X = j) , so that

1 1

:n: · 1 00

1

:n: 1 - e-Itkcp (t) dt = - L lP'(X = j ) 'tfrj (t)'tfr-k (t) dt = - . lP'(X = k) . 2n . 2n -:n: 2n . -:n: 2n )=-00

If X is arithmetic with span A, then X/A is integer valued, whence

lP'(X = kA) = - e-itkAcpx (t) dt . A l:n:/A 2n -:n:/A

5. Let X be uniformly distributed on [-a, a] , Y be uniformly distributed on [-b, b], and let X and Y be independent. Then X has characteristic function sin(at)/ (at) , and Y has characteristic function sin (bt)/ (bt) . We apply the inversion theorem (5.9. 1 ) to the characteristic function of X + Y to find that

1 1

00 1 1

00 sin(at) sin(bt) a /\ b -2 CPx+y (t) dt = -2 b 2 dt = fx+Y (O) = 2ab . n -00 n -00 a t

6. It is elementary that

In addition, a = n , f//(a) = _n-1 , and

and Stirling's formula follows.

248

Page 258: One Thousand Exercises in Probability

Tho limit theorems Solutions [5.9.7]-[5.10.1]

7. The vector X has joint characteristic function c/> (t) = exp( - itVt/) . By the multidimensional version of the inversion theorem (5 .9 . 1 ), the joint density function of X is

Therefore, if i #- j ,

t (X) = _1_ r exp ( -itx' - itVt/) dt.

(2rr)n JRn

at 1 1m ( . tx' 1 ' ) a2 t - = -- ti t · exp -/ - ztVt dt = -- , aVij (2rr )n Rn J aXi aXj

and similarly when i = j . When i #- j ,

a 1 at -lP' (max Xk ::'S U) = - dx where Q = {X : Xk ::'S u for k = 1 , 2, . . . , n } aVij k Q aVij

where J' . dX' is an integral over the variables Xk for k #- i, j . Therefore, lP'(maxk Xk ::'S u ) increases in every parameter Vij , and i s therefore greater than its

value when Vij = 0 for i #- j, namely Ih lP'(Xk ::'S u) .

8. By a two-dimensional version of the inversion theorem (5 .9 . 1 ) applied to JE(eitX' ) , t = (tl , t2 ) ,

�lP'(XI > 0, X2 > 0) = � tX) rOO {� rr eXP (-itx' - itVt/) dt} dX ap ap Jo Jo 4rr JJR2 a 1 rr exp(- itvt/) = ap 4rr2 JJR2 (i t} ) (i t2)

dt

1 �� 1 1 2rr.JiV=IT 1 = -2 exp(- 2tVt ) dt = 2 = �2 ' 4rr R2 4rr 2rr y 1 - p-

We integrate with respect to p to find that, in agreement with Exercise (4.7 .5),

1 1 1 lP'(XI > 0, X2 > 0) = 4 + 2rr sin- p .

5.10 Solutions. Two limit theorems

1. (a) Let {Xi : i :::: I } be a collection of independent Bernoulli random variables with parameter i . Then Sn = '£.1 Xi is binomially distributed as bin(n, i) . Hence, by the central limit theorem,

where ct> is the N (0, 1 ) distribution function.

249

Page 259: One Thousand Exercises in Probability

[5.10.2]-[5.10.4] Solutions Generating functions and their applications

(b) Let {Xi : i � I } be a collection of independent Poisson random variables, each with parameter l . Then Sn = '2:1 Xi is Poisson with parameter n, and by the central limit theorem

k: Ik-n l �x.Jil

nk ( I Sn - nl )

k ! = lP' In � x --+ 4>(x) - 4> ( -x) , as above.

2. A superficially plausible argument asserts that, if all babies look the same, then the number X of correct answers in n trials is a random variable with the bin(n , �) distribution. Then, for large n,

lP' -1_2- > 3 :::: 1 - 4>(3) :::: mk

(X - In ) 'lJn

by the central limit theorem. For the given values of n and X,

X - �n 910 - 750 --- - '" 8 iJn - 505 - .

Now we might say that the event {X - in > iJn} is sufficiently unlikely that its occurrence casts doubt on the original supposition that babies look the same.

A statistician would level a good many objections at drawing such a clear cut decision from such murky data, but this is beyond our scope to elaborate. 3. Clearly

¢y (t) = lE {lE(eitY I X) } = lE{exp (X(eit - I ») } = ( �

t ) S = (_I _.

t) S

1 - (el - 1 ) 2 - e'

It follows that I , lE(Y) = -:-¢Y (0) = s , I

whence var(Y) = 2s . Therefore the characteristic function of the normalized variable Z = (Y -lEY)/ .Jvar(Y) is

Now,

10g {¢y (t/.J2S) } = -s log (2 - eit/$s) = s (eit/$s - 1 ) + �s (eit/$s _ 1 )2 + 0(1 )

= i t# - ! t2 - ! t2 + 0( 1 ) ,

where the 0( 1 ) terms are as s --+ 00. Hence log{¢z (t) } --+ - � t2 as s --+ 00, and the result follows by the continuity theorem (5 .9.5) .

Let PI , P2 , . . . be an infinite sequence of independent Poisson variables with parameter 1 . Then Sn = PI + P2 + . . . + Pn is Poisson with parameter n . Now Y has the Poisson distribution with parameter X, and so Y is distributed as Sx . Also, X has the same distribution as the sum of s independent exponential variables, implying that X --+ 00 as s --+ 00, with probability 1 . This suggests by the central limit theorem that Sx (and hence Y also) is approximately normal in the limit as s --+ 00. We have neglected the facts that s and X are not generally integer valued. 4. Since Xl is non-arithmetic, there exist integers n 1 , n2 , . . . , nk with greatest common divisor 1 and such that lP'(X 1 = ni ) > 0 for 1 .::; i .::; k. There exists N such that, for all n � N, there exist non­negative integers aI , a2 , . . . , ak such that n = a1n 1 + . . . + aknk . If x is a non-negative integer, write

250

Page 260: One Thousand Exercises in Probability

1Wo limit theorems Solutions [5.10.5]-[5.10.5]

N = fhn l + . . . + f3knk, N +x = Yln l + . . . + Yknk for non-negative integers f3l , . . . , 13k . YI . . . , Yk . Now Sn = Xl + . . . + Xn i s such that

k lI"(SB = N) � lI" (Xj = nj for Bj- l < j � Bj , 1 � i � k) = II lI"(XI = nj)fJj > 0

j= l

where Bo = 0, Bj = f31 + 132 + . . . + f3i , B = Bk . Similarly lI"(SG = N + x) > 0 where G = YI + Y2. + . . . + Yk · Therefore

lI"(SG - SG, B+G = x) � lI"(SG = N + X)lI"(SB = N) > 0

where SG, B+G = �f=+J'+1 Xi · Also, lI"(SB - SB , B+G = -x) > 0 as required.

S. Let Xl , X2 , . . . be independent integer-valued random variables with mean 0, variance 1 , span 1 , 1 2

and common characteristic function 1/>. We are required to prove that Jnll"(Un = x) � e- 2x /$ as n � 00 where

1 Xl + X2 + . . . + Xn Un = JnSn = In and x is any number of the form k/ In for integral k. The case of general JL and 0'2 is easily derived from this.

By the result of Exercise (5.9.4), for any such x,

1 jrr.jTi . lI"(Un = x) = 2 r.; e-rtxl/>un (t) dt,

1I:",n -rr.jTi

since Un is arithmetic. Arguing as in the proof of the local limit theorem (6),

211: IJnll"(Un = x) - f(x) 1 � In + In

where f is the N(O, 1 ) density function, and

Now In = 2$ ( 1 - cI> (11: In) ) � 0 as n � 00, where cI> is the N(O, 1 ) distribution function. As for In , pick 8 E (0, 11:) . Then

1 2 The final term involving e -2t is dealt with as was In . By Exercise (5 .7.5a), there exists ').. E (0, 1 ) such that I I/> (t) 1 < ').. if 8 � I t I � 11: . This implies that

and it remains only to show that

as n � 00.

25 1

Page 261: One Thousand Exercises in Probability

[5.10.6]-[5.10.7] Solutions Generating functions and their applications

The proof of this is considerably simpler if we make the extra (though unnecessary) assumption that m3 = JE IXi l < 00, and we assume this henceforth. It is a consequence of Taylor's theorem (see Theorem (5 .7.4)) that q, (t) = 1 - � t2 - ii t3m3 +0(t3 ) as t � o. It follows that q, (t) = e -!t2+t38(t) for some finite 9 (t) . Now lex - 1 1 � Ix le 1x l , and therefore

Let K8 = sup{ 1 9 (u ) 1 : l u i � 8} , noting that K8 < 00, and pick 8 sufficiently small that 0 < 8 < 11: and 8K8 < :1 . For I t I < 8Jn,

1 r.::\n I t2 1 I t l 3 2 1 2 I t l 3 I t2 q, (t/v n, - e- '; � K8 In exp (t 8K8 - 'I t ) � K8 Jne-4 ,

as n � oo

as required. 6. The second moment of the Xi is

/oe-I x2 1-1 e2u 2 dx = -du

o 2x (log x)2 -00 u2

(substitute x = eU ), a finite integral. Therefore the X's have finite mean and variance. The density function is symmetric about 0, and so the mean is O.

By the convolution formula, if 0 < x < e- 1 , e- I x x

fz(x) = 1

f(y)f(x - y) dy � ( f(y)f(x - y) dy � f(x) ( f(y) dy, -e-I Jo Jo

since f(x - y), viewed as a function of y, is increasing on [0, x] . Hence

fz(x) > � = 1 - 2 log Ix I 4 1x I (log Ix 1)3

for 0 < x < e -1 . Continuing this procedure, we obtain kn fn (x) � Ix I (log Ix l )n+l '

for some positive constant kn . Therefore fn (x) � 00 as x � 0, and in particular the density function of (X 1 + . . . + Xn)/ In does not converge to the appropriate normal density at the origin. 7. We have for s > 0 that

q, (is) = � roo exp ( _(2x)- 1 _ xs )x-3/2 dx ", 211: Jo

= � (OO exp (_ �y2 _ Sy-2) 2 dy by substituting x = y-2 ", 211: Jo

= exp(-..;2S),

252

Page 262: One Thousand Exercises in Probability

Large deviations Solutions [5.10.8]-[5.11.3]

by the result of Problem (5 . 12 . 1 8c), or by consulting a table of integrals. The required conclusion follows by analytic continuation in the upper half-plane. See Moran 1968, p. 27 1 . 8. (a) The sum Sn = E�=l Xr has characteristic function JE(eitSn ) = ¢(t)n = ¢(tn2), whence Un = Sn/n has characteristic function ¢(tn) = JE(eitnXl ) . Therefore,

lP'(Sn < c) = lP'(nXI < c) = lP' (Xl < �) -+ 0 as n -+ 00.

(b) JE(eitTn ) = ¢(t) = JE(eitXI ) . 9. (a) Yes, because Xn is the sum of independent identically distributed random variables with non-zero variance. (b) It cannot in general obey what we have called the central limit theorem, because var(Xn ) = (n2 - n) var(8) + nJE(8) ( I - JE(8)) and n var(XI ) = nJE(8) (I - JE(8)) are different whenever var(8) =1= O. Indeed the right 'normalization' involves dividing by n rather than...;n. It may be shown when var(8) =1= 0 that the distribution of Xn/n converges to that of the random variable 8.

5.11 Solutions. Large deviations

1. We may write Sn = E7 Xi where the Xi have moment generating function M (t) = i (et + e -t ) . Applying the large deviation theorem (5. 1 1 .4), we obtain that, for 0 < a < 1 , lP'(Sn > an) l /n -+ inft>o{g(t)} where g(t) = e-at M(t). Now g has a minimum when et = J(l + a)/( l - a), where it takes the value 1/ J (1 + a) 1+a (1 - a) l-a as required. If a � 1 , then lP'(Sn > an) = 0 for all n . 2 . (i) Let Yn have the binomial distribution with parameters n and i . Then 2Yn - n has the same distribution as the random variable Sn in Exercise (5 . 1 1 . 1 ) . Therefore, if 0 < a < 1 ,

and similarly for lP'(Yn - in < - ian), by symmetry. Hence

(ii) This time let Sn = X 1 + ' . . + Xn , the sum of independent Poisson variables with parameter 1 . Then Tn = enlP'(Sn > n(I + a)). The moment generating function of Xl - 1 is M(t) = exp(et - 1 - t) , and the large deviation theorem gives that Tl /n -+ e inft>o{g(t)} where g(t) = e-at M(t) . Now g'(t) = (et - a - I ) exp(et - at - t - I) whence g has a minimum at t = log(a + 1 ) . Therefore Tl/n -+ eg(log(I + a)) = {e/(a + I ) }a+l . 3. Suppose that M(t) = JE(etX) is finite on the interval [-8 , 8] . Now, for a > 0, M(8) � e8alP'(X > a), so that lP'(X > a) .:::: M(8)e-8a . Similarly, lP'(X < -a) .:::: M(-8)e-8a .

Suppose conversely that such A , /-L exist. Then

M(t) .:::: JE(e 1 tX 1 ) = r e 1 t 1x dF(x) iro,oo) where F is the distribution function of IX I . Integrate by parts to obtain

M(t) .:::: 1 + [-e 1 t 1x [ 1 - F(x)]] � + 1000 I t l e 1 t 1x [ 1 - F(x)] dx

253

Page 263: One Thousand Exercises in Probability

[5.11.4]-[5.12.2] Solutions Generating functions and their applications

(the term ' I ' takes care of possible atoms at 0). However 1 - F(x) � JLe-Ax , so that M(t) < 00 if I t I is sufficiently small.

4. The characteristic function of Sn /n is {e- 1 t/n l }n = e- 1t l , and hence Sn /n is Cauchy. Hence

lP'(Sn > an) = t"JO dx 2 = 2. (:: - tan- I a) . Ja n(1 + x ) n 2

S.12 Solutions to problems

1. The probability generating function of the sum is

{ �t sjrO

= G

s) lO { 1

1-=.s

s6 fO

= G

s) 1O

( I - lOS6 + . . . ) ( I + lOS + . . . ) .

The coefficient of s27 is

2. (a) The initial sequences T, HT, HHT, HHH induce a partition of the sample space. By conditioning

on this initial sequence, we obtain f(k) = qf(k - 1) + pqf(k - 2) + p2qf(k - 3) for k > 3,

where p + q = 1 . Also f( 1 ) = f (2) = 0, f(3) = p3 . In principle, this difference equation may be solved in the usual way (see Appendix I). An alternative is to use generating functions.

Set G(s) = E�1 sk f(k) , multiply throughout the difference equation by sk

and sum, to find that

G(s) = p3s3/ { l - qs - pqs2 - p2qs3 } . To find the coefficient of sk , factorize the denominator, expand in partial fractions, and use the binomial series.

Another equation for f(k) is obtained by observing that X = k if and only if X > k - 4 and the last four tosses were THHH. Hence

( k-4 ) f(k) = qp3 1 - � f(i) ,

1= 1 k > 3 .

Applying the first argument to the mean, we find that JL = JE(X) satisfies JL = q( 1 + JL) + pq(2 + JL) + p2q (3 + JL) + 3p3 and hence JL = ( I + p + p2)/p3 .

As for HTH, consider the event that HTH does not occur in n tosses, and in addition the next three tosses give HTH. The number Y until the first occurrence of HTH satisfies

lP'(Y > n)p2q = lP'(Y = n + I)pq + lP'(Y = n + 3), n � 2.

Sum over n to obtain JE(Y) = (pq + 1 )/ (p2q) . (b) G N (S) = (q + ps)

n, in the obvious notation.

(i) lP'(2 divides N) = i {GN ( 1 ) + GN (- 1) } , since only the coefficients of the even powers of s contribute to this probability.

(ii) Let (J) be a complex cube root of unity. Then the coefficient of lP'(X = k) in j {G N ( 1 ) + G N ({J)) + G N ({J)2) } is

j { l + {J)3 + {J)6} = I , if k = 3r,

j { l + (J) + {J)2} = 0, if k = 3r + I , j { l + {J)2 + {J)4} = 0, if k = 3r + 2,

254

Page 264: One Thousand Exercises in Probability

Problems Solutions [5.12.3]-[5.12.6]

for integers r . Hence §- {G N ( 1 ) + G N (W) + G N (W2) } = E;!�j ff>(N = 3r ) , the probability that N is a multiple of 3 . Generalize this conclusion.

3. We have that T = k if no run of n heads appears in the first k - n - 1 throws, then there is a tail, and then a run of n heads. Therefore ff>(T = k) = ff>(T > k - n - l)qpn for k � n + 1 where

p + q = 1 . Finally ff>(T = n) = pn . Multiply by sk and sum to obtain a formula for the probability generating function G of T:

00 00 � G(s) - pnsn = qpn L sk L ff>(T = j) = qpn L ff>(T = j ) L sk

Therefore

k=n+l j >k-n- l j=1 k=n+l n n+l 00 n n+ l

= qp s '"' ff>(T = j) ( 1 - sj ) = qp s ( 1 - G(s» . l - s LJ l - s j=1

4. The required generating function is

G(s) = � sk (k - 1) pr (1 _ pi-r = (�) r

LJ r - 1 l - qs k=r

where p + q = 1 . The mean is G' ( 1 ) = r / p and the variance is Gil ( 1 ) + G' ( 1 ) - {G' ( 1 ) }2 = rq / p2 . 5. It is standard (5 .3 .3) that po(2n) = (�) (pq)n . Using Stirling's formula,

(2n)2n+ ! e-2n$ (4pq)n po(2n) � 1 (pq)n = '-= ' {nn+2 e-n$}2 '\I nn

The generating function Fo (s) for the first return time is given by Fo (s) = 1 - PO(s)- 1 where

Po (s) = En s2n Po (2n) . Therefore the probability of ultimate return is Fo( 1 ) = 1 - A. - 1 where, by Abel's theorem,

A. = L Po(2n) -{ - OO n < 00

Hence Fo(1 ) = 1 if and only if p = � .

6. (a) Rn = X� + Y; satisfies

Hence Rn = n + Ro = n .

if 1 P = q = 2 ' if p =/= q .

(b) The quick way i s to argue as i n the solution to Exercise (5.3 .4). Let Un = Xn + Yn , Vn = Xn - Yn . Then U and V are simple symmetric random walks, and furthermore they are independent. Therefore

PO (2n) = ff>(U2n = 0, V2n = 0) = ff>(U2n = 0)ff>(V2n = 0) = { (D 2n e:) }

2,

255

Page 265: One Thousand Exercises in Probability

[5.12.71-[5.12.8] Solutions Generating functions and their applications

by (5 .3.3). Using Stirling's formula, po (2n) � (mr)- I , and therefore 2:n po (2n) = 00, implying that the chance of eventual return is 1 .

A longer method is as follows. The walk is at the origin at time 0 if and only if it has taken equal numbers of leftward and rightward steps, and also equal numbers of upward and downward steps. Therefore

( 1 ) 2n n (2n) ! ( 1 ) 4n (2n) 2 Po (2n) = 4 E (m ! )2 { (n - m) ! }2

= 2" n

7. Let eij be the probability the walk ever reaches j having started from i . Clearly eao = ea ,a- l ea- l , a-2 · · · e lO , since a passage to 0 from a requires a passage to a - 1 , then a passage to a - 2, and so on. By homogeneity, eao = (elO)a .

By conditioning on the value of the first step, we find that elO = pe30 + qeOO = pero + q. The

cubic equation x = px3 + q has roots x = 1 , c, d, where

-p - Jp2 + 4pq c = --"-----'--=----=----=-

2p d = -p + Jp2 + 4pq . 2p

Now I c l > 1 , and Id l � 1 if and only if p2 + 4pq � 9p2 which is to say that p .:::: j . It follows that

elO = 1 if p .:::: j , so that eao = 1 if p .:::: i . When p > 1 , we have that d < 1 , and it is actually the case that elO = d, and hence

'f 1 1 p > 3 '

In order to prove this, it suffices to prove that eao < 1 for all large a ; this is a minor but necessary chore. Write Tn = Sn - So = 2:1=1 Xj , where Xj is the value of the i th step. Then

eaO = lP(Tn .:::: -a for some n � 1 ) = lP(n/-L - Tn � n/-L + a for some n � 1 ) 00

.:::: L lP(n/-L - Tn � n/-L + a) n=1

where /-L = lE(Xl ) = 2p - q > O. As in the theory of large deviations, for t > 0,

where X is a typical step. Now lE(et (tL-X) ) = 1 + o(t) as t t 0, and therefore we may pick t > 0 such that e (t) = e-ttLlE(et (tL-X) ) < 1 . It follows that eao .:::: 2:�1 e-tae (t)n which is less than 1 for all large a, as required.

8. We have that

where p + q = 1 . Hence Gx, y (s , t) = G (ps + qt) where G is the probability generating function of X + Y. Now X and Y are independent, so that

G(ps + qt) = Gx (s)Gy (t) = Gx, y (s , 1 )Gx, y ( 1 , t) = G (ps + q)G(p + qt) .

256

Page 266: One Thousand Exercises in Probability

Problems Solutions [5.12.9]-[5.12.11]

Write f(u) = G( 1 + u) , x = s - 1 , y = t - 1 , to obtain f(px + qy) = f(px)f(qy), a functional equation valid at least when -2 < x , y � O. Now f is continuous within its disc of convergence,

and also f(O) = 1 ; the usual argument (see Problem (4. 14.5» implies that f(x) = eAX for some

A, and therefore G(s) = f (s - 1 ) = eA(s- I ) . Therefore X + Y has the Poisson distribution with

parameter A. Furthermore, Gx(s) = G (ps + q) = eAp (s- I ) , whence X has the Poisson distribution with parameter Ap. Similarly Y has the Poisson distribution with parameter Aq . 9. In the usual notation, Gn+1 (s) = Gn(G(s» . It follows that G�+I ( 1 ) = G� ( 1 )G' ( 1 )2 + G� ( 1 )G" ( 1 ) so that, after some work, var(Zn+l ) = JL2 var(Zn ) + JLna2 . Iterate to obtain

a2JLn ( 1 - JLn+l ) var(Zn+1 ) = a2 (JLn + JLn+1 + . . . + JL2n) = , n � 0, 1 - JL

for the case JL =1= 1 . If JL = 1 , then var(Zn+l ) = a2 (n + 1 ) . 10. (a) Since the coin i s unbiased, we may assume that each player, having won a round, continues to back the same face (heads or tails) until losing. The duration D of the game equals k if and only if k is the first time at which there has been either a run of r - 1 heads or a run of r - 1 tails; the probability of this may be evaluated in a routine way. Alternatively, argue as follows. We record S (for 'same') each time a coin shows the same face as its predecessor, and we record C (for 'change' ) otherwise; start with a C. It is easy to see that each symbol in the resulting sequence is independent of earlier symbols and is equally likely to be S or C . Now D = k if and only if the first run of r - 2 S's is completed at time k. It is immediate from the result of Problem (5 . 1 2.3) that

(b) The probability that Ak wins is

( l s/-2 (1 _ I s ) GD (S) = 2 2 . 1 - s + (�sy- l

00 7rk = E lP'(D = n (r - 1 ) + k - 1 ) .

n= l

Let W be a complex (r - 1 )th root of unity, and set

1 { 1 1 2 Wk (S) = r _ 1 G D (S) + wk- l G D (WS ) + w2(k- l ) G D (W s)

1 r-2 } + . . . + w(r-2) (k- l) G D (W s) .

It may be seen (as for Problem (5 . 12.2» that the coefficient of s i in Wk(S) is lP'(D = i ) if i is of the form n (r - 1 ) + (k - 1) for some n , and is 0 otherwise. Therefore lP'(Ak wins) = Wk ( 1 ) . (c) The pool contains £D when it is won. The required mean is therefore

E(DI . ) W' ( 1 ) E(D I Ak wins) = {Ak WIns} = _k_ . lP'(Ak wins) Wk(1 )

(d) Using the result of Exercise (5 . 1 .2), the generating function of the sequence lP'(D > k) , k � 0, is T(s) = ( 1 - G D (s»/( 1 - s) . The required probability is the coefficient of sn in T(s ) . 11 . Let Tn be the total number of people in the first n generations. B y considering the size Zl o f the first generation, we see that

Zl Tn = 1 + E Tn-l (i )

i=l

257

Page 267: One Thousand Exercises in Probability

[5.12.12]-[5.12.14] Solutions Generating functions and their applications

where Tn-l ( 1 ) , Tn-l (2) , . . . are independent random variables, each being distributed as Tn-I . Using the compounding formula (5 . 1 .25), Hn (s) = sG(Hn- l (s» . 12. We have that

JP(Z > N I Z = 0) = JP(Zn > N, Zm = 0) n m JP(Zm = 0)

= "f JP(Zm = 0 I Zn = N + r)lP(Zn = N + r)

r=1 JP(Zm = 0)

= "f JP(Zm-n = o)N+rJP(Zn = N + r)

r=1 JP(Zm = 0)

JP(Zm = O)N+l 00 N N � JP(Zm = 0) L JP(Zn = N + r) � JP(Zm = 0) = Gm(O) . r=1

13. (a) We have that Gw (s) = GN (G (S» = eA(G(s)- l ) . Also, Gw (s) l /n = eA«G(s)- I)/n, the same probability generating function as Gw but with A replaced by Aln . (b) We can suppose that H (0) < 1 , since if H (0) = 1 then H (s) = 1 for all s , and we may take A = 0 and G (s) = 1 . We may suppose also that H (0) > O. To see this, suppose instead that H (0) = 0 so that H(s) = sr "L-i=o sjhj+r for some sequence (hk) and some r ::: 1 such that hr > O. Find a

positive integer n such that r In is non-integral; then H(s) l /n is not a power series, which contradicts the assumption that H is infinitely divisible.

Thus we take 0 < H(O) < 1 , and so 0 < 1 - H(s) < 1 for 0 � s < 1 . Therefore

10g H(s) = 10g ( 1 - { l - H(s)J) = A (- 1 + A (s»)

where A = - log H(O) and A(s) is a power series with A (O) = 0, A(I) 00

• "L-j=1 ajsJ , we have that

1 . Writing A(s) =

as n -+ 00. Now H(s) l /n is a probability generating function, so that each such expression is non­negative. Therefore aj � 0 for all j , implying that A(s) is a probability generating function, as required.

14. It is clear from the definition of infinite divisibility that a distribution has this property if and only if, for each n , there exists a characteristic function 1frn such that t/> (t) = 1frn (t)n for all t . (a) The characteristic functions in question are

N(/-L, a2) : Poisson (A) :

r (A , /-L) :

. 1 2 2 t/> (t) = eitlL- '1U t

·t t/> (t) = eA(el - 1 )

t/> (t) = (_A. ) IL .

A - I t

In these respective cases, the 'nth root' 1frn of t/> is the characteristic function of the N(/-Lln , a2 In) , Poisson (Aln) , and r (A , /-LIn) distributions.

258

Page 268: One Thousand Exercises in Probability

Problems Solutions [5.12.15]-(5.12.16]

(b) Suppose that ¢ is the characteristic function of an infinitely divisible distribution, and let 1/In be a characteristic function such that ¢ (t) = 1/In (t)n . Now 1¢ (t) 1 � 1 for all t, so that

1 1/1 (t) 1 = 1¢ (t) 1 1 /n -+ { I if 1 ¢ (t) 1 =1= 0,

n 0 if 1¢ (t) 1 = o.

For any value of t such that ¢ (t) =1= 0, it is the case that 1/In (t) -+ 1 as n -+ 00. To see this, suppose

instead that there exists e satisfying 0 < e < 2n such that 1/In (t) -+ ei(} along some subsequence. Then 1/In (t)n does not converge along this subsequence, a contradiction. It follows that

1/I(t) = lim 1/In (t) = { I � ¢ (t) =1= 0, n-+oo 0 if ¢ (t) = O.

Now ¢ i s a characteristic function, so that ¢ (t) =1= 0 on some neighbourhood of the origin. Hence 1/I(t) = 1 on some neighbourhood of the origin, so that 1/1 is continuous at the origin. Applying the continuity theorem (5.9.5), we deduce that 1/1 is itself a characteristic function. In particular, 1/1 is continuous, and hence 1/I(t) = 1 for all t, by (*). We deduce that ¢ (t) =1= 0 for all t . 15. We have that

lP'(S = n I N = n)lP'(N = n) pnlP'(N = n) lP'(N = n I S = N) = Ek lP'(S = k i N = k)lP'(N = k) = E�l pklP'(N = k) .

Hence E(xN I S = N) = G(px)/G(p) . If N is Poisson with parameter >.., then

eA(px- l) E(xN I S = N) = eA(p- l ) = eAp(x- l ) = G(x)P .

Conversely, suppose that E(xN I S = N) = G(x)P . Then G(px) = G(p)G (x)P , valid for Ix l � 1 , 0 < P < 1 . Therefore f(x) = 10gG(x ) satisfies f(px) = f (p) + pf(x) , and in addition f has a

power series expansion which is convergent at least for 0 < x � 1 . Substituting this expansion into the

above functional equation for f, and equating coefficients of pi xj , we obtain that f(x) = ->"( 1 - x) for some >.. � o. It follows that N has a Poisson distribution.

16. Certainly

( 1 - (PI + P2» ) n ( 1 - (PI + P2» ) n Gx(s) = Gx, Y (s , 1 ) = 1 ' Gy (t) = Gx'yO , t) = 1 ' - P2 - PI s - PI - P2t ( 1 - (PI + P2) ) n Gx+y (s) = Gx, Y (s , s) = 1 _ (PI + P2)S

'

giving that X, Y, and X + Y have distributions similar to the negative binomial distribution. More specifically,

lP'(X = k) = (n + : - l) akO _ a)n , lP'(Y = k) = (n + : - l) fJk ( l _ fJ)n ,

lP'(X + Y = k) = (n + : - 1) yk ( l _ y)n ,

for k � 0, where a = PI /( 1 - P2) , fJ = P2/( 1 - PI ) , y = PI + P2 ·

259

Page 269: One Thousand Exercises in Probability

[5.12.17]-[5.12.19] Solutions Generating functions and their applications

Now JE(SX J } ) A JE(sX I Y = y) = (y=y JP(Y = y) B

where A is the coefficient of tY in G X, y (s , t) and B is the coefficient of tY in Gy (t) . Therefore

17. As in the previous solution,

18. (a) Substitute u = y la to obtain

(b) Differentiating through the integral sign,

aJ 1000 { 2b 2 2 2 2 } - = - - exp(-a u - b u- ) du ab 0 u2

= -1000 2 exp(-a2b2y-2 - y2) dy = -2J ( I , ab) ,

by the substitution u = b I y . (c) Hence aJ lab = -2aJ , whence J = c(a)e-2ab where

(d) We have that

by the substitution x = y2 . (e) Similarly

by substituting x = y-2 . 19. (a) We have that

rOO _a2u2 .fii c(a) = J (a , O) = Jo

e du = �.

260

Page 270: One Thousand Exercises in Probability

Problems Solutions [5.12.20]-[5.12.22]

in the notation of Problem (5 . 1 2 . 1 8). Hence U has the Cauchy distribution.

(b) Similarly

for t > O. Using the result of Problem (5. 1 2. 1 8e), V has density function

f(x) = _1_e-I/ (2x) , x > O .

.J271:x3

(c) We have that W-2 = X-2 + y-2 + Z-2 . Therefore, using (b),

for t > O. It follows that W-2 has the same distribution as 9V = 9X-2 , and so W2 has the same

distribution as � X2 . Therefore, using the fact that both X and W are symmetric random variables, W has the same distribution as 1 x, that is N(O, � ) . 20. It follows from the inversion theorem that

F(x + h) - F(x) _ 1 1 . jN 1 - e-ith -itxA. ( ) d ----,--- - - 1m e 'I' t t . h 271: N-+oo -N i t

Since I ¢ I is integrable, we may use the dominated convergence theorem to take the limit as h + 0 within the integral:

f(x) = � lim jN

e-itx¢ (t) dt. 271: N-+oo -N The condition that ¢ be absolutely integrable is stronger than necessary; note that the characteristic function of the exponential distribution fails this condition, in reflection of the fact that its density function has a discontinuity at the origin.

21. Let Gn denote the probability generating function of Zn . The (conditional) characteristic function of Zn/JL

n is

It is a standard exercise (or see Example (5 .4.3)) that

whence by an elementary calculation

as n --+ 00,

the characteristic function of the exponential distribution with parameter 1 - JL -I . 22. The imaginary part of ¢x (t) satisfies

H ¢x (t) - ¢x (t) } = H ¢x (t) - ¢x ( -t) } = HlE(eitX) - lE(e-itX) } = 0

261

Page 271: One Thousand Exercises in Probability

[5.12.23]-[5.12.24] Solutions Generating junctions and their applications

for all t, if and only if X and -X have the same characteristic function, or equivalently the same distribution.

23. (a) U = X + Y and V = X - Y are independent, so that ¢u + V = ¢u¢v , which is to say that

¢2x = ¢x+Y¢X-y , or

¢(2t) = {¢ (t)2 } {¢ (t)¢ (-t) } = ¢ (t)3¢ (-t) .

Write 1/I(t) = ¢ (t)/¢ ( -t) . Then

Therefore

1/I(2t) = ¢ (2t) = ¢ (t)3¢ ( -t) = 1/1 (t)2 . ¢ ( -2t) ¢ ( -t)3¢ (t)

1/I (t) = 1/I(� t)2 = 1/I (it )4 = . . . = 1/I(t/2n )2n for n :::: 0. However, as h -+ 0,

¢ (h) I - 1 h2 + 0(h2) 2 .I· (h) - --_ :2 - 1 + o(h )

'I' - ¢ (-h) - 1 - �h2 + 0(h2) -,

so that 1/I(t) = { I + 0(t2 /22n) } 2n -+ 1 as n -+ 00, whence 1/I(t) ¢ ( -t) = ¢ (t) . It follows that

1 for all t, giving that

¢ (t) = ¢ (� t)3¢ (_ �t ) = ¢ (� t)4 = ¢ (t/2n )22n for n :::: 1

= { 1 - ! . � + 0(t2/22n) } 22n -+ e- ! t2

2 22n

so that X and Y are N(O, 1 ) .

as n -+ 00,

(b) With U = X + Y and V = X - Y, we have that 1/I (s , t) = JE(eisU+it V ) satisfies

Using what is given,

However, by (*),

1/I(s , t) = JE(ei (s+t)X+i(s-t)Y) = ¢ (s + t)¢ (s - t) .

a2t / = 2{¢"(s)¢ (s) - ¢' (s)2 } , at t=O yielding the required differential equation, which may be written as

d , d/¢ /¢) = - 1 .

1 2 Hence log ¢(s ) = a + bs - �s2 for constants a, b, whence ¢ (s) = e- Zs .

24. (a) Using characteristic functions, ¢z (t) = ¢x (t/n)n = e- it i . (b) JE IXd = 00.

262

Page 272: One Thousand Exercises in Probability

Problems Solutions [5.12.25]-[5.12.27]

25. (a) See the solution to Problem (5 . 12 .24) .

(b) This is much longer. Having established the hint, the rest follows thus:

fx+Y (Y) = L: f(x)f (y - x) dx

where

Finally,

I 100 2 = n(4 + y2) -00 { j

(x) + f(y - x) } dx + J g(y) = n(4 + y2) + J g(y)

J = L: {xf(x) + (y - x)f(y - x) } dx

= lim [�{IOg( l + X2) - log ( 1 + (y _ X)2 ) }] N

= O. M. N�oo 2n -M

1 fz(z) = 2fx+y (2z) = n(l + Z2) '

26. (a) Xl + X2 + " , + Xn . (b) X I - Xl ' where X I and Xl are independent and identically distributed. (c) XN , where N is a random variable with lP'(N = j) = Pj for 1 ::: j ::: n , independent of Xl , X2 , · · " Xn . (d) I:j'!,1 Zj where ZI , Z2 , . . . are independent and distributed as X I , and M is independent of the Zj with lP'(M = m) = ( i )m+I for m � O. ( e) Y X I , where Y is independent of X I with the exponential distribution parameter I .

27. (a) We require 100 2eitx

¢ (t) = 1fX + -1fX dx . -00 e e First method. Consider the contour integral

1 2eitz IK = dz C e1rZ + e 1rZ

where C is a rectangular contour with vertices at ±K, ±K + i . The integrand has a simple pole at 1

z = 1 i , with residue e-2 t l (in ) . Hence, by Cauchy's theorem,

as K -+ 00.

Second method. Expand the denominator to obtain

1 00

---,--,----,- = L (- I )k exp{ - (2k + l )n lx l } . cosh(nx) k=O

Multiply by eitx and integrate term by term.

263

Page 273: One Thousand Exercises in Probability

[5.12.28]-[5.12.30] Solutions Generating junctions and their applications

(b) Define ¢(t) = 1 - I t I for I t I ::s 1 , and ¢ (t) = 0 otherwise. Then

� 100 e-itx¢ (t) dt = � 11 e-itx ( 1 - l tD dt 2]"( -00 2]"( -1

1 101 1 = - ( 1 - t) cos(tx) dt = -2 (1 - cos x) . ]"( 0 ]"(x

Using the inversion theorem, ¢ is the required characteristic function. (c) In this case,

100

eitx e-x-e-X dx = roo

y-ite-y dy = r( 1 - i t ) -00 Jo where r is the gamma function. (d) Similarly,

L: �eitxe- Ix l dx = i {foOO eitx e-x dx + 1000 e-itx e-x dX } 1 { I I } 1 = 2" 1 - i t + 1 + i t =

1 + t2 .

(e) We have that lE(X) = -i¢'(O) = -r'( I ) . Now, Euler's product for the gamma function states that

n ' nZ r (z) = lim --_._--n-+oo z (z + 1 ) · . . (z + n) where the convergence is uniform on a neighbourhood of the point z = 1 . By differentiation,

r' ( 1 ) = lim {_n_ (IOg n - 1 - ! _ . . . _ _ 1_) } = _ y o n-+oo n + 1 2 n + 1

28. (a) See Problem (5. 12.27b). (b) Suppose ¢ is the characteristic function of X. Since ¢' (0) = ¢" (0) = ¢'" (0) = 0, we have that lE(X) = var(X) = 0, so that JI>(X = 0) = 1 , and hence ¢(t) = 1 , a contradiction. Hence ¢ is not a characteristic function. (c) As for (b). (d) We have that cos t = i (eit + e-it ) , whence ¢ is the characteristic function of a random variable taking values ± 1 each with probability i . (e) By the same working as in the solution to Problem (5. 12.27b), ¢ is the characteristic function of the density function

29. We have that

f(x) = { 1 - Ix l if Ix l � 1 , o otherwIse.

1 1 - ¢ (t) 1 ::s lEl l - eitX I = lEJ( 1 - eitX) ( 1 - e-itX) = lEv'2{ 1 - cos(tX)} ::s lEl tX I

since 2(1 - cos x) ::s x2 for all x . 30. This i s a consequence of Taylor's theorem for functions of two variables :

264

Page 274: One Thousand Exercises in Probability

Problems Solutions [5.12.31]-[5.12.33]

where ¢mn is the derivative of ¢ in question, and RMN is the remainder. However, subject to appro­priate conditions,

whence the claim follows. 31. (a) We have that

if Ix l :s I , and hence

x2 x2 x4 - < - - - < 1 - cos x 3 - 2 ! 4 ! -

{ (tx)2 dF(x) :s 1 3 { 1 - cos(tx) } dF(x) It-t- I , t- I ] [-t- I , t- I ]

:s 3 i: { I - cos(tx) } dF(x) = 3 { 1 - Re ¢ (t ) } .

(b) Using Fubini's theorem,

! r { l - Re ¢ (v)} dv = roo ! r { I - cos(vx) } dv dF(x) t 10 1x=-00 t 1v=0 = 1

00 ( 1 - Sin(tX) ) dF(x) -00 tx

� {

x : (1 - Sin(tX) ) dF(x) 1l tx l � 1 tx

since 1 - (tx)- I sin(tx) � 0 if I tx l < 1 . Also, sin (tx) :s (tx) sin 1 for I tx l � I , whence the last integral is at least

{x : ( 1 - sin l ) dF(x) � �JI>( IX I � t- I ) . 1l tx l� 1

32. It is easily seen that, if y > 0 and n is large,

33. (a) The characteristic function of Y .. is

t/I .. (t) = lE{ exp (i t (X - I..) /JA) } = exp{ A (eit /.JI - 1) - i tJA} = exp{ _ ! t2 + 0( 1 ) }

as A -+ 00. Now use the continuity theorem. (b) In this case,

so that, as A -+ 00,

( . ) ( . 2 ) I t I t t 1 1 2 log t/I .. (t) = -itJA - Hog 1 - - = -itJA + A - - - + 0(1.. - ) -+ - zt . v'I v'I 21..

265

Page 275: One Thousand Exercises in Probability

[5.12.34]-[5.12.36] Solutions Generating functions and their applications

(c) Let Zn be Poisson with parameter n. By part (a),

11" ( Z'Jn n ::s 0) -HI>(O) = i

where 4> is the N(O, 1 ) distribution function. The left hand side equals lI"(Zn ::s n) = "E,'k=O e-nnk /kL 34. If you are in possession of r - 1 different types, the waiting time for the acquisition of the next new type is geometric with probability generating function

( ) (n - r + l)s Gr s = . n - (r - l)s

Therefore the characteristic function of Un = (Tn - n log n)/n is n n { ( + 1 ) it/n } -it ,

1/In (t) = e-it log n Gr (eit/n ) = n-it n - r e. = n � . . II II n - (r - 1)el t/n rr-1 (ne-l t/n - r) r=l r=l r=O

The denominator satisfies n- l n- l II (ne-it/n - r) = ( 1 + 0(1» II (n - i t - r) r=O r=O

as n -+ 00, by expanding the exponential function, and hence

n-itn ' lim 1/In (t) = lim 1 . = r ( 1 - i t ) , n---+oo n---+oo rr�,:o (n - i t - r)

where we have used Euler's product for the gamma function:

n ! nZ

rrn -+ r(z) as n -+ 00

r=O (z + r)

the convergence being uniform on allY region of the complex plane containing no singularity of r. The claim now follows by the result of Problem (5. 1 2.27c). 35. Let Xn be uniform on [-n , n] , with characteristic function

if t :;6 0,

if t = O.

It follows that, as n -+ 00, ¢n (t) -+ OOt , the Kronecker delta. The limit function is discontinuous at t = 0 and is therefore not itself a characteristic function. 36. Let G j (s) be the probability generating function of the number shown by the i th die, and suppose that

12 2 ( 1 1 1 ) '"" 1 k s - s G l (S)G2 (S) = f=5. ITS = 1 1 ( 1 - s) ,

so that 1 - s l 1 = 1 1 ( 1 - S)Hl (S)H2 (S) where Hi (S) = s- I Gj (s ) is a real polynomial of degree 5. However

5 1 - s l 1 = (1 - s) II (Wk - S) (Wk - s)

k=l

266

Page 276: One Thousand Exercises in Probability

Problems Solutions [5.12.37]-[5.12.38]

where wI , wI , . . . , w5 , W5 . are the ten complex eleventh roots of unity. The wk come in conjugate pairs, and therefore no five of the ten tenns in rr�=l (Wk - S) (Wk - s) have a product which is a real polynomial. This is a contradiction. 37. (a) Let H and T be the numbers of heads and tails. The joint probability generating function of H and T is

where p = I - q is the probability of heads on each throw. Hence

G H,T (S , t) = GN (qt + ps) = exp {A (qt + ps - I ) } .

It follows that

so that GH,T (S , t) = GH(S)GT (t) , whence H and T are independent. (b) Suppose conversely that H and T are independent, and write G for the probability generating function of N. From the above calculation, G H,T (S , t) = G (qt + ps) , whence G H (S) = G(q + ps) and GT (t) = G (qt + p), so that G(qt + ps) = G (q + ps)G(qt + p) for all appropriate s , t . Write f(x) = G( 1 - x) to obtain f(x + y) = f(x)f(y) , valid at least for all 0 ::: x , y ::: min{p, q } . The only continuous solutions to this functional equation which satisfy f (0) = 1 are of the form f(x) = eJ1-X for some {.t, whence it is immediate that G (x) = eA(x- l ) where A = -{.t. 38. The number of such paths rr containing exactly n nodes is 2n- l , and each such rr satisfies lP(B(rr) � k) = lP(Sn � k) where Sn = YI + Y2 + . . . + Yn is the sum of n independent Bernoulli variables having parameter p (= 1 - q) . Therefore lE{Xn (k) } = 2n-1lP(Sn � k) . We set k = nf3 , and need to estimate lP(Sn � n(3) . It is a consequence of the large deviation theorem (5 . 1 1 .4) that, if p ::: f3 < I ,

lP(Sn � n(3) I /n -+ inf {e-tf3 M(t) } t >O

where M(t) = lE(etY1 ) = (q + pet ) . With the aid of a little calculus, we find that

Hence

where

( ) f3 ( 1 ) 1-f3 lP(Sn � n(3) I /n -+ � 1 = � ,

lE{Xn (f3n) } -+ { O �f y (f3) < 1 ,

00 If y (f3) > 1 ,

y (f3) = 2 (�r C = �r

-f3

P ::: f3 < 1 .

is a decreasing function of f3 . If p < ! , there is a unique f3c E [p, 1) such that y (f3c) = 1 ; if p � 1 then y (f3) > 1 for all f3 E [p, 1) so that we may take f3c = 1 .

Turning to the final part,

As for the other case, we shall make use of the inequality

lE(N)2 lP(N =1= 0) > -­- lE(N2)

267

if f3 > f3c .

Page 277: One Thousand Exercises in Probability

[5.12.38]-[5.12.38] Solutions Generating junctions and their applications

for any N taking values in the non-negative integers. This is easily proved: certainly

whence

var(N I N =1= 0) = lE(N2 I N =1= 0) - lE(N I N =1= 0)2 2:: 0,

lE(N2) lE(N)2 lP'(N =1= 0) 2::

lP'(N =1= 0)2 . We have that lE{Xn CBn)2 } = Err,p lE(Irr lp) where the sum is over all such paths Jr, p, and Irr is the indicator function of the event {B(Jr ) 2:: fJn}. Hence

lE{Xn (fJn)2 } = L lE(Irr ) + L lE(Irr lp) = lE{Xn (fJn)} + 2n- 1 L lE(hlp) rr rr#p p#L

where L is the path which always takes the left fork (there are 2n-1 choices for Jr , and by symmetry each provides the same contribution to the sum). We divide up the last sum according to the number of nodes in common to p and L, obtaining E�-:;\ 2n-m-1 lE(hIM) where M is a path having exactly m nodes in common with L. Now

where Tn-m has the bin(n - m , p) distribution (the 'most value' to 1M of the event {h = I } is obtained when all m nodes in L n M are black). However

so that lE(hIM) � p-mlE(h)lE(IM) . It follows that N = Xn (fJn) satisfies

whence, by (*), 1 lP'(N =1= 0) > . - lE(N)- 1 + � E�-:'\ (2p)-m

If fJ < fJe then lE(N) -+ 00 as n -+ 00. It is immediately evident that lP'(N =1= 0) -+ 1 if p � � . Suppose finally that p > � and fJ < fJe . By the above inequality,

lP'(Xn (fJn) > 0) 2:: c(fJ) for all n

where c(fJ) is some positive constant. Find E > 0 such that fJ + E < fJe . Fix a positive integer m, and let :Pm be a collection of 2m disjoint paths each of length n - m starting from depth m in the tree. Now

lP'(Xn (fJn) = 0) � lP'(B(v) < fJn for all v E :Pm ) = lP'(B(v) < fJn) 2m

where v E :Pm . However

lP'(B(v) < fJn) � lP'( B(v) < (fJ + E) (n - m)) if fJn < (fJ + E) (n - m), which is to say that n 2:: (fJ + E)m/E . Hence, for all large n ,

268

Page 278: One Thousand Exercises in Probability

Problems Solutions [5.12.39]-[5.12.42]

by (**); we let n --+ 00 and m --+ 00 in that order, to obtain JI>(Xn (fJn) = 0) --+ ° as n --+ 00. 39. (a) The characteristic function of Xn satisfies

the characteristic function of the Poisson distribution. (b) Similarly,

peit/n A lE(eitYn /n) = . --+ __ I - ( 1 - p)e1 t/n A - i t

as n --+ 00, the limit being the characteristic function of the exponential distribution. 40. If you cannot follow the hints, take a look at one or more of the following: Moran 1968 (p. 389), Breiman 1968 (p. 1 86), Loeve 1977 (p. 287), Laha and Rohatgi 1979 (p. 288). 41. With Yk = kXb we have that lE(Yk) = 0, var(Yk ) = k2 , lE I Y,? 1 = k3 . Note that Sn = YI + Y2 + . . . + Yn is such that

1 � 3 n4 -{ v-ar-(S-n"--'-) }"3/"'2 f:J. lEI Yk I '" c n 9 /2 --+ °

as n --+ 00, where c is a positive constant. Applying the central limit theorem « 5 . 10.5) or Problem (5 . 12.40» , we find that

Sn D r.;;;:c;- � N(O, 1 ) , as n --+ 00, ",var Sn

where var Sn = 2:Z=1 k2 '" 1n3 as n --+ 00. 42. We may suppose that J-L = ° and (J = 1 ; i f this i s not so, then replace Xi by Yi = (Xi - J-L)/(J .

Let t = (to , tl , t2 , . . . , tn) E IRn+l , and set I = n- I 2:7=1 tj . The joint characteristic function of the n + 1 variables X, Z} , Z2 , . . . , Zn is

¢ (t) = lE{ exp (i toX + t i tj Zj) } = lE{ fI exp (i [� + tj - I] Xj) } J =I J=I

= fl exp (- � [� + tj - If) by independence. flence

( I n [t ]

2) {

t2 1 n } ¢ (t) = exp - - E J1. + (tj - 7) = exp _...Q.. - - E(tj _ 7)2 2 j=1 n 2n 2 j=1

where we have used the fact that 2:7=1 (tj - 7) = 0. Therefore

whence X is independent of the collection Z I , Z2 , . . . , Zn . It follows that X is independent of S2 = (n - 1)- 1 2:J=1 ZJ . Compare with Exercise (4. 10.5) .

269

Page 279: One Thousand Exercises in Probability

[5.12.43]-[5.12.47] Solutions Generating functions and their applications

43. (i) Clearly, lP(Y :s y) = lP(X :s log y) = <I> (log y) for y > 0, where <I> is the N(O, 1) distribution function. The density function of Y follows by differentiating. (ii) We have that fa (x) � 0 if l a l :s 1 , and

a sin (2n log x) r,:o=e- z og x) dx = r,:o=a sin(2ny)e- zY dy = 0 10

00 1 1 (I 2 100 1 1 2

o x...; 2n -00 ...; 2n

since sine is an odd function. Therefore J�oo fa (x) dx = 1 , so that each such fa is a density function. For any positive integer k, the kth moment of fa is J�oo xk f (x) dx + la (k) where

100 1 k 1 2 la Ck) = r,:o=a sin(2ny)e Y-zY dy = 0

-00 ...; 2n since the integrand is an odd function of y - k. It follows that each fa has the same moments as f. 44. Here is one way of proving this. Let Xl , X2 , . . . be the steps of the walk, and let Sn be the position of the walk after the nth step. Suppose J1, = lE(X I ) satisfies J1, < 0, and let em = lP(Sn = o for some n � 1 I So = -m) where m > O. Then em :s E�l lP(Tn > m) where Tn = Xl + X2 + . . . + Xn = Sn - SO . Now, for t > 0,

lP(Tn > m) = lP(Tn - nJ1, > m - nJ1,) :s e-t (m-ntl-)lE(et (Tn-ntl-» ) = e-tm {ettl-M(t) r where M(t) = lE(et (XI -tl-» ) . Now M(t) = 1 +0(t2) as t -+ 0, and therefore there exists t (> 0) such that B (t) = ettl-M(t) < 1 (remember that J1, < 0). With this choice of t, em :s E�l e-tmB (t)n -+ 0 as m -+ 00, whence there exists K such that em < i for m � K.

Finally, there exist 8, E > 0 such that lP(X I < -8) > E, implying that lP(SN < -K I So = 0) > EN where N = r K / 81 , and therefore

lP(Sn ;6 0 for all n � 1 I So = 0) � ( l - e K )EN � iEN ;

therefore the walk is transient. This proof may be shortened by using the Borel-Cantelli lemma. 45. Obviously,

{ a if XI > a, L =

X I + L if X I :s a , where L has the same distribution as L. Therefore,

46. We have that

a lE(sL) = salP(XI > a) + L srlE(sL)lP(XI = r) .

r=l

{ Wn- I + 1 with probability p, Wn = -

Wn- l + 1 + Wn with probability q , where Wn is independent ofWn_ 1 and has the same distribution as Wn . Hence Gn (s) = psGn_ I (S) + qsGn- 1 (s)Gn (s) . Now Go(s) = 1 , and the recurrence relation may be solved by induction. (Alter­natively use Problem (5 . 1 2.45) with appropriate Xi .) 47. Let Wr be the number of flips until you first see r consecutive heads, so that lP(Ln < r) = lP(Wr > n) . Hence,

270

Page 280: One Thousand Exercises in Probability

Problems Solutions [5.12.48]-[5.12.52]

where lEes wr ) = Gr (s) is given in Problem (5 . 1 2.46). 48. We have that

{ iXn with probability � , Xn+l = 1 Z Xn + Yn with probability � .

Hence the characteristic functions satisfy

. itX 1 1 1 1 A 4>n+l (t) = lE(e n+J ) = z4>n (z t) + z4>n (z t) A _ i t

1 A - ii t 1 A - ,\ i t -n A - i t2-n A = 4>n (z t) A - i t = 4>n- l (4 t) A - i t = . . · = 4>I (t2 ) A - i t -+ A - i t

as n -+ 00. The limiting distribution i s exponential with parameter A. 49. We have that

(a) ( l-e-A)/A , (b) -(p/q2 ) (q+1og p) , (c) ( l_qn+l )/ [(n+ 1 )p] , (d) - [ 1+(p/q) log p]/ log p. (e) Not if lP(X + 1 > 0) = 1 , by Jensen's inequality (see Exercise (5 .6. 1)) and the strict concavity of the function f(x) = l/x . If X + 1 is permitted to be negative, consider the case when lP(X + 1 = - 1) = lP(X + 1 = 1 ) = i . 50. By compounding, as in Theorem (5 . 1 .25), the sum has characteristic function

G t _ p4>x (t) - � N (4)X ( )) - 1 - q4>x (t) - AP - i t '

whence the sum is exponentially distributed with parameter Ap. 51. Consider the function G (x ) = (lE(X2) }- 1 J�oo y2 dF(y) . This function is right-continuous and increases from 0 to 1 , and is therefore a distribution function. Its characteristic function is

52. By integration, fx (x) = fy (y) = � , Ix l < 1 , I y l < 1 . Since f(x , y) =1= fx (x)fy (y) , X and Y are not independent. Now,

11 { ,\ (z + 2) if - 2 < z < 0,

fx+Y (z) = f(x , z - x) dx = 1 - 1 4 (2 - z) if O < z < 2 ,

the 'triangular' density function on (-2, 2) . This is the density function of the sum of two independent random variables uniform on (- 1 , 1 ) .

27 1

Page 281: One Thousand Exercises in Probability

6

Markov chains

6.1 Solutions. Markov Processes

1. The sequence Xl , X2 , . . . of independent random variables satisfies

lP'(Xn+ l = j I Xl = i l , · · · , Xn = in ) = lP'(Xn+l = j) ,

whence the sequence i s a Markov chain. The chain i s homogeneous if the Xi are identically distributed. 2. (a) With Yn the outcome of the nth throw, Xn+ l = max{Xn , Yn+ l } , so that

for 1 ::: i , j ::: 6. Similarly,

{ o if j < i 1 · . . .

Pij = 6 ' If J = ,

i if j > i , { o if j < i Pij (n) =

( -61 ,. )n .f . . I J = l .

If j > i , then Pij (n) = lP'(Zn = j) , where Zn = max{Yl , Y2 , . . . , Yn } , and an elementary calculation yields ( . ) n ( . l ) n

Pij (n) = =k - J � , i < j ::: 6.

(b) Nn+l - Nn is independent of Nl , N2 , . . . , Nn , so that N is Markovian with { 1 ·f · . + 1 6 I J = , , Pij = � if j = i , o otherwise.

(c) The evolution of C is given by

whence C is Markovian with

Cr+l = {�r + 1 { 1 o

Pij = � o if the die shows 6, otherwise,

j = 0,

j = i + I ,

otherwise.

272

Page 282: One Thousand Exercises in Probability

Markov processes

(d) This time, { Br - 1 if Br > 0, B 1 -r+ - Y ·f B - 0 r 1 r - ,

Solutions [6.1.3]-[6.1.4]

where Yr is a geometrically distributed random variable with parameter t , independent of the sequence Bo, B2 , . . . , Br . Hence B is Markovian with

{ I if j = i - I ;:: 0, Pij =

( � )j - l � ·f · 0 . 1 o o l / = , j ;:: .

3. (i) If Xn = i , then Xn+1 E {i - 1 , i + I } . Now, for i ;:: 1 ,

(*) lP(Xn+l = i + 1 I Xn = i, B) = lP(Xn+l = i + 1 I Sn = i , B)lP(Sn = i I Xn = i, B)

+ lP(Xn+l = i + 1 I Sn = -i , B)lP(Sn = -i I Xn = i , B)

where B = {Xr = ir for 0 ::: r < n} and io , i l , . . . , in- l are integers. Clearly

lP(Xn+l = i + 1 I Sn = i, B) = p, lP(Xn+l = i + 1 I Sn = -i , B) = q, where p (= 1 - q) i s the chance of a rightward step. Let 1 be the time of the last visit to 0 prior to the time n , 1 = max{r : ir = OJ . During the time-interval (1 , n ] , the path lies entirely in either the positive integers or the negative integers. If the former, it is required to follow the route prescribed by the event B n {Sn = i } , and if the latter by the event B n {Sn = -i } . The absolute probabilities of these two routes are

whence

lP(Sn = i I Xn = i, B) = Jl"1 = L = 1 - lP(Sn = -i I Xn = i, B ) . Jl"1 + Jl"2 pi + ql

Substitute into (*) to obtain

. . pHI + qHl lP(Xn+l = I + 1 I Xn = I , B) = . . = 1 - lP(Xn+l = i - I I Xn = i, B) .

pi + ql Finally lP(Xn+l = 1 I Xn = 0, B) = 1 . (ii) If Yn > 0, then Yn - Yn+ 1 equals the (n + l )th step, a random variable which is independent of the past history of the process. If Yn = 0 then Sn = Mn , so that Yn+ 1 takes the values 0 and 1 with respective probabilities p and q , independently of the past history. Therefore Y is a Markov chain with transition probabilities

for i > 0, { p if j = i - 1 Pij = ·f · · + 1 q 1 j = I ,

{ p if j = O POj =

q if j = 1 . The sequence Y is a random walk with a retaining barrier at O. 4. For any sequence io , i I , . . . of states,

lP(Xns = is for 0 < s < k + 1 ) lP(Yk+1 = ik+l I Yr = ir for 0 ::: r ::: k ) = . - -lP(Xns = Is for 0 ::: s ::: k)

TI�=o Pis , is+l (ns+1 - ns ) = k- l TIs=o Pis , is+! (ns+ 1 - ns ) = Pik , ik+ ! (nk+1 - nk) = lP(Yk+ l = ik+ l I Yk = ik) ,

273

Page 283: One Thousand Exercises in Probability

[6.1.5]-[6.1.9] Solutions

where Pij (n) denotes the appropriate n-step transition probability of X. (a) With the usual notation, the transition matrix of Y is { p2 if j = i + 2,

TCij = 2pq if j = i , q2 if j = i - 2.

Marlwv chains

(b) With the usual notation, the transition probability TCij is the coefficient of sj in G(G(s)i . 5. Writing X = (Xl , X2 , . . . , Xn) , we have that

( I . ) IP' (F, J (X) = I , Xn = i) IP' F J (X) = I , Xn = 1 = ( . ) IP' J (X) = 1 , Xn = I

where F is any event defined in terms of Xn , Xn+l , . . . . Let A be the set of all sequences x = (Xl , x2 , . . . , xn- l , i ) of states such that J (x) = 1 . Then

xeA xeA

by the Markov property. Divide through by the final summation to obtain IP'(F I J (X) = 1, Xn = i ) = IP'(F I Xn = i ) . 6 . Let Hn = {X k = Xk for 0 :::: k < n , X n = i } . The required probability may be written as

IP'(XT+m = j, HT ) En IP'(XT+m = j, HT , T = n) = �������----� IP'(HT) IP'(HT )

Now IP'(XT+m = j I HT , T = n) = IP'(Xn+m = j I Hn , T = n) . Let J be the indicator function of the event Hn n {T = n} , an event which depends only upon the values of Xl , X2 , . . . , Xn . Using the result of Exercise (6. 1 .5),

IP'(Xn+m = j I Hn , T = n) = IP'(Xn+m = j I Xn = i) = Pij (m) .

Hence

7. Clearly

IP'(Yn+ l = j I Yr = ir for 0 :::: r :::: n) = IP'(Xn+l = b I Xr = ar for 0 :::: r :::: n)

where b = h- l (j) , ar = h- l (ir ) ; the claim follows by the Markov property of X. It is easy to find an example in which h is not one-one, for which X is a Markov chain but Y is

not. The first part of Exercise (6. 1 .3) describes such a case if So =1= o.

8. Not necessarily! Take as example the chains S and Y of Exercise (6. 1 .3) . The sum is Sn + Yn =

Mn , which is not a Markov chain. 9. All of them. (a) Using the Markov property of X,

IP'(Xm+r = k I Xm = im , . . . , Xm+r- l = im+r- l ) = IP'(Xm+r = k I Xm+r- l = im+r- l ) ·

274

Page 284: One Thousand Exercises in Probability

Classification of states Solutions [6.1.10]-[6.2.1]

(b) Let {even} = {X2r = i2r for 0 ::::: r ::::: m} and {odd} = {X2r+ l = i2r+l for 0 ::::: r ::::: m - l } . Then,

k � , IP'(X2m+2 = k, X2m+l = i2m+l ' even, odd) IP'(X2m+2 = I even) = w lP'(even)

= L , IP'(X2m+2 = k, X2m+l = i2m+l I X2m = i2m )lP'(even, odd) lP'(even)

= IP'(X2m+2 = k I X2m = i2m ) ,

where the sum i s taken over all possible values of is for odd s . (c) With Yn = (Xn , Xn+ l ) ,

IP' (Yn+1 = (k, I ) I YO = (io , i l ) , · · · , Yn = (in , k)) = IP' (Yn+l = (k, I ) I Xn+l = k) = IP' (Yn+l = (k, I ) I Yn = (in , k)) ,

by the Markov property of X.

10. We have by Lemma (6. 1 .8) that, with /.Ly> = IP'(Xj = j) ,

( 1 ) ( 1 )

LHS _ /.LxI PXIX2 . . • PXr_ l ,kPk,xr+I . • • PXn-lXn /.LxI PXr_ l ,kPk,xr+1 - (1)

= ( 1 )

= RHS. /.LxI · · · PXr- IXr+1 (2) · · · PXn- lXn /.LxI PXr- IXr+l (2)

11. (a) Since Sn+ 1 = Sn + Xn+ 1 , a sum of independent random variables, S is a Markov chain. (b) We have that

IP'(Yn+ l = k I Yj = Xi + Xi- l for 1 ::::: i ::::: n) = IP'(Yn+ l = k I Xn = xn )

by the Markov property of X. However, conditioning on Xn is not generally equivalent to conditioning on Yn = Xn + Xn- l o so Y does not generally constitute a Markov chain. (c) Zn = nX 1 + (n - 1 )X2 + . . . + Xn, so Zn+l is the sum of Xn+l and a certain linear combination of ZI , Z2 , . . . , Zn , and so cannot be Markovian. (d) Since Sn+l = Sn + Xn+l ' Zn+l = Zn + Sn + Xn+l o and Xn+l is independent of Xl , · · · , Xn , this is a Markov chain. 12. With 1 a row vector of 1 's , a matrix P is stochastic (respectively, doubly stochastic, sub-stochastic) if PI' = 1 (respectively, IP = 1 , PI' ::::: 1, with inequalities interpreted coordinatewise). By recursion, P satisfies any of these equations if and only if pn satisfies the same equation.

6.2 Solutions. Classification of states

1. Let Ak be the event that the last visit to i , prior to n, took place at time k. Suppose that Xo = i , so that Ao , A I , . . . , An- l form a partition of the sample space. It follows, by conditioning on the Ai , that n- l

Pij (n) = L Pii (k)lij (n - k) k=O

for i =f. j . Multiply by sn and sum over n (;::: 1 ) to obtain Pij (s) = Pii (s) L ij (s) for i =f. j . Now Pij (s) = Fij (s )Pjj (s) if i =f. j, so that Fij (s) = Lij (s) whenever Pii (s) = Pjj (s) .

As examples of chains for which Pii (s) does not depend on i , consider a simple random walk on the integers, or a symmetric random walk on a complete graph.

275

Page 285: One Thousand Exercises in Probability

[6.2.2]-[6.3.1] Solutions Markov chains

2. LeU (¥= s) be a state of the chain, and define n i = min{n : Pis (n) > O}. If Xo = i and Xnj = s then, with probability one, X makes no visit to i during the intervening period [ 1 , n i - I ] ; this follows from the minimality of ni . Now s is absorbing, and hence

JI>(no return to i I Xo = i) � JI>(Xnj = s I Xo = i ) > O.

3. Let h be the indicator function of the event {Xk = i } , so that N = l:�o h is the number of visits to i . Then 00 00

lE(N) = :L lE(h) = :L Pii (k) k=O k=O

which diverges if and only if i is persistent. There is another argument which we shall encounter in some detail when solving Problem (6. 15 .5). 4. We write Jl>iO = JI>(. I Xo = i). One way is as follows, another is via the calculation of Problem (6. 1 5 .5). Note that Jl>j (Vj � 1 ) = JI>j (Ij < 00). (a) We have that

by the strong Markov property (Exercise (6. 1 .6)) applied at the stopping time 11 . By iteration, Jl>j (Vj � n) = JI> j (Vj � 1 )n , and allowing n -+ 00 gives the result. (b) Suppose i =1= j. For m � 1 ,

Jl>j (Vj � m) = Jl>j ("} � m I Ij < oo)JI>j (Ij < 00) = Jl>j ("} � m - l )JI>j (Ij < 00)

by the strong Markov property. Now let m -+ 00, and use the result of (a). 5. Let e = JI>(Ij < 11 I Xo = i) = JI>(Tj < Ij I Xo = j) , and let N be the number of visits to j before visiting i . Then

JI>(N � 1 I Xo = i ) = JI>(Ij < 11 I Xo = i ) = e .

Likewise, JI>(N � k I Xo = i ) = e ( 1 - e)k- 1 for k � 1 , whence

00 lE(N I Xo = i ) = :L e( 1 - e)k- 1 = 1 .

k=1

6.3 Solutions. Classification of chains

1. If r = 1 , then state i is absorbing for i � 1 ; also, 0 is transient unless ao = 1 . Assume r < 1 and let J = supU : aj > O}. The states 0 , 1 , . . . , J form an irreducible persistent

class ; they are aperiodic if r > O. All other states are transient. For 0 :s i :s J, the recurrence time Tj of i satisfies JI>(1I = 1 ) = r . If 11 > 1 then Tj may be expressed as the sum of

11( 1) := time to reach 0, given that the first step is leftwards,

11(2) := time spent in excursions from 0 not reaching i , TP) := time taken to reach i in final excursion.

276

Page 286: One Thousand Exercises in Probability

Classification of chains Solutions [6.3.2]-[6.3.3]

It is easy to see that E(TP ) ) = 1 + (i - 1)/0 - r) if i � I , since the waiting time at each intermediate point has mean (1 - r)- I . The number N of such 'small' excursions has mass function JJ!'(N = n) = ai O - ai )n , n � 0, where aj = L�i aj ; hence E(N) = O - aj )/aj . Each such small excursion has mean duration

_J_ + 1 � _ 1 + Jaj i- I ( . ) i- I .

j; l - r l - aj - j; o - aj )o - r) and therefore

By a similar argument, (3) 1

Loo ( j - i ) E (T. ) = - 1 + - a · . I a . 1 _ r J I j=j

Combining this information, we obtain that

and a similar argument yields E(To) = 1 + Lj jaj /O - r) . The apparent simplicity of these formulae suggests the possibility of an easier derivation; see Exercise (6.4.2). Clearly E(Tj ) < 00 for i :s J whenever Lj jaj < 00, a condition which certainly holds if J < 00.

2. Assume that 0 < P < 1 . The mean jump-size is 3 P - I , whence the chain is persistent if and only if P = 1 ; see Theorem (5 . lD. 17) . 3. (a) All states are absorbing if P = O. Assume henceforth that p # O. Diagonalize P to obtain P = BAB- l where

B � G 1 0

- 1

A =

Therefore

1 if :2 �} B- 1 _ 1 C -:2 0

G 0 1 - 2p

0

1 _ 1 if

o ) o .

1 - 4p

2

I ) if

-l '

P" � BA"B- 1 � B G ( I -gp>"

whence pjj (n) is easily found.

o ) o B-1

( 1 - 4p)n In particular,

P1 1 (n) = 1 + !O - 2p)n + 10 - 4p)n , P22 (n) = ! + !O - 4p)n , and P33 (n) = P1 1 (n) by symmetry.

277

Page 287: One Thousand Exercises in Probability

[6.3.4]-[6.3.5] Solutions Markov chains

Now Fjj (s) = 1 - Pjj (s)- I , where

1 1 1 Pl 1 (s) = P33 (S) = 4(1 _ s) + 2{ 1 - s ( 1 - 2p) } + 4{ 1 - s (1 - 4p) }

,

1 1 P22 (S) = + . 2(1 - s) 2{ 1 - s ( 1 - 4p) }

After a little work one obtains the mean recurrence times l1-i = F[i ( 1 ) : 11- 1 = 11-3 = 4, 11-2 = 2. (b) The chain has period 2 (if p =j:. 0), and all states are non-null and persistent. By symmetry, the mean recurrence times l1-i are equal. One way of calculating their common value (we shall encounter an easier way in Section 6.4) is to observe that the sequence of visits to any given state j is a renewal process (see Example (5 .2. 15)). Suppose for simplicity that p =j:. O. The times between successive visits to j must be even, and therefore we work on a new time-scale in which one new unit equals two old units. Using the renewal theorem (5 .2.24), we obtain

2 Pij (2n) -+ - if i j - i I is even, I1-j .

2 Pij (2n + 1 ) -+ - if i j - i l is odd; I1-j

note that the mean recurrence time of j in the new time-scale is �l1-j . Now Ej Pij (m) = 1 for all m, and so, letting m = 2n -+ 00, we find that 4/11- = 1 where 11- i s a typical mean recurrence time.

There is insufficient space here to calculate Pij (n) . One way is to diagonalize the transition matrix. Another is to write down a family of difference equations of the form P12 (n) = P . P22 (n -1) + ( 1 - p) . P42 (n - 1) , and solve them. 4. (a) By symmetry, all states have the same mean-recurrence time. Using the renewal-process argument of the last solution, the common value equals 8, being the number of vertices of the cube. Hence I1-v = 8.

Alternatively, let s be a neighbour of v, and let t be a neighbour of s other than v. In the obvious notation, by symmetry,

- 1 3 I1-v - + 4l1-sv ,

1 1 1 1 I1-tv = + 2,l1-sv + 411-tv + 4l1-wv ,

1 1 1 I1-sv = + 411-sv + 2,l1-tv ,

1 1 3 I1-wv = + 411-wv + 4l1-tv ,

a system of equations which may be solved to obtain I1-v = 8 .

(b) Using the above equations, I1-wv = �o , whence I1-vw = �o by symmetry. (c) The required number X satisfies lP(X = n) = en- 1 ( 1 - e )2 for n � 1 , where e is the probability that the first return of the walk to its starting point precedes its first visit to the diametrically opposed vertex. Therefore 00

JE(X) = L nen- 1 ( 1 - e )2 = 1 . n= 1

5. (a) Let lPi ( . ) = lP(· I Xo = i) . Since i is persistent,

1 = lPi (Vi = 00) = lPj (Vj = 0, Vi = 00) + lPi (Vj > 0, Vi = 00) ::: lPi (Vj = 0) + lPi ('1j < 00, Vi = 00) .::: 1 - lPi ('1j < 00) + lPi ('1j < oo)lPj (Vi = 00) ,

by the strong Markov property. Since i -+ j, we have that lPj (Vi = 00) � 1 , which implies TJj i = 1 . Also, lP i ('1j < 00) = 1 , and hence j -+ i and j i s persistent. This implies TJij = 1 . (b) This i s an immediate consequence of Exercise (6.2.4b).

278

Page 288: One Thousand Exercises in Probability

Classification of chains Solutions [6.3.6]-[6.3.9]

6. Let lP'i ( -) = lP'(. I Xo = i ) . It is trivial that TJj = 1 for j E A. For j ¥ A, condition on the first step and use the Markov property to obtain

TJj = L PjklP'(TA < 00 I Xl = k) = L PjkTJk · keS k

If x = (Xj : j E S) is any non-negative solution of these equations, then Xj = 1 ;::: TJj for j E A. For j ¥ A,

Xj = L PjkXk = L Pjk + L PjkXk = lP'j (TA = 1 ) + L PjkXk keS keA k�A k�A

= lP'j (TA = 1 ) + L Pjk { L Pki + L PkiXi } = lP'j (TA � 2) + L Pjk L PkiXi . k�A ieA i�A k�A i�A

We obtain by iteration that, for j if. A,

where the sum is over all kl ' k2 ' . . . , kn if. A . We let n -+ 00 to find that Xj ;::: lP'j (TA < 00) = TJj . 7. The first part follows as in Exercise (6.3.6). Suppose x = (Xj : j E S) is a non-negative solution to the equations. As above, for j ¥ A,

Xj = 1 + L PjkXk = lP'j (TA ;::: 1 ) + L pjk ( l + L PkiXi) k k� i�

= lP'j (TA ;::: 1 ) + lP'j (TA ;::: 2) + . . . + lP'j (TA ;::: n) + L Pjkl Pkl k2 . . . Pkk_ l kn Xkn n

;::: L lP'(TA ;::: m) , m=l

where the penultimate sum is over all paths of length n that do not visit A. We let n -+ 00 to obtain that xj ;::: Ej (TA ) = Pj . S. Yes, because the Sr and Tr are stopping times whenever they are finite. Whether or not the exit times are stopping times depends on their exact definition. The times Ur = min{k > Ur- l : XUr E A, XUr+l if. A} are not stopping times, but the times Ur + 1 are stopping times. 9. (a) Using the aperiodicity of j, there exist integers rl , r2 , . . . , rs having highest common factor 1 and such that Pjj (rk ) > 0 for 1 � k � s . There exists a positive integer M such that, if r ;::: M, then r = 2:�=1 akrk for some sequence aI , a2 , . . . , as of non-negative integers . Now, by the Chapman­Ko1mogorov equations,

so that Pjj (r) > 0 for all r ;::: M.

s Pjj (r) ;::: II Pjj (rk)ak > 0,

k=l

Finally, find m such that Pij (m) > O. Then

if r ;::: M.

(b) Since there are only finitely many pairs i , j , the maximum R(P) = max{N(i , j) : i , j E S} is finite. Now R(P) depends only on the positions of the non-negative entries in the transition matrix P.

279

Page 289: One Thousand Exercises in Probability

[6.3.10]-[6.3.10] Solutions Markov chains

There are only finitely many subsets of entries of P, and so there exists f(n) such that R(P) .:s f(n) for all relevant n x n transition matrices P. (c) Consider the two chains with diagrams in the figure beneath. In the case on the left, we have that PI I (5) = 0, and in the case on the right, we may apply the postage stamp lemma with a = n and b = n - 1 .

3 4

2 2 4

10. Let Xn be the number of green balls after n steps. Let ej be the probability that Xn is ever zero when Xo = j . By conditioning on the first removal,

j + 2 j ej = 2(j + 1 ) ej+1 + 2(j + 1 ) ej- I , j � 1 ,

with eo = 1 . Solving recursively gives

{ qi qI q2 · · · qj- I } (*) ej = 1 - ( 1 - el ) 1 + - + . . . + , PI PI P2 · · · Pj- I

where j + 2

Pj = 2(j + 1 ) ' It is easy to see that

j- I qI q2 . . · q · - I 2 . L J = 2 - -- -+ 2 as J -+ 00 .

r=O PI P2 · · · Pj- I j + 1

By the result of Exercise (6.3 .6), we seek the minimal non-negative solution (ej ) to (*) , which is attained when 2( 1 - el ) = 1, that is, el = 1 . Hence

I j- I qI q2 · · · q · _ 1 1 ej = 1 - 2: L PI P2 . . . ;j- I = j + 1 ·

r=O For the second part, let dj be the expected time until j - 1 green balls remain, having started with j green balls and j + 2 red. We condition as above to obtain

j dj = 1 + 2(j + 1 ) {dj+l + dj } .

We set ej = dj - (2j + 1 ) to find that ( j + 2)ej = jej+b whence ej = 1j (j + 1 )el . The expected time to remove all the green balls is

n n n

L dj = L {ej + 2(j - 1 ) } = n (n + 2) + el L 1j (j + 1 ) . j=1 j=1 j= 1

The minimal non-negative solution is found by setting el = 0, and the conclusion follows by Exercise (6.3.7).

280

Page 290: One Thousand Exercises in Probability

Stationary distributions and the limit theorem Solutions [6.4.1]-[6.4.2]

6.4 Solutions. Stationary distributions and the limit theorem

1. Let Yn be the number of new errors introduced at the nth stage, and let G be the common probability generating function of the Yn . Now Xn+1 = Sn + Yn+1 where Sn has the binomial distribution with parameters Xn and q (= 1 - p). Thus the probability generating function Gn of Xn satisfies

Gn+I (S) = G(s)JE.(sSn ) = G(s)JE.{JE.(sSn I Xn) } = G(s)JE.{ (p + qs)Xn } = G(s)Gn (p + qs) = G(s)Gn ( 1 - q( 1 - s)) .

Therefore, for s < 1 ,

Gn (s) = G(s)G (1 - q ( 1 - s) ) Gn-2( 1 - q2 (1 - s)) = . . . n- l 00

= Go ( l - qn ( 1 - s)) II G ( 1 - qr ( 1 - s)) � II G ( 1 - qr ( 1 - s)) r=O r=O

as n � 00, assuming q < 1 . This infinite product is therefore the probability generating function of the stationary distribution whenever this exists . If G(s) = eA(s- l ) , then

IT G ( 1 - qr ( 1 - s)) = exp{ J.. (s - 1 ) f qr } = eA(s- I )/p , r=O r=O

so that the stationary distribution is Poisson with parameter J.. / p. 2. (6.3 . 1 ) : Assume for simplicity that sup{j : aj > O} = 00 . The chain is irreducible if r < 1 . Look for a stationary distribution 1C with probability generating function G . We have that

Hence sG(s) = 1CosA(s) + rs (G (s) - 1Co) + ( 1 - r ) (G (s) - 1CO)

where A(s) = 'L.1=o ajsj , and therefore

G(s) = 1Co ( SA(S) - (1 - r + sr) ) . ( 1 - r ) (s - 1 )

Taking the limit as s t 1 , we obtain by L'Hopital's rule that

(A' (1 ) + 1 - r) G(1 ) = 1Co . l - r

There exists a stationary distribution if and only if r < 1 and A' ( 1 ) < 00, in which case

sA(s) - (l - r + sr) G (s ) = --:-(s------:-:-l)--:-(A-,-,':-:-(1:-:-) -+-I:-----:-r) "

Hence the chain is non-null persistent if and only if r < 1 and A' ( 1 ) < 00. The mean recurrence time itj is found by expanding G and setting JLj = I /TCj . (6.3 .2) : Assume that 0 < p < 1 , and suppose first that p =1= � . Look for a solution {Yj : j =1= O} of the equations

i =1= 0,

28 1

Page 291: One Thousand Exercises in Probability

[6.4.3]-[6.4.3] Solutions Markov chains

as in (6.4. 10) . Away from the origin, this equation is Yi = qYi- l + PYi+Z where P + q = 1 , with auxiliary equation pe3 - e + q = O. Now pe3 - e + q = p(e - l) (e - a)(e - (3) where

-p - VpZ + 4pq a = < - 1 2p , f3 = -p + VpZ + 4pq > 0. 2p

Note that 0 < f3 < 1 if p > 1 , while f3 > 1 if p < 1 . For p > 1 , set

{ A + Bf3i if i � 1 , Yi =

C + Dai if i :::; - 1 , the constants A , B , C , D being chosen in such a manner as to 'patch over' the omission of 0 in the equations (*):

Y-z = qY-3 , Y- l = qy-Z + PYl , Yl = PY3 ·

The result is a bounded non-zero solution {Yj } to (*), and it follows that the chain is transient. For p < 1 , follow the same route with

{ 0 if i � 1 , Yi =

C + Dai + Ef3i if i :::; - 1 , the constants being chosen such that Y-Z = qY-3 , Y- l = qy-z ·

Finally suppose that p = 1 , so that a = -2 and f3 = 1 . The general solution to (*) is

. _ { A + Bi + Cai if i � 1 , YI - D + Ei + Fai if i :::; - I ,

subject to (**). Any bounded solution has B = E = C = 0, and (**) implies that A = D = F = o. Therefore the only bounded solution to (*) is the zero solution, whence the chain is persistent. The equation x = xP is satisfied by the vector x of 1 's ; by an appeal to (6.4.6), the walk is null. (6.3 .3): (a) Solve the equation n = nP to find a stationary distribution n = ( ! , � , ! ) when p oF O. Hence the chain is non-null and persistent, with /.Ll = ni l = 4, and similarly /.Lz = 2, /.L3 = 4. (b) Similarly, n = ( ! , ! , ! , ! ) is a stationary distribution, and /.Li = nj- l = 4. (6.3.4): (a) The stationary distribution may be found to be nj = � for all i , so that /.Lv = 8.

3. The quantities X 1 . X z ' . . . , Xn depend only on the initial contents of the reservoir and the rainfalls Yo , Yl , . . . , Yn- l . The contents on day n + 1 depend only on the value Xn of the previous contents and the rainfall Yn . Since Yn is independent of all earlier rainfalls, the process X is a Markov chain. Its state space is S = {O , 1 , 2, . . . , K - I } and it has transition matrix [ gO + gl gZ g3 . . .

gO gl gZ . . . IP' =

0 gO gl . . . · " . · . , . · . . . o 0 0

gK- l GK 1 gK-Z GK- l gK-3 GK-Z

· . · . · . gO G l

where gj = IP'(YI = i ) and Gj = 'L-1=j gj . The equation n = n P is as follows :

Jro = Jro (gO + gl ) + JrlgO , Jrr = JrOgr+l + Jrlgr + . . . + Jrr+ l gO ,

0 < r < K - 1 , JrK- l = JrOGK + JrlGK- l + . . . + JrK- I G l ·

282

Page 292: One Thousand Exercises in Probability

Stationary distributions and the limit theorem SolutioDS [6.4.4]-[6.4.5]

The final equation is a consequence of the previous ones, since ��OI 1T:i = 1 . Suppose then that v = (VI , V2 , . . . ) is an infinite vector satisfying

Vo = vo (gO + gl ) + vIgO , Vr = vOgr+1 + vI gr + . . . + vr+l gO for r > O.

Multiply through the equation for Vr by sr+ l , and sum over r to find (after a little work) that

00

N(s) = L Vj Si , i=O

00

G(s) = L gisi i=O

satisfy sN(s) = N(s)G(s) + vOgo (s - 1 ) , and hence

1 go (s - 1 ) -N(s) = . Vo s - G (s)

The probability generating function of the 1T:j i s therefore a constant mUltiplied by the coefficients of so , s l , . . . , sK - I in go (s - l)/(s - G(s» , the constant being chosen in such a way that ��OI 1T:i = 1 .

When G(s) = p(1 - qs)- I , then gO = P and

go (s - 1 ) p( 1 - qs) q ----'----:-:---:---:- = = p + ---,--:-:-s - G(s) p - qs l - (qs/p)

The coefficient of sO is 1 , and of si is qi+ 1 / pi if i � 1 . The stationary distribution is therefore given by 1T:j = q1T:O (q/p)i for i � 1 , where

1T:o = 1

. _ p - q 1 + �f-I q (q/p)1 - P - q + q2 ( 1 - (q/p)K- I )

if p # q, and 1T:O = 2/(K + 1 ) if p = q = i . 4. The transition matrices

have respective stationary distributions 1T: I = (p, 1 - p) and 1T: 2 = ( �p , � p, � (1 - p) , � (1 - p») for any 0 � p � 1 . 5. (a) Set i = 1 , and find an increasing sequence n l ( 1 ) , n l (2) , . . . along which xI (n) converges. Now set i = 2, and find a subsequence of (n l (j) : j � 1 ) along which x2 (n) converges ; denote this subsequence by n2 ( 1 ) , n2 (2) , . . . . Continue inductively to obtain, for each i , a sequence Di =

(ni (j) : j � 1 ) , noting that: (i) Di+ I is a subsequence of Dj , and (ii) limr-+oo Xj (ni (r» exists for all i . Finally, define mk = nk (k) . For each i � 1 , the sequence mj , mi+ ! , . . . is a subsequence of Di , and therefore limr-+oo Xi (mr ) exists . (b) Let S be the state space of the irreducible Markov chain X. There are countably many pairs i , j of states, and part (a) may be applied to show that there exists a sequence (nr : r � 1 ) and a family (Olij : i , j E S), not all zero, such that pjj (nr ) -+ Olij as r -+ 00.

283

Page 293: One Thousand Exercises in Probability

[6.4.6]-[6.4.10] Solutions Markov chains

Now X is persistent, since otherwise Pij (n) � 0 for all i, j . The coupling argument in the proof of the ergodic theorem (6.4. 17) is valid, so that Paj (n) - Pbj (n) � 0 as n � 00, implying that fXaj = fXbj for all a , b , j . 6. Just check that 1r satisfies 1r = 1rP and Ev TCv = 1 . 7. Let Xn be the Markov chain which takes the value r if the walk is at any of the 2r nodes at level r . Then Xn executes a simple random walk with retaining barrier having P = 1 - q = i , and it is thus transient by Example (6.4. 1 5). 8. Assume that Xn includes particles present just after the entry of the fresh batch Yn . We may write

Xn Xn+l = L Bi,n + Yn

i= 1 where the Bi,n are independent Bernoulli variables with parameter 1 - p. Therefore X is a Markov chain. It follows also that

Gn+l (s) = lE(sXn+l ) = Gn (p + qs)eA(s- I ) .

In equilibrium, Gn+! = Gn = G, where G(s) = G(p + qs)eA(s- I) . There is a unique stationary distribution, and it is easy to see that G (s ) = eA(s - 1) / P must therefore be the solution. The answer is the Poisson distribution with parameter Alp. 9. The Markov chain X has a uniform transition distribution Pjk = l /(j + 2) , 0 � k � j + 1 . Therefore,

lE(Xn) = lE (lE(Xn I Xn- l ») = � ( 1 + lE(Xn- l ») = . . .

= 1 - (� )n + ( � )nXo .

The equilibrium probability generating function satisfies

whence

x X { 1 - sXn+2 } G(s) = lE(s n ) = lE (lE(s n I Xn_d) = lE ) 2) ' ( 1 - s (Xn +

d ds { ( 1 - s)G(s) } = -sG(s) ,

subject to G(l ) = 1 . The solution is G(s) = es- 1 , which is the probability generating function of the Poisson distribution with parameter 1 . 10. This is the claim of Theorem (6.4. 1 3) . Without loss of generality we may take s = 0 and the Yj to be non-negative (since if the Yj solve the equations, then so do Yj + c for any constant c). Let T be the matrix obtained from P by deleting the row and column labelled 0, and write Tn = (tij (n) : i , j 1= 0) . Then Tn includes all the n-step probabilities of paths that never visit zero.

We claim first that, for all i , j it is the case that tij (n) � 0 as n � 00. The quantity tij (n) may be thought of as the n-step transition probability from i to j in an altered chain in which s has been made absorbing. Since the original chain is assumed irreducible, all states communicate with s, and therefore all states other than s are transient in the altered chain, implying by the summability of tij (n) (Corollary (6.2.4» that tij (n) � 0 as required.

Iterating the inequality y ;:: Ty yields y ;:: Tny, which is to say that 00 00

Yi ;:: L tij (n)Yj ;:: min{Yr+s } L tij (n) , i ;:: 1 . j= 1 s:::: 1 j=r+l

284

Page 294: One Thousand Exercises in Probability

Stationary distributions and the limit theorem Solutions [6.4.11]-[6.4.12]

Let An = {Xk :;f O for k � n } . For i ;:: 1 ,

00

IP'(Aoo I Xo = i ) = lim IP'(An I Xo = i ) = � tij (n) n-+oo L...J

Let E > 0, and pick R such that

j= 1

1 · ( ) Yi { r } < 1m ti · n + . . - n-+oo .r; J mtns;::: dYr+s }

Yi ----,--- < E .

mins;::: dYR+s }

Take r = R and let n -+ 00, implying that IP'(Aoo I Xo = i ) = o. It follows that 0 is persistent, and by irreducibility that all states are persistent.

11. By Exercise (6.4.6), the stationary distribution is 7r:A = 7r:B = 7rD = 7r:E = j; , 7r:C = 1 · (a) By Theorem (6.4.3), the answer is J1-A = 1 /7r:A = 6. (b) By the argument around Lemma (6.4.5), the answer is PD (A) = 7rDJ1-A = 7r:D/7r:A = 1 . (c) Using the same argument, the answer is pc(A) = 7r:C/7r:A = 2. (d) LetlP'i ( ·) = IP'(. I Xo = i), let 1j be the time of the first passage to state j , and let vi = lP'i (h < TE) . By conditioning on the first step,

·th 1 . 5 3 1 1 WI so utlon VA = 8 ' VB = 'I> Vc = 2 ' vn = 4 · A typical conditional transition probability Lij = lP'i (X 1 = j I TA < TE) is calculated as follows :

and similarly,

2 2 1 1 3 1 1 LAC = S ' LBA = 3 ' iBC = 3 ' LCA = 2 ' LCB = "8" ' LCD = 8 ' LDC = . We now compute the conditional expectations Tii = lEi (TA I TA < TE) by conditioning on the first step of the conditioned process . This yields equations of the form TiA = 1 + � TiB + � Tic , whose solution gives TiA = 1

54 •

(e) Either use the stationary distribution of the conditional transition matrix T , or condition on the first step as follows. With N the number of visits to D, and TJi = lEi (N I TA < TE) , we obtain

3 2 TJA = S TJB + s TJC ,

whence in particular TJ A = lo · - 0 1 TJB - + 3 TJC ,

12. By Exercise (6.4.6), the stationary distribution has 7r:A = �, 7r:B around Lemma (6.4.5), the answer is PB (A) = 7r:BJ1-A = 7r:B/7r:A = 2.

285

TJD = TJc ,

t . Using the argument

Page 295: One Thousand Exercises in Probability

[6.5.1]-[6.5.6] Solutions Markov chains

6.5 Solutions. Reversibility

1. Look for a solution to the equations 1T:i Pij = 1T:j Pji . The only non-trivial cases of interest are those with j = i + 1 , and therefore Ai1T:i = tLi+ l1T:i+ l for 0 :::; i < b, with solution

0 :::; i :::; b,

an empty product being interpreted as 1 . The constant 1T:o i s chosen in order that the 1T:i sum to 1 , and the chain is therefore time-reversible. 2. Let1T: be the stationary distribution of X, and suppose X is reversible. We have that 1T:i Pij = Pji1T:j for all i , j , and furthermore 1T:i > 0 for all i . Hence

1T:iPij PjkPki = Pji1T:jPjkPki = Pji Pkj1T:kPki = Pji Pkj Pik1T:i

as required when n = 3. A similar calculation is valid when n > 3 . Suppose conversely that the given display holds for all finite sequences of states . Sum over all

values of the subsequence h, . . . , jn- l to deduce that Pij (n - 1 )Pji = Pij Pj i (n - 1 ) , where i = iI , j = jn . Take the limit as n -+ 00 to obtain 1T:j Pji = Pij 1T:i as required for time-reversibility. 3. With 1T: the stationary distribution of X, look for a stationary distribution v of Y of the form

There are four cases.

Vi = { c{31T:i C1T:i

if i fl C, if i E C.

(a) i E C, j fI C: Vi qjj = C1T:i{3Pij = c{31T:jPji = Vjqj i , (b) i, j E C : Viqjj = C1T:i Pij = C1T:jPji = Vjqji , (c) i fI c, j E C: Vi qjj = c{31T:i Pij = c{31T:j Pji = Vjqj i , (d) i , j fI C : viqij = c{31T:i Pij = c{31T:jPji = Vjqj i · Hence the modified chain is reversible in equilibrium with stationary distribution v, when

In the limit as {3 + 0, the chain Y never leaves the set C once it has arrived in it. 4. Only if the period is 2, because of the detailed balance equations. 5. With Yn = Xn - im,

Now iterate. 6. (a) The distribution 1T:l = f3!(a + {3) , 1T:2 = a/(a + {3) satisfies the detailed balance equations, so this chain is reversible. (b) By symmetry, the stationary distribution is Jr: = ( 1 , 1 , 1 ) , which satisfies the detailed balance equations if and only if P = i . (c) This chain is reversible if and only if P = i .

286

Page 296: One Thousand Exercises in Probability

Chains with finitely many states Solutions [6.5.7]-[6.6.2]

7. A simple random walk which moves rightwards with probability P has a stationary measure 7rn = A(p/q)n , in the sense that 1r: is a vector satisfying 1r: = 1r:P. It is not necessarily the case that this 1r: has finite sum. It may then be checked that the recipe given in the solution to Exercise (6.5.3) yields 7r(i , j) = p{p4 ; L:(r,s )ec PI P� as stationary distribution for the given process, where C is the relevant region of the plane, and Pi = Pi /qi and Pi (= 1 - qi ) is the chance that the i th walk moves rightwards on any given step. S. Since the chain is irreducible with a finite state space, we have that tri > 0 for all i . Assume the chain is reversible. The balance equations tri Pij = 7rj Pji give Pij = 7rj Pji /7ri . Let D be the matrix with entries 1 /7ri on the diagonal, and S the matrix (7rj Pji ) , and check that P = DS.

Conversely, ifP = DS, then dj- l Pij = dT I Pj i , whence tri = di- l / L:k d;; I satisfies the detailed balance equations.

Note that

Pij = � � Pij.j1fj.

If the chain is reversible in equilibrium, the matrix M = (J7ri /7rj Pij ) is symmetric, and therefore M, and, by the above, P, has real eigenvalues. An example of the failure of the converse is the transition matrix

p = ( i � i ) , 1 0 0

which has real eigenvalues 1 , and - i (twice), and stationary distribution 1r: = ( � , � , � ) . However, 7rI P13 = 0 i= � = 7r3P3 l > so that such a chain is not reversible. 9. Simply check the detailed balance equations tri Pij = 7rj Pji .

6.6 Solutions. Chains with finitely many states

1. Let P = (Pij : I � i, j � n) be a stochastic matrix and let C be the subset of JRn containing all vectors x = (X l , x2 , . . . , xn ) satisfying Xi ;::: 0 for all i and L:i= l Xi = 1 ; for x E C, let I l x l l = maxj {Xj } . Define the linear mapping T : C --+ JRn by T(x) = xP. Let us check that T is a continuous function from C into C. First,

I I T (x) 1 1 = jX{ 2; Xi Pij } � a l lx l l

I where

hence I I T (x) - T (y) 1 1 � a l lx - Y I I . Secondly, T (x)j ;::: 0 for all j , and

L T (x)j = L L Xi Pij = L Xi L Pij = 1 . j j i j

Applying the given theorem, there exists a point 7r in C such that T (7r) = 7r , which is to say that 7r = nP. 2. Let P be a stochastic m x m matrix and let T be the m x (m + I) matrix with (i , j)th entry

{ Pi · - 8i · t . . - J J IJ -

I

if j � m , if j = m + 1 ,

287

Page 297: One Thousand Exercises in Probability

[6.6.3]-[6.6.4] Solutions Markov cMins

where 8ij is the Kronecker delta. Let v = (0, 0, . . . , 0, 1 ) E jRm+ 1 . If statement (ii) of the question is valid, there exists y = (YI , Y2 , . . . , Ym+I ) such that

this implies that

Ym+I < 0,

m

m � (pjj - 8ij )Yj + Ym+I � ° for 1 � i � m; j= I

� Pij Yj � Yi - Ym+I > Yi j= I

for all i ,

and hence the impossibility that L:j=I Pij Yj > maxi {yj } . It follows that statement (i) holds, which is to say that there exists a non-negative vector x = (Xl , x2 , . . . , xm ) such that x(P - I) = 0 and L:i=I Xi = 1 ; such an x is the required eigenvector. 3. Thinking of xn+ 1 as the amount you may be sure of winning, you seek a betting scheme x such that xn+1 is maximized subject to the inequalities

n xn+I � � xi tij for 1 � j � m.

i=I Writing aij = -tij for 1 � i � n and an+I , j = 1 , we obtain the linear program:

n+I maximize xn+I subject to � Xiaij � ° for 1 � j � m.

i= I The dual linear program is:

m minimize ° subject to � aij Yj = ° for 1 � i � n ,

j= I m � an+1 , jYj = 1 , Yj � ° for 1 � j � m . j=I

Re-expressing the aij in terms of the tij as above, the dual program takes the form: m

minimize ° subject to � tij Pj = ° for 1 � i � n , j= I m � Pj = 1 , Pj � ° for 1 � j � m . j=I

The vector x = 0 i s a feasible solution of the primal program. The dual program has a feasible solution if and only if statement (a) holds. Therefore, if (a) holds, the dual program has minimal value 0, whence by the duality theorem of linear programming, the maximal value of the primal program is 0, in contradiction of statement (b) . On the other hand, if (a) does not hold, the dual has no feasible solution, and therefore the primal program has no optimal solution. That is, the objective function of the primal is unbounded, and therefore (b) holds . [This was proved by De Finetti in 1937 .] 4. Use induction, the claim being evidently true when n = 1 . Suppose it is true for n = m. Certainly pm+I is of the correct form, and the equation pm+lx' = p(pmx') with x = ( 1 , W, w2) yields in its first row

288

Page 298: One Thousand Exercises in Probability

Branching processes revisited Solutions [6.6.5]-[6.7.2]

as required. 5. The first part follows from the fact that K if = 1 if and only if KU = 1 . The second part follows from the fact that 1{j > 0 for all i if P is finite and irreducible, since this implies the invertibility of J - P + U.

6. The chessboard corresponds to a graph with 8 x 8 = 64 vertices, pairs of which are connected by edges when the corresponding move is legitimate for the piece in question. By Exercises (6.4.6), (6.5.9), we need only check that the graph is connected, and to calculate the degree of a comer vertex. (a) For the king there are 4 vertices of degree 3, 24 of degree 5, 36 of degree 8. Hence, the number of edges is 210 and the degree of a comer is 3. Therefore JL(king) = 420/3 = 140. (b) JL(queen) = (28 x 21 + 20 x 23 + 12 x 25 + 4 x 27)/21 = 208/3 . (c) We restrict ourselves to the set of 32 vertices accessible from a given comer. Then JL(bishop) =

( 14 x 7 + 10 x 9 + 6 x 1 1 + 2 x 1 3)/7 = 40. (d) JL(knight) = (4 x 2 + 8 x 3 + 20 x 4 + 16 x 6 + 16 x 8)/2 = 168. (e) JL(rook) = 64 x 14/ 14 = 64. 7. They are walking on a product space of 8 x 16 vertices. Of these, 6 x 16 have degree 6 x 3 and 16 x 2 have degree 6 x 5. Hence

JL(C) = (6 x 16 x 6 x 3 + 16 x 2 x 6 x 5)/ 1 8 = 448/3.

8. IP - Al l = (A - l ) (A + � ) (A + j;) . Tedious computation yields the eigenvectors, and thus ( 1 1 pn = 1 1 1

1 1

6.7 Solutions. Branching processes revisited

1. We have (using Theorem (5.4.3), or the fact that Gn+1 (s ) = G (Gn (s ))) that the probability generating function of Zn is

so that

n - (n - l )s Gn (s) = , n + 1 - ns

( n ) k+1 ( n - 1 ) ( n ) k- l nk- 1 P(Zn = k) = n + 1 - n + 1 n + 1

= (n + l )k+l

for k ::: 1 . Therefore, for y > 0, as n --+ 00,

1 L2ynJ nk- 1 ( 1 ) - L2ynJ P(Z < 2yn I Z > 0) - "'" = 1 - 1 + -n --+ 1 - e-2y . n - n - 1 - Gn (O) tI (n + l )k+l

2. Using the independence of different lines of descent,

z . . � skp(Zn = k, extinction) � skp(Zn = k)r/ 1 ( E(s n I extinction) = L...J . . = L...J = -Gn S 1/) ,

k=O P(extinction) k=O 1/ 1/

where Gn is the probability generating function of Zn .

289

Page 299: One Thousand Exercises in Probability

[6.7.3]-[6.8.1] Solutions Markov chains

3. We have that TJ = G(TJ) . In this case G(s ) = q ( 1 - ps )- 1 , and therefore TJ = q / p. Hence

which is Gn (s) with p and q interchanged. 4. (a) Using the fact that var(X I X > 0) � 0,

E(X2) = E(X2 I X > O)IP'(X > 0) � E(X I X > 0)21P'(X > 0) = E(X)E(X I X > 0) .

(b) Hence n E(Z�) 2 E(Zn/ /-L I Zn > 0) � /-LnE(Zn ) = E(Wn )

where Wn = Zn/E(Zn ) . By an easy calculation (see Lemma (5.4.2)),

where a2 = var(ZI ) = p/q2 . (c) Doing the calculation exactly,

E Z n Z > 0 _ E(Zn//-Ln ) _ 1 1 ( n //-L I n ) - IP'(Zn > 0) - 1 - Gn (O) -+ 1 - TJ

where TJ = lP'(ultimate extinction) = q/p.

6.8 Solutions. Birth processes and the Poisson process

1. Let F and W be the incoming Poisson processes, and let N(t) = F(t)+ W(t) . Certainly N(O) = 0 and N is non-decreasing. Arrivals of flies during [0, s] are independent of arrivals during (s , t] , if s < t ; similarly for wasps . Therefore the aggregated arrival process during [0, s] is independent of the aggregated process during (s , t] . Now

where

IP' (N(t + h) = n + 1 1 N(t) = n) = IP'(A f:. B)

A = {one fly arrives during (t , t + hJ ) , B = {one wasp arrives during (t, t + hJ ) .

We have that

IP'(A f:. B) = IP'(A) + IP'(B) - IP'(A n B) = Ah + /-Lh - (Ah) (/-Lh) + o(h) = (A + /-L)h + o(h) .

Finally IP' (N(t + h) > n + 1 1 N(t) = n) � IP'(A n B) + IP'(C U D) ,

290

Page 300: One Thousand Exercises in Probability

Birth processes and the Poisson process Solutions , [6.8.2]-[6.8.4]

where C = {two or more flies arrive in (t , t + h] ) and D = {two or more wasps arrive in (t , t + h] ) . This probability is no greater than (Ah) (f.Lh) + o(h) = o (h) . 2. Let 1 be the incoming Poisson process, and let G be the process of arrivals of green insects. Matters of independence are dealt with as above. Finally,

lP' (G(t + h) = n + 1 1 G (t) = n ) = plP' (I (t + h) = n + 1 1 l (t) = n ) + o(h) = pAh + o(h) , lP' (G(t + h) > n + 1 1 G (t) = n ) � lP' (I (t + h) > n + 1 1 I (t) = n ) = o(h) .

3. Conditioning on Tl and using the time-homogeneity of the process,

(draw a diagram to help you see this). Therefore

if u � t , if t < u � t + x , if u > t + x ,

lP' (E(t) > x ) = 1000 lP' (E(t) > x I Tl = u )Ae-AU du

= t lP' (E(t - u) > x )Ae-AU du + 100 Ae-AU du o Jo t+x

You may solve the integral equation using Laplace transforms. Alternately you may guess the answer and then check that it works. The answer is lP'(E (t) � x) = 1 - e-AX , the exponential distribution. Actually this answer is obvious since E(t) > x if and only if there is no arrival in [t, t + x] , an event having probability e-Ax . 4. The forward equation is

with boundary conditions Pij (O) = 8ij ' the Kronecker delta. We write Gi (S , t) = Lj sj Pij (t) , the probability generating function of B(t) conditional on B(O) = i . Multiply through the differential equation by sj and sum over j :

a partial differential equation with boundary condition G i (s , 0) = s i . This may be solved in the usual way to obtain G i (s , t) = g (eAt (l - s - 1 )) for some function g. Using the boundary condition, we find that g(1 - s- l ) = s i and so g(u) = (1 - u)-i , yielding

The coefficient of sj is, by the binomial series,

j ?:. i ,

as required.

291

Page 301: One Thousand Exercises in Probability

[6.8.5]-[6.8.6] Solutions Markov chains

Alternatively use induction. Set j = i to obtain Pli (t) = -Ai PH (t) (remember Pi, i- l (t) = 0), and therefore PH (t) = e-Ait . Rewrite the differential equation as

d A · t A · t dt (Pij (t)e J ) = A(j - l )Pi, j- l (t)e J •

Set j = i + 1 and solve to obtain Pi, i+ 1 (t) = i e-Ait ( 1 - e-At) . Hence (*) holds, by induction. The mean is

a I

At E(B(t)) = -G[ (s , t) = Ie , as s=1 by an easy calculation. Similarly var(B(t)) = A + E(B(t)) - E(B(t))2 where

Alternatively, note that B(t) has the negative binomial distribution with parameters e-At and I .

5. The forward equations are

p� (t) = An-1 Pn- l (t) - AnPn (t ) , n ?: 0,

where Ai = i A + v . The process i s honest, and therefore m(t) = En npn (t) satisfies 00 00

m' (t) = 2: n [(n - l)A + v 1 Pn- l (t) - 2: n (nA + V)Pn (t) n=1 n=O 00

= 2: {A[(n + l )n - n2] + v [(n + 1 ) - n1 } Pn (t) n=O 00

= 2: (An + V)Pn (t) = Am(t) + v . n=O

Solve subject to m(O) = 0 to obtain m(t) = v (eAt - l )/A. 6. Using the fact that the time to the nth arrival is the sum of exponential interarrival times (or using equation (6 .8. 1 5)), we have that

is given by

� () _ � IIn � Pn ( ) - A A . + () n i=O I which may be expressed, using partial fractions, as

where n A . a · - II -:l­I - A · - A . j=o :l I

j#.i

292

Page 302: One Thousand Exercises in Probability

Continuous-time Markov chains Solutions [6.8.7]-[6.9.1]

so long as Ai #- Aj whenever i #- j. The Laplace transform Pn may now be inverted as

See also Exercise (4.8.4) . 7. Let Tn be the time of the nth arrival, and let T = limn--*oo Tn = sup{t : N(t) < oo}. Now, as in Exercise (6.8.6), n A ' AnPn (e ) = II __ I - = E(e-BTn )

i=O Ai + e

since Tn = Xo + XI + . . . + Xn where Xk is the (k + l )th interarrival time, a random variable which is exponentially distributed with parameter Ak . Using the continuity theorem, E(e-BTn ) -+ E(e-BT ) as n -+ 00, whence AnPn (e ) -+ E(e-BT ) as n -+ 00, which may be inverted to obtain AnPn (t) -+ f(t) as n -+ 00 where f is the density function of T . Now

which converges or diverges according to whether or not En npn (t) converges. However Pn (t) � A;:;-1 f(t) as n -+ 00, so that En npn (t) < 00 if and only if En nA;:;- 1

< 00 .

When An = (n + � )2 , we have that

E(e-BT) = II 1 + e 1 2 = sech (rrv'e) .

00 { } -1

n=O (n + 2 )

Inverting the Laplace transform (or consulting a table of such transforms) we find that

where el is the first Jacobi theta function.

6.9 Solutions. Continuous-time Markov chains

1. (a) We have that

where P12 = 1 - P1 1 , P21 = 1 - P22 . Solve these subject to Pij (t) = 8ij , the Kronecker delta, to obtain that the matrix Pt = (Pij (t)) is given by

1 ( A + /-Le-(J...+f.J-)t /-L - /-Le-(J...+f.J-)t ) Pt = A + /-L A - Ae-(A+f.J-)t /-L + Ae-(A+f.J-)t .

(b) There are many ways of calculating Gn ; let us use generating functions. Note first that GO = I, the identity matrix. Write

n � O,

293

Page 303: One Thousand Exercises in Probability

[6.9.2]-[6.9.4] Solutions Markov chains

and use the equation Gn+1 = G . Gn to find that

Hence an+1 = - (I1P,,)Cn+ l for n � 0, and the first difference equation becomes an+l = -(A+tL)an , n � 1 , which, subject to al = -tL, has solution an = (_ l )n tL(A + tL)n- l , n � 1 . Therefore Cn = (_ l )n+1 A(A + tL)n- l for n � 1 , and one may see similarly that bn = -an , dn = -Cn for n � 1 . Using the facts that ao = do = 1 and bo = Co = 0, we deduce that I:�o(tn In ! )Gn = Pt where Pt is given in part (a). (c) With 1r = (Jrl , Jr2) , we have that -tLJrl + AJr2 = 0 and tLJrl - AJr2 = 0, whence Jrl = (AI tL)Jr2 . In addition, Jrl + Jr2 = 1 if Jrl = A/(A + tL) = 1 - Jr2. 2. (a) The required probability is

IP' (X(t) = 2, X (3t) = 1 I X (O) = 1 ) IP' (X(3t) = 1 I X (O) = 1 )

using the Markov property and the homogeneity of the process. (b) Likewise, the required probability is

the same as in part (a) .

P12 (t)P21 (2t)Pl 1 (t) Pl 1 (3t) Pl 1 (t)

pdt) P21 (2t) Pl 1 (3t)

3. The interarrival times and runtimes are independent and exponentially distributed. It is the lack­of-memory property which guarantees that X has the Markov property.

The state space is S = {O , 1 , 2 , . . . } and the generator is ( -A A tL -(A + tL)

G = 0 tL · . · . · . Solutions of the equation 1rG = 0 satisfy

0 A

- (A + tL)

0

" " ) 0 . . . A

-Ano + tLJrl = 0, AJrj- l - (A + tL)Jrj + tLJrj+1 = 0 for j � 1 ,

with solution Jri = Jro(AltL)i . We have in addition that I:i Jri = 1 in < tL and Jro = { 1 - (AltL)}- l . 4. One may use the strong Markov property. Alternatively, by the Markov property,

IP'(Yn+ l = j I Yn = i , Tn = t , B) = IP'(Yn+1 = j I Yn = i, Tn = t)

for any event B defined in terms of {X(s) : s � Tn } . Hence

IP'(Yn+l = j I Yn = i, B) = 1000 IP'(Yn+1 = j I Yn = i, Tn = t)!Tn (t) dt

= IP'(Yn+1 = j I Yn = i ) ,

so that Y i s a Markov chain. Now qij = IP'(Yn+l = j I Yn = i ) is given by

294

Page 304: One Thousand Exercises in Probability

Continuous-time Markov chains Solutions [6.9.5]-[6.9.8]

by conditioning on the (n + l )th interarrival time of N; here, as usual, Pij (t) is a transition probability of X. Now

5. The jump chain Z = {Zn : n � O} has transition probabilities hij = gij / gi , i i= j . The chance that Z ever reaches A from j is also TJj ' and TJj = Ek hjkTJk for j ¢. A, by Exercise (6.3.6). Hence -gj TJj = Ek gjkTJb as required. 6. Let Tl = inf{t : X (t) i= X (O) } , and more generally let Tm be the time of the mth change in value of X. For j ¢. A,

/-Lj = Ej (Tl ) + L hjk/-Lb kt.j

where Ej denotes expectation conditional on Xo = j . Now Ej (Tl ) = (; 1 , and the given equations follow. Suppose next that (ak : k E S) is another non-negative solution of these equations. With Ui = Ti+1 - Ii and R = min{n � I : Zn E A}, we have for j ¢. A that

where 1; is a sum of non-negative terms. It follows that

aj � Ej (Uo) + Ej (UI I{R> l } ) + . . . + Ej (Un I{R>n} )

= Ej (t UrI{R>r}) = Ej (min{Tn , HA l) -+ Ej (HA) r=O

as n -+ 00, by monotone convergence. Therefore, JL is the minimal non-negative solution. 7. First note that i is persistent if and only if it is also a persistent state in the jump chain Z. The integrand being positive, we can write

where {Tn : n � I } are the times of the jumps of X . The right side equals

00 I 00 L E(TI I X (O) = i )h i i (n) = -: L hii (n) n=O gl n=O

where H = (hij ) is the transition matrix of Z. The sum diverges if and only if i is persistent for Z. 8. Since the imbedded jump walk is persistent, so is X. The probability of visiting m during an excursion is a = (2m) - 1 , since such a visit requires an initial step to the right, followed by a visit to m before 0, cf. Example (3.9.6). Having arrived at m, the chance of returning to m before visiting 0 is I - a, by the same argument with 0 and m interchanged. In this way one sees that the number N of visits to m during an excursion from 0 has distribution given by lP'(N � k) = a( 1 - a)k- l , k � l . The 'total jump rate' from any state is A , whence T may be expressed as E�o Vi where the Vi are exponential with parameter A. Therefore,

295

Page 305: One Thousand Exercises in Probability

[6.9.9]-[6.9.11] Solutions Markov chains

The distribution of T is a mixture of an atom at 0 and the exponential distribution with parameter IXA. 9. The number N of sojourns in i has a geometric distribution lP'(N = k) = fk- l ( 1 - f) , k � 1 , for some f < 1 . The length of each of these sojourns has the exponential distribution with some parameter gi . By the independence of these lengths, the total time T in state i has moment generating function

E(eeT) = � fk- I ( 1 - f) (�)k

= gi ( 1 - f) . f=r. gi - () gi ( 1 - f) - ()

The distribution of T is exponential with parameter gi (1 - f) . 10. The jump chain i s the simple random walk with probabilities A/(A + JL) and JLI(A + JL) , and with PO I = 1 . By Corollary (5 .3 .6), the chance of ever hitting 0 having started at 1 is JLIA, whence the probability of returning to 0 having started there is f = JLIA . By the result of Exercise (6.9.9),

as required. Having started at 0, the walk visits the state r � 1 with probability 1 . The probability of returning to r having started there is

and each sojourn is exponentially distributed with parameter gr = A + JL. Now gr (I - fr) = A - JL, whence, as above,

e fT A - JL E(e rr ) = -----'­A - JL - ()

The probability of ever reaching 0 from X(O) is (JLIA)X (O) , and the time spent there subsequently is exponential with parameter A - JL. Therefore, the mean total time spent at 0 is

11. (a) The imbedded chain has transition probabilities

where gk = -gkk . Therefore, for any state j ,

where we have used the fact thatKG = O . Also nk � O and Ek nk = 1 , and thereforen is a stationary distribution of Y.

Clearly nk = 1{k for all k if and only if gk = Ei 1{i gi for all k, which is to say that gi = gk for all pairs i , k. This requires that the 'holding times ' have the same distribution. (b) Let Tn be the time of the nth change of value of X, with To = 0, and let Un = Tn+ I - Tn . Fix a state k, and let H = min{n � 1 : Zn = k} . Let Yi (k) be the mean time spent in state i between two consecutive visits to k, and let )li (k) be the mean number of visits to i by the jump chain in between two

296

Page 306: One Thousand Exercises in Probability

Birth-death processes and imbedding Solutions [6.9.12]-[6.11.2]

visits to k (so that, in particular, Yk (k) = gk" l and n (k) = 1 ) . With Ej and IP'j denoting expectation and probability conditional on X (0) = j , we have that

Yi (k) = Ek (f Un I{Zn=i, H>nj) = f Ek (Un I I{Zn=i j )lP'k (Zn = i , H > n) n=O n=O

00 1 1 = L.:: -lP'k(Zn = i , H > n) = -Yi (k) .

n=O gi gi

The vector y(k) = (Yj (k) : i E S) satisfies y (k)H = y(k) , by Lemma (6.4.5), where H is the transition matrix of the jump chain Z. That is to say,

for j E S,

whence �i n (k)gij = o for all j E S. If ILk = �i n (k) < 00, the vector (n (k)/ILk ) is a stationary distribution for X, whence nj = Yi (k)/lLk for all i . Setting i = k we deduce that nk = 1 / (gkILk) .

Finally, if �j nigj < 00, then

� 1 �i ni gi " ILk = -;;::- = = ILk L.J nigi . nk nkgk i

12. Define the generator G by gii = -Vi , gij = Vj hij , so that the imbedded chain has transition matrix H. A root of the equation 1I'G = 0 satisfies

0 = L.:: nigij = -nj vj + L.:: (nj Vj )h ij i i : i#.j

whence the vector t = (nj vj : j E S) satisfies t = tH. Therefore t = av, which is to say that nj vj = aVj , for some constant a . Now Vj > o for all j , so that nj = a, which implies that �j nj #- 1 . Therefore the continuous-time chain X with generator G has no stationary distribution.

6.11 Solutions. Birth-death processes and imbedding

1. The jump chain is a walk {Zn } on the set S = {O, 1 , 2, . . . } satisfying, for i ::: 1 ,

IP'(Zn+1 = j I Zn = i ) = { Pi if j = i + 1 ,

1 - Pi if j = i - I ,

where Pi = Ai ! (').i + ILj ) · Also IP'(Zn+1 = 1 I Zn = 0) = 1 . 2. The transition matrix H = (hij ) of Z is given by { ilL 'f ' . 1 -- 1 J = 1 - ,

h . . _ A + iIL IJ - A --.- if j = i + 1 . A + I IL

To find the stationary distribution of Y, either solve the equation 11' = 1I'Q, or look for a solution of the detailed balance equations nj hi, i+ 1 = ni+l hi+l , i . Following the latter route, we have that

hOlh 12 ' " h i- l , i nj = no , i ::: 1 , hi, i- l " · h2lh lO

297

Page 307: One Thousand Exercises in Probability

[6.11.3]-[6.11.4] Solutions Markov chains

whence ni = nopi ( l + i / p)/ i ! for i ::: 1 . Choosing no accordingly, we obtain the result. It is a standard calculation that X has stationary distribution v given by Vi = pi e-P / i ! for i ::: O.

The difference between n and v arises from the fact that the holding-times of X have distributions which depend on the current state. 3. We have, by conditioning on X(h) , that

TJ (t + h) = E{IP'(X (t + h) = 0 I X (h») } = /-Lh . 1 + (1 - Ah - /-Lh)TJ (t) + Ah� (t) + o(h)

where � (t) = IP'(X (t) = 0 I X (0) = 2) . The process X may be thought of as a collection of particles each of which dies at rate /-L and divides at rate A, different particles enjoying a certain independence; this is a consequence of the linearity of An and /-Ln . Hence � (t) = TJ (t)2 , since each of the initial pair is required to have no descendants at time t. Therefore

subject to TJ (0) = O. Rewrite the equation as

TJ' ----'---- = 1 (1 - TJ) (/-L - ATJ) and solve using partial fractions to obtain

if A = /-L,

Finally, if 0 < t < u ,

IP'(X(t) = 0) TJ (t) IP'(X(t) = 0 I X(u) = 0) = IP'(X(u) = 0 I X(t) = 0) = - . IP'(X(u) = 0) TJ (u)

4. The random variable X (t) has generating function

/-L(l - s) - (/-L - As)e-t (J..-/L) G (s , t) = -A-(l---S)---(/-L---A-s)-e--....,t (J..-:----/L"""'""")

as usual. The generating function of X(t) , conditional on {X(t) > O}, is therefore

� sn IP'(X(t) = n) =

G(s , t) - G(O, t) . � IP'(X(t) > 0) 1 - G(O, t)

Substitute for G and take the limit as t -+ 00 to obtain as limit

(/-L - A)S 00 n H(s) = = 2: s Pn /-L - AS n=l

where, with p = A//-L, we have that Pn = pn- l ( l - p) for n ::: 1 .

298

Page 308: One Thousand Exercises in Probability

Special processes Solutions [6.11.5]-[6.12.1]

5. Extinction is certain if A < /-L, and in this case, by Theorem (6. 1 1 . 10) ,

E(T) = 1000 IP'(T > t) dt = 1000 { 1 - E(sX (t) ) l s=O } dt

roo (/-L - A)e(J..-/L)t 1 ( /-L ) = Jo /-L - Ae(J..-/L)t dt = i log /-L - A .

If A > /-L then IP'(T < (0) = /-L/A, so

E(T I T < (0) = roo { I _ �E(sX(t) ) 1 -o} dt = roo (A - /-L)�(/L_��:t dt = .!.. log (_A_) . Jo /-L s- Jo A - /-Le /L /-L A - /-L

In the case A = /-L, IP'(T < (0) = 1 and E(T) = 00. 6. By considering the imbedded random walk, we find that the probability of ever returning to 1 is max{A , /-L}/(A + /-L), so that the number of visits is geometric with parameter min {A , /-L}/ (A + /-L) . Each visit has an exponentially distributed duration with parameter A + /-L, and a short calculation using moment generating functions shows that VI (00) is exponential with parameter min {A , /-L} .

Next, by a change of variables, Theorem (6. 1 1 . 10) , and some calculation,

� srE(Vr (t)) = E (� l sr I{X (u)=r) dU) = E (l sX (u) dU)

= E(s U ) du = - - - log lot X ( ) /-Lt 1 { A ( 1 - s) - (/-L - As )e-(J..-/L)t } o A A A - /-L 1 { As (ept - 1 ) } = - - log 1 - t + terms not involving s , A /-LeP - A

where P = /-L - A. We take the limit as t -+ 00 and we pick out the coefficient of sr . 7. If A = /-L then, by Theorem (6. 1 1 . 10),

and

X(t) M( 1 - s) + s l - s E(s ) = = 1 - --,----.,------:-M(1 - s) + 1 At ( 1 - s) + 1

r E(sX (u» ) du = t - ! 10g{At ( 1 - s) + I } Jo A 1 I { I AtS } . I · = -- og - -- + terms not mvo vmg s . A 1 + At

Letting t -+ 00 and picking out the coefficient of sr gives E(Vr (oo)) = (rA)- I . An alternative method utilizes the imbedded simple random walk and the exponentiality of the sojourn times.

6.12 Solutions. Special processes

1. The jump chain is simple random walk with step probabilities A / (A + /-L) and /-L / (A + /-L). The expected time /-L1O to pass from 1 to 0 satisfies

299

Page 309: One Thousand Exercises in Probability

[6.12.2]-[6.12.4] Solutions Markov chains

whence J-L1O = (J-L + A)/(J-L - A). Since each sojourn is exponentially distributed with parameter J-L + A, the result follows by an easy calculation. See also Theorem (1 1 .3 . 1 7) . 2. We apply the method of Theorem (6. 1 2. 1 1 ) with

the probability generating function of the population size at time u in a simple birth process. In the absence of disasters, the probability generating function of the ensuing population size at time v is

The individuals alive at time t arose subsequent to the most recent disaster at time t - D, where D has density function 8e-8x , x > 0. Therefore,

3. The mean number of descendants after time t of a single progenitor at time ° is e(Je-f,J-) t . The expected number due to the arrival of a single individual at a uniformly distributed time in the interval on [0, x] is therefore

1 loX (Je- )

e (Je-f,J-)x - 1 - e f,J- u du = . x 0 (A - J-L)x The aggregate effect at time x of N earlier arrivals is the same, by Theorem (6. 12.7), as that of N arrivals at independent times which are uniformly distributed on [0, x] . Since E(N) = vx , the mean population size at time x is v [e (Je-f,J-)x - 1 1/ (A - J-L) . The most recent disaster occurred at time t - D, where D has density function 8e -h, x > 0, and it follows that

This is bounded as t --+ 00 if and only if 8 > A - J-L. 4. Let N be the number of clients who arrive during the interval [0 , t ] . Conditional on the event {N = n} , the arrival times have, by Theorem (6. 1 2.7), the same joint distribution as n independent variables chosen uniformly from [0, t ] . The probability that an arrival at a uniform time in [0, t] is still in service at time t is f3 = fri [ 1 - G(t - x ) ]t-1 dx , whence, conditional on {N = n} , the total number M still in service is bin(n , (3) . Therefore,

whence M has the Poisson distribution with parameter Af3t = A fri [ 1 - G(x)] dx . Note that this parameter approaches AE(S) as t --+ 00 .

300

Page 310: One Thousand Exercises in Probability

Spatial Poisson processes Solutions [6.13.1]-[6.13.3]

6.13 Solutions. Spatial Poisson processes

1. It is easy to check from the axioms that the combined process N(t) = B(t) + G (t) is a Poisson process with intensity f3 + y . (a) The time S (respectively, T) until the arrival of the first brown (respectively, grizzly) bear is exponentially distributed with parameter f3 (respectively, y), and these times are independent. Now,

lP'(S < T) = roo f3e-fJs e-Ys ds = _f3_ . 10 f3 + y

(b) Using (a), and the lack-of-memory of the process, the required probability is

( y ) n f3 f3 + y f3 + y '

(c) Using Theorem (6 . 1 2.7),

. { 1 } y - 1 + e-Y

E (mm{S , T} I B ( 1 ) = 1 ) = � G (I ) + 2 = y2 .

2. Let Br be the ball with centre 0 and radius r , and let Nr = l IT n Br l . We have by Theorem (6. 1 3 . 1 1 ) that Sr = L:xennBr g (x) satisfies

E(Sr I Nr ) = Nr ( g (u) ),, (u) du, lBr A(Br)

where A(B) = IyeB ),, (y) dy. Therefore, E(Sr ) = IBr g (u)),, (u) du, implying by monotone conver­gence that E(S) = IRd g (u)),, (u) du. Similarly,

E(S; I Nr ) = E ( l L g (X)] 2) ennBr

whence

= E ( L g (x)2)) + E ( L g (x)g (y)) xennBr xiy x,yennBr h 2 ),, (u)

�'iu ),, (u)),, (v)

= Nr g (u) -- du + Nr (Nr - 1 ) g (u)g (v) 2 du dv, Br A(Br) u,veBr A(Br)

E(S;) = �r g (u)2 ),, (u) du + (�r g (u)),, (u) dU) 2

By monotone convergence,

and the formula for the variance follows. 3. If Bl , B2 , . . . , Bn are disjoint regions of the disc, then the numbers of projected points therein are Poisson-distributed and independent, since they originate from disjoint regions of the sphere. By

301

Page 311: One Thousand Exercises in Probability

[6.13.4]-[6.13.8] Solutions Markov chains

elementary coordinate geometry, the intensity function in plane polar coordinates is 2'A/�, o ::::: r ::::: 1 , 0 ::::: 0 < 2n .

4. The same argument is valid with resulting intensity function 2A� .

5. The Mercator projection represents the spherical coordinates (0 , ¢) as Cartesian coordinates in the range 0 ::::: ¢ < 2n , 0 ::::: 0 ::::: n . (Recall that 0 is the angle made with the axis through the north pole.) Therefore a uniform intensity on the globe corresponds to an intensity .function 'A sin 0 on the map. Likewise, a uniform intensity on the map corresponds to an intensity 'AI sin 0 on the globe. 6. Let the Xr have characteristic function ¢. Conditional on the value of N(t) , the corresponding arrival times have the same distribution as N(t) independent variables with the uniform distribution, whence

lE(ei8S(t» ) = lE{lE(ei8S(t) I N(t)) } = lE{lE(ei8xe-au )N(t) }

= exp {At (lE(ei8Xe-aU ) - I) } = exp{ 'A fot {¢ (Oe-au) - 1 } dU } ,

where U is uniformly distributed on [0, t ] . By differentiation,

'A lE(S(t)) = -i¢�(t/O) = -lE(X) (1 - e-at ) , a

'AlE(X2) lE(S(t)2) = -¢�(t) (O) = lE(S(t))2 + -- ( 1 - e-2at) . 2a

Now, for s < t , Set) = S(s)e-a(t-s) + Set - s) where Set - s) is independent of S(s) with the same distribution as Set - s ) . Hence, for s < t ,

'AlE(X2) 'AlE(X2) cov (S(s) , S et) ) = �(t-S) = � (1 - e-2as )e-a(t-s) --+ �e-av

as s --+ 00 with v = t - s fixed. Therefore, p (S(s ) , S(s + v)) --+ e-av as s --+ 00. 7. The first two arrival times Tl , T2 satisfy

Differentiate with respect to x and y to obtain the joint density function 'A(x)'A(x + y)e-A(x+y) , x , y � O. Since this does not generally factorize as the product of a function of x and a function of y , Tl and T2 are dependent in general. 8. Let Xi be the time of the first arrival in the process Ni . Then

302

Page 312: One Thousand Exercises in Probability

Markov chain Monte Carlo Solutions [6.14.1]-[6.14.4]

6.14 Solutions. Markov chain Monte Carlo

1. If P is reversible then

RHS = �(:�;:>ijXj ) Yi:7Ti = � :7TiPijXjYi = � :7TjPj iYiXj = � :7TjXj (� Pj iYi) = LHS . I ) I , } I , } } I

Suppose conversely that (x, Py) = (Px, y) for all x, y E 12 (:7T) . Choose x, y to be unit vectors with 1 in the i th and jth place respectively, to obtain the detailed balance equations :7Ti Pij = :7Tj Pji . 2. Just check that 0 :::: bij :::: 1 and that the Pij = gij bij satisfy the detailed balance equations (6. 14.3). 3. It is immediate that Pjk = I Ajk l , the Lebesgue measure of Ajk . This is a method for simulating a Markov chain with a given transition matrix. 4. (a) Note first from equation (4. 1 2.7) that d(U) = � sUPi;6j dTV (Ui. , Uj . ) , where Ui . is the mass function Uit , t E T . The required inequality may be hacked out, but instead we will use the maximal coupling of Exercises (4. 1 2.4, 5); see also Problem (7 . 1 1 . 1 6) . Thus requires a little notation. For i, j E S, i =1= j, we find a pair (Xi , Xj ) of random variables taking values in T according to the marginal mass functions Ui . , Uj . , and such that IP'(Xi =1= Xj ) = !dTV (Ui . , Uj . ) . The existence of such a pair was proved in Exercise (4. 1 2.5). Note that the value of Xi depends on j, but this fact has been suppressed from the notation for ease of reading. Having found (Xi , Xj ) , we find a pair (Y(Xi ) ' Y(Xj )) taking values in U according to the marginal mass functions VXi " vXj" and such that IP'(Y(Xi ) =1= y(Xj ) I Xi , Xj ) = !dTV (VXi " vXF) ' Now, taking a further liberty with the notation,

whence

IP' (Y(Xi ) =1= Y(Xj )'�= L IP'(Xi = r, Xj = s)IP' (Y (r) =1= Y es) ) , SES #s

= L IP'(Xi = r, Xj = s) !dTV (Vr . , Vs · ) r,s ES ris

:::: g SUp dTV (Vr . , Vs . ) }IP'(Xi =1= Xj ) , ris

d(UV) = suP IP' (Y(Xi ) =1= Y (Xj ) ) :::: g sup dTV (Vr . , vs . ) } { suP IP'(Xi =1= Xj ) } i;6j r#S i, j

and the claim follows. (b) Write S = { I , 2, . . . , m } , and take

U = (IP'(Xo = 1 ) IP'(Xo = 2) IP'(Yo = 1 ) IP'(Yo = 2)

IP'(Xo = m) ) IP'(Yo = m) .

The claim follows by repeated application of the result of part (a) . It may be shown with the aid of a little matrix theory that the second largest eigenvalue of a finite

stochastic matrix P is no larger in modulus that d(P) ; cf. the equation prior to Theorem (6. 14.9).

303

Page 313: One Thousand Exercises in Probability

[6.15.1]-[6.15.6] Solutions Markov chains

6.15 Solutions to problems

1. (a) The state 4 is absorbing. The state 3 communicates with 4, and is therefore transient. The set { I , 2} is finite, closed, and aperiodic, and hence ergodic. We have that !34 (n) = (i )n-1 t , so that 134 = En !34(n) = � . (b) The chain is irreducible with period 2. All states are non-null persistent. Solve the equation 1r = 1r P to find the stationary distribution 1r = ( � , {6 ' 1

56 ' k ) whence the mean recurrence times are 8 16 16 8

' d "3 ' 3 ' 5 ' , m or er. 2. (a) Let P be the transition matrix, assumed to be doubly stochastic. Then

L: Pij (n) = L: L: Pik (n - I )Pkj = L: (L: Pik (n - I )) Pkj i i k k i

whence, by induction, the n-step transition matrix pn is doubly stochastic for all n � 1 . If j i s not non-null persistent, then Pij (n) � 0 as n � 00, for all i , implying that Ei Pij (n) � 0,

a contradiction. Therefore all states are non-null persistent. If in addition the chain is irreducible and aperiodic then Pij (n) � 1fj ' where 1r is the unique

stationary distribution. However, it is easy to check that 1r = (N-1 , N- 1 , . . . , N-1 ) is a stationary distribution if P is doubly stochastic. (b) Suppose the chain is persistent. In this case there exists a positive root of the equation x = xP , this root being unique up to a multiplicative constant (see Theorem (6.4.6) and the forthcoming Problem (7)). Since the transition matrix is doubly stochastic, we may take x = 1, �vector of 1 's o By the above uniqueness of x, there can exist no stationary distribution, and there£ the chain is null. We deduce that the chain cannot be non-null persistent. 3. By the Chapman-Kolmogorov equations,

m , r, n � O.

Choose two states i and j , and pick m and n such that a = Pij (m)pji (n) > O. Then

Pii (m + r + n) � apjj (r) .

Set r = 0 to find that Pii (m + n) > 0, and so d(i ) I (m + n) . If d(i ) f r then Pii (m + r + n) = 0, so that Pjj (r) = 0; therefore d(i ) I d(j) . Similarly d(j) I d(i ) , giving that d(i ) = d(j) .

4. (a) See the solution to Exercise (6.3 .9a). (b) Let i , j , r, s E S, and choose N(i , r) and N(j, s) according to part (a). Then

lP'(Zn = (r, s ) I Zo = (i, j ) ) = Pir (n)Pjs (n) > 0

if n � max{N (i , r ) , N(j, s ) } , so that the chain is irreducible and aperiodic. (c) Suppose S = { I , 2} and

P = ( � � ) .

In this case { { I , I } , {2, 2} } and { { I , 2} , {2, I } } are closed sets of states for the bivariate chain. 5. Clearly lP'(N = 0) = 1 - iij , while, by conditioning on the time of the nth visit to j , we have that lP'(N � n + 1 I N � n) = Ijj for n � 1 , whence the answer is immediate. Now lP'(N = (0) = 1 - E�o lP'(N = n) which equals 1 if and only if iij = ijj = 1 . 6. Fix i =1= j and let m = min{n : Pij (n) > OJ . If Xo = i and Xm = j then there can be no intermediate visit to i (with probability one), since such a visit would contradict the minimality of m.

304

Page 314: One Thousand Exercises in Probability

Problems Solutions [6.15.71-[6.15.81

Suppose Xo = i , and note that ( 1 - fii ) Pij (m) � 1 - iii , since if the chain visits j at time m and subsequently does not return to i , then no return to i takes place at all. However iii = 1 if i is persistent, so that fii = 1 . 7. (a) We may take S = {O, 1 , 2 , . . . } . Note that qij (n) :::: 0 , and

00

L qu (n) = 1 , qu (n + 1 ) = L qil ( l )qu (n) , j [=0

whence Q = (qu (1) ) is the transition matrix of a Markov chain, and QTI = (qu (n) ) . This chain is persistent since

for all i , n n

and irreducible since i communicates with j in the new chain whenever j communicates with i in the original chain.

That

i =/= j, n :::: l ,

is evident when n = 1 since both sides are qij ( 1 ) . Suppose it is true for n = m where m :::: 1 . Now

\ Ij i (m + 1 ) = L Ijk (m)Pkj , i =/= j, k:kij

so that X ' ""' � lj i (m + l) = L.J gkj (m)qid1 ) , i =/= j, Xi k:kfj which equals gij (m + 1 ) as required. (b) Sum (*) over n to obtain that

i =/= j ,

where Pi (j) i s the mean number of visits to i between two visits to j ; we have used the fact that L:n gij (n) = 1 , since the chain is persistent (see Problem (6. 1 5 .6)). It follows that Xi = XOPi (0) for all i , and therefore x is unique up to a multiplicative constant. (c) The claim is trivial when i = j, and we assume therefore that i =/= j . Let Ni (j) be the number of visits to i before reaching j for the first time, and write lP'k and Ek for probability and expectation conditional on Xo = k. Clearly, lP'j (Ni (j) :::: r) = hj i ( 1 - h ij y-l for r :::: 1 , whence

The claim follows by (**) . 8. (a) If such a Markov chain exists , then

n Un = L ii Un-i ,

i=l

305

n :::: 1 ,

Page 315: One Thousand Exercises in Probability

[6.15.9]-[6.15.9] Solutions Markov chains

where ii is the probability that the first return of X to its persistent starting point s takes place at time i . Certainly Uo = 1 .

Conversely, suppose u is a renewal sequence with respect to the collection (fm : m :::: 1 ) . Let X be a Markov chain on the state space S = {O, 1 , 2, . . . } with transition matrix

. . _ { IP'(T :::: i + 2 I T :::: i + 1) if j = i + 1 , PI} - 1 _ IP'(T :::: i + 2 I T :::: i + 1) if j = 0,

where T is a random variable having mass function im = IP'(T = m) . With Xo = 0, the chance that the first return to 0 takes place at time n is

1P' ( Xn = 0, IT Xi =f. 0 I Xo = 0) = POI P12 ' " Pn-2,n- 1 Pn- l ,O 1

= ( 1 _ G(n + 1 )) IT G(i + 1 )

G(n) i=l G(i ) = G (n) - G(n + 1 ) = in

where G(m) = IP'(T :::: m) = L:�m in . Now Vn = IP'(Xn = 0 I Xo = 0) satisfies

Vo = 1 , n

Vn = L ii Vn-i i=l

for n :::: 1 ,

whence Vn = U n for all n . (b) Let X and Y be the two Markov chains which are associated (respectively) with u and v in the above sense. We shall assume that X and Y are independent. The product (un vn : n :::: 1 ) is now the renewal sequence associated with the bivariate Markov chain (Xn , Yn ) . 9 . Of the first 2n steps, let there be i rightwards, j upwards, and k inwards. Now X2n = 0 if and only if there are also i leftwards, j downwards, and k outwards. The number of such possible combinations is (2n) ! / { (i ! j ! k ! )2 } , and each such combination has probability (! )2(i+ j+k) = (! )2n . The first equality follows, and the second is immediate.

Now

where

( 1 ) 2n (2n) n !

1P'(X2n = 0) � 2: M L 3n ' 1 ' 1 k 1 n i+j+k=n I . J . .

M = max { 3n .7 !' 1 k 1 : i , j , k :::: O, i + j + k = n } .

l . J . . It is not difficult to see that the maximum M is attained when i , j , and k are all closest to j n, so that

Furthermore the summation in (*) equals 1 , since the summand is the probability that, in allocating n balls randomly to three urns, the urns contain respectively i , j , and k balls. It follows tllat

IP'(X = 0) < (2n) ! 2n - 1 2nn ! nj-nJ ! )3

306

Page 316: One Thousand Exercises in Probability

Problems Solutions [6.15.10]-[6.15.13]

3 which, by an application of Stirling's formula, is no bigger than Cn-2 for some constant C. Hence :En 1P'(X2n = 0) < 00, so that the origin is transient.

10. No. The line of ancestors of any cell-state is a random walk in three dimensions. The difference between two such lines of ancestors is also a type of random walk, which in three dimensions is transient.

11. There are one or two absorbing states according as whether one or both of a and {3 equal zero. If a{3 =f. 0, the chain is irreducible and persistent. It is periodic if and only if a = {3 = 1 , in which case it has period 2.

If 0 < a{3 < 1 then

3r = (a ! {3 ' a : {3 ) is the stationary distribution. There are various ways of calculating pn ; see Exercise (6.3 .3) for example. In this case the answer is given by

proof by induction. Hence

as n -+ 00.

The chain is reversible in equilibrium if and only if 7rl Pl2 = 7r2P2l , which is to say that a{3 = {3a !

12. The transition matrix is given by

Pij =

(N; ir ( i ) 2 (N - i ) 2 1 - - - --N N (�r

if j = i + 1 ,

if j = i , if j = i - 1 ,

for 0 ::: i ::: N . This process is a birth-death process in discrete time, and by Exercise (6.5. 1 ) is reversible in equilibrium. Its stationary distribution satisfies the detailed balance equation 7ri Pi, i+ l = 7ri+l Pi+ 1 , i for O ::: i < N, whenCe 7ri = 7rO (�) 2 for O ::: i ::: N, where

1 N (N) 2 (2N) 7rO = � i = N .

13. (a) The chain X is irreducible; all states are therefore of the same type. The state 0 is aperiodic, and so therefore is every other state. Suppose that Xo = 0, and let T be the time of the first return to O. Then IP'(T > n) = aOal . . . an- l = bn for n :::: 1 , so that 0 is persistent if and only if bn -+ 0 as n -+ 00. (b) The mean of T is

00 00

E(T) = L IP'(T > n) = L bn .

307

Page 317: One Thousand Exercises in Probability

[6.15.14]-[6.15.14] Solutions

The stationary distribution 'Ir satisfies

00 'irQ = L 'lrk (l - ak) , 1rn = 1rn- l an- l for n ::: 1 .

k=O

Hence 1rn = 1rQbn and 1rOl = L:�Q bn if this sum is finite. (c) Suppose ai has the stated form for i ::: I. Then

n- l bn = bI II ( 1 - Ai -f3) ,

i=I n ::: I.

Markov clulins

Hence bn --+ 0 if and only if L:i Ai -f3 = 00, which is to say that f3 � 1 . The chain is therefore persistent if and only if f3 � 1 . (d) We have that 1 - x � e-x for x ::: 0 , and therefore

00 00 { n- l } 00 L bn � bI L exp -A L i -f3 � bI L exp {-An . n-f3 } < 00 n=I n=I i=I n=I

(e) If f3 = 1 and A > 1 , there is a constant CI such that

if f3 < 1 .

00 00 { n- l I } 00 00 L bn � bI L exp -A L -;- � CI L exp {-A log n} = CI L n-A < 00, n=I n=I i=I I n=I n=I

giving that the chain is non-null. (!) If f3 = 1 and A � 1 ,

n- l ( A ) n- l ( . 1 ) ( I 1 ) bn = bI II 1 - --;- ::: bI II � = bI = . i=I I i=I I n 1

Therefore L:n bn = 00, and the chain is null.

14. Using the Chapman-Kolmogorov equations,

I Pij (t + h) - Pij (t ) 1 = /L (Pik (h) - 8ik ) Pkj (t) / � ( 1 - Pii (h)) pij (t) + L Pik (h) k koli

� ( 1 - Pii (h )) + ( 1 - Pii (h)) --+ 0

as h .J, 0, if the semigroup is standard. Now log x is continuous for 0 < x � 1 , and therefore g is continuous. Certainly g(O) = O. In

addition Pii (S + t) ::: Pii (S)Pii (t) for s , t ::: 0, whence g (s + t) � g(s) + g(t) , s , t ::: O. For the last part

1 get) Pii (t) - 1 - (Pii (t) - 1 ) = - . --+ -A t t - log{ 1 - ( 1 - Pii (t) ) )

as t .J, 0, since x /log(1 - x) --+ -1 as x .J, O .

308

Page 318: One Thousand Exercises in Probability

Problems Solutions [6.15.15]-[6.15.16]

15. Let i and j be distinct states, and suppose that Pij (t) > 0 for some t . Now

00 � 1 n n Pij (t) = L...J , t (G )ij ,

O n . n=

implying that (Gn)ij > 0 for some n, which is to say that

for some sequence kl ' k2 , . . . , kn of states. Suppose conversely that (*) holds for some sequence of states, and choose the minimal such value

of n . Then i , kl ' k2 , " " kn , j are distinct states, since otherwise n is not minimal. It follows that (Gn)ij > 0, while (Gm )ij = 0 for 0 � m < n . Therefore

00 n � 1 k k Pij (t) = t L...J k ' t (G )ij k=n .

is strictly positive for all sufficiently small positive values of t. Therefore i communicates with j . 16. (a) Suppose X is reversible, and let i and j be distinct states. Now

IP' (X (O) = i , X(t) = j) = IP' (X (t) = i , X (O) = j) ,

which is to say that 11:iPij (t) = 11:j Pji (t) . Divide by t and take the limit as t .J, 0 to obtain that 11:igij = 11:j gj i '

Suppose now that the chain is uniform, and X (0) has distribution 1r . If t > 0, then

so that X(t) has distribution 1r also. Now let t < 0, and suppose that X(t) has distribution IL. The distribution of X (s) for s � 0 is ILP s -t = 1r , a polynomial identity in the variable s - t, valid for all s � O. Such an identity must be valid for all s , and particularly for s = t , implying that IL = 1r .

Suppose in addition that 11: i g ij = 11:j gj i for all i , j . For any sequence kl , k2 ' . . . , kn of states,

11:igi,kl gkl , k2 " ' gkn ,j = gkl , i11:kl gkl ,k2 " ' gkn ,j = . . . = gkl , i gk2 ,kl . . · gj ,kn 11:j · Sum this expression over all sequences kl , k2 ' . . . , kn of length n, to obtain

11:i (Gn+1 )ij = 11:j (Gn+1 )j i '

It follows, by the fact that Pt = etG, that

for all i, j, t . For tl < t2 < . . , < tn ,

IP'(X (tl ) = i i , X (t2) = i2 , " " X (tn ) = in )

n � O.

= 11:il Pil . i2 (t2 - tl ) . . . Pin- l , in (tn - tn-d = Pi2 , il (t2 - tl )11:i2 Pi2 , i3 (t3 - t2) . . . Pin- l , in (tn - tn-d = . . . = Pi2 , i 1 (t2 - tl ) " . Pin , in- l (tn - tn-d11:in = IP' (Y(td = i i , Y (t2) = i2 , " " Y (tn ) = in ) ,

309

Page 319: One Thousand Exercises in Probability

[6.15.17]-[6.15.19] Solutions

giving that the chain is reversible. (b) Let S = { I , 2} and

G = ( -a a ) f3 -f3

where af3 > O . The chain i s unifonn with stationary distribution

7r = (a ! f3 ' a : f3)

'

and therefore 7rl g 12 = 7r2g21 ·

Markov chains

(c) Let X be a birth-death process with birth rates Ai and death rates P,i . The stationary distribution 7r satisfies

Therefore irk+ 1 P,k+ 1 = irkAk for k ;::: 0, the conditions for reversibility.

17. Consider the continuous-time chain with generator

G = (-f3 f3 ) . y -y

It i s a standard calculation (Exercise (6.9. 1 )) that the associated semigroup satisfies

(f3 + )P = ( Y + f3h (t) f3 ( 1 - h (t)) )

y t y (1 - h (t)) f3 + yh (t)

where h (t) = e-t ({J+y) . Now Pi = P if and only if y + f3h (1 ) = f3 + yh( 1 ) = a(f3 + y), which is to say that f3 = y = - � 10g(2a - 1 ) , a solution which requires that a > � . 18. The forward equations for Pn (t) = lP'(X (t) = n ) are

In the usual way,

po (t) = P,Pl - APO , p� (t) = APn- l - (A + np,)Pn + p,(n + I )Pn+l ,

aG = (s _ 1) (AG _ p, aG ) at as

with boundary condition G(s , 0) = sI . The characteristics are given by

dt = ds dG p, (s - 1) A(S - I )G '

n ;::: l .

and therefore G = eP (s -1 ) f ( (s - 1 )e -ILt) , for some function f, determined by the boundary condition to satisfy eP (s- l ) f (s - 1 ) = s I . The claim follows.

As t -+ 00, G(s , t) -+ eP (s- l ) , the generating function of the Poisson distribution, parameter p .

19. (a) The forward equations are

a at Pii (s , t) = -A(t)Pii (S , t ) , a at Pij (s , t) = -A(t) Pij (s , t) + A(t) Pi, j - 1 (t) , i < j .

3 1 0

Page 320: One Thousand Exercises in Probability

Problems Solutions [6.15.201-16.15.20]

Assume N(s) = i and s < t. In the usual way,

satisfies

00 G(s , t ; x) = L xjJP' (N (t) = j I N(s) = i )

j=i

aG at = A (t) (X - l )G .

We integrate subject to the boundary condition to obtain

G(s , t ; x) = xi exp { (x - 1 ) it A(U) du } ,

whence Pij (t) is found to be the probability that A = j - i where A has the Poisson distribution with parameter f: A (U) duo

The backward equations are

a as Pij (s , t) = A(S)Pi+ l ,j (S , t) - A(S)Pij (s , t ) ;

using the fact that Pi+l , j ( t ) = Pi,j - l (t) , we are led to

aG -- = A(S) (X - l )G. as

The solution is the same as above. (b) We have that

JP'(T > t) = poo (t) = exp { - fot A(U) du } ,

so that

fT (t) = A(t) exp { - l A(U) dU } , t � O.

In the case A (t) = c/(l + t) ,

looo 1000 du

E(T) = JP'(T > t) dt = o 0 (1 + u)C

which is finite if and only if c > 1 .

20. Let S > O . Each offer has probability 1 - F (s) of exceeding s , and therefore the first offer exceeding S is the Mth offer overall, where JP'(M = m) = F(s)m- l [ 1 - F(s) ] , m � 1 . Conditional on {M = m} , the value of XM is independent of the values of Xl , X2 , " " XM- l , with

1 - F(u) JP'(XM > U I M = m) = ,

1 - F(s) 0 < s ::: u ,

and Xl , X 2 , . . . , X M - 1 have shared (conditional) distribution function

F(u) G(u I s ) =

F(s) ,

3 1 1

0 ::: U ::: S .

Page 321: One Thousand Exercises in Probability

[6.15.211-16.15.21] Solutions

For any event B defined in terms of Xl , X2 , " " XM-I , we have that

00 lP'(XM > u , B) = .2: lP'(XM > u , B I M = m)lP'(M = m)

m=l 00

= .2: lP'(XM > U I M = m)lP'(B I M = m)lP'(M = m) m=l

00 = lP'(XM > u) .2: lP'(B I M = m)lP'(M = m)

m=l = lP'(XM > u)lP'(B) , 0 < s ::S u ,

Markov chains

where we have used the fact that lP'(XM > u I M = m) is independent of m. It follows that the first record value exceeding s is independent of all record values not exceeding s . By a similar argument (or an iteration of the above) all record values exceeding s are independent of all record values not exceeding s .

The chance of a record value in (s , s + h] is

F(s + h) - F(s ) f(s)h lP'(s < XM ::s s + h) = 1 _ F(s )

= 1 _ F(s )

+ o(h) .

A very similar argument works for the runners-up. Let XMl ' XM2 ' . . . be the values, in order, of offers exceeding s . It may be seen that this sequence is independent of the sequence of offers not exceeding s , whence it follows that the sequence of runners-up is a non-homogeneous Poisson process. There is a runner-up in (s , s + h] if (neglecting terms of order o(h)) the first offer exceeding s is larger than s + h , and the second is in (s , s + h] . The probability of this is

( 1 - F(s + h) ) (F(S + h) - F(S) ) + o(h) = f(s )h + o(h) . 1 - F(s ) 1 - F(s) 1 - F(s)

21. Let Ft (x) = lP'(N* (t) ::s x) , and let A be the event that N has a arrival during (t , t + h) . Then

where

Hence

Ft+h (X) = AhlP' (N* (t + h) ::s x I A) + ( 1 - Ah)Ft (x) + o(h)

lP'(N* (t + h) ::s x I A) = L: Ft (x - y)f(y) dy .

� Ft (x) = -AFt (x) + A 100

Ft (x - y)f(y) dy . at -00 Take Fourier transforms to find that ¢t (8) = lE(eiON* (t) ) satisfies

a¢t at = -A¢t + A¢t¢ ,

an equation which may be solved subject to 4>0(8) = 1 to obtain ¢t (8) = eAt (t/J (O)- i ) . Alternatively, using conditional expectation,

¢t (8) = lE{lE (eiON* (t) I N(t)) } = lE{¢ (8)N(t) }

where N (t) is Poisson with parameter At.

3 1 2

Page 322: One Thousand Exercises in Probability

Problems Solutions [6.15.22H6.15.25]

22. We have that

E(SN(t» = E{E(sN(t) I A) } = 1 {eA j t (s- l ) + eA2t (s- I ) } ,

whence E(N(t» = � (A, 1 + A,2)t and var(N(t» = 1 (A, 1 + A,2)t + £ (A, I - A,2)2 t2 .

23. Conditional on {X (t) = i } , the next arrival in the birth process takes place at rate Ai . 24. The forward equations for Pn (t) = jp(X (t) = n) are

, 1 + p,(n - 1 ) 1 + p,n Pn (t) =

1 + p,t Pn- l (t) -1 + p,t P

n (t ) , n :::: 0,

with the convention that P- l (t) = O. Multiply by sn and sum to deduce that

as required.

aG 2 aG aG ( I + p,t) - = sG + p,s - - G - p,s -at as as

Differentiate with respect to s and take the limit as s t 1. If E(X (t)2 ) < 00, then

m (t) = E(X (t» = -aG

I as s=1

satisfies (1 + p,t)m' (t) = 1 + p,m(t) subject to m (O) = I . Solving this in the usual way, we obtain m (t) = I + (1 + p,I)t .

Differentiate again to find that

a2G

I n (t) = E (X(t) (X (t) - 1 ) ) = -2 as s=1

satisfies (1 + p,t)n' (t) = 2 (m(t) + p,m (t) + p,n (t» ) subject to n (O) = 1 (/ - 1 ) . The solution is

n (t) = 1 (/ - 1) + 21 ( 1 + p,I) t + (1 + p,I) ( 1 + p, + p,I)t2 .

The variance of X (t) is n (t) + m (t) - m(t)2 .

25. (a) Condition on the value of the first step:

j :::: 1 ,

as required. Set Xi = '1i+l - '1i to obtain A,jXj = P,jXj- l for j :::: 1 , so that

j :::: 1 .

It follows that j j '1j+l = '10 + L Xk = 1 + ('1 1 - 1) L ek ·

k=O k=O The '1j are probabilities, and lie in [0, 1 ] . If Z=f ek = 00 then we must have '1 1 = 1, which implies that '1j = 1 for all j .

3 1 3

Page 323: One Thousand Exercises in Probability

[6.15.26]-[6.15.28] Solutions Markov chains

(b) By conditioning on the first step, the probability TJj , of visiting 0 having started from j , satisfies

(j + 1 )2TJj+l + j2TJj_ 1 TJj = j2 + (j + 1 )2

Hence, (j + 1)2 (TJj+ l - TJj ) = f(TJj - TJj- l ) , giving (j + 1 )2 (TJj+1 - TJj ) = TJ l - TJo . Therefore,

as j -+ 00.

By Exercise (6.3 .6), we seek the minimal non-negative solution, which is achieved when TJ l = 1 -(6/rr2) . 26. We may suppose that X (O) = O . Let Tn = inf{t : X (t) = n } . Suppose Tn = T , and let Y = Tn+l - T. Condition on all possible occurrences during the interval (T, T + h) to find that

JE(Y) = O .. nh)h + /1-nh (h + JE(Y')) + (1 - Anh - /1-nh) (h + JE(Y)) + o(h) ,

where Y' i s the mean time which elapses before reaching n + 1 from n - I . Set mn = JE(Tn+ 1 - Tn) to obtain that

mn = /1-nh (mn- l + mn) + mn + h { l - (An + /1-n)mn } + o(h) . Divide by h and take the limit as h .,I. 0 to find that Anmn = 1 + /1-nmn- l , n :::: 1 . Therefore

1 /1-n 1 /1-n /1-n/1-n- l " . /1-1 mn = - + -mn- l = . . . = - + --- + . . . + , An An An AnAn- l AnAn- l . . . 1..0

since mo = 1..0 1 . The process is dishonest if z=�o mn < 00, since in this case Too = lim Tn has finite mean, so that JP'(Too < (0) = 1 .

On the other hand, the process grows no faster than a birth process with birth rates Ai , which is honest if z=�o l/An = 00. Can you find a better condition?

27. We know that, conditional on X (O) = I, X (t) has generating function

so that

( At ( 1 - S) + S ) 1 G (s , t) = At ( 1 _ s) + 1 '

JP'(T ::s x I X (0) = I) = JP'(X (x) = 0 I X (0) = I) = G(O, x) = Cx� 1) 1

It follows that, in the limit as x -+ 00,

� ( AX ) 1 ( AX ) JP'(T ::s x) = L..- -- JP'(X(O) = I) = GX(O) -- -+ 1 . 1=0 AX + 1 Ax + 1

For the final part, the required probability is {x I/(x I + 1) }1 = { I + (xI)- 1 }-I , which tends to e- 1/x as I -+ 00. 28. Let Y be an immigration-<ieath process without disasters, with Y (0) = o. We have from Problem (6. 1 5. 1 8) that yet) has generating function G(s , t) = exp {p (s - 1 ) ( 1 - e-1U ) } where p = 1..//1-. As seen earlier, and as easily verified by taking the limit as t -+ 00, Y has a stationary distribution.

3 14

Page 324: One Thousand Exercises in Probability

Problems Solutions [6.15.29]-[6.15.31]

From the process Y we may generate the process X in the following way. At the epoch of each disaster, we paint every member of the population grey. At any given time, the unpainted individuals constitute X, and the aggregate population constitutes Y. When constructed in this way, it is the case that Y(t) :s X(t) , so that Y is a Markov chain which is dominated by a chain having a stationary distribution. It follows that X has a stationary distribution 1r (the state 0 is persistent for X, and therefore persistent for Y also).

Suppose X is in equilibrium. The times of disasters form a Poisson process with intensity 8 . At any given time t, the elapsed time T since the last disaster is exponentially distributed with parameter 8 . At the time of this disaster, the value of X (t) is reduced to 0 whatever its previous value.

It follows by averaging over the value of T that the generating function H(s) = 2:�o Sn7rn of X (t) is given by

by the substitution x = e-/Lu . The mean of X (t) is

29. Let G( I B I , s) be the generating function of X(B) . If B nC = 0, then X(B UC) = X(B) +X(C) , so that G(or + ,8, s) = G(or, s )G(,8, s) for l s i :s 1 , or, ,8 � O. The only solutions to this equation which are monotone in or are of the form G(or, s) = eO!).. (s) for l s i :s 1 , and for some function A(S) . Now any interval may be divided into n equal sub-intervals, and therefore G(or, s) is the generating function of an infinitely divisible distribution. Using the result of Problem (5 . 12. l 3b), A(S) may be written in the form A(S) = (A(s) - I)A for some A and some probability generating function A(s) = 2:SO ai si . We now use (iii): if I B I = or,

J1D(X (B) � 1 ) 1 - eO!).. (ao- l ) ----._ = --+ 1 J1D(X (B) = 1) orAa l eO!).. (ao- 1 )

as or t O. Therefore ao + al = 1 , and hence A(s) = ao + ( 1 - ao)s , and X(B) has a Poisson distribution with parameter proportional to I B I . 30. (a) Let M(r, s) be the number of points of the resulting process on R+ lying in the interval (r, s ] . Since disjoint intervals correspond to disjoint annuli of the plane, the process M has independent increments in the sense that M('l , S l ) , M(r2 ' S2) , . . . , M(rn , sn ) are independent whenever '1 < Sl < r2 < . . . < rn < Sn . Furthermore, for r < s and k � 0,

(A7r (S - r)}ke-b(s-r) J1D(M(r, s) = k) = J1D(N has k points in the corresponding annulUS) = k ! .

(b) We have similarly that

00 (A7rX2Ye-bx2 J1D(R(k) :s x) = J1D(N has least k points in circle of radius x) = L , '

k r . r=

and the claim follows by differentiating, and utilizing the successive cancellation.

31. The number X(S) of points within the sphere with volume S and centre at the origin has the Poisson distribution with parameter AS. Hence J1D(X (S) = 0) = e-)..S , implying that the volume V of the largest such empty ball has the exponential distribution with parameter A .

3 1 5

Page 325: One Thousand Exercises in Probability

[6.15.32]-[6.15.33] Solutions Markov chains

It follows that IP'( R > r) = IP'( V > crn) = e -J...crn for r � 0, where c is the volume of the unit ball in n dimensions. Therefore

r � O.

Finally, lE(R) = Jooo e-J...crn dr , and we set v = Acrn .

32. The time between the kth and (k + 1)th infection has mean Ak 1 , whence

Now

N 1 lE(T) = :E -. k=l Ak

N 1 1 { N 1 N 1 } t1 k(N + 1 - k) =

N + 1 t1 k + t1 N + 1 - k

2 N 1 2 = -- :E - = -- { log N + y + O(N- 1 ) } . N + 1 k=l k N + 1

It may be shown with more work (as in the solution to Problem (5 . 12.34)) that the moment generating function of A(N + 1 )T - 2 log N converges as N -+ 00, the limit being (r( 1 - B)f.

33. (a) The forward equations for Pn (t) = IP'( V (t) = n + ! ) are

with the convention that P- l (t) = O. It follows as usual that

- = - - 2s - + G + s - + sG aG aG ( aG ) ( 2 aG ) at as as as

as required. The general solution is

G (s , t) = _1_1 (t + _

1_)

1 - s 1 - s

n � 0,

for some function f . The boundary condition is G (s , 0) = 1 , and the solution is as given. (b) Clearly

mn (T) = 1E (loT Int dt) = loT lE(lnt ) dt

by Fubini's theorem, where Int is the indicator function of the event that V (t) = n + ! . As for the second part,

00

T '"' n (T) - 10 G( ) d _ log[1 + ( 1 - s)T] L..J s mn - s, t t - , o 1 - s n=O

so that, in the limit as T -+ 00,

� 1 ( 1 + ( 1 - S )T ) log(l - s) :Eoo L..J sn (mn (T) - log T) = -- log -+ = - snan 1 - s T 1 - s n=O n=l

3 16

Page 326: One Thousand Exercises in Probability

Problems Solutions [6.15.34]-{6.15.35]

if l s i < 1 , where an = 2:7=1 i - I , as required. (c) The mean velocity at time t is

�'-------'- - - - - .

(2, 1 ) (0, 1 ) �'----I __ --'o=--_�_----'- _ _ _ _ .

( 1 , 0) �--.... --,.=--... -� - - - - .

""-:::,---.---,.. - - - - .

34. It is clear that Y is a Markov chain, and its possible transitions are illustrated in the above diagram. Let x and y be the probabilities of ever reaching ( 1 , 1 ) from ( 1 , 2) and (2, 1 ) , respectively. By conditioning on the first step and using the translational symmetry, we see that x = i y + i x2 and y = i + ixy . Hence x3 - 4x2 + 4x - 1 = 0, an equation with roots x = 1 , i (3 ± .J5). Since x is a probability, it must be that either x = l or x = i (3 - .J5), with the corresponding values of y = 1 and y = i (.J5 - 1) . Starting from any state to the right of ( 1 , 1) in the above diagram, we see by recursion that the chance of ever visiting ( 1 , 1 ) is of the form XIX yf3 for some non-negative integers a, p. The minimal non-negative solution is therefore achieved when x = i (3 - .J5) and y = i (.J5 - 1) . Since x < 1 , the chain is transient.

35. We write A, 1 , 2, 3 , 4, 5 for the vertices of the hexagon in clockwise order. Let Ti = min{n :::: 1 : Xn = i } and JlDi ( -) = JID(. I Xo = i ) . (a) By symmetry, the probabilities Pi = JlDi (TA < Tc) satisfy

2 1 1 1 1 2 PA = "3P1 , PI = "3 + "3P2 , P2 = "3P1 + 3P3 , P3 = "3P2 ,

whence PA = 17 ' (b) By Exercise (6.4.6), the stationary distribution is nc = 1 , lCi = � for i =f:. C, whence /LA = -1 8 lCA = .

(c) By the argument leading to Lemma (6.4.5) , this equals /LAlCC = 2. (d) We condition on the event E = {TA < Tc} as in the solution to Exercise (6.2.7). The probabilities bi = JlDi (E) satisfy

3 17

Page 327: One Thousand Exercises in Probability

[6.15.36]-[6.15.37] Solutions Markov clulins

yielding h1 = ft' h2 = i , h3 = � 0 The transition probabilities conditional on E are now found by equations of the form

lP2 (E)p12 <12 = lP1 (E) d o 0 1 I 7 2 1 6 H °th th b O O an SlIm ar Y <2 1 = 9 ' <23 = 9 ' <32 = 2' <1A = 7 0 ence, WI e o VIOUS notation,

J-t2A = 1 + �J-t1A + �J-t3A ' J-t3A = 1 + J-t2A , J-t1A = 1 + tJ-t2A ,

giving J-t1A = W , and the required answer is 1 + J-t1A = 1 + � = 1;j- 0 36. (a) We have that

fJ(m - i )2 Pi, i+ i = m2

(X(i + 1 )2 Pi+1 , i = m2

Look for a solution to the detailed balance equations

to find the stationary distribution

(b) In this case, fJ (m - i )

Pi, i+ 1 = m

(X(i + 1 ) Pi+ 1 , i =

m Look for a solution to the detailed balance equations

yielding the stationary distribution

37. We have that

fJ (m - i ) (X (i + 1 ) lCi = lCi+ 1 m m

by the Chapman-Kolmogorov equations

by the concavity of C

� (a o (s) ) = L...J lCj C _J-o- = des) ,

j lCJ

3 1 8

Page 328: One Thousand Exercises in Probability

Problems Solutions [6.15.38]-[6.15.41]

where we have used the fact that 2:j lrjPjk (t) = lrk . Now aj (s ) � lrj as S � 00, and therefore d(t) � c(1 ) . 38. By the Chapman-Kolmogorov equations and the reversibility,

uo (2t) = L IP'(X(2t) = 0 I X(t) = j )IP'(X(t) = j I X(O) = 0) j

" lrO " ( U . (t) ) 2 = L...J -:-1P'(X(2t) = j I X(t) = O) Uj (t) = lrO L...J lrj _J _.

j lrJ j lrJ

The function c(x) = _x2 is concave, and the claim follows by the result of the previous problem.

39. This may be done in a variety of ways, by breaking up the distribution of a typical displacement and using the superposition theorem (6. 13 .5), by the colouring theorem (6. 1 3 . 14), or by Renyi's theorem (6. 1 3 . 17) as follows. Let B be a closed bounded region of ]Rd . We colour a point of IT at x E ]Rd black with probability lP'(x + X E B) , where X is a typical displacement. By the colouring theorem, the number of black points has a Poisson distribution with parameter

r AIP'(X + X E B) dx = A r dy r IP'(X E dy - x) llRd lYEB lXElRd = A r dy r IP'(X E dv) = A IB I , lYEB lVElRd

by the change of variables v = y - x. Therefore the probability that no displaced point lies in B is e-A1B 1 , and the claim follows by Renyi's theorem.

40. Conditional on the number N(s) of points originally in the interval (0, s ) , the positions of these points are jointly distributed as uniform random variables, so the mean number of these points which lie in (-00, a) after the perturbation satisfies

los 1 10

00 AS -IP'(X + U � a) du � A Fx (a - u) du = lE(Rd o s 0

as s � 00,

where X i s a typical displacement. Likewise, lE(LR) = A Jooo [ 1 - Fx (a + u)] du o Equality is valid if and only if

100

[ 1 - Fx (v)] dv = 1:00

Fx (v) dv ,

which is equivalent to a = lE(X) , by Exercise (4.3.5) . The last part follows immediately on setting Xr = Vrt , where Vr is the velocity of the rth car.

41. Conditional on the number N(t) of arrivals by time t, the arrival times of these ants are distributed as independent random variables with the uniform distribution. Let U be a typical arrival time, so that U is uniformly distributed on (0, t ) . The arriving ant is in the pantry at time t with probability lr = IP'(U + X > t) , or in the sink with probability p = IP'(U + X < t < U + X + Y), or departed with probability 1 - p - lr . Thus,

lE(xA(t) yB (t )) = lE{lE (xA(t) yB(t) I N (t) ) } = 1E{ (lrX + py + 1 - lr _ p )N(t) } = eA1r(x- l )eAp(y- l ) .

Thus A(t) and B(t) are independent Poisson-distributed random variables. If the ants arrive in pairs and then separate,

3 19

Page 329: One Thousand Exercises in Probability

[6.15.42]-[6.15.45] Solutions Markov chains

where y = 1 - 7r - p . Hence,

whence A(t) and B(t) are not independent in this case.

42. The sequence {Xr } generates a Poisson process N(t) = max{n : Sn ::: t} . The statement that Sn = t is equivalent to saying that there are n - 1 arrivals in (0, t) , and in addition an arrival at t . By Theorem (6. 12.7) or Theorem (6. 1 3 . 1 1 ) , the first n - 1 arrival times have the required distribution.

Part (b) follows similarly, on noting that .fu (u) depends on u = (U I , U2 , . . . , un ) only through the constraints on the Ur .

43. Let Y be a Markov chain independent of X, having the same transition matrix and such that Yo has the stationary distribution 7r . Let T = min{n � 1 : Xn = Yn } and suppose Xo = i . As in the proof of Theorem (6.4. 17),

IPij (n) - 7rj I = II: 7rk (Pij (n) - Pkj (n) ) I ::: I: 7rklP'(T > n) = IP'(T > n) . k k

Now, IP'(T > r + 1 I T > r) ::: 1 - €2 for r � 0,

where € = minij {Pij } > O. The claim follows with A = 1 - €2 .

44. Let /k (n) be the indicator function ofa visit to k at time n, so that E(h (n)) = IP'(Xn = k) = ak (n) , say. By Problem (6. 15 .43), lak (n) - 7rk I ::: An . Now,

Let S = minIm , r } and t = 1m - r l . The last summation equals

= � I: I: { (aj (s) - 7rj ) (Pii (t) - 7rj ) + 7rj (Pii (t) - 7rj ) n r m

+ 7rj (aj (s) - 7rj ) - 7rj (aj (r) - 7rj ) - 7rj (aj (m) - 7rj ) }

as n -+ 00,

where 0 < A < 00 . For the last part, use the fact that :E�;:J f (Xr ) = :EjES f(i ) Vj (n) . The result is obtained by Minkowski's inequality (Problem (4. 14.27b)) and the first part.

45. We have by the Markov property that f(Xn+I I Xn , Xn- I , . . . , Xo) = f(Xn+1 I Xn) , whence

E (log f(Xn+1 I Xn , Xn- I , · · · , Xo) I Xn , . . . , Xo) = E (log f(Xn+I I Xn) I Xn) .

320

Page 330: One Thousand Exercises in Probability

Problems Solutions [6.15.46]-[6.15.48]

Taking the expectation of each side gives the result. Furthermore,

H(Xn+1 I Xn) = - L(Pij log Pij )1P'(Xn = i ) . i, j

Now X has a unique stationary distribution 1C , so that IP'(Xn = i ) --+ 7ri as n --+ 00 . The state space is finite, and the claim follows.

46. Let T = inf{t : Xt = Yd. Since X and Y are persistent, and since each process moves by distance 1 at continuously distributed times, it is the case that IP'(T < (0) = 1 . We define

{ Xt if t < T, Zt = Yt if t � T,

noting that the processes X and Z have the same distributions. (a) By the above remarks,

IIP'(Xt = k) - 1P'(Yt = k) 1 = IIP'(Zt = k) - 1P'(Yt = k) 1 ::::: \ 1P'(Zt = k, T ::::: t ) + IP'(Zt = k , T > t ) - 1P'(Yt = k , T ::::: t ) - 1P'(Yt = k , T > t) \ ::::: IP'(Xt = k, T > t) + IP'(Yt = k, T > t ) .

We sum over k E A, and let t --+ 00 .

(b) We have in this case that Zt ::::: Yt for all t . The claim follows from the fact that X and Z are

processes with the same distributions.

47. We reformulate the problem in the following way. Suppose there are two containers, W and N, containing n particles in all. During the time interval (t , t + dt), any particle in W moves to N with probability f.Ldt +o(dt), and any particle in N moves to W with probability Adt +o(dt) . The particles move independently of one another. The number Z (t) of particles in W has the same rules of evolution as the process X in the original problem. Now, Z(t) may be expressed as the sum of two independent random variables U and V, where U is bin(r, (}t ) , V is bin(n - r, 1/It ) , and (}t is the probability that a particle starting in W is in W at time t, 1/It is the probability that a particle starting in N at 0 is in W at t. By considering the two-state Markov chain of Exercise (6.9. 1 ) ,

and therefore

A - Ae-(A+/-t)t 1/It = A + f.L

JE(SX(t») = JE(sU)JE(s v ) = (s(}t + 1 - s/{s1/lt + 1 _ s)n-r . Also, JE(X(t)) = r(}t + (n - r)1/It and var(X (t)) = r(}t ( 1 - (}t ) + (n - r)1/It ( 1 - 1/It) . In the limit as n --+ 00, the distribution of X (t) approaches the bin(n , A/(A + f.L)) distribution.

48. Solving the equations

gives the first claim. We have that y = L:i (Pi - qi )7ri , and the formula for y follows. Considering the three walks in order, we have that: A. 7ri = � for each i , and YA = -2a < o.

B. Substitution in the formula for YB gives the numerator as 3 { -!ga + o(a) } , which is negative for small a whereas the denominator is positive.

321

Page 331: One Thousand Exercises in Probability

[6.15.49]-[6.15.51] Solutions Markov chains

C. The transition probabilities are the averages of those for A and B, namely, Po = � (-Ib -a) + � (� - a) = fu - a, and so on. The numerator in the formula for YC equals � + 0(1) , which is positive for small a .

49. Call a car green i f i t satisfies the given condition. The chance that a green car arrives on the scene during the time interval (u, u + h) is >"hJP(V < x/(t - u)) for u < t . Therefore, the arrival process of green cars is an inhomogeneous Poisson process with rate function

{ U (V < x/(t - u)) >.. (u) =

o

if u < t, if u ::: t .

Hence the required number has the Poisson distribution with mean

>.. fot JP (V <

t � u ) du = >.. fot JP (V < �) du

= >.. fot E(l{vu<x} ) du = >..E (V- 1 min{x , Vt }) .

50. The answer is the probability of exactly one arrival in the interval (s , t) , which equals g(s) =

>.. (t - s)e-J.. Ct-s) . By differentiation, g has its maximum at s = max{O, t - >..- 1 } , and g(S) = e-1

when t ::: >..- 1 .

51. We measure money in millions and time in hours . The number of available houses has the Poisson distribution with parameter 30>", whence the number A of affordable houses has the Poisson distribution with parameter i · 30>" = 5>.. (cf. Exercise (3 .5.2)). Since each viewing time T has moment generating function E(eBT) = (e2B - eB)/(J, the answer is

322

Page 332: One Thousand Exercises in Probability

7

Convergence of random variables

7.1 Solutions. Introduction

1. (a) E I (cXY I = I c l r . { I I X l i r V . (b) This is Minkowski's inequality. (c) Let E > O. Certainly I X I � h where h is the indicator function of the event { I X I > E } . Hence EIXr I � El l; I = IP'(IX I > E) , implying that IP'( IX I > E) = 0 for all E > O. The converse is trivial.

2. (a) E({aX + bY}Z) = aE(XZ) + bE(YZ) . (b) E({X + Y}2) + E({X - y}2) = 2E(X2) + 2E(y2) . (c) Clearly

3. Let f(u) = �E , g(u) = 0, h (u) = -�E , for all u . Then dE (f, g) + dE (g , h) = 0 whereas dE (f, h) = 1 . 4. Either argue directly, or as follows. With any distribution function F, we may associate a graph F obtained by adding to the graph of F vertical line segments connecting the two endpoints at each discontinuity of F. By drawing a picture, you may see that .J2 d (F, G) equals the maximum distance between F and G measured along lines of slope - 1 . It is now clear that d(F, G) = 0 if and only if F = G, and that d (F, G) = d (G, F) . Finally, by the triangle inequality for real numbers, we have that d(F, H) � d(F, G) + d(G, H) .

5. Take X to be any random variable satisfying E(X2) = 00, and define Xn = X for all n .

7.2 Solutions. Modes of convergence

1. (a) By Minkowski's inequality,

let n --+ 00 to obtain lim infn�oo E IX� I � EIXr l . By another application of Minkowski's inequality,

323

Page 333: One Thousand Exercises in Probability

[7.2.2]-[7.2.4] Solutions Corwergence of random variables

(b) We have that IlE(Xn) - lE(X) 1 = IlE(Xn - X) I � lE lXn - X I -+ 0

as n -+ 00. The converse is clearly false. If each Xn takes the values ±I , each with probability i , then lE(Xn) = 0 , but lElXn - 0 1 = 1 .

(c) By part (a), lE(X;) -+ lE(X2) . Now Xn � X by Theorem (7.2.3), and therefore lE(Xn) -+ lE(X) by part (b) . Therefore var(Xn) = lE(X;) - lE(Xn)2 -+ var(X) .

2. Assume that Xn � X. Since IXn l � Z for all n, it is the case that I X I � Z a.s. Therefore Zn = IXn - X I satisfies Zn � 2Z a.s. In addition, if E > 0,

As n -+ 00, JP ( I Zn I > E) -+ 0, and therefore the last term tends to 0; to see this, use the fact that lE(Z) < 00, together with the result of Exercise (5.6.5). Now let E ..I- 0 to obtain that lE lZn l -+ 0 as n -+ 00.

3. We have that X - n - l � Xn � X, so that lE(Xn) -+ lE(X) , and similarly lE(Yn) -+ lE(Y) . By the independence of Xn and Yn ,

lE(Xn Yn ) = lE(Xn)lE(Yn ) -+ lE(X)lE(Y) .

Finally, (X - n - l ) (y - n - l ) � XnYn � XY, and

{ ( I

) ( I

) } lE(X) + lE(Y) I lE X - ;; Y - ;; = lE(XY) - n + n2 -+ lE(XY)

as n -+ 00, so that lE(Xn Yn ) -+ lE(XY) . 4. Let Fl , F2 , ' " be distribution functions. As in Section 5 .9, we write Fn -+ F if Fn (x) -+ F (x) for all x at which F is continuous. We are required to prove that Fn -+ F if and only if d (Fn , F) -+ O.

Suppose that d (Fn , F) -+ O. Then, for E > 0, there exists N such that

F(x - E) - E � Fn (x) � F(x + E) + E for all x .

Take the limits as n -+ 00 and E -+ 0 in that order, to find that Fn (x) -+ F(x) whenever F is continuous at x .

Suppose that Fn -+ F. Let E > 0 , and find real numbers a = X l < X2 < . . . < Xn = b , each being points of continuity of F, such that (i) Fj (a) < E for all i , F (b) > 1 - E , (ii) IXj+ ! - xi i < E for I � i < n . In order to pick a such that Fj (a) < E for all i , first choose a/ such that F (a') < iE and F is continuous at a/ , then find M such that IFm (a') - F (a') 1 < i E for m ::: M, and lastly find a continuity point a of F such that a � a/ and Fm (a) < E for I � m < M.

There are finitely many points Xj , and therefore there exists N such that I Fm (xj ) - F(xj ) 1 < E for all i and m ::: N. Now, if m ::: N and Xj � x < xi+ l ,

and similarly Fm (x) ::: Fm (xj ) > F(xj ) - E ::: F(x - E) - E .

Similar inequalities hold if x � a or x ::: b, and i t follows that d(Fm , F) < E if m ::: N. Therefore d(Fm , F) -+ 0 as m -+ 00 .

324

Page 334: One Thousand Exercises in Probability

A1odes ofconvergence Solutions [7.2.5]-[7.2.7]

5. (a) Suppose C > 0 and pick li such that 0 < li < c. Find N such that 1P'( I Yn - c i > li) < li for n :::: N. Now, for x :::: 0,

IP'(Xn Yn ::::; x) ::::; IP' (Xn Yn ::::; x, I Yn - c l ::::; li) + 1P' ( l Yn - c l > li) ::::; IP' ( Xn ::::; c : li ) + li ,

and similarly

IP'(Xn Yn > x) ::::; IP' (Xn Yn > x , I Yn - c l ::::; li) + li ::::; IP' ( Xn > c : li ) + li .

Taking the limits as n � 00 and li + 0, we find that IP'(Xn Yn ::::; x) � IP'(X ::::; x/c) if x/c is a point of continuity of the distribution function of X. A similar argument holds if x < 0, and we conclude that XnYn .s cX if c > O. No extra difficulty arises if c < 0, and the case c = 0 is similar.

For the second part, it suffices to prove that yn- 1 � c- 1 if Yn � c (# 0). This is immediate from the fact that I y,;-l - c- 1 1 < E/{ lc l ( l c l - E) } if I Yn - c l < E « Ic l ) . (b) Let E > O. There exists N such that

1P'( IXn l > E) < E, 1P' ( I Yn - Y I > E) < E, if n :::: N,

and in addition IP'( I Y I > N) < E . By an elementary argument, g is uniformly continuous at points of the form (0, y) for Iy l ::::; N. Therefore there exists li (> 0) such that

I g (x/ , y/) - g(O, y) 1 < E if lx / I ::::; li , Iy' - y l ::::; li . If IXn l ::::; li, I Yn - Y I ::::; li, and I Y I ::::; N, then I g (Xn , Yn) - g (O, Y) I < E, so that

1P' ( l g (Xn , Yn) - g(O, Y) I :::: E) ::::; 1P'( IXn l > li) + 1P' ( I Yn - Y I > li) + IP'( I Y I > N) ::::; 3E, p for n :::: N. Therefore g(Xn , Yn) --* g(O, y) as n � 00 .

6. The subset A of the sample space Q may be expressed thus: 00 00 00

A = n u n { I Xn+m - Xn l < k- 1 } , k=l n=l m=l

a countable sequence of intersections and unions of events. For the last part, define

X (w) = { l

oimn400 Xn (w) if W E A

if W 1. A .

The function X is 9='-measurable since A E :F.

7. (a) If Xn (w) � X(w) then cnXn (w) � cX (w) . (b) We have by Minkowski's inequality that, as n � 00,

lE ( l cnXn - cXn ::::; I Cn I TlE ( I Xn - Xn + ICn - c l TlE lXT I � o.

(c) If c = 0, the claim is nearly obvious. Otherwise c # 0, and we may assume that c > O. For o < E < c, there exists N such that ICn - c l < E whenever n :::: N. By the triangle inequality, ICnXn - cX I ::::; I cn (Xn - X) I + I (cn - c)X I , so that, for n :::: N,

1P' ( l cnXn - cXI > E) ::::; lP' (cn lXn - X I > iE) + 1P' ( l cn - c l · I X I > iE)

::::; IP' ( I Xn - X I > E

) + IP' ( I X I > E

) 2(c + E) 2 1cn - c l � 0 as n � 00 .

325

Page 335: One Thousand Exercises in Probability

[7.2.8]-[7.3.1] Solutions Convergence of random variables

(d) A neat way is to use the Skorokhod representation (7.2. 14) . If Xn S X, find random variables Yn , Y with the same distributions such that Yn � Y. Then cn Yn � cY, so that cnYn S cY, implying the same conclusion for the X's.

8. If X is not a.s. constant, there exist real numbers C and E such that 0 < E < � and JP(X < c) > 2E ,

JP(X > C + E) > 2E . Since Xn � X, there exists N such that

JP(Xn < c) > E , JP(Xn > C + E) > E, if n 2::: N.

Also, by the triangle inequality, I Xr - Xs i :::: IXr - X I + I Xs - X I ; therefore there exists M such that JP( IXr - Xs i > E) < E3 for r, s 2::: M. Assume now that the Xn are independent. Then, for r, s 2::: max{M, N}, r =1= s ,

E3 > JP( IXr - Xs i > E) 2::: JP (Xr < C, XS > C + E) = JP(Xr < c)JP(Xs > C + E) > E2 ,

a contradiction.

9. Either use the fact (Exercise (4. 12.3)) that convergence in total variation implies convergence in distribution, together with Theorem (7.2. 1 9), or argue directly thus. Since l uO I :::: K < 00,

IlE(u (Xn)) - lE(u (X)) 1 = II: u (k) {fn (k) - f(k) } 1 :::: K I: I fn (k) - f(k) I --+ o. k k

10. The partial sum Sn = E�=l Xr is Poisson-distributed with parameter an = E�=l Ar . For fixed x , the event {Sn :::: x } is decreasing in n, whence by Lemma ( 1 .3 .5), if an --+ a < 00 and x is a non-negative integer,

Hence if a < 00, E�l Xr converges to a Poisson random variable. On the other hand, if an --+ 00, then e-Cfn EJ=o a1 / j ! --+ 0, giving that JP(E�l Xr > x ) = 1 for all x, and therefore the sum diverges with probability 1 , as required.

7.3 Solutions. Some ancillary results

1. (a) If IXn - Xm I > E then either IXn - XI > �E or I Xm - X I > �E, so that

JP( IXn - Xm l > E) :::: JP ( IXn - X I > �E) + JP( IXm - X I > � E) --+ 0

as n , m --+ 00, for E > O. Conversely, suppose that {Xn } is Cauchy convergent in probability. For each positive integer k,

there exists nk such that

for n , m 2::: nk ·

The sequence (nk) may not be increasing, and we work instead with the sequence defined by Nl = n t . Nk+l = max{Nk + 1 , nk+i l . We have that

I:JP( IXNk+l - XNk l 2::: Tk) < 00, k

326

Page 336: One Thousand Exercises in Probability

Some ancillary results Solutions [7.3.2]-[7.3.3]

whence, by the first Borel-Cantelli lemma, a.s. only finitely many of the events { I X Nk+! -X Nk

I �

2-k } occur. Therefore, the expression

00

X = XNl + L(XNk+! - XNk ) k=l

converges absolutely on an event C having probability one. Define X (w) accordingly for w E C, and X (w) = 0 for w rf. C. We have, by the definition of X, that XNk

� X as k -+ 00. Finally, we 'fill in the gaps' . As before, for E > 0,

as n, k -+ 00, where we are using the assumption that {Xn } is Cauchy convergent in probability.

(b) Since Xn � X, the sequence {Xn } is Cauchy convergent in probability. Hence

JP ( I Yn - Ym l > E) = JP ( IXn - Xm l > E) -+ 0 as n , m -+ 00,

for E > O . Therefore {Yn } i s Cauchy convergent also, and the sequence converges in probability to some limit Y . Finally, Xn .!; X and Yn .!; X, so that X and Y have the same distribution.

2. Since An � U:;;'=n Am , we have that

lim sup JP(An ) ::: lim JP ( U Am) = JP ( lim U Am) = JP(An Lo.) , n .... oo n .... oo m=n n .... oo m=n

where we have used the continuity of JP. Alternatively, apply Fatou's lemma to the sequence IA� of indicator functions.

3. (a) Suppose X2n = 1 , X2n+l = - 1 , for n � 1 . Then {Sn = 0 Lo.} occurs if X l = - 1 , and not if X I = 1 . The event is therefore not in the tail a-field of the X's . (b) Here is a way. As usual, JP(S2n = 0) = (�) {p(1 - p) }n , so that

L JP(S2n = 0) < 00 if p # i , n

implying by the first Borel-Cantelli lemma that JP(Sn = 0 i.o.) = O. (c) Changing the values of any finite collection of the steps has no effect on I = lim inf Tn and J =

lim sup Tn , since such changes are extinguished in the limit by the denominator ' ..;n' . Hence I and J are tail functions, and are measurable with respect to the tail a -field. In particular, {l ::: -x} n {J � x } lies in the a -field.

Take x = 1 , say. Then, JP(l ::: - 1) = JP(J � 1) by symmetry; using Exercise (7 .3 .2) and the central limit theorem,

JP(J � 1 ) � JP(Sn � ..;n Lo.) � lim sup JP(Sn � ../ii) = 1 - <1> (1 ) > 0, n .... oo

where <I> is the N(O, 1) distribution function. Since {J � I } is a tail event of an independent sequence, it has probability either 0 or 1 , and therefore JP(l ::: - 1) = JP(J � 1) = 1 , and also JP(l ::: - 1 , J � 1) = 1 . That is, on an event having probability one, each visit of the walk to the left of -..;n is followed by a visit of the walk to the right of ..;n, and vice versa. It follows that the walk visits 0 infinitely often, with probability one.

327

Page 337: One Thousand Exercises in Probability

[7.3.4]-[7.3.8] Solutions Convergence of random variables

4. Let A be exchangeable. Since A is defined in tenns of the Xi , it follows by a standard result of measure theory that, foreach n, there exists an event An E a (Xl , X2 , . . . , Xn) , such thatJP>(ALlAn ) -+ o as n -+ 00. We may express An and A in the fonn

An = {Xn E Bn } , A = {X E B} ,

where Xn = (Xl , X2 , . . . , Xn ) , and Bn and B are appropriate subsets of Rn and Roo. Let

A� = {X� E Bn } , A' = {X' E B} ,

where X� = (Xn+ l , Xn+2 , . . . , X2n) and X' = (Xn+l , Xn+2 , . . . , X2n , Xl , X2 , · · · , Xn , X2n+l , X2n+2 , · · . ) .

Now JP>(An n A� ) = JP>(An )lP'(A� ) , by independence. Also, JP>(An) = JP>(A� ) , and therefore

By the exchangeability of A, we have that JP>(A Ll A� ) = JP>(A' Ll A� ) , which in turn equals JP>(A Ll An) , using the fact that the Xi are independent and identically distributed. Therefore,

1JP>(An n A� ) - JP>(A) I :::: JP>(A Ll An) + JP>(A Ll A� ) -+ 0 as n -+ 00.

Combining this with (*) , we obtain that JP>(A) = JP>(A)2 , and hence JP>(A) equals 0 or 1 .

5. The value of Sn does not depend on the order of the first n steps, but only on their sum. If Sn = 0 i.o., then S� = 0 i.o. for all walks {S� } obtained from {Sn } by pennutations of finitely many steps.

6. Since f is continuous on a closed interval, it is bounded: I f (y ) I :::: c for all y E [0, I ] for some c. Furthennore f is uniformly continuous on [0, 1 ] , which is to say that, if E > 0, there exists 8 (> 0), such that I f(y) - f (z) 1 < E if Iy - z l :::: 8 . With this choice of E, 8, we have that 1JE:(Z/AC ) I < E, and

x ( 1 - x) 1JE:(ZIA) I :::: 2cJP>(A) :::: 2c · n82

by Chebyshov's inequality. Therefore

2c 1JE:(Z) I < E + n82 '

which is less than 2E for values of n exceeding 2C/(E82) . 7 . If {Xn } converges completely to X then, by the first Borel-Cantelli lemma, IXn - X I > E only finitely often with probability one, for all E > O. This implies that Xn � X; see Theorem (7.2.4c) .

Suppose conversely that {Xn } is a sequence of independent variables which converges almost surely to X. By Exercise (7.2.8), X is almost surely constant, and we may therefore suppose that Xn � c where c E R. It follows that, for E > 0, only finitely many of the (independent) events { I Xn - c l > E } occur, with probability one. Using the second Borel-Cantelli lemma,

L JP> ( I Xn - c l > E) < 00. n

8. Of the various ways of doing this, here is one. We have that

( ) - 1 ( 1 n ) 2 1 n n

L Xi Xj = _n

_ - L Xi - L Xf . 2 l ::;:i <j::;:n n - 1 n i=l n (n - 1 ) i= l

328

Page 338: One Thousand Exercises in Probability

Some ancillary results Solutions [7.3.9]-[7.3.12]

Now n-1 �1 Xi � fJ" by the law of large numbers (5 . 10.2); hence n- 1 �1 Xi � fJ, (use Theorem

(7.2.4a» . It follows that (n-1 �1 Xi )2 � fJ,2 ; to see this, either argue directly or use Problem (7. 1 1 .3). Now use Exercise (7.2.7) to find that

Arguing similarly,

n ( 1 n ) 2 p -- - L Xi --+ fJ,2 . n - 1 n

i=1

1 n --....".. " Xf � O n(n - 1) � 1 '

1=1

and the result follows by the fact (Theorem (7.3 .9» that the sum of these two expressions converges in probability to the sum of their limits.

9. Evidently, ( Xn ) 1

lP -- � I + E = -r-+ ' log n n €

for l E I < 1 .

By the Borel-Cantelli lemmas, the events An = {Xn / 10g n � 1 + E } occur a.s. infinitely often for - 1 < E � 0, and a.s. only finitely often for E > O. 10. (a) Mills 's ratio (Exercise (4.4.8) or Problem (4. 14. 1c» informs us that 1 - 1P(x) "" x-1cp (x) as x -+ 00. Therefore,

1 lP( IXn l � .j2 10gn ( 1 + E» ) "" 2 . J27f 10g n( 1 + E)n (1+€)

The sum over n of these terms converges if and only if E > 0, and the Borel-Cantelli lemmas imply the claim. (b) This is an easy implication of the Borel-Cantelli lemmas.

11. Let X be uniformly distributed on the interval [- 1 , 1 ] , and define Xn = I{X::;(- INnj . The dis­tribution of Xn approaches the Bernoulli distribution which takes the values ± 1 with equal probability ! . The median of Xn is 1 if n is even and - 1 if n is odd.

12. (i) We have that

for x > o. The result follows by the second Borel-Cantelli lemma. (ii) (a) The stationary distribution 1C is found in the usual way to satisfy

k - l 2 7fk = --7fk-l = . . . = 7fl k + 1 k(k + 1 ) , k � 2.

Hence tfk = {k(k + 1 ) }- 1 for k � 1 , a distribution with mean ��1 (k + 1 )-1 = 00. (b) By construction, lP(Xn � Xo + n) = 1 for all n, whence

lP (lim sup

Xn � 1

) = 1 .

n--*oo n

It may in fact be shown that lP (lim sUPn--*oo Xn/n = 0) = 1 .

329

Page 339: One Thousand Exercises in Probability

[7.3.13]-[7.4.3] Solutions Convergence of random variables

13. We divide the numerator and denominator by Jna. By the central limit theorem, the former converges in distribution to the N(O, 1 ) distribution. We expand the new denominator, squared, as

l I:n 2 2 - I:n 1 - 2 - (Xr - 11-) - - (X - 11-) (Xr - 11-) + - (X - 11-) . na2 na2 a2 r=1 r=1

By the weak law of large numbers (Theorem (5 . 10.2), combined with Theorem (7.2.3)), the first term converges in probability to 1 , and the other terms to o. Their sum converges to 1 , by Theorem (7.3 .9), and the result follows by Slutsky's theorem, Exercise (7.2.5).

7.4 Solutions. Laws of large numbers

1. Let Sn = Xl + X2 + · · · + Xn . Then

n . 2 2 ", z n

JE(S ) = L.J - < -n i=2 10g i - log n

and therefore Sn/n � O. On the other hand, �i JP( I Xi i 2: i ) = 1 , so that IXi i 2: i i.o., with probability one, by the second Borel-Cantelli lemma. For such a value of i , we have that l Si - Si-l l 2: i , implying that Sn/n does not converge, with probability one.

2. Let the Xn satisfy

1 JP(Xn = -n) = 1 - 2 ' n

3 1 JP(Xn = n - n) = 2 ' n

whence they have zero mean. However,

implying by the first Borel-Cantelli lemma that JP(Xn/n -+ - 1) = 1 . It is an elementary result of real analysis that n -1 ��= 1 xn -+ - 1 if Xn -+ - 1 , and the claim follows.

3. The random variable N(S) has mean and variance A I S I = crd, where c is a constant depending only on d. By Chebyshov's ineqUality,

( I N(S) I

) A (A ) 2 1 JP lSi - A 2: E � E2 1 S 1

= € crd ·

By the first Borel-Cantelli lemma, i l Sk l - 1 N(Sk) - A I 2: E for only finitely many integers k, a.s., where Sk is the sphere of radius k. It follows that N(Sk) / I Sk l � A as k -+ 00. The same conclusion holds as k -+ 00 through the reals, since N(S) is non-decreasing in the radius of S.

330

Page 340: One Thousand Exercises in Probability

Martingales Solutions [7.5.1]-[7.7.1]

7.5 Solutions. The strong law

1. Let Iij be the indicator function of the event that X j lies in the i th interval. Then

n n m m 10g Rm = L Zm (i ) log Pi = L L Iij log Pi = L Yj

i=1 i=1 j=1 j=1

where, for 1 ::::: j ::::: m , Yj = L:7=1 Iij log Pi is the sum of independent identically distributed variables with mean n

lE(Yj ) = L Pi log Pi = -h . i=1

By the strong law, m- 1 log Rm � -h . 2. The following two observations are clear: (a) N(t) < n if and only if Tn > t , (b) TN(t) ::::: t < TN(t)+l ·

If lE(X 1 ) < 00, then lE(Tn ) < 00, so that lP'(Tn > t) -+ 0 as t -+ 00. Therefore, by (a),

lP'(N (t) < n) = lP'(Tn > t) -+ 0 as t -+ 00,

implying that N(t) � 00 as t -+ 00. Secondly, by (b),

TN(t) < _t_ < TN(t)+1 . ( 1 + N(t)-I ) . N(t) - N(t) N(t) + 1

Take the limit as t -+ 00, using the fact that Tn/n � lE(X l ) by the strong law, to deduce that t/N(t) � lE(Xl ) . 3 . By the strong law, Sn/n � lE(X l ) =1= O. In particular, with probability 1 , Sn = 0 only finitely often.

7.6 Solution. The law of the iterated logarithm

1. The sum Sn is approximately N(O, n), so that

-�a lP'(Sn > van log n ) = 1 - <I> (va log n ) < � alog n

for all large n , by the tail estimate of Exercise (4.4.8) or Problem (4. l4. 1c) for the normal distribution. This is summable if a > 2, and the claim follows by an application of the first Borel-Cantelli lemma.

1. Suppose i < j . Then

7.7 Solutions. Martingales

lE(XjXi ) = lE{ lE [(Sj - Sj- l )Xi I So, SI , · · · , Sj-d }

= lE{ Xi [lE(Sj I SO , s} , . . . , Sj- l ) - Sj-d } = 0

33 1

Page 341: One Thousand Exercises in Probability

[7.7.2]-[7.8.2] Solutions Convergence of random variables

by the martingale property.

2. Clearly lEl Sn l < 00 for all n . Also, for n ::: 0,

1 { ( 1 - /-tn+l ) } lE(Sn+1 I Zo , Zl , . . . , Zn ) = /-tn+l lE(Zn+1 I ZO , . . . , Zn) - m 1 _ /-t

1 { ( 1 - /-tn+l ) } = /-tn+l m + /-tZn - m 1 - /-t = Sn .

3. Certainly lEl Sn l < 00 for all n . Secondly, for n ::: 1 ,

lE(Sn+l I Xo , Xl , . · · , Xn) = alE(Xn+l I Xo , · . · , Xn) + Xn = (aa + I )Xn + abXn- l ,

which equals Sn if a = ( 1 - a)-I . 4. The gambler stakes Zi = fi- l (Xl , . . . , Xi- } ) on the i th play, at a return of Xi per unit. Therefore Si = Si- l + Xi Zi for i ::: 2, with Sl = Xl Y. Secondly,

where we have used the fact that Zn+l depends only on Xl , X2 , . . . , Xn .

7.8 Solutions. Martingale convergence theorem

1. It is easily checked that Sn defines a martingale with respect to itself, and the claim follows from the Doob-Kolmogorov inequality, using the fact that

n lE(S;) = L var(Xj ) .

j=l

2. It would be easy but somewhat perverse to use the martingale convergence theorem, and so we give a direct proof based on Kolmogorov's inequality of Exercise (7 .8 . 1 ) . Applying this inequality to the sequence Zm , Zm+l , . . . where Zi = (Xi - lEXi )/ i , we obtain that Sn = Zl + Z2 + . . . + Zn satisfies, for E > 0,

JP ( max I Sn - Sm l > E) � ; t var(Zn ) . m<n<r E - - n=m+l We take the limit as r -+ 00, using the continuity of JP, to obtain

( ) 1 00 1 JP sup I Sn - Sm l > E � 2" L 2" var(Xn ) . n�m E n=m+l n

Now let m -+ 00 to obtain (after a small step)

JP ( lim sup I Sn - Sm I � E) = 1 m--*oo n�m

Any real sequence (xn ) satisfying

lim sup I Xn - xm l � E m--*oo n�m

332

for all E > o. for all E > 0,

Page 342: One Thousand Exercises in Probability

Prediction and conditional expectation Solutions [7.8.3]-[7.9.4]

is Cauchy convergent, and hence convergent. It follows that Sn converges a.s. to some limit Y. The last part is an immediate consequence, using Kronecker's lemma.

3. By the martingale convergence theorem, S = limn-+oo Sn exists a.s . , and Sn � S. Using Exercise (7.2. lc), var(Sn) -+ var(S), and therefore var(S) = O.

7.9 Solutions. Prediction and conditional expectation

1. (a) Clearly the best predictors are JE(X I Y) = y2 , JE(Y I X) = o.

(b) We have, after expansion, that

since JE(Y) = JE(y3 ) = O. This is a minimum when b = JE(y2) = j , and a = o. The best linear predictor of X given Y is therefore t .

Note that JE(Y I X) = 0 is a linear function of X ; it is therefore the best linear predictor of Y given X. 2. By the result of Problem (4. 14. 1 3), JE(Y I X) = 1-t2 + PCT2 (X - I-tl )/CTl , in the natural notation.

3. Write n g(a) = �::>iXi = aX' ,

i=l and

v (a) = JE{ (Y - g(a))2 } = JE(y2) - 2aJE(YX') + aVa' .

Let Ii be a vector satisfying va' = JE(YX') . Then

v (a) - vOO = aVa' - 2alE(YX') + 23JE(YX') - ava' = aVa' - 2ava' + ava' = (a - 3)V(a - 3)' ::: 0,

since V is non-negative definite. Hence v (a) is a minimum when a = Ii, and the answer is gOO . If V is non-singular, Ii = JE(YX)V-1 .

4. Recall that Z = JE(Y I g,) is the ( 'almost' ) unique g,-measurable random variable with finite mean and satisfying JE{(Y - Z)IG } = 0 for all G E g,. (i) n E g" and hence JE{JE(Y I g,)In } = JE(Zln ) = JE(Y In) . (ii) U = aJE(Y I g,) + ,BJE(Z I g,) satisfies

JE(UIG ) = aJE{JE(Y I g,)IG } + ,BJE{JE(Z I g,)IG } = aJE(Y IG) + ,BJE(ZIG) = JE{ (aY + ,BZ)IG } ,

Also, U is g,-measurable. (iii) Suppose there exists m (> 0) such that G = {JE(Y I g,) < -m } has strictly positive probability. Then G E g" and so JE(Y IG) = JE{JE(Y I g,)IG } . However Y IG ::: 0, whereas JE(Y I g,)IG < -m. We obtain a contradiction on taking expectations. (iv) Just check the definition of conditional expectation. (v) If Y is independent of g" then JE(Y IG) = JE(Y)lP'(G) for G E g,. Hence JE{(Y - JE(Y))IG } = 0 for G E g" as required.

333

Page 343: One Thousand Exercises in Probability

[7.9.5]-[7.10.1] Solutions Convergence of random variables

(vi) If g is convex then, for all a E JR, there exists J.. (a) such that

g(y) :::: g (a) + (y - a)J.. (a) ;

furthermore J.. may be chosen to be a measurable function of a . Set a = lE(Y I 9,) and y = Y, to obtain

g (Y) :::: g {lE(Y I 9,)} + {Y - lE(Y I 9,) }J.. {lE(Y I 9,) } .

Take expectations conditional on 9" and use the fact that lE(Y I 9,) i s 9,-measurable. (vii) We have that

I lE(Yn I 9,) - lE(Y I 9,) 1 � lE{ l Yn - Y I I 9, } � Vn

where Vn = lE{ suPm�n I Ym - Y I I 9, } . Now { Vn : n :::: I } is non-increasing and bounded below. Hence V = limn-+oo Vn exists and satisfies V :::: O. Also

lE (V ) � lE (Vn ) = lE { sup I Ym - Y I } , m�n

which tends to 0 as m -+ 00, by the dominated convergence theorem. Therefore lE(V) = 0, and hence lP'(V = 0) = 1 . The claim follows.

5. lE(Y I X) = X. 6. (a) Let {Xn : n :::: I } be a sequence of members of H which is Cauchy convergent in mean square, that is, lE{ IXn - Xm 1 2 } -+ 0 as m, n -+ 00. By Chebyshov's inequality, {Xn : n :::: I } is Cauchy convergent in probability, and therefore converges in probability to some limit X (see Exercise (7.3. 1 )) . It follows that there exists a subsequence {Xnk : k :::: I } which converges to X almost surely. Since each Xnk is 9,-measurable, we may assume that X is 9,-measurable. Now, as n -+ 00,

where we have used Fatou's lemma and Cauchy convergence in mean square. Therefore Xn � X. That lE(X2) < 00 is a consequence of Exercise (7.2. l a). (b) That (i) implies (ii) is obvious, since IG E H. Suppose that (ii) holds. Any Z (E H) may be written as the limit, as n -+ 00, of random variables of the form

m(n) Zn = L aj (n) IGj (n)

j=l

for reals aj (n) and events Gj (n) in 9, ; furthermore we may assume that I Zn l � I Z I . It is easy to see that lE{ (Y - M)Zn } = 0 for all n . By dominated convergence, lE{(Y - M)Zn } -+ lE{(Y - M)Z}, and the claim follows.

7.10 Solutions. Uniform integrability

1. It is easily checked by considering whether Ix I � a or I y I � a that, for a > 0,

Now substitute x = Xn and y = Yn , and take expectations.

334

Page 344: One Thousand Exercises in Probability

Uniform integrability Solutions [7.10.2]-[7.10.6]

2. (a) Let E > O. There exists N such that lE( IXn - Xn < E if n > N. Now lE lxr l < 00, by Exercise (7.2. 1a), and therefore there exists 8 (> 0) such that

lE( IX( fA) < E, lE( IXn ( fA) < E for 1 :s n :s N,

for all events A such that IP'(A) < 8 . By Minkowski's inequality,

{lE( IXn l r fA) } l /r :s {lE( IXn - X l r fA) }

l /r + {lE( I X l r fA) } l /r :s 2E 1 /r

if lP'(A) < 8. Therefore { IXn l r : n � I } is unifonnly integrable.

if n > N

If r is an integer then {X� : n � I } is unifonnly integrable also. Also X� � Xr since Xn � X (use the result of Problem (7. 1 1 .3». Therefore lE(X�) -+ lE(xr ) as required.

(b) Suppose now that the collection { I Xn l r : n � I } is uniformly integrable and Xn � X. We show first that lE lXr l < 00, as follows. There exists a subsequence {Xnk : k � I } which converges to X almost surely. By Fatou's lemma,

lE lXr l = lE (lim inf IXnk l r) :s lim inf lE IX�k l :S sup lE lX� 1 < 00. k--+oo k--+oo n

If E > 0, there exists 8 (> 0) such that

lE( IXr l fA ) < E, lE( IX� l fA ) < E for all n , whenever A is such that IP'(A) < 8. There exists N such that Bn (E) = { I Xn - X I > E} satisfies IP'(Bn (E» < 8 for n > N. Consequently

lE( IXn - Xn :s Er + lE ( IXn - X l r fBn (E» ) , n > N,

of which the final term satisfies

{lE ( IXn - X lr fBn (E» ) } l /r :s {lE ( IX� I fBn (E» ) }

l /r + {lE ( IXr I fBn (E» ) } l /r :s 2E 1 /r .

Therefore, Xn � X.

3. Fix E > 0, and find a real number a such that g (x) > X/E if x > a . If b � a, lE ( IXn l f{ lXn l >bJ ) < ElE{g ( IXn D } :s E sup lE{g ( IXn D } ' n

whence the left side approaches 0, unifonnly in n , as b -+ 00.

4. Here is a quick way. Extinction is (almost) certain for such a branching process, so that Zn � 0, and hence Zn � O. If {Zn : n � O} were unifonnly integrable, i t would follow that lE(Zn) -+ 0 as n -+ 00; however lE(Zn) = 1 for all n . 5. We may suppose that Xn , Yn , and Zn have finite means, for all n . We have that 0 :s Yn - Xn :s

p Zn - Xn where, by Theorem (7.3 .9c), Zn - Xn --+ Z - X. Also

lE lZn - Xn l = lE(Zn - Xn) -+ lE(Z - X) = lEl Z - X I ,

so that {Zn - Xn : n :::: I } is unifonnly integrable, by Theorem (7. 1 0.3) . It follows that {Yn - Xn :

n � I } is unifonnly integrable. Also Yn - Xn � Y - X, and therefore by Theorem (7. 10.3), lE l Yn -Xn I -+ lE lY - X I , which is to say that lE(Yn ) -lE(Xn ) -+ lE(Y) - lE(X) ; hence lE(Yn ) -+ lE(Y) .

It i s not necessary to use uniform integrability; try doing it using the 'more primitive' Fatou's lemma.

6. For any event A, lE( IXn l fA) :s lE(ZfA) where Z = sUPn IXn l . The uniform integrability follows by the assumption that lE(Z) < 00.

335

Page 345: One Thousand Exercises in Probability

[7.11.1]-[7.11.2] Solutions Convergence of random variables

7.11 Solutions to problems

1. lEIX� I = 00 for r 2: 1 , so there is no convergence in any mean. However, if E > 0,

p so that Xn -r O.

2 lP'( IXn l > E) = 1 - - tan- 1 (nE) --+ 0 11:

as n --+ 00,

You have insufficient information to decide whether or not Xn converges almost surely:

(a) Let X be Cauchy, and let Xn = X/n o Then Xn has the required density function, and Xn � O. (b) Let the Xn be independent with the specified density functions. For E > 0,

so that L:n lP'(IXn l > E) = 00. By the second Borel-Cantelli lemma, I Xn l > E i.o. with probability one, implying that Xn does not converge a.s. to O.

2. (i) Assume all the random variables are defined on the same probability space; otherwise it is meaningless to add them together. (a) Clearly Xn (w) + Yn (w) --+ X(w) + Y(w) whenever Xn (w) --+ X(w) and Yn (w) --+ Y(w) . Therefore

{Xn + Yn ,.. X + Y} � {Xn ,.. X} U {Yn ,.. Y} , a union of events having zero probability. (b) Use Minkowski's inequality to obtain that

{lE ( IXn + Yn - X - yn } l /r .::: {lE( IXn - Xn} l /r + {lE( I Yn - yn} l /r .

(c) If E > 0, we have that

{ IXn + Yn - X - Y I > E } � { IXn - X I > iE } U { I Yn - Y I > iE } ,

and the probability of the right side tends to 0 as n --+ 00.

(d) If Xn S X and the Xn are symmetric, then -Xn S X. However Xn + (-Xn) S 0, which generally differs from 2X in distribution. (ii) (e) Almost-sure convergence follows as in (a) above. (t) The corresponding statement for convergence in rth mean is false in general. Find a random variable Z such that lE lZr l < 00 but lEIZ2r l = 00, and define Xn = Yn = Z for all n .

p p (g) Suppose Xn -r X and Yn -r Y. Let E > O. Then

lP'( IXnYn - XY I > E) = lP' ( I (Xn - X)(Yn - Y) + (Xn - X)Y + X (Yn - Y) I > E) .::: lP' ( IXn - X I · I Yn - Y I > 1E) + lP'( IXn - X I · I Y I > 1E)

+ lP'( IX I · I Yn - Y I > jE) . Now, for 8 > 0,

lP'( IXn - X I · I Y I > 1E) .::: lP' ( IXn - X I > E/(38») + lP'( I Y I > 8) , 336

Page 346: One Thousand Exercises in Probability

Problems Solutions [7.11.3]-[7.11.5]

which tends to 0 in the limit as n -+ 00 and 8 -+ 00 in that order. Together with two similar facts, we obtain that XnYn � XY. (h) The example of (d) above indicates that the corresponding statement is false for convergence in distribution. 3. Let E > O. We may pick M such that 1P'( IX I � M) :::: E . The continuous function g is uniformly continuous on the bounded interval [-M, M]. There exists 8 > 0 such that

I g(x) - g(y) 1 :::: E if Ix - y l :::: 8 and Ix l :::: M.

If I g(Xn) - g(X) 1 > E , then either IXn - X I > 8 or IX I � M. Therefore

1P'( lg (Xn) - g(X) 1 > E) :::: 1P' ( IXn - X I > 8) + 1P'( IX I � M) -+ 1P'( IX I � M) :::: E ,

in the limit as n -+ 00. It follows that g(Xn) � g(X) . 4. Clearly

lE(eitXn ) = II lE(eitYj / IOJ ) = II - . - e. .

n . n { I 1 it / lOj- 1 } . . 10 1 _ el t/ lOJ J=l J=l

1 - eit 1 _ eit - -+ ----- IOn (l - eit/ 10" ) i t

as n -+ 00. The limit is the characteristic function of the uniform distribution on [0 , 1 ] . Now Xn :::: Xn+l :::: 1 for all n , s o that Y(w) = Iimn�oo Xn (w) exists for all w. Therefore

Xn � Y; hence Xn S Y, whence Y has the uniform distribution. 5. (a) Suppose s < t. Then

lE (N(s)N(t)) = lE(N(sf) + lE{ N(s) (N(t) - N(s)) } = lE(N(s)2) + lE(N(s))lE (N(t) - N(s) ) ,

since N has independent increments. Therefore

cov(N(s) , N(t)) = lE (N(s)N(t)) - lE(N(s))lE(N(t)) = (As)2 + AS + AS {A(t - s)} - (AS) (>..t) = AS .

In general, cov(N(s) , N(t)) = A min{s , t } . (b) N(t + h) - N(t) has the same distribution as N(h) , if h > O. Hence

which tends to 0 as h -+ o. (c) By Markov's inequality,

1P' ( IN(t + h) - N(t) 1 > E) :::: E12 lE ( {N(t + h) - N(t) } 2) ,

which tends to 0 as h -+ 0, if E > O. (d) Let E > O. For O < h < E- 1 ,

IP' (IN(t + h� - N(t)

I> E) = IP'(N(t + h) - N(t) � 1 ) = Ah + o(h) ,

337

Page 347: One Thousand Exercises in Probability

[7.11.6]-[7.11.10] Solutions Convergence of random variables

which tends to ° as h -+ 0. On the other hand,

JE ( { N(t + � - N(t) f) = h12 { (Ah)2 + Ah }

which tends to 00 as h + 0.

6. By Markov's inequality, Sn = 2:i=l Xj satisfies

for £ > 0. Using the properties of the X 's,

since JE(Xj ) = ° for all i . Therefore there exists a constant C such that

implying (via the first Borel-Cantelli lemma) that n- 1Sn � 0.

7. We have by Markov 's inequality that

for £ > 0, so that Xn � X (via the first Borel-Cantelli lemma).

< 00

8. Either use the Skorokhod representation or characteristic functions. Following the latter route, the characteristic function of aXn + b is

where <Pn is the characteristic function of Xn . The result follows by the continuity theorem.

9. For any positive reals c, t ,

JE{(X + c)2} JP'(X ::: t) = JP'(X + c ::: t + c) � (t + c)2 .

Set c = (12 f t to obtain the required inequality.

10. Note that g(u) = ufO + u) is an increasing function on [0, (0). Therefore, for £ > 0,

( IXn l £ ) 1 + £ ( IXn l ) JP'( IXn l > £) = JP' 1 + IXn l > 1 + £ � -£- . JE

1 + IXn l

338

Page 348: One Thousand Exercises in Probability

Problems Solutions [7.11.11]-[7.11.14]

by Markov's inequality. If this expectation tends to 0 then Xn � O. p

Suppose conversely that Xn -+ O. Then

( I Xn l ) E E lE :::: -- . JP>( I Xn l :::: E) + 1 · JP>( IXn l > E) -+ --

1 + I Xn l 1 + E 1 + E

as n -+ 00, for E > O. However E is arbitrary, and hence the expectation has limit O.

11. (i) The argument of the solution to Exercise (7.9.6a) shows that {Xn } converges in mean square if it is mean-square Cauchy convergent. Conversely, suppose that Xn � X. By Minkowski's inequality,

as m , n -+ 00, so that {Xn } is mean-square Cauchy convergent. (ii) The corresponding result is valid for convergence almost surely, in rth mean, and in probability. For a.s. convergence, it is self-evident by the properties of Cauchy-convergent sequences of real numbers. For convergence in probability, see Exercise (7.3 . 1 ) . For convergence in rth mean (r ::: 1 ) , just adapt the argument of (i) above.

12. If var(Xi ) :::: M for all i , the variance of n- 1 2::7= 1 Xi is

13. (a) We have that

1 n

M 2" L var(Xi ) :::: - -+ 0 as n -+ 00. n i=l n

If x :::: 0 then F(anx)n -+ 0, so that H (x) = O. Suppose that x > o. Then

- log H(x) = - lim {n log [l - ( 1 - F(anx))] } = lim {n ( l - F (anx)) } n�oo n�oo

since -y- 1 log(1 - y) -+ 1 as y .J, O. Setting x = 1 , we obtain n ( 1 - F(an)) -+ - log H(I) , and the second limit follows. (b) This is immediate from the fact that it is valid for all sequences {an } . (c) We have that

1 - F(tex+Y) 1 - F(tex+y) 1 - F(teX ) log H (eY) log H (eX ) ----- = . -+ . --=---

1 - F(t) 1 - F(teX) 1 - F(t) log H( 1 ) log H( I )

a s t -+ 00. Therefore g(x + y) = g(x)g(y) . Now g i s non-increasing with g (O) = 1 . Therefore g(x) = e-fJx for some ,8 , and hence H(u) = exp(-au-fJ) for u > 0, where a = - log H( 1 ) .

14. Either use the result of Problem (7. 1 1 . 1 3) or do the calculations directly thus. We have that

JP> (Mn :::: xn/rr ) = { � + � tan- I c;) r = { 1 - � tan- l (:J r

if x > 0, by elementary trigonometry. Now tan- I y = y + o(y) as y -+ 0, and therefore

JP> (Mn :::: xn/rr ) = ( 1 -x1n

+ 0(n- 1 ))n

-+ e- 1/x

339

as n -+ 00.

Page 349: One Thousand Exercises in Probability

[7.11.15]-[7.11.16] Solutions Convergence of random variables

15. The characteristic function of the average satisfies

as n � 00.

By the continuity theorem, the average converges in distribution to the constant IL, and hence in probability also.

16. (a) With Un = u (xn) , we have that

IE(u(X)) - E(u (Y)) 1 :s L IUn l · l fn - gn l :s L Ifn - gn l n n

if l I u l ioo :s 1 . There is equality if Un equals the sign of fn - gn . The second equality holds as in Problem (2.7. 1 3) and Exercise (4. 1 2.3). (b) Similarly, if l I u l ioo :s 1 ,

IE(u(X)) - E(u(Y)) 1 :s i: l u (x) 1 . I f (x) - g(x) 1 dx :s i: I f (x) - g(x) 1 dx

with equality if u (x) is the sign of f(x) - g(x) . Secondly, we have that

where

I IP'(X E A) - 1P'(Y E A) I = � I E(u(X)) - E(u (Y)) 1 :s �dTV(X, Y) ,

{ I if x E A, u (x) =

- 1 if x ¢ A . Equality holds when A = { x E JR : f(x) � g(x) } . (c) Suppose dTV (Xn , X) � o . Fix a E JR, and let u be the indicator function of the interval (-00, a]. Then IE(u(Xn) ) - E(u(X)) 1 = IIP'(Xn :s a) - 1P'(X :s a) l , and the claim follows.

On the other hand, if Xn = n- 1 with probability one, then Xn .s O. However, by part (a), dTV(Xn , O) = 2 for all n . (d) This is tricky without a knowledge of Radon-Nikodym derivatives, and we therefore restrict ourselves to the case when X and Y are discrete. (The continuous case is analogous.) As in the solution to Exercise (4. 1 2.4) , IP'(X '# Y) � �dTV(X, Y). That equality is possible was proved for Exercise (4. 12.5), and we rephrase that solution here. Let ILn = min{fn , gn } and IL = I:n ILn , and note that

dTV (X, Y) = L Ifn - gn l = L{fn + gn - 21Ln } = 2( 1 - 1L) ·

It is easy to see that

n n

1 { I if IL = 0, '1dTV (X, Y) = IP'(X '# Y) = 0 if IL = 1 ,

and therefore we may assume that 0 < IL < 1 . Let U , V , W be random variables with mass functions

IP'(U - ) - ILn lP'(V _ ) _ max{fn - gn , O} IP'(W _ ) _ - min{fn - gn , O} - Xn - , - Xn - , - Xn - ,

IL l - IL l - IL

and let Z be a Bernoulli variable with parameter IL, independent of (U, V, W). We now choose the pair X', Y' by

(X' Y') = { (U, U) if Z = 1 , , (V, W) if Z = O.

340

Page 350: One Thousand Exercises in Probability

Problems Solutions [7.11.17]-[7.11.19]

It may be checked that X' and Y' have the same distributions as X and Y, and furthermore, lI"(X' =I­Y') = lI"(Z = 0) = 1 - J1, = idTV(X, Y) .

(e) By part (d), we may find independent pairs (X� , Y[) , 1 :::: i :::: n , having the same marginals as

(Xi , Yi ) , respectively, and such that ll"(X� =I- Y[) = �dTV(Xi ' Yi ) . Now,

17. If Xl , X2 , . . . are independent variables having the Poisson distribution with parameter A, then

Sn = Xl + X2 + . . . + Xn has the Poisson distribution with parameter An . Now n- I Sn � A, so that E(g(n- l Sn)) --+ g(A) for all bounded continuous g. The result follows.

18. The characteristic function o/mn of

satisfies

(Xn - n) - (Ym - m) U mn = r.:::;-;-;; vm + n

log o/mn (t) = n (eitf./m+n - 1 ) + m (e-itf./m+n - 1 ) + (m - n)i t --+ _ lt2 ../m + n 2

as m, n --+ 00, implying by the continuity theorem that Umn � N(O, 1 ) . Now Xn + Ym is Poisson­distributed with parameter m + n, and therefore

V. - J

Xn + Ym P 1 mn - � m + n as m , n --+ 00

D by the law oflarge numbers and Problem (3). It follows by Slutsky's theorem (7.2.5a) that U mn / V mn ---+ N(O, 1) as required.

19. (a) The characteristic function of Xn is ¢n (t) = exp{i J1,nt - iO';t2 } where J1,n and a; are 1 2

the mean and variance of Xn . Now, limn-+oo ¢n ( 1 ) exists. However ¢n ( 1 ) has modulus e- 20'n , and therefore 0'2 = limn-+oo a; exists . The remaining component eiJ.Ln t of ¢n (t) converges as n --+ 00, say eiJ.Ln t --+ O (t) as n --+ 00 where O (t) lies on the unit circle of the complex plane. Now

1 2 2 ¢n (t) --+ 0 (t)e- 20' t , which is required to be a characteristic function; therefore 0 is a continuous function of t. Of the various ways of showing that O (t) = eiJ.Lt for some J1" here is one. The sequence o/n (t) = eiJ.Ln t is a sequence of characteristic functions whose limit O (t) is continuous at t = 0. Therefore 0 is a characteristic function. However 0/ n is the characteristic function of the constant J1,n , which must converge in distribution as n --+ 00; it follows that the real sequence {J1,n } converges to some limit J1" and O (t) = eiJ.Lt as required.

This proves that ¢n (t) --+ exp{iJ1,t - �O'2t2 } , and therefore the limit X is N(J1" 0'2) . (b) Each linear combination s Xn + t Yn converges in probability, and hence in distribution, to s X + t Y . Now s Xn + t Yn has a normal distribution, implying by part (a) that s X + t Y i s normal. Therefore the joint characteristic function of X and Y satisfies

¢X, Y (s , t) = ¢sx+ty ( 1 ) = exp{ iE(sX + tY) - � var(sX + tY) } = exp{ i (sJ1,X + tJ1,y) - � (s2O'i + 2stPxyO'xO'y + t2O'¥ ) }

341

Page 351: One Thousand Exercises in Probability

[7.11.20]-[7.11.21] Solutions Convergence of random variables

in the natural notation. Viewed as a function of s and t , this is the joint characteristic function of a bivariate normal distribution.

When working in such a context, the technique of using linear combinations of Xn and Yn is sometimes called the 'Cramer-Wold device' .

20. (i) Write Yj = Xi - lE(Xj ) and Tn = 2:1=1 Yi . It suffices to show that n- 1 Tn � O . Now, as n --+ 00,

2 2 1 n

2 nc lE(Tn In ) = 2" L var(Xi ) + 2" L COV(Xi , Xj ) :5 2" --+ o. n i= l n 1::5.i<j::s.n n

(ii) Lett:" > O. There exists I such that I p (Xj , Xj ) 1 :5 E if I i - j l � I . Now

n L cov(Xj , Xj ) :5 L cov(Xj , Xj ) + L COV(Xi , Xj ) :5 2nlc + n2Ec ,

i, j= l l i-j l::s.! 1::s.j, j5n l i-j l> !

1::s.i, j5n

since COV(Xi , Xj ) :5 I p (Xi ' Xj ) l ..jvar(Xi ) · var(Xj ) . Therefore,

2 2 2Ic lE(Tn In ) :5 - + EC --+ EC as n --+ 00. n

This is valid for all positive E, and the result follows.

21. The integral roo __ C_ dx J2 x log Ix l

diverges, and therefore lE(X 1 ) does not exist. The characteristic function ¢ of X 1 may be expressed as

whence

A. ) 2 hoo cos(tx)

d 'f' (t = C --- x 2 x2 log x

¢ (t) - ¢ (O) = _ roo 1 - cos(tx)

dx . 2c 12 x2 log x

Now 0 :5 1 - cos e :5 min{2, e2} , and therefore

Now

and

Therefore

I ¢ (t) - ¢ (O) I h 1/ t t2 100 2 < -- dx + --- dx 2c - 2 log x l /t x2 log x '

- -- --+ 0 1 hU dx u 2 log x

as u --+ 00,

100 2 1 100 2 2 --- dx < -- - dx = -- , U x2 log x - log u U x2 u log u

I ¢ (t) � ¢ (O) I = oCt) as t .J.. O.

if t > O.

u > 1 .

Now ¢ is an even function, and hence ¢' (O) exists and equals O. Use the result of Problem (7. 1 1 . 15) to deduce that n- 1 2:1 Xi converges in distribution to 0, and therefore in probability also, since 0 is constant. The Xi do not obey the strong law since they have no mean.

342

Page 352: One Thousand Exercises in Probability

Problems Solutions [7.11.22]-[7.11.24]

22. If the two points are U and V then

and therefore 1 2 1 � 2 P 1 -Xn = - �(Ui - Vi ) � -n n i= l 6

as n � 00,

by the independence of the components. It follows that Xn/.fii � 1 /./6 either by the result of Problem (7. 1 1 .3) or by the fact that

23. The characteristic function of Yj = Xi I is

lo l . . lo l 100 cos y cjJ (t) = 1 (el t/x + e-I t/X ) dx = cos(t/x) dx = I t I -- dy o 0 It I y2

by the substitution x = I t l /Y . Therefore

100 1 - cos y cjJ (t) = 1 - I t I 2 dy = 1 - / I t l + o( l t l )

I t I y

where, integrating by parts,

- laOO 1 - cos Y d _ loOO sin u _ Ti / - 2 Y - du - - . o Y O u 2

It follows that Tn = n- l L.l=l Xi I has characteristic function

as t � 0,

as t � 00, whence 2Tn/Ti is asymptotically Cauchy-distributed. In particular,

2 100 du 1 1P' ( 1 2Tn/Ti I > 1 ) � - --2 = -Ti I l + u 2 as t � 00.

24. Let mn be a non-decreasing sequence of integers satisfying 1 � mn < n , mn � 00, and define

noting that Ynk takes the value ±1 each with probability 1 whenever mn < k � n . Let Zn L.k=l Ynk · Then

n n 1 IP'(Un =1= Zn) � L IP'(IXk l � mn ) � L k2 � 0 k=l k=mn

as n � 00,

343

Page 353: One Thousand Exercises in Probability

[7.11.25]-[7.11.26] Solutions Convergence o/random variables

from �hich it follows that Un /.;n � N(O, 1 ) if and only if Zn/.;n � N(O, 1 ) . Now

mn Zn = L Ynk + Bn-mn

k=l

where Bn-mn is the sum of n - mn independent summands each of which takes the values ±1 , each possibility having probability � . Furthermore

1 1 mn

I 2

- L Ynk :::: mn

.;n k=l .;n

which tends to ° if mn is chosen to be mn = Ln 1/5 J ; with this choice for mn , we have that

n- l Bn-mn � N(O, 1 ) , and the result follows. Finally,

so that

var(Un) = t (2 - -\-) k=l k

1 n 1 var (Un /.,f1i) = 2 - - L "2 � 2. n k=l k

25. (i) Let ¢n and ¢ be the characteristic functions of Xn and X. The characteristic function 1{!k of XNk is

whence

00

1{!k (t) = L ¢j (t)IP'(Nk = j) j=l

00

l 1{!k (t) - ¢ (t) 1 :::: L I ¢j (t) - ¢ (t) IIP'(Nk = j) . j=l

Let E > 0. We have that ¢j (t) � ¢ (t) as j � 00, and hence for any T > 0, there exists J (T) such that I¢j (t) - ¢ (t) 1 < E if j � J (T) and I t I :::: T. Finally, there exists K(T) such that IP'(Nk :::: J (T» :::: E if k � K (T) . It follows that

if I t I :::: T and k � K (T) ; therefore 1{!k (t) � ¢ (t) as k � 00. (ii) Let Yn = sUPm�n I Xm - X I . For E > 0, n � 1 ,

1P' ( I XNk - X I > E ) :::: IP'(Nk :::: n) + 1P' ( I XNk - X I > E, Nk > n ) :::: IP'(Nk :::: n) + 1P'(Yn > E) � IP'(Yn > E)

Now take the limit as n � 00 and use the fact that Yn � 0.

26. (a) We have that

( k ) k (

.) (

k .) a n - , n

= II 1 - !... < exp - L !... . a (n + 1 , n) i=O n -i=O n

344

as k � 00.

Page 354: One Thousand Exercises in Probability

Problems Solutions [7.11.27]-[7.11.28]

(b) The expectation is

En = L g ( j �

n ) ni��n

. "I n J . J

where the sum is over all j satisfying n - M.;n :::: j :::: n . For such a value of j ,

g ( j - n ) nie-n

= e-n (niH _ ni )

.;n j ! .;n j ! (j - I ) ! '

whence En has the form given. (c) Now g is continuous on the interval [-M, 0], and it follows by the central limit theorem that

Also,

e-n e-n e-n-k2/(2n) En ::::

.;na(n + 1 , n) :::: En +

.;na(n - k, n) :::: En +

.;n a(n + 1 , n)

where k = L M .;n J . Take the limits as n � 00 and M � 00 in that order to obtain

27. Clearly

{ I } 1 nn+! e-n 1 -- < lim < --. $ - n�oo n ! - $

Rn E(RnH I Ro , Rl , " " Rn) = Rn + -­n + 2 since a red ball is added with probability Rn/(n + 2) . Hence

E(SnH I Ro , Rl , " " Rn) = Sn ,

and also 0 :::: Sn :::: 1 . Using the martingale convergence theorem, S = limn�oo Sn exists almost surely and in mean square.

28. Let 0 < E < 1 , and let

k(t) = LetJ , m et) = ro - E3)k(t)1 , n et) = L( 1 + E3 )k(t)J

and let Imn (t) be the indicator function of the event {m (t) :::: M(t) < n et) } . Since M(t)/ t � e , we may find T such that E(lmn (t)) > 1 - E for t � T.

We may approximate SM(t) by the random variable Sk(t) as follows. With Ai = { l Si - Sk(t) I > E..jk(t) } ,

lP' (AM(t) ) :::: lP' (AM(t) , Imn (t) = 1 ) + lP' (AM(t) , Imn (t) = 0) k(t)- l n(t)- l

:::: lP' ( U Ai) + lP' ( U Ai) + lP'(Imn (t) = O) j=m(t) i=k(t)

{k(t) - m(t) }0'2 {n (t) - k(t) }0'2 < + + E - E2k(t) E2k(t) :::: E ( 1 + 20'2) , if t � T,

345

Page 355: One Thousand Exercises in Probability

[7.11.29]-[7.11.32] Solutions Convergence of random variables

by Kolmogorov's inequality (Exercise (7.8 . 1) and Problem (7. 1 1 .29» . Send t --+ 00 to find that

D = SM(t) - Sk(t) � 0 t Jk(t) as t --+ 00.

Now Sk(t)/ Jk(t) S N(O, (J'2) as t --+ 00, by the usual central limit theorem. Therefore

which implies the first claim, since k(t)/ «()t) --+ 1 (see Exercise (7.2.7» . The second part follows by Slutsky's theorem (7.2.5a) .

29. We have that Sn = Sk + (Sn - Sk) , and so, for n ::: k,

Now StJAk ::: c2IAk ; the second term on the right side is 0, by the independence of the X 's, and the third term is non-negative. The first inequality of the question follows. Summing over k, we obtain lE(S�) ::: c21P'(Mn > c) as required.

30. (i) With Sn = LI=1 Xi , we have by Kolmogorov's inequality that

(

) 1 m+n

IP' max I Sm+k - Sm l > E � 2" 2: lE(X�) l �k�n E k=m

for E > O. Take the limit as m, n --+ 00 to obtain in the usual way that {Sr : r ::: O} is a.s. Cauchy convergent, and therefore a.s. convergent, if Lj'" lE(X�) < 00. It is shorter to use the martingale convergence theorem, noting that Sn is a martingale with uniformly bounded second moments. (ii) Apply part (i) to the sequence Yk = Xk/bk to deduce that L�1 Xk/bk converges a.s . The claim now follows by Kronecker's lemma (see Exercise (7.8 .2» .

31. (a) This is immediate by the observation that

A(P) - f II Nij e - Xo Pij ' i, j /

(b) Clearly Lj Pij = 1 for each i , and we introduce Lagrange multipliers {ILi : i E S} and write V = i,.(P) + Li ILi Lj Pij ' Differentiating V with respect to each Pij yields a stationary (maximum) value when (Nij / Pij ) + ILi = O. Hence Lk Nik = -ILk > and

� Nij Nij p . . - -- - --IJ - ILi - Lk Nik .

(c) We have that Nij = LE\ Nik Ir where lr is the indicator function of the event that the rth transition out of i is to j . By the Markov property, the Ir are independent with constant mean Pij . Using the

strong law of large numbers and the fact that Lk Nik � 00 as n --+ 00, Pi) � lE(h) = Pij .

32. (a) If X is transient then Vi (n) < 00 a.s . , and ILi = 00, whence Vi (n)/n � 0 = ILII . If X is persistent, then without loss of generality we may assume X 0 = i . Let T (r) be the duration of the rth

346

Page 356: One Thousand Exercises in Probability

Problems Solutions [7.11.33]-[7.11.34]

excursion from i . By the strong Markov property, the T (r) are independent and identically distributed with mean iJ-i . Furthermore,

By the strong law of large numbers and the fact that Vi (n) � 00 as n --+ 00, the two outer terms sandwich the central term, and the result follows.

(b) Note that L�:J !(Xr ) = LieS !(i ) Vi (n) . With Q a finite subset of S, and 1!i = iJ-i 1 , the unique stationary distribution,

1: !(Xr ) - I: !(�) = II: ( Vi (n) - �) !(i ) 1 ---" n . iJ-1 . n iJ-1 r=v I I :::: {I: I Vj (n) - � I + I: ( Vi (n)

+ �) } I I ! l Ioo , i eQ n iJ-1 i r;. Q n iJ-1

where I l f l loo = sup{ I !(i ) 1 : i E S} . The sum over i E Q converges a.s. to 0 as n --+ 00, by part (a) . The other sum satisfies

I: ( Vi (n) + �) = 2 - I: ( Vi (n)

+ 1!i) i r;. Q n iJ-1 i eQ n

which approaches 0 a.s . , in the limits as n --+ 00 and Q t S. 33. (a) Since the chain is persistent, we may assume without loss of generality that Xo = j . Define the times Rl , R2 , . . . of return to j, the sojourn lengths SI , S2 , . . . in j , and the times VI , V2 , . . . between visits to j . By the Markov property and the strong law of large numbers,

1 1 � a.s. -Rn = - L...J Vr � iJ-j . n n r=1

Also, Rnl Rn+l � 1 , since iJ-j = lE(Rl ) < 00. If Rn < t < Rn+b then

� . L�=1 Sr < � lot I . ds < Rn+l . L�;:t Sr

R 1 �n V; - t 0 (X (s)=} l - R �n+l V; . n+ L."r=1 r n L."r=1 r

Let n --+ 00 to obtain the result. (b) Note by Theorem (6.9.2 1 ) that Pij (t) --+ 1!j as t --+ 00. We take expectations of the integral in part (a), and the claim follows as in Corollary (6.4.22) . (c) Use the fact that

l !(X (s)) ds = I: lot I(X(s)=j ) ds o

j eS 0

together with the method of solution of Problem (7 . 1 1 .32b).

34. (a) By the first Borel-Cantelli lemma, Xn = Yn for all but finitely many values of n, almost surely. Off an event of probability zero, the sequences are identical for all large n . (b) This follows immediately from part (a), since Xn - Yn = 0 for all large n , almost surely.

(c) By the above, a; 1 L� 1 (Xr - Yr) � 0, which implies the claim.

347

Page 357: One Thousand Exercises in Probability

[7.11.35]-[7.11.37] Solutions

35. Let Yn = Xnl{ lXn l9} ' Then,

n

Convergence of random variables

n

by assumption (a) , whence {Xn } and {Yn } are tail-equivalent (see Problem (7. 1 1 .34)). By assumption (b) and the martingale convergence theorem (7 .8 . 1 ) applied to the partial sums 2:�=1 (Yn - lE(Yn)) , the infinite sum 2:�1 (Yn - lE(Yn)) converges almost surely. Finally, L� l lE(Yn) converges by assumption (c), and therefore L�1 Yn , and hence L�1 Xn , converges a.s. 36. (a) Let n l < n2 < . . . < nr = n. Since the h take only two values, it suffices to show that

r lP'(lns = 1 for 1 ::s s ::s r) = II lP'(lns = 1 ) .

s=1

Since F is continuous, the Xi take distinct values with probability 1 , and furthermore the ranking of XI , X 2 , . . . , Xn is equally likely to be any of the n ! available. Let X I , x2 , . . . , Xn be distinct reals, and write A = {Xi = Xi for 1 ::s i ::s n} . Now,

lP'(lns = 1 for 1 ::s s ::s r I A)

1 { (n - 1) . } { (n 1 - 1) } = , (n - 1 - ns- l ) ! s- (ns- l - 1 - ns-2) ! . . . (n l - I) ! n . ns- l ns-2 1 1 1 = - 0 -- . . . -

and the claim follows on averaging over the Xi . (b)We have thatlE(h ) = lP'(h = 1 ) = k- 1 and var(h) = k- 1 ( l -k- 1 ) , whence Lk var(h/ log k) < 00. By the independence of the h and the martingale convergence theorem (7.8 . 1 ), L�1 (h -k- 1 )j log k converges a.s . Therefore, by Kronecker's lemma (see Exercise (7.8 .2)),

1 � ( 1 ) a.s . -- L..J lj - -;- ---+ 0 log n j= 1 J as n -+ 00.

The result follows on recalling that 2:7=1 j - l � log n as n -+ 00.

37. By an application of the three series theorem of Problem (7. 1 1 .35), the series converges almost surely.

348

Page 358: One Thousand Exercises in Probability

8 Random processes

8.2 Solutions. Stationary processes

1. With aj (n) = IP'(Xn = i ) , we have that

cov(Xm , Xm+n) = IP'(Xm+n = 1 I Xm = 1 )IP'(Xm = 1) - 1P'(Xm+n = 1 )1P'(Xm = 1 ) = al (m)Pl 1 (n) - al (m)al (m + n) ,

and therefore,

al (m)pl 1 (n) - a l (m)a l (m + n) p (Xm , Xm+n) = -r=7��==��7=��r===7=�� Val (m) ( 1 - al (m))a l (m + n) ( 1 - al (m + n)) Now, al (m) � a/(a + fJ) as m � 00, and

a fJ n Pl 1 (n) = -- + -- ( 1 - a - fJ) , a + fJ a + fJ

whence p(Xm , Xm+n) � (1 - a - fJ)n as m � 00. Finally,

lim � t lP'(Xr = 1 ) = _a_ . n�oo n r=l a + fJ

The process is strictly stationary if and only if Xo has the stationary distribution.

2. We have that lE(T(t)) = 0 and var(T(t)) = var(To) = 1 . Hence:

(a) p(T(s) , T(s + t)) = lE(T (s )T (s + t)) = lE [(_ 1 )N(t+s)-N(s) ] = e-2M . (b) Evidently, lE(X (t)) = 0, and

lE[X (t)2] = lE (ll T(u)T (v) dU dV)

= 2 r lE (T(u)T (v) ) du dv = 2 t rv

e-2A(v-u) du dv }o<u<v<t }v=o }u=o 1 ( 1 1 -2M) t = i t -

2)" +

2)" e '" i as t � 00.

3. We show first the existence of the limit )., = limt-l-O g(t)/t , where g(t) = IP'(N(t) > 0) . Clearly,

g(x + y) = IP' (N(x + y) > 0)

= IP'(N(x) > 0) + IP'({N(x) = OJ n {N(x + y) - N(x) > OJ) ::::: g (x) + g(y) for x, y � O.

349

Page 359: One Thousand Exercises in Probability

[8.3.1]-[8.3.4] Solutions Random processes

Such a function g is called subadditive, and the existence of A follows by the subadditive limit theorem discussed in Problem (6. 15 . 14). Note that A = 00 is a possibility.

Next, we partition the interval (0, 1 ] into n equal sub-intervals, and let In (r) be the indicator function of the event that at least one arrival lies in ( r - l)ln , r In] , 1 :s r :s n . Then E�=l In (r) t N(I) as n -+ 00, with probability 1 . By stationarity and monotone convergence,

E (N ( I )) = E ( lim � In (r)) = lim E (� In (r)) = lim ng(n- 1 ) = A . n->oo L.J n->oo L.J n->oo r= l r= l

8.3 Solutions. Renewal processes

1. See Problem (6. 1 5.8).

2. With X a certain inter-event time, independent of the chain so far,

{ X - 1 if Bn = 0, Bn+l = Bn - 1 if Bn > 0.

Therefore, B is a Markov chain with transition probabilities Pi, i- l = 1 for i > 0, and POj = /j+l for j ::: 0, where In = IP'(X = n) . The stationary distribution satisfies IT} = JIj+l + 7rO/j+l , j ::: 0, with solution 7rj = IP'(X > j)/E(X), provided E(X) is finite.

The transition probabilities of B when reversed in equilibrium are

_ 7rHl IP'(X > i + 1 ) Pi,Hl = ---;r;- = IP'(X > i ) ,

- IHI PiO = IP'(X > i ) , for i ::: 0.

These are the transition probabilities of the chain U of Exercise (8.3 . 1 ) with the /j as given.

3. We have that pnun = E�= l pn-kun_kpk /k, whence Vn = pnun defines a renewal sequence provided p > ° and En pn In = 1 . By Exercise (8 .3 . 1 ), there exists a Markov chain U and a state s such that Vn = IP'(Un = s) -+ 7rs , as n -+ 00, as required.

4. Noting that N (0) = 0,

00 00 r 00 00 L E(N (r))sr = L L ukSr = L Uk L sr r=O r=l k=l k=l r=k

= � UkSk = U(s) - 1 = F (s)U (s) . L.J l - s l - s l - s k= l

Let Sm = EZ'=l Xk and So = 0 . Then IP'(N(r) = n) = IP'(Sn :s r) - 1P'(Sn+l :s r ) , and

Now,

00 [ (N(t) + k) ] 00 00 ( + k) � SIE k = � SI � n

k (IP'(Sn :s t) - 1P'(Sn+l :s t))

00 [ 00 ( + k 1) ] = L / 1 + L n

k _ � IP'(Sn :s t) . 1=0 n= l

350

Page 360: One Thousand Exercises in Probability

Queues Solutions [8.3.5]-[8.4.4]

whence, by the negative binomial theorem,

00 t [ (N(t) + k) ] I U(S)k

� S JE k

= (1 - s) ( I - F(s))k = l - s ·

5. This is an immediate consequence of the fact that the interarrival times of a Poisson process are exponentially distributed, since this specifies the distribution of the process.

8.4 Solutions. Queues

1. We use the lack-of-memory property repeatedly, together with the fact that, if X and Y are independent exponential variables with respective parameters "A and /-t, then JP'(X < Y) = "A/("A + /-t) . (a) In this case,

I { "A /-t /-t } I { /-t "A "A } I 2"A/-t P = 2 "A + /-t

. "A + /-t

+ "A + /-t

+ 2 "A + /-t .

"A + /-t +

"A + /-t = 2 +

("A + /-t)2 .

(b) If "A < /-t, and you pick the quicker server, p = I _ (_/-t_)2.

"A + /-t 2"A/-t

(c) And finally, p = ("A + /-t)2

.

2. The given event occurs if the time X to the next arrival is less than t, and also less than the time Y of service of the customer present. Now,

3. By conditioning on the time of passage of the first vehicle,

JE(T) = loa (x + JE(T) )"Ae-AX dx + ae-Aa ,

and the result follows. If it takes a time b to cross the other lane, and so a + b to cross both, then, with an obvious notation,

(a)

(b)

eaA _ I eb/L - I JE(Ta ) + JE(Tb ) = -- + -- , "A /-t

e(a+b) (A+/L) - I JE(Ta+b) = ----­"A + /-t

The latter must be the greater, by a consideration of the problem, or by turgid calculation.

4. Look for a solution of the detailed balance equations

"A (n + 1 ) /-trrn+l =

n + 2 rrn , n ::: O.

to find that rrn = pnrro/(n + l ) is a stationary distribution ifp < I , in which case rro = -pj log( l - p) . Hence L::n nrrn = "Arro/(/-t - "A) , and by the lack-of-memory property the mean time spent waiting for service is prro/(/-t - "A). An arriving customer joins the queue with probability

351

Page 361: One Thousand Exercises in Probability

[8.4.5]-[8.5.5] Solutions Random processes

5. By considering possible transitions during the interval (t, t + h) , the probability Pi (t) that exactly i demonstrators are busy at time t satisfies:

Hence,

P2 (t + h) = PI (t)2h + P2 (t) ( 1 - 2h) + o(h) , P I ( t + h) = po (t)2h + PI (t) ( 1 - h) ( 1 - 2h) + P2 (t)2h + o(h) , Po (t + h) = po (t)( 1 - 2h) + PI (t)h + o(h ) .

p� (t) = 2Pl (t) - 2P2 (t) , PI (t) = 2po (t) - 3Pl (t) + 2p2 (t) , po et) = -2po (t) + PI (t) ,

and therefore P2 (t) = a + be-2t + ce-5t for some constants a, b, c . By considering the values of P2 and its derivatives at t = 0, the boundary conditions are found to be a + b + c = 0, -2b - 5c = 0, 4b + 25c = 4, and the claim follows.

8.5 Solutions. The Wiener process

1. We might as well assume that W is standard, in that u2 = 1 . Because the joint distribution is multivariate normal, we may use Exercise (4.7 .5) for the first part, and Exercise (4.9.8) for the second, giving the answer

- + - sm - + sm - + sm - . 1 1 { . - 1 V; . - 1 Jr . - 1 Jr} 8 4n t u u

2. Writing W(s) = .fix, Wet) = ..jiz, and W(u) = ,J'UY, we obtain random variables X, Y, Z with the standard trivariate normal distribution, with correlations PI = ...jSfU, P2 = .filli, P3 = .fiTi. By the solution to Exercise (4.9.9),

var(Z I X, y) = (u - t)(t - s) , t (u - s) yielding var(W(t) I W(s) , W(u)) as required. Also,

{ [ (u - t)W(s) + (t - S)W(U) ] I } lE (W(t)W(u) I W(s) , W(v)) = lE u _ s W(u) W(s) , W(v) ,

which yields the conditional correlation after some algebra.

3. Whenever a2 + b2 = 1 .

4. Let Llj (n) = W«j + l )t/n) - W(jt/n) . By the independence of these increments,

5. They all have mean zero and variance t, but only (a) has independent normally distributed incre­ments.

352

Page 362: One Thousand Exercises in Probability

Problems Solutions [8.7.1]-[8.7.3]

8.7 Solutions to problems

1. lE(Yn) = 0, and cov(Ym , Ym+n) = Ei=o ajan+j for m , n :::: 0, with the convention that ak = 0 for k > r . The covariance does not depend on m, and therefore the sequence is stationary.

2. We have, by iteration, that Yn = Sn (m) + am+1 Yn-m- l where Sn (m) = Ej=o aj Zn-j . There are various ways of showing that the sequence {Sn (m) : m :::: I } converges in mean square and almost surely, and the shortest is as follows. We have that am+! Yn-m- l --+ 0 in m.s. and a.s. as m --+ 00; to see this, use the facts that var(am+! Yn-m- l ) = a2(m+1) var(Yo) , and

E > O.

It follows that Sn (m) = Yn - am+ 1 Yn-m- l converges in m.s. and a.s. as m --+ 00. A longer route to the same conclusion is as follows. For r < s ,

whence {Sn (m) : m :::: I } i s Cauchy convergent in mean square, and therefore converges in mean square. In order to show the almost sure convergence of Sn (m) , one may argue as follows. Certainly

whence E�o aj Zn-j is a.s. absolutely convergent, and therefore a.s. convergent also. We may

express limm .... Hlo Sn (m) as E�o ajzn-j . Also, am+! Yn_m_ l --+ 0 in mean square and a.s . as m --+ 00, and we may therefore express Yn as

00

Yn = E aj Zn-j a.s. j=o

It follows that lE(Yn) = limm�oo lE(Sn (m)) = O. Finally, for r > 0, the autocovariance function is given by

c(r) = cov(Yn , Yn-r ) = lE{ (aYn- l + Zn)Yn-r } = ac(r - 1 ) , whence

since c(O) = var(Yn ) .

a lr l c(r) = a lr l c(O) = --2 '

I - a r = . . . , - 1 , 0 , 1 , . . . ,

3. Ift is a non-negative integer, N(t) is the number of O's and 1 ' s preceding the (t + l )th 1 . Therefore N(t) + 1 has the negative binomial distribution with mass function

k :::: t + 1 .

If t is not an integer, then N(t) = N(ltJ ) .

353

Page 363: One Thousand Exercises in Probability

[8.7.4]-[8.7.6] Solutions Random processes

4. We have that { Ah + o(h ) if j = i + 1 ,

lP (Q (t + h) = j I Q (t) = i ) = /1-ih + o(h) if j = i - I , 1 - (A + /1-i )h + o(h) if j = i ,

an immigration-death process with constant birth rate A and death rates /1-i = i /1-. Either calculate the stationary distribution in the usual way, or use the fact that birth-death

processes are reversible in equilibrium. Hence Ani = /1-(i + l )ni+l for i ::: 0, whence

n . = - - e-J..//L 1 ( A ) i I i ! /1- ' i ::: 0.

5. We have that X (t) = R cos (Ill) cos(Ot) - R sin(llI) sin(Ot) . Consider the transformation u = r cos 1{1, v = -r sin 1{1, which maps [0, (0) x [0, 2n) to R2 . The Jacobian is

au au ar a1{l av av = -r, ar a1{l

whence U = R cos IV , V = -R sin III have joint density function satisfying

r fu, v (r cos 1{1, -r sin 1{1) = fR, IJ/ (r, 1{1) .

1 ( 2 2 ) Substitute fu, v (u , v) = e-Z U +V /(2n) , to obtain

r > 0, ° � 1{1 < 2n.

Thus R and IV are independent, the latter being uniform on [0, 2n) . 6. A customer arriving at time u i s designated green i f he i s i n state A at time t , an event having probability p(u, t - u) . By the colouring theorem (6. 13 . 14), the arrival times of green customers form a non-homogeneous Poisson process with intensity function A (U)p(U , t - u) , and the claim follows.

354

Page 364: One Thousand Exercises in Probability

9

Stationary processes

9.1 Solutions. Introduction

1. We examine sequences Wn of the form

00

Wn = L:: ak Zn-k k=O

for the real sequence {ak : k � OJ. Substitute, to obtain ao = 1 , al = a, ar = aar- l + f3ar-2 , r � 2, with solution

if a2 + 4f3 = 0,

otherwise,

where )'1 and A2 are the (possibly complex) roots of the quadratic x2 - ax - f3 = 0 (these roots are distinct if and only if a2 + 4f3 #- 0).

Using the method in the solution to Problem (8.7.2), the sum in (*) converges in mean square and almost surely if IA l l < 1 and IA2 1 < 1 . Assuming this holds, we have from (*) that E(Wn) = 0 and the autocovariance function is

c(m) = E(Wn Wn-m ) = ac(m - 1 ) + f3c(m - 2) , m � 1 ,

by the independence of the Zn . Therefore W is weakly stationary, and the autocovariance function may be expressed in terms of a and f3 .

2 . We adopt the convention that, if the binary expansion of U i s non-unique, then we take the (unique) non-terminating such expansion. It is clear that Xi takes values in {O, I } , and

lP' (Xn+l = 1 1 Xi = Xi for 1 � i � n) = �

for all Xl , x2 , . . . , Xn ; therefore the X 's are independent Bernoulli random variables. For any se­quence kl < k2 < . . . < kr , the joint distribution of Vkj ' Vk2 ' . . . , Vkr depends only on that of Xkj +l ' Xkj +2 , . . . . Since this distribution is the same as the distribution of Xl , X2 , . . . , we have that (Vkj , Vk2 ' . . . , Vkr ) has the same distribution as (Vo , Vk2 -kj , . . . , Vkr -kj ) ' Therefore V is strongly stationary.

Clearly E(Vn) = E(Vo) = i , and, by the independence of the Xi ,

00

cov(Vo , Vn) = L:: r2i-n var(Xi ) = -fi (i )n . i= l

355

Page 365: One Thousand Exercises in Probability

[9.1.3]-[9.2.1] Solutions Stationary processes

3. (i) For mean-square convergence, we show that Sk = L:�=o anXn is mean-square Cauchy con­vergent as k --+ 00. We have that, for r < s ,

since I c(m) I :5 c(O) for all m, by the Cauchy-Schwarz inequality. The last sum tends to 0 as r, s --+ 00 if L:i lai I < 00. Hence Sk converges in mean square as k --+ 00.

Secondly,

E (t l akXk l) :5 t lak l · E IXk l :5 VE(X5) t lak l k=l k= l k=l

which converges as n --+ 00 if the l ak l are summable. It follows that L:k=l lakXk l converges absolutely (almost surely), and hence L:k=l akXk converges a.s . (ii) Each sum converges a . s . and in mean square, by part (i). Now

whence

00

cy (m) = L ajakc(m + k - j) j ,k=O

4. Clearly Xn has distribution 1C for all n, so that {f(Xn ) : n :::: m} has fdds which do not depend on the value of m . Therefore the sequence is strongly stationary.

1. (i) We have that

9.2 Solutions. Linear prediction

which is minimized by setting a = c( l )/c(O) . Hence Xn+! = c( 1 )Xn/c(O) . (ii) Similarly

(**) E{ (Xn+ ! - f3Xn - y Xn_ t >2 } = ( 1 + f32 + y2)c(O) + 2f3 (y - l)c(l ) - 2yc(2) ,

an expression which is minimized by the choice

c( l ) (c(O) - c(2») f3 = c(O)2 _ c( 1 )2 '

Xn+ l is given accordingly.

c(O)c(2) - c(1 )2 y - . - c(O)2 - c(1 )2 '

(iii) Substitute a, f3, y into (*) and (**), and subtract to obtain, after some manipulation,

{c( 1 )2 - c(O)c(2) }2 D - ------,.-----..,.-- c(O) {c(O)2 - c( 1 )2 } .

356

Page 366: One Thousand Exercises in Probability

Autocovariances and spectra Solutions [9.2.2]-[9.3.2]

1 � -(a) In this case c(O) = 2;' and c( 1 ) = c (2) = 0. Therefore Xn+1 = Xn+1 = 0, and D = 0. (b) In this case D = ° also.

In both (a) and (b), little of substance is gained by using Xn+1 in place of Xn+1 . 2. Let {Zn : n = . . . , - 1 , 0, 1 , . . . } be independent random variables with zero means and unit variances, and define the moving-average process

Zn + aZn- 1 Xn = Vi + a2 .

It is easily checked that X has the required autocovariance function.

By the projection theorem, Xn - Xn is orthogonal to the collection {Xn-r : r > I } , so that � � 00 • E{(Xn - Xn)Xn-r } = 0, r � 1 . Set Xn = 2:s=1 bsXn-s to obtam that

for s � 2,

where a = a/( 1 + a2) . The unique bounded solution to the above difference equation is bs ( _ 1 )s+1 as , and therefore

00

Xn = L (_ 1 )s+1 as Xn-s · s= 1

The mean squared error of prediction is

Clearly E(Xn ) = ° and

00

cov(Xn , Xn-m) = L brbsc(m + r - s ) , m � O, r, s= 1

so that X is weakly stationary.

9.3 Solutions. Autocovariances and spectra

1. It is clear that E(Xn) = ° and var(Xn ) = 1 . Also

cov(Xm , Xm+n) = cos(mA) cos{ (m + n)A} + sin(mA) sin{ (m + n)A } = cos(nA) ,

so that X is stationary, and the spectrum of X is the singleton {A } .

2. Certainly ¢u (t) = (eitrr - e-itrr )/ (27r it) , so that E(Xn) = ¢u ( 1 )¢v (n) = 0. Also

cov(Xm , Xm+n) = E(XmXm+n) = E (ei {U-Vm-U+V(m+n) } ) = ¢v (n) ,

whence X i s stationary. Finally, the autocovariance function is

c(n) = ¢v (n) = J einJ... dF(A) ,

whence F i s the spectral distribution function.

357

Page 367: One Thousand Exercises in Probability

[9.3.3]-[9.3.4] Solutions Stationary processes

3. The characteristic functions of these distributions are

(i)

(ii)

1 2 p et) = e- '1. t , 1 ( 1 1 ) 1 p et) = 2" 1 - i t + 1 + i t = 1 + t2 .

4. (i) We have that

( I n ) 1 n c(O) j ( n . . ) var - L Xj = 2" L cov(Xj , Xk) = -2 L i (j-k)A dF()") .

n j=l n j,k= l n (-11',11'] j,k=l

The integrand is

whence

/ n eijA / 2 = ( ei�A _ l ) ( e-i�A _ l ) = l - cos(n)..)

, L elA - 1 e-1A - 1 1 - cos ).. j= l

( 1 � ) j ( Sin(n)"/2» ) 2

var - L..J Xj = c (O) dF()") . n j= l (-11',11'] n sin()"/2)

It is easily seen that I sin 8 I � 18 I , and therefore the integrand is no larger than

( ),,/2 ) 2 1 2

sin()"/2) � (�n) .

As n � 00, the integrand converges to the function which is zero everywhere except at the origin, where (by continuity) we may assign it the value 1 . It may be seen, using the dominated convergence theorem, that the integral converges to F (0) - F (0-), the size of the discontinuity of F at the origin, and therefore the variance tends to 0 if and only if F (0) - F (0-) = O.

Using a similar argument,

where

1 n- l (0) j (n- l ) j - L c(j) = � LeijA dF()..) = c (O) gn ()..) dF()") n j=O n (-11',11'] j=O (-11',11']

gn ()..) = { l einA - 1 n(eiA - 1 )

if).. = 0,

if).. :;6 0,

is a bounded sequence of functions which converges as before to the Kronecker delta function 8Ao. Therefore

1 n- l - L c(j) � c(O) (F(O) - F(O-») n j=O

358

as n � 00.

Page 368: One Thousand Exercises in Probability

The ergodic theorem Solutions [9.4.1]-[9.5.2]

9.4 Solutions. Stochastic integration and the spectral representation

1. Let H x be the space of all linear combinations of the Xi , and let H X be the closure of this space, that is, Hx together with the limits of all mean-square Cauchy-convergent sequences in Hx . All members of Hx have zero mean, and therefore all members of H X also. Now S(A) E H X for all A, whence lE(S(A) - S(p,)) = 0 for all A and J-L. 2. First, each Y m lies in the space H X containing all linear combinations of the X n and all limits of mean-square Cauchy-convergent sequences of the same form. As in the solution to Exercise (9.4. 1) , all members of H X have zero mean, and therefore lE(Ym ) = 0 for all m . Secondly,

As for the last part,

This proves that such a sequence Xn may be expressed as a moving average of an orthonormal sequence.

3. Let H X be the space of all linear combinations of the Xn , together with all limits of (mean­square) Cauchy-convergent sequences of such combinations. Using the result of Problem (7. 1 1 . 19), all elements in H X are normally distributed. In particular, all increments of the spectral process are normal. Similarly, all pairs in H X are jointly normally distributed, and therefore two members of H X are independent if and only if they are uncorrelated. Increments of the spectral process have zero means (by Exercise (9.4. 1 )) and are orthogonal. Therefore they are uncorre1ated, and hence independent.

9.5 Solutions. The ergodic theorem

1. With the usual shift operator . , it is obvious that .- 1 0 = 0, so that 0 E 1 . Secondly, if A E 1 , then .- 1 (AC ) = (.- 1 A)C = AC , whence AC E 1 . Thirdly, suppose A I , A2 , . . . E 1 . Then

so that Uf Ai E 1 .

( 00 ) 00 00 .- 1 U Ai = U .- 1 Ai = U Ai '

i=1 i=1 i= 1

2. The left-hand side is the sum of covariances, c(O) appearing n times, and c(i ) appearing 2(n - i ) times for 0 < i < n, in agreement with the right-hand side.

Let E > O. If e(j) = rl 'E{�� c(i ) -+ (12 as j -+ 00, there exists J such that le(j) - (12 1 < E when j � J . Now

2 n 2 { J n } "2 L jC(j) :::; "2 L je(j) + L j «(12 + E) -+ (12 + E

n j= 1 n j= 1 j=J+ 1

as n -+ 00. A related lower bound is proved similarly, and the claim follows since E (> 0) is arbitrary.

359

Page 369: One Thousand Exercises in Probability

[9.5.3H9.6.3] Solutions Stationary processes

3. It is easily seen that Sm = 2::r=0 ai Xn+i constitutes a martingale with respect to the X's, and m 00

E(S; ) = l: a[E(X;+i ) � l: a[, i=O i=O

whence Sm converges a.s . and in mean square as m --+ 00. Since the Xn are independent and identically distributed, the sequence Yn is strongly stationary;

also E(Yn) = 0, and so n- 1 2::7=1 Yi --+ Z a.s. and in mean, for some random variable Z with mean zero. For any fixed m (� 1 ) , the contribution of XI , X2 , . . . , Xm towards 2::7=1 Yi is, for large n, no larger than

Cm = It (ao + al + . . . + aj_ l )Xj I · J=1

Now n- 1Cm --+ 0 as n --+ 00, so that Z i s defined in terms of the subsequence Xm+l ' Xm+2 , . . . for all m, which is to say that Z is a tail function of a sequence of independent random variables. Therefore Z is a.s. constant, and so Z = 0 a.s.

9.6 Solutions. Gaussian processes

1. The quick way is to observe that c is the autocovariance function of a Poisson process with intensity 1 . Alternatively, argue as follows. The sum is unchanged by taking complex conjugates, and hence is real. Therefore it equals

where to = O. 2. For s, t � 0, X(s) and X(s + t) have a bivariate normal distribution with zero means, unit variances, and covariance c(t) . It is standard (see Problem (4. 14. l 3)) that E(X(s + t) I X(s)) = c(t)X (s) . Now

c(s + t) = E (X (O)X (s + t) ) = E{ E (X(O)X(s + t) I X(O) , X (s)) } = E (X (O)c(t )X(s)) = c(s)c(t)

by the Markov property. Therefore c satisfies c(s + t) = c(s )c(t), c(O) = 1, whence c(s) = c( 1 ) ls l = p is i . Using the inversion formula, the spectral density function is

00 2 1 " . )" I - p f()..) = - L c(s)e-IS = .)" 2 ' 27r s=-oo 27r 1 1 - pel I

Note that X has the same autocovariance function as a certain autoregressive process. Indeed, stationary Gaussian Markov processes have such a representation. 3. If X is Gaussian and strongly stationary, then it is weakly stationary since it has a finite variance. Conversely suppose X is Gaussian and weakly stationary. Then c(s , t) = cov(X(s ) , X (t)) depends

360

Page 370: One Thousand Exercises in Probability

Problems Solutions [9.6.4]-[9.7.2]

on t - s only. The joint distribution of X(tI ) , X(t2) , . . " X(tn) depends only on the common mean and the covariances C(ti ' tj ) . Now C(ti ' tj ) depends on tj - ti only, whence X(tI ) , X (t2) , . . . , X (tn ) have the same joint distribution as Xes + tI ) , x es + t2 ) , . . . , xes + tn ) . Therefore X is strongly stationary.

4. (a) If s , t > 0, we have from Problem (4. 14. l3) that

whence

COV (X(S)2 , Xes + t)2 ) = E (X (s)2 x es + t)2 ) - 1 = E{ X(s)2E (X (s + t)2 1 Xes)) } - 1

= c(t)2E(X(s)4) + ( 1 - c(t)2 )E(X(s )2 ) - 1 = 2c(t)2

by an elementary calculation.

(b) Likewise cov(X(s)3 , x es + t)3 ) = 3 (3 + 2c(t)2 )c(t) .

9.7 Solutions to problems

1. It is easily seen that Yn = Xn + (a - P)Xn- I + PYn- I , whence the autocovariance function c of Y is given by { _1 _+_a_

2-,_.--p_2

'f k - 0 1 _ p2 1 - ,

c(k) = { 2 } p lk l - I a( 1 + ap - p )

if k :;6 O. 1 - p2

Set Yn+I = I:�o ai Yn-i and find the ai for which i t is the case that E{(Yn+I - Yn+I ) Yn-k } = 0 for k � O. These equations yield

00

c(k + 1 ) = l: aic(k - i ) , i=O

which have solution ai = a(p - ai for i � O. 2. The autocorrelation functions of X and Y satisfy

Therefore

r

O'ipx (n) = O'� l: ajakPy (n + k - j) . j ,k=O

0'2 00 r

O'ifx O.) = 2; l: e-inJ.. l: ajakPy (n + k - j) n=-oo j,k=O

2 r 00 = O'y "" a 'akei (k-j )J.. "" e-i (n+k-j)J.. py (n + k _ j) 2n L..J J L..J j ,k=O n=-oo

2 'J.. 2 = O'y lGa (ei ) 1 fy ()..) ·

361

Page 371: One Thousand Exercises in Probability

[9.7.3]-[9.7.5] Solutions Stationary processes

In the case of exponential smoothing, Ga (eiA ) = ( 1 - j.t)/( 1 - j.teiA) , so that

fxO .. ) = c( l - j.t)2 fy ()..) ,

1 - 2j.t cos ).. + j.t2 1).. 1 < ](,

where c = a � / a i is a constant chosen to make this a density function.

3. Consider the sequence {Xn } defined by

Xn = Yn - Yn = Yn - aYn- l - {3Yn-2 ·

Now Xn is orthogonal to { Yn-k : k � I } , so that the Xn are uncorrelated random variables with spectral density function fx()..) = (2]( )- 1 , ).. E (-](, ]() . By the result of Problem (9.7.2),

whence

ai fx ()..) = a� l l - aeiA - {3e2iA 1 2 fy ()..) ,

a2 /a2 fy ()..) = x'A y n 2 ' 2]( 1 1 - ael - {3e I I -]( < ).. < ](.

4. Let {X� : n � I} be the interarrival times of such a process counted from a time at which a meteorite falls. Then Xl ' Xz , ' " are independent and distributed as X2 . Let Y� be the indicator function of the event {X:n = n for some m} . Then

E(Ym Ym+n) = lP'(Ym = 1 , Ym+n = 1 )

= lP'(Ym+n = 1 I Ym = l )lP'(Ym = 1 ) = lP'(Y� = l)a

where a = lP'(Ym = 1) . The autocovariance function of Y is therefore c(n) = a{lP'(Y� = 1 ) - a}, n � 0, and Y is stationary.

The spectral density function of Y satisfies

Now

1 � . A c(n) { I � . A I } fy ()..) = - L e-In = Re L eln c(n) - - . 2]( n=-oo a ( 1 - a) ](a( 1 - a) n=O 2](

00 00 L einA y� = L eiAT� n=O n=O

where Th = Xl + Xz + . . . + X� ; just check the non-zero terms. Therefore

when eiA :f. 1 , where ¢ is the characteristic function of X2 . It follows that

fy ()..) = Re - --. - - , 1 { I a

} 1

]( ( 1 - a) 1 - ¢ ()..) 1 - elA 2]( 1).. 1 < ](.

5. We have that

1:n; 1

E (cos(n U) ) = - cos(nu) du = 0, -:n; 2](

362

Page 372: One Thousand Exercises in Probability

Problems Solutions [9.7.6]-[9.7.7]

for n � 1 . Also

lE (cos(m U) cos(n U) ) = lEU (cos[(m + n)U] + cos[(m - n)Ul) } = 0

if m =1= n . Hence X is stationary with autocorrelation function p (k) = 8kO , and spectral density function f(A) = (2Jr ) - 1 for IA I < Jr . Finally

lE { cos(m U) cos(n U) cos(r U) } = ilE{ (cos[(m + n)U] + cos[(m - n)Ul) cos(r U) }

= Hp(m + n - r) + p (m - n - r) }

which takes different values in the two cases (m , n, r) = ( 1 , 2 , 3) , (2, 3 , 4) .

6. (a) The increments of N during any collection of intervals { (Ui , Vi ) : 1 :::: i :::: n } have the same fdds if all the intervals are shifted by the same constant. Therefore X is strongly stationary. Certainly lE(X (t)) = Aa for all t, and the autocovariance function is

c(t) = cov (X (O) , X (t) ) = { 0

A(a - t)

if t > a,

if O :::: t :::: a .

Therefore the autocorrelation function is

{ 0 if I t I > a, p et) = 1 - It/a l if I t I :::: a,

which we recognize as the characteristic function of the spectral density f (A) = { 1 -cos( aA) } / (aJr A 2) ; see Problems (S. 12.27b, 28a). (b) We have that lE(X(t)) = 0; furthermore, for s :::: t , the correlation of X (s ) and X (t) is

1 1 Z-cov (X (s) , X (t) ) = Z-cov (W(s) - W(s - 1 ) , Wet) - Wet - 1 )) (J (J

= s - min{s, t - I } - (s - 1 ) + (s - 1 )

{ I if s :::: t - 1 , = s - t + l if t - l < s :::: t .

This depends on t - s only, and therefore X i s stationary; X i s Gaussian and therefore strongly stationary also.

The autocorrelation function is

(h _ { 0 if I h l � 1 , p ) -1 - Ih l if I h l < 1 ,

which we recognize as the characteristic function of the density function f(A) = ( 1 - cos A)/(JrA 2) . 7 . We have from Problem (8.7 . 1 ) that the general moving-average process of part (b) i s stationary with autocovariance function c(k) = 'L.J=o ajak+ j ' k � 0, with the convention that as = 0 if s < 0 or s > r . (a) In this case, the autocorrelation function is

p (k) = { � 1 + a2 o

363

if k = 0,

if I k l = 1 ,

if I k l > 1 ,

Page 373: One Thousand Exercises in Probability

[9.7.8]-[9.7.13] Solutions

whence the spectral density function is

1 ( 'A 'A ) 1 ( 2a COS A) f(A) = - p(O) + el p ( l ) + e-I p(- l ) = - 1 + --2 ' 2n � l + a

(b) We have that

Stationary processes

IA I < n.

00 iA 2 f (A) = � L e-ikAp (k) = _1

_ L a 'eijA L ak ' e-i (k+j)A = IA(e ) 1 2n k=-oo 2nc(0) j J k

+} 2nc(0)

where c(O) = '£j aJ and A(z) = '£j ajzj . See Problem (9.7.2) also.

8. The spectral density function f is given by the inversion theorem (5.9. 1 ) as

f(x) = - e-1 txp (t) dt 1 100 .

2n - 00

under the condition fooo Ip (t) I dt < 00; see Problem (5. 12 .20) . Now

and

1 100 I f (x) 1 ::s - I p (t) 1 dt 2n -00

I f (x + h) - f(x ) 1 ::s _1 100 l eith - 1 1 . Ip (t) 1 dt . 2n -00

The integrand is dominated by the integrable function 2 Ip (t) l . Using the dominated convergence theorem, we deduce that I f (x + h) - f (x) I � 0 as h � 0, uniformly in x . 9 . By Exercise (9 .5 .2) , var (n- l '£J= l Xj ) � (]'2 if Cn = n- l ,£j:J cov(Xo, Xj ) � (]'2 . If cov(Xo, Xn) � 0 then Cn � 0, and the result follows.

10. Let Xl , X2 , ' " be independent identically distributed random variables with mean f..t . The se­quence X is stationary, and it is a consequence of the ergodic theorem that n- l '£J=l Xj � Z a.s. and in mean, where Z is a tail function of Xl , X2 , . . . with mean f..t . Using the zero--one law, Z is a.s. constant, and therefore IP'(Z = f..t) = 1 . 1 1 . We have from the ergodic theorem that n- l '£1=1 Yi � JE(Y 1 1) a.s. and in mean, where 1 is the (]'-fie1d of invariant events. The condition of the question is therefore

JE(Y 1 1 ) = JE(Y) a.s . , for all appropriate Y.

Suppose (*) holds. Pick A E 1 , and set Y = IA to obtain IA = Q(A) a.s. Now IA takes the values 0 and 1 , so that Q(A) equals 0 or 1 , implying that Q is ergodic. Conversely, suppose Q is ergodic. Then JE(Y 1 1) is measurable on a trivial (]'-fie1d, and therefore equals JE(Y) a.s .

12. Suppose Q is strongly mixing. If A is an invariant event then A = .. -n A. Therefore Q(A) = Q(A n .. -n A) � Q(A)2 as n � 00, implying that Q(A) equals 0 or 1 , and therefore Q is ergodic.

13. The vector X = (Xl , X2 , . . . ) induces a probability measure Q on (lRT , JRT) . Since T is measure­preserving, Q is stationary. Let Y : ]RT � ]R be given by Y(x) = Xl for x = (Xl , x2 , . . . ), and define Yi (X) = Y('ri- l (x)) where .. is the usual shift operator on ]RT . The vector Y = (Yl , Y2 , . . . ) has the same distributions as the vector X. By the ergodic theorem for Y, n- l '£i:: l Yi � JE(Y I 9.) a.s. and in mean, where 9. is the invariant (]'-field of .. . It follows that the limit

1 n Z = lim - " Xi n--+oo n L.-J i= l

364

Page 374: One Thousand Exercises in Probability

Problems Solutions [9. 7.14J-[9. 7.15J

exists a.s . and in mean. Now U = lim suPn�oo (n- I '2:'{ Xi ) is invariant, since

a.s . ,

implying that U(w) = U(Tw) a.s . It follows that U i s i -measurable, and i t i s the case that Z = U a.s. Take conditional expectations of (*), given 1 , to obtain U = lE(X 1 1 ) a.s .

If T is ergodic, then 1 is trivial, so that lE(X 1 1 ) is a.s . constant; therefore lE(X 1 1 ) = lE(X) a.s .

14. If (a , b) � [0, l ) , then T- I (a , b) = (1a , �b) U (� + �a , � + 1b), and therefore T is measurable. Secondly,

P (T- I (a , b) ) = 2(1 b - 1a) = b - a = P ((a , b) ) ,

so that T- I preserves the measure of intervals . The intervals generate 93, and it is then standard that T- I preserves the measures of all events.

Let A be invariant, in that A = T- I A. Let 0 ::::: w < 1 ; it is easily seen that T (w) = T (w + 1 ) .

Therefore W E A if and only if w + i E A, implying that A n [i , 1 ) = 1 + {A n [0, i ) } ; hence

P(A n E) = iP(A) = P(A)lP(E)

This proves that A is independent of both [0, 1) and [1 , 1 ) . A similar proof gives that A is independent of any set E which is, for some n, the union of intervals of the form [k2-n , (k+ 1)2-n ) for O ::::: k < 2n . It is a fundamental result of measure theory that there exists a sequence E I , E2 , . . . of events such that (a) En is of the above form, for each n , (b) P(A f:j. En) --+ 0 as n --+ 00. Choosing the En accordingly, i t follows that

P(A n En) = P(A)P(En) --+ P(A)2 by independence,

IP(A n En) - P(A) I ::::: P(A f:j. En) --+ o. Therefore peA) = P(A)2 so that peA) equals 0 or 1 .

For W E n, expand w in base 2 , w = O.WIWZ · · · , and define Yew) = WI . It is easily seen that Y(Tn-Iw) = Wn , whence the ergodic theorem (Problem (9.7 . 1 3)) yields that n- I '2:1=1 Wi --+ 1 as n --+ 00 for all w in some event of probability 1 .

15. We may as well assume that 0 < a < 1 . Let T : [0, 1 ) --+ [0, 1 ) be given by T (x) = x + a (mod 1) . It is easily seen that T is invertible and measure-preserving. Furthermore T (X) is uniform on [0, 1 ] , and it follows that the sequence ZI , Z2 , . . . has the same fdds as Z2 , Z3 , . . . , which is to say that Z is stationary. It therefore suffices to prove that T is an ergodic shift, since this will imply by the ergodic theorem that

1 n

r l - L Zj --+ lE(ZI ) = io g (u) du o n j= 1 0

We use Fourier analysis . Let A be an invariant subset of [0, 1 ) . The indicator function of A has a Fourier series:

where en (x) = e2rrinx and

00 IA (X) � L anen (x)

n=-oo

1 10 1 1 1 an = - IA (x)e-n (x) dx = - e-n (x) dx .

2rr 0 2rr A

365

Page 375: One Thousand Exercises in Probability

[9.7.16H9.7.18] Solutions

Similarly the indicator function of T- 1 A has a Fourier series,

IT- I A (x) � L bnen (x) n

where, using the substitution y = T(x) ,

since em (y - a) = e-21rimaem (y) . Therefore IT- I A has Fourier series

I ( ) '"' -21rina ( ) T- I A x � L e anen x . n

Stationary processes

Now IA = IT- l A since A is invariant. We compare the previous formula with that of (*), and deduce that an = e-21rina an for all n . Since a is irrational, it follows that an = 0 if n "# 0, and therefore IA has Fourier series ao, a constant. Therefore IA is a.s. constant, which is to say that either peA) = 0 or peA) = 1 .

16. Let Gt (z) = JE(zX(t») , the probability generating function of X(t) . Since X has stationary independent increments, for any n (2:: 1 ), X(t) may be expressed as the sum

n X(t) = L {X(it/n) - X« i - l )t/n) }

i= l

of independent identically distributed variables. Hence X(t) is infinitely divisible. By Problem (5. 12 . 1 3), we may write

Gt (z) = e-J.. (t)( 1-A(z»

for some probability generating function A, and some A. (t) . Similarly, X es + t) = X es) + (X es + t) - Xes)} , whence Gs+t (z) = Gs (z)Gt (z) , implying that

Gt (z) = eJ.l.(z)t for some JL(z) ; we have used a little monotonicity here. Combining this with (*), we obtain that Gt (z) = e-J..t ( 1 -A(z» for some A. .

Finally, X (t) has jumps of unit magnitude only, whence the probability generating function A is given by A (z) = z .

17. (a) We have that

X(t) - X (O) = {X (s) - X (0) } + {X(t) - X (s) } , O :::s s :::s t ,

whence, by stationarity,

{m et) - m (O) } = {m (s) - m (O) } + {m et - s) - m (O) } .

Now m is continuous, so that m et) - m (O) = {3t , t 2:: 0, for some {3 ; see Problem (4. 14.5). (b) Take variances of (*) to obtain v et) = v (s) + v et - s) , 0 :::s s :::s t , whence vet) = a2t for some a2 • 18. In the context of this chapter, a process Z is a standard Wiener process if it is Gaussian with Z(O) = 0, with zero means, and autocovariance function c(s , t) = min{s, t } .

366

Page 376: One Thousand Exercises in Probability

Problems Solutions [9.7.19]-[9.7.20]

(a) Z(t) = exW(t/ex2) satisfies Z(O) = 0, E(Z(t)) = 0, and

cov (Z(s ) , Z(t)) = ex2 min{s/ex2 , t/ex2 } = min{s , t } .

(b) The only calculation of any interest here is

cov(W(s + ex)-W(ex) , Wet + ex) - W(ex)) = c(s + ex, t + ex) - c(ex, t + ex) - c(s + ex, ex) + c(ex, ex) = (s + ex) - ex - ex + ex = s , s :'S. t .

(c) yeO) = 0, and E(V(t)) = o. Finally, if s , t > 0,

cov (V(s ) , Vet)) = stcov(W( 1 /s ) , W( l /t)) = st min{ l /s , 1 f t } = min{t, s } .

(d) Z(t) = W(1 ) - W(1 - t) satisfies Z(O) = 0, E(Z(t)) = O. Also Z is Gaussian, and

cov (Z(s ) , Z(t)) = 1 - ( 1 - s) - ( 1 - t) + min{ l - s , 1 - t } = min{s, t } , O :::; s , t :::; 1 .

19. The process W has stationary independent increments, and G(t) = E( l W(t) 12 ) satisfies G(t) = t -+ 0 as t -+ 0; hence Jooo cjJ (u) dW (u) is well defined for any cjJ satisfying

It is obvious that cjJ (u) = I[O, t] (u) and cjJ (u) = e-(t-u) I[O, t] (u) are such functions . Now X(t) is the limit (in mean-square) of the sequence

n- l Sn (t) = L {W((j + 1 )t/n) - W(jt/n) } , n � 1 .

j=O

However Sn (t) = Wet) for all n, and therefore Sn (t) � Wet) as n -+ 00. Finally, Y (s) i s the limit (in mean-square) of a sequence of normal random variables with mean

0, and therefore is Gaussian with mean o. If s < t ,

cov (Y(s ) , Y(t)) = 1000 (e-(s-u) I[O,s ] (u)) (e-(t-u) I[O, t] (u)) dG(u)

los 2u-s-t d 1 ( s-t -s-t ) = e u = :z e - e . o

Y is an Omstein-Uhlenbeck process.

20. (a) Wet) is N(O, t), so that

E I W(t) 1 = 100 �e- � (U2/t) du = J2t/rr , -00 y 2rrt

var( I W(t) 1 ) = E(W(t)2) - � = t ( 1 - �) .

367

Page 377: One Thousand Exercises in Probability

[9.7.21]-[9.7.21] Solutions Stationary processes

The process X is never negative, and therefore it is not Gaussian. It is Markov since, if s < t and B is an event defined in terms of {X (u) : u :5 s } , then the conditional distribution function of X(t) satisfies

lP'(X (t) :5 Y I X es ) = x, B) = lP'(X (t) :5 Y I W(s ) = x, B)lP'(W(s ) = x I X es ) = x, B) + lP' (X (t) :5 Y I W(s ) = -x, B)lP'(W(s ) = -x I X es ) = x , B)

= HlP'(X (t) :5 y I W(s ) = x ) + lP'(X (t) :5 y I W(s ) = -x) } ,

which does not depend on B . (b) Certainly,

E(Y(t» = __ e-:Z (u It) du = e :z t . JOO eU 1 2 1

-00 -I2iit Secondly, W(s ) + Wet) = 2W(s ) + {Wet) - W(s ) } is N(O, 3s + t) if s < t, implying that

and therefore

E (Y(s )Y (t» ) = E (eW(s)+w(t») = e ! (3s+t) ,

1 1 cov (Y(s ) , Y et)) = e Z (3s+t) - eZ (s+t) , s < t .

W( I ) is N(O, I ) , and therefore Y(I ) has the log-normal distribution. Therefore Y is not Gaussian. It is Markov since W is Markov, and yet) is a one-one function of Wet) . (c) We shall assume that the random function W is a .s . continuous, a point to which we return in Chapter 13 . Certainly,

E(Z(t» = lot E(W(u» du = 0,

E (Z(s )Z(t») = � E (W(u)W(v») du dv }Jo<u<s o;v;t

= is { [U v dv + [t U dV} du = i s2 (3t - s ) , s < t , u=o }v=o }v=u

since E(W(u)W(v» = min{u, v} . Z i s Gaussian, as the following argument indicates. The single random variable Z(t) may be

expressed as a limit of the form

lim � (!.-) W(it/n) , n-+oo L...J n i=I

each such summation being normal. The limit of normal variables is normal (see Problem (7. 1 1 . 19» , and therefore Z(t) is normal. The limit in (*) exists a.s . , and hence in probability. By an appeal to (7 . 1 1 . 1 9b), pairs (Z(s ) , Z(t» are bivariate normal, and a similar argument is valid for all n-tuples of the Z(u) .

The process Z i s not Markov. An increment Z(t) - Z(s) depends very much on W(s) = Z'(s) , and the collection {Z (u) : u :5 s} contains much information about Z' (s) in excess of the information contained in the single value Z(s ) . 21. Let Ui = X (ti ) . The random variables A = Ub B = U2 - UI , C = U3 - U2 , D = U4 - U3 are independent and normal with zero means and respective variances tI , t2 - tI , t3 - t2 , t4 - t3 . The Jacobian of the transformation is I , and it follows that UI , U2 , U3 , U4 have joint density function

e- ! Q .fu(u) = (2rr)2,JtI (t2 - tI ) (t3 - t2) (t4 - t3 )

368

Page 378: One Thousand Exercises in Probability

Problems

where

Solutions [9.7.22]-[9.7.22]

U1 (U2 - U I )2 (U3 - U2 )2 (U4 - U3 )2 Q = - + + + -'---'--�

tl t2 - tl t3 - t2 t4 - t3

Likewise UI and U4 have joint density function

where

Hence the joint density function of U2 and U3 , given UI = U4 = 0, is

where U� (U3 - U2)2 u�

s = -- + + -- . t2 - tl t3 - t2 t4 - t3

Now g is the density function of a bivariate normal distribution with zero means, marginal variances

and correlation

See also Exercise (8.5 .2).

(t4 - t3 ) (t2 - tI ) (t4 - t2) (t3 - tI )

22. (a) The random variables {Ij (x) : 1 ::s j ::s n } are independent, so that

lE(Fn (x)) = x , 1 x ( 1 - x) var(Fn (x)) = - var(lt (x)) = . n n

By the central limit theorem, .J1i{Fn (x) - x } � Y(x) , where Y(x) is N(O, x ( 1 - x)) . (b) The limit distribution is multivariate normal. There are general methods for showing this, and here is a sketch. If ° ::s Xl < x2 ::s 1 , then the number M2 (= nF(x2)) of the Ij not greater than X2 is approximately N(nx2 , nX2 (1 - X2)) . Conditional on {M2 = m} , the number MI = nF (x I ) is approximately N(mu , mu ( 1 - u)) where u = XI /X2 . It is now a small exercise to see that the pair (MI ' M2) is approximately bivariate normal with means nXI , nX2 , with variances nXI ( 1 - Xl ) , nX2 (1 - X2) , and such that

whence COV(MI , M2) � nXI (1 - X2) . It follows similarly that the limit of the general collection is multivariate normal with mean 0, variances Xi (1 - Xi ) , and covariances Cij = Xi (1 - Xj ) . (c) The autocovariance function of the limit distribution i s c(s , t) = mints, t } - st , whereas, for ° ::s s ::s t ::s 1 , we have that cov(Z(s) , Z(t)) = s - ts - st + st = mints, t } - st . It may be shown that the limit of the process {.J1i (Fn (x) - x) : n � I } exists as n � 00, in a certain sense, the limit being a Brownian bridge; such a limit theorem for processes is called a 'functional limit theorem' .

369

Page 379: One Thousand Exercises in Probability

10 Renewals

10.1 Solutions. The renewal equation

1. Since E(Xd > 0, there existH (> 0) such that lP'(Xl � E) > E. Let Xi = El{Xk?:E} , and denote

by N' the related renewal process. Now N(t) � N'(t) , so that E(e(lN(t) � E(e(lN' (t) , for 0 > O. Let Zm be the number of renewals (in N') between the times at which N' reaches the values (m - I )E and mE. The Z's are independent with

€e(l E(e(lZm ) = , 1 - ( 1 - E)e(l if ( 1 - E)e(l < 1 ,

whence E(e(lN' (t) � (Ee(l { I - (1 - E)e(l } -1 ) t/E for sufficiently small positive O . 2 . Let X 1 be the time of the first arrival. IT Xl > s , then W = s . On the other hand if Xl < s , then the process starts off afresh at the new starting time Xl . Therefore, by conditioning on the value of X I .

Fw(x ) = foOO IP'(W � x I Xl = u) dF(u) = foS IP'(W � x - u) dF(u) + 100 1 · dF(u)

= fos IP'(W � x - u) dF(u) + { 1 - F(s ) }

i f x � s . It i s clear that Fw (x) = 0 if x < s . This integral equation for Fw may be written in the standard form

FW (x) = H(x) + foX Fw(x - u) dF(u)

where H and F are given by

H(x) = { � _ F(s) if x < s ,

if x � s ,

� { F(X) if x < s , F(x) = F(s) if x � s .

This renewal-type equation may be solved in the usual way by the method of Laplace-Stieltjes trans­forms. We have that FiV = H* + FiVF* , whence FiV = H* /( 1 - F*) . IT N is a Poisson process then F(x) = 1 - e-)..x . In this case

H* (O) = fooo e-(lx dH(x) = e-()..+(I)s ,

since H is constant apart from a jump at x = s . Similarly

370

Page 380: One Thousand Exercises in Probability

Limit theorems Solutions [10.1.3]-[10.2.2]

so that * (A + 8 )e- (He )s

FW (8) = 8 + Ae-(He )s .

Finally, replace 8 with -8 , and differentiate to find the mean.

3. We have as usual that IP'(N(t) = n) = IP'(Sn ::::: t) - 1P'(Sn+l ::::: t) . In the respective cases,

(a)

(b)

LtJ I IP'(N(t) = n) = L: , {e-An (Anl - e-A(n+l) [A (n + IW } , r= O r .

IP'(N(t) = n) = - e-AX dx . lot { A nbxnb- l A (n+ 1 )bx (n+l )b- l }

o r (nb) r ((n + l )b)

4. By conditioning on X b met) = lE(N(t» satisfies

met) = lot ( 1 + met - x») dx = t + lot

m (x) dx , 0 ::::: t ::::: 1 .

Hence m' = 1 + m, with solution met) = et - 1 , for 0 ::::: t ::::: 1 . (For larger values of t , m et) = 1 + IJ met - x) dx , and a tiresome iteration is in principle possible.)

With v et) = lE(N(t)2 ) ,

v et) = l [v (t - x) + 2m(t - x) + 1 ] dx = t + 2 (i - t - 1 ) + lot v (x) dx , 0 ::::: t ::::: 1 .

Hence v' = v + 2et - 1 , with solution v et) = 1 - et + 2tet for 0 ::::: t ::::: 1 .

10.2 Solutions. Limit theorems

1. Let Zi be the number of passengers in the i th plane, and assume that the Zj are independent of each other and of the arrival process. The number of passengers who have arrived by time t is Set) = ���) Zj . Now

1 N(t) S(t) lE(ZI ) -Set) = -- . -- � -- a.s. t t N(t) 11-

by the law of the large numbers, since N(t)lt � 1 /11- a.s . , and N(t) � 00 a.s.

2. We have that

lE(Tit-) = lE { (f Zi I{M:;:i l) 2} = f lE (ZTI{M:;:i} ) + 2 .� lE(Zi Zj I{M:;:j ))

1=1 1=1 1:;:: 1 <J < 00

since I{M:;:i} I{M:;:j) = I{M:;:iV j ) , where i V j = max{i , j} . Now

since {M ::::: i - I } is defined in terms of ZI , Z2 , ' . . ' Zj-l , and is therefore independent of Zj . Similarly lE(Zi Zj I{M:;:j ) = lE(Zj )lE(Zi I{M:;:j ) = 0 if i < j . It follows that

00 00 lE(Tit-) = L: lE(Z[)IP'(M ::: i ) = a2 L: 1P'(M ::: i ) = a2lE(M) .

i=1 i=1

37 1

Page 381: One Thousand Exercises in Probability

[10.2.3]-[10.2.5] Solutions Renewals

3. (i) The shortest way is to observe that N(t) + k is a stopping time if k � 1 . Alternatively, we have by Wald's equation that JE(TN(t)+ I ) = t-t(m(t) + 1 ) . Also

JE(XN(t)+k) = JE{JE (XN(t)+k I N(t)) } = t-t, k � 2,

and therefore, for k � 1 , k

JE(TN(t)+k ) = JE(TN(t)+ I ) + L JE(XN(t)+j ) = t-t (m(t) + k) . j=2

(ii) Suppose p =f:. 1 and

lP'(XI = a) = { p if a = 1 ,

1 - p if a = 2. Then t-t = 2 - p =f:. 1 . Also

JE(TN(1 » = ( 1 - p)JE(TO I N(1) = 0) + pJE(TI I N(1 ) = 1) = p,

whereas m(1) = p. Therefore JE(TN(1» =f:. t-tm(I ) . 4. Let Vet) = N(t) + 1 , and let WI , W2 , . . . be defined inductively as follows. WI = V(I) , W2 is obtained similarly to WI but relative to the renewal process starting at the V( I )th renewal, i.e., at time TN(1 )+ I , and Wn is obtained similarly:

Wn = N(TXn_l + 1) - N(TXn_l ) + 1 , n � 2,

where Xm = WI + W2+ · ·+ Wm • Foreach n , Wn is independent of the sequence WI , W2 , · · · , Wn-I , and therefore the Wn are independent copies of V ( 1 ) . It is easily seen, by measuring the time-intervals covered, that Vet) ::s �J�I Wi , and hence

1 1 rtl - Vet) ::s - L Wi � JE(V ( 1 » a.s. and in mean, as t � 00. t t i= I

It follows that the family {m-I �i=I Wi : m � I } is uniformly integrable (see Theorem (7. 10.3» . Now N(t) ::s Vet) , and so {N(t)/t : t � O} is uniformly integrable also.

Since N(t)/ t � t-t- I , it follows by uniform integrability that there is also convergence in mean.

5. (a) Using the fact that lP'(N(t) = k) = lP'(Sk ::s t) - lP'(Sk+I ::s t) , we find that

JE(sN(T) = 1000 (�SklP'(N(t) = k) ) ve-vt dt

= �sk {foOO [lP'(Sk ::s t) - lP'(Sk+I ::s t)] ve-vt dt } .

By integration by parts, 1000 lP'(Sk ::s t)ve-vt dt = M( _v)k for k � O. Therefore,

JE(sN(T» = f >k { M(-v)k _ M(_v)k+I } = 1 - M(-v) .

k=O 1 - sM(-v)

(b) In this case, JE(sN(T» = JE(eAT(s- I » = MT ()' (s - I » . When T has the given gamma distribution, MT (8) = {v/(v - 8)}b , and

JE(SN(T» = (_V_)b ( 1 - �)

b.

v + ). v + ). The coefficient of sk may be found by use of the binomial theorem.

372

Page 382: One Thousand Exercises in Probability

Excess life Solutions [10.3.1]-[10.3.3]

10.3 Solutions. Excess life

1. Let g(y) = IP'(E(t) > y) , assumed not to depend on t . By the integral equation for the distribution of E(t),

g(y) = 1 - F(t + y) + g(y) l dF(x) .

Write h ex) = 1 - F(x) to obtain g (y)h (t) = h (t + y) , for y , t ::: O. With t = 0 , we have that g(y)h (O) = h ey) , whence g (y) = h (y)/ h (O) satisfies g et + y) = g(t)g(y) , for y , t ::: O. Now g is left-continuous, and we deduce as usual that g et) = e-At for some A. Hence F(t) = 1 - e-At , and the renewal process is a Poisson process. 2. (a) Examine a sample path of E . If E (t) = x , then the sample path decreases (with slope - 1 ) until it reaches the value 0 , at which point it jumps to a height X, where X i s the next interarrival time. Since X is independent of all previous interarrival times, the process is Markovian. (b) In contrast, C has sample paths which increase (with slope 1 ) until a renewal occurs, at which they drop to o. If C (s) = x and, in addition, we know the entire history of the process up to time s , the time of the next renewal depends only on the length of the spent period (i.e. , x) of the interarrival time in process. Hence C is Markovian. 3. (a) We have that

IP' (E (t) ::::; y) = F(t + y) - l G(t + y - x) dm(x)

where G(u) = 1 - F(u) . Check the conditions of the key renewal theorem ( 10.2.7): g et) = G (t + y) satisfies: (i) g et) ::: 0, (ii) I: get) dt ::::; 1000 [ 1 - F(u)] du = lE(X 1 ) < 00,

(iii) g is non-increasing. r We conclude, by that theorem, that

1 looo loy 1 lim IP' (E(t) ::::; y) = 1 - - g(x) dx = - [ 1 - F(x)] dx . t ...... oo fJ, 0 0 fJ,

(b) Integrating by parts,

looo xr 1 looo xr+1 lE(Xr+1 ) - [1 - F(x)] dx = - -- dF(x) = 1 1 . o fJ, fJ, 0 r + 1 fJ,(r + )

See Exercise (4.3.3) . (c) As in Exercise (4.3.3), we have that lE(E(tY) = 1000 ryr- 11P'(E(t) > y) dy , implying by (*) that

lE(E(tn = lE ({(XI - t)+n + i:o 1�0 ryr- 11P'(XI > t + y - x) dm(x) dy ,

whence the given integral equation is valid with

h (u) = looo

ryr- 1 1P'(XI > u + y) dy = lE ({ (XI - u)+n .

Now h satisfies the conditions of the key renewal theorem, whence

lim lE(E(tn = .!.. roo h (u) du = .!.. rr yr dF(u + y) du

t ...... oo fJ, Jo fJ, JJO<u,y<oo 1 looo r lE(X�+I ) = - y lP'(Xl > y) dy = ) . fJ, 0 fJ,(r + 1

373

Page 383: One Thousand Exercises in Probability

[10.3.4]-[10.3.5] Solutions

4. We have that

1 - F(y + x) lP' (E (t» y / C (t) = x) = lP'(X, > y + x I X, > x) = I - F(x) '

whence

E (E(t) / C (t) = x) = t)O 1 - F(y + x) dy = E{(X, - x)+ } . Jo 1 - F(x) 1 - F(x)

Renewals

5. (a) Apply Exercise ( 10.2.2) to the sequence Xi - /-L, 1 ::5 i < 00, to obtain var(TM(t) - /-LM(t)) = (12E(M(t)) . (b) Clearly TM(t) = t + E (t) , where E is excess lifetime, and hence /-LM(t) = (t + E(t)) - (TM(t) -/-LM(t)), implying in turn that

/-L2 var(M(t)) = var(E (t)) + var(SM(t» ) - 2cov (E (t) , SM(t» ) ,

where SM(t) = TM(t) - /-LM(t) . Now

if E(Xr ) < 00 (see Exercise ( lO.3 .3c)), implying that

1 - var(E(t)) � 0 t

as t � oo

as t � 00.

This i s valid under the weaker assumption that E(Xt) < 00, as the following argument shows. By Exercise ( lO.3 .3c),

E(E(t)2 ) = a(t) + lot a(t - u) dm (u) ,

where a(u) = E({(X, - u)+ }2) . Now use the key renewal theorem together with the fact that a(t) ::5 E(Xt I{Xl >t} ) � 0 as t � 00.

Using the Cauchy-Schwarz inequality,

! / cov (E (t) , SM(t» ) / ::5 !Jvar(E(t)) var(SM(t» ) � 0 t t

as t � 00, by part (a) and (**). Returning to (*), we have that

/-L2 { (12 } (12 - var(M(t)) � lim - (m(t) + 1 ) = - . t t-+oo t /-L

374

Page 384: One Thousand Exercises in Probability

Renewal-reward processes Solutions [10.4.1]-[10.5.3]

10.4 Solution. Applications

1. Visualize a renewal as arriving after two stages, type 1 stages being exponential parameter A and type 2 stages being exponential parameter t-t . The ' stage' process is the flip-flop two-state Markov process of Exercise (6.9. 1 ) . With an obvious notation,

A -(J..+ ,, ) t t-t Pl 1 (t) = --e ,.. + -- .

A + t-t A + t-t

Hence the excess lifetime distribution is a mixture of the exponential distribution with parameter t-t, and the distribution of the sum of two exponential random variables, thus,

where g(x) is the density function of a typical interarrival time. By Wald's equation,

lE(t + E (t)) = lE(SN(t)+ 1 ) = lE(X l )lE(N(t) + 1 ) = (� + �) (m(t) + 1 ) .

We substitute

( 1 1

) 1 1 Pl 1 (t)

lE(E(t)) = Pl 1 (t) - + - + (1 - Pl 1 (t)) - = - + --A t-t t-t t-t A

to obtain the required expression.

10.5 Solutions. Renewal-reward processes

1. Suppose, at time s, you are paid a reward at rate u (X(s)) . By Theorem ( 10.5. 10), equation ( 10.5 .7), and Exercise (6.9. 1 1b),

1 lot d a.s . 1

- I{X(s)=j } s -+ -- = 1tj . t 0 t-tj gj

Suppose I u (i ) I ::; K < 00 for all i E S, and let F be a finite subset of the state space. Then

"'11 lot

I ( t - Tt (F) )

::; K L..... - I{X (s)=i} ds - 1ti + K + K L 1ti , ieF t o t i¢F

where Tt (F) is the total time spent in F up to time t. Take the limit as t � 00 using (*) , and then as F t S, to obtain the required result.

2. Suppose you are paid a reward at unit rate during every interarrival time of type X, i .e . , at all times t at which M(t) is even. By the renewal-reward theorem (10.5 . 1 ) ,

1 r d a.s . lE(reward during interarrival time) lEX 1 t Jo

lIMes) is even} s -+ lE(length of interarrival time)

= lEX 1 + lEYl

.

3. Suppose, at time t, you are paid a reward at rate C (t) . The expected reward during an interval (cycle) of length X is fl s ds = � X2 , since the age C is the same at the time s into the interval. The

375

Page 385: One Thousand Exercises in Probability

[10.5.4]-[10.6.3] Solutions Renewals

result follows by the renewal-reward theorem ( 10.5. 1 ) and equation ( 10.5 .7). The same conclusion is valid for the excess lifetime E (s) , the integral in this case being foX (X - s) ds = � X2 . 4. Suppose Xo = j . Let VI = min {n � 1 : Xn = j, Xm = k for some 1 � m < n} , the first visit to j subsequent to a visit to k, and let Vr+l = min {n � Vr : Xn = j, Xm = k for some Vr + 1 :::: m < n} . The Vr are the times of a renewal process. Suppose a reward of one ecu is paid at every visit to k. By the renewal-reward theorem and equation ( 10.5.7),

By considering the time of the first visit to k,

E (VI I Xo = j) = E(Tk I Xo = j) + E(1j I Xo = k) .

The latter expectation in (*) is the mean of a random variable N having the geometric distribution JJ>(N = n) = p ( 1 - p)n- l for n � 1 , where p = JJ>(1j < Tk I Xo = k) . Since E(N) = p-l , we deduce as required that

1 /JJ>(1j < Tk I Xo = k) nk = ------�--------------E (Tk I Xo = j) + E(1j I Xo = k)

10.6 Solutions to problems

1. (a) For any n, JJ>(N (t) < n) � JJ>(Tn > t) --+ 0 as t --+ 00.

(b) Either use Exercise ( 10. 1 . 1 ) , or argue as follows. Since /.l > 0, there exists E (> 0) such that JJ>(XI > E) > O. For all n ,

JJ>(Tn � nE ) = 1 - JJ>(Tn > nE ) :::: 1 - JJ>(XI > E)n < 1 ,

so that, i f t > 0 , there exists n = n et) such that JJ>(Tn � t) < 1 . Fix t and let n be chosen accordingly. Any positive integer k may be expressed in the form

k = an + fJ where 0 � fJ < n . Now JJ>(Tk :::: t) � JJ>(Tn :::: t)a for an � k < (a + l )n , and hence

00 00

met) = L JJ>(Tk :::: t) � L nJJ>(Tn � t)a < 00. k=l a=O

(c) It is easiest to use Exercise ( 10. 1 . 1 ) , which implies the stronger conclusion that the moment generating function of N(t) is finite in a neighbourhood of the origin.

2. (i) Condition on X I to obtain

t 2 rt vet) = io E{ (N (t - u) + 1 ) } dF(u) = io

{ v (t - u) + 2m(t - u) + I } dF(u) .

Take Laplace-Stieltjes transforms to find that v* = (v* + 2m * + 1 ) F* , where m * = F* + m * F* as usual. Therefore v* = m* ( 1 + 2m*) , which may be inverted to obtain the required integral equation.

(ii) If N is a Poisson process with intensity A, then met) = At , and therefore vet) = ()..t)2 + At.

3. Fix x E R. Then

376

Page 386: One Thousand Exercises in Probability

Problems Solutions [10.6.4]-[10.6.7]

where a(t) = l(tIJL) + xVtu2/JL3J . Now,

) ( Ta(t) - JLa(t) t - JLa(t) ) lP'(Ta(t) ::::: t = lP' u J aCt) :::::

u J a Ct) .

However aCt) � t I JL as t --+ 00, and therefore t - JLa(t) --==- --+ -x u Ja(t) as t --+ 00,

implying by the usual central limit theorem that

lP' (N(t) - (tIJL) � x) --+ <I> (-x) Vtu2/JL3

where <I> is the N(O, 1 ) distribution function.

as t --+ 00

An alternative proof makes use of Anscombe's theorem (7 . 1 1 .28). 4. We have that, for y ::::: t ,

lP' (C(t) � y) = lP'(E (t - y) > y) --+ lim lP' (E (u) > y) u-+oo

= roo .!.. [ 1 - F(x)] dx Jy JL

as t --+ 00

by Exercise (l0.3.3a) . The current and excess lifetimes have the same asymptotic distributions. 5. Using the lack-of-memory property of the Poisson process, the current lifetime C (t) is independent of the excess lifetime E(t) , the latter being exponentially distributed with parameter A. To derive the density function of C(t) either solve (without difficulty in this case) the relevant integral equation, or argue as follows. Looking backwards in time from t, the arrival process looks like a Poisson process up to distance t (at the origin) where it stops. Therefore C (t) may be expressed as min{Z, t} where Z is exponential with parameter A; hence

{ Ae-J..s if s < t JC(t) (s) = ° .f

- , 1 S > t ,

and lP'(C (t) = t) = e-At . Now D(t) = C(t) + E (t) , whose distribution is easily found (by the convolution formula) to be as given. 6. The i th interarrival time may be expressed in the form T + Zj where Zj is exponential with parameter A. In addition, Z 1 , Z2 , . . . are independent, by the lack-of-memory property. Now

1 - F(x) = lP'(T + Zl > x) = lP'(Zl > x - T) = e-J.. (x-T) , x � T.

Taking into account the (conventional) dead period beginning at time 0, we have that k

lP'(N(t) � k) = lP' (kT + � Zj ::::: t) = lP'(N(t - kT) � k) , t � kT,

1= 1 where N is a Poisson process. 7. We have that Xl = L + E (L) where L is the length of the dead period beginning at 0, and E(L) is the excess lifetime at L . Therefore, conditioning on L,

377

Page 387: One Thousand Exercises in Probability

[10.6.8]-[10.6.10] Solutions

We have that

lP'(E(t) ::: y) = F(t + y) - lot { 1 - F(t + y - x) } dm(x) .

By the renewal equation,

whence, by subtraction,

It follows that

t+y m(t + y) = F(t + y) +

Jo F(t + Y - x) dm(x) ,

It+y lP'(E (t) ::: y) =

t { 1 - F(t + y - x) } dm (x) .

lP' (Xl ::: x) = l�o i:l { I - F(x - y) } dm(y) dFL (l)

Renewals

= [FL (l)IX { 1 - F(x - Y) } dm (y)] X + r FL (l) { I - F(x - l) } dm (l)

I �o k using integration by parts. The term in square brackets equals O.

8. (a) Each interarrival time has the same distribution as the sum of two independent random variables with the exponential distribution. Therefore N(t) has the same distribution as liM (t)J where M is a

Poisson process with intensity A. Therefore m(t) = �E(M(t)) - �lP'(M (t) is odd) . Now E(M(t)) = At , and

00 (M)2n+l e-At lP'(M (t) is odd) = '"' = 1 e-At (eAt _ e-At) . � (2n + I ) ! :2 n=O

With more work, one may establish the probability generating function of N(t) . (b) Doing part (a) as above, one may see that iii (t) = m(t) . 9. Clearly C (t) and E (t) are independent if the process N is a Poisson process. Conversely, suppose that C(t) and E(t) are independent, for each fixed choice of t . The event {C(t) � y} n {E(t) � x } occurs i f and only if E ( t - y) � x + y . Therefore

lP' (C(t) � y)lP'(E (t) � x) = lP'(E(t - y) � x + y) .

Take the limit as t --+ 00, remembering Exercise ( 10.3.3) and Problem ( 10.6.4), to obtain that G(y)G(x) = G (x + y) if x , y � 0, where

100 1 G (u) = - [ 1 - F(v)] dv .

u JL

Now 1 - G is a distribution function, and hence has the lack-of-memory property (Problem (4. 14.5)), implying that G (u) = e-J..u for some A. This implies in tum that [ 1 - F(u)]jJL = -G'(u) = Ae-J..u , whence JL = I /A and F(u) = 1 - e-J..u . 10. Clearly N is a renewal process if N2 is Poisson. Suppose that N is a renewal process, and write A for the intensity of Nl , and F2 for the interarrival time distribution of N2 . By considering the time X 1 to the first arrival of N,

378

Page 388: One Thousand Exercises in Probability

Problems Solutions [10.6.11]-[10.6.12]

Writing E, Ej for the excess lifetimes of N, Nj , we have that

lP'(ECt) > x ) = lP'(EI Ct) > x , E2 (t) > x ) = e-AxlP'(E2 (t) > x) .

Take the limit as t --+ 00, using Exercise ( 10.3 .3) , to find that

100 1 100 1 - [ 1 - F(u)] du = e-AX - [1 - F2 (U)] du ,

x /.l x /.l2

where /.l2 is the mean of F2 . Differentiate, and use (*), to obtain

1 Ax AX 100 1 e-AX -e- [ 1 - F2 (X)] = Ae- - [1 - F2 (U)] du + -- [ 1 - F2 (X) ] , /.l x /.l2 /.l2

which simplifies to give 1 - F2 (X) = C J:O [ 1 - F2(U)] du where c = A/.l/(/.l2 - /.l) ; this integral equation has solution F2 (X) = 1 - e-cx . 11. (i) Taking transforms of the renewal equation in the usual way, we find that

where

m* () = F* «() = 1 _ 1 ( ) 1 - F* «() 1 - F*«()

F* «() = JE(e-OXt ) = 1 - () /.l + �()2 (/.l2 + (]'2) + 0«()2) as () --+ O. Substitute this into the above expression to obtain

and expand to obtain the given expression. A formal inversion yields the expression for m. (ii) The transform of the right-hand side of the integral equation is

� - FE «() + m* «() - FE «() m* «() . /.l()

By Exercise (10.3.3), FE «() = [ 1 - F* «() ]f (/.l() , and (*) simplifies to m* «() - (m* - m* F* -F*)/(/.l() , which equals m* «() since the quotient is 0 (by the renewal equation).

Using the key renewal theorem, as t --+ 00,

lot 1 1000 JE(X2) (]'2 + /.l2 [1 - FE Ct - x)] dm (x) --+ - [ 1 - FE (u)] du = -2-1 = 2 o /.l 0 2/.l 2/.l

by Exercise ( lO.3.3b) . Therefore,

12. (i) Conditioning on XI , we obtain

379

Page 389: One Thousand Exercises in Probability

[10.6.13]-[10.6.16] Solutions Renewals

Therefore md* = Fd* + m* Fd* . Also m* = F* + m* F* , so that

whence md* = Fd* + md* F* , the transform of the given integral equation. (ii) Arguing as in Problem ( 1 0.6.2), vd* = Fd*+2m* Fd* +v* Fd* where v* = F* ( 1 +2m*)/( 1 -F*) is the corresponding object in the ordinary renewal process. We eliminate v* to find that

by (*). Now invert.

13. Taking into account the structure of the process, it suffices to deal with the case I = 1 . Refer to Example ( 10.4.22) for the basic notation and analysis. It is easily seen that f3 = (v - 1)1.. . Now F(t) = 1 - e-vM . Solve the renewal equation ( 10.4.24) to obtain

g(t) = h (t) + lot h (t - x) diii (x)

where iii (x) = VAX is the renewal function associated with the interarrival time distribution F. Therefore g(t) = 1, and m(t) = ef3t . 14. We have from Lemma (1 0.4.5) that p* = 1 - Fz + p* F* , where F* = FYFz . Solve to obtain

* 1 - Fz P = 1 F*F* ' - Y Z

15. The first locked period begins at the time of arrival of the first particle. Since all future events may be timed relative to this arrival time, we may take this time to be O. We shall therefore assume that a particle arrives at 0; call this the Oth particle, with locking time Yo.

We shall condition on the time X I of the arrival of the next particle. Now

( I ) { JP'(Yo > t) if u > t , JP' L > t Xl = u = JP'(Yo > u)JP'(L' > t - u) if u .::: t ,

where L' has the same distribution as L ; the second part i s a consequence of the fact that the process 'restarts ' at each arrival. Therefore

JP'(L > t) = ( 1 - G(t»)JP'(XI > t) + lot JP'(L > t - u) ( I - G(u») fXj (u) du ,

the required integral equation. If G(x) = 1 - e-/LX , the solution is JP'(L > t) = e-/Lt , so that L has the same distribution as

the locking times of individual particles. This striking fact may be attributed to the lack-of-memory property of the exponential distribution.

16. (a) It is clear that M(tp) is a renewal process whose interarrival times are distributed as Xl + X2 + . . . + XR where JP'(R = r) = pqr- l for r � 1 . It follows that M(t) is a renewal process whose first interarrival time

X(p) = inf{t : M(t) = I } = p inf{t : M(tp) = I } 380

Page 390: One Thousand Exercises in Probability

Problems Solutions [10.6.17]-[10.6.19]

has distribution function 00 lP' (X (p) :::; x ) = L lP'(R = r)Fr (x/p) .

r= 1 (b) The characteristic function ¢p of Fp is given by

00 100 . 00 p¢ (pt) ¢p (t) = L pqr- 1 e1xt dFr (t/p) = L pqr- 1 ¢ (pt/ = ----

r=1 -00 r=1 1 - q¢ (pt)

where ¢ is the characteristic function of F. Now ¢ (pt) = 1 + iILPt + o(p) as p + 0, so that

( l + iILpt + O(P) 1 + 0( 1 )

¢ t ) = = -,------p 1 - iILt + 0(1 ) 1 - iILt as p + O. The limit is the characteristic function of the exponential distribution with mean IL, and the continuity theorem tells us that the process M converges in distribution as p + 0 to a Poisson process with intensity 1 / IL (in the sense that the interarrival time distribution converges to the appropriate limit). (c) If M and N have the same fdds, then ¢p (t) = ¢ (t) , which implies that ¢ (pt) = ¢ (t) / (p + q¢ (t)) . Hence 1/I (t) = ¢ (t)- 1 satisfies 1/I (pt) = q + p1/l (t) for t E R. Now 1/1 is continuous, and it follows as in the solution to Problem (5 . 1 2. 1 5) that 1/1 has the form 1/I (t) = l +t.lt , implying that¢ (t) = ( 1 +,Bt)- 1 for some ,B E C. The only characteristic function of this form is that of an exponential distribution, and the claim follows.

17. (a) Let N (t) be the number of times the sequence has been typed up to the tth keystroke. Then N is a renewal process whose interarrival times have the required mean IL; we have that lE(N(t) )/ t --+ IL- 1 as t --+ 00. Now each epoch of time marks the completion of such a sequence with probability (nk ) 14 , so that

!lE(N(t)) = ! t (_1_) 14

--+ (_1_) 14

t t n=14 100 100 as t --+ 00,

implying that IL = 1028 . The problem with 'omo' is 'omomo' (Le. , appearances may overlap). Let us call an epoch

of time a 'renewal point' if it marks the completion of the word 'omo' , disjoint from the words completed at previous renewal points. In each appearance of 'omo' , either the first '0 ' or the second '0' (but not both) is a renewal point. Therefore the probability Un , that n is a renewal point, satisfies (nk )3 = Un + Un-2 (nk )2 . Average this over n to obtain

(_1 ) 3 = lim ! t {un + U -2 (_1 ) 2 } = .!. + .!. (_1 ) 2

, 100 n--+oo t n=1

n 100 IL IL 100

and therefore IL = 106 + 102 . (b) (i) Arguing as for 'omo' , we obtain p3 = un + PUn- 1 + P2Un_2 ' whence p3 = ( 1 + p + p2)/IL.

(it) Similarly, p2q = Un + pqun-2 , so that IL = ( 1 + pq)/(p2q) . 18. The fdds of {N(u) - N(t) : u � t } depend on the distributions of E (t) and of the interarrival times. In a stationary renewal process, the distribution of E (t) does not depend on the value of t , whence {N(u) - N(t) : u � t } has the same fdds as {N(u) : u � OJ, implying that X is strongly stationary.

19. We use the renewal-reward theorem. The mean time between expeditions is B IL, and this is the mean length of a cycle of the process. The mean cost of keeping the bears during a cycle is i B(B - l)cIL, whence the long-run average cost is {d + B(B - 1)cIL/2}/(BIL) .

38 1

Page 391: One Thousand Exercises in Probability

11 Queues

11.2 Solutions. MIMIl

1. The stationary distribution satisfies 1r = 1r P when it exists , where P is the transition matrix. The equations

P1rn- 1 1rn+1 +" 2 1rn = -- + -- l or n > 1 + p 1 + p - ,

with L:�o 1rj = 1 , have the given solution. If p � 1 , no such solution exists . It is slightly shorter to use the fact that such a walk is reversible in equilibrium, from which it follows that 1r satisfies

1r1 1rO = --, 1 + p P1rn 1rn+1 1 + p 1 + p for n � 1 .

2. (i) This continuous-time walk is a Markov chain with generator given by gOI = eo , gn,n+l = enP/ ( l + p) and gn,n- l = en/ ( 1 + p) for n � 1 , other off-diagonal terms being O. Such a process is reversible in equilibrium (see Problem (6. 15 . 16», and its stationary distribution v must satisfy vngn,n+ l = vn+l gn+l , n ' These equations may be written as

for n � 1 .

These are identical to the equations labelled (*) in the previous solution, with 1rn replaced by vnen . It follows that Vn = C1rn/en for some positive constant C. (ii) If eo = A, en = A + JL for n � 1 , we have that

'" { 1r0 1 - 1r0 } C 1 = � vn = C J: + JL + A = 2A '

whence C = 2A and the result follows.

3. Let Q be the number of people ahead of the arriving customer at the time of his arrival. Using the lack-of-memory property of the exponential distribution, the customer in service has residual service­time with the exponential distribution, parameter JL, whence W may be expressed as S I + S2 + . . . + S Q , the sum of independent exponential variables, parameter JL. The characteristic function of W is

382

Page 392: One Thousand Exercises in Probability

MIMI] Solutions [11.2.4]-[11.2.7]

This is the characteristic function of the given distribution. The atom at 0 corresponds to the possibility that Q = O.

4. We prove this by induction on the value of i + j . If i + j = 0 then i = j = 0, and it is easy to check that Jl'(0; 0, 0) = 1 and A(O; 0, 0) = 1 , A(n ; 0, 0) = 0 for n ::: 1 . Suppose then that K ::: 1 , and that the claim is valid for all pairs (i , j ) satisfying i + j = K. Let i and j satisfy i + j = K + 1 . The last ball picked has probability i / ( i + j ) of being red; conditioning on the colour of the last ball, we have that

Jl'(n ; i , j ) = . +i . Jl'(n - l ; i - l , j ) + . +

j . Jl'(n + l ; i , j - l ) . I J I J

Now (i - 1) + j = K = i + (j - 1 ) . Applying the induction hypothesis, we find that

Jl'(n ; i , j) = i � j { A(n - 1 ; i - I , j) - A(n ; i - I , j) }

+ i � j {A(n + 1 ; i , j - 1) - A(n + 2; i , j - 1 ) } .

Substitute to obtain the required answer, after a little cancellation and collection of tenns. Can you see a more natural way?

5. Let A and B be independent Poisson process with intensities ).. and /L respectively. These processes generate a queue-process as follows. At each arrival time of A, a customer arrives in the shop. At each arrival-time of B, the customer being served completes his service and leaves ; if the queue is empty at this moment, then nothing happens. It is not difficult to see that this queue-process is M()..)/M(/L)/ 1 . Suppose that A(t) = i and B(t) = j . During the time-interval [0, t ] , the order of arrivals and departures follows the schedule of Exercise ( 1 1 .2.4), arrivals being marked as red balls and departures as lemon balls. The imbedded chain has the same distributions as the random walk of that exercise, and it follows that JP> (Q(t) = n I A(t) = i, B(t) = j) = Jl' (n ; i, j ) . Therefore Pn (t) = Ei, i Jl' (n ; i, j)JP>(A(t) = i )JP>(B(t) = j) . 6 . With p = ).. / /L, the stationary distribution of the imbedded chain is, as in Exercise ( 1 1 .2. 1 ),

� { i ( 1 - p) if n = 0, Jl'n =

i ( 1 _ p2)pn- l if n ::: 1 . In the usual notation of continuous-time Markov chains, gO = ).. and gn = ).. + /L for n ::: 1 , whence, by Exercise (6. 1 0. 1 1 ) , there exists a constant c such that

Jl'o = � (1 - p) , Jl'n = c (1 - p2)pn- l for n > 1 . 2)" 2()" + /L) -

Now Ei Jl'i = 1 , and therefore c = 2)" and Jl'n = (1 - p)pn as required. The working is reversible.

7. (a) Let Qi (t) be the number of people in the i th queue at time t, including any currently in service. The process Q 1 is reversible in equilibrium, and departures in the original process correspond to arrivals in the reversed process. It follows that the departure process of the first queue is a Poisson process with intensity ).. , and that the departure process of Q 1 is independent of the current value of Q 1 . (b) We have from part (a) that, for any given t, the random variables Q l (t) , Q2 (t) are independent. Consider an arriving customer when the queues are in equilibrium, and let Wi be his waiting time (before service) in the i th queue. With T the time of arrival, and recalling Exercise ( 1 1 .2.3),

JP>(WI = 0, W2 = 0) > JP>(Qi (T) = 0 for i = 1 , 2) = JP>(Q I (T) = 0)JP>(Q2 (T) = 0) = (1 - Pl ) ( 1 - pz) = JP>(WI = 0)JP>(W2 = 0) .

Therefore WI and W2 are not independent. There is a slight complication arising from the fact that T is a random variable. However, T is independent of everybody who has gone before, and in particular of the earlier values of the queue processes Q i .

383

Page 393: One Thousand Exercises in Probability

[11.3.1]-[11.4.1] Solutions

11.3 Solutions. MlG/!

1. In equilibrium, the queue-length Qn just after the nth departure satisfies

Queues

where An is the number of arrivals during the (n + l) th service period, and h (m) = 1 - 8mo . 'Now Qn and Qn+1 have the same distribution. Take expectations to obtain o = E(An ) - J1D(Qn > 0) ,

where E(An ) = Ad, the mean number of arrivals in an interval of length d. Next, square (*) and take expectations :

Use the facts that An is independent of Qn , and that Qnh (Qn ) = Qn , to find that o = { (Ad)2 + Ad} + J1D(Qn > 0) + 2{ (Ad - l)E(Qn) - AdJID(Qn > 0) }

and therefore, by (**),

2. From the standard theory, MB satisfies MB(S) = Ms(s - A + AMB (S)) , where Ms(O) JL/(JL - 0) . Substitute to find that x = MB (S) is a root of the quadratic Ax2 - X (A + JL - s) + JL = O. For some small positive s, M B (s ) is smooth and non-decreasing. Therefore M B (s) is the root given.

3. Let Tn be the instant of time at which the server is freed for the nth time. By the lack-of-memory property of the exponential distribution, the time of the first arrival after Tn is independent of all arrivals prior to Tn , whence Tn is a 'regeneration point' of the queue (so to say). It follows that the times which elapse between such regeneration points are independent, and it is easily seen that they have the same distribution.

11.4 Solutions. GIM/!

1. The transition matrix of the imbedded chain obtained by observing queue-lengths just before arrivals is

The equation 1r = 1r P A may be written as

. . .

. . . ) 00

1rn = L ai1rn+i- l for n � 1 . ;=0

It is easily seen, by adding, that the first equation is a consequence of the remaining equations, taken in conjunction with 2:8" 1ri = 1 . Therefore 1r is specified by the equation for 1rn , n � 1 .

384

Page 394: One Thousand Exercises in Probability

GIGI]

The indicated substitution gives

which is satisfied whenever e satisfies

00

en = en- 1 L aie i i=O

Solutions [11.4.2]-[11.5.2]

It is easily seen that A(e) = MX (JL(e - 1» is convex and non-decreasing on [0, 1 ] , and satisfies A(O) > 0, A(I ) = 1 . Now A' ( I ) = JLE(X) = p- l > 1, implying that there is a unique 7j E (0, 1 ) such that A(7j) = 7j. With this value of 7j, the vector 1r given by 1tj = ( 1 - 7j)7jj , j � 0, is a stationary distribution of the imbedded chain. This 1r is the unique such distribution because the chain is irreducible.

2. (i) The equilibrium distribution is 1tn = (1 _ 7j)7jn for n � 0, with mean I:�o n1tn = 7j/( I - 7j) . (ii) Using the lack-of-memory property of the service time in progress at the time of the arrival, we see that the waiting time may be expressed as W = Sl + S2 + . . . + SQ where Q has distribution 1r , given above, and the Sn are service times independent of Q. Therefore

7j/ JL E (W) = E(Sl )E(Q) = -- . ( 1 - 7j)

3. We have that Q (n+) = 1 + Q (n-) a.s. for each integer n , whence limt--+oo IP'(Q (t) = m) cannot exist.

Since the traffic intensity is less than 1 , the imbedded chain is ergodic with stationary distribution as in Exercise ( 1 1 .4. 1 ) .

11.5 Solutions. G/G/!

1. Let Tn be the starting time of the nth busy period. Then Tn is an arrival time, and also the beginning of a service period. Conditional on the value of Tn , the future evolution of the queue is independent of the past, whence the random variables {Tn+ 1 - Tn : n � I } are independent. It is easily seen that they are identically distributed.

2. If the server is freed at time T, the time I until the next arrival has the exponential distribution with parameter JL (since arrivals form a Poisson process).

By the duality theory of queues, the waiting time in question has moment generating function Mw(s) = (1 - s)/( 1 - sM/ (s» where M/ (s) = JL/ (JL - s) and s = IP'(W > 0) . Therefore,

MW(s) = sJL( I - S) + (1 - S ) , JL( 1 - S ) - s

the moment generating function of a mixture of an atom at ° and an exponential distribution with parameter JL(1 - S} -

If G i s the probability generating function of the equilibrium queue-length, then, using the lack­of-memory property of the exponential distribution, we have that M W (s) = G (JL / (JL - s», since W is the sum of the (residual) service times of the customers already present. Set u = JL/ (JL - s) to find that G(u) = (1 - S)/( 1 - su) , the generating function of the mass function f(k) = (1 - s >sk for k � 0.

385

Page 395: One Thousand Exercises in Probability

[11.5.3]-[11.7.2] Solutions Queues

It may of course be shown that s is the smallest positive root of the equation x = M X (t-t (x - 1)) , where X is a typical interarrival time.

3. We have that

1 - G(y) = J1D(S - X > y) = fooo

J1D(S > u + y) dFx (u) , y E R,

where S and X are typical (independent) service and interarrival times. Hence, formally,

dG(y) = - roo dJID(S > u + y) dFx (u) = dy 100 t-te-/L(U+Y) dFx (u) ,

Jo -Y

since fs (u + y) = e-/L(u+y) if u > -y, and is 0 otherwise. With F as given,

Ix F(x - y) dG(y) = rr { 1 - 17e-/L(I-I1) (x-Y) }t-te-/L(u+y) dFx (u) dy .

-00 JJ -oo<y�x -Y<U<OO

First integrate over y , then over u (noting that F X (u) = 0 for u < 0), and the double integral collapses to F(x) , when x � o.

11.6 Solution. Heavy traffic

1. Qp has characteristic function

00 A. ( ) ""' itn n ( 1 ) 1 - P 'l'p t = L...J e p - p = ·t . n=O 1 - pel

Therefore the characteristic function of (1 - p) Qp satisfies

1 - p 1 ¢p (( 1 - p)t ) =

1 _ pei (1 -p)t --+ 1 - i t

as p t l .

The limit characteristic function is that of the exponential distribution, and the result follows by the continuity theorem.

1 1.7 Solutions. Networks of queues

1. The first observation follows as in Example ( 1 1 .7 .4). The equilibrium distribution is given as in Theorem ( 1 1 .7 . 14) by

c ni -ClI · II 0/ . e I :7r (0) = --,1,---n · ' i=1 I ·

for 0 = (n l , n2 , . . . , nc) E 71/,

the product of Poisson distributions. This i s related to Bartlett's theorem (see Problem (8.7.6)) by defining the state A as 'being in station i at some given time' .

2. The number of customers in the queue is a birth-death process, and is therefore reversible in eqUilibrium. The claims follow in the same manner as was argued in the solution to Exercise ( 1 1 .2.7).

386

Page 396: One Thousand Exercises in Probability

Problems Solutions [11.7.31-111.8.1]

3. (a) We may take as state space the set {O, 1 ' , 1 " , 2, 3 , . . . } , where i E {O, 2, 3, . . . } is the state of having i people in the system including any currently in service, and l ' (respectively I") is the state of having exactly one person in the system, this person being served by the first (respectively second) server. It is straightforward to check that this process is reversible in equilibrium, whence the departure process is as stated, by the argument used in Exercise ( 1 1 .2.7). (b) This time, we take as state space the set {O' , 0" , 1', 1", 2, 3 , . . . } having the same states as in part (a) with the difference that 0' (respectively 0" ) is the state in which there are no customers present and the first (respectively second) server has been free for the shorter time. It is easily seen that transition from 0' to I" has strictly positive probability whereas transition from I" to 0' has zero probability, implying that the process is not reversible. By drawing a diagram of the state space, or otherwise, it may be seen that the time-reversal of the process has the same structure as the original, with the unique change that states 0' are 0" are interchanged. Since departures in the original process correspond to arrivals in the time-reversal, the required properties follow in the same manner as in Exercise ( 1 1 .2.7).

4. The total time spent by a given customer in service may be expressed as the sum of geometrically distributed number of exponential random variables, and this is easily shown to be exponential with parameter 8 J.L. The queue is therefore in effect a M(A )/M(8 J.L)/1 system, and the stationary distribution is the geometric distribution with parameter p = A I (8 J.L) , provided p < 1 . As in Exercise ( 1 1 .2.7), the process of departures is Poisson.

Assume that rejoining customers go to the end of the queue, and note that the number of customers present constitutes a Markov chain. However, the composite process of arrivals is not Poisson, since increments are no longer independent. This may be seen as follows . In equilibrium, the probability of an arrival of either kind during the time interval (t, t + h) is Ah + pJ.L(1 - 8)h + o(h) = (AI8)h + o(h) . If there were an arrival of either kind during ( t - h , t) , then (with conditional probability 1 - O(h)) the queue is non-empty at time t , whence the conditional probability of an arrival of either kind during (t, t + h) is "Ah + J.L( 1 - 8)h + o(h) ; this is of a larger order of magnitude than the earlier probability (AI8)h + o(h) .

5. For stations r , s , we write r -+ s i f an individual at r visits s at a later time with a strictly positive probability. Let C comprise the station j together with all stations i such that i -+ j . The process restricted to C is an open migration process in equilibrium. By Theorem (1 1 .7 . 1 9), the restricted process is reversible, whence the process of departures from C via j is a Poisson process with some intensity s . Individuals departing C via j proceed directly to k with probability

Ajk¢j (nj ) Ajk J.Lj¢j (nj ) + L.rfJC Ajr¢j (nj )

= J.Lj + L.rfJC Ajr '

independently of the number nj of individuals currently at j . Such a thinned Poisson process is a Poisson process also (cf. Exercise (6.8 .2)), and the claim follows.

11.8 Solutions to problems

1. Although the two cases may be done together, we choose to do them separately. When k = 1 , the equilibrium distribution 'Jf satisfies:

J.L1l'1 - A1l'O = 0, J.L1l'n+1 - (A + J.L)1l'n + A1l'n- l = 0,

-J.L1l'N + A1l'N- l = 0, l ::; n < N,

a system of equations with solution 1l'n = 1l'o (AI J.L)n for 0 ::; n ::; N, where (if A "I J.L)

N 1 (AI )N+l

1l'- 1 = '"' (AI )n = - J.L o � J.L 1 - (AIJ.L)

387

Page 397: One Thousand Exercises in Probability

[11.8.2]-[11.8.3] Solutions

Now let k = 2. The queue is a birth-death process with rates

{ A if i < N, A· -1 - 0 if i ?:. N,

{ JL if i = 1 , JLi = . . 2JL If I ?:. 2.

Queues

It is reversible in equilibrium, and its stationary distribution satisfies A(lfi = JLi+l lfi+l . We deduce that lfi = 2pilfO for 1 � i � N, where p = A/(2JL) and

N 1rO

I = 1 + L 2p i.

i= 1

2. The answer is obtainable in either case by following the usual method. It is shorter to use the fact that such processes are reversible in equilibrium. (a) The stationary distribution 1C satisfies 1rnAp(n) = 1rn+l JL for n ?:. 0, whence 1rn = 1ropn/n ! where P = A/JL. Therefore 1rn = pne-p In ! . (b) Similarly,

where

n- l 1rn = 1rOpn II p(m) = 1rOpn2- �n(n- 1 ) ,

m=O

00 I 1rO

I = L pn ( � ) 2;n (n- l ) . n=O

n ?:. 0,

At the instant of arrival of a potential customer, the probability q that she joins the queue is obtained by conditioning on its length:

00 00 I 00 I 1 q = L p(n)1rn = 1ro L pn2-n- zn(n- l ) = 1ro L pn2- zn(n+l ) = 1rO- {1rol - I } .

n=O n=O n=O p

3. First method. Let (Ql , Q2) be the queue-lengths, and suppose they are in equilibrium. Since Q 1 is a birth-death process, it is reversible, and we write c h (t) = Q 1 ( -t) . The sample paths of Q l have increasing jumps of size 1 at times of a Poisson process with intensity A; these jumps mark arrivals at the cash desk. By reversibility, 0 1 has the same property; such increasing jumps for 0 1 are decreasing jumps for Q 1 , and therefore the times of departures from the cash desk form a Poisson process with intensity A. Using the same argument, the quantity Q I (t) together with the departures prior to t have the same joint distribution as the quantity 0 1 (-t) together with all arrivals after -to However 0 1 (-t) is independent of its subsequent arrivals, and therefore Ql (t) is independent of its earlier departures .

It follows that arrivals at the second desk are in the manner of a Poisson process with intensity A, and that Q2 (t) is independent of Q l (t) . Departures from the second desk form a Poisson process also.

Hence, in equilibrium, Q I is M(A)IM(JLI )11 and Q2 is M(A)IM(JL2)11 , and they are independent at any given time. Therefore their joint stationary distribution is

1rmn = IP' (Q l (t) = m , Q2 (t) = n) = ( 1 - Pl ) ( l - P2)Pi"P2

where Pi = A/ JLi . Second method. The pair (Q l (t) , Q2 (t)) is a bivariate Markov chain. A stationary distribution (1rmn : m , n ?:. 0) satisfies

m , n ?:. l ,

388

Page 398: One Thousand Exercises in Probability

Problems Solutions [11.8.4]-[11.8.5]

together with other equations when m = 0 or n = O. It is easily checked that these equations have the solution given above, when Pi < 1 for i = 1 , 2. 4. Let Dn be the time of the nth departure, and let Q n = Q (Dn + ) be the number of waiting customers immediately after Dn . We have in the usual way that Qn+1 = An + Qn - h (Qn ) , where An is the number of arrivals during the (n + l) th service time, and h (x) = min{x , m } . Let G(s) = L:�o 1l'iS i be the equilibrium probability generating function of the Qn . Then, since Q n is independent of An ,

where E(SAn ) = lX> eAu (s- l ) fS (u) du = MS (J... (s - 1 )) ,

M s being the moment generating function of a service time, and

Combining these relations, we obtain that G satisfies

smG(s) = Ms (J... (s - 1 ) ) {G(S) + t (Sm - S i )1l'i } ' 1=0

whenever it exists. Finally suppose that m = 2 and Ms«(}) = p,/(p, - (}) . In this case,

G(s) = p,{1l'o (s + 1) + 1l'I S } p,(s + 1 ) - J...s2

Now G(I ) = 1 , whence p,(21l'0 + 1l'1 ) = 2p, - J... ; this implies in particular that 2p, - J... > O. Also G(s) converges for I s I :::: 1 . Therefore any zero of the denominator in the interval [- 1 , 1] is also a zero of the numerator. There exists exactly one such zero, since the denominator is a quadratic which takes the value -J... at s = - 1 and the value 2p, - J... at s = 1 . The zero in question is at

and it follows that 1l'0 + (1l'0 + 1l'1 )so = O. Solving for 1l'0 and 1l'1 , we obtain

I - a G(s) = -- , 1 - as

where a = 2J.../{p, + V p,2 + 4J...p,} . 5 . Recalling standard MlG/1 theory, the moment generating function MB satisfies

whence M B (s) is one of (J... + p, - s) ± V(J... + p, - s)2 - 4J...p,

2J...

389

Page 399: One Thousand Exercises in Probability

[11.8.6]-[11.8.7] Solutions Queues

Now M B (s) is non-decreasing in s , and therefore it is the value with the minus sign. The density function of B may be found by inverting the moment generating function; see Feller ( 197 1 , p. 482), who has also an alternative derivation of M B .

As for the mean and variance, either differentiate MB , or differentiate (*). Following the latter route, we obtain the following relations involving M (= MB) :

2AM M' + M + (s - A - I-L)M' = 0 , 2AM M" + 2A(M,)2 + 2M' + (s - A - I-L)M" = O .

Set s = O to obtain M'(O) = (I-L-A)- l and M"(O) = 21-L(I-L-A)-3 , whence the claims are immediate.

6. (i) This question is closely related to Exercise ( 1 1 .3 . 1 ) . With the same notation as in that solution, we have that

where h (x) = min{ l , x } . Taking expectations, we obtain lP'(Qn > 0) = lE(An) where

lE(An ) = lX> lE(An I S = s) dFs (s) = AlE(S) = p ,

and S is a typical service time. Square (*) and take expectations to obtain

p ( 1 - 2p) + lE(A�+1 ) lE(Qn) = 2( 1 _ p) ,

where lE(A� ) is found (as above) to equal p + A 2lE(S2) . (ii) If a customer waits for time W and i s served for time S, he leaves behind him a queue-length which is Poisson with parameter A (W + S) . In equilibrium, its mean satisfies AlE(W + S) = lE(Qn) , whence lE(W) i s given as claimed. (iii) lE(W) is a minimum when lE(S2) is minimized, which occurs when S is concentrated at its mean. Deterministic service times minimize mean waiting time.

7. Condition on arrivals in (t, t+h) . If there are no arrivals, then Wt+h ::: x if and only if Wt ::: x+h .

If there i s an arrival, and his service time i s S, then Wt+h ::: x if and only if Wt ::: x + h - S. Therefore

rx+h F(x ; t + h) = ( 1 - Ah )F (x + h ; t) + Ah

Jo F(x + h - s ; t) dFs (s) + o(h) .

Subtract F(x ; t ) , divide by h , and take the limit a s h -!- 0 , to obtain the differential equation. We take Laplace-Stieltjes transforms. Integrating by parts, for () ::: 0,

and therefore

r /Jx dh (x) = -h (O) - (} {Mu «(}) - H(O) } , J(O,oo) r /Jx dH(x) = Mu «(}) - H(O) ,

J(O,oo) r eOx dlP'(U + S ::: x) = Mu «(})Ms «(}) ,

J(O,oo)

0 = -h (O) - (} {Mu «(}) - H(O) } + AH(O) + AMU «(}) {Ms «(}) - I } .

390

Page 400: One Thousand Exercises in Probability

Problems Solutions [11.8.8]-[11.8.10]

Set () = 0 to obtain that h (O) = AH(O) , and therefore

Take the limit as () -+ 0, using L'Hopital's rule, to obtain H(O) = 1 - AE(S) = I - p . The moment generating function of U is given accordingly. Note that Mu is the same as the moment generating function of the equilibrium distribution of actual waiting time. That is to say, virtual and actual waiting times have the same equilibrium distributions in this case.

8. In this case U takes the values I and -2 each with probability 1 (as usual, U = S - X where S and X are typical (independent) service and interarrival times). The integral equation for the limiting waiting time distribution function F becomes

F(O) = 1 F(2) , F(x) = 1 {F(x - 1 ) + F(x + 2) } for x = I , 2, . . . .

The auxiliary equation is (}3 - 2() + 1 = 0, with roots 1 and - � ( 1 ± .J5). Only roots lying in [- 1 , 1 ] can contribute, whence

F(x) = A + B ( -1 ; .J5r for some constants A and B . Now F(x) -+ 1 as x -+ 00, since the queue is stable, and therefore A = 1 . Using the equation for F(O) , we find that B = 1 ( 1 - .J5) . 9. Q is a M(A)IM(JL)/oo queue, otherwise known as an imrnigration-death process (see Exercise (6. 1 1 .3) and Problem (6. 1 5 . 1 8)). As found in (6. 1 5 . 1 8) , Q (t) has probability generating function

where p = A/ JL. Hence

E(Q(t)) = I e-JLt + p ( 1 - e-JLt ) ,

IP'(Q (t) = 0 ) = (1 - e-JLt )1 exp{ -p(1 - e-JLt ) } , 1 _

IP'(Q(t) = n) -+ _pne P n ! as t -+ 00.

If E(l) and E(B) denote the mean lengths of an idle period and a busy period in equilibrium, we have that the proportion of time spent idle is E(l)/{E(l) + E(B)} . This equals limt-HlO IP'(Q(t) = 0) = e-p . Now E(l) = A-I , by the lack-of-memory property of the arrival process, so that E(B) = (eP - 1)/A.

10. We have in the usual way that

Q (t + 1) = At + Q (t) - min{ l , Q (t)}

where At has the Poisson distribution with parameter A. When the queue is in equilibrium, E(Q(t)) = E(Q(t + 1) ) , and hence

IP'(Q(t) > 0) = E (min{ l , Q (t) }) = E(At ) = A .

We have from (*) that the probability generating function G (s) of the equilibrium distribution of Q (t) (= Q) is

G(s) = E(sAt )E(s Q-min{ I , Q} ) = eA(s- I ) {E(s Q-l I{ Q� l } ) + IP'(Q = O) } .

391

Page 401: One Thousand Exercises in Probability

[11.8.11]-[11.8.13] Solutions

Also, G(s) = lE(s Q I{Q::: l } ) + IP'(Q = 0) ,

and hence

G(s) = eA(s- l ) { �G(S) + ( 1 - �) ( 1 - A) }

whence ( 1 - s ) ( 1 - A) G(s) = A( 1 ) . 1 - se- s-

Queues

The mean queue length is G'( I ) = iA(2 - A)/ ( I - A) . Since service times are of unit length, and arrivals form a Poisson process, the mean residual service time of the customer in service at an arrival time is i , so long as the queue is non-empty. Hence

lE(W) = lE(Q) - ilP'(Q > 0) = A

2(1 - A)

11. The length B of a typical busy period has moment generating function satisfying M B (s) =

exp{s - A + AMB (S) } ; this fact may be deduced from the standard theory of M/G/l , or alternatively by a random-walk approach. Now T may be expressed as T = I + B where I is the length of the first idle period, a random variable with the exponential distribution, parameter A . It follows that MT (S) = AMB (S)/(A - s) . Therefore, as required,

(A - S )MT (S) = A exp{s - A + (A - s )MT (S) } .

If A � 1 , the queue .. length at moments of departure is either null persistent or transient, and it follows that lE(T) = 00. If A < 1 , we differentiate (*) and set s = 0 to obtain AlE(T) - 1 = A 2lE(T), whence lE(T) = {A( I - A)}- I .

12. (a) Q is a birth-death process with parameters Ai = A, p'i = p" and is therefore reversible in equilibrium; see Problems (6. 1 5 . 1 6) and ( 1 1 .8 .3) . (b) The equilibrium distribution satisfies A7Ci = W'i+ l for i � 0, whence 7Ci = ( 1 - p)pi where p = A/ p,. A typical waiting time W is the sum of Q independent service times, so that

Mw(s) = GQ (Ms (s)) = 1 - P

= ( 1 - p) (p, - s) .

1 - pp,/(p, - s) p,(1 - p) - s

(c) See the solution to Problem ( 1 1 .8 .3) . (d) Follow the solution to Problem ( 1 1 .8 .3) (either method) to find that, at any time t in equilibrium, the queue lengths are independent, the jth having the eqUilibrium distribution of M(A)/M(p,j )/1 . The joint mass function is therefore

where Pj = A/p,j .

13. The size of the queue is a birth-death process with rates Ai = A, P,i = P, min{i, k} . Either solve the equilibrium equations in order to find a stationary distribution 7C , or argue as follows. The process is reversible in eqUilibrium (see Problem (6. 1 5 . 1 6)), and therefore Ai7Ci = P,i+I 7Ci+l for all i . These 'balance equations ' become

392

if 0 ::::: i < k, if i � k.

Page 402: One Thousand Exercises in Probability

Problems Solutions [11.8.14]-[11.8.14]

These are easily solved iteratively to obtain

{ 7roaJl i ! if O :s i :s k,

7rj = 7ro(a/ k)

j kk I k ! if i ?:. k

where a = AI J.t. Therefore there exists a stationary distribution if and only if A < kJ.t, and it is given accordingly, with

The cost of having k servers is

where 7r0 = 7ro(k) . One finds, after a little computation, that

Therefore

Ba Cl = A + -- , I - a 2Ba2

C2 = 2A + --2 ' 4 - a

a3(A - B) + a2 (2B - A) - 4a (A + B) + 4A C2 - Cl =

( 1 _ a)(4 _ (2) Viewed as a function of a, the numerator is a cubic taking the value 4A at a = ° and the value -3B at a = 1 . This cubic has a unique zero at some a* E (0, 1 ) , and Cl < C2 if and only if ° < a < a*. 14. The state of the system is the number Q (t ) of customers within it at time t . The state 1 may be divided into two sub-states, being al and a2 , where aj is the state in which server i is occupied but the other server is not. The state space is therefore S = {O, ai , a2 , 2, 3 , . . . } .

The usual way of finding the stationary distribution, when i t exists, i s to solve the equilibrium equations. An alternative is to argue as follows. If there exists a stationary distribution, then the process is reversible in equilibrium if and only if

for all sequences i i , i2 , . . . , ik of states, where G = (guv )u . v eS is the generator of the process (this may be shown in very much the same way as was the corresponding claim for discrete-time chains in Exercise (6.5 .3); see also Problem (6. 1 5 . 1 6» . It is clear that (*) is satisfied by this process for all sequences of states which do not include both al and a2 ; this holds since the terms guv are exactly those of a birth-death process in such a case. In order to see that (*) holds for a sequence containing both al and a2 , it suffices to perform the following calculation:

gO,uI gUI ,2g2,u2gu2 .0 = (1A)AJ.t2J.tl = gO,u2gu2 , 2g2,uI gUI ,0 ' Since the process is reversible in eqUilibrium, the stationary distribution 7r satisfies 7ruguv

7rvgvu for all u , v E S, U =1= v. Therefore

and hence

7ruA = 7ru+ l (J.t l + J.t2 ) , u ?:. 2 ,

A2 ( A ) U-2 for u ?:. 2. 7ru = 2J.tlJ.t2 J.t l + J.t2

7r0

393

Page 403: One Thousand Exercises in Probability

[11.8.15]-[11.8.17] Solutions Queues

This gives a stationary distribution if and only if A < P,l + P,2 , under which assumption Jro is easily calculated.

A similar analysis is valid if there are s servers and an arriving customer is equally likely to go to any free server, otherwise waiting in turn. This process also is reversible in equilibrium, and the stationary distribution is similar to that given above.

15. We have from the standard theory that Q/L has as mass function Jrj = ( I - 11)l1j , j :::: 0, where 11 is the smallest positive root of the equation x = e/L(x- l) . The moment generating function of ( 1 - p,- l ) Q/L is

M/L (O) = E (exp{O ( l - p,- l ) Q/L }) = 1 - 11

- I . I - l1eO ( l-/L )

Writing p, = 1 + f, we have by expanding e/L(t}- l) as a Taylor series that 11 = l1 (f) = 1 - 2f + O(f) as f -I- 0. This gives

M (0) = 2f + O(f)

= 2f + O(f)

� _2_ /L 1 - ( 1 - 2f) ( I + Of) + O(f) (2 - O )f + O(f) 2 - 0

as f -I- 0, implying the result, by the continuity theorem.

16. The numbers P (of passengers) and T (of taxis) up to time t have the Poisson distribution with respective parameters Jrt and r:t . The required probabilities Pn = lP'(P = T + n) have generating function

n=-oo

00 00 L L lP'(P = m + n)lP'(T = m)zn

n=-oo m=O 00

= L lP'(T = m)z-mG p (z) m=O

= GT (z- l )Gp (z) = e- (1T+T)te (1TZ+TZ- 1 )t ,

in which the coefficient of zn is easily found to be that given.

17. Let N(t) be the number of machines which have arrived by time t . Given that N(t) = n , the times Tl , T2 , . . . , Tn of their arrivals may be thought of as the order statistics of a family of independent uniform variables on [0, t ] , say VI , V2 , . . . , Vn ; see Theorem (6. 12.7). The machine which arrived at time Vj is, at time t ,

in the x-stage } { a(t) in the Y -stage with probability f3(t)

repaired 1 - a(t) - f3 (t)

where a(t) = lP'(V + X > t) and f3(t) = lP'(V + X � t < V + X + y) , where V is uniform on [0, t], and (X, Y) is a typical repair pair, independent of V . Therefore

lP' (V(t) = . V (t) = k I N(t) = n ) = n ! a (t)j f3 (t)k ( I - a(t) - f3 (t))n-k-j

) , · ' k ' ( - · - k) ' ' ) . . n ) .

implying that

00 e-J..t (At)n lP' (V(t) = j, V(t) = k) = L , lP' (V(t) = j, V(t) = k I N(t) = n)

O n . n=

. , ) .

394

k !

Page 404: One Thousand Exercises in Probability

Problems Solutions [11.8.18]-[11.8.19]

18. The maximum deficit Mn seen up to and including the time of the nth claim satisfies

Mn = max {Mn-l , t(Kj - Xj ) } = max{O, Ul , Ul + U2 , · · · , Ul + U2 + . . . + Un } , J=l

where the Xj are the inter-claim times, and Uj = Kj - Xj . We have as in the analysis of GIGll that Mn has the same distribution as Vn = max{O, Un , Un + Un- l , . . . , Un + Un-l + . . . + Ul l , whence Mn has the same distribution as the (n + l)th waiting time in a M(A)/G/l queue with service times Kj and interarrival times Xj . The result follows by Theorem ( 1 1 . 3 . 16) .

19. (a) Look for a solution to the detailed balance equations A7fi = ( i + l )JL7fHl , ° ::::: i < s , to find that the stationary distribution is given by 7fi = (pi / i ! )7fO . (b) Let Pc be the required fraction. We have by Little's theorem ( 10.5 . 1 8) that

A(7fc- l - 7fc) Pc = = P (7fc- l - 7fe> , c ::: 2 ,

JL

and P l = 7fl , where tfs is the probability that channels 1 , 2, . . . , s are busy in a queue MlM/s having the property that further calls are lost when all s servers are occupied.

395

Page 405: One Thousand Exercises in Probability

12 Martingales

12.1 Solutions. Introduction

1. (i) We have that E(Ym ) = E{E(Ym+l I J='m) } = E(Ym+l ) , and the result follows by induction. (ii) For a submartingale, E(Ym) � E{E(Ym+l I J='m) } = E(Ym+l ) , and the result for supermartingales follows similarly.

2. We have that

if m ::: I , since J='n � J='n+m- l . Iterate to obtain E(Yn+m I J='n) = E(Yn I J='n) = Yn · 3. (i) ZnJ-L-n has mean I , and

"" (Z -(n+ l) I (J"" ) -(n+ l)1L' (Z I (J"" ) -nZ ." n+ lJ-L .rn = J-L ." n+l .rn = J-L n ,

where J='n = cr (Zl , Z2 , . . . , Zn) . (ii) Certainly T/Zn � 1 , and therefore i t has finite mean. Also,

where the Xi are independent family sizes with probability generating function G. Now G (T/) = T/, and the claim follows.

4. (i) With Xn denoting the size of the nth jump,

where J='n = cr (Xl , X2 , . . . , Xn) . Also E I Sn l � n, so that {Sn } is a martingale. (ii) Similarly E(S�) = var(Sn ) = n, and

(iii) Suppose the walk starts at k, and there are absorbing barriers at 0 and N (::: k). Let T be the time at which the walk is absorbed, and make the assumptions that E(ST) = So, E(Si- - T) = S5 . Then the probability Pk of ultimate ruin satisfies o . Pk + N . (1 - Pk) = k,

396

Page 406: One Thousand Exercises in Probability

Introduction Solutions [12.1.5]-[12.1.9]

and therefore Pk = 1 - (k/ N) and lE(T) = keN - k) .

5. (i) By Exercise ( 12 . 1 .2), for r ::: i ,

lE(Yr Yj ) = lE{lE(Yr Yj I :J=i ) } = lE{YjlE(Yr I :J=i ) } = lE(yh ,

an answer which is independent of r . Therefore

if i ::::: j ::::: k.

(li) We have that

lE{ (Yk - Yj )2 1 :J=i } = lE(Yf l :J=i) - 2lE(Yk Yj I :J=i) + lE(Y/ 1 :Ft ) .

Now lE(YkYj 1 :Ft) = lE{lE(YkYj 1 Jj) � J='j} = lE(YJ I J='j) , and the c1aim follows. (iii) Taking expectations of the last conclusion,

j ::::: k.

Now {lE(Y;) : n ::: 1 } is non-decreasing and bounded, and therefore converges. Therefore, by (*) , { Yn : n ::: 1 } is Cauchy convergent in mean square, and therefore convergent in mean square, by Problem (7. 1 1 . 1 1 ) .

6. (i) Using Jensen's inequality (Exercise (7.9.4)),

lE (U(Yn+l ) I J='n ) ::: u (lE(Yn+l I J='n ) ) = u(Yn) .

(li) It suffices to note that lx i , x2 , and x+ are convex functions of x ; draw pictures if you are in doubt about these functions.

7. (i) This follows just as in Exercise ( 12 . 1 .6), using the fact that u {lE(Yn+ l I J='n ) } ::: u (Yn ) in this case. (ii) The function x+ is convex and non-decreasing. Finally, let {Sn : n ::: O} be a simple random walk whose steps are + 1 with probability P (= 1 - q > ! ) and - 1 otherwise. If Sn < 0, then

lE ( I Sn+l l l J='n ) = P( ISn l - 1 ) + q ( ISn l + 1) = I Sn l - (p - q) < I Sn l ;

note that IP'(Sn < 0) > 0 if n ::: 1 . The same example suffices in the remaining case.

S. Clearly lEli.. -n1/! (Xn) 1 ::::: i.. -n sup{ I1/! (j ) 1 : j E S} . Also,

lE(1/! (Xn+} ) I J='n) = L PXn , j 1/! (j) ::::: i..1/! (Xn ) j eS

where J='n = cr (X 1 , X2 , . . . , Xn) . Divide by i..n+1 to obtain that the given sequence is a supermartin­gale.

9. Since var(Zl ) > 0, the function G, and hence also Gn , is a strictly increasing function on [0, 1 ] . Since 1 = Gn+l (Hn+l (s)) = Gn (G(Hn+l (s))) and Gn (Hn (s)) = 1 , we have that G(Hn+l (s)) = Hn (s) . With J='m = cr (Zk : 0 ::::: k ::::: m) ,

lE (Hn+l (s )Zn+l I J='n ) = G (Hn+l (S))Zn = Hn (s )Zn .

397

Page 407: One Thousand Exercises in Probability

[12.2.1]-[12.3.2] Solutions Martingales

12.2 Solutions. Martingale differences and Hoeffding's inequality

1. Let J'f = a ({Vj ' Wj : I ::: j ::: i }) and Yi = lE(Z I J'f) . With Z(j) the maximal worth attainable without using the jth object, we have that

lE (Z(j) I :Fj ) = lE (Z(j) I :Fj-d , Z(j) ::: Z ::: Z(j) + M.

Take conditional expectations of the second inequality, given :Fj and given :Fj - 1 , and deduce that I Yj - Yj- l l ::: M. Therefore Y is a martingale with bounded differences, and Hoeffding's inequality yields the result.

2. Let J'f be the a-field generated by the (random) edges joining pairs (va , Vb) with I ::: a , b ::: i , and let Xi = lE(X I J'f) . We write X (j) for the minimal number of colours required in order to colour each vertex in the graph obtained by deleting Vj . The argument now follows that of the last exercise, using the fact that X (j) ::: X ::: X (j) + 1 .

12.3 Solutions. Crossings and convergence

1. Let Tl = rnin{n : Yn � b} , T2 = rnin{n > Tl : Yn ::: a } , and define Tk inductively by

T2k- l = rnin{n > T2k-2 : Yn � b} , T2k = min{n > T2k- l : Yn ::: a } .

The number of downcrossings by time n i s Dn (a , b ; Y) = max{k : T2k ::: n} . (a) Between each pair of upcrossings of [a , b] , there must be a downcrossing, and vice versa. Hence I Dn (a , b ; Y) - Un (a , b ; Y) I ::: 1 . (b) Let Ii be the indicator function of the event that i E (T2k- l ' T2k] for some k , and let

n Zn = L Ii (Yj - Yi- l ) , n � O.

i= 1

It is easily seen that Zn ::: - (b - a)Dn (a , b; Y) + (Yn - b)+ ,

whence

Now Ii is J'f_ l -measurable, since

Therefore,

{Ii = I } = U ({T2k- l ::: i - I } \ {T2k ::: i - ln · k

since In � 0 and Y is a submartingale. It follows that lE(Zn ) � lE(Zn- l ) � . . . � lE(Zo) = 0, and the final inequality follows from (*). 2. If Y is a supermartingale, then -Y is a submartingale. Upcrossings of [a , b] by Y correspond to downcrossings of [-b, -a] by -Y, so that

lE{(-Yn + a)+ } lE{(Yn - a)- } lEUn (a , b ; Y) = lEDn (-b , -a ; -Y) ::: b = b ' - a - a

398

Page 408: One Thousand Exercises in Probability

Stopping times Solutions [12.3.3]-[12.4.5]

by Exercise (12 .3 . 1 ) . If a, Yn :::: 0 then (Yn - a)- ::s a .

3 . The random sequence {1/I (Xn ) : n :::: 1 } i s a bounded supermartingale, which converges a.s . to some limit Y. The chain is irreducible and persistent, so that each state is visited infinitely often a.s . ; it follows that limn-+oo 1/1 (Xn ) cannot exist (a.s . ) unless 1/1 is a constant function.

4. Y is a martingale since Yn is the sum of independent variables with zero means. Also r:l'" lP'( Zn i= 0) = r:l'" n-2 < 00, implying by the Borel-Cantelli lemma that Zn = 0 except for finitely many values of n (a.s .) ; therefore the partial sum Yn converges a.s. as n -+ 00 to some finite limit.

It is easily seen that an = San- l and therefore an = 8 · sn-2 , if n :::: 3. It follows that I Yn I :::: ian if and only if I Zn I = an . Therefore

which tends to infinity as n -+ 00.

12.4 Solutions. Stopping times

1. We have that

n {TI + T2 = n} = U (fTl = k} n { T2 = n - k}) ,

k=O { max{TI ' T2 } ::s n } = {TI ::s n} n {T2 ::s n} , { min{TI ' T2 } ::s n } = {TI ::s n} U {T2 ::s n} .

Each event on the right-hand side lies in :Fn .

2. Let :Fn = a (XI , X2 , . . . , Xn) and Sn = Xl + X2 + . . . + Xn . Now

{N(t) + 1 = n} = {Sn- l ::s t} n {Sn > t} E :Fn .

3. (Y+ , s:) is a submartingale, and T = min{k : Yk :::: x} is a stopping time. Now 0 ::s T /\ n ::s n , so that lE(Yci ) ::s lE(Yil\n) ::s lE(Y,t ) , whence

4. We may suppose that lE(Yo) < 00. With the notation of the previous solution, we have that

5. It suffices to prove that lEYs ::s lEYT , since the other inequalities are of the same form but with different choices of pairs of stopping times. Let lm be the indicator function of the event {S < m ::s T} , and define n

Zn = L Im (Ym - Ym- l ) , O ::S n ::s N. m=l

Note that 1m is :Fm_I -measurable, so that

399

Page 409: One Thousand Exercises in Probability

[12.4.6]-[12.5.2] Solutions Martingales

since Y is a submartingale. Therefore E(ZN) 2: E(ZN- I ) 2: . . . 2: E(Zo) = O. On the other hand, ZN = YT - Ys , and therefore E(YT ) 2: E(Ys) ·

6. De Moivre's martingale is Yn = (q j p)Sn , where q = 1 - p. Now Yn 2: 0, and E(Yo) = 1 , and the maximal inequality gives that

lP' ( max Sm 2: x) = lP' ( max Ym 2: (qjP)x) ::: (pjq)x . O�m�n O�m�n

Take the limit as n -+ 00 to find that Soo = sUPm Sm satisfies

00 E(Soo) = L lP'(Soo 2: x) ::: -p- .

x=I q - P

We can calculate E(Soo) exactly as follows. It is the case that Soo 2: x if and only if the walk ever visits the point x , an event with probability fX for x 2: 0, where f = pjq (see Exercise (5 .3 . 1 » . The inequality of (*) may be replaced by equality.

7. (a) First, 0 n {T ::: n } = 0 E :Fn . Secondly, if A n {T ::: n } E :Fn then

AC n {T ::: n } = {T ::: n} \ (A n {T ::: n }) E :Fn .

Thirdly, if AI , A2 , ' " satisfy Ai n {T ::: n } E :Fn for each i , then

(l) Ai) n {T ::: n } = l) (Ai n {T ::: n}) E :Fn · I I

Therefore :FT is a a-field. For each integer m, it is the case that

{ {T < n} {T < m} n {T < n} = -- - {T ::: m}

an event lying in :Fn . Therefore {T ::: m } E :FT for all m. (b) Let A E :Fs . Then, for any n ,

n

if m > n , if m ::: n ,

(A n {S ::: T}) n {T ::: n } = U (A n {S ::: m }) n {T = m} , m=O

the union of events in :Fn, which therefore lies in :Fn . Hence A n {S ::: T} E :FT . (c) We have {S ::: T } = n, and (b) implies that A E :FT whenever A E :Fs .

12.5 Solutions. Optional stopping

1. Under the conditions of (a) or (b), the family {YT An : n 2: O} is uniformly integrable. Now T /\ n -+ T as n -+ 00, so that YT An -+ YT a.s. Using uniform integrability, E(YT An ) -+ E(YT ) , and the claim follows by the fact that E(YT An ) = E(Yo) .

2. It suffices to prove that {YT An : n 2: O} i s uniformly integrable. Recall that {Xn : n 2: O} is uniformly integrable if

400

Page 410: One Thousand Exercises in Probability

Optional stopping Solutions [12.5.3]-[12.5.5]

(a) Now,

E ( I YTAn I I{ l YTAn I �a } ) = E ( I YT I I{T:::n , I YT I�a } ) + E ( I Yn I I{T>n, I Yn l�a } ) :::: E ( I YT I I{ I YT I �a} ) + E ( I Yn I I{T>n} ) = g(a) + h en) ,

say. We have that g(a) � 0 as a � 00, since E I YT I < 00 . Also h en) � 0 as n � 00 ,

so that sUPn>N h en) may be made arbitrarily small by suitable choice of N. On the other hand, E ( I Yn I I{ I Yn l�a} ) � 0 as a � 00 uniformly in n E {O, 1 , . . . , N} , and the claim follows.

(b) Since Y;i defines a submartingale, we have that supn E(YiAn) :::: sUPn E(Y;i ) < 00, the second inequality following by the uniform integrability of {Yn } . Using the martingale convergence theorem, YT lin � YT a.s. where E I YT 1 < 00. Now

Also lP'(T > n) � 0 as n � 00, so that the final two terms tend to 0 (by the uniform integrability of

the Yi and the finiteness of EI YT 1 respectively) . Therefore YT An � YT , and the claim follows by the standard theorem (7. 10.3).

3. By uniform integrability, Y 00 = limn--+oo Yn exists a.s. and in mean, and Yn = E(Y 00 1 .'Fn). (a) On the event {T = n} it is the case that YT = Yn and E(Y 00 1 .'FT ) = E(Y 00 1 .'Fn) ; for the latter statement, use the definition of conditional expectation. It follows that YT = E(Y 00 1 .'FT) , irrespective of the value of T. (b) We have from Exercise ( 12 .4.7) that .'Fs � .'FT . Now Ys = E(Yoo 1 .'Fs) = E{E(Yoo 1 .'FT) 1 .'Fs) = E(YT 1 .'Fs) ·

4. Let T be the time until absorption, and note that {Sn } is a bounded, and therefore uniformly integrable, martingale. Also JP'(T < 00) = 1 since T is no larger than the waiting time for N consecutive steps in the same direction. It follows that E(So) = E(ST) = NJP'(ST = N), so that JP'(ST = N) = E(So)/ N. Secondly, {S� - n : n 2': O} is a martingale (see Exercise ( 12. 1 .4» , and the optional stopping theorem (if it may be applied) gives that

and hence E(T) = NE(So) - E(S5) as required. It remains to check the conditions of the optional stopping theorem. Certainly JP'(T < 00) = 1 ,

and in addition E(T2) < 00 by the argument above. We have that E I Si- - T I :::: N2 + E(T) < 00 .

Finally, E { (S� - n) I{T>n) } :::: (N2 + n)JP'(T > n) � 0

as n � 00, since E(T2) < 00 .

5. Let .'Fn = O' (Sl , S2 , . . . , Sn ) . I t is immediate from the identity cos(A + A) + cos(A - A) = 2 cos A cos A that

COS[A (Sn + 1 - ! (b - a))] + COS[A (Sn - 1 - i (b - a))] E(Yn+1 1 .'Fn) =

2(COS A)n+1 = Yn ,

and therefore Y is a martingale (it is easy to see that E I Yn 1 < 00 for all n). Suppose that 0 < A < rr/(a + b), and note that 0 :::: IA {Sn - i (b - a)} 1 < iA(a + b) < irr for

n :::: T . Now YT An constitutes a martingale which satisfies

401

Page 411: One Thousand Exercises in Probability

[12.5.6]-[12.5.8] Solutions Martingales

If we can prove that lE{(cos A)-T } < 00, it will follow that {YT An } is uniformly integrable. This will imply in turn that lE(YT ) = limn-+oo lE(YT An ) = lE(Yo) , and therefore

Cos{ �A(a + b)}lE{ (COS A)-T } = cos{ �A(b - a)}

as required. We have from (*) that

Now T /\ n --+ T as n --+ 00, implying by Fatou's lemma that

lE{ (COS A)-T } ::::: lE(Yo)

= Cos{ �A(a - b)}

. cos{ �A(a + b) } cos{ �A(a + b) }

6 . (a) The occurrence of the event {U = n } depends on S1 > S2 , . . . , Sn only, and therefore U i s a stopping time. Think of U as the time until the first sequence of five consecutive heads in a sequence of coins tosses. Using the renewal-theory argument of Problem ( 10.5 . 1 7), we find that lE(U) = 62. (b) Knowledge of Sl , S2 , . . . , Sn is insufficient to determine whether or not V = n, and therefore V is not a stopping time. Now lE(V) = lE(U) - 5 = 57. (c) W is a stopping time, since it is a first-passage time. Also lE(W) = 00 since the walk is null persistent.

7. With the usual notation,

lE(Mm+n I :Fm) = lE (t Sr + Y: Sr - j(Sm+n - Sm + Sm)3 1 :Fm) r==O r=m+1

= Mm + nSm - SmlE{(Sm+n - Sm)2} = Mm + nSm - nSmlE(Xi ) = Mm .

Thus {Mn : n ::: O} is a martingale, and evidently T is a stopping time. The conditions of the optional stopping theorem ( 12.5 . 1 ) hold, and therefore, by a result of Example (3.9.6),

8. We partition the sequence into consecutive batches of a + b flips. If any such batch contains only l 's, then the game is over. Hence JP>(T > n (a + b)) ::::: { I - (� )a+b }n --+ 0 as n --+ 00. Therefore,

lE l si - T I ::::: lE(Si ) + lE(T) ::::: (a + b)2 + lE(T) < 00,

and as n --+ 00.

402

Page 412: One Thousand Exercises in Probability

Problems Solutions [12.7.1]-[12.9.2]

12.7 Solutions. Backward martingales and continuous-time martingales

1. Let s ::: t. We have that lE(l1 (X (t)) I :Fs , Xs = i) = l:.j Pij (t - s )l1 (j ) . Hence

�lE (l1 (X (t)) I :Fs , Xs = i ) = (Pt-sGrl')l· = 0, dt so that lE(l1 (X (t)) I :Fs , Xs = i) = l1 (i ) , which is to say that lE(l1 (X (t)) I :Fs ) = l1 (X (S) ) .

2. Let Wet) = exp{-ON(t) + At ( l - e-li) } where 0 2: 0. I t may be seen that Wet /\ Ta ) , t 2: 0 , constitutes a martingale. Furthermore

IW (t /\ Ta ) 1 ::: exp{A (t /\ Ta ) ( 1 - e-li ) } t exp{ATa O - e-li ) } as t � 00,

where, by assumption, the limit has finite expectation for sufficiently small positive 0 (this fact may be checked easily). In this case, (Wet /\ Ta ) : t 2: o} is uniformly integrable. Now Wet /\ Ta ) � W(Ta ) a.s. as t � 00, and it follows by the optional stopping theorem that

1 = lE(W(O)) = lE (W(t 1\ Ta )) � lE(W(Ta )) = e-lialE{eATa (l -e-O ) } .

Write s = e-Ii to obtain s-a = lE{eATa (l -S) } . Differentiate at s = 1 to find that a = AlE(Ta ) and a(a + 1) = A 2lE(T;) , whence the claim is immediate.

3. Let 9.m be the a -field generated by the two sequences of random variables Sm , Sm+ I . . . , Sn and Um+l , Um+2 , . . . , Un . It is a straightforward exercise in conditional density functions to see that

-I IoUm+2 (m + l )xm-1 m + 1 lE(Um+1 I 9.m+l ) = (U )m+ l dx = -U-- , o m+2 m m+2 whence lE(Rm I 9.m+l ) = Rm+1 as required. [The integrability condition is elementary.]

Let T = maxIm : Rm 2: 1 } with the convention that T = 1 if Rm < 1 for all m . As in the closely related Example ( 12.7.6), T is a stopping time. We apply the optional stopping theorem 02.7.5) to the backward martingale R to obtain that lE(RT I 9.n ) = Rn = Snit . Now, RT 2: 1 on the event {Rm 2: 1 for some m ::: n } , whence

� = lE(RT I Sn = y) 2: lP'(Rm 2: 1 for some m ::: n I Sn = y) . t

[Equality may be shown to hold. See Karlin and Taylor 198 1 , pages 1 10-1 13 , and Example 02.7.6).]

12.9 Solutions to problems

1. Clearly lE(Zn) ::: (fJ, + m)n , and hence lE lYn I < 00. Secondly, Zn+ I may be expressed as

l:.f�l Xi + A, where Xl , X2 , . · . are the family sizes of the members of the nth generation, and A is the number of immigrants to the (n + l )th generation. Therefore lE(Zn+1 I Zn) = fJ,Zn + m , whence

1 { ( 1 - fJ,n+l ) } lE(Yn+1 I Zn) = fJ,n+1 fJ,Zn + m 1 -

1 _ fJ, = Yn ·

2. Each birth in the (n + l) th generation is to an individual, say the sth, in the nth generation. Hence, for each r, B(n+I) , r may be expressed in the form B(n+ 1) , r = Bn,s + Bj (s ) , where Bj (s ) is the age of the parent when its jth child is born. Therefore

lE{ L e -liB(n+l ) , r I :Fn } = lE{ � e -li (Bn,s+Bj (s)) I :Fn } = L e-liBn ,s MI (0 ) , r S , ] s

403

Page 413: One Thousand Exercises in Probability

[12.9.3]-[12.9.5] Solutions

which gives that E(Yn+1 I :Fn) = Yn . Finally, E(YI (0)) = 1 , and hence E(Yn (O)) = 1 .

3 . If x , C > 0 , then

Martingales

Now (Yk + c)2 is a convex function of Yk . and therefore defines a submartingale (Exercise ( 12. 1 .7)). Applying the maximal inequality to this sub martingale, we obtain an upper bound ofE{ (Yn + c)2} / (x + c)2 for the right-hand side of (*). We set c = E(Y;)/x to obtain the result. 4. (a) Note that Zn = Zn-l + Cn {Xn - E(Xn I :Fn- l ) } , so that (Z, F) is a martingale. Let T be the stopping time T = rnin{k : q Yk � x} . Then E(ZT/\n ) = E(Zo) = 0, so that

since the final term in the defiuition of Zn is non-negative. Therefore

n xJP'(T ::::: n) ::::: E{CT/\nYT/\n } ::::: I'>kE{E(Xk I :Fk-d } ,

k=l

where we have used the facts that Yn � ° and E(Xk I :Fk- l ) � 0. The claim follows. (b) Let Xl , X 2 , . . . be independent random variables, with zero means and fiuite variances, and let Yj = �{=l Xi · Then Y} defines a non-negative submartingale, whence

5. The function h (u) = l u l r is convex, and therefore Yi (m ) = l Si - Sm l r , i � m , defines a submartingale with respect to the filtration :Fi = a ({ Xj : 1 ::::: j ::::: i }) . Apply the HRC inequality of Problem ( 12.9 .4), with q = 1 , to obtain the required inequality.

If r = 1 , we have that

m+n E ( I Sm+n - Sm l ) ::::: L EIZk l

k=m+l

by the triangle iuequality. Let m , n -+ 00 to find, in the usual way, that the sequence {Sn } converges a.s . ; Kronecker's lemma (see Exercise (7 .8 .2)) then yields the final claim.

Suppose 1 < r ::::: 2, in which case a little more work is required. The function h is differentiable, and therefore

h (v) - h (u) = (v - u)h' (u) + fov-u {h'(u + x) - h'(u) } dx .

Now h'(y) = r l y l r- l sign(y) has a derivative decreasing in I y l . It follows (draw a picture) that h ' (u +x) - h'(u) ::::: 2h' (1x) if x � 0, and therefore the above integral is no larger than 2h ( � (v - u)) . Apply this with v = Sm+k+l - Sm and u = Sm+k - Sm , to obtaiu

404

Page 414: One Thousand Exercises in Probability

Problems Solutions [12.9.6]-[12.9.10]

Sum over k and use the fact that

to deduce that m+n E ( I Sm+n - Smn :::; 22-r L E ( I Zkn ·

k=m+l The argument is completed as after (*). 6. With It = I{Yk=O} , we have that

E(Yn I J="'n- l ) = E (XnIn- l + nYn- l IXn l ( 1 - In- I ) I J="'n- l ) = In-lE(Xn ) + nYn- l ( 1 - In- I )EIXn l = Yn-l

since E(Xn) = 0, EIXn l = n-l . Also E IYn l :::; E{ IXn l ( 1 + n l Yn-l D } and E lYl l < 00, whence EI Yn l < 00. Therefore (Y, !F) is a martingale.

Now Yn = 0 if and only if Xn = O. Therefore lP'(Yn = 0) = lP'(Xn = 0) = 1 - n- 1 � 1 as n � 00, implying that Yn � O. On the other hand, Ln lP'(Xn =1= 0) = 00, and therefore lP'(Yn =1= 0 i.o.) = 1 by the second Borel-Cantelli lemma. However, Yn takes only integer values, and therefore Yn does not converge to 0 a.s. The martingale convergence theorem is inapplicable since sUPn E IYn l = 00.

7. Assume that t > 0 and M(t) = 1 . Then Yn = etSn defines a positive martingale (with mean 1 ) with respect to J="'n = a (X 1 , X2 , . . . , Xn) . By the maximal inequality,

and the result follows by taking the limit as n � 00.

S. The sequence Yn = ;Zn defines a martingale; this may be seen easily, as in the solution to Exercise ( 12. 1 . 1 5) . Now {Yn } is uniformly bounded, and therefore Yoo = limn-+oo Yn exists a.s. and satisfies E(Y

00) = E(Yo) = ; . Suppose 0 < ; < 1 . In this case Zl is not a.s. zero, so that Zn cannot converge a.s. to a constant

c unless C E {O, oo}. Therefore the a.s. convergence of Yn entails the a.s. convergence of Zn to a limit random variable taking values 0 and 00. In this case, E(Yoo) = 1 · lP'(Zn � 0) + 0 · lP'(Zn � (0), implying that lP'(Zn � 0) = ; , and therefore lP'(Zn � (0) = 1 - ; . 9. It is a consequence of the maximal inequality that lP'(Y; � x) :::; x- 1E(Yn I{Y,i2:x} ) for x > O. Therefore

E(Y;) = 1000 lP'(Y; � x) dx :::; 1 + 100 lP'(Y; � x) dx

:::; 1 + E {Yn 100 x- I 1(1 , Y,ij (x) dX }

= 1 + E(Yn log+ Y;) :::; 1 + E(Yn log+ Yn) + E(Y;)/e .

10. (a) We have, as in Exercise ( 12 .7 . 1) , that

E(h (X(t)) I B, X(s) = i ) = L Pij (t)h(j) for s < t , j

405

Page 415: One Thousand Exercises in Probability

[12.9.11]-[12.9.13] Solutions Martingales

for any event B defined in terms of (X(u) : u ::: s } . The derivative of this expression, with respect to t , is (PtGh/) j , where Pt is the transition semigroup, G is the generator, and h = (h (j) : j � 0) . In this case,

(Gh/)j = L gjkh (k) = Aj {h (j + 1) - h (j) } - J-Lj {h (j) - h (j - I ) } = 0 k

for all j . Therefore the left side of (*) is constant for t � s, and is equal to its value at time s, i.e. X (s) . Hence h (X(t)) defines a martingale. (b) We apply the optional stopping theorem with T = min{t : X (t) E to, n } } to obtain lE(h (X(T))) =

lE(h (X(O))) , and therefore ( I - :n: (m))h (n) = h (m) as required. It is necessary but not difficult to check the conditions of the optional stopping theorem. 11. (a) Since Y is a submartingale, so is y+ (see Exercise ( 12. 1 .6)). Now

Therefore (lE(Y,i+m I :Fn) : m � O} is (a.s .) non-decreasing, and therefore converges (a.s.) to a limit Mn . Also, by monotone convergence of conditional expectation,

and furthermore lE(Mn) = limm-+oo lE(Y';;+n ) ::: M. It is the case that Mn is :Fn-measurable, and therefore it is a martingale. (b) We have that Zn = Mn - Yn is the difference of a martingale and a submartingale, and is therefore a supeI1lUllj:ingale. Also Mn � Y;t � 0, and the decomposition for Yn follows. (c) In thi; case Zn is a martingale, being the difference oftwo martingales. Also Mn � lE(Y;t I :Fn) =

Y;t � Yn a.s . , and the claim follows. 12. We may as well assume that J-L < P since the inequality is trivial otherwise. The moment generating function of P - Cl is M(t) = /(P-/L)+!u2t2 , and we choose t such that M(t) = 1 , i .e. , t = -2(P - J-L)/a2 . Now define Zn = min{etYn , I } and :Fn = a (Cl , C2 , . . . , Cn) . Certainly lE l Zn l < 00; also

lE(Zn+l I :Fn) ::: lE(etYn+ 1 I :Fn) = etYn M(t) = etYn

and lE(Zn+1 I :Fn) ::: 1 , implying that lE(Zn+l I :Fn) ::: Zn . Therefore (Zn , :Fn) is a positive supermartingale. Let T = inf{n : Yn ::: O} = inf{n : Zn = I } . Then T /\ m is a bounded stopping time, whence lE(Zo) � lE(ZT Am) � lP'(T ::: m) . Let m --+ 00 to obtain the result. 13. Let :Fn = a (Rl , R2 , . . . , Rn ) . (a) 0 ::: Yn ::: 1 , and Yn i s :Fn-measurable. Also

whence Yn satisfies lE(Yn+1 I :Fn) = Yn . Therefore {Yn : n � O} is a uniformly integrable martingale, and therefore converges a.s. and in mean. (b) In order to apply the optional stopping theorem, it suffices that lP'(T < 00) = 1 (since Y is uniformly integrable) . However lP'(T > n) = � . � . . . n�l = (n + 1 )- 1 --+ O. Using that theorem, lE(YT ) = lE(Yo) , which is to say that lE{T /(T + 2) } = � , and the result follows. (c) Apply the maximal inequality.

406

Page 416: One Thousand Exercises in Probability

Problems Solutions [12.9.14]-[12.9.17]

14. As in the previous solution, with 9.n the a-field generated by AI , A2 , . . . and :Fn ,

lE(Y. I 9. ) _ ( Rn + An ) ( Rn ) + ( Rn ) ( Bn ) n+1 n - Rn + Bn + An Rn + Bn Rn + Bn + An Rn + Bn Rn _

Y. Rn + Bn - n ,

so thatlE(Yn+1 I :Fn) = lE{lE(Yn+1 I 9.n ) I :Fn } = Yn · Also I Yn I ::: 1 , and therefore Yn is a martingale. We need to show that lP'(T < (0) = 1 . Let In be the indicator function of the event {T > n} . We

have by conditioning on the An that

n- I ( 1 ) 00 ( 1 ) lE(ln I A) = II 1 - -- --+ II 1 - --

j=O 2 + Sj j=O 2 + Sj

as n --+ 00, where Sj = 'E,{=I Ai . The infinite product equals 0 a.s . if and only if 'E,j (2 + Sj )- 1 = 00

a.s. By monotone convergence, lP'(T < (0) = 1 under this condition. If this holds, we may apply the optional stopping theorem to obtain that lE(YT ) = lE(Yo) , which is to say that

lE ( 1 _ 1 + AT ) = � .

2 + ST 2

15. At each stage k, let Lk be the length of the sequence 'in play' , and let Yk be the sum of its entries, so that Lo = n, Yo = 'E,?=I xi . If you lose the (k + l )th gamble, then Lk+1 = Lk + 1 and Yk+1 = Yk + Zk where Zk is the stake on that play, whereas if you win, then Lk+1 = Lk - 2 and Yk+1 = Yk - Zk ; we have assumed that Lk � 2, similar relations being valid if Lk = 1 . Note that Lk is a random walk with mean step size - 1 , implying that the first-passage time T to 0 is a.s. finite, and has all moments finite. Your profits at time k amount to Yo - Yk , whence your profit at time T is Yo , since YT = O.

Since the games are fair, Yk constitutes a martingale. Therefore lE(YT Am ) = lE(YO) =I- 0 for all m . However T 1\ m --+ T a.s. as m --+ 00, so that YT Am --+ YT a.s . Now lE(YT ) = 0 =I­limm--+oo lE(YT Am) , and it follows that {YTAm : m � I } is not uniformly integrable. Therefore lE(suPm YTAm) = 00; see Exercise (7. 10.6).

16. Since the game is fair, lE(Sn+1 I Sn ) = Sn . Also I Sn l ::: 1 + 2 + . . . + n < 00 . Therefore Sn is a martingale. The occurrence of the event {N = n} depends only on the outcomes of the coin-tosses up to and including the nth; therefore N is a stopping time.

A tail appeared at time N - 3, followed by three heads. Therefore the gamblers G I , G2 , . . . , G N -3 have forfeited their initial capital by time N, while G N -i has had i + 1 successful rounds for 0 ::: i ::: 2. Therefore SN = N - (p- I + p-2 + p-3 ) , after a little calculation. It is easy to check that N satisfies the conditions of the optional stopping theorem, and it follows that lE(S N) = lE(So) = 0, which is to say that lE(N) = p-I + p-2 + p-3 .

In order to deal with HTH, the gamblers are re-programmed to act as follows. If they win on their first bet, they bet their current fortune on tails, returning to heads thereafter. In this case, SN = N - (p-I + p-2q- l ) where q = 1 - p (remember that the game is fair), and therefore lE(N) = p-I + p-2q- l . 17. Let :Fn = a ({ Xi , Yi : 1 ::: i ::: n}) , and note that T is a stopping time with respect to this filtration. Furthermore lP'(T < (0) = 1 since T is no larger than the first-passage time to 0 of either of the two single-coordinate random walks, each of which has mean 0 and is therefore persistent.

Let al = var(XI ) and ai = var(YI ) · We have that Un - Uo and Vn - Vo are sums of independent summands with means 0 and variances al and ai respectively. It follows by considering

407

Page 417: One Thousand Exercises in Probability

[12.9.18]-[12.9.19] Solutions Martingales

the martingales (Un - UO)2 -na'f and (Vn - VO)2 -na:j (see equation ( 12.5. 14) and Exercise ( 10.2.2)) that

E{(UT - UO)2 } = a'fE(T) , E{(VT - VO)2 } = a:jE(T) . Applying the same argument to (Un + Vn) - (Uo + Yo), we obtain

E{ (UT + VT - Uo - VO)2 } = E(T)E{ (X 1 + Y1 )2 } = E(T) (a'f + 2c + a:j) .

Subtract the two earlier equations to obtain

E{ (UT - UO) (VT - Vo) } = cE(T)

if E(T) < 00. Now UT VT = 0, and in addition E(UT) = Uo, E(VT ) = Yo, by Wald's equation and the fact that E(X 1 ) = E(Y1 ) = O. It follows that -E(Uo yo) = cE(T) if E(T) < 00, in which case c < o.

Suppose conversely that c < O. Then (*) is valid with T replaced throughout by the bounded stopping time T /\ m, and hence

o � E(UT Am VTAm) = E(UO yO) + cE(T /\ m) .

Therefore E(T /\ m) � E(Uo Vo)/ (2 Ic l ) for all m, implying that E(T) = limm-+oo E(T /\ m) < 00,

and so E(T) = -E(Uo Vo)/c as before. 18. Certainly 0 � Xn � 1 , and in addition Xn is measurable with respect to the a-field :Fn =

a (R1 , R2 , . . . , Rn) . Also E(Rn+1 I Rn) = Rn - Rn/ (52 - n) , whence E(Xn+1 I :Fn) = Xn · Therefore Xn is a martingale. .

A strategy corresponds to a stopping time. If the player decides to call at the stopping time T, he wins with (conditional) probability XT, and therefore lP'(wins) = E(XT) , which equals E(Xo) (= � ) by the optional stopping theorem.

Here is a trivial solution to the problem. It may be seen that the chance of winning is the same for a player who, after calling "Red Now", picks the card placed at the bottom of the pack rather than that at the top. The bottom card is red with probability ! , irrespective of the strategy of the player. 19. (a) A sum s of money in week t is equivalent to a sum s / ( 1 +a)t in week 0, since the latter sum may be invested now to yield s in week t . If he sells in week t, his discounted costs are E�=l c/(1 + a)n and his discounted profit is Xt/ ( 1 + a)t . He wishes to find a stopping time for which his mean discounted gain is a maximum.

Now T

- L(1 + a)-nc = � { ( 1 + a)-T - I } , n=l a

so that JL(T) = E{ ( 1 + a)-T ZT } - (c/a) . (b) The function h (y) = ay - J; lP'(Zn > y) dy i s continuous and strictly increasing on [0, 60), with h (O) = -E(Zn ) < 0 and h (y) � 00 as y � 00. Therefore there exists a unique y (> 0) such that h (y) = 0, and we choose y accordingly. (c) Let :Fn = a (Zl , Z2 , . . . , Zn ) . We have that

E (max {Zn , y }) = y + loo

[ 1 - G(y)] dy = (1 + a)y

where G(y) = lP'(Zn � y) . Therefore E(Vn+1 I :Fn) = ( 1 + a)-ny � Vn , so that (Vn , :Fn) i s a non-negative supermartingale.

408

Page 418: One Thousand Exercises in Probability

Problems Solutions [12.9.20]-[12.9.21]

Let fL(-e) be the mean gain of following the strategy 'accept the first offer exceeding -e - (cia) , . The corresponding stopping time T satisfies lP'(T = n) = G(-e)n ( 1 - G(-e)) , and therefore

00

fL(-e) + (cia) = L E{ ( 1 + a)-T ZT I{T=nJ } n=O

00

= L ( 1 + a)-nG (-e)n ( 1 - G(-e))E(ZI I ZI > -e) n=O

= 1 + a { -e ( 1 _ G(-e)) + roo (1 - G(y)) dy } .

1 + a - G(-e) J,

Differentiate with care to find that the only value of -e lying in the support of Z 1 such that fL' ( -e ) = 0 is the value -e = y . Furthermore this value gives a maximum for fL (-e) . Therefore, amongst strategies of the above sort, the best is that with -e = y . Note that fL (Y) = y ( 1 + a) - (cia) .

Consider now a general strategy with corresponding stopping time T , where lP'(T < 00) = 1 . For any positive integerm, T I\m is abounded stopping time, whence E(VTAm ) ::::; E(Vo) = y ( 1+a) . Now I VT Am I ::::; I:�o l Vi I , and I:�o E l Vi i < 00. Therefore {VT Am : m � O} is uniformly integrable. Also VT Am � VT a.s. as m � 00, and it follows that E(VT Am) � E(VT) . We conclude that fL(T) = E(VT) - (cia) ::::; y ( 1 + a) - (cia) = fL(Y) . Therefore the strategy given above is optimal. (d) In the special case, lP'(Zl > y) = (y - 1 )-2 for y � 2, whence Y = 10. The target price is therefore 9, and the mean number of weeks before selling is G (y)/( 1 - G(y)) = 80.

20. Since G is convex on [0, 00) wherever it is finite, and since G ( I ) = 1 and G'( 1 ) < 1, there exists a unique value of TJ (> 1) such that G (TJ) = TJ. Furthermore, Yn = TJZn defines a martingale with mean E(Yo) = TJ. Using the maximal inequality ( 12.6.6),

1 lP'(sup Zn � k) = lP'(Sup Yn � TJk) ::::; k- l n n TJ for positive integers k. Therefore

00 1 E (suP Zn) ::::; L --. n k=1 TJ - 1

21. Let Mn be the number present after the nth round, so Mo = K, and Mn+l = Mn - Xn+l , n � 1 , where Xn is the number of matches in the nth round. By the result of Problem (3 . 1 1 . 1 7), EXn = 1 for all n, whence

E(Mn+l + n + 1 1 :Fn) = Mn + n , where :Fn is the a-field generated by Mo , Ml , . . . , Mn . Thus the sequence {Mn + n} is a martingale. Now, N is clearly a stopping time, and therefore K = Mo + 0 = E(MN + N) = EN.

We have that

E{ (Mn+1 + n + 1 )2 + Mn+1 1 :Fn } = (Mn + n)2 - 2(Mn + n)E(Xn+l - 1 ) + Mn + E{ (Xn+l - 1 )2 - Xn+l I :Fn } ::::; (Mn + n)2 + Mn ,

where we have used the fact that

{ I if Mn > 1 , var(Xn+l I :Fn ) = 0 if Mn = 1 .

409

Page 419: One Thousand Exercises in Probability

[12.9.22]-[12.9.24] Solutions Martingales

Hence the sequence {(Mn + n)2 + Mn } is a supermartingale. By an optional stopping theorem for supermartingales,

and therefore var(N) ::::: K .

22. In the usual notation,

lE (M(s + t) I J=S) = lE (IoS W(u) du + ls+t

W(u) du - H W(s + t) - W(s) + W(s) }3 1 J=S ) = M(s) + tW(s ) - W(s)lE ([W(s + t ) - W(s)]2 1 .1"'s ) = M(s)

as required. We apply the optional stopping theorem ( 12.7. 12) with the stopping time T = inf{u : W(u) E {a, b} } . The hypotheses of the theorem follow easily from the boundedness of the process for t E [0, T] , and it follows that

, Hence the required area A has mean

[We have used the optional stopping theorem twice actually, in that lE(W(T)) = 0 and therefore lP'(W(T) = a) = -b/ (a - b) .] 23. With .1"'s = a(W(u) : 0 ::::: u ::::: s ) , we have for s < t that

lE(R(t)2 I J=S) = lE ( I W(s) 1 2 + I W (t) - W(s) 1 2 + 2W(s) · (W(t) - W(s)) l .1"'s ) = R(s)2 + (t - s) ,

and the first claim follows. We apply the optional stopping theorem ( 12.7. 12) with T = inf{u : I W(u) 1 = a } , as in Problem ( 12 .9 .22), to find that 0 = lE(R(T)2 - T) = a2 - lE(T) . 24. We apply the optional stopping theorem to the martingale W(t) with the stopping time T to find that lE(W(T)) = -a ( 1 - Pb) + bPb = 0, where Pb = lP'(W(T) = b) . By Example ( 12.7. 10), W(t)2 - t is a martingale, and therefore, by the optional stopping theorem again,

whence lE(T) = ab o For the final part, we take a = b and apply the optional stopping theorem to the martingale exp[eW(t) - �e2t] to obtain

on noting that the conditional distribution of T given W (T) = b is the same as that given W (T) = -b. 1 02

Therefore, lE(e-:Z T ) = 1 / cosh(be) , and the answer follows by substituting s = !e2 .

410

Page 420: One Thousand Exercises in Probability

1. It is easily seen that

13

Diffusion processes

13.3 Solutions. Diffusion processes

lE{ X (t + h) - X(t) I X(t) } = (A - fJ,)X(t)h + o(h ) , lE ( {X (t + h) - X (t) }2 1 X(t)) = (A + fJ,)X (t)h + o(h ) ,

which suggest a diffusion approximation with instantaneous mean a (t, x) = (A - fJ,)x and instanta­neous variance bet, x) = (A + fJ,)x . 2. The following method is not entirely rigorous (it is an argument of the following well-known type: it is valid when it works, and not otherwise) . We have that

by using the forward equation and integrating by parts. Assume that a (t, y) = l:n an (t)yn , b (t , y) =

l:n f3n (t)yn . The required expression follows from the 'fact' that

/ly yn f dy = - /ly f dy = -- . 100 an 100 an M -00 aon -00 aon

3. Using Exercise ( 13 .3 .2) or otherwise, we obtain the equation

aM = OmM + 102 M

at 2

with boundary condition M(O, 0) = 1 . The solution is M(t) = exp{ �O (2m + O )t } . 4 . Using Exercise ( 13 .3 .2) or otherwise, we obtain the equation

aM aM 1 2 - = -0 - + zO M at ao with boundary condition M(O, 0) = 1 . The characteristics of the equation are given by

dt dO 2 dM T = e =

02M '

with solution M(t, 0) = e !o2 g(Oe-t ) where g is a function satisfying 1 = e !o2 g (O ) . Therefore M = exp{ A02 ( 1 - e-2t ) } .

4 1 1

Page 421: One Thousand Exercises in Probability

[13.3.5]-[13.3.10] Solutions Diffusion processes

5. Fix t > O. Suppose we are given WI (s ) , W2 (S) , W3 (S) , for O � s � t. By Pythagoras 's theorem, R(t + u)2 = Xr + X� + X� where the Xi are independent N(Wi (t) , u) variables. Using the result of Exercise (5.7 .7), the conditional distribution of R(t + u)2 (and hence of R(t + u) also) depends only on the value of the non-centrality parameter e = R(t)2 of the relevant non-central X2 distribution. It follows that R satisfies the Markov property. This argument is valid for the n-dimensional Bessel process. 6. By the spherical symmetry of the process, the conditional distribution of R(s +a) given R(s) = x is the same as that given W(s) = (x , 0, 0) . Therefore, recalling the solution to Exercise ( 13 .3 .5),

lP'(R(s + a) � y I R(s) = x)

J i { (U - x)2 + v2 + w2 } = (u , v , w) : 2rra 3/2 exp - 2a du dv dw

u2+v2+w2�y2 ( )

= {Y {2rr {rr I

3/2 exp

{_ p2 - 2px cos e + x2 } p2 sin e de d</J dp lp=o It/J=o le=o (2rra) 2a

= {Y pix { exp (_ (p - X)2 ) _ exp (_ (p + X)2 ) } dP , 10 .J2rra 2a 2a and the result follows by differentiating with respect to y . 7 . Continuous functions of continuous functions are continuous. The Markov property i s preserved because g(.) is single-valued with a unique inverse.

1 2 8. (a) Since E(euW(t)) = e Zu t , this is not a martingale. (b) This is a Wiener process (see Problem ( 13 . 12 . 1 )), and is certainly a martingale. (c) With :Ft = a(W(s) : 0 � s � t) and t, u > 0,

E{ (t + u)W(t + u) - Iot+u W(s) ds 1 :Ft } = (t + u)W(t) - l W(s) ds _ 1t+u W(t) ds

= tW(t) - lot W(s) ds ,

whence this is a martingale. [The integrability condition is easily verified.] 9. (a) With s < t, S(t) = S(s) exp{a (t - s) + b(W(t) - W(s)) } . Now W(t) - W(s) is independent of {W(u) : 0 � u � s } , and the claim follows. (b) S(t) is clearly integrable and adapted to the filtration :F= (:Ft) so that, for s < t , E (S(t) I Fs ) = S(s)E (exp{a(t - s) + b(W(t) - W(s)) } I Fs ) = S(s) exp{a(t - s) + 1b2 (t - s) } ,

which equals S(s) if and only if a + 1b2 = O. In this case, E(S(t)) = E(S(O)) = 1 . 10. Either find the instantaneous mean and variance, and solve the forward equation, or argue directly as follows. With s < t ,

lP' (S(t) � y I S(s) = x) = lP' (bW(t) � -at + log y I bW(s) = -as + log x) . Now b(W(t) - W(s)) is independent of W(s) and is distributed as N(O, b2 (t - s)) , and we obtain on differentiating with respect to y that

f( I ) 1 ( (lOg(Ylx) - a(t - s))2 ) t, Y s , x = exp - 2 ' yv2rrb2 (t - s) 2b (t - s)

4 1 2

x , y > O.

Page 422: One Thousand Exercises in Probability

Excursions and the Brownian bridge Solutions [13.4.1]-[13.6.1]

13.4 Solutions. First passage times

1. Certainly X has continuous sample paths, and in addition E I X (t) 1 < 00. Also, if s < t , 1 2 ( . 1 (J2 1 (J2 E (X (t) l .1's ) = X(s )e 2(J t-S)E (el(J {W(t)-W(s» ) I Fs) = X(s )e :Z (t-s)e- :Z (t-s) = Xes )

as required, where we have used the fact that W(t) - W(s ) is N(O, t - s ) and is independent of .1's . 2. Apply the optional stopping theorem to the martingale X of Exercise ( 13 .4. 1) , with the stopping time T, to obtain E(X (T)) = 1 . Now WeT) = aT + b, and therefore E(eVtT+iBh ) = 1 where 1{f = ia() + �()2 . Solve to find that

E(eVtT) = e-i(Jh = exp { -b ( J a2 - 21{f + a) }

is the solution which gives a moment generating function.

3. We have that T ::: u if and only if there is no zero in (u , t ] , an event with probability 1 -(2 j 1r) cos -1 { .JUTt}, and the claim follows on drawing a triangle.

13.5 Solution. Barriers

1. Solving the forward equation subject to the appropriate boundary conditions, we obtain as usual that

P(t , y) = g(t, y I d) + e-2md g(t, y I -d) - i: 2me2mx g (t, y I x) dx

1 where get , y I x ) = (21rt)- :Z exp{ -(y - x - mt)2 j (2t) } . The first two terms tend to 0 as t � 00, regardless of the sign of m. As for the integral, make the substitution u = (x - y - mt) j.../i to obtain, as t � 00,

1 2 j-(d+y+mt)/.ji 2m e- :Zu

{ 2 Im le-2 Im IY if m < 0, - 2me Y -- du � -00 v'2rr 0 if m � O.

13.6 Solutions. Excursions and the Brownian bridge

_ I 2 1. Let f(t, x) = (21rt) :Z e-x /(2t) . It may be seen that

IP' (W(t) > x I z , W(O) = 0) = lim IP' (W(t) > x I z , W(O) = w) w.j.O

where Z = {no zeros in (0, t ] ) ; the small missing step here may be filled by conditioning instead on the event {WeE) = w, no zeros in (E , t ] ) , and taking the limit as E + O. Now, if W > 0,

IP' (W(t) > x , Z I W(O) = w) = 100 { J(t , Y - w) - f(t, y + w) } dy

by the reflection principle, and

IP'(Z I W(O) = w) = 1 - 2 LOO f(t, y) dy = i: f(t , y) dy

4 1 3

Page 423: One Thousand Exercises in Probability

[13.6.2]-[13.6.5] Solutions Diffusion processes

by a consideration of the minimum value of W on (0, t ] . It follows that the density function of W(t) , conditional on Z n {W (O) = w } , where w > 0, is

h ( ) _ f(t, x - w) - f(t , x + w) w x - fW

' x > O. -w f (t , y) dy

Divide top and bottom by 2w, and take the limit as w -l- 0:

1 af x x2/(2t) lim hw (x) = - -- - = -e- , x > O. w,!,o f (t , O) ax t

2. It is a standard exercise that, for a Wiener process W,

E{ W(t) I W(s ) = a, W(1) = o} = a - , ( I - t )

l - s E{ W(s )2 1 W(O) = W(1 ) = o} = s ( 1 - s ) ,

i f 0 ::: s ::: t ::: 1 . Therefore the Brownian bridge B satisfies, for 0 � s ::: t ::: 1 ,

E (B(s )B (t)) = E{ B(s )E (B(t) I B(s ) ) } = 1 - t

E(B(s )2) = s ( 1 - t) l - s

as required. Certainly E(B(s )) = 0 for all s , by symmetry.

3. W is a zero-mean Gaussian process on [0, 1] with continuous sample paths, and also W(O) =

W(I) = O. Therefore W is a Brownian bridge if it has the same autocovariance function as the Brownian bridge, that is, c(s , t) = min{s, t } - st . For s < t ,

cov (W(s ) , W(t) ) = cov (W(s) - sW (I ) , W(t) - tW(1 ) ) = s - ts - s t + s t = s - st

since cov(W(u) , W(v)) = min{u , v } . The claim follows.

4. Either calculate the instantaneous mean and variance of W, or repeat the argument in �e solution to Exercise ( 13 .6.3) . The only complication in this case is the necessity to show that W(t) is a.s. continuous at t = 1, Le. , that u- 1 W(u - 1) --+ 0 a.s. as u --+ 00. There are various ways to show this. Certainly it is true in the limit as u --+ 00 through the integers, since, for integral u , W (u - 1) may be expressed as the sumofu - l independent N(O, 1) variables (use the strong law). It remains to fill in the gaps. Let n be a positive integer, let x > 0, and write Mn = max { I W (u) - W(n) 1 : n ::: u ::: n + I } . We have by the stationarity of the increments that

00 00 E(M1 ) L lP' (Mn � nx) = L lP'(Ml � nx) ::: 1 + -- < 00,

n=O n=O X

implying by the Borel-Cantelli lemma that n- 1 Mn ::: x for all but finitely many values of n, a.s. Therefore n- 1 Mn --+ 0 a.s . as n --+ 00, implying that

lim _1_ I W(u) l ::: lim � { I W (n ) 1 + Mn } --+ 0

u-+oo u + 1 n-+oo n a.s.

5. In the notation of Exercise ( 1 3 .6.4), we are asked to calculate the probability that W has no zeros in the time interval between s / (1 - s ) and t / ( 1 - t) . By Theorem ( 13 .4.8), this equals

1 _ � cos- 1 s ( 1 - y) 2 - 1 � 7r t ( 1 - s)

= ; COS V � .

414

Page 424: One Thousand Exercises in Probability

Stochastic calculus Solutions [13.7.1]-[13.7.5]

13.7 Solutions. Stochastic calculus

1. Let :Fs = a (Wu : 0 ::::: u ::::: s ) . Fix n � 1 and define Xn (k) = I Wkt/zn l for 0 ::::: k ::::: 2n . By Jensen's inequality, the sequence {Xn (k) : 0 ::::: k ::::: 2n } is a non-negative submartingale with respect to the filtration :Fkt/Zn , with finite variance. Hence, by Exercise (4.3 .3) and equation ( 12.6.2), X� = max{Xn (k) : 0 ::::: k ::::: 2n } satisfies

E(X�z) = 2l)() xlP'(X� > x) dx ::::: 2 1000 E(W/I{X�:::x } ) dx = 2E{ W/ IoX� dX }

= 2E(W/ X�) ::::: 2..jE(whE(X'hZ) by the Cauchy-Schwarz inequality.

Hence E(X�z) ::::: 4E(Wh . Now X�z is monotone increasing in n, and W has continuous sample paths. By monotone convergence,

2. See the solution to Exercise (8.5 .4).

3. (a) We have that

{n- l n- l } h en) = � L(�?+1 - v/) - L (VJ+ l - VJ)z .

j=O j=O

The first summation equals wl, by successive concellation, and the mean-square limit of the second summation is t, by Exercise (8.5 .4). Hence limn-+oo h (n) = � wl - � t in mean square.

Likewise, we obtain the mean-square limits:

lim h en) = � wl + � t , n-+oo lim h en) = lim 14 (n) = � W? n-+oo n-+oo

4. Clearly E(U(t)) = O. The process U is Gaussian with autocovariance function

Thus U is a stationary Gaussian Markov process, namely the Ornstein-Uhlenbeck process. [See Example (9.6. 10) .] 5. Clearly E(Ut ) = O. For s < t ,

E(Us Us+t ) = E(Ws Wt) + {PE (is r e-.8(S-U) Wue-.8(t-v) wv du dV) u=O }v=O

- E ( Wt,8 los e-.8(s-u) wu dU) - E ( Ws lot e-.8(t-v)wv dV) = s + ,8Ze -.8 (s +t) is r e.8(u+v) min{u, v} du dv

u=o }v=o - ,8 los e-.8(s-u) min{u, t} du - lot e-.8(t-v) min{s, v} dv

eZ.8s - 1 -.8(s+t) 2,8

e

4 1 5

Page 425: One Thousand Exercises in Probability

[13.8.1]-[13.8.1] Solutions Diffusion processes

after prolonged integration. By the linearity of the definition of U, it is a Gaussian process. From the calculation above, it has autocovariance function c(s , s +t) = (e-fJ(t-s) - e-fJ(t+s» )/ (2{J) . From this we may calculate the instantaneous mean and variance, and thus we recognize an Omstein-Uhlenbeck process. See also Exercise ( 13.3 .4) and Problem ( 13 . 12 .4).

13.8 Solutions. The Ito integral

1. (a) Fix t > 0 and let n � 1 and /) = tin. We write tj = jtln and Vj = Wtj ' By the absence of correlation of Wiener increments, and the Cauchy-Schwarz inequality,

Therefore,

(b) As n � 00,

n- l n- l L V/ (Vj+l - Vj ) = � L U�/+l - V] - 3Vj (Vj+1 - Vj )2 - (V)+l - Vj)3 } j=o j=O

n- l n- l = � wi - L [Vj (t)+l - tj ) + Vj { (Vj+l - Vj )2 - (t)+l - tj ) } ] - i L(V)+1 - Vj )3

j� j�

� i w? - fot W(s) ds + O + O. The fact that the last two terms tend to 0 in mean square may be verified in the usual way. For example,

lE ( [� (Vj+1 - Vj )3 [2) = � lE [(V)+l - Vj )6]

}=o } =o n- l n- l 3

= 6 LCt)+l - tj )3 = 6 L U') � 0 j=O j=O n

4 1 6

as n � 00 .

Page 426: One Thousand Exercises in Probability

Ito 'sjormula Solutions [13.8.2]-[13.9.1]

(c) It was shown in Exercise ( 13 .7.3a) that fci Ws dWs = � wl - � t . Hence,

and the result follows because lE(Wl) = t . 2. Fix t > ° and n � 1 , and let 8 = tin . We set Vj = Wj t/n . It is the case that Xt =

limn�oo I:j Vj (tj+1 - tj ) . Each term in the sum is normally distributed, and all partial sums are multivariate normal for all 8 > 0, and hence also in the limit as 8 -+ 0. Obviously lE(Xt ) = 0. For s � t ,

lE(XsXt) = lot los lE(Wu Wv ) du dv = lot los

min{u , v} du dv

= los � u2 du + los

u (t - u) du = s2 G - �) .

Hence var(Xt ) = j- t3 , and the autocovariance function is

3. By the Cauchy-Schwarz inequality, as n -+ 00,

4. We square the equation 1 1 1 (1/11 + 1/12) 1 1 2 = 1 1 1/11 + 1/12 1 1 and use the fact that I I I (1/Ii ) 1 1 2 = 1 1 1/Ii I I for i = 1 , 2, to deduce the result.

S. The question permits us to use the integrating factor ef3t to give, formally,

ef3t Xt = ef3s __ s ds = ef3t Wt - f3 ef3s Ws ds lot dW lot o ds 0

on integrating by parts . This is the required result, and substitution verifies that it satisfies the given equation.

6. Find a sequence '" = (¢(n» ) of predictable step functions such that 1 1 ¢ (n) - 1/1 1 1 -+ ° as n -+ 00 .

By the argument before equation ( 13 .8 .9), I (¢(n» ) � 1 (1/1) as n -+ 00. By Lemma ( 13 .8 .4), I I I (¢(n» ) 1 1 2 = 1 1 ¢(n) I I , and the claim follows.

13.9 Solutions. Ito's formula

1. The process Z is continuous and adapted with Zo = 0. We have by Theorem ( 13 .8 . 1 1 ) that lE(Zt - Zs I :Fs) = 0, and by Exercise ( 13 .8 .6) that

The first claim follows by the Levy characterization of a Wiener process ( 12.7 . 10) .

417

Page 427: One Thousand Exercises in Probability

[13.9.2]-[13.10.2] Solutions Diffusion processes

We have in n dimensions that R2 = XI + X� + . . . + X� , and the same argument yields that Zt = L:i fci (Xi / R) dXi is a Wiener process . By Example ( 13 .9.7) and the above,

n n X . d(R2) = 2 L Xi dXi + n dt = 2R L ---.!... dXi + n dt = 2R dW + n dt .

i=l i=l R

2. Applying Ito's formula ( 13 .9 .4) to Yt = wt4 we obtain dYt = 4W? dWt + 6W? dt. Hence,

3. Apply Ito's formula ( 13 .9 .4) to obtain dYt = Wt dt + t dWt . Cf. Exercise ( 13 .8 . 1) .

4. Note that Xl = cos W and X2 = sin W. By Ito's formula ( 13 .9.4),

dY = d(XI + iX2) = dXI + i dX2 = d(cos W) + i d(sin W) = - sin W dW - � cos W dt + i cos W dW - � sin W dt .

5. We apply Ito's formula to obtain: (a) (1 + t) dX = -X dt + dW, (b) dX = - �X dt + Jl - X2 dW,

(c) d (�) = -� (�) dt + ( b�a -�b ) (�) dW.

1 . (a) We have that

13.10 Solutions. Option pricing

E ( (aez _ K)+) = ['Xl (aeZ _ K) � exp (_ (z - ;)2 ) dz Jlog(K/a) v 27n2 2. 1 2

100 e- ZY Z - Y 10g(K/a) - y

= (aeY+t"Y - K)-- dy where y = -- , a = --=-'----'---'-ct ../iii • •

1 21

00 e- � (Y-t")2 = aeY+ zt" dy - K 4>(-a)

ct ../iii 1 2

= aeY+Z t" 4> (. - a) - K4> (-a) .

(b) We have that ST = aez where a = St and, under the relevant conditional lQi-distribution, Z is normal with mean y = (r - �(2) (T - t) and variance .2 = a2 (T - t) . The claim now follows by the result of part (a).

2. (a) Set � (t , S) = � (t , St ) and 1fr Ct , S) = 1fr(t , St ) , in the natural notation. By Theorem (13 . 1O. 15), we have 1frx = 1frt = 0, whence 1fr(t , x) = c for all t , x , and some constant c. (b) Recall that dS = fJ,S dt + as dW. Now,

d(� S + 1frert ) = d(S2 + 1frert ) = (a S)2 dt + 2S dS + ert d1fr + 1frrert dt,

4 1 8

Page 428: One Thousand Exercises in Probability

Option pricing Solutions [13.10.3]-[13.10.5]

by Example ( 13 .9.7) . By equation ( 13 . 10.4), the portfolio is self-financing if this equals S dS + 1/Irert dt, and thus we arrive at the SDE ert d1/l = -S dS - a2S2 dt, whence

(c) Note first that Zt = fci Su du satisfies dZt = St dt. By Example ( 13 .9.8), d(StZt ) = Zt dSt + sl dt, whence

d(� S + 1/Iert ) = Zt dSt + sf dt + ert d1/l + rert dt .

Using equation ( 13 . 10.4), the portfolio is self-financing if this equals Zt dSt + 1/Irert dt, and thus we require that ert d1/l = -sl dt, which is to say that

3. We need to check equation ( 13 . 10.4) remembering that dMt = O. Each of these portfolios is self-financing. (a) This case is obvious. (b) d@S + 1/1) = d(2S2 - S2 - t) = 2S dS + dt - dt = � dS. (c) d(�S + 1/1) = -S - t dS + S = � dS. (d) Recalling Example ( 13 .9 .8), we have that

4. The time of exercise of an American call option must be a stopping time for the filtration (Ft) . The value of the option, if exercised at the stopping time . , i s V� = (S� - K)+ , and it follows by the usual argument that the value at time 0 of the option exercised at . is lElQ!(e-r� V�) . Thus the value at time 0 of the American option is sup� {lElQ! (e-r� V� ) } , where the supremum is taken over all stopping times . satisfying ]P(. ::::: T) = 1 . Under the probability measure Q, the process e-rt Vt is a martingale, whence, by the optional stopping theorem, lElQ!(e-n V� ) = Vo for all stopping times • . The claim follows.

5. We rewrite the value at time 0 of the European call option, possibly with the aid of Exercise ( 13 . 10. 1) , as

where N is an N(O, 1) random variable. It is immediate that this is increasing in So and r and is decreasing in K. To show monotonicity in T, we argue as follows. Let Tr < T2 and consider the European option with exercise date T2 . In the corresponding American option we are allowed to exercise the option at the earlier time Tr . By Exercise ( 13 . 10.4), it is never better to stop earlier than T2, and the claim follows.

Monotonicity in a may be shown by differentiation.

419

Page 429: One Thousand Exercises in Probability

[13.11.1]-[13.12.2] Solutions Diffusion processes

13.1 1 Solutions. Passage probabilities and potentials

1. Let H be a closed sphere with radius R (> I w i ), and define P R (r) = lP' (G before H I I W (0) I = r) . Then P R satisfies Laplace 's equation in Rd, and hence

!!.- (rd- 1 dPR ) = 0 dr dr

since PR is spherically symmetric. Solve subject to the boundary equations PR (E) = 1 , PR (R) = 0, to obtain

2-d R2-d r - d-2 PR (r) = E2-d _ R2-d � (E/r) as R � 00.

2. The electrical resistance Rn between 0 and the set t:.n is no smaller than the resistance obtained by, for every i = 1 , 2, . . . , 'shorting out' all vertices in the set t:.i . This new network amounts to a linear chain of resistances in series, points labelled t:.j and t:.i+ 1 being joined by a resistance if size Ni- 1 . It follows that

00 1 R(G) = lim Rn � L - . n-+oo i=O Ni

By Theorem (13 . 1 1 . 18) , the walk is persistent if Lj Nj- 1 = 00.

3. Thinking of G as an electrical network, one may obtain the network H by replacing the resistance of every edge e lying in G but not in H by 00. Let 0 be a vertex of H. By a well known fact in the theory of electrical networks, R(H) � R (G) , and the result follows by Theorem (13 . 1 1 . 19).

13.12 Solutions to problems

1. (a) T(t) = aW(t /(2) has continuous sample paths with stationary independent increments, since W has these properties. Also T (t)/a is N(O, t/(2 ) , whence T(t) is N(O, t) . (b) As for part (a) . (c) Certainly V has continuous sample paths on (0, (0) . For continuity at 0 it suffices to prove that tW(t- l ) � 0 a.s. as t .J, 0; this was done in the solution to Exercise ( 13 .6.4).

If (u , v), (s , t) are disjoint time-intervals, then so are (v- I , u- 1 ) , (t- l , s - l ) ; since W has independent increments, so has V . Finally,

is N(O, fJ) if s, t > 0, where

fJ = � + s2 (� _ _ 1_) = t . s + t s s + t

2. Certainly W is Gaussian with continuous sample paths and zero means, and it is therefore sufficient to prove that cov(W(s) , Wet)) = min{s, t } . Now, if s :::: t ,

as required.

420

Page 430: One Thousand Exercises in Probability

Problems Solutions [13.12.3]-[13.12.4]

If u(S) = s, v (t) = 1 - t, then r (t) = t/ ( l - t) , and r- 1 (w) = w/( l + w) for 0 � w < 00. In this case X(t) = ( 1 - t)W(t/( 1 - t)) . 3. Certainly U is Gaussian with zero means, and U(O) = O. Now, with St = e2fJt - 1 ,

lE{U(t + h) I U (t) = u } = e-fJ(t+h)lE{ W(St+h ) I W(St) = uefJt } = ue-fJ(t+h)efJt = u - f3uh + o(h) ,

whence the instantaneous mean of U i s a(t , u) = -f3u . Secondly, St+h = St + 2f3e2fJth + o(h) , and therefore

It follows that

lE{U(t + h)2 1 U(t) = u } = e-2fJ (t+h)lE{ W(St+h)2 1 W(St ) = uefJt } = e-2fJ (t+h) (u2e2fJt + 2f3e2fJt h + o(h) ) = u2 - 2f3h (u2 - 1) + o(h ) .

lE { I U (t + h) - U (t) 12 1 U(t) = u } = u2 - 2f3h (u2 - 1) - 2u (u - f3uh) + u2 + o(h) = 2f3h + o(h) ,

and the instantaneous variance i s b(t , u) = 2f3 . 4. Bartlett's equation (see Exercise ( 13 .3 .4)) for M(t, e) = lE(eOV(t» ) is

aM aM 1 2 2 - = -{3e - + za e M at ae

with boundary condition M(e , 0) = iJu . Solve this equation (as in the exercise given) to obtain

the moment generating function of the given normal distribution. Now M(t , e) --+ exp{ ie2a2 /(2f3)} as t --+ 00, whence by the continuity theorem V (t) converges in distribution to the N(O, ia2/f3) distribution.

If V (0) has this limit distribution, then so does V(t) for all t . Therefore the sequence (V (t1 ) , . . . , V (tn )) has the same joint distribution as (V(t1 + h) , . . . , V (tn + h)) for all h , t1 ( . . . , tn , whenever V (0) has this normal distribution.

In the stationary case, lE(V (t)) = 0 and, for S � t ,

cov (V (S ) , V(t)) = lE{ V (s)lE (V (t) I V (s ) ) } = lE{ V(s)2e-fJ(t-s) } = c(O)e-fJ lt-s l

where c(O) = var(V (s) ) ; we have used the first part here. This is the autocovariance function of a stationary Gaussian Markov process (see Example (9.6. 10)). Since all such processes have autocovariance functions of this form (i.e. , for some choice of f3), all such processes are stationary Ornstein-Uhlenbeck processes.

The autocorrelation function is p (s) = e-fJ 1s 1 , which is the characteristic function of the Cauchy density function

1 f(x) = f3;rr { 1 + (x/f3)2 } ' x E R

42 1

Page 431: One Thousand Exercises in Probability

[13.12.5]-[13.12.7] Solutions

5. Bartlett's equation (see Exercise ( 1 3.3 .2)) for M is

aM aM 1 2 aM at = a°

ai + l.fJO ai'

subject to M(O, 0) = iJd . The characteristics satisfy

dM dt ° 1

2 dO

Diffusion processes

The solution is M = g(Oeat /(a + ifJO)) where g is a function satisfying g (O/(a + ifJO)) = iJd . The solution follows as given.

By elementary calculations,

whence var(D(t)) = (fJd/a)eat (eat - 1) . Finally

{ 2adeat } lP'(D(t) = 0) = lim M(t , 0) = exp fJ t e-+-oo ( 1 - ea )

which converges to e-2ad/fJ as t --+ 00.

6 . The equilibrium density function g (y) satisfies the (reduced) forward equation

d 1 d2 - -(ag) + - -(bg) = ° dy 2 dy2

where a(y) = -fJy and b(y) = a2 are the instantaneous mean and variance. The boundary conditions are

y = - c , d .

Integrate (*) from -c to y , using the boundary conditions, to obtain

1 2 dg fJyg + l.a - = 0, dy -c ::; y ::; d .

Integrate again to obtain g(y) = Ae-fJy2 /(12 . The constant A i s given by the fact that J�c g(y) dy = 1 . 7 . First we show that the series converges uniformly (along a subsequence), implying that the limit exists and is a continuous function of t. Set

We have that

n- 1 . � sm(kt)

Zmn (t) = L..J -k- Xk o k=m

Mmn = sup { I Zmn (t) I ; ° ::; t ::; rr } .

I

n- 1 eikt 1 2 n- 1 X2 n-m- 1

I

n-I- 1 X · X · 1 M2 < su � - X < � --.Ii + 2 � � J J+l mn - P L..J k k - L..J k2 L..J L..J . ( " + I) . 09�1r k=m k=m 1=1 j=m J J

422

Page 432: One Thousand Exercises in Probability

Problems Solutions [13.12.8]-[13.12.8]

The mean value of the final term is, by the Cauchy-Schwarz inequality, no larger than

n-m-1 2 E

1= 1 lE (

I

n�

1 �j�i+1 1 2) = 2

nf 1

j=m J {j + 1) 1=1

n-I-1 I �-m '" < 2(n - m) - -L p(j + 1)2 - m4 • j=m

Combine this with (*) to obtain

It follows that

2 2 3 lE(Mm,2m ) � lE(Mm 2m) � r.:;; ' , ",m

(00

) 00

6 lE E M2n-l ,2n � E 2n/2 < 00,

n= 1 n=1 implying that 2:� 1 M2n- l 2n < 00 a .s . Therefore the series which defines W converges uniformly with probability 1 (along a subsequence) , and hence W has (a.s .) continuous sample paths.

Certainly W is a Gaussian process since Wet) is the sum of normal variables (see Problem (7 . 1 1 . 19» . Furthermore lE(W(t» = 0, and

cov (W(s) , Wet») = � + � f sin(ks) ;in (kt) rr rr k=1 k

since the Xi are independent with zero means and unit variances. It is an exercise in Fourier analysis to deduce that cov(W(s) , Wet»� = min{s , t } .

8 . We wish to find a solution ge t , y) to the equation

I y l < b,

satisfying the boundary conditions

g(O, y) = 8yo if I y l � b, g(t, y) = o if l y l = b .

Let get, y I d) be the N(d, t) density function, and note that g( . " I d) satisfies (*) for any 'source' d. Let 00

g(t , y) = E (- I )kg(t, y I 2kb) , k=-oo

a series which converges absolutely and is differentiable term by term. Since each summand satisfies (*), so does the sum. Now g(O, y) is a combination of Dirac delta functions, one at each multiple of2b. Only one such multiple lies in [-b, b], and hence g(y, 0) = 8dO . Also, setting y = b, the contributions from the sources at -2(k - l)b and 2kb cancel, so that get, b) = 0. Similarly get, -b) = 0, and therefore g is the required solution.

Here is an alternative method. Look for the solution to (*) of the form e-An t sin{ �nrr(y + b)/b} ; such a sine function vanishes when I y l = b. Substitute into (*) to obtain An = n2rr2/(8b2) . A linear combination of such functions has the form 00

(

) -Ant . nrr(y + b)

g et , y) = E ane sm 2b .

n=1

423

Page 433: One Thousand Exercises in Probability

[13.12.9]-[13.12.11] Solutions Diffusion processes

We choose the constants an such that g(O, y) = 8yO for I y l < b. With the aid of a little Fourier analysis, one finds that an = b- 1 sin(�n:7r) .

Finally, the required probability equals the probability that Wa has been absorbed by time t , a probability expressible as 1 - J�b r (t , y) dy . Using the second expression for r, this yields

9. Recall that U (t) = e-2mD(t) is a martingale. Let T be the time of absorption, and assume that the conditions of the optional stopping theorem are satisfied. Then lE(U (0)) = lE(U (T)) , which is to say that 1 = e2ma Pa + e-2mb ( 1 - Pa ) . 10. (a) We may assume that a , b > O . With

pt (b) = IP' (W(t) > b, F(O, t) / W(O) = a) ,

we have by the reflection principle that

giving that

pt (b) = IP'(W(t) > b / W(O) = a) - 1P'(W(t) < -b / W(O) = a) = 1P'(b - a < W(t) < b + a / W(O) = 0) ,

8pt (b) � = I(t , b + a) - I(t , b - a)

where I(t, x) is the N(O, t) density function. Now, using conditional probabilities,

IP' (F(O, t) / W(O) = a , W(t) = b) = 1 8pt (b) = 1 _ e-2ab/t . I(t, b - a) 8b

(b) We know that

2 2 IP'(F(s , t)) = 1 - - cos- 1 { VS7t} = - sin- 1 { VS7t} 7r 7r

if 0 < s < t . The claim follows since F(to , t2 ) S; F(tl , t2 ) . (c) Remember that sin x = x + o(x) as x .J, o. Take the limit in part (b) as to .J, 0 to obtain ..filTi2.

11. Let M(t) = sup{W(s) : 0 ::: s ::: t} and recall that M(t) has the same distribution as I W (t) I . By symmetry,

IP' ( sup I W(s ) l ::: w) ::: 21P'(M(t) ::: w ) = 21P' ( I W(t) 1 ::: w ) . 09::9

By Chebyshov's inequality,

Fix E > 0, and let

An (E) = { I W(s) l /s > E for some s satisfying 2n- 1 < s ::: 2n } .

Note that

424

Page 434: One Thousand Exercises in Probability

Problems Solutions [13.12.12]-[13.12.13]

for all large n , and also

00 ( ) 00 2n+l L IP' sup I W(s ) I � 22n/3 � L 4n/3 < 00. n=1 O:;::s:;::2n n=1 2

Therefore l:n IP'(An (E)) < 00, implying by the Borel-Cantelli lemma that (a.s .) only finitely many of the An (E) occur. Therefore t- 1 W (t) � 0 a.s. as t � 00. Compare with the solution to the relevant part of Exercise ( 13 .6.4).

12. We require the solution to Laplace's equation \72 p = 0, subject to the boundary condition

w) = { O if w E H, p(

1 if w E G.

Look for a solution in polar coordinates of the form

00 p(r, e ) = L rn {an sin(ne) + bn cos(ne) } .

n=O

Certainly combinations having this form satisfy Laplace 's equation, and the boundary condition gives that

where

00 H(e) = bo + L {an sin(ne) + bn cos(ne) } ,

n=1

{ 0 if - n < e < 0, H(e) =

1 if O < e < n.

l e i < n,

The collection {sin (me) , cos(me) : m � O} are orthogonal over ( -n , n) . Multiply through (*) by sin(me) and integrate over (-n, n) to obtain nam = { I - cos(nm)}lm , and similarly bo = � and bm = O for m � l . 13. The joint density function of two independent N(O, t) random variables is (2n t)- 1 exp{ _ (x2 + y2)/(2t) } . Since this function is unchanged by rotations of the plane, it follows that the two coordi­nates of the particle's position are independent Wiener processes, regardless of the orientation of the coordinate system. We may thus assume that I is the line x = d for some fixed positive d.

The particle is bound to visit the line I sooner or later, since IP'(WI (t) < d for all t) = O. The first-passage time T has density function

Conditional on {T = t } , D = W2 (T) is N(O, t) . Therefore the density function of D is

fD (U) = fD IT (U I t)fT (t) dt = __ e-(U +d )/(2t) dt = , 1000 1000 d 2 2 d o 0 2nt2 n(u2 + d2 )

giving that Did has the Cauchy distribution. The angle E> = :P6R satisfies e = tan -1 (D I d) , whence

1 e IP'(E> � e) = IP'(D � d tan e) = 2" + n '

425

l e i < �n.

U E R,

Page 435: One Thousand Exercises in Probability

[13.12.14]-[13.12.18] Solutions Diffusion processes

14. By an extension of Ito 's formula to functions of two Wiener processes, U = U(WI , W2) and V = V (WI , W2) satisfy

dU = Ux dWI + Uy dW2 + � (uxx + Uyy) dt, dV = Vx dWI + Vy dW2 + � (vxx + Vyy) dt,

where ux , Vyy , etc, denote partial derivatives of U and v . Since ¢ is analytic, U and v satisfy the Cauchy-Riemann equations Ux = Vy , Uy = -Vx , whence u and v are harmonic in that Uxx + Uyy =

Vxx + Vyy = O. Therefore,

dU = Ux dWI + uy dW2 , dV = -uy dWI + Ux dW2 .

Thematrix ( ux uy ) is an orthogonal rotation ofR2 when u;+u� = 1 . Since thejoint distribution -Uy Ux

of the pair (WI , W2) is invariant under such rotations, the claim follows.

15. One method of solution uses the fact that the reversed Wiener process {W(t -s) - W(t) : 0 :::: s :::: t} has the same distribution as {W(s) : 0 :::: s :::: t } . Thus M(t) - W(t) = maxo<s<dW (s) - W(t) } has the same distribution as maxo::;u::;dW(u) - W (O) } = M(t) . Alternatively, byfuereflection principle,

lP' (M(t) � x , W(t) :::: y ) = IP'(W(t) � 2x - y) for x � max{O, y} .

By differentiation, the pair M(t), W(t) has joint density function -2¢'(2x - y) for y :::: x , x � 0, where ¢ is the density function of the N(O, t) distribution. Hence M(t) and M(t) - W(t) have the joint density function -2¢' (x + y) . Since this function is symmetric in its arguments, M (t) and M(t) - W(t) have the same marginal distribution.

16. The Lebesgue measure A (Z) is given by

A(Z) = 1000 I{W(t)=u} du ,

whence by Fubini 's theorem (cf. equation (5.6. 13)),

lE(A(Z)) = 1000 IP' (W(t) = u) dt = O.

17. Let 0 < a < b < c < d, and let M(x , y) = maxx:::s::;y W(s) . Then

M(c , d) - M(a , b) = max {W(s) - W(c) } + {W(c) - W(b) } - max {W(s) - W(b) } . c::;s::;d a::;s::;b

Since the three terms on the right are independent and continuous random variables, it follows that lP' ( (M(c, d) = M(a, b)) = O. Since there are only countably many rationals, we deduce that lP' ( M (c , d) = M(a, b) for all rationals a < b < c < d) = 1 , and the result follows.

18. The result is easily seen by exhaustion to be true when n = 1 . Suppose it is true for all m :::: n - 1 where n 2: 2. (i) If Sn :::: 0, then (whatever the final term of the permutation) the number of positive partial sums and the position of the first maximum depend only on the remaining n - 1 terms. Equality follows by the induction hypothesis. (ii) If Sn > 0, then

n

Ar = L Ar- I (k) , k= 1

426

Page 436: One Thousand Exercises in Probability

Problems Solutions [13.12.19]-[13.12.20]

where Ar- l (k) is the number of permutations with Xk in the final place, for which exactly r - 1 of the first n - 1 terms are strictly positive. Consider a permutation n = (XiI ' Xi2 ' . . . , Xin_ 1 ' Xk) with Xk in the final place, andmove the position ofxk to obtain the newpermutation n' = (Xb XiI ' Xi2 ' . . . , Xin_ I ) . The first appearance of the maximum in n' i s at its rth place if and only if the first maximum of the reduced permutation (XiI ' Xi2 ' . . . , Xin_l ) is at its (r - l) th place. [Note that r = 0 is impossible since Sn > 0.] It follows that n

Br = L Br- l (k) , k=l

where Br- l (k) is the number of permutations with Xk in the final place, for which the first appearance of the maximum is at the (r - l )th place.

By the induction hypothesis, Ar- l (k) = Br- l (k) , since these quantities depend on the n - 1 terms excluding Xk . The result follows.

19. Suppose that Sm = L:j=l Xj , 0 ::::: m ::::: n , are the partial sums of n independent identically distributed random variables Xj . Let An be the number of strictly positive partial sums, and Rn the index of the first appearance of the value of the maximal partial sum. Each of the n ! permutations of (Xl , X2 , . . . , Xn ) has the same joint distribution. Consider the kth permutation, and let h be the indicator function of the event that exactly r partial sums are positive, and let h be the indicator function that the first appearance of the maximum is at the rth place. Then, using Problem ( 1 3 . 1 2. 1 8),

1 n ! 1 n l IP'(An = r) = ,. L lE(h) = ,. L lE(h) = IP'(Rn = r) . n . k= l n . k= l

We apply this with Xj = W(jt/n) - W«j - 1)t/n) , so that Sm = W(mt/n) . Thus An L:j I{W(jt/n» O} has the same distribution as

Rn = min {k 2': 0 : W(kt/n) = �x W(jt/n) } . O::s} ::sn

By Problem ( 13 . 12. 17), Rn � R as n --+ 00. By Problem ( 13 . 12 . 1 6), the time spent by W at zero is a.s. a null set, whence An � A. Hence A and R have the same distribution. We argue as follows to obtain that that L and R have the same distribution. Making repeated use of Theorem ( 13 .4.6) and the symmetry of W,

IP'(L < x) = IP' ( sup W(s) < 0) + IP' ( inf W(s) > 0) x::ss::st x::ss::st

= 21P' ( sup {W(s) - W(x) } < -W(x)) = 21P' ( IW (t) - W(x) 1 < W(x)) x::ss::st

= 1P'( I W(t) - W(x) 1 < I W (x) l ) = IP' ( sup {W(s) - W(x) } < sup {W(s) - W (X) }) = IP'(R ::::: x) .

x::ss::st O::ss::sx

Finally, by Problem ( 13 . 1 2. 15) and the circular symmetry of the joint density distribution of two independent N(O, 1) variables U, V,

1P' ( IW(t) - W(x) 1 < I W(x) l ) = 1P'(t - x) V2 < x U2 ) = IP' ( 2 V2

2 ::::: ::) = � sin- l �. - U + V t n V t 20. Let

{ inf {t ::::: 1 : W (t) = x } if this set i s non-empty, Tx = 1 otherwise,

427

Page 437: One Thousand Exercises in Probability

[13.12.21]-[13.12.24] Solutions Diffusion processes

and similarly Vx = sup{t .:::: 1 : Wet) = x } , with Vx = 1 if W(t) =f:. x for all t E [0, 1 ] . Recall that Uo and Vo have an arc sine distribution as in Problem ( 13 . 12. 19). On the event {UX < I }, we may write (using the re-scaling property of W)

Ux = Tx + ( 1 - Tx)Uo , Vx = Tx + ( 1 - Tx)Yo ,

where Uo and Yo are independent of Ux and Vx , and have the above arc sine distribution. Hence Ux and Vx have the same distribution. Now Tx has the first passage distribution of Theorem ( 13 .4.5), whence

Therefore,

and

fux (u) = iou irx,ux (t , u) du = 7r .JU(� _ u) exp ( -�:) , 0 < x < l .

21. Note that V is a martingale, by Theorem ( 13 .8 . 1 1 ) . Fix t and let 1/Is = sign(Ws) , 0 .:::: s ::'S t . We have that 1 1 1/1 1 1 = ..,fi, implying by Exercise ( 13 .8 .6) that lE(Vh = I I I (1/I) I I � = t . By a similar

calculation, lE(V? 1 :Fs ) = V} + t - s for 0 ::'S s .:::: t . That is to say, V? - t defines a martingale, and the result follows by the Levy characterization theorem of Example ( 12.7. 10).

22. The mean cost per unit time is

Differentiate to obtain that /-L' (T) = 0 if

R = 2C {loT <l>(a/.Ji) dt - T<l> (a/.JT) } = aC loT t- 1f/> (a/../t) dt,

where we have integrated by parts .

23. Consider the portfolio with � (t , St ) units of stock and 1/I(t , Sr) units of bond, having total value w(t , St ) = x� (t, x) + ert1/l (t , St) . By assumption,

(*) (1 - y)xW, x) = yert 1/I (t , x) .

Differentiate this equation with respect to x and substitute from equation ( 13 . 10. 16) to obtain the

differential equation ( 1 - y)� + x�x = 0, with solution � (t , x) = h (t)xy- 1 , for some function h (t) . We substitute this, together with (*), into equation ( 13 . 10. 17) to obtain that

h' - h ( 1 - y) ( i ya2 + r) = O.

It follows that h (t) = A exp{( 1 - y) ( ! ya2 + r) t } , where A is an absolute constant to be determined

according to the size of the initial investment. Finally, wet , x) = y- l x�(t , x) = y- l h (t)xY .

24. Using Ito's formula ( 13 .9.4), the drift term in the SDE for Ut is

(-U l (T - t , W) + !U22 (T - t , W») dt,

where U l and U22 denote partial derivatives of u . The drift function is identically zero if and only if

U l = i U22 '

428

Page 438: One Thousand Exercises in Probability

Bibliography

A man will tum over half a library to make one book. Samuel Johnson

Abramowitz, M. and Stegun, I. A. ( 1 965). Handbook of mathematicalfunctions with formulas,

graphs and mathematical tables. Dover, New York.

Billingsley, P. ( 1995). Probability and measure (3rd edn) . Wiley, New York.

Breiman, L. ( 1968). Probability. Addison-Wesley, Reading, MA, reprinted by SIAM, 1 992.

Chung, K. L. ( 1974) . A course in probability theory (2nd edn) . Academic Press, New York.

Cox, D. R. and Miller, H. D. ( 1 965). The theory of stochastic processes. Chapman and Hall,

London.

Doob, J. L. ( 1 953) . Stochastic processes. Wiley, New York.

Feller, W. ( 1968). An introduction to probability theory and its applications, Vol. 1 (3rd edn) .

Wiley, New York.

Feller, W. ( 1971 ) . An introduction to probability theory and its applications, Vol. 2 (2nd edn) .

Wiley, New York.

Grimmett, G. R. and Stirzaker, D. R. (200 1) . Probability and random processes, (3rd edn) .

Oxford University Press, Oxford.

Grimmett, G. R. and Welsh, D. J. A. ( 1 986). Probability, an introduction. Clarendon Press,

Oxford.

Hall, M. ( 1983). Combinatorial theory (2nd edn) . Wiley, New York.

Harris, T. E. ( 1963). The theory of branching processes. Springer, Berlin.

Karlin, S. and Taylor, H. M. ( 1 975). Afirst course in stochastic processes (2nd edn) . Academic

Press, New York.

Karlin, S . and Taylor, H. M. ( 1981 ) . A second course in stochastic processes. Academic

Press, New York.

Laha, R. G. and Rohatgi, V. K. ( 1 979) . Probability theory. Wiley, New York.

Loeve, M. ( 1977) . Probability theory, Vol. 1 (4th edn) . Springer, Berlin.

Loeve, M. ( 1978). Probability theory, Vol. 2 (4th edn) . Springer, Berlin.

Moran, P. A. P. ( 1968). An introduction to probability theory. Clarendon Press, Oxford.

Stirzaker, D. R. ( 1 994) . Elementary probability. Cambridge University Press, Cambridge.

Stirzaker, D. R. ( 1999). Probability and random variables. Cambridge University Press,

Cambridge.

Williams, D. ( 199 1) . Probability with martingales. Cambridge University Press, Cambridge.

429

Page 439: One Thousand Exercises in Probability

Index

Abbreviations used in this index: c.f. characteristic function; distn distribution; eqn equation;

fn function; m.g.f. moment generating function; p.g.f. probability generating function; pro

process; r.v. random variable; r.w. random walk; s.r.w. simple random walk; thm theorem.

A absolute value of s.r. W. 6. 1.3

absorbing barriers: s.r.w. 1.7.3, 3.9. 1, 5, 3. 1 1.23, 25-26, 12.5 .4-5, 7; Wiener pro 12.9.22-3, 13. 12.8-9

absorbing state 6 .2 . 1 adapted process 13.8.6

affine transformation 4. 13. 11 ; 4. 14.60

age-dependent branching pro 5 .5 . 1-2; conditional 5 . 1.2; honest martingale 12.9.2; mean 10.6. 13

age, see current life airlines 1.8.39, 2.7.7

alarm clock 6.15.21 algorithm 3 .11 .33, 4. 14.63, 6 .14.2

aliasing method 4. 1 1.6 alternating renewal pro 10.5.2,

10.6 . 14 American call option 13. 10.4

analytic fn 13. 12. 14 ancestors in common 5 .4.2 anomalous numbers 3.6.7 Anscombe's theorem 7. 11 .28

antithetic variable 4. 1 1. 1 1 ants 6. 15.41

arbitrage 3.3.7, 6.6.3 Arbuthnot, I. 3. 1 1.22

arc sine distn 4. 1 1 . 13; sample from 4. 11 . 13

arc sine law density 4. 1. 1 arc sine laws for r.w. : maxima

3. 11 .28; sojourns 5 .3.5 ; visits 3. 10.3

arc sine laws for Wiener pro 13.4.3, 13. 12. 10, 13. 12.19

Archimedes's theorem 4. 1 1.32

arithmetic r.v. 5.9.4

attraction 1.8.29

autocorrelation function 9.3.3, 9 .7.5, 8

autocovariance function 9.3.3, 9.5 .2, 9 .7.6, 19-20, 22

autoregressive sequence 8.7.2, 9 . 1 . 1, 9.2. 1, 9 .7.3

average, see moving average

B babies 5 . 10.2

backward martingale 12.7.3

bagged balls 7. 1 1.27, 12.9. 13-14

balance equations 11 .7. 13

Bandrika 1.8.35-36, 4.2.3

bankruptcy, see gambler's ruin

Barker's algorithm 6. 14.2

barriers: absorbing/retaining in r.w. 1.7.3, 3.9. 1-2, 3. 1 1.23, 25-26; hitting by Wiener pro 13.4.2

Bartlett: equation 13.3.2-4; theorem 8.7.6, 1 1.7. 1

batch service 11 .8.4

baulking 8.4.4, 11 .8.2, 19

Bayes's formula 1.8. 14, 1.8.36

bears 6 .13. 1, 10.6. 19

Benford's distn 3.6.7

Berkson's fallacy 3 .1 1.37

Bernoulli: Daniel 3.3.4, 3.4.3-4; Nicholas 3.3.4

430

Bernoulli: mode1 6. 15 .36; renewal 8.7.3; shift 9. 17. 14; sum of r.v.s 3. 11 . 14, 35

Bertrand's paradox 4. 14.8 Bessel: function 5 .8.5, 11 .8.5,

1 1.8. 16; B. pro 12.9.23, 13.3.5-6, 13.9. 1

best predictor 7.9 .1 ; linear 7.9.3, 9.2. 1-2, 9.7. 1, 3

beta fn 4.4.2, 4. 10.6 beta distn: b.-binomial 4.6.5;

sample from 4. 11 .4-5 betting scheme 6.6.3 binary: expansion 9. 1.2; fission

5 .5 . 1 binary tree 5 . 12.38; r.w. on 6.4.7 binomial r.v. 2 . 1.3; sum of 3 .11 .8,

1 1 birth process 6.8.6; dishonest

6.8.7; forward eqns 6.8.4; divergent 6.8.7; with immigration 6.8.5; non-homogeneous 6 .15.24; see also simple birth

birth-death process: coupled 6. 15 .46; extinction 6 .11 .3, 5, 6 . 15 .25, 12.9. 10; honest 6. 15.26; immigration-death 6 .11 .3, 6. 13. 18, 28; jump chain 6 .11 .1 ; martingale 12.9. 10; queue 8.7.4; reversible 6.5 . 1, 6 .15 . 16; symmetric 6 .15 .27; see also simple birth-death

birthdays 1.8.30 bivariate: Markov chain 6. 15.4;

negative binomial distn 5. 12. 16; p.g.f. 5 . 1.3

bivariate normal distn 4.7.5-6, 12, 4.9 .4-5, 4. 14. 13, 16, 7.9.2,

Page 440: One Thousand Exercises in Probability

Index

7.11 .19 ; c.f. 5 .8. 11 ; positive part 4.7.5, 4.8.8, 5.9.8

Black-Scholes: model 13. 12.23; value 13. 10.5

Bonferroni's inequality 1.8.37 books 2.7. 15 Boo1e's inequalities 1.8. 1 1 Borel: normal number theorem

9.7. 14; paradox 4.6 . 1 Borel-Cantelli lemmas 7.6. 1,

13. 12. 11 bounded convergence 12.1.5 bow tie 6.4. 11 Box-Muller normals 4. 11.7 branching process: age-dependent

5.5. 1-2, 10.6. 13; ancestors 5.4.2; conditioned 5 . 12.21, 6.7. 1-4; convergence, 12.9 .8; correlation 5 .4. 1; critical 7. 10. 1; extinction 5 .4.3; geometric 5.4.3, 5 .4.6; imbedded in queue 11 .3.2, 11 .7.5, 11; with immigration 5 .4.5, 7.7.2; inequality 5 . 12. 12; martingale 12. 1 .3, 9, 12.9. 1-2, 8; maximum of 12.9.20; moments 5 .4. 1; p.g.f. 5 .4.4; supercritical 6.7.2; total population 5 . 12 .11 ; variance 5 . 12.9; visits 5 .4.6

bridge 1.8.32 Brownian bridge 9.7.22, 13.6.2-5 ;

autocovariance 9.7.22, 13.6.2; zeros of 13.6.5

Brownian motion; geometric 13.3.9; tied-down, see Brownian bridge

Buffon: cross 4.5.3; needle 4.5 .2, 4. 14.31-32; noodle 4. 14.31

busy period 6. 12. 1; in GIGII 1 1.5 . 1 ; in MlGll 11 .3.3; in MIMIl 11 .3.2, 11 .8.5; in MlMloo 11 .8.9

C cake, hot 3.1 1.32 call option: American 13. 10.4;

European 13. 10.4-5 Campbell-Hardy theorem 6 .13.2 capture-recapture 3.5.4 car, parking 4. 14.30 cards 1.7.2, 5, 1.8.33 Carroll, Lewis 1.4.4 casino 3.9 .6, 7.7.4, 12.9 . 16 Cauchy convergence 7.3. 1; in

m.s. 7. 11 . 11 Cauchy distn 4.4.4; maximum

7. 11 .14; moments 4.4.4;

sample from 4. 1 1.9; sum 4.8.2, 5 . 1 1.4, 5 . 12.24-25

Cauchy-Schwarz inequality 4. 14.27

central limit theorem 5. 10. 1, 3, 9, 5 . 12.33, 40, 7. 1 1 .26, 10.6.3

characteristic function 5 . 12.26-31; bivariate normal 5 .8. 11 ; continuity theorem 5. 12.39; exponential distn 5 .8.8; extreme-value distn 5 . 12.27; first passage distn 5 . 10.7-8; joint 5 . 12.30; law of large numbers 7. 11 . 15 ; multinormal distn 5 .8.6; tails 5 .7.6; weak law 7. 1 1. 15

Chebyshov's inequality, one-sided 7. 1 1.9

cherries 1.8.22

chess 6.6.6-7

chimeras 3. 1 1.36

chi-squared distn: non-central 5 .7.7; sum 4. 10. 1, 4. 14.12

Cholesky decomposition 4. 14.62

chromatic number 12.2.2

coins: double 1.4.3; fair 1.3.2; first head 1.3.2, 1 .8.2, 2 .7. 1; patterns 1.3.2, 5 .2.6, 5 . 12.2, 10.6. 17, 12.9 . 16; transitive 2.7. 16; see Poisson flips

colouring: graph 12.2.2; sphere 1.8.28; theorem 6.15.39

competition lemma 6. 13.8

complete convergence 7.3.7

complex-valued process 9.7.8

compound: Poisson pro 6. 15 .21; Poisson distn 3.8.6, 5 . 12 . 13

compounding 5 .2.3, 5.2.8

computer queue 6.9.3

concave fn 6 . 15 .37

conditional: birth-death pro 6. 1 1.4-5; branching pro 5 . 12.21, 6.7. 1-4; convergence 13.8.3; correlation 9.7.21; entropy 6.15 .45 ; expectation 3.7.2-3, 4.6.2, 4. 14. 13, 7.9 .4; independence 1.5.5; probability 1 .8.9 ; s.r.w. 3.9.2-3; variance 3.7.4, 4.6.7; Wiener pro 8.5.2, 9.7.21, 13.6 .1 ; see also regression

continuity of: distn fns 2.7. 10; marginals 4.5 . 1; probability measures 1.8. 16, 1.8. 18; transition probabilities 6 . 15 . 14

continuity theorem 5 . 12.35, 39

43 1

continuous r.v.: independence 4.5.5, 4. 14.6; limits of discrete r.v.s 2.3. 1

convergence: bounded 12. 1 .5; Cauchy 7.3. 1, 7. 11 . 11 ; complete 7.3.7; conditional 13.8.3; in distn 7.2.4, 7. 1 1.8, 16, 24; dominated 5.6.3, 7.2.2; event of 7.2.6; martingale 7.8.3, 12. 1.5, 12.9.6; Poisson pro 7. 11 .5 ; in probability 7.2.8, 7. 11 . 15 ; subsequence 7. 1 1.25; in total variation 7.2.9

convex: fn 5 .6 .1, 12. 1.6-7; rock 4. 14.47; shape 4. 13.2-3, 4. 14.61

corkscrew 8.4.5 Com Flakes 1.3.4, 1.8. 13 countable additivity 1.8. 18 counters 10.6.6-8, 15 coupling: birth-death pro 6 .15 .46;

maximal 4. 12.4-6, 7. 11 . 16 coupons 3.3.2, 5 .2.9, 5 . 12.34 covariance: matrix 3. 11 . 15, 7.9.3;

of Poisson pro 7. 1 1.5 Cox process 6. 15.22 Cp inequality Cramer-Wold device 7. 11 . 19,

5 .8. 1 1 criterion: irreducibility 6 .15 .15 ;

Kolmogorov's 6 .5 .2 ; for persistence 6.4. 10

Crofton's method 4. 13.9 crudely stationary 8.2.3 cube: point in 7. 11 .22; r.w. on

6.3.4 cumulants 5 .7.3-4 cups and saucers 1.3.3 current life 10.5 .4; and excess

10.6.9; limit 10.6.4; Markov 10.3.2; Poisson 10.5 .4

D dam 6.4.3 dead period 10.6.6-7 death-immigration pro 6 .1 1.3 decay 5 . 12.48, 6.4.8 decimal expansion 3.1 .4, 7. 1 1.4 decomposition: Cholesky 4. 14.62;

Krickeberg 12.9 . 1 1 degrees o f freedom 5 .7.7-8 delayed renewal pro 10.6. 12 de Moivre: martingale 12. 1.4,

12.4.6; trial 3.5 . 1 D e Morgan laws 1.2 . 1 density: arc sine 4. 11 . 13; arc

sine law 4. 1 . 1 ; betsa 4. 11 .4,

Page 441: One Thousand Exercises in Probability

4. 14. 11, 19, 5 .8.3; bivariate normal 4.7.5-6, 12, 4.9.4-5, 4. 14. 13, 16, 7.9.2, 7. 11 . 19; Cauchy 4.4.4, 4.8.2, 4. 10.3, 4. 14.4, 16, 5 .7. 1, 5 . 1 1.4, 5 . 12. 19, 24-25, 7. 1 1 . 14; chi-squared 4. 10. 1, 4. 14. 12, 5 .7.7; Dirichlet 4. 14.58; exponential 4.4.3, 4.5.5, 4.7.2, 4.8. 1, 4. 10.4, 4. 14.4-5, 17-19, 24, 33, 5 . 12.32, 39, 6.7. 1 ; extreme-value 4. 1 . 1, 4. 14.46, 7. 11 . 13; F(r, s) 4. 10.2, 4, 5 .7.8; first passage 5 . 10.7-8, 5 . 12. 18-19; Fisher's spherical 4. 14.36; gamma 4. 14. 10-12, 5 .8.3, 5.9 .3, 5 . 10.3, 5 . 12 .14, 33; hypoexponential 4.8.4; log-normal 4.4.5, 5 . 12.43; multinormal 4.9 .2, 5 .8.6; normal 4.9 .3, 5, 4. 14. 1, 5.8.4-6, 5 . 12.23, 42, 7. 11 . 19; spectral 9.3.3; standard normal 4.7.5; Student's t 4. 10.2-3, 5 .7.8; uniform 4.4.3, 4.5 .4, 4.6.6, 4.7. 1, 3, 4, 4.8.5, 4. 1 1 . 1, 8, 4. 14.4, 15, 19, 20, 23-26, 5 . 12.32, 7. 11 .4, 9. 1.2, 9.7.5; WeibulI 4.4.7, 7. 11 . 13

departure pro 1 1.2.7, 1 1.7.2-4, 11 .8. 12

derangement 3.4.9 diagonal selection 6.4.5

dice 1.5 .2, 3.2.4, 3.3.3, 6. 1.2; weightedlloaded 2.7. 12, 5 . 12.36

difference eqns 1.8.20, 3.4.9, 5.2.5

difficult customers 11 . 7.4

diffusion: absorbing barrier 13. 12.8-9; Bessel pro 12.9.23, 13.3.5-6, 13.9 .1 ; Ehrenfest model 6.5 .5, 36; first passage 13.4.2 ; Ito pro 13.9.3; models 3.4.4, 6.5.5, 6 .15. 12, 36; Ornstein-Uhlenbeck pro 13.3.4, 13.7.4-5, 13. 12.3-4, 6; osmosis 6.15.36; reflecting barrier 13.5 .1, 13. 12.6; Wiener pro 12.7.22-23; zeros 13.4. 1, 13. 12. 10; Chapter 13 passim

diffusion approximation to birth-death pro 13.3 .1

dimer problem 3.11 .34

Dirichlet: density 4. 14.58; distn 3. 1 1.31

disasters 6. 12.2-3, 6. 15 .28 discontinuous marginal 4.5 . 1 dishonest birth pro 6.8.7

distribution: see also density; arc sine 4. 11 . 13; arithmetic 5.9.4; Benford 3.6.7; Bernoulli 3. 11 . 14, 35 ; beta 4. 11 .4; beta-binomial 4.6.5; binomial 2 .1 .3, 3. 1 1.8, 1 1, 5 . 12.39; bivariate normal 4.7.5-6, 12; Cauchy 4.4.4; chi-squared 4. 10. 1; compound 5.2.3; compound Poisson 5 . 12 .13; convergence 7.2.4; Dirichlet 3. 1 1.31, 4. 14.58; empirical 9 .7.22; exponential 4.4.3, 5 . 12.39; F(r, s) 4. 10.2-4; extreme-value 4. 1 .1, 4. 14.46, 7. 11 . 13; first passage 5 . 10.7-8. 5. 12. 18-19; gamma 4. 14. 10-12; Gaussian, see normal; geometric 3. 1 . 1, 3.2.2, 3.7.5, 3. 11 .7, 5 . 12.34, 39, 6 . 1 1.4; hypergeometric 3 .11 . 1 0-11 ; hypoexponential 4.8.4; infinitely divisible 5 . 12. 13-14; inverse square 3 .1 . 1 ; joint 2.5.4; lattice 5 .7.5 ; logarithmic 3 .1 . 1, 5 .2.3; log-normal 4.4.5, 5 . 12.43; maximum 4.2.2, 4. 14. 17; median 2.7. 1 1, 4.3.4, 7.3. 11 ; mixed 2 .1 .4, 2.3.4, 4. 1.3; modified Poisson 3 .1 .1 ; moments 5 . 11 .3; multinomial 3.5. 1, 3.6.2; multinormal 4.9.2; negative binomial 3.8.4, 5.2.3, 5 . 12.4, 16; negative hypergeometric 3.5.4; non-central 5 .7.7-8; normal 4.4.6, 8, 4.9 .3-5; Poisson 3. 1 . 1, 3.5.2-3, 3. 1 1.6, 4. 14. 1 1, 5.2.3, 5 . 10.3, 5 . 12.8, 14, 17, 33, 37, 39, 7. 11 . 18; spectral 9 .3.2, 4; standard normal 4.7.5; stationary 6.9. 11 ; Student's t 4. 10.2-3, 5 .7.8; symmetric 3.2.5; tails 5 . 1 .2, 5.6.4, 5 . 1 1.3; tilted 5 .7. 1 1 ; trapezoidal 3.8. 1; trinormal 4.9 .8-9; uniform 2.7.20, 3.7.5, 3.8. 1, 5 . 1.6, 9 .7.5 ; Weibu1l 4.4.7, 7. 11 . 13; zeta or Zipf 3. 1 1.5

divergent birth pro 6.8.7 divine providence 3.1 1.22

Dobrushin's bound and ergodic coefficient 6. 14.4

dog-flea model 6.5.5, 6 . 15 .36 dominated convergence 5.6.3,

7.2.2

Doob's L2 inequality 13.7. 1

Doob-Kolmogorov inequality 7.8. 1-2

432

Index

doubly stochastic: matrix 6 .1 . 12, 6. 15.2; Poisson pro 6.15.22-23

downcrossings inequality 12.3. 1 drift 13.3.3, 13.5 . 1, 13.8.9,

13. 12.9 dual queue 1 1.5.2 duration of play 12. 1.4

E Eddington's controversy 1.8.27 editors 6.4. 1 eggs 5 . 12. 13 Ehrenfest model 6.5 .5, 6 .15 .36 eigenvector 6.6. 1-2, 6. 15.7 embarrassment 2.2.1 empires 6. 15. 10 empirical distn 9.7.22 entrance fee 3.3.4 entropy 7.5 .1 ; conditional

6. 15.45 ; mutual 3.6.5 epidemic 6. 15 .32 equilibrium, see stationary equivalence class 7. 1. 1 ergodic: coefficient 6. 14.4;

measure 9.7. 11 ; stationary measure 9.7. 1 1

ergodic theorem: Markov chain 6 .15 .44, 7. 11 .32; Markov pro 7. 11 .33, 10.5 . 1 ; stationary pro 9.7. 10-11, 13

Erlang's loss formula 11 .8.19 error 3.7.9; of prediction 9.2.2 estimation 2.2.3, 4.5.3, 4. 14.9,

7. 1 1.31 Euler: constant 5 . 12.27, 6.15.32;

product 5 . 12.34 European call option 13. 10.4-5 event: of convergence 7.2.6;

exchangeable 7.3.4-5; invariant 9 .5 .1 ; sequence 1.8.16; tail 7.3.3

excess life 10.5 .4; conditional 10.3.4; and current 10.6.9; limit 10.3.3; Markov 8.3.2, 10.3.2; moments 10.3.3; Poisson 6.8.3, 10.3. 1, 10.6.9; reversed 8.3.2; stationary 10.3.3

exchangeability 7.3.4-5 expectation: conditional

3.7.2-3, 4.6.2, 4. 14. 12, 7.9.4; independent r.v.s 7.2.3; linearity 5.6.2; tail integral 4.3.3, 5; tail sum 3 .11 .13, 4. 14.3

exponential distn: c.f. 5.8.8; holding time 1 1.2.2; in Poisson pro 6.8.3; lack-of-memory

Page 442: One Thousand Exercises in Probability

Index

property 4. 14.5 ; limit in branching pro 5.6.2, 5 . 12.21, 6.7 .1 ; limit of geometric distn 5. 12.39 ; heavy traffic limit 1 1 .6.1 ; distn of maximum 4. 14. 18; in Markov pro 6.8.3, 6.9.9; order statistics 4. 14.33; sample from 4. 14.48; sum 4.8. 1, 4, 4. 14. 10, 5 . 12.50, 6.15.42

exponential martingale 13.3.9 exponential smoothing 9 .7.2 extinction: of birth-death pro

6.11 .3, 6 .15 .25, 27, 12.9. 10; of branching pro 6.7.2-3

extreme-value distn 4. 1 . 1, 4. 14.46, 5 . 12.34, 7. 11 . 13; c.f. and mean 5 . 12.27

F F(r, s) distn 4. 10.2, 4;

non-central 5.7.8 fair fee 3.3.4 families 1.5 .7, 3.7.8 family, planning 3. 11 .30 Farkas's theorem 6.6.2 filter 9.7.2 filtration 12.4. 1-2, 7 fingerprinting 3.11 .21 finite: Markov chain 6.5 .8, 6.6.5,

6 .15 .43-44; stopping time 12.4.5 ; waiting room 1 1.8. 1

first passage: c.f. 5. 10.7-8; diffusion pro 13.4.2; distn 5 .10.7-8, 5 . 12 .18-19; Markov chain 6.2. 1, 6.3.6; Markov pro 6.9.5-6; mean 6.3.7; m.g.f. 5 .12 .18; s.r.w. 5.3.8; Wiener pro 13.4.2

first visit by s.r.w. 3. 10. 1, 3 Fisher: spherical distn 4. 14.36;

F.-Tippett-Gumbel distn 4. 14.46

FKG inequality 3.11 . 18, 4. 11 . 11 flip-flop 8.2. 1 forest 6. 15.30 Fourier: inversion thm 5.9.5;

series 9.7. 15, 13. 12.7 fourth moment strong law 7. 1 1.6 fractional moments 3.3.5, 4.3. 1 function, upper-class 7.6. 1 functional eqn 4. 14.5, 19

G Galton's paradox 1.5.8 gambler's ruin 3.11 .25-26, 12. 1.4,

12.5.8

gambling: advice 3.9.4; systems 7.7.4

gamma distn 4. 14. 10-12, 5 .8.3, 5.9 .3, 5 . 10.3, 5 . 12 . 14, 33; g. and Poisson 4. 14. 11 ; sample from 4. 1 1.3; sum 4. 14. 1 1

gamma fn 4.4. 1, 5 . 12.34 gaps: Poisson 8.4.3, 10. 1.2;

recurrent events 5 . 12.45 ; renewal 10. 1.2

Gaussian distn, see normal distn

Gaussian pro 9.6.2-4, 13. 12.2; Markov 9.6.2; stationary 9.4.3; white noise 13.8.5

generator 6.9 . 1 geometric Brownian motion,

Wiener pro 13.3.9 geometric distn 3.1 . 1, 3.2 .2,

3.7.5, 3 .11 .7, 5 . 12.34, 39; lack-of-memory property 3.11 .7; as limit 6 .1 1.4; sample from 4. 11 .8; sum 3.8.3-4

goat 1.4.5 graph: colouring 12.2.2; r.w.

6.4.6, 9, 13. 11 .2-3

H Hlijek-Renyi-Chow inequality

12.9 .4-5 Hall, Monty 1.4.5 Hastings algorithm 6. 14.2 Hawaii 2.7. 17 hazard rate 4. 1.4, 4.4.7; technique

4. 11 . 10 heat eqn 13. 12.24 Heathrow 10.2 . 1 heavy traffic 11 .6. 1, 11 .7. 16 hen, see eggs

Hewitt-Savage zero-one law 7.3.4-5

hitting time 6.9 .5-6; theorem 3.10.1, 5 .3.8

Hoeffding's inequality 12.2. 1-2 Holder's inequality 4. 14.27 holding time 1 1.2.2 homogeneous Markov chain 6. 1 . 1 honest birth-death pro 6. 15 .26 Hotelling's theorem 4. 14.59 house 4.2. 1, 6. 15 .20, 5 1 hypergeometric distn 3. 11 . 10-1 1 hypoexponential distn 4.8.4

I idle period 1 1 .5 .2, 11 .8.9 imbedding: jump chain 6.9 . 11 ;

Markov chain 6.9. 12, 6 .15 . 17; queues: DIMJI 11 .7. 16; GIMJI

433

11 .4. 1, 3; MlGIl 11 .3 .1, 11 .7.4; unsuccessful 6.15 . 17

immigration: birth-i. 6.8.5 ; branching 5 .4.5, 7.7.2; i.-death 6 .11 .2, 6 .15 . 18; with disasters 6. 12.2-3, 6. 15 .28

immoral stimulus 1.2 . 1 importance sampling 4. 11 . 12 inclusion-exclusion principle

1.3.4, 1.8. 12 increasing sequence: of events

1.8. 16; of r.v.s 2.7.2 increments: independent 9.7.6,

16-17; orthogonal 7.7. 1 ; spectral 9.4. 1, 3; stationary 9.7. 17; of Wiener pro 9.7.6

independence and symmetry 1.5.3 independent: conditionally 1.5.5;

continuous r.v.s 4.5.5, 4. 14.6; current and excess life 10.6.9; customers 11 .7. 1 ; discrete r.v.s 3.11 . 1, 3; events 1.5 . 1 ; increments 9 .7. 17; mean, variance of normal sample 4. 10.5, 5 . 12.42; normal distn 4.7.5; pairwise 1.5 .2, 3.2. 1, 5 . 1 .7; set 3.1 1.40; triplewise 5 .1 .7

indicators and matching 3. 11 . 17 inequality: bivariate normal

4.7. 12; Bonferroni 1.8.37; Boole 1.8. 11 ; Cauchy-Schwarz 4. 14.27; Chebyshov 7. 1 1 .9; Dobrushin 6. 14.4; Doob-Kolmogorov 7.8. 1; Doob L2 13.7. 1 ; downcrossings 12.3 . 1 ; FKG 3. 1 1 . 18; Hlijek-Renyi-Chow 12.9.4-5; Hoeffding 12.2. 1-2; Holder 4. 14.27; Jensen 5 .6. 1, 7.9.4; Kolmogorov 7.8. 1-2, 7. 1 1 .29-30; Kounias 1.8.38; Lyapunov 4. 14.28; maximal 12.4.3-4, 12.9.3, 5, 9; m.g.f. 5.8.2, 12.9.7; Minkowski 4. 14.27; triangle 7. 1 . 1, 3; upcrossings 12.3.2

infinite divisibility 5 . 12. 13-14 inner product 6. 14. 1, 7. 1.2 inspection paradox 10.6.5 insurance 1 1.8. 18, 12.9 . 12 integral: Monte Carlo 4. 14.9;

normal 4. 14. 1; stochastic 9.7. 19, 13.8. 1-2

invariant event 9.5 .1 inverse square distn 3.1 .1 inverse transform technique 2.3.3 inversion theorem 5.9.5; c.f.

5 . 12.20

Page 443: One Thousand Exercises in Probability

irreducible Markov pro 6. 15. 15 iterated logarithm 7.6. 1 Ita: formula 13.9.2; process

13.9.3

J Jaguar 3. 1 1.25 Jensen's inequality 5.6. 1, 7.9.4 joint: c.f. 5. 12.30; density 2.7.20;

distn 2.5.4; mass fn 2.5.5; moments 5. 12.30; p.g.f. 5. 1.3-5

jump chain: of MIMI1 1 1.2.6

K key renewal theorem 10.3.3, 5,

10.6. 1 1 Keynes, J . M. 3.9.6 knapsack problem 12.2. 1 Kolmogorov: criterion 6.5.2;

inequality 7.8.1-2, 7. 1 1.29-30 Koro1yuk-Khinchin theorem 8.2.3 Kounias's inequality 1.8.38 Krickeberg decomposition 12.9. 1 1 Kronecker's lemma 7.8.2,

7. 11 .30, 12.9.5 kurtosis 4. 14.45

L L2 inequality 13.7. 1 Labouchere system 12.9 .15 lack of anticipation 6.9.4 1ack-of-memory property:

exponential distn 4. 14.5 ; geometric distn 3. 1 1.7

ladders, see records

Lancaster's theorem 4. 14.38 large deviations 5 .11. 1-3, 12.9.7 last exits 6.2. 1, 6. 15.7 lattice distn 5.7.5 law: anomalous numbers 3.6.7;

arc sine 3. 10.3, 3. 11.28, 5.3.5; De Morgan 1.2. 1; iterated logarithm 7.6. 1; large numbers 2.2.2; strong 7.4.1, 7.8.2, 7. 1 1.6, 9.7. 10; unconscious statistician 3. 1 1.3; weak 7.4. 1, 7. 1 1. 15, 20-2 1; zero-one 7.3.4-5

Lebesgue measure 6. 15.29, 13. 12.16

left-continuous r.w. 5.3.7, 5. 12.7 level sets of Wiener pro 13. 12. 16 Levy metric 2.7. 13, 7.1.4, 7.2.4 limit: binomial 3. 1 1. 10;

binomial-Poisson 5. 12.39 ; branching 12.9.8; central limit theorem 5. 10. 1, 3, 9,

5. 12.33, 40, 7. 11 .26, 10.6.3; diffusion 13.3. 1; distns 2.3. 1; events 1.8. 16; gamma 5.9.3; geometric-exponential 5. 12.39; 1im inf 1.8. 16; lim sup 1.8. 16, 5.6.3, 7.3.2, 9-10, 12; local 5.9.2, 5. 10.5-6; martingale 7.8.3; normal 5. 12.41, 7. 11 . 19; Poisson 3. 1 1. 17; probability 1.8. 16-18; r.v. 2.7.2; uniform 5.9. 1

linear dependence 3. 11 .15 linear fn of normal r.v. 4.9.3-4 linear prediction 9.7. 1, 3 local central limit theorem 5.9.2,

5 . 10.5-6 logarithmic distn 3. 1. 1, 5.2.3 log-convex 3. 1.5 log-likelihood 7. 1 1.31 log-normal r.v. 4.4.5, 5. 12.43 lottery 1.8.31 Lyapunov's inequality 4.14.28

M machine 1 1.8. 17 magnets 3.4.8 marginal: discontinuous 4.5.1 ;

multinomial 3.6.2; ord�r statistics 4. 14.22

Markov chain in continuous time: ergodic theorem 7. 1 1.33, 10.5 .1 ; first passage 6.9.5-6; irreducible 6. 15. 15 ; jump chain 6.9. 11 ; martingale 12.7. 1 ; mean first passage 6.9.6; mean recurrence time 6.9. 11 ; renewal pro 8.3.5; reversible 6. 15. 16, 38; stationary distn 6.9. 1 1; two-state 6.9. 1-2, 6. 15. 17; visits 6.9.9

Markov chain in discrete time: absorbing state 6.2.2; bivariate 6. 15.4; convergence 6.15.43; dice 6. 1.2; ergodic theorem 7. 1 1.32; finite 6.5.8, 6.15.43; first passages 6.2. 1, 6.3.6; homogeneous 6. 1.1 ; imbedded 6.9. 1 1, 6. 15. 17, 1 1.4. 1; last exits 6.2. 1, 6. 15.7; martingale 12. 1.8, 12.3.3; mean first passage 6.3.7; mean recurrence time 6.9. 11 ; persistent 6.4. 10; renewal 10.3.2; reversible 6. 14. 1; sampled 6. 1.4, 6.3.8; simulation of 6. 14.3; stationary distn 6.9. 11 ; sum 6. 1.8; two-state 6.15. 1 1, 17, 8.2. 1; visits 6.2.3-5, 6.3.5, 6. 15.5, 44

Markov-Kakntani theorem 6.6. 1

434

Index

Markov process: Gaussian 9.6.2; reversible 6. 15. 16

Markov property 6. 1.5, 10; strong 6.1.6

Markov renewal, see Poisson pro

Markov time, see stopping time

Markovian queue, see MIMI1 marriage problem 3.4.3, 4. 14.35 martingale: backward 12.7.3;

birth-death 12.9. 10; branching pro 12. 1.3, 9, 12.9. 1-2, 8, 20; casino 7.7.4; continuous parameter 12.7. 1-2; convergence 7.8.3, 12.1.5, 12.9.6; de Moivre 12. 1.4, 12.4.6; exponential 13.3.9; finite stopping time 12.4.5 ; gambling 12. 1.4, 12.5.8; Markov chain 12. 1.8, 12.3.3, 12.7. 1; optional stopping 12.5. 1-8; orthogonal increments 7.7. 1; partial sum 12.7.3; patterns 12.9.16; Poisson pro 12.7.2; reversed 12.7.3; simple r.w. 12. 1.4, 12.4.6, 12.5.4-7; stopping time 12.4. 1, 5, 7; urn 7. 11.27, 12.9. 13-14; Wiener pro 12.9.22-23

mass function, joint 2.5.5

matching 3.4.9, 3. 11. 17, 5.2.7, 12.9.2 1

matrix: covariance 3. 11.15; definite 4.9. 1; doubly stochastic 6.1. 12, 6. 15.2; multiplication 4.14.63; square root 4.9. 1; stochastic 6.1. 12, 6. 14. 1 ; sub-stochastic 6.1. 12; transition 7. 1 1.31; tridiagonal 6.5. 1, 6. 15. 16

maximal: coupling 4. 12.4-6, 7. 1 1. 16; inequality 12.4.3-4, 12.9.3, 5, 9

maximum of: branching pro 12.9.20; mu1tinormal 5.9.7; r.w. 3. 10.2, 3. 11.28, 5.3. 1, 6. 1.3, 12.4.6; uniforms 5. 12.32; Wiener pro 13. 12.8, 1 1, 15, 17

maximum r.v. 4.2.2, 4.5.4, 4.14. 17-18, 5. 12.32, 7. 11 .14

mean: extreme-value 5. 12.27; first passage 6.3.7, 6.9.6; negative binomial 5. 12.4; normal 4.4.6; recurrence time 6.9. 11 ; waiting time 11.4.2, 11.8.6, 10

measure: ergodic 9.7. 1 1-12; Lebesgue 6. 15.29, 13. 12. 16;

Page 444: One Thousand Exercises in Probability

Index

stationary 9.7. 11-12; strongly mixing 9.7.12

median 2.7. 11, 4.3.4, 7.3. 1 1 menages 1.8.23 Mercator projection 6. 13.5 meteorites 9.7.4 metric 2.7. 13, 7. 1.4; Levy 2.7. 13,

7. 1.4, 7.2.4; total variation 2.7. 13

m.g.f. inequality 5.8.2, 12.9.7 migration pr., open 11 .7. 1, 5 millionaires 3.9.4 Mills's ratio 4.4.8, 4. 14. 1 minimal, solution 6.3.6--7, 6.9.6 Minkowski's inequality 4. 14.27 misprints 6.4. 1 mixing, strong 9.7. 12 mixture 2.1 .4, 2.3.4, 4. 1 .3, 5 . 1.9 modified renewal 10.6. 12 moments: branching pro 5.4. 1 ;

fractional 3.3.5, 4.3. 1 ; generating fn 5. 1.8, 5 .8.2, 5 . 1 1.3; joint 5 . 12.30; problem 5. 12.43; renewal pro 10. 1 .1 ; tail integral 4.3.3, 5 . 1 1.3

Monte Carlo 4. 14.9 Monty Hall 1.4.5 Moscow 11 .8.3 moving average 8.7. 1, 9 .1 .3,

9.4.2, 9.5.3, 9.7. 1-2, 7; spectral density 9.7.7

multinomial distn 3.5 .1 ; marginals 3.6.2; p.g.f. 5 . 1.5

multinormal distn 4.9.2; c.f. 5.8.6; conditioned 4.9.6--7; covariance matrix 4.9.2; maximum 5.9.7; sampling from 4. 14.62; standard 4.9.2; transformed 4.9.3, 4. 14.62

Murphy's law 1.3.2 mutual information 3.6.5

N needle, Buffon's 4.5 .2,

4. 14.31-32 negative binomial distn 3.8.4;

bivariate 5. 12.16; moments 5. 12.4; p.g.f. 5 . 1 . 1, 5 . 12.4

negative hypergeometric distn 3.5.4

Newton, I. 3.8.5 non-central distn 5 .7.7-8 non-homogeneous: birth pro

6.15 .24; Poisson pro 6. 13.7, 6.15. 19-20

noodle, Buffon's 4. 14.31

norm 7. 1. 1, 7.9.6; equivalence class 7. 1 .1 ; mean square 7.9.6; rth mean 7. 1 . 1

normal distn 4.4.6; bivariate 4.7.5-6, 12; central limit theory 5. 10. 1, 3, 9, 5 . 12.23, 40; characterization of 5 . 12.23; cumulants 5 .7.4; limit 7. 1 1 . 19 ; linear transfomations 4.9.3-4; Mills's ratio 4.4.8, 4. 14. 1 ; moments 4.4.6, 4. 14. 1 ; multivariate 4.9.2, 5 .8.6; regression 4.8.7, 4.9 .6, 4. 14. 13, 7.9.2; sample 4. 10.5, 5 . 12.42; simulation of 4. 1 1.7, 4. 14.49; square 4. 14. 12; standard 4.7.5; sum 4.9.3; sum of squares 4. 14. 12; trivariate 4.9 .8-9; uncorrelated 4.8.6

normal integral 4. 14. 1 normal number theorem 9.7. 14 now 6. 15.50

o occupation time for Wiener pro

13. 12.20 open migration 11 .7. 1, 5 optimal: packing 12.2. 1; price

12.9 . 19 ; reset time 13. 12.22; serving 11 .8. 13

optimal stopping: dice 3.3.8-9; marriage 4. 14.35

optional stopping 12.5. 1-8, 12.9. 19; diffusion 13.4.2; Poisson 12.7.2

order statistics 4. 14.21; exponential 4. 14.33; general 4. 14.21; marginals 4. 14.22; uniform 4. 14.23-24, 39, 6.15.42, 12.7.3

Ornstein-Uhlenbeck pro 9.7. 19, 13.3.4, 13.7.4--5, 13. 12.3-4, 6; reflected 13. 12.6

orthogonal: increments 7.7. 1 ; polynomials 4. 14.37

osmosis 6.15.36

p pairwise independent: events

1.5.2; r.v.s 3.2. 1, 3.3.3, 5 . 1.7 paradox: Bertrand 4. 14.8;

Borel 4.6. 1 ; Carroll 1.4.4; Galton 1.5 .8; inspection 10.6.5; Parrando 6 .15 .48; St Petersburg 3.3.4; voter 3.6.6

parallel lines 4. 14.52 parallelogram 4. 14.60; property

7. 1.2;

435

parking 4. 14.30 Parrando's paradox 6 .15 .48 particle 6 .15 .33 partition of sample space 1.8.10 Pasta property 6.9.4 patterns 1.3.2, 5 .2.6, 5 . 12.2,

10.6. 17, 12.9 . 16 pawns 2.7. 18 Pepys's problem 3.8.5 periodic state 6.5 .4, 6. 15 .3 persistent: chain 6.4. 10, 6. 15.6;

r.w. 5 . 12.5-6, 6.3.2, 6.9 .8, 7.3.3, 13. 1 1.2; state 6.2.3-4, 6.9.7

Petersburg, see St Petersburg

pig 1.8.22 points, problem of 3.9.4, 3 .1 1.24 Poisson: approximation 3. 1 1.35;

coupling 4. 12.2; flips 3.5 .2, 5 . 12.37; sampling 6.9.4

Poisson distn 3.5.3; characterization of 5. 12.8, 15 ; compound 3.8.6, 5 . 12. 13; and gamma distn 4. 14. 11 ; limit of binomial 5. 12.39; modified 3. 1 .1 ; sum 3.1 1.6, 7.2. 10

Poisson pro 6. 15.29; age 10.5 .4; arrivals 8.7.4, 10.6.8; autocovariance 7. 1 1.5, 9 .6. 1; characterization 6. 15 .29, 9.7. 16; colouring theorem 6. 15.39; compound 6.15.21; conditional property 6. 13.6; continuity in m.s. 7. 11 .5 ; covariance 7. 11 .5 ; differentiability 7. 1 1.5; doubly stochastic 6. 15 .22-23; excess life 6.8.3, 10.3. 1 ; forest 6 .15 .30; gaps 10. 1.2; Markov renewal pro 8.3.5, 10.6.9; martingales 12.7.2; non-homogeneous 6. 13.7, 6.15. 19-20; optional stopping 12.7.2; perturbed 6. 15 .39-40; renewal 8.3.5, 10.6.9-10; Renyi's theorem 6. 15.39; repairs 11 .7. 18; sampling 6.9.4; spatial 6 .15 .30-31, 7.4.3; spectral density 9.7.6; sphere 6. 13.3-4; stationary increments 9.7.6, 16; superposed 6.8. 1 ; thinned 6.8.2; total life 10.6.5; traffic 6. 15 .40, 49, 8.4.3

poker 1.8.33; dice 1.8.34 P6lya's urn 12.9. 13-14 portfolio 13. 12.23; self-financing

13. 10.2-3 positive definite 9.6. 1, 4.9 . 1

Page 445: One Thousand Exercises in Probability

positive state, see non-null

postage stamp lemma 6.3.9 potential theory 13. 11 . 1-3 power series approximation

7. 11 . 17 Pratt's lemma 7. 10.5 predictable step fn 13.8.4 predictor: best 7.9. 1 ; linear 7.9.3,

9.2. 1-2, 9 .7. 1, 3 probabilistic method 1.8.28, 3.4.7 probability: continuity 1.8. 16,

1.8. 18; p.g.f. 5 . 12.4, 13; vector 4. 1 1 .6

problem: matching 3.4.9, 3 .11 . 17, 5.2.7, 12.9 .21; menages 1.8.23; Pepys 3.8.5; of points 3.9.4, 3 .11 .24; Waldegrave 5 . 12 .10

program, dual, linear, and primal 6.6.3

projected r.w. 5. 12.6 projection theorem 9.2. 10 proof-reading 6.4. 1 proportion, see empirical ratio

proportional investor 13. 12.23 prosecutor's fallacy 1.4.6 protocol 1.4.5, 1 .8.26 pull-through property 3.7. 1

Q quadratic variation 8.5 .4, 13.7.2 queue: batch 11 .8.4; baulking

8.4.4, 11 .8.2, 19; busy period 6. 12. 1, 1 1.3.2-3, 11 .5 . 1, 11 .8.5, 9 ; costs 11 .8. 13; departure pro 1 1.2.7, 1 1.7.2-4, 11 .8. 12; difficult customer 1 1.7.4; DIM/I 11 .4.3, 1 1.8. 15; dual 1 1.5.2; Erlang's loss fn 1 1.8. 19; finite waiting room 11 .8. 1; G/GIl 11 .5 .1, 1 1.8.8; GIM/l 11 .4. 1-2, 1 1.5.2-3; heavy traffic 11 .6 .1, 1 1.8. 15 ; idle period 11 .5 .2, 1 1.8.9; imbedded branching 11 .3.2, 11 .8.5, 11 ; imbedded Markov pro 11 .2.6, 11 .4. 1, 3, 11 .4. 1 ; imbedded renewal 1 1 .3.3, 11 .5 .1 ; imbedded r.w. 11 .2.2, 5 ; Markov, see MIMIl ; MID/I 1 1.3. 1, 1 1.8. 10-11; MIG/I 11 .3.3, 1 1 .8.6-7; MIG/co 6. 12.4, 1 1.8.9; migration system 11 .7. 1, 5; MIMIl 6.9.3, 6. 12. 1, 1 1.2.2-3, 5-6, 11 .3.2, 11 .6 .1, 11 .8.5, 12; MIM/k 1 1.7.2, 1 1.8. 13; M/M/co 8.7.4; series 11 .8.3, 12; supermarket 11 .8.3; tandem 1 1.2.7, 11 .8.3; taxicabs 1 1.8. 16; telephone

exchange 11 .8.9; two servers 8.4. 1, 5, 1 1.7.3, 1 1.8. 14; virtual waiting 11 .8.7; waiting time 11 .2.3, 1 1.5.2-3, 11 .8.6, 8, 10

quotient 3.3. 1, 4.7.2, 10, 13-14, 4. 10.4, 4. 11 . 10, 4. 14. 1 1, 14, 16, 40, 5.2.4, 5 . 12.49, 6.15.42

R radioactivity 10.6.6-8

random: bias 5. 10.9; binomial coefficient 5.2 .1 ; chord 4. 13. 1; dead period 10.6.7; harmonic series 7. 11 .37; integers 6.15.34; line 4. 13. 1-3, 4. 14.52; paper 4. 14.56; parameter 4.6.5, 5 . 1.6, 5 .2.3, 5.2.8; particles 6.4.8; pebbles 4. 14.5 1; permutation 4. 1 1.2; perpendicular 4. 14.50; polygons 4. 13. 10, 6.4.9; rock 4. 14.57; rods 4. 14.25-26, 53-54; sample 4. 14.21; subsequence 7. 1 1.25; sum 3.7.6, 3.8.6, 5.2.3, 5 . 12.50, 10.2.2; telegraph 8.2.2; triangle 4.5.6, 4. 13.6-8, 1 1, 13; velocity 6.15 .33, 40

random sample: normal 4. 10.5 ; ordered 4. 12.2 1

random variable: see also density and distribution; arc sine 4. 11 . 13; arithmetic 5.9.4; Bernoulli 3. 11 . 14, 35; beta 4. 1 1.4; beta-binomial 4.6.5; binomial 2.1 .3, 3. 1 1.8, 1 1, 5 . 12.39; bivariate normal 4.7.5-6, 12; Cauchy 4.4.4; c.f. 5. 12.26-31; chi-squared 4. 10. 1; compounding 5.2.3; continuous 2.3. 1 ; Dirichlet 3.1 1.31, 4. 14.58; expectation 5.6.2, 7.2.3; exponential 4.4.3, 5 . 12.39 ; extreme-value 4. 1 .1, 4. 14.46; F(r, s) 4. 10.2, 4, 5 .7.8; gamma 4. 11 .3, 4. 14. 10-12; geometric 3.1 . 1, 3 .11 .7; hypergeometric 3. 11 . 10-11 ; independent 3. 1 1 . 1, 3, 4.5.5, 7.2.3; indicator 3.11 . 17; infinitely divisible 5. 12. 13-14; logarithmic 3.1 . 1, 5.2.3; log-normal 4.4.5 ; median 2.7. 1 1, 4.3.4; m.gJ. 5 .1 .8, 5 .8.2, 5 . 1 1.3; multinomial 3.5 . 1 ; multinormal 4.9.2; negative binomial 3.8.4; normal 4.4.6, 4.7.5-6, 12;

436

Index

orthogonal 7.7. 1; p.gJ. 5. 12.4, 13; Poisson 3.5.3; standard normal 4.7.5; Student's t 4. 10.2-3, 5 .7.8; symmetric 3.2.5, 4. 1.2, 5 . 12.22; tails 3. 11 . 13, 4.3.3, 5, 4. 14.3, 5 .1 .2, 5.6.4, 5 . 1 1.3; tilted 5. 1.9, 5 .7. 11 ; trivial 3. 1 1.2; truncated 2.4.2; uncorrelated 3.11 . 12, 16, 4.5.7-8, 4.8.6; uniform 3.8. 1, 4.8.4, 4. 1 1 . 1, 9 .7.5 ; waiting time, see queue; Weibull 4.4.7; zeta or Zipf 3. 1 1.5

random walk: absorbed 3.11 .39, 12.5 .4-5; arc sine laws 3. 10.3, 3 .11 .28, 5 .3.5; on binary tree 6.4.7; conditional 3.9 .2-3; on cube 6.3.4; first passage 5.3.8; first visit 3. 10.3; on graph 6.4.6, 9, 13. 1 1.2-3; on hexagon 6. 15.35; imbedded in queue 11 .2.2, 5; left-continuous 5.3.7, 5 . 12.7; martingale 12.1 .4, 12.4.6, 12.5.4-5; maximum 3. 10.2, 3 .11.28, 5 .3.1 ; persistent 5. 12.5-6, 6.3.2; potentials 13. 11.2-3; projected 5. 12.6; range of 3.11.27; reflected 11.2 .1-2; retaining barrier 11 .2.4; returns to origin 3.10.1, 5 .3.2; reversible 6.5 .1 ; simple 3.9. 1-3, 5 , 3. 10. 1-3; on square 5 .3.3; symmetric 1.7.3, 3. 1 1.23; three dimensional 6.15 .9-10; transient 5 .12.44, 6. 15 .9, 7.5.3; truncated 6.5 .7; two dimensional 5.3.4, 5 . 12.6, 12.9. 17; visits 3.11 .23, 6.9 .8, 10; zero mean 7.5.3; zeros of 3. 10. 1, 5 .3.2, 5 . 12.5-6

range of r.w. 3.1 1.27 rate of convergence 6.15.43 ratios 4.3.2; Mills's 4.4.8; sex

3. 1 1.22 record: times 4.2. 1, 4, 4.6.6, 10;

values 6. 15 .20, 7. 11 .36 recurrence, see difference

recurrence time 6.9 . 1 1 recurrent: event 5. 12.45, 7.5.2,

9 .7.4; see persistent

red now 12.9 . 18 reflecting barrier: s.r.w. 1 1.2. 1-2;

drifting Wiener pro 13.5 . 1; Ornstein-Uhlenbeck pro 13. 12.6

regeneration 11 .3.3 regression 4.8.7, 4.9 .6, 4. 14.13,

7.9.2 rejection method 4. 1 1.3-4, 13

Page 446: One Thousand Exercises in Probability

Index

reliability 3.4.5-6, 3. 1 1. 18-20, renewal: age, see current life;

alternating 10.5.2, 10.6. 14; asymptotics 10.6. 11 ; Bernoulli 8.7.3; central limit theorem 10.6.3; counters 10.6.6-8, 15; current life 10.3.2, 10.5.4, 10.6.4; delayed 10.6. 12; excess life 8.3.2, 10.3. 1-4, 10.5.4; r. function 10.6. 1 1 ; gaps 10. 1 .2; key r. theorem 10.3.3, 5, 10.6 .11 ; Markov 8.3.5; m.gJ. 10. 1. 1; moments 10. 1.1; Poisson 8.3.5, 10.6.9-10; r. process 8.3.4; r.-reward 10.5. 1-4; r. sequence 6. 15.8, 8.3. 1, 3; stationary 10.6. 18; stopping time 12.4.2; sum/superposed 10.6. 10; thinning 10.6. 16

Renyi's theorem 6. 15.39 repairman 11.7. 18 repulsion 1.8.29 reservoir 6.4.3 resources 6. 15.47 retaining barrier 11.2.4 reversible: birth-death pro

6.15. 16; chain 6. 14. 1; Markov pro 6. 15. 16, 38; queue 11.7.2-3, 11.8. 12, 14; r.w. 6.5. 1

Riemann-Lebesgue lemma 5.7.6 robots 3.7.7 rods 4. 14.25-26, ruin 11.8. 18, 12.9. 12; see also

gambler's ruin

runs 1.8.21, 3.4. 1, 3.7.10, 5. 12.3, 46-47

S a-field 1.2.2, 4, 1.8.3, 9.5. 1,

9.7. 13; increasing sequence of 12.4.7

St John's College 4. 14.5 1 St Petersburg paradox 3.3.4 sample: normal 4. 10.5, 5. 12.42;

ordered 4. 12.21 sampling 3. 11.36; Poisson 6.9.4;

with and without replacement 3. 11.10

sampling from distn: arc sine 4. 11.13; beta 4. 11.4-5; Cauchy 4. 11.9; exponential 4. 14.48; gamma 4. 11.3; geometric 4. 11.8; Markov chain 6. 14.3; multinormal 4. 14.62; normal 4. 11.7, 4.14.49; s.r.w. 4. 1 1.6; uniform 4. 11 .1

secretary problem 3. 11 . 17, 4. 14.35

self-financing portfolio 13. 10.2-3 semi-invariant 5.7.3-4 sequence: of c.f.s 5. 12.35 ; of

distns 2.3. 1; of events 1.8. 16; of heads and tails, see pattern; renewal 6. 15.8, 8.3. 1, 3; of r.v.s 2.7.2

series of queues 1 1.8.3, 12 shift operator 9.7. 1 1-12 shocks 6. 13.6 shorting 13. 1 1.2 simple birth pro 6.8.4-5, 6. 15.23 simple birth-death pr.:

conditioned 6. 1 1.4-5; diffusion approximation 13.3. 1 ; extinction 6. 1 1.3, 6. 15.27; visits 6. 1 1.6-7

simple immigration-death pro 6. 1 1.2, 6. 15. 18

simple: process 8.2.3; r.w. 3.9. 1-3, 5, 3. 10. 1-3, 3. 1 1.23, 27-29, 1 1.2. 1, 12. 1.4, 12.4.6, 12.5.4-7

simplex 6. 15.42; algorithm 3. 1 1.33

simulation, see sampling

sixes 3.2.4 skewness 4. 14.44 sleuth 3. 1 1.21 Slutsky's theorem 7.2.5 smoothing 9.7.2 snow 1.7. 1 space, vector 2.7.3, 3.6. 1 span of r.v. 5 .7.5, 5.9.4 Sparre Andersen theorem

13. 12. 18 spectral: density 9.3.3;

distribution 9.3.2, 4, 9.7.2-7; increments 9.4. 1, 3

spectrum 9.3. 1 sphere 1.8.28, 4.6. 1, 6. 13.3-4,

12.9.23, 13. 11 .1 ; empty 6. 15.31

squeezing 4. 14.47 standard: bivariate normal

4.7.5; multinormal 4.9.2; normal 4.7.5; Wiener pro 9.6. 1, 9.7. 18-2 1, 13. 12. 1-3

state: absorbing 6.2.2; persistent 6.2.3-4; symmetric 6.2.5; transient 6.2.4

stationary distn 6.9. 1, 3-4, 1 1-12; 6. 1 1.2; birth-death pro 6. 1 1.4; current life 10.6.4; excess life 10.3.3; Markov chain 9. 1.4; open migration 1 1.7. 1,

437

5; queue length 8.4.4, 8.7.4, 1 1.2. 1-2, 6, 11.4. 1, 1 1.5 .2, and Section 1 1.8 passim; r.w. 11.2. 1-2; waiting time 1 1.2.3, 1 1.5.2-3, 1 1.8.8

stationary excess life 10.3.3 stationary increments 9.7. 17 stationary measure 9.7. 1 1-12 stationary renewal pro 10.6. 18, Stirling's formula 3. 10. 1, 3. 1 1.22,

5.9.6, 5. 12.5, 6. 15.9, 7. 1 1.26 stochastic: integral 9.7. 19,

13.8. 1-2; matrix 6.1. 12, 6. 14. 1, 6. 15.2; ordering 4. 12. 1-2

stopping time 6. 1.6, 10.2.2, 12.4. 1-2, 5, 7; for renewal pro 12.4.2

strategy 3.3.8-9, 3. 1 1.25, 4. 14.35, 6.15 .50, 12.9. 19, 13. 12.22

strong law of large numbers 7.4. 1, 7.5. 1-3, 7.8.2, 7. 1 1.6, 9.7. 10

strong Markov property 6.1.6 strong mixing 9.7. 12 Student's t distn 4. 10.2-3;

non-central 5.7.8 subadditive fn 6.15. 14, 8.3.3 subgraph 13. 1 1.3 sum of independent r.v.s:

Bernoulli 3. 11 .14, 35; binomial 3. 1 1.8, 11 ; Cauchy 4.8.2, 5 .1 1.4, 5. 12.24-25; chi-squared 4. 10. 1, 4. 14. 12; exponential 4.8. 1, 4, 4. 14. 10, 5. 12.50, 6. 15.42; gamma 4. 14. 11 ; geometric 3.8.3-4; normal 4.9.3; p.g.f. 5.12.1 ; Poisson 3. 1 1.6, 7.2. 10; random 3.7.6, 3.8.6, 5.2.3, 5. 12.50, 10.2.2; renewals 10.6. 10; uniform 3.8. 1, 4.8.5

sum of Markov chains 6. 1.8 supercritical branching 6.7.2 supermartingale 12. 1.8 superposed: Poisson pro 6.8. 1 ;

renewal pro 10.6. 10 sure thing principle 1.7.4 survival 3.4.3, 4. 1.4 Sylvester's problem 4. 13. 12,

4. 14.60 symmetric: r.v. 3.2.5, 4. 1.2,

5. 12.22; r.w. 1.7.3; state 6.2.5 symmetry and independence 1.5.3 system 7.7.4; Labouchere 12.9. 15

Page 447: One Thousand Exercises in Probability

T t, Student's 4. 10.2-3; non-central

5.7.8

tail: c.f. 5.7.6; equivalent 7. 11 .34; event 7.3.3, 5; function 9.5.3; integral 4.3.3, 5 ; sum 3. 11 . 13, 4. 14.3

tail of distu: and moments 5.6.4, 5 . 1 1.3; p.g.f. 5. 1.2

tandem queue 11.2.7, 11 .8.3 taxis 11 .8.16

telekinesis 2.7.8

telephone: exchange 11 .8.9; sales 3.11 .38

testimony 1.8.27

thinning 6.8.2; renewal 10.6.16

three-dimensional r.w. 6.15 . 10; transience of 6. 15.9,

three-dimensional Wiener pro 13. 11 . 1

three series theorem 7. 1 1.35 tied-down Wiener pro 9 .7.2 1-22,

13.6.2-5

tilted distu 5. 1.9, 5.7. 1 1

time-reversibility 6.5. 1-3, 6. 15 . .1 6 total life 10.6.5

total variation distance 4. 12.3-4, 7.2.9, 7. 11 . 16

tower property 3.7. 1, 4. 14.29

traffic: gaps 5. 12.45, 8.4.3; heavy 1 1.6. 1, 1 1.8. 15; Poisson 6.15.40, 49, 8.4.3

transform, inverse 2.3.3 transient: r.w. 5 . 12.44, 7.5 .3;

Wiener pro 13. 1 1 . 1 transition matrix 7. 1 1.31

transitive coins 2.7. 16

trapezoidal distn 3.8. 1

trial, de Moivre 3.5 . 1

triangle inequality 7. 1 . 1, 3

Trinity College 12.9. 15

triplewise independent 5. 1.7

trivariate normal distn 4.9.8-9

trivial r.v. 3.1 1.2

truncated: r.v. 2.4.2; r.w. 6.5.7

Tunm's theorem 3.1 1.40 turning time 4.6. 10

two-dimensional: r.w. 5.3.4, 5 . 12.6, 12.9. 17; Wiener pro 13. 12. 12-14

two server queue 8.4. 1, 5, 1 1.7.3, 11 .8. 14

two-state Markov chain 6.15. 1 1, 17, 8.2. 1; Markov pro 6.9. 1-2, 6. 15 . 16-17

Index

11 .4.2; in MlDIl 11.8. 10; in MlG/I 11 .8.6; in MIMII 11 .2.3; stationary distn 1 1 .2.3, 11.5.2-3; virtual 11 .8.7

Wald's eqn 10.2.2-3

Waldegrave's problem 5. 12. 10

Waring's theorem 1.8. 13, 5.2. 1

Type: T. one counter 10.6.6-7; T. weak law of large numbers 7.4. 1, two counter 10.6. 15 7. 11 . 15, 20-21

u U Mys8ka 8.7.7 unconscious statistician 3. 1 1.3 uncorrelated r.v. 3.11 . 12, 16,

4.5 .7-8, 4.8.6 uniform integrability: Section

7. 10 passim, 10.2.4, 12.5. 1-2 uniform distn 3.8. 1, 4.8.4, 4. 11 . 1,

9 .7.5; maximum 5. 12.32; order statistics 4. 14.23-24, 39, 6.15.42; sample from 4. 11 .1 ; sum 3.8. 1, 4.8.4

uniqueness of conditional expectation 3.7.2

upcrossings inequality 12.3.2 upper class fn 7.6 .1 urns 1.4.4, 1.8.24--25, 3.4.2,

4, 6.3. 10, 6 .15 .12; P6lya's 12.9. 13-14

V value function 13. 10.5 variance: branching pro 5. 12.9;

conditional 3.7.4, 4.6.7; normal 4.4.6

vector space 2.7.3, 3.6. 1 Vice-Chancellor 1.3.4, 1.8. 13 virtual waiting 11 .8.7 visits: birth-death pro 6.1 1.6-7;

Markov chain 6.2.3-5, 6.3.5, 6.9.9, 6. 15.5, 44; r.w. 3. 1 1.23, 29, 6.9 .8, 10

voter paradox 3.6.6

W waiting room 11 .8. 1, 15 waiting time: dependent 1 1.2.7;

for a gap 10. 1.2; in G/G/I 1 1.5 .2, 11 .8.8; in GIMlI

438

Weibull distn 4.4.7, 7. 11 . 13

Weierstrass's theorem 7.3.6

white noise, Gaussian 13.8.5

Wiener process : absorbing barriers 13. 13.8, 9; arc sine laws 13.4.3, 13. 12. 10, 13. 12. 19; area 12.9.22; Bartlett eqn 13.3.3; Bessel pro 13.3.5; on circle 13.9.4; conditional 8.5 .2, 9.7.21, 13.6. 1; constructed 13. 12.7; d-dimensional 13.7. 1; drift 13.3.3, 13.5 .1 ; on ellipse 13.9.5; expansion 13. 12.7; first passage 13.4.2; geometric 13.3.9, 13.4. 1; hits sphere 13. 11 . 1 ; hitting barrier 13.4.2; integrated 9.7.20, 12.9 .22, 13.3.8, 13.8. 1-2; level sets 13. 12. 16; martingales 12.9 .22-23, 13.3.8-9; maximum 13. 12.8, 1 1, 15, 17; occupation time 13. 12.20; quadratic variation 8.5.4, 13.7.2; reflected 13.5 . 1; sign 13. 12.21; standard 9.7. 18-2 1; three-dimensional 13. 11 .1 ; tied-down, see Brownian bridge; transformed 9.7. 18, 13. 12. 1, 3; two-dimensional 13. 12. 12-14; zeros of 13.4.3, 13. 12. 10

Wiener-Hopf eqn 1 1.5.3, 11 .8.8

x X-ray 4. 14.32

Z zero-one law, Hewitt-Savage

7.3.4-5

Page 448: One Thousand Exercises in Probability

FROM A REVI EW OF PROBABILITY ANO RANDOM PROCESSES:

PROBLEMS AND SOLUTIONS