89
Lecture Notes on Stochastic Processes in Biostatistics: Applications to Infectious Diseases Ira M. Longini, Jr. Department of Biostatistics Rollins School of Public Health Emory University Atlanta, GA Michael G. Hudgens Fred Hutchinson Cancer Research Center Seattle, WA January 1, 2003

Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Lecture Notes on Stochastic Processes inBiostatistics: Applications to Infectious Diseases

Ira M. Longini, Jr.Department of Biostatistics

Rollins School of Public HealthEmory UniversityAtlanta, GA

Michael G. HudgensFred Hutchinson Cancer Research Center

Seattle, WA

January 1, 2003

Page 2: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

2

Page 3: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

CONTENTS

0.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1 Introduction 9

2 Preliminaries 112.1 Probability Generating Function . . . . . . . . . . . . . . . . . . . . . 112.2 Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 Compound distributions . . . . . . . . . . . . . . . . . . . . . 13

2.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Galton-Watson (GW) Branching Process 153.1 Probability distribution, Expectations, Extinction . . . . . . . . . . . 153.2 Inference on the GW branching process . . . . . . . . . . . . . . . . . 17

3.2.1 The explosion set . . . . . . . . . . . . . . . . . . . . . . . . . 173.2.2 Time Series Methods . . . . . . . . . . . . . . . . . . . . . . . 183.2.3 Likelihood-based methods . . . . . . . . . . . . . . . . . . . . 19

3.3 Epidemics as GW branching process . . . . . . . . . . . . . . . . . . 213.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Random Walks 254.1 Simple Random Walks . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2 Di¤erence Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Gambling Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.1 Gambler�s Ruin . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.2 Expected Duration . . . . . . . . . . . . . . . . . . . . . . . . 284.3.3 Discrete-time martingales . . . . . . . . . . . . . . . . . . . . 28

4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Discrete-time Markov chains 315.1 Transition Probabilities, Classi�cations, Asymptotics . . . . . . . . . 31

5.1.1 Absorbing Chains. . . . . . . . . . . . . . . . . . . . . . . . . 375.2 Algebraic treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Page 4: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

4 CONTENTS

5.3 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.3.1 Inference on a single sequence . . . . . . . . . . . . . . . . . . 415.3.2 Inference on multiple observed sequences . . . . . . . . . . . . 42

5.4 The chain binomial model . . . . . . . . . . . . . . . . . . . . . . . . 425.5 The Reed-Frost Model . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.5.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.5.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.5.3 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.6 Life Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.7 HIV-progression model . . . . . . . . . . . . . . . . . . . . . . . . . . 485.8 Endemic Reed-Frost Model . . . . . . . . . . . . . . . . . . . . . . . . 495.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6 Continuous-time Markov chains 536.1 Poisson process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.2 Birth and death processes . . . . . . . . . . . . . . . . . . . . . . . . 55

6.2.1 Linear Birth Process . . . . . . . . . . . . . . . . . . . . . . . 566.2.2 Linear Death Process . . . . . . . . . . . . . . . . . . . . . . . 566.2.3 Linear Birth-Death Process . . . . . . . . . . . . . . . . . . . 57

6.3 Kolmogorov di¤erential equations . . . . . . . . . . . . . . . . . . . . 586.4 Algebraic Treatment . . . . . . . . . . . . . . . . . . . . . . . . . . . 606.5 Mean time to absorption . . . . . . . . . . . . . . . . . . . . . . . . . 626.6 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.6.1 Inference on a single sequence . . . . . . . . . . . . . . . . . . 656.6.2 Inference on birth and death processes . . . . . . . . . . . . . 66

6.7 HIV-progression models . . . . . . . . . . . . . . . . . . . . . . . . . 676.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7 Counting Processes 697.1 Continuous Time Martingales . . . . . . . . . . . . . . . . . . . . . . 697.2 Inference on continuous-time epidemics . . . . . . . . . . . . . . . . . 737.3 Martingale-based approach to estimating vaccine e¢ cacy . . . . . . . 75

8 Hidden Markov Chains 79

9 Gibbs Sampling 81

10 Appendix 8510.1 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8510.2 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8510.3 Convergence of Sequences and Series . . . . . . . . . . . . . . . . . . 8610.4 Convergence in distribution . . . . . . . . . . . . . . . . . . . . . . . 8610.5 Convergence in Probability . . . . . . . . . . . . . . . . . . . . . . . . 86

Page 5: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

CONTENTS 5

10.6 Almost Sure Convergence . . . . . . . . . . . . . . . . . . . . . . . . 87

Page 6: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

6 CONTENTS

To see a World in a Grain of SandAnd a Heaven in a Wild Flower,Hold In�nity in the palm of your handAnd Eternity in a hour. :::::

W. Blake, Auguries of Innocence, 1803

0.1 Preface

These notes have grown out of a one-semester course in applied stochastic processesthat I have taught over the last twenty-four years. I taught the course for the �rsttime at the Universidad del Valle in Cali, Colombia in 1977. At that time, I also waswith the International Center Medical Research and Training in Cali were I workedon projects on tropical diseases. My formal education in stochastic processes was notadequate to deal with the statistics generated by these infectious disease problems.The application of the ideas of stochastic processes to infectious disease problemsmostly involved modeling exercises rather than statistical inference from data. Atthat time, the only book on the mathematics of epidemics that had a good stochas-tic processes basis was The Mathematical Theory of Infectious Diseases by NormanBailey [1]. For me, this book provided a foundation for the analytic study of in-fectious disease problems. My interpretation of Bailey�s basic approach to solvingthe problems associated with the analysis of infectious disease statistics is i) framethe problem mathematically, ii) carry out a qualitative analysis of the deterministicequations, iii) carry out the qualitative analysis of the stochastic equations, iv) carryout inference with the stochastic equations from �eld data, if available. Usually, thedeterministic formulation of the process, as either di¤erential or di¤erence equations,is more tractable than the stochastic formulation. Thus, an analysis of the deter-ministic equations can lead to basic insights about the dynamics of the mean of theprocess. For linear processes, the solution to the deterministic process is exactly themean of the analogous stochastic process. For nonlinear processes, the solution tothe deterministic process may serve as an approximation. In some cases, the investi-gator can start with step iii. Generally, the stochastic equations are needed in orderformulate likelihood functions or other estimating functions used in inference. Thesenotes are partially based on the sound approach of applying steps i - iv above.

More recent books, Analysis of Infectious Diseases [3] and Stochastic EpidemicModels and Their Statistical Analysis[?], make use of stochastic processes to analyzeinfectious disease data. However, neither book can serve as a general reference for sto-chastic processes. These lecture notes are intended to �ll this gap. Infectious diseaseproblems o¤er an excellent paradigm on which to teach applied stochastic processin biostatistics. A susceptible individual makes the potential transition to infectedthrough his or her interaction with infected individuals. If infection occurs, this in-dividual then makes further transitions to other infected states with the possibility

Page 7: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Preface 7

of eventual recovery. These transitions, for both susceptible and infected individuals,evolve over time and space. Such a complex probabilistic process is best describedthrough the use of stochastic processes. Through such illustrations, we present classi-cal material on a host of stochastic processes including branching processes, Markovprocesses, birth and death processes and martingales. We develop material for acourse in applied stochastic process in a uniform, logical manner moving from dis-crete to continuous processes, but we anchor the material with illustrations mostlyfrom infectious disease problems.

The �ow and material for the classical stochastic processes backbone of theselecture notes is similar to that in Chiang[4] and Karlin and Taylor [?]. Other impor-tant stochastic processes background texts are Bailey[?], Bhat[?] and Ross[?]. Thereare few books about inference on stochastic processes [?][?], and Basawa and Rao[?]is a good reference in this area.

Students taking this course should have a basic grounding in probability theoryand mathematical statistics. In addition, some basic knowledge in real analysis anddi¤erential equations is helpful, but not necessary. A willingness to develop intuitionabout the nature of dynamic systems is important and curiosity about the naturalworld is essential.

Ira M. Longini, Jr.AtlantaJanuary 1, 2003

Page 8: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

8 CONTENTS

Page 9: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 1

INTRODUCTION

We give the formal de�nition of a stochastic process as follows:

De�nition 1.0.1 We de�ne a stochastic process as a collection of random variablesfX(t); t 2 Tg, de�ned on a state space (set) S, where each random variable is indexedby a parameter (index parameter) t which varies in an index set T:

Thus, the random variable X(t) has a range that we will refer to as the statespace given by

De�nition 1.0.2 We de�ne the state space as the set of possible values X(t) cantake on.

Some examples of state spaces are as follows:

Example 1.0.3 S = f0; 1; 2; : : :g integer valued on a discrete state space.

Example 1.0.4 S = fx : �1 < x <1g continuous state space.

Example 1.0.5 S = fx : x 2 Rkg k - vector process

The domain, t; of random variable X(t) is the index parameter.

De�nition 1.0.6 We de�ne the index parameter t and the index set T.

Example 1.0.7 T = f0;�1;�2; : : :g discrete parameter process.

Example 1.0.8 T = ft : �1 < t <1g continuous parameter process.

Based on the state space and the index set, the stochastic process fX(t); t 2 Tgcan be classi�ed into four possible categories as shown in table 1.

Table 1: Categories of stochastic processesIndex Set T

Discrete ContinuousState Space Discrete I II

S Continuous III IV

Page 10: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

10 Introduction

Example 1.0.9 An example of a category I process is the Galton-Watson branchingprocess. Let Zn be the random variable for the population size in the nth generation.Then S = f0; 1; 2; : : :g and T = f0; 1; 2; : : :g:

Example 1.0.10 An example of a category II process is a linear birth and deathprocess. Let X(t) be the random variable for the population at time t: Then S =f0; 1; 2; : : :g and T = ft : 0 � t <1g:

Example 1.0.11 An example of a category IV process is di¤usion process (Brownianmotion). Imagine a particle moving randomly in one dimensional. Let X(t) be therandom variable for the position of the particle on the real line at time t: Then S =fx : �1 < x <1g and T = ft : 0 � t <1g:

Examples of category III processes are relatively rare in biostatistical applica-tions, but we will brie�y encounter one in chapter 9 on Gibbs sampling.

Page 11: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 2

PRELIMINARIES

2.1 Probability Generating Function

The probability generating function (pgf) is one of the major analytical tools we willuse to work with stochastic processes on discrete state spaces.

De�nition 2.1.1 Let X be a nonnegative integer-valued random variable such thatP [X = k] = pk; k = 0; 1; 2; : : : is a probability mass function (pmf). Then the proba-bility generating function is given by

gX (s) = E�sX�=

1Xk=0

skpk: (2.1)

We note the following properties of the pgf

gX (1) =1Xk=0

pk = 1

g0

X (s) =1Xk=1

ksk�1pk; g0

X (0) = p1

g00

X (s) =

1Xk=2

k (k � 1) sk�2pk; g00

X (0) = 2p2

In general, grX (s) =P1

k=r k (k � 1) (k � 2) � � � (k � r + 1) sk�rpk, grX (0) = r!pr; sothat pr =

grX(0)

r!and we see how the pgf literally �generates�probabilities. Therefore

the pmf and pgf have one-to-one correspondence. Thus, if we are able to �nd thepgf of a stochastic process, then, at least theoretically, we can recover the pmf of theprocess. This is important since in some cases, it is di¢ cult or impossible to derivethe pmf of a process directly, but it is possible to derive its pgf.

The moments are also easily attainable from the pgf

g0

X (1) =

1Xk=1

kpk = E (X)

Page 12: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

12 Preliminaries

g00

X (1) =1Xk=2

k (k � 1) pk = E (X (X � 1))

implying that V ar (X) = g00X (1) + g

0X (1) �

�g0X (1)

�2: Similarly, for the rth factorial

moment about the origin we have

E [X (X � 1) � � � (X � r + 1)] = grX (1) :

Thus, we can recover all the moments about the origin from the pgf.

Example 2.1.2 Let X be a Bernoulli random variable such that the pmf of X isgiven by p1 = p and p0 = 1� p. Then the pgf of X is given by

gX (s) =

1Xk=0

skpk = 1� p+ ps

Example 2.1.3 Let X have a Poisson distribution such that

P [X = k] =e���k

k!; k = 0; 1; 2; : : :

Then the pgf of X is given by

gX (s) =1Xk=0

skpk = e��(1�s)

2.2 Convolution

Convolutions are distributions of sums of random variables. Such distributions canbe di¢ cult to �nd directly, but the pgf of such a sum is just the product of the pgf�sinvolved. This is generally a relatively simple operation.

Let X; Y be independent random variables such that if P [X = i] = pi andP [Y = j] = qj then P [X = i; Y = j] = pij = piqj: Let Z = X+Y . Then P [Z = k] =rk =

Pki=0 piqk�i =

Pkj=0 pk�jqj: We write a convolution as frkg = fpkg � fqkg. Note

thatgZ (s) =

P1k=0 s

krk =P1

k=0 skPk

i=0 piqk�i

=P1

i=0

P1k=i s

kpiqk�i =P1

i=0

P1j=0 s

i+jpiqj

=P1

i=0 sipiP1

j=0 sjqj

Therefore,gZ (s) = gX (s) � gY (s) :

In general, suppose X1; : : : ; Xn are mutually independent random variables and Zn =X1 + � � � +Xn: Let pkj = P [Xj = k] and rk = P [Zn = k] : Then Zn has probabilitydistribution given by frkg = fpk1g � � � � � fpkngand pgf gzn(s) = gx1 (s) � � � gxn(s):Furthermore, if X1; : : : ; Xn are i.i.d. with pgf g (s), then gZn (s) = [g (s)]

n :

Page 13: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Exercises 13

Example 2.2.1 Let Zn be a binomial random variable which is the sum of n Bernoullirandom variables. Then the pgf of Zn is given by

gZn (s) = [g (s)]n = [1� p+ ps]n

2.2.1 Compound distributions

Now suppose ZN = X1 + � � � +XN where X1; :::; XN are i.i.d. with pgf g(s) and Nis a random variable with pgf h (s) = E

�sN�=P1

k=0 skP [N = k] : Thus, ZN is a

random sum of i.i.d. random variables. Then the pgf of ZN is given by

G (s) = E�sZN�=P1

k=0 skP [ZN = k]

=P1

k=0 skP1

j=0 P [ZN = kjN = j] � P [N = j]

=P1

k=0 skP1

j=0 P [X1 + � � �+Xj = k] � P [N = j]

=P1

j=0 P [N = j]P1

k=0 P [X1 + � � �+Xj = k] � sk

=P1

j=0 P [N = j] � (g (s))j = h (g (s))

2.3 Exercises

Exercise 2.3.1 Use the equation G (s) = h (g (s)) to show that E [ZN ] = E [N ]�E [X]and V ar (ZN) = E (N) � V ar (X) + V ar (N) � E [X]2 :

Exercise 2.3.2 Consider the following experiment: A cell is given a single dose ofpathogenic organisms, where the number of organisms in a dose is a random variableN which follows a Poisson distribution with parameter �: Given that a dose has beenadministered, each organism survives with probability p: In addition, the probabilitythat any particular organism survives is independent of the fate of the other N � 1organisms in the dose. Let YN be a random variable for the number of survivingorganisms in a dose.

a. Find the pmf of YN :b. If the cell dies when more than r organisms survive from a single dose, what

is the probability that the cell dies when infected by a single dose?c. Suppose n cells are given a single dose each. What is the probability that k

of the cells survive (where k � n)?

Page 14: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

14 Preliminaries

Page 15: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 3

GALTON-WATSON (GW) BRANCHING PROCESS

The theory of branching processes goes back to the middle of the nineteenth century.This theory has been applied to problems in genetics, evolution, physics and epidemictheory just to name a few areas. In 1869, Francis Galton posed �the problem ofthe extinction of families�[?]. He presented this challenge [?] to his acquaintance,the Reverend H.W. Watson, who attacked the problem with generating functionsand functional iteration[?]. Unfortunately, the good reverend got the wrong answer.Their joint e¤orts lead to more di¢ culties[?], but they helped lay the groundworkfor the correct solutions as we see in the next section. A good history and rigorousmodern treatment of the branching process can be found in the book by Guttorp[?].An important earlier book on the topic is by Harris[?].

3.1 Probability distribution, Expectations, Extinction

Let Zn be the population size in generation n and X be the number of o¤spring for anindividual with pmf P [X = k] = pk; k = 0; 1; 2; : : :, E[X] = �; and V ar [X] = �2 <1: Let Xj be the random variable for the number of o¤spring for the jth individualsuch that Zn+1 =

PZnj=1Xj: Let the pgf of X be given by g (s) =

P1k=0 s

kpk and letgn (s) =

P1k=0 P [Zn = k] s

k be the pgf of Zn for n = 1; 2; 3 : : : It follows that

gn+1 (s) = gn (g (s)) = g (gn (s))

the latter equality holding only when Z0 = 1: (Note that by Z0 = 1 we meanP [Z0 = 1] = 1). We see that g0 (s) = s when Z0 = 1; g0 (s) = s2 when Z0 = 2;and in general, g0(s) = si0 when Z0 = i0: The expected value of the Zn is given by

E [Zn] = g0

n (1) =hg0(1)in= �n

since g0n (1) = g

0(1)�g0n�1 (g (1)) = g

0(1)�g0n�1 (1) implies that g

0n (1) = � � � =

�g0(1)�n:

We could arrive at the same solution via conditional expectation by noting that

Page 16: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

16 Galton-Watson (GW) Branching Process

E [Zn+1 j Zn] = E [X1 + � � �+XZn j Zn] = � � Zn: It follows that

E [Zn] = E [E [Zn j Zn�1]]

= E [� � Zn�1] = � � E [Zn�1]

= � � E [E [Zn�1 j Zn�2]] = �2 � E [Zn�2] = � � � = �n:

We note the underlying deterministic system given by zn+1 = � � zn; z0 = 1 has thesolution zn = �

n: However as n tends to1; the deterministic and stochastic processeshave very di¤erent behavior. The limiting value of E [Zn] is given by:

limn!1

E [Zn] =

8<:0 if � < 1 (subcritical)1 if � = 1 (critical)1 if � > 1 (supercritical)

:

It can also be shown that V ar (Zn+1jZn) = �2Zn, from which it follows that

V ar [Zn] =

��2�n�1

�1��n1���if � 6= 1

n�2 if � = 1:

Thus when � = 1; the mean remains constant while the variance tends to in�nity,assuring extinction.

For 0 < p0 < 1; let qn = P [Zn = 0] = gn (0) be the probability of extinctionby the nth generation. Note gn+1 (0) = g (gn (0)) implies that qn+1 = g (qn) : Sinceg (s) is a strictly monotone increasing function for s 2 (0; 1), it follows that

q1 = p0 < g (q1) = q2 < g (q2) = q3 < � � � < qn < qn+1 < � � �

Thus fqng is a monotonically increasing sequence bounded above by 1, implying theexistence of

� = limn!1

qn

for 0 < � � 1: If p0+p1 < 1, then g00(s) =

P1k=0 k(k�1)sk�2pk > 0: Thus the equation

s = g (s) has one or two solutions for s 2 (0; 1]. (Insert graphs here.) We leave itas an exercise to show that the probability of extinction, �; is the smallest positiveroot of s = g (s) : Here we refer to graphs to show that � = 1 when g

0(1) = � � 1

and � < 1 when g0(1) = � > 1: To summarize the behavior of this process: if � � 1,

then eventual extinction is certain, but if � > 1 then population goes extinct withprobability with probability � which is the smallest positive root of s = g (s) : Thepopulation does not go extinct with probability 1 � �. In this case, the populationsize will go to in�nity. This very fundamental result can be thought of as a type offolk theorem for population processes. The counter intuitive message is that �nitepopulations may go extinct even if � > 1:

Page 17: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Inference on the GW branching process 17

It is instructive to compare this behavior with the corresponding deterministicsystem. As we mentioned earlier, the path of the corresponding deterministic systemis the same as the mean of the stochastic system, i.e., zn = �

n. The behavior of thedeterministic system is as follows: if � < 1, then extinction occurs, but if � � 1 thenthe population does not go extinct. This simpler behavior is not realistic for �nitepopulations. Thus, the deterministic model is not realistic unless the population sizebecomes very large. At this point, it provides a good approximation to reality. Thisis a very common characteristic of deterministic formulations.

Thus far we have assumed that Z0 = 1: Suppose now that instead Z = i0where i0 > 1: It follows that g0 (s) = si0 and that E [Zn] = i0�

n: In this case, if� � 1 then extinction occurs with probability 1, while if � > 1 then the probabilityof extinction is �i0 :

Example 3.1.1 Survival and Family Names (male lines): Let pk be the prob-ability that a newborn boy becomes the progenitor of k other boys. What is the proba-bility that the family name goes extinct? Lotka uses the following estimated o¤springdistribution to solve the problem: p0 = 0:4825; pk = (0:2126) (0:5893)

k�1 ; k � 1:

Example 3.1.2 Genes and Mutation: Let the number of descendants of a mutantgene K have a Poisson distribution with parameter �: Then g (s) = e�(s�1) and wecan easily calculate the extinction probability. For example, if � = 2; then � = 0:203.

Example 3.1.3 S ! I ! R Epidemic (Susceptible, Infected, Removed): LetZn be the number of people infected in generation n; and � be the average numberof people that an infected person infects during his or her infectious period in a fullysusceptible population. (� is called the basic reproductive number). Let Xn be thenumber of susceptible people in generation n, so that Xn+1 = Xn � Zn: Then oneversion of the celebrated threshold theorem of epidemics is as follows: if � � 1; thereis no epidemic with probability one. If � > 1; then there is no epidemic with probability�i0 and an epidemic with probability 1��i0 ; where i0 is the number of initial infectedpeople.

3.2 Inference on the GW branching process

3.2.1 The explosion set

In this section we describe methods for estimating the o¤spring distribution andits moments. Generally such estimation will be done from an observed realizationfZ0; Z1; � � � ; Zng : Technically, such estimators are consistent only if the observedrealization is part of the explosion set. The explosion set is that set of sequences for� > 1 that do not go extinct. In the following sections, we will assume that we aredoing inference on the explosion set.

Page 18: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

18 Galton-Watson (GW) Branching Process

3.2.2 Time Series Methods

Our main goal is to do inference on the mean and variance of the o¤spring distrib-ution � and �2: Based on the conditional expectation of the GW branching process,E [Zn+1 j Zn] = � � Zn , we consider the �rst order autoregressive process

Zn = �Zn�1 + Yn

where fYng1n=1 is a sequence of uncorrelated random variables such that E (Yi) = 0and V ar (Yi) = �2: We let

SSE =

nXk=1

(Zk � �Zk�1)2 =nXk=1

Y 2k

so that@SSE

@�= �

nXk=1

2 (Zk � �Zk�1)Zk�1

Setting to 0, we get the following estimate for � :

�n =

Pnk=1 (Zk � Zk�1)Pnk=1 (Zk�1)

2

which is serial the correlation of lag one.Recall that E (Zn j Zn�1) = �Zn�1 and V ar (Zn j Zn�1) = �2Zn�1; suggesting

that we might consider the autoregressive type model:

Zn = �Zn�1 + UnpZn�1:

It follows that

Un =

�Zn��Zn�1�pZn�1

�� =

�Zn�E(ZnjZn�1)pV ar(ZnjZn�1)

�� :

Again we can minimize the error sum of squares to get an estimate of � :

SSE =nXk=1

U2k =nXk=1

(Zk � �Zk�1)2

Zk�1(3.1)

@SSE

@�= �

nXk=1

2 (Zk � �Zk�1)

Setting to 0, we get:

�n =

Pnk=1 ZkPnk=1 Zk�1

(3.2)

However we need maximum likelihood to get V ar��n

�: Based on (??), an

estimator of the variance of the o¤spring distribution is

b�2n = SSE

n=1

n

nXk=1

(Zk � �Zk�1)2

Zk�1

Page 19: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Inference on the GW branching process 19

3.2.3 Likelihood-based methods

A clever idea by Harris[?] is as follows. Assume Z0 = 1: Observe Z1; Z2; : : : andlet Znr = # of individuals in the nth generation who produce r o¤spring. ThenZn =

P1r=0 Znr and Zn+1 =

P1r=0 rZnr: The joint conditional density is an in�nite

multinomial:

P [Zn0; Zn1; : : : j Zn] =�

Zn!Q1r=0 Znr!

� 1Yr=0

pZnrr

which gives rise to the following likelihood function for n generations:

L(p0; p1; : : :) =n�1Yk=0

"�Zk!Q1r=0 Zkr!

� 1Yr=0

pZkrr

#= c

n�1Yk=0

1Yr=0

pZkrr

where c is some constant. The MLE is given by

pr =

Pn�1k=0 ZkrPn�1k=0 Zk

:

And since

� =1Xr=0

rpr;

it follows that the MLE of � is

�n =P1

r=0 rpr =P1r=0 r

Pn�1k=0 ZkrPn�1

k=0 Zk

=Pn�1k=0

P1r=0 rZkrPn�1

k=0 Zk

=Pnk=1 ZkPn

k=1 Zk�1

which is the same answer as equation(3.2). However we still can not determine thevariance and distribution of this estimator, so we make a weak assumption: namelythat the o¤spring distribution follows a generalized power series distribution.

De�nition 3.2.1 A discrete random variable X has a generalized power series dis-tribution (GPSD) if:

px = P [X = x] =ax�

x

f (�)for x 2 T

wheref (�) =

Xx2T

ax�x; � > 0; ax � 0:

Page 20: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

20 Galton-Watson (GW) Branching Process

Example 3.2.2 Let ax = 1=x!; T = f0; 1; 2; : : :g ; f (�) =P1

x=0�x

x!= e�: Then

px =�x

x!e��; which we recognize as the Poisson probability mass function.

Example 3.2.3 Let T = f0; 1; 2; : : : ; ng; ax =�nx

�; which gives binomial where � =

p1�p :

Example 3.2.4 For some positive integer c; let T = fc; c + 1; 2; : : : ; ng; ax =�nx

�;

which gives truncated binomial.

If X has a GPSD, then:

� = E [X] =�f 0 (�)

f (�)

sincef 0 (�) =

1

Xx2T

axx�x

It is also easy to show that V ar [X] = �2 = � d�d�:

Assuming X has a GPSD, the likelihood for the branching process now be-comes:

Ln (p0; p1; : : :) = cQn�1k=0

Q1r=0 p

Zkrr

= cQn�1k=0

Q1r=0

har�

r

f(�)

iZkr= c1

Qn�1k=0

Q1r=0

�rZkr

f(�)Zkr

= c1�Pn�1k=0

P1r=0 rZkr

f(�)Pn�1k=0

P1r=0 Zkr

= c1�Pnk=1 Zk

f(�)Pnk=1

Zk�1:

Taking the natural logarithm:

ln = c2 +nXk=1

[Zk ln�� Zk�1 ln f (�)] (3.3)

thus@ln@�

= @ln@�

@�@�

= 1�2

Pnk=1

hZk � Zk�1�f 0(�)

f(�)

i= 1

�2

Pnk=1 [Zk � Zk�1�]

Setting to zero,

�n =

Pnk=1 ZkPnk=1 Zk�1

;

Page 21: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Epidemics as GW branching process 21

which is the same as equation (3.2). The observed information is given by:

In (�) = �@2 lnLn

@�2=

Pnk=1 Zk�1�2

implying that asymptotically:

V ar��n

�=

�2Pnk=1 Zk�1

where �2 is obtained from time series i.e.

�2 =1

n

nXk=0

�Zk � �Zk�1

�2:

The expected or Fisher information is given by:

In (�) =E [Pn

k=1 Zk�1]

�2=1

�2

n�1Xk=0

�k =1

�2

�1� �n

1� �

�implying

V ar��n

�� �2

1� �1� �n

!:

If we further assume that the o¤spring distribution � Poisson, then

V ar��n

�� �

1� �1� �n

!:

3.3 Epidemics as GW branching process

This example is taken from Becker[?]. Let Zn be the number of infected people ingeneration n, with Z0 = i0: Let � be the basic reproductive number. That is, � isthe average number of individuals that an infective could infect in a fully susceptiblepopulation. The threshold theorem (see example 3.1.3 ) is as follows: For � � 1there is no epidemic, and for � > 1 there is an epidemic with probability 1 � �i0where � is the smallest root of g(s) = s: Suppose we want to estimate � from thesequence Z1; Z2; Z3 : : : . Since in �nite populations, the number of susceptibles tendsto decrease as the epidemic evolves, we must modify the Galton-Watson process sothat the o¤spring distribution changes with time accordingly.

Let N be the original number of susceptibles and Yn =Pn

i=1 Zi be thecumulative number of infections through the nth generation. Further let Zn =(Z1; Z2; : : : ; Zn) be the history of the number of infections in each generation up

Page 22: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

22 Galton-Watson (GW) Branching Process

to the nth generation. We will assume that an infected person is infected for only onegeneration and is subsequently removed from the pool of susceptibles. Then in the�rst generation we let mean of the o¤spring distribution be

E1 [X] = �;

and for the nth generation, n > 1;

En [X] = � � gn (Zn�1) = �n

where gn (�) can be any monotonically decreasing function in n and �n is the repro-ductive number for the nth generation. That is, �n is the average number of peoplea person in generation n infects over his or her infectious period. An intuitive choicefor gn (Zn�1) is the fraction of the population not infected at time n� 1 :

gn (Zn�1) = max

�1� Yn�1

N; 0

�:

Similarly let V ar (Zn) = �2n = �2 � hn (Zn�1) where �2 = V ar (X) : One choice

for hn is hn � gn, a Poisson like assumption. Assume Xi has a GPSD. Then we have

pn (Xi) =an (x) � [cn (�)]x

fn [cn (�)]=anc

xn

fn

the latter equality being notational convention. Now

fn =Xx

ancxn; and f

0n =

dfnd�

=c0ncn

Xx

anxcxn

so that

�n =cnc0n� f

0n

fn= � � gn (Zn�1)

�2n =cnc0n� �0n = �2hn (Zn�1)

Making the approximation

P [ZnjZn�1] t(cn)

Zn

(fn)Zn�1

we approximate the log likelihood by

lnLn tnXk=1

fZk ln (ck)� Zk�1 ln (fk)g :

Page 23: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Exercises 23

Taking the derivative with respect to � and setting equal to zero, one can show that

�n =

Pnk=1 (Zkgk=hk)Pnk=1 (Zk�1g

2k=hk)

:

Further it can be shown that the observed information is given by:

In (�) =1

�n

nXk=1

Zkgk�2k

If we make the Poisson assumption that hk � gk; then

�n =

Pnk=1 ZkPn

k=1 Zk�1gk=

YnPnk=1 Zk�1gk

(3.4)

and

In (�) =Yn

�n�2

Note that 3.4 reduces to usual estimator for � 3.2 when gk = 1 for all k. Thus, thegk are weights that adjust for the change in susceptibles as the epidemic progresses.

Example 3.3.1 An outbreak of smallpox in a closed, unvaccinated community inAbakalbiki, Nigeria, 1967. The incubation period for smallpox is 9-15 days (averageof 12). The data is clustered into 12 day intervals as follows:.

generation k 0 1 2 3 4 5 6 7# cases (Zk) 1 1 7 6 3 8 4 0

The overall attack rate is 29119= 0:24: There were N = 119 people in the community

and by the last generation Y7 = 29: It can be shown that �Z = 1:14 with V ar��Z

�=

(1:14)2

29= 0:045: The corresponding 95% con�dence interval is given by: � = 1:14�0:41:

The estimate of � is � = 0:76: Thus, smallpox is not very infectious with a basicreproductive number barely above one.

More examples of branching process-like estimation of � can be found inSaunders[26], Longini[14] and Becker[3].

3.4 Exercises

Exercise 3.4.1 In a Galton-Watson branching process, the number of o¤spring perindividual has a binomial distribution with parameters 2, p. Starting with a singleindividual, calculate:

a. the extinction probabilityb. the probability that the population becomes extinct for the �rst time in the

third generation

Page 24: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

24 Galton-Watson (GW) Branching Process

Exercise 3.4.2 For the Galton-Watson branching process, prove that V ar [Zn+1] =�2�n

Pnj=0 �

j using identity (5.34) in Theorem 7 on page 19 of Chiang�s book. (Hint:use induction).

Exercise 3.4.3 A stochastic process fWng1n=0 is said to be a Martingale process (withrespect to itself) if E [jWij] <1 for all n and if E [Wn+1jW0;W1; : : : ;Wn] =Wn:

a. Let X1; X2; : : : be independent random variables with 0 mean and let Wn =Pni=1Xi. Assume E [jXij] <1: Then prove fWng1n=1 is a Martingale.

b. Let X1; X2; : : : be independent random variables with E [Xi] = 1 and letWn = �

ni=1Xi. Then prove fWng1n=1 is a Martingale.c. Let fZng1n=0 be generated by the Galton-Watson branching process and

Wn =Zn�n; n = 0; 1; 2; : : : Then prove fWng1n=1 is a Martingale.

Exercise 3.4.4 For the Galton-Watson branching process, prove that the probabilityof extinction, �; is the smallest positive root of s = g (s) :

Exercise 3.4.5 Let X have a generalized power series distribution.a. Find the probability generating function of X in terms of f (�) :b. If T = f0; 1; 2; : : :g ; ax = 1 for all x 2 T; and 0 < � < 1; then what is the

pmf of X?c. Given the random sample X1; X2; : : : ; Xn; what is the MLE of �?d. Give the approximate variance of � from part c.

Exercise 3.4.6 Consider a subcritical Galton-Watson branching process that we wishto study when n is large. Since we know that the population will be extinct for largen; one approach is to study the process conditioned on non-extinction. Let Gn (s) bethe pgf of Zn conditioned on the event that the population is not extinct at generationn: Find the expression for Gn (s) in terms of gn (s) and gn (0) ; where gn (s) is theunconditional pgf of Zn:

Page 25: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 4

RANDOM WALKS

4.1 Simple Random Walks

Let T = (0; 1; 2; : : :), S = (: : :� 2;�1; 0; 1; 2; : : :) and Zn = position of particle aftern jumps. De�ne Zn = X1 + � � �+Xn where

P [Xi = k] =

8<:p if k = 1q = 1� p if k = �10 otherwise

:

Assume Z0 = 0: Then E [Xi] = p � q, V ar [Xi] = 4pq;and gi (s) =P1

k=�1 skpk =

ps + qs: (Note that in our original de�nition of the pgf, k was restricted to the non-

negative integers whereas here we use the more general form which allows k to be anyinteger. Some of the properties derived earlier for the pgf do not necessarily continueto hold.) It follows that

GZn (s) = (ps+ qs�1)

n=Pn

i=0

�ni

�(ps)i (qs�1)

n�i

=Pn

i=0

�ni

�piqn�is2i�n

=P

k

�n

n+k2

�pn+k2 q

n�k2 sk;

where the last sum is taken over k = �n;�n+ 2;�n+ 4; : : : ; n� 2; n: Therefore,

P [Zn = k] =

( �n

n+k2

�pn+k2 q

n�k2 for k = �n;�n+ 2;�n+ 4; : : : ; n� 2; n

0 otherwise:

Note that for any simple random walk P [Zn = 0] =�nn2

�(pq)

n2 when n is even and 0

when n is odd. Thus we can only return to the origin on an even numbered step.

Example 4.1.1 A symmetric random walk is a special case of a simple random walkand is given by p = q = 1

2: In which case, P [Zn = k] =

�n

n+k2

� �12

�n:

Page 26: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

26 Random Walks

We can also de�ne a general random walk by letting:

P [Xi = k] =

8>><>>:pi if k = 1ri if k = 0qi = 1� pi � ri if k = �10 otherwise

: (4.1)

A simple random walk is then just a special case of a general random walkwhere ri = 0 and pi = p for all i:

4.2 Di¤erence Equations

First we review solving di¤erence equations. In general, a di¤erence equation is givenby:

xn+k + ak�1xn+k�1 + � � �+ a0xn = gnWhen gn = 0 we have a set of homogeneous di¤erence equations. Otherwise theequations are non-homogeneous. De�ne the characteristic polynomial as

c (�) = �k + ak�1�k�1 + � � �+ a0

To solve, we let c (�) = 0. The k solutions (or roots), �1; �2; : : : ; �k; give rise to thesolution

xn = c1�n1 + � � �+ ck�nk

where the ci�s are found from the initial conditions.

Example 4.2.1 Example: Fibonnacci sequence 1, 1, 2, 3, 5, 8, 13, 21, : : : is givenby

xn+2 � xn+1 � xn = 0; x1 = x2 = 1; n = 1; 2; 3; : : :Let xn = �

n and solve the equation

�2 � �� 1 = 0which yields two roots: �1 = 1+

p5

2= 1:618; �2 =

1�p5

2= �0:618: Next we solve the

equations:

c1�1 + c2�2 = 1

c1�21 + c2�

22 = 1

and arrive at the solution:

xn =1p5

" 1 +p5

2

!n� 1�p5

2

!n#Example 4.2.2 Let xn+2 + xn = 0; x0 = 1; x1 = 0; i.e., -1, 0, 1, 0, -1, 0, 1, 0; : : :Then xn+2 = �xn gives �2 = �1 implying the characteristic roots are � = �i: Usinginitial conditions, the closed form solution is

xn =1

2[in + (�i)n] :

Page 27: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Gambling Systems 27

4.3 Gambling Systems

4.3.1 Gambler�s Ruin

We now de�ne the scenario for gambler�s ruin. Consider to gamblers, A and B, withrespective initial fortunes a and b and probabilities of winning p and q such thatp + q = 1: We wish to �nd Ra, the probability of gambler A�s ruin if gambler A hasinitial wealth a and the game is played until either A or B is broke. Let

Rx = p �Rx+1 + q �Rx�1; x = 1; 2; : : : ; a+ b� 1

where R0 = 1; Ra+b = 0: The above gives rise to the following homogeneous di¤erenceequation:

Rx+2 �1

pRx+1 +

q

pRx = 0

Thus we solve the characteristic polynomial c (�) = �2 � 1p� + q

p= (�� 1)

��� q

p

�which has two roots �1 = 1; �2 =

qp: If �1 = �2; the system is said to be indeterminate,

so we assume �1 6= �2 (i.e. p 6= q ). In which case, Rx = c1 + c2�qp

�xwhere c1 and

c2 are determined by the boundary conditions: c1 + c2 = 1 and c1 + c2�qp

�a+b= 0.

It can be shown that the probability of the gambler�s ruin given starting at a, is

Ra =

�qp

�a��qp

�a+b1�

�qp

�a+b when q 6= p

Via L�Hopital�s rule, it can be shown that Ra = bb+a

when q = p = 12:

What if the opponent is in�nitely wealthy?

limb!1

Ra =

(1 if p � q�qp

�aif p > q

We let Wa = 1�Ra be the probability of winning given start at a.

Example 4.3.1 Roulette �Vegas style wheel has 18 red, 18 black and 1 green fora total of 37 slots. The gambler bets on red or black, thus p = 18

37and q = 19

37.

Furthermore, suppose gambler has an initial wealth of $100 ( i.e. a = 100) and thehouse has $1000 (i.e. b = 1000) (or, more realistically, the gambler will play until sheis ruined, or wins $1000) and on each turn the gambler wages $1. Then 1� R100 =W100 = 3:29 � 10�23 is the probability of the gambler winning: An interesting caveat ofthis example is that a slight change in the rules (Monaco style), namely green delaysa win or loss till the next spin, does not change the result above.

Page 28: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

28 Random Walks

4.3.2 Expected Duration

Let the expected duration of the game if gambler A has wealth x be given by

Dx = p �Dx+1 + q �Dx�1 + 1; D0 = Da+b = 0:

This gives rise to the following non-homogeneous di¤erence equation

p �Dx+2 �Dx+1 + q �Dx = �1

or

Dx+2 ��1

p

�Dx+1 +

�q

p

�Dx = �

1

p:

The solution is of the form Dx = general(homogeneous part) + particular = c1 +

c2

�qp

�x+fx: In general, fx is di¢ cult to ascertain. However, when the non-homogeneous

part is a constant, in this case �1p, fx = �x: From the following

p � � (x+ 2)� � (x+ 1) + q � � (x) = �1

we arrive at � = 1q�p when q 6= p: Thus fx =

xq�p ; so that

Dx = c1 + c2

�q

p

�x+

x

q � p; q 6= p

where c1 and c2 are attained from the boundary conditions. Similarly we resolve whenp = q:

Da =

8<: aq�p �

a+bq�p

�1�( qp)

a

1�( qp)a+b

�when q 6= p�

a+b2

�2when q = p

Example 4.3.2 Revisiting the roulette example, where a + b = 1100, we �nd thatD100 = 3700:

4.3.3 Discrete-time martingales

We motivate martingales by considering di¤erent betting strategies. Assume thatthere is some positive probability p > 0 of winning and let the outcome of each gamebe

Xi =

�1 if win ith game-1 if lose ith game

:

Let the amount wagered on the ith game beWi = Wi (Xi�1) whereXi�1 = (X1; X2; : : : ; Xi�1)is the history of wins and losses. Also let Yi be the fortune after the ith game, i:e:;

Yn = Y0 +nXk=1

WkXk:

Page 29: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Gambling Systems 29

It follows that

E [Yn] = Y0 + E

nXk=1

WkXk

!:

Now we consider two cases based on the gambler�s initial wealth.Case 1. Suppose the gambler has in�nite wealth initially, i.e., Y0 = 1: Let

W = W1 be the initial wager, and henceforth wager the following:

Wk =

�W � 2k�1 if Xi = �1 for i = 1; 2; : : : ; k � 10 otherwise

and suppose that and that gambler wins for the �rst time on the nth game. Then thegambler wins W � 2n�1 and accrues a loss (from the �rst n� 1 games) of W +W � 2+� � �+W � 2n�2 = W

Pn�2i=0 2

i = W�1�2n�11�2

�= W (2n�1 � 1) for a net winning of W:

Case 2. Now suppose Y0 = W (2m � 1) and that the game is fair, i:e:; p = q =12: Now let the betting strategy be

Wk =

�W � 2k�1 if Xi = �1 for i = 1; 2; : : : ; k � 1 and k � m0 otherwise

:

In which case,

E [winnings] = W � P [win at least one game]�W (2m � 1) � P [lose �rst m games]= W (1� 2�m)�W (2m � 1) 2�m = 0 :

For either case, it is easy to see that Yn+1 = Yn +Wn+1Xn+1: It follows that

E (Yn+1 j Xn) = E (Yn j Xn) + E (Wn+1Xn+1 j Xn)= Yn +Wn+1E (Xn+1 j Xn)= Yn +Wn+1E (Xn+1)= Yn

(4.2)

where the last equality holds when p = q = 12: This is the principle driving force

behind the martingale. In this case, it follows that

E (Yn+1) = E (E (Yn+1 j Xn)) = E (Yn) = � � � = Y0:

De�nition 4.3.3 A stochastic process fYng is said to be a martingale with respect toXn = fX1; X2; : : : ; Xng if E (Yn+1 j Xn) = Yn and E (jYnj) < 1 for all n: It followsthat E (Yn+k j Xn) = Yn and via reasoning similar to above that E (Yn) = Y0:

Martingale derives it�s name from a strap used to restrict the movement ofthe horse�s head. We can see by the de�nition that the history Xn restricts theconditional expectation of Yn: Note that in the betting strategies above, case 2 is amartingale and case 1 is not.

Page 30: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

30 Random Walks

Theorem 4.3.4 Martingale Convergence Theorem: If fYng is a martingale such thatfor some M < 1 EfjYnjg � M; for all n, then with probability 1, limn!1Yn exists,and is �nite, i.e., Yn !a:s: Y or Yn converges almost surely to Y . (see Section 6.4 inRoss[?] for the proof).

De�nition 4.3.5 Zero Mean Martingales (ZMM): fYng is a martingale such thatE [Yn] = 0:

Theorem 4.3.6 If fYng is a ZMM, then Yn converges in distribution to a time-transformed Brownian motion as n!1:

Example 4.3.7 Returning to the GW branching process we let the score function be

Yn (�) =1

�2

nXk=1

[Zk � Zk�1�] =1

�2

nXk=1

[Zk � E (Zk j Zk�1)]

and let Xk = Zk��Zk�1 such that E [Xk] = 0: Then it follows that E [Yn (�)] = 0; andhence is a ZMM. Thus, Yn (�) converges to a time-transformed Brownian motion withmean 0. This makes Yn (�) a good estimating equation. This implies that b�n !as �;and is, therefore, a strongly consistent estimator. We will apply this theory in detailin Chapter 7.

For many sequential stochastic processes, the score functions can be writtenin the form

Yn (�) =@ lnLn@�

=nXk=1

[Zk � E (Zk j Z1; :::; Zk�1)] ;

and under mild regularity conditions Yn (�) will be a ZMM. This property can beused to �nd strongly consistent estimators of �:

4.4 Exercises

Exercise 4.4.1 Suppose we have a Galton-Watson branching process. We know thatan estimator for � is �n =

Pnk=1 ZkPn

k=1 Zk�1:

a. Show that �n is a su¢ cient statistic for � (Assume that a generalized powerseries distribution for the o¤spring distribution):

b. Show that �n !a:s: � (Hint: Use the Martingale Convergence Theorem andthe Toeplitz Lemma.)

Page 31: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 5

DISCRETE-TIME MARKOV CHAINS

5.1 Transition Probabilities, Classi�cations, Asymptotics

In this chapter we derive the properties of discrete-time Markov chains. Throughoutwe will assume the stochastic process moves through discrete state space S and hassupport on the nonnegative integers, i.e., T = f0; 1; 2; : : :g : First we start with thefundamental de�nitions of Markov processes.

De�nition 5.1.1 A stochastic process fXng is said to be a Markov chain if it hasthe following property: P [Xt j X0; : : : ; Xt�1] = P [Xt j Xt�1] :

De�nition 5.1.2 The one step transition probability: P [X�+1 = i�+1 j X� = i�] =pi�;i�+1 : For simplicity, in the time homogeneous case we write P [X�+1 = j j X� = i] =pij. Note that

Pi�2S pi�;i� = 1:

De�nition 5.1.3 The absolute probability is P [X� = i�] = ai� :

For a Markov chain fXng and �0 < �1 < � � � < �n < �;

P [X�0 ; X�1 ; : : : ; X�n ; X�] = P [X� j X�0 ; : : : ; X�n ] � � �P [X�1 j X�0 ]P [X�0 ]= P [X� j X�n ] � � �P [X�1 j X�0 ]P [X�0 ]= ai�0pi�0 ;i�1pi�1 ;i�2 � � � pi�n;i�

(5.1)

De�nition 5.1.4 For S = f1; 2; 3; : : :g the one step transition probability matrix isgiven by:

P =

26664p11 p12 � � �p12 p22...

. . .

37775Note that each row of P sums to unity, i.e.,

Pj2S pij = 1.

Page 32: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

32 Discrete-time Markov chains

Example 5.1.5 Recall in the Gambler�s Ruin that the state space is S = f0; 1; 2; : : : ; a+ bg :The (a+ b+ 1)� (a+ b+ 1) one step transition probability matrix is given by:

P =

266666664

1 0 0 0 � � � 0q 0 p 00 q 0 p...

. . .q 0 p

0 0 0 1

377777775Example 5.1.6 Suppose we have a branching process with a Poisson o¤spring distri-bution. The state space is S = f0; 1; 2; : : : ; g and the corresponding one step transitionprobability matrix is given by:

P =

266666664

1 0 0 0 � � � 0p10 p11 p12 p13p20 p21 p22 p23...

. . .

377777775De�nition 5.1.7 The n step transition probability is P [X�+n = j j X� = i] = pij (n)and P (n) is the n step transition probability matrix with ijth entry pij (n) :We employthe convention that P (1) = P.

We now derive the Chapman-Kolmogorov (CK) equation. For any i; k 2 S;

pik (m+ n) = P [X�+m+n = k j X� = i]

=P

j2S P [X�+m+n = k;X�+m = j j X� = i]

=P

j2S P [X�+m+n = k j X�+m = j;X� = i]P [X�+m = j j X� = i]

=P

j2S P [X�+m+n = k j X�+m = j]P [X�+m = j j X� = i]

=P

j2S pij (m) pjk (n) :

(5.2)Therefore,

P (m+ n) = P (m) �P (n) :

We use the convention pii (0) = 1: Note that P (n) = P � P (n� 1) = Pn , which isnot numerically stable in general.

Page 33: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Transition Probabilities, Classi�cations, Asymptotics 33

Example 5.1.8 Consider the 2 step transition probability matrix for S = f1; 2g ;

P (2) =

�p11 (2) p12 (2)p21 (2) p22 (2)

�=

�p211 + p11p12 p11p12 + p12p22p11p12 + p12p22 p11p12 + p

222

�= P2:

Example 5.1.9 Consider a Markov chain model for the transmission of binary codewith S = f0; 1g such that

P =

�p qq p

�;

where 0 < p < 1: It can be shown that

Pn =

�12+ 1

2(p� q)n 1

2� 1

2(p� q)n

12� 1

2(p� q)n 1

2+ 1

2(p� q)n

�:

Therefore

limn!1

Pn =

�12

12

12

12

�:

De�nition 5.1.10 State j is accessible from i there exists some n > 0 such thatpij (n) > 0: If state j is accessible from i; we will use the notation: i! j:

De�nition 5.1.11 If i ! j and j ! i; then i and j are said to communicate(i ! j).

Theorem 5.1.12 If i! j and j ! k; then i! k:

Proof. We know there exists m such that pij (m) > 0 and there exist n suchthat pjk (n) > 0: Thus by the CK equation, pik (m+ n) =

Pj02S pij0 (m) pj0k (n) �

pij (m) pjk (n) > 0:

De�nition 5.1.13 A communicating class C (i) is de�ned to be the set of all statesj which communicate with i: That is, j 2 C (i) i¤ i ! j:

Theorem 5.1.14 A communicating class follows an equivalence relation

1. Re�exivity: i ! i

2. Symmetry: If i ! j then j ! i:

3. Transitivity: If i ! j; j ! k then i ! k

De�nition 5.1.15 The �rst passage time probability is given by

fij (n) = P [Xn = j;Xm 6= j;m = 1; : : : ; n� 1 j X0 = i] :

Page 34: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

34 Discrete-time Markov chains

One step transition and �rst passage time probabilities are related as follows:fij (1) = pij (1) = pij:We also use the convention that fij (0) = 0 for i 6= j: In general,

pij (n) =nXl=1

fij (l) pjj (n� l) :

Notice that

fij (n) = pij (n)�n�1Xl=1

fij (l) pjj (n� l) ;

and that the probability we ever reach j from i is given by fij =P1

n=1 fij (n) ;0 � fij � 1: Often we are interested in when fij = 1 when i 6= j; in which caseffij (n)g1n=0 can be thought of as a proper distribution, and the mean �rst passagetime can be calculated:

�ij =1Xn=1

nfij (n) :

If fij < 1, then the mean �rst passage time does not exist.

De�nition 5.1.16 State i is a transient state i¤ fii < 1:

De�nition 5.1.17 State i is an absorbing state i¤ fii = 1:

Example 5.1.18 In the Gambler�s ruin problem, states 1; 2; : : : ; a+b�1 are transientstates while states 0 and a+ b are absorbing states.

De�nition 5.1.19 State i is a recurrent state i¤ fii = 1: Note that all absorbingstates are recurrent.

Example 5.1.20 Consider a random walk. If p = q; then fii = 1 so that all statesare recurrent. If p 6= q; then fii < 1 so that all states are transient:

De�nition 5.1.21 A recurrent state i is nonnull i¤ �ii <1:

Example 5.1.22 Recall the general random walk de�ned given by 4.1. A state i isa re�ecting barrier on the left (on the right) if qi = 0(qi > 0) and pi > 0(pi = 0):Suppose we have a general random walk on S = fa; a+ 1; : : : ; bg where a and b areleft and right re�ecting barriers respectively. Then all states in S are nonull recurrent.

De�nition 5.1.23 A recurrent state i is null i¤ �ii !1:

Example 5.1.24 All states in a symmetric random walk without re�ecting barriersare null recurrent:

Page 35: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Transition Probabilities, Classi�cations, Asymptotics 35

De�nition 5.1.25 Let t = gcd fn : pii (n) > 0g where gcd is the greatest commondivisor. Then state i is periodic with period t if t > 1.

Example 5.1.26 Both states given by the following one step transition matrix

P =

�0 11 0

�are periodic with period 2.

Example 5.1.27 A SRW with p = q = 12is also of period 2. Recall that p00 (n) =�

nn2

� �12

�nwhen n is even and 0 when n is odd.

De�nition 5.1.28 A recurrent state i is ergodic if t = gcd fn : pii (n) > 0g = 1:

Example 5.1.29 For 0 < p < 1; both states given by the following one step transitionprobability matrix is ergodic:

P =

�p qq p

�:

Example 5.1.30 A general random walk with at least one ri > 0:

Thus states may be classi�ed as follows:

Transient RecurrentRecurrent null Recurrent Nonnull

Periodic Ergodic

Ex. SRW p 6= q Ex. SRW p = q = 12Ex. P =

�0 11 0

�Ex. P =

�p qq p

�Theorem 5.1.31

State i is transient i¤P1

n=0 pii (n) <1:State i is recurrent i¤

P1n=0 pii (n)!1:

Theorem 5.1.32

If state i is transient or recurrent null, then limn!1 pii (n) = 0:If state i is recurrent nonnull with period t, then limn!1 pii (nt) =

t�ii:

If state i is ergodic, the limn!1 pii (n) =1�ii:

Example 5.1.33 Recall the period 2 transition probability matrix given in example5.1.26. Then by theorem 5.1.32, limn!1 pii (2t) =

2�iiimplies that �ii = 2:

Page 36: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

36 Discrete-time Markov chains

Theorem 5.1.34If state j is transient or recurrent null, then for all i limn!1 pij (n) = 0:If state j is ergodic, then for all i limn!1 pij (n) =

1�jj:

(Note in the latter case, j must be reachable from i).

De�nition 5.1.35 A set C is closed i¤P

j2C pij = 1 for all i 2 C:De�nition 5.1.36 A closed set of communicating states is a class or irreducibleMarkov chain.

Example 5.1.37 The following one step transition probability matrices gives rise toan irreducible Markov chain:

P =

�0 11 0

�and P =

�p qq p

�for 0 < p < 1:

Example 5.1.38 The gambler�s ruin is not an irreducible Markov chain since states0 and a+ b are absorbing states and do not communicate with the other states.

De�nition 5.1.39 A probability distribution f�ig of a Markov chain C is stationaryi¤

�j =Xi2C

�ipij for j 2 C andXi2C

�i = 1:

Theorem 5.1.40 If C is an ergodic, irreducible Markov chain, then

limn!1

pij(n) = �j > 0

exists and are independent of state i: Furthermore, the limiting distribution f�jg isstationary. Conversely, if a stationary distribution of an irreducible Markov chainexists, then each state in C is ergodic and the stationary distribution is the limitingdistribution of the chain.

In matrix notations, for � = [�1; �2; : : :] ; � = �P and �C = 1 whereC = [1; 1; : : :]T :

Example 5.1.41 For a 2 state process on S = f1; 2g the stationary distribution canbe found in general. That is,

[�1; �2]

�p11 p12p21 p22

�= [�1; �2] and �1 + �2 = 1

implies�1 =

p21p21 + p12

and �2 =p12

p21 + p12:

Example 5.1.42 Societal classes: 1 upper, 2 middle, 3 lower.

P =

24 0.448 0.484 0.0680.054 0.699 0.2470.011 0.503 0.486

35yields � = [0:067; 0:624; 0:309] [?].

Page 37: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Transition Probabilities, Classi�cations, Asymptotics 37

5.1.1 Absorbing Chains.

Let N be a random variable for the number of steps to absorption. In this section wediscuss inference on time until absorption which relies on partitioning the one steptransition probability matrix. First we consider the simpler, scalar example whichcan be thought of as having one transient state and one absorbing state.

Example 5.1.43 Suppose N has a geometric mass function i.e. pN (n) = qn�1p;n = 1; 2; : : : for 0 < p < 1; so that the pgf is given by gN (s) = ps

P1j=1 q

j�1sj�1 =ps1�qs = (1� qs)

�1 ps: It follows that E [N ] = 1p= (1� q)�1 and that V ar [N ] = q

p2=

q (1� q)�2 :

Now consider the more general case where we have a reducible Markov chainwith r states. Let s < r transient and r�s absorbing. We let C1 = ftransient statesgand C2 = fabsorbing statesg and partition the one step transition probability matrixaccording to C1 and C2: We illustrate this partitioning in the following example:

Example 5.1.44 Consider the gambler�s ruin scenario with a+ b = 4 such that thetransient states are C1 = f1; 2; 3g ; the absorbing states are C2 = f0; 4g ; r = 5; s = 3;and

P =

2666641 0 0 0 0q 0 p 0 00 q 0 p 00 0 q 0 p0 0 0 0 1

377775 :We then partition P as follows:

P =

�Trans: Trans:! Abs:0 Abs:

�(5.3)

=

2666640 p 0 q 0q 0 p 0 00 q 0 0 p0 0 0 1 00 0 0 0 1

377775=

�Q R0 I

�;

so that

Pn =

�Qn (I+Q+ :::+Qn�1)R0 I

�:

Note that Q is a substochastic matrix (i.e., the sum of at least one row < 1), so thatlimn!1Q

n = 0: Thus

limn!1

Pn =

�0 (I�Q)�1R0 I

�:

Page 38: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

38 Discrete-time Markov chains

We now calculate mean time to absorption (note: this could be accomplishedusing di¤erence equations as discussed earlier). Letmi be the mean time to absorptiongiven that X0 = i; i 2 C1;then

mi = E [N j X0 = i]

=P

j2C pijE (N j X1 = j;X0 = i)

=P

j2C1 pijE (N j X1 = j;X0 = i) +P

j2C2 pijE (N j X1 = j;X0 = i)

=P

j2C1 pij(1 +mj) +P

j2C2 pij � 1

= 1 +P

j2C1 pijmj:

In matrix form, we letM = [m1;m2; : : : ;ms]T and C = [1; 1; : : : ; 1]T : It follows that

IsM = C+QM;

where Q is the s � s submatrix for the transition states and Is is the s � s identitymatrix. This is equivalent to

M = (Is �Q)�1C: (5.4)

Example 5.1.45 Revisiting gambler�s ruin from example 5.1.44

I�Q =

24 1 �p 0�q 1 �p0 �q 1

35 ;so that for p = 2

3; we get M = [3:4; 3:6; 2:2]T :

It can also be shown that V ar [M] =M(2)�M2 whereM(2) = (I�Q)�1 (2M�C) :We now present an alternative derivation using the properties of the pgf

for a simpler case. Suppose we have r states such that state 0 is an absorbingstate and states f1; 2; : : : ; r � 1g are communicating transient states. That is, C1 =f1; 2; : : : ; r � 1g andC2 = f0g : Let pi (n) = P [N = n j X0 = i] =

Pj2C1 pij (n� 1) pj0.

Then

P (n) =

2666664p1 (n)p2 (n)...

pr�1 (n)

3777775 = Qn�1R ,

Page 39: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Algebraic treatment 39

so that the pgf for N is given by

f (s) = [f1 (s) ; f2 (s) ; : : : ; fr�1 (s)]T

=P1

j=1 Qj�1sjR

=P1

j=1 (Qs)j�1 sR

= (I�Qs)�1 sR:

: (5.5)

We leave it as an exercise to derive the mean time to absorption from equation 5.5.

5.2 Algebraic treatment

As mentioned earlier, direct evaluation of Pn is numerically unstable. Here we usetools from linear algebra to derive the closed form (i.e. numerically stable) of then-step transition probability matrix.

LetW be a s� s matrix

W =

26664w11 w12 � � � w1sw21 w22...

. . .ws1 ws2 � � � wss

37775 :De�nition 5.2.1 The cofactor Wij of the (i; j)th element ofW is given by (�1)i+jtimes the determinant of the submatrix obtained by deleting the ith row and jth columnofW:

De�nition 5.2.2 The adjoint ofW is given by

W+ =

26664W11 W21 � � � Ws1

W12 W22 Ws2...

. . .W1s W2s � � � Wss

37775De�nition 5.2.3 If there exist a non-zero vector t and a scalar � such thatWt = �t;then � is called an eigenvalue ofW and t the eigenvector. An equivalent form is givenby (�I�W) t = 0: Letting A = �I �W; then At = 0 has a solution if and only ifA is singular. That is, if and only if jAj = 0 where jAj is the determinant of A:

De�nition 5.2.4 jAj = 0 is the characteristic equation ofW:

Thus eigenvalues are the roots of the characteristic equation. We will relyheavily on the following theorem.

Page 40: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

40 Discrete-time Markov chains

Theorem 5.2.5 If �1; �2; : : : ; �s are distinct eigenvalues of W, then the matrix ofcorresponding eigenvectors T = [t1; t2; : : : ; ts] is invertible.

From the following equations

WT = [Wt1;Wt2; : : : ;Wts] = [�1t1; �2t2; : : : ; �sts]

we arrive at

T�1WT = [�1e1; �2e2; : : : ; �ses] =

26664�1 0 � � � 00 �2...

. . .0 �s

37775 ;where ei is a vector of 0�s with a 1 in the ith position (e.g. e1 = [1; 0; 0; : : : ; 0]).Therefore,

Wn = T

26664�n1 0 � � � 00 �n2...

. . .0 �ns

37775T�1: (5.6)

Theorem 5.2.6 Let A (�j) = �jI�W be the characteristic matrix of �j: Then anynon-zero column of A+ (�j) is an eigenvector for �j:

Example 5.2.7 Suppose

P =

�p qq p

�for 0 < p < 1;

such that

A =

��� p �q�q �� p

�:

Setting the determinant to zero, i.e., jAj = 0; gives rise to two roots: �1 = 1 and�2 = p� q: It follows that

A+ (1) =

�q qq q

�and A+ (p� q) =

��q qq �q

�:

From theorem 5.2.6, a matrix of eigenvectors is given by

T =

�q qq �q

�:

Therefore,

P (n) =

�q qq �q

� �1 00 (p� q)n

�12q

�1 11 �1

= 12

�1 + (p� q)n 1� (p� q)n1� (p� q)n 1 + (p� q)n

�:

Page 41: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Inference 41

Chiang derived the following explicit form for the ijth entry of P (n) :

Pij (n) =sXl=1

Aji (�l)�nl

1Qsm=1m6=l

(�l � �m)(5.7)

Example 5.2.8 Revisiting example 5.2.7:

P11 (n) =P2

l=1A11 (�l)�nlQ2

m=1m6=l

(�l��m)

= A11 (�1)�n1

�1

�1��2

�+ A11 (�2)�

n2

�1

�2��1

�= q

�1

1�p+q

�+ (�q) (p� q)n

�1

p�q�1

�= 1

2+ 1

2(p� q)n :

Furthermore, Chiang shows that

limn!1

Pij (n) =Ajj (1)Psk=1Akk (1)

Finally, we note that for matrices with non-distinct eigenvalues, diagonaliza-tion as in equation 5.6 is not possible, and one must resort to the Jordan from.

5.3 Inference

In this section, we will introduce likelihood based inference on discrete state spaceand index set, homogeneous Markov chains. Generally one can perform inference ona single observed sequence over time or on many observed sequences over time.5.3.1 Inference on a single sequence

We assume that we are estimating a set of parameters, �; from a single observedsequence fx1; x2; : : : ; xng; from a single population. Then the likelihood function is

Ln(�) =

nYk=0

P (Xk j Xk�1):

Then the score functions are

Sn(�) =@ lnLn(�)

@�=

nXk=0

P (Xk j Xk�1)�1@P (Xk j Xk�1)

@�:

Example 5.3.1 Galton-Watson branching process. Then P (Zk j Zk�1) m �Zk

f(�)Zk�1 ;

and the log likelihood is given by (3.3) in section 3.2.3.

Example 5.3.2 The Reed-Frost model. See (5.26) in section 5.5.3.

Page 42: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

42 Discrete-time Markov chains

5.3.2 Inference on multiple observed sequences

We let nij be the number of observed transitions from state i to state j; (i; j) 2 S:Based on these data, we can estimate the elements of one step transition matrix, P:Suppose the Markov chain is �nite with s states. Then the problem reduces to as � s contingency table. The probability of the outcome fni1; ni2; : : : ; nisg follows amultinomial distribution with probability

ni�!

ni1!ni2! : : : nis!pni1i1 p

ni2i2 : : : p

nisis ;

where ni� =Ps

j=1 nij . The likelihood function for P; conditional on fni1; ni2; : : : ; nisgis

L(P) =csYi=1

sYji=1

pnijij ;

where c is a constant. This is the likelihood for multinomial data. It follows di-rectly that the maximum likelihood estimates are bpij = nij

ni�;with estimated variancesdvar(bpij) ' bpij(1�bpij)

ni�; (i; j) 2 S.

5.4 The chain binomial model

Let time be discrete and indexed t = 0; 1; ::: Let St be the number of individuals atrisk for the event of interest (e.g., infection, death) at the beginning of time intervalt, and It be the number that experienced the event of interest at the beginningof time interval t: The event has a duration of at least one time interval. We letpt = 1� qt = f(t;�; It) be the probability that an at-risk individual has a new eventat the beginning of time interval t+1, with parameters �. As shown, this probabilitycan be a function, f (�) ; of t and It. We usually start with a closed population ofn = S0+ I0 individuals. Then It+1 is a binomial random variable that follows theconditional probability mass function

Pr(It+1 = it+1 j St = st; pt) =�stit+1

�pit+1t q

st�it+1t ; st � it+1 : (5.8)

In many cases, St is updated via the relationship

St+1 = St � It+1; (5.9)

although other relationships are possible (see below). The conditional expectationand variance of It+1, respectively, are

E(It+1 j st; pt) = stpt; (5.10)

var(It+1 j st; pt) = stptqt: (5.11)

Page 43: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

The Reed-Frost Model 43

Equations (5.8,5.9) form the classical chain-binomial model. Formal mathematicaltreatment of the model involves formulation of the discrete, two-dimensional Markovchain fSt; Itgt=0;1;:::. It is the (binomial) random variable of interest, and St is updatedusing (5.9). The probability of a particular chain, fi0; i1; i2; :::; irg ; is given by theproduct of conditional binomial probabilities from (5.8) as

Pr(I1 = i1 j S0 = s0; p0)Pr(I2 = i2 j S1 = s1; p1) � � �Pr(Ir = ir j Sr�1 = sr�1; pr�1)

=r�1Yt=0

�stit+1

�pit+1t q

st�it+1t :

The conditional expected value of It+1 (5.10) suggests the deterministic system of�rst-order di¤erence equations

it+1 = stpt; (5.12)

st+1 = st � it+1;

which can be analyzed as an approximation to the mean of the sample paths of thestochastic process fSt; Itgt=0;1;::: . This system reduces to

st = st�1qt�1 = s0

t�1Y`=0

q`; (5.13)

which is analyzed using methods from discrete mathematics (e.g., see Frauenthal[9]and Longini[14]).

5.5 The Reed-Frost Model

This section is taken from Longini[16].

5.5.1 History

The probabilistic form of the Reed-Frost epidemic model was introduced by the bio-statistician Lowell J. Reed and the epidemiologist Wade Hampton Frost around 1930as a teaching tool at Johns Hopkins University. It was developed as a mechanicalmodel consisting of colored balls and wooden shoots. Although Reed and Frost neverpublished their results, the work is described in articles and books by others (seeChapters 14 and 18 in Bailey[1] and Chapters 2 and 3 in Becker [3]). An excellentdescription of the early Reed-Frost model is given by Fine [8]. The deterministicversion of the Reed-Frost model has been traced back to the Russian epidemiologistP.D. En�ko who used the model to analyze epidemic data in the 1880�s (see Dietz [7]).The Reed-Frost version of the chain binomial and its extensions is used to study thedynamics of epidemics in small populations, such as families or day care centers, andto estimate transmission probabilities from epidemic data.

Page 44: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

44 Discrete-time Markov chains

5.5.2 Formulation

In this case, St is the number of susceptible persons at the beginning of time intervalt and It is the number of persons who where newly infected at the beginning of timeinterval t: An infected person is infectious for exactly one time interval and then isremoved, i.e., becomes immune. Thus, a person infected at the beginning of timeinterval t, will be infectious to others until the beginning of time interval t+1:We letRt be the number of removed persons at the beginning of time interval t, and then,by de�nition

Rt+1 = Rt + It =tXr=0

Ir: (5.14)

Since the population is closed, we have St + It + Rt = n for all t. We let p = 1 � qbe the probability that any two speci�ed people make su¢ cient contact in order totransmit the infection, if one is susceptible and the other infected, during one timeinterval. We note that p is a form of the secondary attack rate. We assume randommixing. Then, if during time interval t there are It infectives, then the probabilitythat a susceptible will escape being infected over the time interval is qIt, and theprobability that they will become a new case at the beginning of time interval t + 1is 1� qIt. Thus, qt = qIt ; and substituting into (5.8) yields

Pr(It+1 = it+1 j St = st; It = it) =�stit+1

��1� qit

�it+1 qit(st�it+1); st � it+1 : (5.15)

The epidemic process starts with I0 > 0, and terminates at stopping time T , where

T = inft�0ft : StIt = 0g : (5.16)

Table 1 shows the possible chains for a population of size 4 with one initial infective,i.e., S0 = 3; I0 = 1.

Table 1: Possible individual chains when S0 = 3; I0 = 1Chain Probability Final Size

fi0; i1; i2; :::; iTg RTf1g q3 1f1; 1g 3pq4 2f1; 1; 1g 6p2q4 3f1; 2g 3p2q3 3f1; 1; 1; 1g 6p3q3 4f1; 1; 2g 3p3q2 4f1; 2; 1g 3p3q (1 + q) 4f1; 3g p3 4

Page 45: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

The Reed-Frost Model 45

The probability of no epidemic is de�ned as the probability that there will beno further cases beyond the initial cases. This probability is

Pr(I1 = 0 j S0 = s0; p0) = qi0s0 : (5.17)

For example if S0 = 10; I0 = 1, and p = 0:05, then the probability of no furthercases beyond the initial case is 0:599. From (5.10), the conditional expected numberof new cases in time interval t is E(It+1 j st; pt) = st (1� qit) : On the average, theepidemic process will not progress very far if the expected number of cases in the �rstgeneration is less than or equal to one, i.e., E(I1 j s0; p0) = s0 (1� qi0) � 1: In manycases, i0 = 1, so that there will be few secondary cases if s0p � 1: Then, for example,if S0 = 10; I0 = 1, there will be few secondary cases if p � 0:1:

From (5.13), the deterministic counterpart of the Reed-Frost model is

st = s0qPt�1`=0 i` ; (5.18)

which has been thoroughly analyzed (e.g., see Frauenthal[9] and Longini[14]).In some cases, the distribution of the total number of cases, RT ; is the random

variable of interest. We let J be the random variable for the total number of cases inaddition to the initial cases, so that RT = J + I0: If we let S0 = k and I0 = i; thenthe probability of interest is

Pr (J = j j S0 = k; I0 = i) = mijk; (5.19)

wherePk

j=0mijk = 1: Then, based on probability arguments (e.g., see Bailey [1]), wehave the recursive expression

mijk =

�k

j

�mijjq

(i+j)(k�j); j < k (5.20)

and

mikk = 1�k�1Xj=0

mijk: (5.21)

The Reed-Frost model has several extensions and special cases. If it is hy-pothesized that the probability that a susceptible becomes infected does not dependon the number of infectives that he or she is exposed to, then

pt =

�p; if It > 0;0; if It = 0:

(5.22)

This model is know as the Greenwood model [10].Longini and Koopman [23] modi�ed the Reed-Frost model for the common case

where there is a constant source of infection from outside the population that doesnot depend on the number of infected persons in the population. We let at = 1�bt be

Page 46: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

46 Discrete-time Markov chains

the probability that a susceptible person is infected during interval t due to contactswith infected persons outside the population, where

at > 0 if t � T;at = 0 if t > T;

and T is a stopping time. Then pt = 1 � btqIt. If we let B =QTt=0 bt; then B is the

probability that a person escapes infection from sources outside of the population overthe entire period [0; T ]. We then de�ne CPI = 1� B as the community probabilityof infection. Longini and Koopman derive probability mass function

mijk =

�k

j

�mijjB

(k�j)q(i+j)(k�j); j < k: (5.23)

Usually, i = 0 for this model. This model reduces to (5.20) when B = 1:Another extension of the Reed-Frost model is for infectious diseases that do

not confer immunity following infection. In this case, there is no removed state sothat St+It = n. Then, since St+1 = n�It+1, the model is a discrete, one-dimensionalMarkov chain fItgt=0;1;::: : The transition probabilities for this process are

Pr(It+1 = it+1 j It = it) =�n� itit+1

��1� qit

�it+1 qit(n�it�it+1); it � it+1 � n: (5.24)

In this case, the disease in question can become �endemic.�An interesting analyticalquestion involves the study of the mean stopping time for the endemic process. From(5.12), the deterministic counterpart of this model is

it+1 = (n� it)�1� qit

�; (5.25)

which is a form of the discrete logistic function. The stochastic behavior of (5.24)has been analyzed by Longini[13], and the dynamics of (5.25) have been analyzed byCooke, et al.[6].

There are many other extensions of the Reed-Frost model depending on theparticular infectious disease being analyzed, but a further key extension is to al-low the infectious period to extend over several time intervals. In this case pt =f(t;�;I0; I1; :::; It); and fSt; Itgt=0;1;::: is not a Markov chain. Special methods areused to analyze this model (Saunders[26]).

5.5.3 Inference

Data are usually in the form of observed chains, fi0; i1; :::; irg ; for one or more popu-lations, or �nal sizes, RT ; for more than one population. With respect to the formerdata form, suppose that we have N populations and let fik0; ik1; :::; ikrg be the ob-served chain for the kth population. Then, from (5.1), the likelihood function forestimating p = 1� q is

Page 47: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Life Tables 47

L(p) =NYk=1

r�1Yt=0

�sktikt+1

��1� qikt

�ikt+1 qikt(skt�ikt+1); (5.26)

For �nal value data, let aijk be the observed frequencies of the mijk; from (5.23),i = 1; :::; I; k = 1; :::; K; and j = 1; :::; k: Then the likelihood function for estimatingp and B is

L(p;B) =IYi=1

KYk=1

kYj=0

maijkijk : (5.27)

The logarithms of (5.26,5.27) are maximized using standard scoring routines (e.g.,Bailey[1], Becker[3], Longini, et al.[23], [24]), or the corresponding generalized linearmodel (see Becker [3], Haber, et al.[11]). Extensions involve making both p andthe CPI functions of covariates, such as age, level of susceptibility or vaccinationstatus. Bailey[1] (Sec. 14:3) gives an example where (5.26) is used to estimatebp = 0:789 � 0:015 (estimate �1 standard error) for the household spread of measlesamong children. In the case of the household spread of in�uenza, Longini, et al.[24]use (5.27) to estimate bp = 0:260 � 0:030 for persons with no prior immunity andbp = 0:021 � 0:026 for persons with some prior immunity. In addition, they estimatedCPI = 0:164 � 0:015 and dCPI = 0:092 � 0:013 for persons with no and some priorimmunity, respectively.

5.6 Life Tables

The chain binomial model forms the statistical underpinnings of the life table (seeChapter 10 in Chiang [5]). In this case, pt simply depends on the time interval. Then,St is the random variable of interest, which is formulated in terms of the intervalsurvival probabilities qt = 1 � pt. Many important life table indices are functions ofqt. For example, the probability that an individual who starts in the cohort at time 0,is still alive at the end of time interval r, denoted q0r, is q0r =

Qrt=0 qt. The expected

number alive at the beginning of time interval r + 1 is E(Sr+1) = s0q0r: This modelis a discrete, one-dimensional Markov chain fStgt=0;1;::: : From (5.8), we see that thechain binomial model for St is simply

Pr(St+1 = st+1 j St = st) =�stst+1

�qst+1t p

st�st+1t ; st � st+1 : (5.28)

From (5.1), the probability of a particular chain fs0; s1; s2; : : : ; srg is

Pr(S1 = s1 j S0 = s0)Pr(S2 = s2 j S1 = s1) � � �Pr(Sr = sr j Sr�1 = sr�1)(5.29)

=r�1Yt=0

�stst+1

�qst+1t p

st�st+1t :

Page 48: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

48 Discrete-time Markov chains

For an observed chain fs0; s1; s2; : : : ; srg, (5.29) is the likelihood function for estimat-ing fq0; q1; : : : ; qrg. The maximum likelihood estimators are

bqt = st+1st; (5.30)

while the approximate variances, for large S0; are

var(bqt) � ptqtE(St)

: (5.31)

In addition, the bqt are unique, unbiased estimates of the qt, and cov(bqt; bq`) = 0; t 6= `:Estimators of most of the life table functions are based on the estimators bqt:5.7 HIV-progression model

Longini[15] de�ned the stages of HIV infection based on T4 cell counts as shown inthe table below:

Stages of HIV infectionStage T4 cell count range1 > 8992 700 - 8993 500 - 6994 200 - 4995 0 - 2006 Deceased

We assume that the T4 cell decline is monotonically decreasing so that theone-step transition matrix has the following form:

P =

26666664p11 p12 p13 p14 p15 p160 p22 p23 p24 p25 p260 0 p33 p34 p35 p360 0 0 p44 p45 p460 0 0 0 p55 p560 0 0 0 0 1

37777775 :An estimate of this matrix from U.S. Army data is

P =

266666640:48 0:18 0:20 0:14 0 00 0:49 0:23 0:26 0:02 00 0 0:53 0:45 0:02 00 0 0 0:79 0:19 0:020 0 0 0 0:70 0:300 0 0 0 0 1

37777775The eigenvalues of this matrix are � =[ 0:48; 0:49; 0:53; 0:79; 0:7; 1] , and the

Page 49: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Endemic Reed-Frost Model 49

matrix of eigenvectors is

T =

26666664100000

18:01:00000

24:75:751:0000

2:84192:19361:73081:000

�13:28�8:5101�5:4706�2:11111:00

111111

37777775 :

5.8 Endemic Reed-Frost Model

Consider the endemic model described in (5.24). This model is a one-dimensionalMarkov chain fItgt=0;1;::: ; on the state space S = f0; 1; :::; ng, with transition proba-bilities

pij =

� �n�ij

�(1� qi)j qi(n�i�j); if, i+ j � n

0; if, i+ j > n: (5.32)

State 0 is an absorbing state and state n is an isolated state. States 1; :::; n � 1 aretransient states. Thus, given that the process starts in one of the transient states, itwill eventually end up in state 0: The one-step transition matrix is

P =

266666666664

p11 p12 � � � p1r � � � p1n�1 p10p21 p22 � � � p2r � � � 0 p20...

......

......

......

pr1 pr2 � � � prr...

... pr0...

......

......

......

pn�1;1 0 0 0 0 0 pn�1;00 0 0 0 0 0 1

377777777775where r � n

2:

We are interested in the mean time to absorption. To study the absorbingchain, we have reordered the states as S = f1; 2; ::::; n � 1; 0g:We can now partitionmatrix P into the absorbing chain form (5.3), where

Q =

2666666664

p11 p12 � � � p1r � � � p1n�1p21 p22 � � � p2r � � � 0...

......

......

...

pr1 pr2 � � � prr...

......

......

......

...pn�1;1 0 0 0 0 0

3777777775, R =

266666664

p10p20...pr0...

pn�1;0

377777775, and I = 1:

Then, (5.4) can be used to compute the mean time to absorption, i.e., lengthof the endemic process. For example, suppose p = 0:05 and n = 5; then given oneinitial infective, i.e., I0 = 1;the mean time to absorption is m1 = 1:2 time units;given two initial infectives, m2 = 1:3; and for three initial infectives, m3 = 1:3: Thefollowing table shows m1 for di¤erent values of n:

Page 50: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

50 Discrete-time Markov chains

Expected Length of the Endemics ProcessWith One Initial Infective

Population size Expected lengthn m1

5 1:210 1:615 2:320 3:825 8:730 45:235 839:640 52; 385

Once the population size approaches around 40; the mean time to absorptionjumps to a very long time. Thus, the infectious disease becomes endemic for popu-lation sizes of about 40 or larger. We can see this by examining the components ofvector R; fpi0g; where pi0 = qi(n�i): When n ! 1; then pi0 ! 0; for �xed i; andR ! 0:Thus, the chain becomes recurrent, i.e., the disease becomes endemic, as ngets large.

5.9 Exercises

Exercise 5.9.1 The Chapman-Komogorov equations 5.2 may be extended as follows:

pil (n1 + n2 + n3) =Xj

Xk

= pij (n1) pjk (n2) pkl (n3)

Verify this equation.

Exercise 5.9.2 Let fXng1n=0 be a homogeneous Markov chain.a. Show that

P [Xm = j j Xm+1 = i;Xm+2 = i2; : : : ; Xm+n = in] = P [Xm = j j Xm+1 = i]

(Note that P [Xm = j j Xm+1 = i] are the one-step transition probabilities for the �re-verse�Markov chain).

b. Let bij = P [Xm = j j Xm+1 = i] be the elements of the reverse one-steptransition probability matrix B: Find an expression for bij in terms of pij and aj (n)from the forward Markov chain.

Exercise 5.9.3 Consider a simple random walk. Prove that if p 6= q; then state 0 istransient (i.e. f00 < 1) and that if p = 1

2; then state 0 is recurrent (i.e. f00 = 1). Use

theorem 5.1.31 and Stirling�s formula to approximate the factorial term.

Page 51: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Exercises 51

Exercise 5.9.4 Suppose we have a homogeneous, reducible Markov chain with rstates, one of which is absorbing state and the others being communicating tran-sient states. Let N be the random variable for the number of steps to absorption. Weshowed in class that the pgf for N is given by f (s) = (I�sQ)�1 sR: Using f (s) showthat

a. M =(I�Q)�1Cb. M(2) = (I�Q)�1 (2M�C)

Exercise 5.9.5 Suppose we have an endemic process for a group of size n = 3;s = (0; 1; 2; 3) :

a. Find the eigenvalues of the one-step transition matrix.b. Write out p11 (n) ; p12 (n) ; and p10 (n) :

Exercise 5.9.6 Consider a staged disease process where an individual passes throughstages 1; : : : ; k � 1 before reaching the absorbing state k > 1: The one-step transitionprobabilities are

pij =

8<:pi if j = i+ 1qi if j = i0 otherwise

pi + qi = 1; 0 < pi < 1; i = 1; 2; : : : ; k � 1; and pkk = 1; k > 1:a. Classify the states and the Markov chain.b. Find the eigenvalues of the one-step transition probability matrix and in-

terpret them.c. Suppose k = 3; �nd p11 (n) ; p12 (n) ; and p13 (n) :

Exercise 5.9.7 The natural history of carcinoma of the cervix is as follows:

Dysplasia! Carcinoma in sites! Invasive cancer

Suppose a cohort of women with dysplasia are followed on a yearly basis for the rest oftheir lives. The following information is known about the yearly transitions betweenstates:

i. 20% of those with dysplasia will convert to cancer in sites, while 80% remaindysplasia.

ii. 10% of those with cancer in sites will develop invasive cancer; 10% of thosewith cancer in sites will revert back to dysplasia; while the rest, i.e. 80%, maintaincancer in sites.

iii. All those with invasive cancer will remain in that state.The progression of the women with dysplasia to other forms of the disease can

be modeled as a homogeneous Markov process with three discrete states and in discretetime (one year intervals).

a. Given the one-step transition probability matrix P for this process.b. Find the eigenvalues of this system.

Page 52: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

52 Discrete-time Markov chains

c. Give the mean number of years for a woman to develop invasive cancerfrom the dysplastic state.

Now assume that 80% of those women with invasive cancer will be successfullytreated, and will return to the cancer in sites state in one year.

d. Classify the states of this Markov chain, and the chain itself.e. On the average, what proportion of the cohort will be in each of the three

states after a long period of time?

Exercise 5.9.8 Consider the �endemic� process discussed in section 5.5. That is,let It be the number of infectives at time t; n be the population size, p = (1� q) bethe transmission probability. Then the deterministic system is

It+1 = (n� It)�1� qIt

�; t = 1; 2; : : :

and 0 < I0 < n:a. Describe the state space of the system.b. Find the �xed points of the system.c. Describe the behavior of the system as t ! 1: Speci�cally, under what

conditions doeslimt!1

It = 0 and limt!1

It = I� > 0:

Hint: Transform the system by Xt =Itnand � = �n ln q:

Page 53: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 6

CONTINUOUS-TIME MARKOV CHAINS

Suppose now that the stochastic process still assumes values on a discrete state space,e.g. S = (0; 1; 2; : : :), but that time is now continuous, i.e. T = (t : 0 � t <1) :

Continuous-time processes of this type can be used to model birth and death typeprocesses as well as infectious diseases. We �rst give the de�nition for the continuoustime analog of the discrete time n step transition probability.

De�nition 6.0.1 Let the conditional probability that the process is in state k at timet given it is in state i at time 0 be given by pik (0; t) = P [x (t) = k j x (0) = i] : Forsimplicity, we will use the notation pk (t) = pik (0; t) :

6.1 Poisson process

Let X (t) be the number of individuals alive at time t with initial condition X (0) = 0:Then de�ne

P [X (t+�) = k j X (t) = j] =

8>>>><>>>>:��+ o (�) if k = j + 1

1� ��+ o (�) if k = j

o (�) otherwise

;

where o (�) is de�ned such that lim�!0o(�)�= 0: It follows that

pk (t+�) = [1� ��] pk (t) + ��pk�1 (t) + o (�) for k � 1; (6.1)

andp0 (t+�) = [1� ��] p0 (t) + o (�) : (6.2)

From equation(6.2) it follows that

lim�!0

p0 (t+�)� p0 (t)�

= ��p0 (t) ;

implyingdp0 (t)

dt= ��p0 (t) ;

Page 54: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

54 Continuous-time Markov chains

and in turnp0 (t) = e

��t:

Likewise, from equation (6.1) we would like to solve the di¤erential equations:

dpk (t)

dt= ��pk (t) + �pk�1 (t) ; k = 1; 2; 3; : : :

Note thatdp1 (t)

dt= ��p1 (t) + �p0 (t) = ��p1 (t) + �e��t;

from which one can showp1 (t) = e

��t�t:

Via induction, it follows that

pk (t) =e��t (�t)k

k!; k = 0; 1; 2; : : :

We now solve for the pgf of X (t) ;

GX (s; t) =1Xk=0

pk (t) sk:

Taking derivative with respect to time t;

@Gx(s;t)@t

=P1

k=0dpk(t)dtsk

=P1

k=0��pk (t) sk +P1

k=1 �pk�1 (t) sk

= ��P1

k=0 pk (t) sk + �s

P1k=1 pk�1 (t) s

k�1

= ��GX (s; t) + �sGX (s; t)

= �� (1� s)GX (s; t) ;

with initial condition GX (s; 0) = 1: IntegratingZ@GX (s; t)

GX (s; t)=

Z�� (1� s) @t;

we arrive atGX (s; t) = e

��(1�s)t;

which we recognize as the generating function of the Poisson distribution.

Page 55: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Birth and death processes 55

We may generalize the Poisson process by replacing � by � (t) ; such that wedo not have a time homogeneous Markov chain. In which case,

GX (s; t) = exp

�� (1� s)

Z t

0

� (w) dw

�:

Often � (t) =R t0� (w) dw is called the cumulative hazard function. In this case the

following is easily shown:

pk (t) =e��(t) [� (t)]k

k!;

and corresponding expectation

E [X (t)] = � (t) :

6.2 Birth and death processes

A general birth and death process is de�ned as follows:

P [X (t+�) = k j X (t) = j] =

8>>>>>>>><>>>>>>>>:

�j�+ o (�) if k = j + 1

�j�+ o (�) if k = j � 1

1� �j�� �j�+ o (�) if k = j

o (�) otherwise

: (6.3)

Example 6.2.1 If we let �j = �; �j = 0 and i = 0, then we see that a Poissonprocess is a special case of a general birth and death process.

Equation 6.3 gives rise to the following di¤erential equations:

dpj (t)

dt= �

��j + �j

�pj (t) + �j�1pj�1 (t) + �j+1pj+1 (t) for j > 0; (6.4)

dp0 (t)

dt= � (�0 + �0) p0 (t) + �1p1 (t) ;

with initial conditions pi (0) = 1:To �nd the stationary distribution of a birth-death process, we take the limits

of both sides of (6.4) to get

0 = ���j + �j

��j + �j�1�j�1 + �j+1�j+1;

with constraint1Xj=0

�j = 1;

which has a solution if and only if the process is ergodic.

Page 56: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

56 Continuous-time Markov chains

6.2.1 Linear Birth Process

A pure linear birth process is a special case of (6.3) with �j = 0 and �j = j� forj = 0; 1; 2; : : : which gives rise to

dpj (t)

dt= �j�pj (t) + (j � 1)�pj�1 (t) for j � 0:

We will attack this problem using pgf�s. Let

GX (s; t) =1Xj=0

pj (t) sj;

such that

@GX(s;t)@t

=P1

j=0dpj(t)

dtsj

= ��P1

j=0 jpj (t) sj + �

P1j=1 (j � 1) pj�1 (t) sj

= ��sP1

j=0 jpj (t) sj�1 + �s2

P1j=1 (j � 1) pj�1 (t) sj�2

= ��s@GX(s;t)@s

+ �s2 @GX(s;t)@s

:

With the initial condition GX (s; 0) = si; it can be shown that

GX (s; t) = si

�e��t

1� s (1� e��t)

�i;

which suggest that X (t) is the sum of a constant i and a negative binomial withparameters i and e��t. That is,

pj (t) =

�j � 1j � i

�e��ti

�1� e��t

�j�i; j � i

E [X (t)] = ie�t;

V ar [X (t)] = ie�t�e�t � 1

�:

6.2.2 Linear Death Process

A pure linear death process is a special case of (6.3) with �j = j� and �j = 0 forj = 0; 1; 2; : : : That is,

P [X (t+�) = k j X (t) = j] =

8<:j��+ o (�) k = j � 1

1� j��+ o (�) k = jo (�) otherwise

:

Page 57: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Birth and death processes 57

It follows that

dpj (t)

dt= �j�pj (t) + (j + 1)�pj+1 (t) for j � 0

with initial condition pi (0) = 1: It can be shown that

@GX (s; t)

@t= � (1� s) @GX (s; t)

@s:

Using the initial condition GX (s; 0) = si; it follows that

GX (s; t) =�1� e��t + e��ts

�i;

which we recognized as the pgf of a binomial. Thus,

pij (t) =

�i

j

��e��t

�j �1� e��t

�i�j; j = 0; 1; : : : ; i;

E [X (t)] = ie��t;

V ar [X (t)] = ie��t�1� e��t

�:

If we replace � by � (t) ; then the cumulative hazard function becomes

M (t) =

Z t

0

� (w) dw;

and

pj (t) =

�i

j

�e�M(t)j

�1� e�M(t)

�i�j; j = 0; 1; : : : ; i:

When i = 1; the probability of surviving at time t is p1 (t), which is commonly knownas the survival function

S (t) = e�M(t):

6.2.3 Linear Birth-Death Process

A linear birth-death process is also a special case of (6.3) with �j = j� and �j = j�such that

dpj (t)

dt= �j (�+ �) pj (t) + (j � 1)�pj�1 (t) + (j + 1)�pj+1 (t) for j � 1;

dp0 (t)

dt= �p1 (t) ;

which give rise toE [X (t)] = ie(���)t: (6.5)

Page 58: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

58 Continuous-time Markov chains

Note the underlying deterministic process dX(t)X(t)

= (�� �) dt yields X (t) = ie(���)t:Taking the limit as t!1 of equation 6.5, we have

limt!1

E [X (t)] =

8>>>><>>>>:1 � > �

i � = �

0 � < �

:

Further it can be shown that

V ar (X (t)) =

8><>:i��+����

�e(���)t

�e(���)t � 1

�if � 6= �

2i�t if � = �

: (6.6)

Taking the limit as t!1 of equation 6.6, we have

limt!1

V ar (X (t)) =

8<:1 � � �

0 � < �:

Suppose we let � = ��be the average number of o¤spring of an individual per lifetime.

Then if � � 1; the extinction probability is one, whereas if � > 1; then the probabilityof extinction is

�1�

�i:

Example 6.2.2 Consider an S-I-R epidemic where X (t) ; Y (t) ; and Z (t) are thenumber of susceptible, infected and removed individuals in the population at time t;respectively. Let X (t) + Y (t) + Z (t) = n+ i; X (0) = n; Y (0) = i; Z (0) = 0: Sincepopulation size is �xed, this is a 2-dimensional stochastic process.

P [X (t+�) = x� 1; Y (t+�) = y + 1 j X (t) = x; Y (t) = y] = �X (t)Y (t)�+o (�)

P [ Y (t+�) = y � 1; Z (t+�) = z + 1 j Y (t) = y; Z (t) = z] = Y (t)� + o (�)where 1

is the average length of the infectious period. This system is intractable, but

at the onset of the epidemic, when X (t) is large and Y (t) is small, we approximatethe system by the linear birth and death theory developed earlier. Namely, let � = �

,

and again, if � � 1; the extinction probability is one, whereas if � > 1; then theprobability of extinction is

�1�

�i:

6.3 Kolmogorov di¤erential equations

For any continuous time Markov process, the Kolmogorov di¤erential equations aregiven by

dpk (t)

dt=Xj2S

pj (t) vjk; (6.7)

Page 59: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Kolmogorov di¤erential equations 59

with the restriction thatP

k2S vjk = 0 and vjk > 0 whenever j 6= k: Thus it followsimmediately that vjj = �

Pk2Sk 6=j

vjk: Therefore, we can write

P [X (t+�) = k j X (t) = j] =

8>>>><>>>>:vjk�+ o (�) if k 6= j

1�P

j 6=k vjk�+ o (�) if k = j

o (�) otherwise

Example 6.3.1 In the linear birth-death process,

vjk =

8>>>>>>>><>>>>>>>>:

j� k = j + 1

j� k = j � 1

�j (�+ �) k = j

0 otherwise

De�nition 6.3.2 The in�nitesimal generator V is given by

V =

264 v00 v01 � � �v10 v11...

. . .

375Example 6.3.3 The in�nitesimal generator of the linear birth-death process is

V =

26666640 0 0 0 � � �� � (�+ �) � 00 2� �2 (�+ �) 2�

0 0. . .

...

3777775 :

Example 6.3.4 A time homogeneous Poisson process has in�nitesimal generator

V =

26664�� � 0 0 � � �0 �� � 00 0 �� �...

. . .

37775 :

Page 60: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

60 Continuous-time Markov chains

6.4 Algebraic Treatment

Now suppose the state space is S=f1; 2; : : : ; sg and let P (t) be a s � s matrix withijth element pij (t) : Similarly, let

dP(t)dt

be a s�s matrix with ijth element dpij(t)dt: Note

that P (0) = I; the identity matrix. The task at hand becomes solving

dP (t)

dt= P (t)V;

which is the matrix equivalent to (6.7). We motivate the solution by �rst consideringthe one dimensional analog:

dp (t)

dt= vp (t) ; p (0) = 1;

which has solution

p (t) = evt = 1 + vt+(vt)2

2!+ � � �

This suggests that we let P (t) have the following expansion

P (t) = eVt

= I+Vt+ V2t2

2!+ � � �+ Vktk

k!+ � � �

=P1

n=0Vntn

n!

(6.8)

which can be shown to converge uniformly in t provided that jvijj � M for all i; j:Taking the derivative with respect to t of P (t) ; we have

dP (t)

dt=

1Xn=1

Vn�1tn�1

(n� 1)! V = P (t)V:

We now take advantage of the expansion 6.8 to get a closed formP (t) : First, considerthe eigenvalues �1; : : : ; �s of V such that

Vn = T�nT�1 where � =

26664�1 0 � � � 00 �2 0...

. . ....

0 0 � � � �s

37775 ;where T is a matrix of eigenvalues. Note that we assume that all the eigenvalues are

Page 61: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Algebraic Treatment 61

distinct so that the inverse of T is known to exist. It follows that

P (t) = eVt =P1

n=0Vntn

n!

= T�P1

n=0�ntn

n!

�T�1

= T

26664e�1t 0 � � � 00 e�2t 0...

. . ....

0 0 � � � e�st

37775T�1

= Te�(t)T�1;

where

e�(t) =

26664e�1t 0 � � � 00 e�2t 0...

. . ....

0 0 � � � e�st

37775 :It can be shown that the form P (t) = Te�(t)T�1 gives rise to the following explicitsolution

Pij (t) =sXl=1

Aji (�l) e�ltQ

m=1m6=l

(�l � �m); (6.9)

the continuous-time analog of equation 5.7. Further, it can be shown that the limitingtransition probabilities are given by

limt!1

Pij (t) = �j =vjjPl vll;

so that 0 = �V where � = [�] :

Example 6.4.1 Consider the two stage continuous time Markov process with in�n-itesimal generator

V =

��� �� ��

�Then the characteristic matrix

A (�) =

��+ � ���� �+ �

�;

gives rise to eigenvalues �1 = 0 and �2 = � (�+ �) : Then

A (0) =

�� ���� �

�; A+ (0) =

�� �� �

�;

Page 62: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

62 Continuous-time Markov chains

implying [�; �]T is an eigenvector of the eigenvalue �1 = 0: Similarly,

A (� (�+ �)) =��� ���� ��

�; A+ (� (�+ �)) =

��� �� ��

�;

yields [�1; 1]T as an eigenvector for �2 = � (�+ �) : Thus

T =

�� �1� 1

�and T�1=

�1

�+ �

��1 1�� �

�;

and

P (t) = T

�1 00 e�(�+�)

�T�1

=�

1�+�

�� �+ �e�(�+�)t ��1� e�(�+�)t

���1� e�(�+�)t

��+ �e�(�+�)t

�;

implying limt!1P (t) =�

1�+�

�� � �� �

�:

6.5 Mean time to absorption

We wish to investigate mean time to absorption in continuous-time Markov chains.Again we motivate with the 1-dimensional case. Let T be the random variable fortime to absorption with exponential density f (t) = �e��t and corresponding momentgenerating function

m (s) = �

Z 1

0

e�(��s)tdt =�

�� s = � (�� s)�1 :

The rth moment of T is then found by evaluating

drm (s)

dsr

����s=0

= r! (�)�r :

It follows that the mean time of absorption is

E (T ) = ��1:

Now suppose there are s transient states and 1 super absorbing state such that thein�nitesimal generator V is a (s+ 1) � (s+ 1) matrix which can be partitioned asfollows

V =

�Q R0 0

�;

where Q is the s�s in�nitesimal generator among the transient states. De�ne a s�1vector f (t)= [f1 (t) ; f2 (t) ; : : : ; fs (t)]

T where fi (s) is the time to absorption density

Page 63: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Mean time to absorption 63

from state i such that f (t) = eQtR: Then matrix moment generating function is givenby

M (s) =

Z 1

0

estf (t) dt =

Z 1

0

esteQtdt R = � (Q+sI)�1R;

from which we can obtain the rth moment

Mr = (�1)r r!Q�1C

M1 = �Q�1C

Example 6.5.1 Suppose we have s = 2 transient states and 1 absorbing state andthe in�nitesimal generator is given by

V =

24 � (�1 + �) �1 �0 ��2 �20 0 0

35 :Then

Q�1 =1

(�1 + �)�2

���2 ��10 � (�1 + �)

�implying that the mean time to absorption is

M1 =

" �1

�1+�

��1 + �1

�2

�1�2

#:

Next we investigate the probability of absorption given starting at state i. Ourderivation involves embedded Markov chains.

De�nition 6.5.2 A stochastic process fX (t)g is right continuous at t if for every� > 0 there exists s such that X (s) = X (t) where t � s � t+ �:

Next, suppose we are in state i; at some time Ti such that

Pii (�) = 1 + vii�+ o (�)

Si (t) = P [Ti > t]

= lim�!0n!1

[pii (�)]n

= lim�!0n!1

�1 + vii

�tn

�+ o (�)

�n= eviit

Page 64: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

64 Continuous-time Markov chains

where � = tn: Then the probability that when a jump is made, we go to state j, is

given by:vij�+ o (�)

�vii�+ o (�)where �vii =

Pj 6=i vij: We then let

pij = lim�!1

vij +o(�)�

�vii + o(�)�

=vij�vii

Thus, examining a continuous Markov process only at transition times, we observethe discrete, embedded process de�ned by P = fpijg :

Example 6.5.3 Let fX (t)g be a continuous birth-death process. Then fYng ; thediscrete embedded process, is de�ned by

P =

26666641 0 0 0 � � �q1 0 p1 00 q2 0 p2...

. . .

3777775where pk = pk;k+1 =

�k�k+�k

and qk = pk;k�1 =�k

�k+�k: If we let !i be the probability of

absorption given starting in state i, then

!i =�i

�i + �i!i+1 +

�i�i + �i

!i�1; w0 = 1

Furthermore, if we assume that we have a linear birth-death process i.e. pk = ��+�

and qk =��+�; then the embedded process is very similar to gambler�s ruin scenario.

We have

!i =�

�+ �!i+1 +

�+ �!i�1; w0 = 1

Letting � = ��;

!i+2 � (1 + �)!i+1 + �!i = 0which gives rise to

!i =

8<:���

�iif � > �

1 if � � �Now if mi is the mean time to absorption from state i and pij is the one step

transition probability from the embedded Markov process, then

mi =SXj=1

pijmj � v�1ii ; m0 = 0

(where we assume that the state space is �nite i.e. S = f1; 2; : : : ; sg :

Page 65: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Inference 65

Example 6.5.4 Revisiting the general birth-death process

mi =�i

�i + �imi+1 +

�i�i + �i

mi�1 +1

�i + �i

Now let � =��v�1ii ; : : : ;�v�1ss

�,M = [m1;m2; : : : ;ms] and Q be the embedded

process one step transition probability matrix among transient states. Then

IM = �+QM

implyingM =(I�Q)�1�

6.6 Inference

In this section, we will introduce likelihood based inference on discrete state spaceand continuous index set, homogeneous Markov chains. Generally one can performinference on a single observed sequence over time or on many observed sequences overtime.6.6.1 Inference on a single sequence

Assume that we plan to estimate parameter, �; that have dimension less that thenumber of states. Let fX(1); X(2); : : : X(k)g be the observed successive states that thesystem passes through with X(k) 6= X(k+1); where the () indicates order statistics.We let Tk be the sojourn time in state k: We know have a bivariate discrete indexset process f(X(k); Tk); k = 1; 2; : : :g; which is a Markov process on s � [0;1): Thetransition probabilities for this process are

P�X(k+1) = j; Tk+1 > t j X(k) = i; Tk = u

�=vij�vii

evjjt;

with densities

lim�!0+

"P�X(k+1) = j; t < Tk+1 � t+� j X(k) = i; Tk = u

��

#=vijviivjje

vjjt;

lim�!0+

"P�X(k+1) = j; t � Tk+1 < t+�

��

#= vjje

vjjt;

and survival functionsP�X(k) = j; Tk > t

�= evjjt:

Example 6.6.1 Suppose observe the following: (1; t1) ; (3; t2) ; (2; t3) ; (3; t4) ; (2; t�5) :Then the likelihood is given by

L (�) = chv11

�v13v11

�ev11t1

i hv33

�v32v22

�ev33t2

i hv22

�v23v33

�ev22t3

i hv33

�v32v22

�ev33t4

i[ev22t5 ]

= cv13v32v23v32ev11t1ev22(t3+t5)ev33(t2+t4):

Page 66: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

66 Continuous-time Markov chains

In general, let nij be the observed number of transitions from state i to j; and�i be the total time spend in state i; where �i2S�i = t: Then the likelihood functionfor � is

L(�) = cY

(i;j)2S;

Yi6=i

exp (�i2Svij�i) :

If the Markov process is ergodic, then

E(�it) �! p�i; as t �!1;

E(nijt) �! p�ivij; as t �!1:

6.6.2 Inference on birth and death processes

We de�ne the elements of in�nitesimal generator as

vij =

8>><>>:�i; j = i+ 1�i; j = i� 1

� (�i + �i) ; j = i0; otherwise

; i = 0; 1; :::;

where �0 = 1: The observations take the form of bi = ni;i+1; the total number of birthsobserved from state i; di = ni;i�1; the total number of deaths observed from state i;and �i total time spent in state i: Then the likelihood function for � = (�0; �1; :::)and � = (�1; �2; :::) is

L(�;�) =1Yi=0

�bii

1Yi=0

�dii exp

(�

1Xi=1

(�i + �i) �i � �0�0

);

The score functions are@ lnL@�i

= bi�i� �i;

@ lnL@�i

= di�i� �i;

; (6.10)

resulting in the MLE�s b�i = bi�i; i = 0; 1; ::: and b�i = di

�i; i = 1; 2; ::: . The second

partials are@2 lnL@�2i

= � bi�2i;

@2 lnL@�2i

= � di�2i;

@2 lnL@�i@�i

= @2 lnL@�i@�i

= 0:

(6.11a)

Example 6.6.2 Simple Immigration-Emigration Process: In this case �k = � and�k = k�; k = 0; 1; ::: . Using (6.10), the MLE�s are b� = b

tand b� = d

�1i=1i�i. Note

that �1i=1i�i is the total person time under observation in all the non-zero states.From the expected values of (6.11a), the variances of the estimators based on Fisher�sinformation are var(b�) = �

tand var(b�) = �

�t; where � = �

�. The form of the var(b�)

comes from the fact that the births part of the process is a pure birth process. Theform of the var(b�) comes from the fact that we have an ergodic process, where � isthe average number of people in the system at equilibrium (see exercise 6.8.3).

Page 67: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

HIV-progression models 67

6.7 HIV-progression models

Longini, et al.[18] modelled the progression of an infected individual through thestages of infection and ultimately death as a time-homogeneous Markov process inwhich stages 1 to 4 are transient states, and stage 5 is an absorbing state. Thus wehave the following in�nitesimal generator

V =

266664��1 �1 0 0 00 ��2 �2 0 00 0 ��3 �3 00 0 0 ��4 �40 0 0 0 0

377775with eigenvalues �1 = ��1; �2 = ��2; �3 = ��3; �4 = ��4; �5 = 0: It follows fromequation (6.9) that the transition probabilities among the transient states are givenby

pik (t) = (�1)k�i �i : : : �k�1kXj=i

e��jt

�kl=il 6=j(�j � �l)

; i = 1; 2; 3; 4; i � k < 5:

The transition probabilities from a transient state i to the absorbing state (death)are

pi5 (t) = (�1)4�i �i : : : �44Xj=i

�1� e��jt

��j�kl=i

l 6=j(�j � �l)

; i = 1; 2; 3; 4:

See Longini, et al. [20][19][17][21] for further use of continuous time Markov modelsfor HIV progression and prediction.

6.8 Exercises

Exercise 6.8.1 Pure death with annihilation process. For a time homogeneous puredeath process with death rate �k = k�; k = 0; 1; : : : , suppose that the entire populationmay be subject independently at time t to sudden annihilation. Let v�+ o (�) be theprobability that a population, of any size, will be annihilated in the time interval[t; t+�): Also, let pi (0) = 1; i > 0:

a. Construct the state diagram for the process.b. Classify the states and the Markov process.c. Write out the system of di¤erential-di¤erence equations for pk (t) ; k =

0; 1; : : :d. Solve the system for pk (t) (Hint: Use induction)e. Find the survival distribution, ST (t) = 1� FT (t) ; for the population.Note: An interesting application of this problem is as follows: A convoy of i

ships is carrying supplies to a given destination in time of war. If convoy protection

Page 68: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

68 Continuous-time Markov chains

is ine¤ective, enemy attacks on the convoy may cause attrition or even total annihila-tion. Analysis of WWII data has shown that e¤ective convoy protection would resultin an attrition rate which is independent of the size of the convoy at a given time.

Exercise 6.8.2 Consider a time-dependent pure death process where �k (t) = k� (t) ;

k = 0; 1; : : : ; i: Also de�ne p0l = e�R tl0 �(�)d� and pl = e

�R tl+1tl

�(�)d� :a. Give a physical interpretation of p0i and pi and show that p0l+1 = p0lpl:b. Consider a sequence of random variables X (t1) ; : : : ; X (t!) de�ned at the

points 0 � t1 < t2 < � � � < t!: Derive the joint pmf for X (t1) = k1; : : : ; X (t!) = k!:(Assume P [X (t0) = k0] = 1).

c. Now suppose that X (t1) = k1; : : : ; X (t!) = k! is a set of observationsmade at points 0 � t1 < t2 < � � � < t! with k0 � k1 � � � � � k!: Give the likelihoodfunction.

d. Find the maximum likelihood estimators p0; p1; : : : ; p!�1 of the probabilitiesp0; p1; : : : ; p!�1: (Assume k!�1 > 0).

e. Give the asymptotic variances for the mle�s p0; p1; : : : ; p!�1; also give theasymptotic variance-covariance matrix (asymptotic).

Exercise 6.8.3 Consider a special case of the general birth-death process where �k (t) =�; �k (t) = k�; k = 0; 1; : : : given in Chiang pages 229-233.

a. Construct the state diagram for the process.b. Classify the states and identify the type of Markov chain.c. Solve the P.D.E.(6.9) and verify that (6.17) is correct.d. Find the stationary distribution f�0; �1; : : :g if one exists.

Page 69: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 7

COUNTING PROCESSES

Let N (t) be the cumulative number of events at time t: Consider fN (t)g1t=0 whereS = f0; 1; 2; : : :g and N (0) = 0: Let Ft be the history up to t of the process generatedby the N (t) so that Ft is a right continuous �-�eld. Note for � < t; F� � Ft: Now

P [dN (t) = 1jFt] = A (t) dtP [dN (t) = 0jFt] = 1� A (t) dt

where A (t) is the intensity of the process.

Example 7.0.1 If A (t) = � (t) (i.e. does not depend on Ft), then we have anordinary Poisson process:

P [dN (t) = 1jN (t)] = � (t) dtP [dN (t) = 0jN (t)] = 1� � (t) dt

Example 7.0.2 S-I-R epidemic. Let N (t) = S (0)�S (t) ; Ft = � (S (�) ; I (�) : 0 � � � t) ;A (t) = �S (t) I (t) ; so that:

P [dN (t) = 1jN (t) ; I (t)] = �S (t) I (t) dtP [dN (t) = 0jN (t) ; I (t)] = 1� �S (t) I (t) dt

Example 7.0.3 Linear death process. LetX (t) = X (0)�N (t) and A (t) = X (t)� (t) :

7.1 Continuous Time Martingales

We would like to use Martingale machinery. First note that

E [dN (t) jFt] = A (t) dt

and

E [N (t) jFt] =Z t

0

A (�) d�

which is called the compensator. Similarly one can show that

V ar [dN (t) jFt] = A (t) dt [1� A (t) dt] �= A (t) dt

Page 70: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

70 Counting Processes

and

V ar [N (t) jFt] =Z t

0

A (�) d� :

De�ne M (t) = N (t)�R t0A (�) d� ; M (0) = 0: Then it follows that

E [dM (t) jFt] = E [dN (t) jFt]� A (t) dt = 0

and it can also be shown that

E [M (t) jFt] = 0

and that

V ar [M (t)] �= E�Z t

0

A (�) d�

�De�nition 7.1.1 fM (t)g1t=0 is a martingale with respect to Ft if

i: E [jM (t)j] <1ii: E [M (�) jFt] =M (t) for all � � t

Furthermore, if E [M (t)] = M (0) ; when M (0) = 0; then M (t) is a Zero MeanMartingale (ZMM).

Theorem 7.1.2 Martingale Convergence Theorem (MCT): M (t)!a:s: M

Theorem 7.1.3 Zero Mean Martingale Central Limit Theorem (ZMMCLT): as t!1; M(t)p

V ar(M(t))!d N (0; 1)

Example 7.1.4 Time homogeneous Poisson process. We have M (t) = N (t) � �tand V ar [M (t)] = E [N (t)] = �t: It follows that [N (t)� �t]!a:s: 0; or equivalently�

N (t)

t� ��!a:s: 0:

Therefore

� (t) =N (t)

tand �!a:s: �:

From the ZMMCLT,

M (t)p�t

=N (t)� �tp

�t=� (t)� �q

�t

s N (0; 1)

and thus

V ar����=�

t

Page 71: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Continuous Time Martingales 71

De�nition 7.1.5 A process fB (t)g1t=0 is predictable if determined by Ft: Note: fB (t)g1t=0

is left continuous.

De�ne

M� (t) =

Z t

0

B (�) dM (�) =

Z t

0

B (�) dN (�)�Z t

0

B (�)A (�) d�

so that dM� (t) = B (t) dM (t) : It follows that

E [dM� (t) jFt] = E [B (t) dM (t) jFt�]

= B (t)E [dM (t) jFt� ] = 0

so that both dM� (t) and M� (t) are ZMM. Finally, note that

V ar [M� (t)] = E

�Z t

0

B (�)2 dN (�)

�(7.1)

Example 7.1.6 Linear Pure Death Process. Start with a cohort of n individuals.Let X (t) be the number alive at time t and N (t) be the cumulative number of deathsby time t; so that X (t) = n�N (t) : Let A (t) = X (t)� (t) be the intensity of N (t) :We want to estimate � (t) =

R t0� (�) d� , the cumulative hazard function, and in turn,

s (t) = e��(t): Let

M (t) = N (t)�Z t

0

X (�)� (�) d�

and

J (�) =

�1 if X (�) > 00 if X (�) = 0

:

Then B (�) = J(�)X(��) is a predictable process. De�ne

M� (t) =

Z t

0

B (�) dM (�) =

Z t

0

1

X (��)dN (�)�

Z t

0

� (�) d� if X (t) > 0

=

Z t

0

1

X (��)dN (�)� � (t)

Since M� (t) is a ZMM, it follows that

� (t) =

Z t

0

1

X (��)dN (�)

Nelson-Aalen estimator:

� (t) =1

n+

1

n� 1 + � � �+1

n�N (t) + 1�= ln

�n

n�N (t)

�= ln

�n

X (t)

Page 72: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

72 Counting Processes

so that

�� (t) = ln�X (t)

n

�= ln

�S (t)

�; S (t) = e��(t)

From (7.1), the variance of M� (t) is

V ar[M� (t)] = E

�1

n2+

1

(n� 1)2+ � � �+ 1

[n�N (t) + 1]2�

(7.2)

�= E

�1

n�N(t) �1

n

�= E

�1

X(t)� 1n

�= E

"1� X(t)

n

X(t)

#

Since S(t) = 1� X(t)n; then

V ar [M� (t)] �=1� S (t)X (t)

and by ZMMCLT� (t)� � (t)pV ar [M� (t)]

~N (0; 1)

Example 7.1.7 Linear Birth-Death Process.

P [dX (t) = 1jFt] = �X (t) + o (dt)

P [dX (t) = �1jFt] = �X (t) + o (dt)

P [dX (t) = 0jFt] = 1� (�+ �)X (t) + o (dt)

Then E [dX (t) jFt] = (�� �)X (t) dt and

E [X (t) jFt] = X (0) e(���)t

which is similar to the underlying deterministic system:

dX (t)

dt= (�� �)X (t)

X (t) = X (0) e(���)t

Page 73: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Inference on continuous-time epidemics 73

7.2 Inference on continuous-time epidemics

Recall the S-I-R stochastic process fS (t) ; I (t)g : Let n = S (t) + I (t) + R (t), Ft =� fS (�) ; I (�) : 0 � � � tg ; F0 = � fn; 1g and

P [dS (t) = �1; dI (t) = +1jFt] = �I (t)S (t) dt+ o (dt)

P [dS (t) = 0; dI (t) = �1jFt] = I (t) dt+ o (dt)

P [dS (t) = 0; dI (t) = 0jFt] = 1� [�S (t) + ] I (t) dt+ o (dt)

with � = c�nwhere c is the number of contacts per unit for an individual and � is the

transmission probability.Counting process. LetN (t) = n�S (t) be the cumulative number infected and

R (t) be the cumulative number recovered. Now Ft = � fN (�) ; R (�) : 0 � � � tg :

P [dN (t) = 1; dR (t) = 0jFt] = �I (t)S (t) dt+ o (dt)

P [dN (t) = 0; dR (t) = 1jFt] = I (t) dt+ o (dt)

Let

M1 (t) = N (t)� �Z t

0

I (�)S (�) d�

M2 (t) = R (t)� Z t

0

I (�) d�

J (t) =

8<:1 when S (t) > 0

0 when S (t) = 0(7.3)

B (t) =J (t�)

S (t�)(7.4a)

Then let

M�1 (t) =

Z t

0

B (�) dM (�) =

Z t

0

B (�) dN (�)�Z t

0

B (�) I (�)S (�) d�

where we estimateZ t

0

B (�) dN (�) =1

n+

1

n� 1 + � � �+1

S (t�)�= � ln

�1� AR

�t���: (7.5)

Page 74: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

74 Counting Processes

If we let � = � and de�ne

M (t) = M�1 (t)� �M2 (t) (7.6)

=

Z t

0

B (�) dN (�)� �R (t) + �Z t

0

I (�) [1� J (�)] d�

De�ne stopping time

T = inf ft � 0 : S (t) [N (t)�R (t)] = 0g

Then Z T

0

I (�) [1� J (�)] d� = 0

implyingM (T ) = � ln [1� AR (T )]� �R (T )

� =� ln [1� AR (T )]

R (T )=c�

n =1

n�

� =� ln [1� AR (T )]

AR (T )

Taking the variance of (7.6) we have

V ar [M (t)] = V ar [M�1 (t)]� �2V ar [M2 (t)]

= E

�Z T

0

B2(�)dN(�)

�+ �2E [R(t)] :

From (7.2) we have

V ar [M (t)] �= E�1� AR (t)R(t)

�+ �2E [R(t)] :

Then the estimated variance for the stopped process is

dV ar [M (t)] �=1� AR (T )R(T )

+ b�2R(t):By the ZMMCLT

M�1 (T )� �M2 (T )pV ar [M (T )]

~N (0; 1) :

SinceM�1 (T )� �M2 (T )qdV ar [M (T )]

=b� � �pdV ar[M(T )]

M2(T )

;

then

V ar(b�) �=h1�AR(T )R(T )

i+ b�2R(T )

R(T )2:

Page 75: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Martingale-based approach to estimating vaccine e¢ cacy 75

7.3 Martingale-based approach to estimating vaccine e¢ cacy

This section in based on Longini, et al.[21]. We use the martingale approach toestimate vaccine e¢ cacy. Consider an epidemic process in a �xed population of size n:We let n� be number of people in vaccination stratum �; where � = 0 (unvaccinated),1 (vaccinated). Then the fraction vaccinated is f = 1� n0

n: Let �� be the per-contact

transmission probability for a person in stratum �: Individuals in the populationmake an average of c contacts per unit of time. The scaled contact rate is � = c=n.The relative risk � = �1=�0 is the relative susceptibility of people in the vaccinationstratum as compared to those in the unvaccinated stratum. We de�ne the vaccinee¢ cacy for susceptibility as VE = 1��. Once infected, we assume that people remainifectious for a average of � time units. We work with the SIR infectious process sothat S�(t); I�(t) and R�(t) are number of susceptible, infected and removed (immune)people, respectively, in stratum �; at time t;where S�(t) + I�(t) + R�(t) = n� . Thenthe stochastic process of interest is fS0(t); S1(t); I(t), t � 0g is on the discrete statespace I3 and has a continuous index set. The transition probabilities are

PrfdS0(t) = �1; dS1(t) = 0; dI(t) = 1jFtg = � �0I(t)S0(t)dt+ o(dt);PrfdS0(t) = 0; dS1(t) = �1; dI(t) = 1jFtg = � �1I(t)S1(t)dt+ o(dt);

PrfdS0(t) = dS1(t) = 0; dI(t) = �1jFtg = I(t)dt+ o(dt);PrfdS0(t) = dS1(t) = dI(t) = 0jFtg = 1� I(t)f[

P1�=0 ��S�(t)] + gdt+ o(dt);

(7.7)where Ft is the �- �eld for the process (7.7) up to and including time t; so thatFt = �fS0(t); S1(t); I(�) : 0 � � � tg:The initial conditions are PrfI�(0) = 1g = 1,for some �; and the process has stopping times T� = inft�0ft : S�(t)I(t) = 0g,T = maxfT0; T1g: Our goal is to use counting process-based methods to estimate �of observations on the stochastic process.

The counting process of interest is the cumulative number of infected peoplein stratum � by time t;which is N�(t) = S�(0)�S�(t). Then, this counting process isthe stochastic process fN0(t); N1(t), t � 0g on the discrete space I2;with continuousindex set. The transition probabilities for the counting process are

PrfdN0(t) = 1; dN1(t) = 0jFtg = � �0I(t)S0(t)dt+ o(dt);PrfdN0(t) = 0; dN1(t) = 1jFtg = � �1I(t)S1(t)dt+ o(dt);

PrfdN0(t) = dN1(t) = 0jFtg = 1� I(t)[P1

�=0 ��S�(t)]dt+ o(dt):

Then it follow directly that

M�(t) = N�(t)� � ��Z t

0

S�(�)I(�)d� ; � = 0; 1;

is a zero mean martingale (ZMM) with respect to Ft. We construct the stochastic

Page 76: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

76 Counting Processes

integrals

M�� (t) =

Z t

0

B�(�)dM�(�) =

Z t

0

B�(�)dN�(�)� � ��Z t

0

J�(�)I(�)d� ; (7.8)

which are also ZMM�s with respect to Ft; where J�(�) and B�(�) are de�nedby (7.3) and (7.4a), respectively. In order to evaluate the integrals in (7.8), we note,as before (7.5), that

Z t

0

B�(�)dN�(�) =1

S�(0)+

1

S�(0)� 1+ ::::+

1

S�(t�)�= � loge[

S�(t�)S�(0)

]: (7.9)

Substituting (7.9) into (7.8) yields

M�� (t)�= � loge[

S�(t�)S�(0)

]� � ��Z t

0

J�(�)I(�)d� :

Now consider the estimating equation

M(t) =M�1 (t)� � M�

0 (t);

which is a ZMM with respect to Ft;so that EfM�1 (t)� � M�

0 (t)g = 0, which suggeststhe estimator

� =M�1 (t)=M

�0 (t):

By de�nition, the stratum-speci�c �nal attack rate is AR� = 1 � [S�(T�)=S�(0)].Evaluating this estimator for the stopped process yields

� = loge(1� AR1)=loge(1� AR0):

From the variation process, the variance of M�� (t) is

V ar[M(t)] = V ar[M�1 (t)] + �

2V ar[M�0 (t)]; (7.10)

since M�0 (t) and M

�1 (t) are orthogonal, where, as before,

V ar[M�� (t)] = E[

Z t

0

B2�(�)dN�(�)] =1

S�(0)2+

1

[S�(0)� 1]2+ ::::+

1

S�(t�)2(7.11)

�=1� S�(t�)

S�(0)

S�(t�); � = 0; 1: (7.12)

De�ne the stopped ZMM M = M(T ): Then, evaluating (7.10) at (7.11) for thestopped process yields

dV ar[M ] = dAR1bS1 + �2 dAR0bS0 :

Page 77: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Martingale-based approach to estimating vaccine e¢ cacy 77

By the ZMMCLT (Theorem 6.1.3 ), we have

M�1 � �M�

0

Var[M ]1=2� N (0; 1): (7.13)

Recognizing thatM�1 �� M�

0 = [� loge(1�AR1)+� loge(1�AR0)] and manipulatingthe right-hand side of (7.13) reveals that

� � N��;

Var[M ][loge(1� AR0)]2

�Thus, the large sample variance of � isdvar(�) = Var[M ]

[loge(1�AR0)]2.

Some vaccine e¢ cacy estimates and 95% con�dence intervals are given in thetable below.

Vaccine coverage (among children with vaccination cards), attack rates,and estimated vaccine e¢ cacy from the measles outbreak in Muyinga, Burundi,

July 1988 to January 1989Age group Group size Attack rates Vaccine e¢ cacymonths f dAR0 dAR1 cVE 95% CI[9� 15] 199 0:452 0:560 0:178 0:761 [0:752; 0:770][16� 36] 533 0:842 0:405 0:134 0:723 [0:716; 0:731][37� 60] 432 0:956 0:158 0:080 0:515 [0:348; 0:683]Total 1; 164 Summaryy 0:735 [0:648; 0:822]yReciprocal variance weighted average. Source: Longini, et al. [22]

[3][12][21]

Page 78: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

78 Counting Processes

Page 79: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 8

HIDDEN MARKOV CHAINS

How do we write the likelihood when transition times are not observed? For example,suppose we observeY = fY0; Y1; : : : ; Ymg at times � = f� 0; � 1; : : : ; �mg. Let pYkYk+1 =pYkYk+1 (� k; � k+1) : Then

L (Y) =m�1Yk=0

pYkYk+1 :

But what if we do not observe Y; but rather something related to Y: That is, say weobserve X where X j Y sf (x j y) : Then

L (X) =

Zy

f (xjy) f (y) dy:

Example 8.0.1 Suppose S = f1; 2g, Y0 = 1, we observe X0 = 1; X1 = 1; X2 = 2and

f (x = 1jy = 1) = �

f (x = 2jy = 1) = 1� �

f (x = 1jy = 2) = 1� �

f (x = 2jy = 2) = �

Then

L (X) = p11p11� (1� �) + p11p12�� + p12p22 (1� �) � + p12p21 (1� �) (1� �)

=P2

Y1=1

P2Y2=1

p1Y1pY1Y2f (1jY1) f (2jY2)

=P2

Y1=1p1Y1f (1jY1)

P2Y2=1

pY1Y2f (2jY2)

Now suppose that Y0 is not known, so that there are 23 = 8 possible paths (note thatpreviously there were 22 = 4 possible paths). We let pY0 be the initial distribution.Then

L (X) =

2XY0=1

pY0f (1jY0)2X

Y1=1

pY0Y1f (1jY1)2X

Y2=1

pY1Y2f (2jY2) :

Page 80: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

80 Hidden Markov Chains

In general, we have a chain with S states andm observationsX =(X1; X2; : : : ; Xm)yielding

L (X) =XY0

pY0f (X0jY0)XY1

pY0Y1f (X1jY1) : : :XYm

pYm�1Ymf (XmjYm)

which, due to Baum, et al.[2], can be simpli�ed using matrix multiplication. Let

f (X0) =

26664pY0=y0f (X0jy0)pY0=y1f (X0jy1)

...pY0=ysf (X0jyS)

37775 ; C =2666411...1

37775 ;where f (X0) and C are s�1 column vectors, and T(j) (Xj) be the s�s matrix having(k; `) element

�pYj�1=k;Yj=`f (XjjYj = `)

: Then

L (X) = f (X0)T T(1) (X1)T

(2) (X2) � � �T(m) (Xm)C;

Example 8.0.2 Continuing previous example where X =(1; 1; 2) ;

f (X0) =

�p1f (1j1)p2f (1j2)

�=

�p1�

p2 (1� �)

T(1) (1) =

�p11� p12 (1� �)p21� p22 (1� �)

�T(2) (2) =

�p11 (1� �) p12�p21 (1� �) p22�

�Satten and Longini[25] analyze HIV progression as hidden Markov chains.

Page 81: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 9

GIBBS SAMPLING

Suppose we are given the joint density f (x; y1; : : : ; yp) and would like to �nd themarginal density

f (x) =

Z: : :

Zf (x; y1; : : : ; yp) dy1 : : : dyp: (9.1)

The Gibbs sampling technique generates x1; x2; : : : ; xm~f (x) without requiring ex-plicit calculation of the integrations required in equation (9.1). Rather the Gibbssampler only requires knowledge of the conditional distributions:

Example 9.0.3 Bivariate case. (p = 1) :

fX (x) =

ZfXY (x; y) dy

=

ZfXjY (x; y) f (y) dy

=

ZfXjY (xjy)

ZfY jX (yjw) fX (w) dwdy

=

Z �ZfXjY (xjy) fY jX (yjw) dy

�fX (w) dw

=

Zh (x;w) fX (w) dw

where

h (x;w) =

ZfXjY (x; y) fY jX (y; w) dy

which is a �xed point integral equation for which fX (x) is a unique solution.

Gibbs sampling generates a random sample from fX (x) by sampling fromfXjY (x; y) and fY jX (y; x) : That is, generate sequence fY0; X0; Y1; X1; : : : ; Yk; Xk; : : :gwhere we �rst chooseX0 and the sample as follows: Y1 = f (yjX0) ; X1 = f (xjY1) ; Y2 =f (yjX1) ; : : : ; Xk = f (xjYk) : It can be shown that as k !1, Xm !d fX (x) :

Page 82: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

82 Gibbs Sampling

Consider the simple case where X and Y are (marginally) Bernoulli randomvariables, S = f0; 1g ; T = f0; 1; 2; : : :g : Since

fY jX (y; x) =fXY (x; y)

fX (x)

we de�ne transition matrix

PY jX =

264f(0;0)fX(0)

f(0;1)fX(0)

f(1;0)fX(1)

f(1;1)fX(1)

375 :Similarly, we let

PXjY =

264f(0;0)fY (0)

f(1;0)fY (0)

f(0;1)fY (1)

f(1;1)fY (1)

375ForX0 ! Y1 ! X1; it follows from the CK equations that P [X1jX0] =

PY1P [Y1jX0]P [X1jY1]

and thusPXjX = PY jXPXjY :

LetfX (k) = [f (Xk = 0) ; f (Xk = 1)]

so that

fX (k) = fX (k � 1)PXjX

= f (0)PkXjX

This irreducible, ergodic Markov chain has stationary distribution fX = fXPXjX :Thus Xm !d fX (x) :

Notice that fX = fXPXjX is the discrete state space analog of

fX (x) =

Zh (x;w) fX (w) dw

Theorem 9.0.4 Ergodic Theorem:

limm!1

1

m

mXi=1

Xi !a:s:

ZxfX (x) dx = EX

Example 9.0.5 The following example comes from Casella and George[?]. Suppose

f (x; y) =

�n

x

�� (�+ �)

� (�) � (�)yx+��1 (1� y)n�x+��1 ; x = 0; 1; : : : ; n; 0 � y � 1

Page 83: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Gibbs Sampling 83

where f (xjy) is binomial (n; y) and f (yjx) is beta (x+ �; n� x+ �) : Then we

f (x) =

�n

x

�� (�+ �)

� (�) � (�)

� (x+ �) � (n� x+ �)� (�+ � + n)

; x = 0; 1; : : : ; n:

Multivariate case. S = f0; 1; 2; : : : ; ng ; T = f0; 1; 2; : : :g : Suppose n = 3:That is, we have X;Y; Z random variables. Start with Y0; Z0; X0: Generate GibbsfY0; Z0; X0; Y1; Z1; X1; : : :gsequence as follows:

Y1 = f (yjX0; Z0)

Z1 = f (zjX0; Y1)

X1 = f (xjY1; Z1)Y2 = f (yjX1; Z1)

...

Example: Let N be Poisson with parameter �:

f (x; y; n) /�n

x

�yx+��1 (1� y)n�x+��1 e���

n

n!; x = 0; 1; : : : ; n; 0 � y � 1

Then f (xjy; n) ~binomial(n; y), f (yjx; n) ~beta(x+ �; n� x+ �), and

f (njx; y) / e�(1�y)� [(1� y)�]n�x

(n� x)! ; n = x; x+ 1; : : :

Continuous case. SupposeX and Y are continuous with S = (�1 � x � 1; �1 � y � 1) :Then

fX1jX0 (x1jx0) =ZfX1jY1 (x1jy) fY1jX0 (yjx0) dy

fXkjX0 (xjx0) =ZfXkjXk�1 (xjt) fXk�1jX0 (tjx0) dt

As k !1; converges to a stationary point

fXkjX0 (xjx0)! fX (x)

fXkjXk�1 (xjt)! h (x; t)

fX (x) =

Zh (x; t) fX (x) dt

�xed point integral equation from before.

Page 84: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

84 Gibbs Sampling

Page 85: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Chapter 10

APPENDIX

10.1 Series

Newton�s Formula:1Xr=0

�n

r

�xr = (1 + x)n ; jxj < 1

Binomial:nXr=0

�n

r

�xryn�r = (x+ y)n

Geometric:P1

r=0 xr = 1

1�x ; jxj < 1Pn�1r=0 x

r = 1�xn1�x ; x 6 =1

Exponential:1Xr=0

xr

r!= ex

Logarithmic:1Xr=0

(�1)r

r + 1xr+1 = ln (1 + x) ; jxj < 1

10.2 Inequalities

Schwartz (Inequalities): [E (X;Y )]2 � E�X2�E�Y 2�

Schwartz:

�����nXr=0

XrYr

�����2

�nXr=0

jXrj2nXr=0

jYrj2

Absolute Values:

�����nXr=0

Xr

����� �nXr=0

jXrj

Jensen: E [f (X)] � f [E (X)] for f (�) a convex function

Page 86: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

86 Appendix

10.3 Convergence of Sequences and Series

�P1

k=01kpconverges if p > 1 and diverges p � 1:

� IfP1

k=0 ak converges, then limk!1 ak ! 0:

� Fatou�s Lemma: Let fak (t)g be a sequence of non-negative numbers wherelimk!1 ak (t) exists for each t in T = f0; 1; : : :g ; then for all t in T and k =0; 1; : : : ; X

t2T

nlimk!1

ak (t)o� lim

k!1

(Xt2T

ak (t)

)

� The Renewal Theorem: Let fakg ; fbkg and fukg be sequences of non-negativenumbers with

P1k=0 ak = 1;

P1k=0 bk <1; and with the fukg sequence bounded.

Assume the g.c.d.fk : ak > 0g = 1 and assume the renewal equation un �Pnk=0 an�kuk = bn holds for all n = 0; 1; : : : Then the limn!1 un exists. In

fact,

limn!1 un =P1k=0 bkP1k=1 kak

ifP1

k=1 kak <1limn!1 un = 0 if

P1k=1 kak = 0

� Toeplitz Lemma: Let fakg be a sequence of numbers, and let bn =Pn

k=1 ak;such that limn!1 bn ! 1. Also, let fxkg be a sequence of numbers such thatlimk!1 xk ! x <1: Then limn!1

n1bn

Pnk=1 akxk

o! x:

10.4 Convergence in distribution

Let fFng be a sequence of distribution functions (df). If there exists a df F such that,as n ! 1; Fn(x) ! F (x); at every point x at which F is continuous, we say thatFn converges in law (weakly) to F; and we write Fn !L F: In addition, if fXng is asequence of random variables and fFng is the corresponding sequence of df�s, we saythat Xn converges in distribution (or law) to X if there exists a random variable Xwith df F such that Fn !D F: We write Xn !L X:

10.5 Convergence in Probability

Let fXng be a sequence of random variables de�ned on some probability space(;F ;P): We say that the sequence fXng converges in probability to the randomvariable X if, for every " > 0;

PrfjXn �Xj > "g ! 0 as n!1:

We write Xn !P X:

Page 87: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Almost Sure Convergence 87

10.6 Almost Sure Convergence

Let fXng be a sequence of random variables. We say that fXng converges almostsurely (a.s.) the random variable X if and only if

Prf! : Xn(!)! X(!) as n!1g = 1;

and we write Xn !a:s: X or Xn ! X with probability 1. We also have the theoremthat Xn !a:s: X if and only if

limn!1 Prfsupm�njXn �Xj > "g = 0 for all " > 0:

We note that a.s. convergence is stronger than convergence in probability, which inturn is stronger that convergence in law so that

Xn !a:s: X =) Xn !P X =) Xn !L X:

Page 88: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

88 Appendix

References

1. N. Bailey. The mathematical theory of infectious diseases. Gri¢ n, London, secondedition, 1975.

2. L. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization technique occurringin the statistical analysis of probabilistic functions of Markov chains. Annals ofStatistics, 41:164�171, 1970.

3. N. Becker. Analysis of infectious disease data. Chapman and Hall, New York, 1989.

4. C. Chiang. An Introduction to Stochastic Processes and Their Applications. Krieger,Huntington, New York, 1980.

5. C. Chiang. The Life Table and Its Applications. Krieger, Malabar, Florida, 1984.

6. K. Cooke, D. Calef, and E. Level. Stability or chaos in discrete epidemic models.In Nonlinear systems and applications�an international conference. Academic press,New York, 1977.

7. K. Dietz. The �rst epidemic model: A historical note on P.D. En�ko. AustrialianJournal of Statistics, 30A:56�65, 1988.

8. P. Fine. A commentary on the mechanical analogue to the Reed-Frost epidemicmodel. Americal Journal of Epidemiology, 106:87�100, 1977.

9. J. Frauenthal. Mathematical models in epidemiology. Springer-Verlag, Berlin, 1980.

10. M. Greenwood. The statistical measure of infectiosness. Joural of Hygiene (Cam-bridge), 31:336�351, 1931.

11. M. Haber, I. Longini, and G. Cotsonis. Models for the statistical analysis of infectiousdisease data. Biometrics, 44:163�173, 1988.

12. M. Halloran, M. Haber, and I. Longini. Interpretation and estimation of vaccinee¢ cacy under heterogeneity. American Journal of Epidemiology, 136:328�343, 1992.

13. I. Longini. A chain binomial model of endemicity. Mathematical Biosciences, 50:85�93, 1980.

14. I. Longini. The generalized discrete-time epidemic model with immunity: A synthesis.Mathematical Biosciences, 82:19�41, 1986.

15. I. Longini. Modeling the decline of CD4+ T-lymphocyte counts in HIV-infectedindividuals. Journal of Acquired Immune De�ciency Syndromes, 3:930�931, 1990.

16. I. Longini. Chain binomial models. In P. Armitage and T. Colton, editors, TheEncyclopedia of Biostatistics, pages 593 �597. Wiley, New York, 1998.

17. I. Longini, R. Byers, N. Hessol, and W. Tan. Estimating the stage-speci�c numbersof HIV infection using a Markov model and back-calculation. Statistics in Medicine,11:831�843, 1992.

18. I. Longini, W. Clark, R. Byers, G. Lemp, J. Ward, W. Darrow, and H. Hethcote.Statistical analysis of the stages of HIV infection using a Markov model. Statisticsin Medicine, 8:831�843, 1989.

19. I. Longini, W. Clark, L. Gardner, and J. Brudage. The dynamics of CD4+ T-lymphocyte decline in HIV infected individuals: a Markov modeling approach. Jour-

Page 89: Lecture Notes on Stochastic Processes in Biostatistics ... · cal material on a host of stochastic processes including branching processes, Markov processes, birth and death processes

Almost Sure Convergence 89

nal of Acquired Immune De�ciency Syndromes, 4:1141�1147, 1991.

20. I. Longini, W. Clark, M. Haber, and R. Horsburgh. The stages of HIV infection:Waiting times and infection transmission probabilities. Lecture Notes in Biomathe-matics, 83:111�137, 1989.

21. I. Longini, M. Halloran, and M. Haber. Estimation of vaccine e¢ cacy from epidemicsof acute infectious agents under vaccine-related heterogeneity. Math Biosci, 117:271�281, 1993.

22. I. Longini, M. Halloran, M. Haber, and R. Chen. Measuring vaccine e¢ cacy fromepidemics of acute infectious agents. Stat Med, 12:249�263, 1993.

23. I. Longini and J. Koopman. Household and community transmission parametersfrom �nal distributions of infections in households. Biometrics, 38:115�126, 1982.

24. I. Longini, J. Koopman, M. Haber, and G. Cotsonis. Statistical inference for infec-tious diseases: Risk-speci�c household and community transmission parameters. AmJ Epidemiol, 128:845�859, 1988.

25. G. Satten and I. Longini. Markov chains with measurement error: estimating the"true" course of a marker of HIV disease progression (with discussion). AppliedStatistics, 45:275�309, 1996.

26. I. Saunders. An approximate maximum likelihood estimator for chain binomialmodels. Australian Journal of Statistics, 22:307�316, 1980.