30
Chapter 8. Stochastic differential equations 8.1 Preliminaries Consider the example dy dt Dˇy C ‘noise’ : If the noise is zero mean, then we can take the expectation, and since expectation is a linear operator d hy i dt Dˇhy i so the mean should decay exponentially. What aboutthe variance? Well, it’s not immediately clear how to deal with this. To get some idea about what’s going on, let’s think about how we would simulate this numerically. If we do a simple Euler method for the deterministic part, and then add independent Gaussian noise at each step, we get y nC1 y n t Dˇy n C Z n ; where each Z n is an i.i.d. standard normal or, equivalently, y nC1 D .1 ˇt/y n C tZ n : Taking expectations of this, we get hy nC1 iD .1 ˇt/hy n i ; so that hy n iD .1 ˇt/ n y 0 : If t is small, this is, of course, hy n iD e ˇ nt y 0 C O.t/, i.e., the numerical approximation of the exponential decay. (Not a very good approximation, though, since the error is first order in t .) This becomes the correct result, of course, if we take the limit t ! 0 while letting n !1 and holding nt D t constant. For the discrete approximation, y nC1 D .1 ˇt/y n C tZ n ; ©kath2020esam448notes

Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

Chapter 8. Stochastic differential equations

8.1 Preliminaries

Consider the exampledy

dtD �ˇy C ‘noise’ :

If the noise is zero mean, then we can take the expectation, and since expectation is a linearoperator

d hyi

dtD �ˇhyi

so the mean should decay exponentially.

What about the variance? Well, it’s not immediately clear how to deal with this. To get someidea about what’s going on, let’s think about how we would simulate this numerically. If wedo a simple Euler method for the deterministic part, and then add independent Gaussian noiseat each step, we get

ynC1 � yn

�tD �ˇyn C �Zn ;

where each Zn is an i.i.d. standard normal or, equivalently,

ynC1 D .1 � ˇ�t/yn C ��tZn :

Taking expectations of this, we get

hynC1i D .1 � ˇ�t/hyni ;

so thathyni D .1 � ˇ�t/

ny0 :

If�t is small, this is, of course, hyni D e�ˇn�ty0CO.�t/, i.e., the numerical approximationof the exponential decay. (Not a very good approximation, though, since the error is first orderin �t .) This becomes the correct result, of course, if we take the limit �t ! 0 while lettingn!1 and holding n�t D t constant.

For the discrete approximation,

ynC1 D .1 � ˇ�t/yn C ��tZn ;

©kath2020esam448notes

Page 2: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

110 Stochastic differential equations

we can also compute the variance,

VarŒynC1� D .1 � ˇ�t/2VarŒyn�C �2.�t/2 :

where we have used the fact that the yn and Zn will be independent (yn depends upon theprevious Zj values, and Zn and all previous values are independent), and also we have usedVarŒZn� D 1, assuming it is a standard Gaussian random variable. The solution to this equa-tion, assuming that the initial variance is zero, is

VarŒyn� D�2�t

2ˇ � ˇ2�t

�1 � .1 � ˇ�t/2n

�:

If we now take the limit �t ! 0 with t D n�t fixed, we get a zero answer unless we assume�2�t D Q�2 is held constant; then we get

VarŒy.t/� DQ�2

h1 � e�2ˇt

i:

This means, of course, that our original discrete approximation should have been

ynC1 � yn

�tD �ˇyn C

Q�p�t

Zn ;

or, equivalently,�y D �ˇy�t C Q��W ;

where EŒ�W � D 0 and EŒ�W 2� D �t . Note �W D O.p�t/ and that this is what we found

for the chemical Langevin equation at the end of the last chapter. This scaling of the noise likep�t is one of the main fundamental differences (and perhaps the most striking difference) be-

tween the numerical solution of deterministic differential equations and stochastic differentialequations. Having a good intuition about this provides a good foundation for everything else.

8.2 Stochastic processes and the C.K.S equation

The previous analysis is from the viewpoint of Monte-Carlo simulations, i.e., one picks randomnumbersZn and generates one trial solution of the differential equation. After doing this manytimes one can build up statistics for the solution y.t/, i.e., EŒy.t/� and VarŒy.t/�. [And note,in the above example, this is sufficient to determine the entire probability distribution of y.t/;since at any time point the solution is just a sum of Gaussians, the distribution for y.t/ is also aGaussian. Therefore, its mean and variance are sufficient to determine the entire distribution.]

Let’s go back and start again, this time looking at things from a probabilistic point of view.Note that as opposed to previous problems, here we are interested in a variable X.t/ that isnot just random, but is also a function of time. This is called a stochastic process. There are a

©kath2020esam448notes

Page 3: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.2 Stochastic processes and the C.K.S equation 111

number of different cases: the values X.t/ takes on can be either discrete or continuous; also,the time t can be either discrete or continuous. In the continuous case, in general the jointprobability distribution for X.t1/, X.t2/, . . .X.tn/, i.e.,

P.xi < X.ti/ < xi C dxi/ D p.x1; t1I x2; t2I : : : I xn; tn/ dx1 dx2 : : : dxn :

1. is positive

2. satisfiesR: : :Rp.x1; t1I : : : I xn; tn/ dx1 : : : dxn D 1

3. p.x1; t1I x2; t2I : : : I xm; tm/ DR: : :Rp.x1; t1I : : : I xn; tn/ dxmC1 : : : dxn

This is so cumbersome, though, that we immediately switch to transition probabilities, i.e.,

P.xn; tn j x1; t1I : : : I xn�1; tn�1/

is the probability of being at xn at time tn given that we were at x1 at time t1, etc.

This is still too difficult, however, so we assume that this transition probability only dependsupon the previous location and the previous time, i.e., if the times are ordered, t1 < t2 < : : : <tn, the transition probability is

P.xn; tn j xn�1; tn�1/ :

This is called a Markov process. Furthermore, if this transition probability only depends upontime through the difference tn � tn�1, then the Markov process is said to be stationary.

For a Markov process, we will revise the notation a little and write this in the general vectorcase as

P.x; t I s; t0/dx D P.x < X.t/ < xC dx jX.t0/ D s/ :

Then we have the Chapman-Kolmogorov-Smoluckowski (C.K.S.) equation

P.x; t I s; t0/ D

Z 1�1

P.x; t I y; �/P.y; � I s; t0/ dy :

The idea behind this is that the probability of going from s at time t0 to x at time t is theprobability of going from s at time t0 to y at time � and then going from y at time � to x at timet , summed over all possible intermediate states y. This can also be written

P.x; tC� I s; t0/ D

Z 1�1

.�; � j x��; t /P.x��; t I s; t0/ d� :

where .�; � j x��; t / D P.x; t C � j x��; t / is the probability of taking a step of size � intime � starting at position x�� at time t .

©kath2020esam448notes

Page 4: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

112 Stochastic differential equations

8.3 Fokker-Planck equations

If � is small, then we expect that most steps � will be small. Thus, an expansion for small �should work. Note, however, if the probability distribution for is something like a Gaussian,e.g., in the 1-D case,

.�; � jy; t/ D1q

2�a.y/�exp

��.� � b.y/�/2

2a.y/�

�;

then we can only expand the last argument of .�; � j x��; t / with respect to �, since theGaussian decay with respect to the first argument is needed. (Also, note that the above Gaussianprobability distribution for has mean b.y/� C : : : and variance a.y/� C : : : for small � . Wewill make use of this shortly.) Doing the expansion, and suppressing the initial conditions, wehave

P.x; tC� I s; t0/ D

Z 1�1

�P.x; t I s; t0/ .�; � j x; t / �

@

@xiŒP.x; t I s; t0/ .�; � j x; t /� �i

C1

2

@2

@xi@xjŒP.x; t I s; t0/ .�; � j x; t /� �i�j C : : :

�d� :

(Summation notation has been assumed here.) Now we useZ 1�1

.�; � j x; t / d� D 1 ;Z 1�1

�i .�; � j x; t / d� D bi.x; t /� CO.�2/ ;Z 1

�1

�i �j .�; � j x; t / d� D aij .x; t /� CO.�2/ :

We also assume that all higher moments are O.�2/; this can be verified directly when therandom step has a Gaussian distribution. This is important, of course, because it means that ingeneral we only have to consider the first two moments of the PDF associated with the randomstep.

We then have, omitting explicit writing of the dependence upon the initial conditions,

P.x; tC�/ D P.x; t / �@

@xiŒ bi.x; t / P.x; t / � � C

1

2

@2

@xi@xj

�aij .x; t / P.x; t /

�� CO.�2/ :

Finally, dividing the � and taking the limit � ! 0 we obtain the forward Fokker-Planck equa-tion

@P.x; t /

@tD �

@

@xiŒ bi.x; t / P.x; t / �C

1

2

@2

@xi@xj

�aij .x; t / P.x; t /

�:

©kath2020esam448notes

Page 5: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.3 Fokker-Planck equations 113

Again, the dependence with respect to the initial conditions, s and t0, is still present and ismerely suppressed; they show up in the initial condition for the probability distribution,

limt#t0

P.x; t I s; t0/ D ı.x�s/ :

Because this is a diffusion equation, processes for which both the mean and variance of anincremental step are O.�/ are called diffusion processes.

Similarly, we can derive a backward Fokker-Planck equation by taking a small step at the start,

P.x; t I s; t0/ D

Z 1�1

.�; � j s; t0/ P.x; t I sC�; t0C�/ d� :

Expanding for small steps � now gives

P.x; t I s; t0/ D

Z 1�1

�P.x; t I s; t0C�/ .�; � j s; t0/C

@

@si

hP.x; t I s; t0C�/

i .�; � j s; t0/ �i

C1

2

@2

@si@sj

hP.x; t I s; t0C�/

i .�; � j s; t0/ �i�j C : : :

�d� ;

and doing the integrals and taking the limit � ! 0 gives the backward Fokker-Planck equation

�@P.x; t I s; t0/

@t0D bi.s; t0/

@

@si

�P.x; t I s; t0/

�C1

2aij .s; t0/

@2

@si@sj

�P.x; t I s; t0/

�:

Note that derivatives here are with respect to the initial conditions. This looks like a backwardheat equation, but it really isn’t, since it’s solved for t0 < t (i.e., t0 decreasing) and the ‘initial’condition is applied at t0 D t , i.e.,

limt0"t

P.x; t I s; t0/ D ı.x�s/ :

Note also that the backward Fokker-Planck equation is the adjoint of the forward Fokker-Planck equation.

Aside: It’s very important in the above that the mean and variance are the only moments whichare O.�/; as a result, we can stop expanding after just the 2nd term. Intuitively, this happensbecause the width of the distribution .�; �/ is O.

p�/, so that in general one gets h�ni D

O.�n=2/. In the case of the mean, it turns out there is significant cancelation and so it ends upbeing smaller than expected, O.�/ rather than O.�1=2/.

It turns out that there is a result, Pawula’s theorem, showing that for a Markov process, if themean and variance associated with a small step are each O.�t/, then all higher moments mustbe a smaller order. First, define

An D lim�t!0

EŒ.�X/n��t

:

©kath2020esam448notes

Page 6: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

114 Stochastic differential equations

Then, for n � 3 and n odd, since by the Cauchy-Schwarz inequality

jEŒ.�X/n�j2 D jEŒ.�x/.n�1/=2.�x/.nC1/=2�j2 � .EŒ.�x/n�1�/.EŒ.�x/nC1�/ ;

we haveA2n � An�1AnC1 :

Similarly, for n even,A2n � An�2AnC2 :

Setting n D r � 1 and n D r C 1 in the first, and n D r � 2 and n D r C 2 in the second,where r is an even integer, we obtain

A2r�2 � Ar�4Ar ; r � 6 ;

A2r�1 � Ar�2Ar ; r � 4 ;

A2rC1 � ArArC2 ; r � 2 ;

A2rC2 � ArArC4 ; r � 2 :

These four inequalities imply that if there is any even k > 3 such that Ak D 0, then An � 0 forn � 3. Thus, it really does make sense in the present case, where both the mean and varianceof a small step are O.�t/, to only look at these first two moments.

8.4 The connection to differentials

We still have something left to do, however, and that is to connect the above results to Monte-Carlo simulations of particular problems. In particular, we need to connect the statistics of thesmall step � (or rather, the first and second moments bi.x; t /� C : : : and aij .x; t /� C : : :) withsome particular differential equation and its numerical implementation. In this section we willadopt a somewhat different approach from the one taken previously; here we will start with adeterministic differential equation

dxi

dtD bi.x; t /

and convert it into a stochastic equation for a random variable X by adding a random forcing.We can do this in a manner that is consistent with the above derivation of the Fokker-Planckequations by approximating the solution with small time steps, i.e. using a finite differenceapproach.

Suppose that at time t we are at position x (restrict the problem to 1-D for the moment), andwrite

�X � X.t C�t/ � x D b.x; t/�t C�W

©kath2020esam448notes

Page 7: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.4 The connection to differentials 115

where �W is the random forcing which we might write as

�W D

Z tC�t

t

F.�/ d� :

Note one could be tempted to take the limit �t ! 0 and write

dX

dtD b.X; t/C F.t/ ;

but we will see shortly that quite a bit of interpretation is necessary before doing so.

We now have to decide about the properties of the random forcing �W . First, we assume thatit is a mean zero random variable, i.e.,

h�W i D 0 :

Note this means thath�Xi D b.x; t/�t ;

so that the expected position follows the deterministic path, at least to this order. The randomforcing thus gives a certain amount of spread around the deterministic path.

A measure of the amount of spread is

�2 DD.�X � h�Xi/2

ED˝.�W /2

˛;

and we now are required to decide something about the statistics of the random forcing �W .From our above derivation of the Fokker-Planck equations, we know we would like

˝.�W /2

˛D

O.�t/ so that the variance of the fluctuations is of the same size as the drift produced by thedeterministic terms. We therefore choose �W to have a probability distribution such as

1p2��t

exp .��2

2�t/ ;

i.e. a standard normal (mean zero, unit variance) distribution. [Note we also have

h.�W /ni D1

p2��t

Z 1�1

�n exp .��2

2�t/ d� D

8<:0; n odd;1p�.2�t/n=2

�n�12

�Š; n even:

so that h.�W /ni D o.�t/ for n > 2].

This last result is especially important when one considers the random variable �W 2. Thisrandom variable has mean �t and variance h.�W /4i � h.�W /2i2 D 2.�t/2, and so �W 2 D

�t C �tZ where Z is a random variable with zero mean and O.1/ variance. If we add up

©kath2020esam448notes

Page 8: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

116 Stochastic differential equations

N of these steps, the sum will have mean N�t , but its standard deviation will be of sizepN�t . This is very different from the random increment �X , which has mean of size N�t

and standard deviationpN�t . In the case of �W 2, the standard deviation divided by the

mean goes to zero asN !1 and�t ! 0 withN�t fixed, but for�X the standard deviationand the mean are both of the same size. This means that in the limit �t ! 0 the random partof �W 2 can be neglected and it can be replaced by the deterministic value �t .

With this particular result we have now identified

h�Xi D b.x; t/�t C : : : and h.�X � h�Xi/2i D �t C : : : ;

where the ellipses indicate things we can neglect. Thus, we have determined the coefficientsb.x; t/ and a.x; t/ D 1 that appear in the Fokker-Planck equations (since they are the meanand variance of a small step, and that is precisely what we have been calculating). Stochasticdifferential equations handled in this manner are sometimes called Itô equations. It should alsobe noted that using differentials as described above gives a method for numerically computingsolutions of such randomly forced equations. One just needs a method for calculating randomincrements �W and easy methods for doing so are available.

Obviously, the above is not yet general enough. We can modify things somewhat, however, byinstead considering the problem

�X D b.x; t/�t C �.x; t/�W ;

where, again, x is the value of the solution before the increment �X . Repeating the abovecalculation, we then have

h�Xi D b.x; t/�t C : : :

andh.�X � h�Xi/2i D �2.x; t/�t C : : : :

Thus, b.x; t/ is the same as before and we have now also identified a.x; t/ D �2.x; t/.

In more dimensions we have

�Xi D bi.x; t /�t C �ij .x; t /�Wj

where the summation convention is again assumed. Here, each of the �Wj is assumed toindependent, identically distributed (iid) Gaussian random variables. We then have

h�Xii D bi.x; t /�t C : : :

andh�Xi�Xj i D �ik.x; t /�jl.x; t / h�Wk�Wli :

©kath2020esam448notes

Page 9: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.5 The continuous limit: Itô calculus 117

If the �Wj are (i.i.d.), however, we have

h�Wk�Wli D ıkl �t C : : : ;

where ıij is the Kronecker delta function, ıij D 1 if i D j and ıij D 0 if i ¤ j . Thus,

h�Xi�Xj i D �ik.x; t /�jl.x; t / ıkl �t C : : : D �ik.x; t /�jk.x; t /�t C : : : ;

and we haveaij .x; t / D �ik.x; t /�jk.x; t / :

In vector form, the above is

�X D b.x; t /�t C � .x; t /�W :

The mean of this ish�Xi D b.x; t /�t

and the covariance is

h�X�XT i D � .x; t /�W�WT � .x; t /T C : : : D � .x; t / � .x; t /T �t C : : : ;

since �W�WT D I�t C : : :. Thus, a.x; t / D � .x; t / � .x; t /T . Note that if there are n X’sand m �Ws, both in column vectors, then � is n �m and a.x; t / is n � n. The rank of a.x; t /will be smaller than n ifm < n, however; in this case, the noise in the different components ofX will be correlated.

8.5 The continuous limit: Itô calculus

So far, we have only talked about single increments. Let’s now discuss what happens when wetake the limit �t ! 0 and talk about the continuous equation, e.g., in the 1-D case

dX D b.X; t/ dt C �.X; t/ dW :

First of all, we have written the equation in this way because it will turn out to be incorrect todivide by �t before taking the limit, because the

p�t behavior of the increments �W means

that lim�t!0 .�W=�t/ does not exist in any usual sense. The big advantage of all of theprevious discussion, though, is that we now know what the terms b.X; t/dt and �.X; t/dWmean: the mean of a small step taking starting at a now random positionX is b.X; t/dt , and thevariance of that small step is �2.X; t/dt . [It is very useful to think of dW as being O.

p�t/.]

Again thinking in terms of Monte-Carlo sampling, if we integrate this equation along oneparticular sample path ! we get

X!.t/ D X!.t0/C

Z t

t0

b.X!.s/; s/ ds C

Z t

t0

�.X!.s/; s/ dW.s/ :

©kath2020esam448notes

Page 10: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

118 Stochastic differential equations

The first integral makes perfect sense, but what about the second? Well, if we discretize theinterval .t0; t / into n pieces, t0 < t1 < � � � < tn D t , then the way we have been thinking aboutthis is thatZ t

t0

�.X!.s/; s/ dW! D limmax .�t/!0

n�1XjD0

�.X!.tj /; tj / ŒW!.tjC1/ �W!.tj /� :

In particular, note that �.X!.t/; t/ is evaluated before the random step.

To figure out what this means, let’s do a couple of examples. First, let’s assume that �.X; t/ D1 and start from t0 D 0. Then we haveZ t

0

dW! D W!.t/ D limmax .�t/!0

n�1XjD0

ŒW!.tjC1/ �W!.tj /� :

In the discrete case, each step is a Gaussian random variable with mean zero and variancetjC1 � tj . Since the sum of a number of Gaussians is a Gaussian with mean equal to the sumof the means and variance equal to the sum of the variances, this means that the result W!.t/is a Gaussian with main 0 and variance

Pj ŒtjC1�tj � D tn � t0 D t . Since this result doesn’t

depend upon n, we have that

W!.t/ is a Gaussian with mean 0 and variance t :

This is known as the Wiener process, also known as white noise.

Next, let’s consider the integral

I! D

Z t

t0

f .s/ dW! D limn!1

n�1XjD0

f .tj /ŒW!.tjC1/ �W!.tj /� ;

where f .t/ is a deterministic function. Clearly we have

EŒI!� D 0 :

In addition, though, we also have

EŒI 2!� D limn!1

n�1XjD0

Œf .tj /�2.tjC1 � tj / D

Z t

t0

Œf .s/�2 ds :

In addition, since we have a Gaussian random variable for any finite n, we expect that I! is azero-mean Gaussian with the indicated standard deviation.

©kath2020esam448notes

Page 11: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.5 The continuous limit: Itô calculus 119

Integrals of a random function behave a little differently. As a specific example, let’s considerZ t

0

W!.s/ dW! D limn!1

n�1XjD0

W!.tj /ŒW!.tjC1/ �W!.tj /� :

Writing W!.tj / D 12ŒW!.tjC1/CW!.tj /� �

12ŒW!.tjC1/ �W!.tj /�, the above sum becomes

limn!1

8<:12 n�1XjD0

�W 2! .tjC1/ �W

2! .tj /

��1

2

n�1XjD0

�W!.tjC1/ �W!.tj /

�29=; :

The first sum telescopes to become

limn!1

1

2

�W 2.tn/ �W

2.t0/�D1

2W 2.t/ ;

while the second is

� limn!1

1

2

n�1XjD0

.�Wj /2 :

We recall, however, that .�Wj /2 D .tjC1 � tj /.1C Zj /, where Zj is a random variable withzero mean and O.1/ variance; adding up the above terms, we get

� limn!1

1

2

n�1XjD0

.tjC1�tj / � limn!1

1

2

n�1XjD0

.tjC1�tj /Zj :

The first part is just �12t , and the second is a random variable with mean zero and variance of

size n�t2, where �t D max jtjC1 � tj j. When we take the limit n!1, with �t D O.1=n/,both the variance and standard deviation go to zero. Thus, in the limit we are left only with�12t . Finally, then, Z t

0

W!.s/ dW! D1

2W 2.t/ �

1

2t :

This, of course, is different than the standard calculus result, which doesn’t have the extra �12t .

Again, in general we haveZ t

t0

�.X!.s/; s/ dW! D limn!1

n�1XjD0

�.X!.tj /; tj / ŒW!.tjC1/ �W!.tj /� :

Because the function in the integrand only depends upon tj , such integrals are said to be non-anticipating.

©kath2020esam448notes

Page 12: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

120 Stochastic differential equations

8.6 The Itô formula

Suppose we have a random variable X and that

dX D b.X; t/ dt C �.X; t/ dW :

In addition, suppose we have another random variable Y given by Y D U.X; t/. What is theequation for Y ? Using differentials,

�Y D@U

@t�t C

@U

@X�X C

1

2

@2U

@X2�X2

C : : :

D@U

@t�t C

@U

@XŒb.X; t/ dt C �.X; t/ dW �C

1

2

@2U

@X2�2�t C : : :

where in the above we have used �X2 D �2�W 2 C : : : D �2�t C : : :. Collecting terms andtaking the limit, we have

dY D

�@U

@tC b.X; t/

@U

@XC1

2�2@2U

@X2

��t C �.X; t/

@U

@XdW :

This is the chain rule for stochastic differential equations. Note that there is an additional termnot present in the deterministic case.

As a example, in the above take b D 0, � D 1 and Y D X2. Then the above gives

d.W 2/ D dt C 2W dW :

Rearranging, this givesW dW D 12d.W 2/� 1

2dt , which is the same as what we found directly

for the Itô integral.

This can also be done in the vector case. Suppose we have

dXi D bi. EX; t/ dt C �ij . EX; t/ dWj ;

and we define Yk D Uk. EX; t/. In terms of differentials, we have

�Yk D@Uk

@t�t C

@Uk

@Xm�Xm C

1

2

@2Uk

@Xm@Xn�Xm�Xn C : : :

D@Uk

@t�t C

@Uk

@Xm

�bm�t C �mj �Wj

�C1

2

@2Uk

@Xm@Xn

��mj�Wj�np�Wp

�C : : :

keeping the biggest terms. Now we use �Wj�Wp D �t if jDp and 0 otherwise, rearrange,and take �t ! 0 to get

dYk D

�@Uk

@tC bm

@Uk

@XmC1

2�mj�nj

@2Uk

@Xm@Xn

�dt C �mj

@Uk

@XmdWj

©kath2020esam448notes

Page 13: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.7 Stratonovich integrals 121

In addition to the chain rule, one can also derive a product rule. Suppose

dX1 D b1 dt C �1 dW

dX2 D b2 dt C �2 dW

Then

d.X1X2/ D lim�.X1 C�X1/.X2 C�X2/ �X1X2

�D lim

�X1�X2 CX2�X1 C�X1�X2

�D .b2X1 C b1X2 C �1�2/ dt C .�2X1 C �1X2/ dW ;

where we have used�X1�X2 D �1�2�W 2C: : : D �1�2�tC: : :. Additional generalizationsare possible, of course.

8.7 Stratonovich integrals

Another way to define a stochastic integral is the Stratonovich form

S

Z t

t0

�.X!.s/; s/ dW! D limn!1

n�1XjD0

1

2

��.X!.tj /; tj /C �.X!.tjC1/; tjC1/

�ŒW!.tjC1/�W!.tj /� :

This definition is called anticipating since if this is part of the right-hand side of the integratedform of a stochastic differential equation, then the random increment in the solution dependsupon the value at the new time step, as for an implicit numerical method.

The advantage of the Stratonovich form, however, is that the regular rules of calculus apply.For example,

S

Z t

0

W! dW! D limn!1

n�1XjD0

1

2ŒW!.tjC1/CW!.tj /�ŒW!.tjC1/ �W!.tj /�

D limn!1

n�1XjD0

1

2ŒW 2

! .tjC1/ �W2! .tj /�

D1

2ŒW 2

! .tn/ �W2! .t0/� D

1

2W 2! .t/ :

Let’s also check the chain rule. Suppose we have

(S) dXi D bi. EX; t/ dt C �ij . EX; t/ dWj :

©kath2020esam448notes

Page 14: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

122 Stochastic differential equations

The .S/ means we are interpreting at the differentials in the Stratonovich form; this means interms of finite differentials,

�Xi D bi. EX; t/�t C1

2

h�ij . EX; t/C �ij . EXC� EX; tC�t/

i�Wj :

Note that it makes no difference whether we use bi. EX; t/�t or bi. EXC� EX; tC�t/�t , sincethe difference is o.�t/. This is not the case for the random part. To simplify what follows, let

� ij . EX; t/ D1

2

h�ij . EX; t/C �ij . EXC� EX; tC�t/

i:

Note this means that the finite difference Stratonovich form is

�Xi D bi. EX; t/�t C � ij . EX; t/�Wj :

Now, let Yk D Uk. EX; t/. In terms of differentials, we have

�Yk D@Uk

@t. EX; t/�t C

@Uk

@Xm. EX; t/�Xm C

1

2

@2Uk

@Xm@Xn. EX; t/�Xm�Xn C : : :

D@Uk

@t. EX; t/�t C

@Uk

@Xm. EX; t/

�bm. EX/�t C �mj . EX; t/�Wj

�C1

2

@2Uk

@Xm@Xn. EX; t/

��mj . EX; t/�Wj�np. EX; t/�Wp

�C : : :

keeping the biggest terms, and suppressing the explicit t dependence in the coefficients. Nowwe use �Wj�Wp D �t if jDp and 0 otherwise and rearrange, giving

�Yk D

�@Uk

@t. EX; t/C bm. EX/

@Uk

@Xm. EX; t/C

1

2�mj . EX; t/�nj . EX; t/

@2Uk

@Xm@Xn. EX; t/

��t

C �mj . EX; t/@Uk

@Xm. EX; t/�Wj C : : :

Whether we have � ij . EX; t/ or �ij . EX; t/ in the first term is of no importance; the differencebetween these two is O.� EX/, and since the whole term is multiplied by �t the net differencebecomes than any term we are keeping. In order for the final result to be in Stratonovich term,however, the last term needs to be the average of the function evaluated at the current point andthe next point. In other words, the coefficient of �Wj in last term should be

1

2

��mj . EX; t/

@Uk

@Xm. EX; t/C �mj . EXC� EX; tC�t/

@Uk

@Xm. EXC� EX; tC�t/

�;

©kath2020esam448notes

Page 15: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.7 Stratonovich integrals 123

rather than what we have,

1

2

h�mj . EX; t/C �mj . EXC� EX; tC�t/

i @Uk@Xm

. EX; t/ :

To correct this, we note that the difference between what we have and what we want is

�1

2�mj . EXC� EX; tC�t/

�@Uk

@Xm. EXC� EX; tC�t/ �

@Uk

@Xm. EX; t/

�and when we expand the term in the brackets in a Taylor series and keep only the biggest term,we get

@2Uk

@Xm@Xn. EX; t/�Xn C : : : �

@2Uk

@Xm@Xn. EX; t/ �np. EX; t/�Wp C : : : :

Putting back the multiplying factors, the difference then becomes

�1

2�mj . EXC� EX; tC�t/

@2Uk

@Xm@Xn. EX; t/ �np. EX; t/�Wj�Wp :

Again, the product �Wj�Wp can be replaced by ıjp�t , since all other contributions will besmaller than �t . In addition, �mj . EXC� EX; tC�t/ can be replaced by �mj . EX; t/ since anycorrections will also be smaller than �t . Therefore, we finally get

�Yk �

�@Uk

@t. EX; t/C bm. EX/

@Uk

@Xm. EX; t/C

1

2�mj . EX; t/�nj . EX; t/

@2Uk

@Xm@Xn. EX; t/

��t

C1

2

��mj . EX; t/

@Uk

@Xm. EX; t/C �mj . EXC� EX; tC�t/

@Uk

@Xm. EXC� EX; tC�t/

��Wj ;

�1

2�mj . EX; t/�nj . EX; t/

@2Uk

@Xm@Xn. EX; t/�t C : : : :

We see that the terms involving the second derivatives of U. EX; t/ cancel, i.e.,

�Yk D

�@Uk

@t. EX/C bm. EX/

@Uk

@Xm. EX/

��t

C1

2

��mj . EX; t/

@Uk

@Xm. EX; t/C �mj . EXC� EX; tC�t/

@Uk

@Xm. EXC� EX; tC�t/

��Wj C : : :

so that in the limit �t ! 0 we get

(S) dYk D

�@Uk

@tC bm

@Uk

@xm

�dt C �mj

@Uk

@xmdWj :

©kath2020esam448notes

Page 16: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

124 Stochastic differential equations

In other words, the Stratonovich form obeys the regular chain rule of calculus.

Here’s another way to think about this: the increment�W isO.�t1=2/. If we use a discretizedversion that agrees with the continuous equation up to second order in �W (i.e., up to andincluding terms that are O.�W 2/, then any manipulations are going to always be overallcorrect to O.�t/. By centering the argument of the stochastic increment, the truncation errorin comparison with the continuous case is always going to be second order, thus the regularrules of calculus will be obeyed.

Converting between Itô and Stratonovich versions. It’s easy to do this conversion using thediscrete forms:

(S) dXi D bi. EX; t/ dt C �ij . EX; t/ dWj

) �Xi D bi. EX; t/�t C1

2

h�ij . EX; t/C �ij . EXC� EX; tC�t/

i�Wj

so expanding we get

�Xi D bi. EX; t/�t C �ij . EX; t/�Wj C1

2

@�ij

@Xk. EX; t/�Xk �Wj C : : : :

But, �Xk D �km. EX/�Wm C : : : and �Wj�Wm D ıkm�t C : : :, so

�Xi D

�bi. EX/C

1

2

@�ij

@Xk. EX/�kj . EX/

��t C �ij . EX/�Wj C : : :

Taking the limit, we therefore get

(I) dXi D

�bi. EX/C

1

2

@�ij

@Xk. EX/�kj . EX/

�dt C �ij . EX/dWj :

Thus, to convert from Stratonovich to Itô only the drift term changes. Similarly, or by justredefining coefficients,

(I) dXi D bi. EX/dt C �ij . EX/dWj

leads to

(S) dXi D

�bi. EX/ �

1

2

@�ij

@Xk. EX/�kj . EX/

�dt C �ij . EX/dWj :

8.8 Langevin analysis

One of the advantages of the Itô form of a stochastic differential equation is that the ‘random’increment dW is independent of the current value of X appearing in the coefficients. Thisallows means of the equation to be taken; since hdW i D 0, all of the random terms drop out.

©kath2020esam448notes

Page 17: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.8 Langevin analysis 125

As an example, consider the linear equation

dX D �ˇX dt C � dW :

First of all, we can solve this equation exactly. If we let Y D Xeˇt , we get

dY D ˇXeˇtdt C eˇtdX D �eˇt dW :

Integrating, we find

Y D X0 C �

Z t

0

eˇs dW ) X D X0e�ˇtC �e�ˇt

Z t

0

eˇs dW :

We know thatR t0eˇs dW is a zero-mean Gaussian random variable with variance

R t0e2ˇsds D

.e2ˇt � 1/=.2ˇ/, sohXi D X0e

�ˇt

and

VarŒX� D�2

�1 � e�2ˇt

�:

Since the original equation is in Itô form, however, it’s possible to obtain these results withoutactually solving the equation. Taking means of the X equation, we have

d hXi D �ˇhXi dt ) X D X0e�ˇt :

Similarly, we can compute

d.X2/ D 2X dX C dX2D��2ˇX2

C �2�dt C 2X� dW :

Taking averages, we then have

d hX2i D �2ˇhX2

i dt C �2 dt ) hX2i D

�2

�1 � e�2ˇt

�CX2

0 e�2ˇt :

Thus,

VarŒX� D�2

�1 � e�2ˇt

�:

These results allow a check on the numerics. Figure 8.1 shows one random solution of thisequation, and Figure 8.2 shows the mean and variance of 100,000 trials.

As a second example, consider the equation

dX D �ˇX dt C �X dW :

©kath2020esam448notes

Page 18: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

126 Stochastic differential equations

0 5 10 15 20 25−4

−2

0

2

4

time

X

Figure 8.1: One solution of the equation dX D �ˇX dt C � dW for ˇ D 0:5 and � D 1:0.

0 2 4 6 8 10−1

0

1

2

3

4

time

me

an

& v

aria

nce

Figure 8.2: Numerically computed mean (blue) and variance (green) of dX D �ˇX dt C� dW for ˇ D 0:5 and � D 1:0.

In this case, if we let Y D lnX , we have

dY D1

XdX �

1

2X2�2X2 dt D �.ˇ C

1

2�2/ dt C � dW ;

which givesX D X0 exp Œ�.ˇ C �

2=2/t C �W.t/� :

Another way to solve this is to convert the Itô equation for X into Stratonovich form andintegrate it using standard methods, i.e., use

(S) dX D

�b.X/ �

1

2

@�

@X.X/�.X/

�dt C �.X/ dW

©kath2020esam448notes

Page 19: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.9 Itô-Taylor expansions 127

0 5 10 15 20 250

5

10

15

time

X

Figure 8.3: One solution of the equation dX D �ˇX dt C �X dW for ˇ D 0:0 and � D 0:5.

with b.X/ D �ˇX and �.X/ D �X . Therefore

(S) dX D �

�ˇ C

1

2�2�X dt C �X dW

and integrating now using the regular rules of calculus we get the same solution.

As in the previous example, it’s probably easier to determine the moments

d hXi D �ˇhXi dt ) hXi D X0e�ˇt ;

andd.X2/ D 2X dX C �2X2 dX2

D .�2ˇ C �2/X2 dt C 2�X2 dW

which gives

d hX2i D .�2ˇ C �2/hX2

i dt ) hX2i D X2

0 e�2ˇtC�2t :

Therefore,VarŒX� D X2

0 e�2ˇt

�e�

2t� 1

�:

In this case, the variance can grow even if the mean decays (if �2 > 2ˇ). Figure 8.3 showsone random solution of this equation, and Figure 8.4 shows the mean and variance of 100,000trials.

8.9 Itô-Taylor expansions

Suppose we have a general Itô stochastic differential equation, such as

dX D b.X; t/ dt C �.X; t/ dW ;

©kath2020esam448notes

Page 20: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

128 Stochastic differential equations

0 2 4 6 8 100

5

10

15

time

X

Figure 8.4: Numerically computed mean (blue) and variance (green) of dX D �ˇX dt C�X dW for ˇ D 0:0 and � D 0:5.

or, in integrated form

X.t/ D X.t0/C

Z t

t0

b.X; s/ ds C

Z t

t0

�.X; s/ dW.s/ :

We would like to approximate this with finite differences. The main difficulty is that we haveto deal with finite approximations of the process W.t/ a little differently.

To deal with this, we will iterate the above integral form using what’s called an Itô-Taylorexpansion. The main idea is to write integrated forms for b.X; t/ and �.X; t/ in the above. Ingeneral, if Y D f .X; t/, from Itô’s lemma we have

dY D

�@f

@tC b

@f

@XC1

2�2@2f

@X2

�dt C �

@f

@XdW ;

which means

Y.t/ D Y.t0/C

Z t

t0

�@f

@tC b

@f

@XC1

2�2@2f

@X2

�ds C

Z t

t0

�@f

@XdW.s/ :

First, we use f .X; t/ D b.X; t/, so that we have

b.X; t/ D b.X.t0/; t0/C

Z t

t0

�@b

@tC b

@b

@XC1

2�2@2b

@X2

�ds C

Z t

t0

�@b

@XdW.s/ :

Next, we use f .X; t/ D �.X; t/, so that

�.X; t/ D �.X.t0/; t0/C

Z t

t0

�@�

@tC b

@�

@XC1

2�2@2�

@X2

�ds C

Z t

t0

�@�

@XdW.s/ :

©kath2020esam448notes

Page 21: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.10 Euler-Maruyama method 129

0 2 4 6 8 100

1

2

3

time

X

Figure 8.5: Example solution with the Euler-Maruyama method

Substituting these expansions into the equation for X.t/, we get

X.t/ D X.t0/C

Z t

t0

b.X.t0/; t0/ ds C

Z t

t0

�.X.t0/; t0/ dW.s/

C

Z t

t0

Z s

t0

�@�

@XdW.s0/ dW.s/

C

Z t

t0

Z s

t0

�@b

@XdW.s0/ ds C

Z t

t0

Z s

t0

�@�

@tC b

@�

@XC1

2�2@2�

@X2

�ds0 dW.s/

C

Z t

t0

Z s

t0

�@b

@tC b

@b

@XC1

2�2@2b

@X2

�ds0ds :

In the above, when t�t0 is small, O.�t/, the first and the last terms are deterministic, O.�t/and O.�t2/, respectively. The other terms are random; the first is O.�t1=2/, the second isO.�t/, and the remaining two are O.�t3=2/. The advantage of doing this, of course, is thatthe integrands of the first two terms are constant, and thus those particular integrals can bedone explicitly.

8.10 Euler-Maruyama method

If we keep the largest deterministic and random terms, doing the integrals we have

X.t/ D X.t0/C b.X.t0/; t0/.t � t0/C �.X.t0/; t0/ ŒW.t/ �W.t0/� :

This is the Euler-Maruyama method. The deterministic error per step is O.�t2/ and the ran-dom error per step is O.�t/.

Figure 8.5 shows an example of the exact (solid) and numerical approximation (dashed) to thesolution of the equation dX D �ˇX dtC�X dW for ˇ D �0:25 and � D 0:5 with a stepsize

©kath2020esam448notes

Page 22: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

130 Stochastic differential equations

4.5 5 5.5

1

1.5

2

2.5

time

W

Figure 8.6: Example discretized version (dashed) of the Wiener process (solid)

of 0.125. To compute this solution, we have approximated the Wiener process W.t/ with astepsize of 0:125, as well; this is shown in Figure 8.6. Note that in this latter figure, we havefirst generated a Wiener process using a very small step size, and then sampling it at the largerstep size. We will see shortly this is done to discuss convergence.

The numerical solution does not really have to be constructed in this way, however; if we areonly interested in the solution for one particular step size (e.g., 0.125), we can use the resultthat the random increments W.tC�t/�W.t/ are i.i.d. Gaussian random variables with meanzero and variance �t .

8.11 Strong vs. weak convergence

To discuss convergence, we first have to decide what type of convergence we want. Becausesolutions are random, it makes sense to talk about the average error in some sense. We caneither compute the error first and then average it, or we can find the average solution first andthen compute the error. These are not the same.

First of all, suppose we compute the error first and average, i.e., we approximate

E Œ jXnum.t/ �Xexact.t/j � :

This is known as strong convergence of the SDE. For any particular time step, we first constructa Wiener process using that time step, and then construct the numerical solution using theEuler-Maruyama formula. At a desired final time we can compare the numerical solution withthe exact solution (assuming that we know the exact solution) and compute the error. Theprocess is repeated, and the errors averaged.

To test the convergence of the method, though, we want to reduce the step size. Unfortunately,we can’t just generate a new Wiener process with a smaller step size, as the new Wiener processwill not coincide with the previous one at the overlapping time steps unless we do something

©kath2020esam448notes

Page 23: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.11 Strong vs. weak convergence 131

special. One way to deal with this issue is to first generate a Wiener process with a very tinytime step, and then merely sample that Wiener process with different time steps — first with alarge time step (as in Figure 8.6), and then with successively smaller values.

As an example, we consider again the SDE

dX D �ˇX dt C �X dW ;

for which we know the exact answer

X D X0 exp Œ�.ˇ C �2=2/t C �W.t/� :

If we solve this problem numerically 1,000 times using the Euler-Maruyama method withˇ D 0 and � D 0:5, and take the average of the absolute value of the error between thenumerical and exact solutions, we get the following:

For a stepsize of 0.12500 average strong error is 0.55174For a stepsize of 0.06250 average strong error is 0.29705For a stepsize of 0.03125 average strong error is 0.23295For a stepsize of 0.01562 average strong error is 0.16959For a stepsize of 0.00781 average strong error is 0.10437For a stepsize of 0.00391 average strong error is 0.08480For a stepsize of 0.00195 average strong error is 0.05813

From this it’s easy to see that the strong (pathwise) error is proportional top�t as �t ! 0;

if the time step decreases by a factor of 4, the average error only decreases by roughly a factorof 2. A plot of this convergence is shown in Fig. 8.7.

Alternatively, we can look at the weak convergence of the numerical solutions, i.e., we approx-imate ˇ̌

EŒXnum.t/� � EŒXexact.t/�ˇ̌:

In this case, we average the numerical solutions along different paths to get the mean, andcompare it to the exact result for the mean (assuming it’s known).

For example, let’s again consider the previous SDE; we recall from the Langevin analysis that

EŒX.t/� D X0e�ˇt :

Performing 100,000 sample paths using ˇ D 0:5 and � D 0:75, we get the results shown inthe table below, and in Fig. 8.8, for the weak error:

©kath2020esam448notes

Page 24: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

132 Stochastic differential equations

10−3

10−2

10−1

10−3

10−2

10−1

100

stepsize

err

or

error

(∆t)(1/2)

∆t

Figure 8.7: Plot showing strong convergence of the Euler-Maruyama method. The curves �tand �t1=2 are shown for comparison.

For a stepsize of 0.50000 average weak error is 0.51830For a stepsize of 0.25000 average weak error is 0.27411For a stepsize of 0.12500 average weak error is 0.11990For a stepsize of 0.06250 average weak error is 0.06083For a stepsize of 0.03125 average weak error is 0.01273For a stepsize of 0.01562 average weak error is 0.04439For a stepsize of 0.00781 average weak error is 0.01378

Here it appears that the average weak error is proportional to �t , at least for a while. Thedifficulty here is that in addition to the numerical errors, there are also variations because theaveraging is not perfect, even with 100,000 samples. Once the numerical error drops to a sizecomparable with the statistical error associated with the number of samples used, the estimatefor the average weak error becomes unreliable. To go farther down along the convergencecurve, more samples are needed, and because the statistical convergence goes like 1=

pN , one

needs a factor of 100 more samples to see another order of 10 convergence.

8.12 Millstein’s method

If we take the next biggest term in the Itô-Taylor expansion,Z t

t0

Z s

t0

�@�

@XdW.s0/ dW.s/;

and expand this as well, it becomes

��Xˇ̌t0

Z t

t0

Z s

t0

dW.s0/ dW.s/C

Z t

t0

Z s

t0

Z s

t0

�.��X/X dW.s00/ dW.s0/ dW.s/CO.�t2/ :

©kath2020esam448notes

Page 25: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.12 Millstein’s method 133

10−2

10−1

100

10−2

10−1

100

stepsize

err

or

error

((1/2)

∆t

∆t)

Figure 8.8: Plot showing weak convergence of the Euler-Maruyama method. The curves �tand �t1=2 are shown for comparison.

We can evaluate the first integral,Z t

t0

Z s

t0

dW.s0/ dW.s/ D1

2

˚ŒW.t/ �W.t0/�

2� .t � t0/

:

Adding this term to the approximation, we get

X.t/ D Xˇ̌t0C b

ˇ̌t0�t C �

ˇ̌t0�W C

1

2��X

ˇ̌t0

��W 2

��t�:

This is known as Milstein’s method. It has a strong (pathwise) error that is O.�t/. As anexample, below is a table of average strong errors for dX D �ˇX dtC�X dW with ˇ D 0:5,� D 0:5 and a final time t D 1:

For a stepsize of 0.50000 average strong error is 0.24905For a stepsize of 0.25000 average strong error is 0.11149For a stepsize of 0.12500 average strong error is 0.05519For a stepsize of 0.06250 average strong error is 0.02770For a stepsize of 0.03125 average strong error is 0.01320For a stepsize of 0.01562 average strong error is 0.00654For a stepsize of 0.00781 average strong error is 0.00330

It’s easy to see that the strong error is now proportional to �t . Furthermore, the average weakerror associated with Millstein’s method is also proportional to �t :

For a stepsize of 0.50000 average weak error is 0.87221

©kath2020esam448notes

Page 26: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

134 Stochastic differential equations

For a stepsize of 0.25000 average weak error is 0.43739For a stepsize of 0.12500 average weak error is 0.21237For a stepsize of 0.06250 average weak error is 0.11215For a stepsize of 0.03125 average weak error is 0.03021For a stepsize of 0.01562 average weak error is 0.00616For a stepsize of 0.00781 average weak error is 0.01981

Again, the weak error at the end here is not very accurate because not enough independentsamples were taken.

8.13 Methods for multidimensional systems

Let’s consider again a system of stochastic differential equations:

dXi D bi. EX; t/ dt C �ij . EX; t/ dWj :

In integral form this is

Xi.t/ D Xi.t0/C

Z t

t0

bi. EX.s/; s/ ds C

Z t

t0

�ij . EX.s/; s/ dWj .s/ :

As before, we apply the Itô formula to get equations for bi. EX.t/; t/ and �ij . EX.t/; t/ and thensubstitute the integral forms of these equations into the one forXi . We get, keeping the biggestterms,

Xi.t/ D Xi.t0/C

Z t

t0

bi. EX.t0/; t0/ ds C

Z t

t0

�ij . EX.t0/; t0/ dWj .s/

C

Z t

t0

Z s

t0

@�ij

@Xk. EX.t0/; t0/�k`. EX.t0/; t0/ dW`.s

0/dWj .s/C : : : :

The first two terms work as before; we haveZ t

t0

ds D .t � t0/ andZ t

t0

dWj .s/ D�Wj .t/ �Wj .t0/

�:

Therefore, the first terms of the numerical give

Xi.t/ D Xi.t0/C bi. EX.t0/; t0/.t � t0/C �ij . EX.t0/; t0/�Wj .t/ �Wj .t0/

�C : : : ;

which is the Euler-Maruyama method, so it works for systems, too.

©kath2020esam448notes

Page 27: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.14 Higher order methods 135

If we try to evaluate the next term (in order to get a Millstein method for systems), however,we encounter a difficulty when we try to evaluateZ t

t0

Z s

t0

dW`.s0/dWj .s/ :

If ` D j this is the same integral that we calculated before (that evaluates to 12

��W 2

j ��t�),

but if ` ¤ j we divide the interval �t up into N smaller intervals ıt and try to evaluate

limıt!0

ıt

NXnD1

Z.n/

`

nXmD1

Z.m/j ;

where Z.n/`

and Z.m/`

are i.i.d. standard normal random variables. Written out, this is

ıthZ.1/

`Z.1/j CZ

.2/

`

�Z.1/j CZ

.2/j

�CZ

.3/

`

�Z.1/j CZ

.2/j CZ

.3/j

�C : : :CZ

.N/

`

�Z.1/j CZ

.2/j C : : :CZ

.N/j

�i:

The problem now is that products of Gaussians are not Gaussian, and we are adding up termsthat clearly will have complicated correlations with one another. It turns out that it is not possi-ble to evaluate this limit in terms a simple distribution that we can sample; in fact, determiningthe correct random increments when ` and j are different requires computing a finite trunca-tion of an infinite sum where more and more terms are needed as �t becomes smaller. Forthis reason, the multidimensional case is much more complicated past the Euler-Maruyamaapproximation than for the case when only a single random increment �W is present.

8.14 Higher order methodsWhen there is only a single random increment dW present, the previous analysis can be ex-tended to higher order. To simplify things a little, let’s consider this for the special case

dXi D bi. EX/dt C �idW :

Here we are assuming that Eb. EX/ does not specifically depend upon t , we are assuming that E�is constant, and that there is only a single random increment dW . Then

Xi.t/ D Xi.t0/C

Z t

t0

bi. EX.s// ds C

Z t

t0

�i dW.s/ :

Right away we observe that the last term in the above can be integrated exactly; no additionalapproximations are needed. In addition, if we apply the Itô formula to bi , we get

dbi D@bi

@XjdXj C

1

2

@2bi

@Xj@XkdXj dXk C : : :

D

�@bi

@Xjbj C

1

2

@2bi

@Xj@Xk�j�k

�dt C

@bi

@Xj�jdW :

©kath2020esam448notes

Page 28: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

136 Stochastic differential equations

If we construct the integral form of this equation and substitute it into the equation for Xi.t/,we get

Xi.t/ D Xi.t0/C bi. EX.t0//�t C �i�W

C

Z t

t0

Z s

t0

�@bi

@Xjbj C

1

2

@2bi

@Xj@Xk�j�k

�ds0 ds

C

Z t

t0

Z s

t0

@bi

@Xj�jdW.s

0/ ds :

The first line is just the Euler-Maruyama method again. The second line is a deterministicterm, and if we iterate again (using the Itô-Taylor formula on the term in the parenthesis), itbecomes

1

2

�@bi

@Xjbj C

1

2

@2bi

@Xj@Xk�j�k

�ˇ̌̌t0�t2 :

This term, of course, is just the next term in the Taylor series of the deterministic part of theequation.

The last line is random, and if we iterate using the Itô-Taylor formula it’s leading order ap-proximation becomes

@bi

@Xj�j

ˇ̌̌t0

Z t

t0

Z s

t0

dW.s0/ ds :

We therefore have to figure out the value ofZ t

t0

Z s

t0

dW.s0/ ds D

Z t

t0

ŒW.s/ �W.t0/� ds D

Z t

t0

�W.s/ ds :

It turns out to be easiest to simultaneously work with the other term of the same sizeZ t

t0

Z s

t0

ds0 dW.s/ D

Z t

t0

.s � s0/ dW.s/ D

Z t

t0

�s dW.s/ :

First of all, the sum of these two terms isZ t

t0

�W.s/ ds C

Z t

t0

�s dW.s/ D

Z t

t0

d.�W.s/�s/ D �W.t/�t :

In addition, we know that

�Z D

Z t

t0

�s dW.s/

is a Gaussian random variable with mean zero and varianceZ t

t0

�s2ds D1

3�t3 :

©kath2020esam448notes

Page 29: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

8.14 Higher order methods 137

The one additional thing we have to figure out, however, is if �Z and �W are correlated.Since both �Z and �W have zero mean, all we need to do is determine

limıt!0

E Œ�Z �W � :

Going back to definitions, this is

E

"Xm

.sm�s0/ıWm.sm/Xn

ıWn.sn/

#D

Xn

.sn�s0/ıs D1

2.t�t0/

2D1

2�t2 ;

where we have used EŒıWm.sm/ıWn.sn/� D ıs ımn.

So, if�W D

p�tZ1

where Z1 is a zero mean, unit variance Gaussian R.V., we can try

�Z D aZ1 C bZ2

where Z2 is an independent zero mean, unit variance Gaussian R.V. We then need

Var.�Z/ D a2 C b2 D1

3�t3

andEŒ�Z�W � D a

p�t D

1

2�t2 :

Therefore we have a D 12�t3=2 and then b2 D 1

3�t3 � 1

4�t3 D 1

12�t3. As a result,

�Z D1

2�t3=2

�Z1 C

1p3Z2

�:

Finally, the coefficient we are looking for isZ t

t0

�W.s/ ds D �W �t ��Z D1

2�t3=2

�Z1 �

1p3Z2

�D1

2�W �t �

1p12�t3=2Z2 :

Therefore, finally,

Xi.t/ D Xi.t0/C bi. EX.t0//�t C �i�W C1

2

�@bi

@Xjbj C

1

2

@2bi

@Xj@Xk�j�k

�ˇ̌̌t0�t2

C@bi

@Xj�j

ˇ̌̌t0

�1

2�W �t �

1p12�t3=2Z2

�:

©kath2020esam448notes

Page 30: Chapter 8. Stochastic differential equationspeople.esam.northwestern.edu/~kath/448/sdes_draft.pdf · 2020. 3. 2. · tween the numerical solution of deterministic differential equations

138 Stochastic differential equations

By direct analogy with the Euler-Maruyama method, the weak error of this method is O.�t2/and the strong error is O.�t3=2/.

There are additional terms that need to be added if � depends upon X , however. In the scalarcase, the full method is

X.t/ D X.t0/C b.X.t0//�t C �.X.t0//�W

C ��Xˇ̌t0

1

2.�W 2

��t/C �.��X/Xˇ̌t0

1

2�W

�1

3�W 2

��t

�C �bX

ˇ̌t0

�1

2�W �t �

1p12�t3=2Z2

�C

��t C b�X C

1

2�2�XX

�ˇ̌̌t0

�1

2�W �t C

1p12�t3=2Z2

�C1

2

�bt C bbX C

1

2�2bXX

�ˇ̌̌t0�t2 C : : : :

This can be continued if necessary, of course.

©kath2020esam448notes