AN INTRODUCTION TO STOCHASTIC CALCULUS · Stochastic processes are well suited for modeling stochastic evolution phe-nomena. The interesting cases correspond to families of random

AN INTRODUCTION TOSTOCHASTIC CALCULUS

Marta Sanz-SoleFacultat de Matematiques i Informatica

Universitat de Barcelona

October 15, 2017

Contents

1 A review of the basics on stochastic processes 41.1 The law of a stochastic process . . . . . . . . . . . . . . . . . 41.2 Sample paths . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 The Brownian motion 102.1 Equivalent definitions of Brownian motion . . . . . . . . . . . 102.2 A construction of Brownian motion . . . . . . . . . . . . . . . 122.3 Path properties of Brownian motion . . . . . . . . . . . . . . . 162.4 The martingale property of Brownian motion . . . . . . . . . . 212.5 Markov property . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Ito’s calculus 283.1 Ito’s integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 The Ito integral as a stochastic process . . . . . . . . . . . . . 353.3 An extension of the Ito integral . . . . . . . . . . . . . . . . . 373.4 A change of variables formula: Ito’s formula . . . . . . . . . . 39

3.4.1 One dimensional Ito’s formula . . . . . . . . . . . . . . 403.4.2 Multidimensional version of Ito’s formula . . . . . . . . 52

4 Applications of the Ito formula 534.1 Burkholder-Davis-Gundy inequalities . . . . . . . . . . . . . . 534.2 Representation of L2 Brownian functionals . . . . . . . . . . . 544.3 Girsanov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . 56

5 Local time of Brownian motion and Tanaka’s formula 60

6 Stochastic differential equations 666.1 Examples of stochastic differential equations . . . . . . . . . . 686.2 A result on existence and uniqueness of solution . . . . . . . . 696.3 Some properties of the solution . . . . . . . . . . . . . . . . . 746.4 Markov property of the solution . . . . . . . . . . . . . . . . . 77

7 Numerical approximations of stochastic differential equa-tions 80

8 Continuous time martingales 838.1 Doob’s inequalities for martingales . . . . . . . . . . . . . . . 838.2 Quadratic variation of a continuous martingale . . . . . . . . . 86

9 Stochastic integrals with respect to continuous martingales 92

10 Appendix 1: Conditional expectation 95

11 Appendix 2: Stopping times 100

3

1 A review of the basics on stochastic pro-

cesses

This chapter is devoted to introduce the notion of stochastic processes andsome general definitions related with this notion. For a more complete ac-count on the topic, we refer the reader to [12]. Let us start with a definition.

Definition 1.1 A stochastic process with state space S is a family Xi, i ∈I of random variables Xi : Ω→ S indexed by a set I.

For a successful progress in the analysis of such an object, some furtherstructure on the index set I and on the state space S is required. In thiscourse, we shall mainly deal with the particular cases: I = N,Z+,R+ and Seither a countable set or a subset of Rd, d ≥ 1.The basic problem statisticians are interested in, is the analysis of the prob-ability law (mostly described by some parameters) of characters exhibited bypopulations. For a fixed character described by a random variable X, theyuse a finite number of independent copies of X -a sample of X. For manypurposes, it is interesting to have samples of any size and therefore to con-sider sequences Xn, n ≥ 1. It is important here to insist on the word copies,meaning that the circumstances around the different outcomes of X do notchange. It is a static world. Hence, they deal with stochastic processesXn, n ≥ 1 consisting of independent and identically distributed randomvariables.This is not the setting we are interested in here. Instead, we would like togive stochastic models for phenomena of the real world which evolve as timegoes by. Stochasticity is a choice that accounts for an uncomplete knowledgeand extreme complexity. Evolution, in contrast with statics, is what weobserve in most phenomena in Physics, Chemistry, Biology, Economics, LifeSciences, etc.Stochastic processes are well suited for modeling stochastic evolution phe-nomena. The interesting cases correspond to families of random variables Xi

which are not independent. In fact, the famous classes of stochastic processesare described by means of types of dependence between the variables of theprocess.

1.1 The law of a stochastic process

The probabilistic features of a stochastic process are gathered in the jointdistributions of their variables, as given in the next definition.

4

Definition 1.2 The finite-dimensional joint distributions of the processXi, i ∈ I consists of the multi-dimensional probability laws of any finitefamily of random vectors Xi1 , . . . , Xim, where i1, . . . , im ∈ I and m ≥ 1 isarbitrary.

Let us give an important example.

Example 1.1 A stochastic process Xt, t ≥ 0 is said to be Gaussian if itsfinite-dimensional joint distributions are Gaussian laws.Remember that in this case, the law of the random vector (Xt1 , . . . , Xtm) ischaracterized by two parameters:

µ(t1, . . . , tm) = E (Xt1 , . . . , Xtm) = (E(Xt1), . . . , E(Xtm))

Λ(t1, . . . , tm) =(Cov(Xti , Xtj)

)1≤i,j≤m .

If det Λ(t1, . . . , tm) > 0, then the law of (Xt1 , . . . , Xtm) has a density, andthis density is given by

ft1,··· ,tm(x) =1

((2π)m det Λt1,...,tm)1/2

× exp

(−1

2(x− µt1,...,tm)tΛ−1

t1,...,tm(x− µt1,...,tm)

).

In the sequel we shall assume that I ⊂ R+ and S ⊂ R, either countable oruncountable, and denote by RI the set of real-valued functions defined on I.A stochastic process Xt, t ≥ 0 can be viewed as a random vector

X : Ω→ RI .

Putting the appropriate σ-field of events in RI , say B(RI), one can define,as for random variables, the law of the process as the mapping

PX(B) = P (X−1(B)), B ∈ B(RI).

Mathematical results from measure theory tell us that PX is defined by meansof a procedure of extension of measures on cylinder sets given by the familyof all possible finite-dimensional joint distributions. This is a deep result.In Example 1.1, we have defined a class of stochastic processes by means ofthe type of its finite-dimensional joint distributions. But, does such an objectexist? In other words, could one define stochastic processes giving only itsfinite-dimensional joint distributions? Roughly speaking, the answer is yes,adding some extra condition. The precise statement is a famous result byKolmogorov that we now quote.

5

Theorem 1.1 Consider a family

Pt1,...,tn , t1 < . . . < tn, n ≥ 1, ti ∈ I (1.1)

where:

1. Pt1,...,tn is a probability on Rn,

2. if ti1 < . . . < tim ⊂ t1 < . . . < tn, the probability law Pti1 ...tim is themarginal distribution of Pt1...tn.

There exists a stochastic process Xt, t ∈ I defined in some probability space,such that its finite-dimensional joint distributions are given by (1.1). Thatis, the law of the random vector (Xt1 , . . . , Xtn) is Pt1,...,tn.

One can apply this theorem to Example 1.1 to show the existence of Gaussianprocesses, as follows.Let K : I × I → R be a symmetric, nonnegative definite function. Thatmeans:

• for any s, t ∈ I, K(t, s) = K(s, t);

• for any natural number n and arbitrary t1, . . . , tn ∈ I, and x1, . . . , xn ∈R,

n∑i,j=1

K(ti, tj)xixj ≥ 0.

Covariance functions provide an example of nonnegative definite functions, aswe now illustrate. Consider a random vector (Y1, . . . , Yn) whose componentshave mean zero and finite second order moment. Fix x1, . . . , xn ∈ R. Then,we have,

n∑j,k=1

xjxk Cov(Yj, Yk) =n∑

j,k=1

xjxkE (YjYk)

= E

(n∑j=1

xjYj

)2

≥ 0.

There exists a Gaussian process Xt, t ≥ 0 such that E(Xt) = 0 for anyt ∈ I and Cov (Xti , Xtj) = K(ti, tj), for any ti, tj ∈ I.To prove this result, fix t1, . . . , tn ∈ I and set µ = (0, . . . , 0) ∈ Rn, Λt1...tn =(K(ti, tj))1≤i,j≤n and

Pt1,...,tn = N(0,Λt1...tn).

6

We denote by (Xt1 , . . . , Xtn) a random vector with law Pt1,...,tn . For anysubset ti1 , . . . , tim of t1, . . . , tn, it holds that

A(Xt1 , . . . , Xtn) = (Xti1, . . . , Xtim

),

with

A =

δt1,ti1 · · · δtn,ti1· · · · · · · · ·δt1,tim · · · δtn,tim

,

where δs,t denotes the Kronecker Delta function.

By the properties of Gaussian vectors, the random vector (Xti1, . . . , Xtim

)has an m-dimensional normal distribution, zero mean, and covariance matrixAΛt1...tnA

t. By the definition of A, it is trivial to check that

AΛt1...tnAt = (K(til , tik))1≤l,k≤m .

Hence, the assumptions of Theorem 1.1 hold true and the result follows.

1.2 Sample paths

In the previous discussion, stochastic processes are considered as randomvectors. In the context of modeling, what matters are the observed valuesof the process. Observations correspond to fixed values of ω ∈ Ω. This newpoint of view leads to the next definition.

Definition 1.3 The sample paths of a stochastic process Xt, t ∈ I are thefamily of functions indexed by ω ∈ Ω, X(ω) : I → S, defined by X(ω)(t) =Xt(ω).

Sample paths are also called trajectories.

Example 1.2 Consider random arrivals of customers at a store. We set ourclock at zero and measure the time between two consecutive arrivals. Theyare random variables X1, X2, . . . . We assume Xi > 0, a.s. Set S0 = 0 andSn =

∑nj=1Xj, n ≥ 1. Sn is the time of the n-th arrival. Define Nt as the

(random) number of customers who have visited the store during the timeinterval [0, t], t ≥ 0.

Clearly, N0 = 0 and for t > 0, Nt = k if and only if

Sk ≤ t < Sk+1.

7

The stochastic process Nt, t ≥ 0 takes values on Z+. Its sample paths areincreasing right continuous functions, with jumps at the random times Sn,n ≥ 1, of size one. It is a particular case of a counting process. Sample pathsof counting processes are always increasing right continuous functions, theirjumps are natural numbers.

Example 1.3 Evolution of prices of risky assets can be described by real-valued stochastic processes Xt, t ≥ 0 with continuous, although very rough,sample paths. They are generalizations of the Brownian motion.The Brownian motion, also called Wiener process, is a Gaussian processBt, t ≥ 0 with the following parameters:

E(Bt) = 0

E(BsBt) = s ∧ t,

This defines the finite dimensional distributions and therefore the existenceof the process via Kolmogorov’s theorem (see Theorem 1.1).

Before giving a heuristic motivation for the preceding definition of Brownianmotion, we introduce two further notions.

A stochastic process Xt, t ∈ I has independent increments if for anyt1 < t2 < . . . < tk the random variables Xt2 − Xt1 , . . . , Xtk − Xtk−1

areindependent.

A stochastic process Xt, t ∈ I has stationary increments if for any t1 < t2,the law of the random variable Xt2 −Xt1 is the same as that of Xt2−t1 .

Brownian motion is termed after Robert Brown, a British botanist who ob-served and reported in 1827 the irregular movements of pollen particles sus-pended in a liquid. Assume that, when starting the observation, the pollenparticle is at position x = 0. Denote by Bt the position of (one coordinate)of the particle at time t > 0. By physical reasons, the trajectories must becontinuous functions and because of the erratic movement, it seems reason-able to say that Bt, t ≥ 0 is a stochastic process. It also seems reasonableto assume that the change in position of the particle during the time interval[t, t+ s] is independent of its previous positions at times τ < t and therefore,to assume that the process has independent increments. The fact that suchan increment must be stationary is explained by kinetic theory, assumingthat the temperature during the experience remains constant.

8

The model for the law of Bt has been given by Einstein in 1905. Moreprecisely, Einstein’s definition of Brownian motion is that of a stochasticprocesses with independent and stationary increments such that the law ofan increment Bt−Bs, s < t is Gaussian, zero mean and E(Bt−Bs)

2 = t− s.This definition is equivalent to the one given before.

9

2 The Brownian motion

2.1 Equivalent definitions of Brownian motion

This chapter is devoted to the study of Brownian motion, the process intro-duced in Example 1.3 that we recall now.

Definition 2.1 The stochastic process Bt, t ≥ 0 is a one-dimensionalBrownian motion if it is Gaussian, zero mean and with covariance functiongiven by E (BtBs) = s ∧ t.The existence of such process is ensured by Kolmogorov’s theorem. Indeed,it suffices to check that

(s, t)→ Γ(s, t) = s ∧ t

is nonnegative definite. That means, for any ti ≥ 0 and any real numbers ai,i, j = 1, . . . ,m,

m∑i,j=1

aiajΓ(ti, tj) ≥ 0.

But

s ∧ t =

∫ ∞0

11[0,s](r) 11[0,t](r) dr.

Hence,m∑

i,j=1

aiaj(ti ∧ tj) =m∑

i,j=1

aiaj

∫ ∞0

11[0,ti](r) 11[0,tj ](r) dr

=

∫ ∞0

(m∑i=1

ai11[0,ti](r)

)2

dr ≥ 0.

Notice also that, since E(B20) = 0, the random variable B0 is zero almost

surely.

Each random variable Bt, t > 0, of the Brownian motion has a density, andit is

pt(x) =1√2πt

exp(−x2

2t),

while for t = 0, its ”density” is a Dirac mass at zero, δ0.Differentiating pt(x) once with respect to t, and then twice with respect tox easily yields

∂

∂tpt(x) =

1

2

∂2

∂x2pt(x)

p0(x) = δ0.

10

This is the heat equation on R with initial condition p0(x) = δ0. Thatmeans, as time evolves, the density of the random variables of the Brownianmotion behaves like a diffusive physical phenomenon.There are equivalent definitions of Brownian motion, as the one given in thenest result.

Proposition 2.1 A stochastic process Xt, t ≥ 0 is a Brownian motion ifand only if

(i) X0 = 0, a.s.,

(ii) for any 0 ≤ s < t, the random variable Xt − Xs is independent of theσ-field generated by Xr, 0 ≤ r ≤ s, σ(Xr, 0 ≤ r ≤ s) and Xt −Xs is aN(0, t− s) random variable.

Proof: Let us assume first that Xt, t ≥ 0 is a Brownian motion. ThenE(X2

0 ) = 0. Thus, X0 = 0 a.s..Let Hs and Hs be the vector spaces included in L2(Ω) spanned by (Xr, 0 ≤r ≤ s) and (Xs+u −Xs, u ≥ 0), respectively. Since for any 0 ≤ r ≤ s

E (Xr (Xs+u −Xs)) = 0,

Hs and Hs are orthogonal in L2(Ω). Consequently, Xt −Xs is independentof the σ-field σ(Xr, 0 ≤ r ≤ s).Since linear combinations of Gaussian random variables are also Gaussian,Xt −Xs is normal, and E(Xt −Xs) = 0,

E(Xt −Xs)2 = t+ s− 2s = t− s.

This ends the proof of properties (i) and (ii).Assume now that (i) and (ii) hold true. Then the finite dimensional distri-butions of Xt, t ≥ 0 are multidimensional normal, since they are obtainedby linear transformation of random vectors with Gaussian independent com-ponents. Moreover, for 0 ≤ s ≤ t,

E(XtXs) = E ((Xt −Xs +Xs)Xs) = E ((Xt −Xs)Xs) + E(X2s )

= E(Xt −Xs)E(Xs) + E(X2s ) = E(X2

s ) = s = s ∧ t.

Remark 2.1 We shall see later that Brownian motion has continuous sam-ple paths. The description of the process given in the preceding propositiontell us that such a process is a model for a random evolution which starts fromx = 0 at time t = 0, such that the qualitative change on time increments onlydepends on their length (stationary law), and that the future evolution of theprocess is independent of its past (Markov property).

11

Remark 2.2 The Brownian motion possesses several invariance properties.Let us mention some of them.

• If B = Bt, t ≥ 0 is a Brownian motion, so is −B = −Bt, t ≥ 0.

• For any λ > 0, the process Bλ =

1λBλ2t, t ≥ 0

is also a Brownian

motion. This means that zooming in or out, we will observe the samebehaviour. This is called the scaling property of Brownian motion.

• For any a > 0, B+a = Bt+a −Ba, t ≥ 0 is a Brownian motion.

2.2 A construction of Brownian motion

There are several ways to obtain a Brownian motion. Here we shall give P.Levy’s construction, which also provides the continuity of the sample paths.Before going through the details of this construction, we mention an alter-nate.Brownian motion as limit of a random walk

Let ξj, j ∈ N be a sequence of independent, identically distributed randomvariables, with mean zero and variance σ2 > 0. Consider the sequence ofpartial sums defined by S0 = 0, Sn =

∑nj=1 ξj. The sequence Sn, n ≥ 0 is

a Markov chain, and also a martingale.Let us consider the continuous time stochastic process defined by linear in-terpolation of Sn, n ≥ 0, as follows. For any t ≥ 0, let [t] denote its integervalue. Then set

Yt = S[t] + (t− [t])ξ[t]+1, (2.1)

for any t ≥ 0.The next step is to scale the sample paths of Yt, t ≥ 0. By analogy withthe scaling in the statement of the central limit theorem, we set

B(n)t =

1

σ√nYnt, (2.2)

t ≥ 0.A famous result in probability theory -Donsker theorem- tell us that the se-quence of processes B(n)

t , t ≥ 0, n ≥ 1, converges in law to the Brownianmotion. The reference sample space is the set of continuous functions van-ishing at zero. Hence, proving the statement, we obtain continuity of thesample paths of the limit.Donsker theorem is the infinite dimensional version of the above mentionedcentral limit theorem. Considering s = k

n, t = k+1

n, the increment B

(n)t −

B(n)s = 1

σ√nξk+1 is a random variable, with mean zero and variance t −

12

s. Hence B(n)t is not that far from the Brownian motion, and this is what

Donsker’s theorem proves.

P. Levy’s construction of Brownian Motion

An important ingredient in the procedure is a sequence of functions definedon [0, 1], termed Haar functions, defined as follows:

h0(t) = 1,

hkn(t) = 2n2 11[ 2k

2n+1 ,2k+1

2n+1 [ − 2n2 11[ 2k+1

2n+1 ,2k+2

2n+1 [,

where n ≥ 1 and k ∈ 0, 1, . . . , 2n − 1.The set of functions (h0, h

kn) is a CONS of L2([0, 1],B([0, 1]), λ), where

λ stands for the Lebesgue measure. Consequently, for any f ∈L2([0, 1],B([0, 1]), λ), we can write the expansion

f = 〈f, h0〉h0 +∞∑n=1

2n−1∑k=0

〈f, hkn〉hkn, (2.3)

where the notation 〈·, ·〉 means the inner product in L2([0, 1],B)([0, 1]), λ).Using (2.3), we define an isometry between L2([0, 1],B([0, 1]), λ) andL2(Ω,F , P ) as follows. Consider a family of independent random variableswith law N(0, 1), (N0, N

kn). Then, for f ∈ L2([0, 1],B([0, 1]), λ), set

I(f) = 〈f, h0〉N0 +∞∑n=1

2n−1∑k=0

〈f, hkn〉Nkn .

Clearly,

E(I(f)2

)= ‖f‖2

2.

Hence I defines an isometry between the space of random variablesL2(Ω,F , P ) and L2([0, 1],B([0, 1]), λ). Moreover, since

I(f) = limm→∞

〈f, h0〉N0 +m∑n=1

2n−1∑k=0

〈f, hkn〉Nkn ,

the randon variable I(f) is N(0, ‖f‖22) and by Parseval’s identity

E(I(f)I(g)) = 〈f, g〉, (2.4)

for any f, g ∈ L2([0, 1],B([0, 1]), λ).

13

Theorem 2.1 The process B = Bt = I(11[0,t]), t ∈ [0, 1] defines a Brow-nian motion indexed by [0, 1]. Moreover, the sample paths are continuous,almost surely.

Proof: By construction B0 = 0. Notice that for 0 ≤ s ≤ t ≤ 1, Bt − Bs =I(11]s,t]). Hence, by virtue or (2.4) the process Bt −Bs is independent of anyBr, 0 < r < s, and Bt − Bs has a N(0, t − s) law. By Proposition 2.1, weobtain the first statement.Our next aim is to prove that the series appearing in

Bt = I(11[0,t]

)= 〈11[0,t], h0〉N0 +

∞∑n=1

2n−1∑k=0

〈11[0,t], hkn〉Nk

n

= g0(t)N0 +∞∑n=1

2n−1∑k=0

gkn(t)Nkn (2.5)

converges uniformly, a.s. In the last term we have introduced the Schauderfunctions defined as follows.

g0(t) = 〈11[0,t], h0〉 = t,

gkn(t) = 〈11[0,t], hkn〉 =

∫ t

0

hkn(s)ds,

for any t ∈ [0, 1].By construction, for any fixed n ≥ 1, the functions gkn(t), k = 0, . . . , 2n − 1,are positive, have disjoint supports and

gkn(t) ≤ 2−n2 .

Thus,

supt∈[0,1]

∣∣∣∣∣2n−1∑k=0

gkn(t)Nkn

∣∣∣∣∣ ≤ 2−n2 sup

0≤k≤2n−1|Nk

n |.

The next step consists in proving that |Nkn | is bounded by some constant

depending on n such that when multiplied by 2−n2 the series with these

terms converges.For this, we will use a result on large deviations for Gaussian measures alongwith the first Borel-Cantelli lemma.

Lemma 2.1 For any random variable X with law N(0, 1) and for any a ≥ 1,

P (|X| ≥ a) ≤ e−a2

2 .

14

Proof: We clearly have

P (|X| ≥ a) =2√2π

∫ ∞a

dxe−x2

2 ≤ 2√2π

∫ ∞a

dxx

ae−

x2

2

=2

a√

2πe−

a2

2 ≤ e−a2

2 ,

where we have used that 1 ≤ xa

and 2a√

2π≤ 1.

We now move to the Borel-Cantelli’s based argument. By the precedinglemma,

P

(sup

0≤k≤2n−1|Nk

n | > 2n4

)≤

2n−1∑k=0

P(|Nk

n | > 2n4

)≤ 2n exp

(−2

n2−1).

It follows that∞∑n=1

P

(sup

0≤k≤2n−1|Nk

n | > 2n4

)< +∞,

and by the first Borel-Cantelli lemma

P

(lim infn→∞

sup

0≤k≤2n−1|Nk

n | ≤ 2n4

)= 1.

That is, a.s., there exists n0, which may depend on ω, such that

sup0≤k≤2n−1

|Nkn | ≤ 2

n4

for any n ≥ n0. Hence, we have proved

supt∈[0,1]

∣∣∣∣∣2n−1∑k=0

gkn(t)Nkn

∣∣∣∣∣ ≤ 2−n2 sup

0≤k≤2n−1|Nk

n | ≤ 2−n4 ,

a.s., for n big enough, which proves the a.s. uniform convergence of the series(2.5).

Next we discuss how from Theorem 1.1 we can get a Brownian motion indexedby R+. To this end, let us consider a sequence Bk, k ≥ 1 consisting ofindependent Brownian motions indexed by [0, 1]. That means, for each k ≥ 1,Bk = Bk

t , t ∈ [0, 1] is a Brownian motion and for different values of k, they

15

are independent. Then we define a Brownian motion recursively as follows.Let k ≥ 1; for t ∈ [k, k + 1] set

Bt = B11 +B2

1 + · · ·+Bk1 +Bk+1

t−k .

Such a process is Gaussian, zero mean and E(BtBs) = s ∧ t. Hence it is aBrownian motion.

we end this section by giving the notion of d-dimensional Brownian motion,for a natural number d ≥ 1. For d = 1 it is the process we have seen so far.For d > 1, it is the process defined by

Bt = (B1t , B

2t , . . . , B

dt ), t ≥ 0,

where the components are independent one-dimensional Brownian motions.

2.3 Path properties of Brownian motion

We already know that the trajectories of Brownian motion are a.s. continuousfunctions. However, since the process is a model for particles wanderingerratically, one expects rough behaviour. This section is devoted to provesome results that will make more precise these facts.Firstly, it is possible to prove that the sample paths of Brownian motionare γ-Holder continuous. The main tool for this is Kolmogorov’s continuitycriterion (see e.g. [11][Theorem 2.1]):

Proposition 2.2 Let Xt, t ≥ 0 be a stochastic process satisfying the fol-lowing property: for some positive real numbers α, β and C,

E (|Xt −Xs|α) ≤ C|t− s|1+β.

Then almost surely, the sample paths of the process are γ-Holder continuouswith γ < β

α.

The law of the random variable Bt−Bs is N(0, t− s). Thus, it is possible tocompute the moments, and we have

E((Bt −Bs)

2k)

=(2k)!

2kk!(t− s)k,

for any k ∈ N. Therefore, Proposition 2.2 yields that almost surely, thesample paths of the Brownian motion are γ–Holder continuous with γ ∈(0, 1

2).

16

Nowhere differentiability

We shall prove that the exponent γ = 12

above is sharp. As a consequencewe will obtain a celebrated result by Dvoretzky, Erdos and Kakutani tellingthat a.s. the sample paths of Brownian motion are not differentiable. Wegather these results in the next theorem.

Theorem 2.2 Fix any γ ∈ (12, 1]; then a.s. the sample paths of Bt, t ≥ 0

are nowhere Holder continuous with exponent γ.

Proof: Let γ ∈ (12, 1] and assume that a sample path t → Bt(ω) is γ-Holder

continuous at s ∈ [0, 1). Then

|Bt(ω)−Bs(ω)| ≤ C|t− s|γ,

for any t ∈ [0, 1] and some constant C > 0.Let n big enough and let i = [ns] + 1; by the triangular inequality∣∣∣B j

n(ω)−B j+1

n(ω)∣∣∣ ≤ ∣∣∣Bs(ω)−B j

n(ω)∣∣∣

+∣∣∣Bs(ω)−B j+1

n(ω)∣∣∣

≤ C

(∣∣∣∣s− j

n

∣∣∣∣γ +

∣∣∣∣s− j + 1

n

∣∣∣∣γ) .Hence, by restricting j = i, i+ 1, . . . , i+N − 1, we obtain∣∣∣B j

n(ω)−B j+1

n(ω)∣∣∣ ≤ C

(N

n

)γ=M

nγ.

Define

AiM,n =

∣∣∣B jn(ω)−B j+1

n(ω)∣∣∣ ≤ M

nγ, j = i, i+ 1, . . . , i+N − 1

.

We have seen that the set of trajectories where t → Bt(ω) is γ-Holder con-tinuous at s is included in

∪∞M=1 ∪∞k=1 ∩∞n=k ∪ni=1 AiM,n.

Next we prove that this set has null probability. Indeed,

P(∩∞n=k ∪ni=1 A

iM,n

)≤ lim inf

n→∞P(∪ni=1A

iM,n

)≤ lim inf

n→∞

n∑i=1

P(AiM,n

)≤ lim inf

n→∞n

(P

(∣∣∣B 1n

∣∣∣ ≤ M

nγ

))N,

17

where we have used that the random variables B jn− B j+1

nare N

(0, 1

n

)and

independent. But

P

(∣∣∣B 1n

∣∣∣ ≤ M

nγ

)=

√n

2π

∫ Mn−γ

−Mn−γe−

nx2

2 dx

=1√2π

∫ Mn12−γ

−Mn12−γ

e−x2

2 dx ≤ Cn12−γ.

Hence, by taking N such that N(γ − 1

2

)> 1,

P(∩∞n=k ∪ni=1 A

iM,n

)≤ lim inf

n→∞nC[n

12−γ]N

= 0.

since this holds for any k,M , se get

P(∪∞M=1 ∪∞k=1 ∩∞n=k ∪ni=1 A

iM,n

)= 0.

This ends the proof of the theorem.

Notice that, if a.s. the sample paths of Brownian motion were differentiableat some point, they would also be γ-Holder continuous of degree γ = 1. Thiscontradicts the preceding theorem.

What happens for γ = 12? The answer to this question comes as a conse-

quence of Paul Levy’s modulus of continuity result.For a function f : [0,∞)→ R, the modulus of continuity is a way to describeits local smoothness. More precisely, let σ : [0,∞) → [0,∞) be such thatσ(0) = 0 and strictly increasing. The function σ is a modulus of continuityfor the function f if for any 0 ≤ s ≤ t,

|f(s)− f(t)| ≤ σ(|t− s|).

Theorem 2.3 Let Bt, t ≥ 0 be a Brownian motion. Then,

P

(lim suph→0

sup0≤t≤1−h |B(t+ h)−B(t)|√2h| log h|

= 1

)= 1.

As a consequence, the sample paths of a Brownian motion are not Holdercontinuous of degree γ = 1

2. Indeed,

sup0≤t≤1−h |B(t+ h)−B(t)|√h

=sup0≤t≤1−h |B(t+ h)−B(t)|√

2h| log h|√

2| log h|,

18

which tends to ∞ as h→ 0.

Quadratic variationThe notion of quadratic variation provides a measure of the roughness ofa function. Existence of variations of different orders are also important inprocedures of approximation via a Taylor expansion and also in the develop-ment of infinitesimal calculus. We will study here the existence of quadraticvariation, i.e. variation of order two, for the Brownian motion. As shall bediscussed in more detail in the next chapter, this provides an explanation tothe fact that rules of Ito’s stochastic calculus are different from those of theclassical differential deterministic calculus.Fix a finite interval [0, T ] and consider the sequence of partitions given bythe points Πn = (tn0 = 0 ≤ tn1 ≤ . . . ≤ tnrn = T ), n ≥ 1. We assume that

limn→∞

|Πn| = 0,

where |Πn| denotes the norm of the partition Πn:

|Πn| = supj=0,...,rn−1

(tj+1 − tj)

Set ∆kB = Btnk− Btnk−1

. Under the preceding conditions on the sequence(Πn)n≥1 we have the following.

Proposition 2.3 The sequence ∑rn

k=1(∆kB)2, n ≥ 1 converges in L2(Ω)to the deterministic random variable T . That is,

limn→∞

E

( rn∑k=1

(∆kB)2 − T

)2 = 0.

Proof: For the sake of simplicity, we shall omit the dependence on n. Set∆kt = tk−tk−1. Notice that the random variables (∆kB)2−∆kt, k = 1, . . . , n,are independent and centered. Thus,

E

( rn∑k=1

(∆kB)2 − T

)2 = E

( rn∑k=1

((∆kB)2 −∆kt

))2

=rn∑k=1

E[(

(∆kB)2 −∆kt)2]

=rn∑k=1

[3(∆kt)

2 − 2(∆kt)2 + (∆kt)

2]

= 2rn∑k=1

(∆kt)2 ≤ 2T |Πn|,

19

which clearly tends to zero as n tends to infinity.

This proposition, together with the continuity of the sample paths of Brow-nian motion yields

supn

rn∑k=1

|∆kB| =∞, a.s..

Therefore, a.s. Brownian motion has infinite variation.Indeed, assume that V := supn

∑rnk=1 |∆kB| <∞. Then

rn∑k=1

(∆kB)2 ≤ supk|∆kB|

(rn∑k=1

|∆kB|

)≤ V sup

k|∆kB|.

We obtain limn→∞∑rn

k=1(∆kB)2 = 0. a.s., which contradicts the result provedin Proposition 2.3.

Remark 2.3 In the particular case Πn ⊂ Πn+1, n ≥ 1, the result on thequadratic variation of Brownian motion given in the preceding Propositioncan be improved. In fact, the following stronger statement holds:

limn→∞

rn∑k=1

(∆kB)2 = T, a.s. (2.6)

For example, one can consider Πn = ((kT )2−n, k = 0, . . . , 2n).

Assume that the sequence of partitions (Πn)n≥1 satisfies the assumptions ofProposition 2.3. In addition, we assume that there exists γ ∈ (0, 1) such that∑

n≥1 |Πn|γ < ∞. Then (2.6) holds. Indeed, let λ > 0. Using Chebychev’sinequality and the computations of the proof of Proposition 2.3, we have

P

∣∣∣∣∣rn∑k=1

(∆kB)2 − T

∣∣∣∣∣ > λ

≤ λ−2E

∣∣∣∣∣rn∑k=1

(∆kB)2 − T

∣∣∣∣∣2

≤ Cλ−2|Πn|.

Choose λ = |Πn|1−γ2 . Then

∑n≥1

P

∣∣∣∣∣rn∑k=1

(∆kB)2 − T

∣∣∣∣∣ > λ

≤ C

∑n≥1

|Πn|γ <∞.

Thus, (2.6) follows from Borel-Cantelli’s Lemma.

20

The notion of quadratic variation presented before does not coincide with theusual notion of quadratic variation for real functions. In the latter case,there is no restriction on the partitions. Actually, for Brownian motion thefollowing result holds:

supΠ

∑tk∈Π

(∆kB)2 = +∞, a.s.,

where the supremum is on the set of all partition of the interval [0, T ].

2.4 The martingale property of Brownian motion

We start this section by giving the definition of martingale for continuoustime stochastic processes. First, we introduce the appropriate notion of fil-tration, as follows.A family Ft, t ≥ 0 of sub σ–fields of F is termed a filtration if

1. F0 contains all the sets of F of null probability,

2. For any 0 ≤ s ≤ t, Fs ⊂ Ft.

If in addition∩s>tFs = Ft,

for any t ≥ 0, the filtration is said to be right-continuous.

Definition 2.2 A stochastic process Xt, t ≥ 0 is a martingale with respectto the filtration Ft, t ≥ 0 if each variable belongs to L1(Ω) and moreover

1. Xt is Ft–measurable for any t ≥ 0,

2. for any 0 ≤ s ≤ t, E(Xt/Fs) = Xs.

If the equality in (2) is replaced by ≤ (respectively, ≥), we have a super-martingale (respectively, a submartingale).Given a stochastic process Xt, t ≥ 0, there is a natural way to define afiltration by considering

Ft = σ(Xs, 0 ≤ s ≤ t), t ≥ 0.

To ensure that the above property (1) for a filtration holds, one needs to com-plete the σ-field. In general, there is no reason to expect right-continuity.However, for the Brownian motion, the natural filtration possesses this prop-erty.

21

A stochastic process X with X0 constant and constant mean, independentincrements possesses the martingale property with respect to the naturalfiltration. Indeed, for 0 ≤ s ≤ t,

E(Xt −Xs/Fs) = E(Xt −Xs) = 0.

Hence, a Brownian motion possesses the martingale property with respect tothe natural filtration.

Other examples of martingales with respect to the same filtration, relatedwith the Brownian motion are

1. B2t − t, t ≥ 0,

2. exp(aBt − a2t

2

), t ≥ 0.

Indeed, for the first example, let us consider 0 ≤ s ≤ t. Then,

E(B2t /Fs

)= E

((Bt −Bs +Bs)

2/Fs)

= E((Bt −Bs)

2/Fs)

+ 2E ((Bt −Bs)Bs/Fs)+ E

(B2s/Fs

).

Since Bt−Bs is independent of Fs, owing to the properties of the conditionalexpectation, we have

E((Bt −Bs)

2/Fs)

= E((Bt −Bs)

2)

= t− s,E ((Bt −Bs)Bs/Fs) = BsE (Bt −Bs/Fs) = 0,

E(B2s/Fs

)= B2

s .

Consequently,E(B2t −B2

s/Fs)

= t− s.For the second example, we also use the property of independent increments,as follows:

E

(exp

(aBt −

a2t

2

)/Fs)

= exp(aBs)E

(exp

(a(Bt −Bs)−

a2t

2

)/Fs)

= exp(aBs)E

(exp

(a(Bt −Bs)−

a2t

2

)).

Using the expression of the density of the random variable Bt−Bs, we write

E

(exp

(a(Bt −Bs)−

a2t

2

))=

1√2π(t− s)

∫R

exp

(ax− at2

2− x2

2(t− s)

)dx

= exp

(a2(t− s)

2− a2t

2

)= exp

(−a

2s

2

),

22

where the before last equality is obtained by using the identity

x2

2(t− s)− ax+

at2

2=

(x− a(t− s))2

2(t− s)+a2t

2− a2(t− s)

2. (2.7)

Therefore, we obtain

E

(exp

(aBt −

a2t

2

)/Fs)

= exp

(aBs −

a2s

2

).

Remark. The computation of the expression

E

(exp

(a(Bt −Bs)−

a2t

2

))could have been avoided by using the following property:. If Z is a randomvariable with distribution N(0, σ2), then it has exponential moments, and

E (exp(Z)) = exp

(σ2

2

).

This property is proved by using computations analogue to those above.

2.5 Markov property

For any 0 ≤ s ≤ t, x ∈ R and A ∈ B(R), we set

p(s, t, x, A) =1

(2π(t− s)) 12

∫A

exp

(−|x− y|

2

2(t− s)

)dy. (2.8)

Actually, p(s, t, x, A) is the probability that a random variable, Normal, withmean x and variance t− s take values on a fixed set A.Let us prove the following identity:

PBt ∈ A/Fs := E(1Bt∈A|Fs

)= p(s, t, Bs, A), (2.9)

which means that, conditionally to the past of the Brownian motion untiltime s, the law of Bt at a future time t only depends on Bs.Let f : R → R be a bounded measurable function. Then, since Bs is Fs–measurable and Bt −Bs independent of Fs, we obtain

E (f(Bt)/Fs) = E (f (Bs + (Bt −Bs)) /Fs)

= E (f(x+Bt −Bs))∣∣∣x=Bs

.

23

The random variable x+Bt −Bs is N(x, t− s). Thus,

E (f(x+Bt −Bs)) =

∫Rf(y)p(s, t, x, dy),

and consequently,

E (f(Bt)/Fs) =

∫Rf(y)p(s, t, Bs, dy).

This yields (2.9) by taking f = 11A.With similar arguments, we can prove that

PBt ∈ A/σ(Bs) = p(s, t, Bs, A),

which along with (2.9) yields

PBt ∈ A/Fs = PBt ∈ A/σ(Bs) = p(s, t, Bs, A).

Going back to (2.8), we notice that the function x→ p(s, t, x, A) is measur-able, and the mapping A→ p(s, t, x, A) is a probability.Let us prove the additional property, called Chapman-Kolmogorov equation:For any 0 ≤ s ≤ u ≤ t,

p(s, t, x, A) =

∫Rp(u, t, y, A)p(s, u, x, dy). (2.10)

We recall that the sum of two independent Normal random variables, is againNormal, with mean the sum of the respective means, and variance the sumof the respective variances. This is expressed in mathematical terms by thefact that

[fN(x,σ1) ∗ fN(y,σ2)](z) =

∫RfN(x,σ1)(y)fN(y,σ2)(z − y)dy

= fN(x+y,σ1+σ2)(z),

where the first expression denotes the convolution of the two densities. Usingthis fact along with Fubini’s theorem, we obtain∫

Rp(u, t, y, A)p(s, u, x, dy) =

∫Rp(s, u, x, dy)

∫A

p(u, t, y, dz)

=

∫A

dz

(∫Rdy fN(x,u−s)(y)fN(0,t−u)(y − z)

)=

∫A

dz(fN(x,u−s) ∗ fN(0,t−u)

)(z)

=

∫A

dzfN(x,t−s)(z) = p(s, t, x, A),

24

proving (2.10).This equation is the time continuous analogue of the property own by thetransition probability matrices of a homogeneousMarkov chain. That is,

Π(m+n) = Π(m)Π(n),

meaning that evolutions in m + n steps are done by concatenating m-stepand n-step evolutions. In (2.10) m+ n is replaced by the real time t− s, mby t− u, and n by u− s, respectively.We are now ready to give the definition of a Markov process.Consider a mapping

p : R+ × R+ × R× B(R)→ R+,

satisfying the properties

(i) for any fixed s, t ∈ R+, A ∈ B(R),

x→ p(s, t, x, A)

is B(R)–measurable,

(ii) for any fixed s, t ∈ R+, x ∈ R,

A→ p(s, t, x, A)

is a probability,

(iii) Equation (2.10) holds.

Such a function p is termed a Markovian transition function. Let us also fixa probability µ on B(R).

Definition 2.3 A real valued stochastic process Xt, t ∈ R+ is a Markovprocess with initial law µ and transition probability function p if

(a) the law of X0 is µ,

(b) for any 0 ≤ s ≤ t,

PXt ∈ A/Fs = p(s, t,Xs, A).

Therefore, we have proved that the Brownian motion is a Markov processwith initial law a Dirac delta function at 0 and transition probability functionp the one defined in (2.8).

Strong Markov property

Throughout this section, (Ft, t ≥ 0) will denote the natural filtration asso-ciated with a Brownian motion Bt, t ≥ 0 and stopping times will alwaysrefer to this filtration.

25

Theorem 2.4 Let T be a stopping time. Then, conditionally to T < ∞,the process defined by

BTt = BT+t −BT , t ≥ 0,

is a Brownian motion independent of FT .

Proof: Assume that T < ∞ a.s. We shall prove that for any A ∈ FT , anychoice of parameters 0 ≤ t1 < · · · < tp and any continuous and boundedfunction f on Rp, we have

E[11Af

(BTt1, · · · , BT

tp

)]= P (A)E

[f(Bt1 , · · · , Btp

)]. (2.11)

This suffices to prove all the assertions of the theorem. Indeed, by taking A =Ω, we see that the finite dimensional distributions of B and BT coincide. Onthe other hand, (2.11) states the independence of 11A and the random vector(BT

t1, · · · , BT

tp). By a monotone class argument, we get the independence of

11A and BT .The continuity of the sample paths of B implies, a.s.

f(BTt1, · · · , BT

tp

)= f

(BT+t1 −BT , . . . , BT+tp −BT

)= lim

n→∞

∞∑k=1

11(k−1)2−n<T≤k2−nf(Bk2−n+t1 −Bk2−n , . . . Bk2−n+tp −Bk2−n

).

Since f is bounded we can apply bounded convergence and write

E[11Af

(BTt1, · · · , BtTp

)]= lim

n→∞

∞∑k=1

E[11(k−1)2−n<T≤k2−n11Af

(Bk2−n+t1 −Bk2−n , . . . Bk2−n+tp −Bk2−n

)].

Since A ∈ FT , the event A ∩ (k − 1)2−n < T ≤ k2−n ∈ Fk2−n . SinceBrownian motion has independent and stationary increments, we have

E[11(k−1)2−n<T≤k2−n11Af

(Bk2−n+t1 −Bk2−n , . . . Bk2−n+tp −Bk2−n

)]= P

[A ∩ (k − 1)2−n < T ≤ k2−n

]E[f(Bt1 , · · ·Btp)

].

Summing up with respect to k both terms in the preceding identity yields(2.11), and this finishes the proof if T <∞ a.s.If P (T =∞) > 0, we can argue as before and obtain

E[11A∩T<∞f

(BTt1, · · · , BT

tp

)]= P (A ∩ T <∞)E

[f(Bt1 , · · · , Btp

)].

26

An interesting consequence of the preceding property is given in the nextproposition.

Proposition 2.4 For any t > 0, set St = sups≤tBs. Then, for any a ≥ 0and b ≤ a,

P St ≥ a,Bt ≤ b = P Bt ≥ 2a− b . (2.12)

As a consequence, the probability law of St and |Bt| are the same.

Proof: Consider the stopping time

Ta = inft ≥ 0, Bt = a,

which is finite a.s. We have

PSt ≥ a,Bt ≤ b = PTa ≤ t, Bt ≤ b = PTa ≤ t, BTat−Ta ≤ b− a.

Indeed, BTat−Ta = Bt − BTa = Bt − a and B and BTa have the same law.

Moreover, we know that these processes are independent of FTa .This last property, along with the fact that BTa and −BTa have the samelaw yields that (Ta, B

Ta) has the same distribution as (Ta,−BTa).Define H = (s, w) ∈ R+ × C(R+;R); s ≤ t, w(t− s) ≤ b− a. Then

PTa ≤ t, BTat−Ta ≤ b− a = P(Ta, BTa) ∈ H = P(Ta,−BTa) ∈ H

= PTa ≤ t,−BTat−Ta ≤ b− a = PTa ≤ t, 2a− b ≤ Bt

= P2a− b ≤ Bt.

Indeed, by definition of the process BTt , t ≥ 0, the condition −BTa

t−Ta ≤ b−ais equivalent to 2a−b ≤ Bt; moreover, the inclusion 2a−b ≤ Bt ⊂ Ta ≤ tholds true. In fact, if Ta > t, then Bt ≤ a; since b ≤ a, this yields Bt ≤ 2a−b.This ends the proof of (2.12).For the second one, we notice that Bt ≥ a ⊂ St ≥ a. This fact alongwith (2.12) yield the validity of the identities

PSt ≥ a = PSt ≥ a,Bt ≤ a+ PSt ≥ a,Bt ≥ a= 2PBt ≥ a = P|Bt| ≥ a.

The proof is now complete.

27

3 Ito’s calculus

Ito’s calculus has been developed in the 50’ by Kyoshi Ito in an attempt togive rigourous meaning to some differential equations driven by the Brown-ian motion appearing in the study of some problems related with continuoustime Markov processes. Roughly speaking, one could say that Ito’s calculusis an analogue of the classical Newton and Leibniz calculus for stochastic pro-cesses. In fact, in classical mathematical analysis, there are several extensionsof the Riemann integral

∫f(x)dx. For example, if g is an increasing bounded

function (or the difference of two of these functions), Lebesgue-Stieltjes in-tegral gives a precise meaning to the integral

∫f(x)g(dx), for some set of

functions f . However, before Ito’s development, no theory allowing nowheredifferentiable integrators g was known. Brownian motion, introduced in thepreceding chapter, is an example of stochastic process whose sample paths, al-though continuous, are nowhere differentiable. Therefore, Lebesgue-Stieltjesintegral does not apply to the sample paths of Brownian motion.There are many motivations coming from a variety of disciplines to considerstochastic differential equations driven by a Brownian motion. Such an objectis defined as

dXt = σ(t,Xt)dBt + b(t,Xt)dt,

X0 = x0,

or in integral form,

Xt = x0 +

∫ t

0

σ(s,Xs)dBs +

∫ t

0

b(s,Xs)ds. (3.1)

The first notion to be introduced is that of stochastic integral. In fact, in (3.1)the integral

∫ t0b(s,Xs)ds might be defined pathwise, but this is not the case

for∫ t

0σ(s,Xs)dBs, because of the roughness of the paths of the integrator.

More explicitly, it is not possible to fix ω ∈ Ω, then to consider the pathσ(s,Xs(ω)), and finally to integrate with respect to Bs(ω).

3.1 Ito’s integral

Throughout this section, we will consider a Brownian motion B = Bt, t ≥ 0defined on a probability space (Ω,F , P ). We will also consider a filtration(Ft, t ≥ 0) satisfying the following properties:

1. B is adapted to (Ft, t ≥ 0),

2. the σ-field generated by Bu−Bt, u ≥ t is independent of (Ft, t ≥ 0).

28

Notice that these two properties are satisfied if (Ft, t ≥ 0) is the naturalfiltration associated to B.

We fix a finite time horizon T and define L2a,T as the set of stochastic processes

u = ut, t ∈ [0, T ] satisfying the following conditions:

(i) u is adapted and jointly measurable in (t, ω), with respect to the productσ-field B([0, T ])⊗F .

(ii)∫ T

0E(u2

t )dt <∞.

This is a Hilbert space with the norm ‖u‖L2a,T

=[∫ T

0E(u2

t )dt] 1

2, which

coincides with the natural norm on the Hilbert space L2(Ω × [0, T ],F ⊗B([0, t]), dP × dλ) (here λ stands for the Lebesgue measure on R).

The notation L2a,T evokes the two properties -adaptedness and square

integrability- described before.

Consider first the subset of L2a,T consisting of step processes. That is, stochas-

tic processes which can be written as

ut =n∑j=1

uj11[tj−1,tj [(t), (3.2)

with 0 = t0 ≤ t1 ≤ · · · ≤ tn = T and where uj, j = 1, . . . , n, are Ftj−1–

measurable square integrable random variables. We shall denote by E theset of these processes.

For step processes, the Ito stochastic integral is defined by the very naturalformula ∫ T

0

utdBt =n∑j=1

uj(Btj −Btj−1), (3.3)

that we may compare with Lebesgue integral of simple functions. Noticethat

∫ T0utdBt is a random variable. Of course, we would like to be able to

consider more general integrands than step processes. Therefore, we must tryto extend the definition (3.3). For this, we have to use tools from FunctionalAnalysis based upon a very natural idea: If we are able to prove that (3.3)gives a continuous functional between two metric spaces, then the stochasticintegral defined for the very particular class of step stochastic processes couldbe extended to a more general class given by the closure of this set withrespect to a suitable norm. This is possible by one of the consequences ofHahn-Banach Theorem.

The idea of continuity is made precise by the

29

Isometry property:

E

(∫ T

0

utdBt

)2

= E

(∫ T

0

u2tdt

). (3.4)

Let us prove (3.4) for step processes. Clearly

E

(∫ T

0

utdBt

)2

=n∑j=1

E(u2j(∆jB)2

)+ 2

∑j<k

E(ujuk(∆jB)(∆kB)).

The measurability property of the random variables uj, j = 1, . . . , n, impliesthat the random variables u2

j are independent of (∆jB)2. Hence, the con-tribution of the first term in the right hand-side of the preceding identity isequal to

n∑j=1

E(u2j)(tj − tj−1) =

∫ T

0

E(u2t )dt.

For the second term, we notice that for fixed j and k, j < k, the randomvariables ujuk∆jB are independent of ∆kB. Therefore,

E(ujuk(∆jB)(∆kB)) = E(ujuk(∆jB))E(∆kB) = 0.

Thus, we have (3.4).This property tell us that the stochastic integral is a continuous functionaldefined on E , endowed with the norm of L2(Ω× [0, T ]), taking values on theset L2(Ω) of square integrable random variables.

Other properties of the stochastic integral of step processes

1. The stochastic integral is a centered random variable. Indeed,

E

(∫ T

0

utdBt

)= E

(n∑j=1

uj(Btj −Btj+1

)

=n∑j=1

E(uj)E(Btj −Btj+1

)= 0,

where we have used that the random variables uj and Btj − Btj+1are

independent and moreover E(Btj −Btj+1

)= 0.

30

2. Linearity: If u1, u2 are two step processes and a, b ∈ R, then clearlyau1 + bu2 is also a step process and∫ T

0

(au1 + bu2)(t)dBt = a

∫ T

0

u1(t)dBt + b

∫ T

0

u2(t)dBt.

The next step consists of identifying a bigger set than E of random processessuch that E is dense in the norm of the Hilbert space L2(Ω × [0, T ]). SinceL2a,T is a Hilbert space with respect to this norm, we have that E ⊂ L2

a,T .The converse inclusion is also true. This is proved in the next Proposition,which is a crucial fact in Ito’s theory.

Proposition 3.1 For any u ∈ L2a,T there exists a sequence (un, n ≥ 1) ⊂ E

such that

limn→∞

∫ T

0

E(unt − ut)2dt = 0.

Proof: It will be done throughout three steps.Step 1. Assume that u ∈ L2

a,T is bounded, and has continuous sample paths,a.s. An approximation sequence can be defined as follows:

un(t) =

[nT ]∑k=0

u

(k

n

)11[ kn ,

k+1n [(t),

with the convention [nT ]+1n

:= T .Clearly, un ∈ L2

a,T and by continuity,

∫ T

0

|un(t)− u(t)|2dt =

[nT ]∑k=0

∫ k+1n∧T

kn

∣∣∣∣u(kn)− u(t)

∣∣∣∣2 dt≤ T sup

ksupt∈∆k

|un(t)− u(t)|2 → 0,

as n → ∞, a.s. Then, the approximation result follows by bounded conver-gence on the measure space L1(Ω,F , P ) applied to the sequence of randomvariables Yn ∈ L1(Ω,F , P ),

Yn =

∫ T

0

|un(t)− u(t)|2dt.

Indeed, we have proved that limn→∞ Yn = 0, a.s. and, by the assymptionson u, supn≥1 |Yn| ≤ Y , with Y ∈ L1(Ω,F , P ).

31

Step 2. Assume that u ∈ L2a,T is bounded. For any n ≥ 1, let Ψn(s) =

n11[0, 1n

](s). The sequence (Ψn)n≥1 is an approximation of the identity. Con-sider

un(t) =

∫ +∞

−∞Ψn(t− s)u(s)ds = (Ψn ∗ u)(t) =

∫ t

t− 1n

u(s)ds,

where the symbol “∗” denotes the convolution operator on R. In the lastintegral we put u(r) = 0 if r < 0. With this, we define a stochastic processun with continuous and bounded sample paths, a.s., and by the propertiesof the convolution (see e.g. [7][Corollary 2, p. 378]), we have∫ T

0

|un(s)− u(s)|2ds→ 0.

As in the preceding step, by bounded convergence,

limn→∞

E

∫ T

0

|un(s)− u(s)|2ds→ 0.

Step 3. Let u ∈ L2a,T and define

un(t) =

0, u(t) < −n,u(t), −n ≤ u(t) ≤ n,

0, u(t) > n.

Clearly, supω,t |un(t)| ≤ n and un ∈ L2a,T . Moreover,

E

∫ T

0

|un(s)− u(s)|2ds = E

∫ T

0

|u(s)|211|u(s)|>nds→ 0,

where we have used that for a function f ∈ L1([0, T ],B([0, T ]), dt),

limn→∞

∫ T

0

|f(t)|11|f(t)|>ndt = 0,

and then we have applied the bounded convergence theorem on L1(Ω,F , P ).

By using the approximation result provided by the preceding Proposition,we can give the following definition.

Definition 3.1 The Ito stochastic integral of a process u ∈ L2a,T is∫ T

0

utdBt := L2(Ω)− limn→∞

∫ T

0

unt dBt. (3.5)

32

In order this definition to make sense, one needs to make sure that if theprocess u is approximated by two different sequences, say un,1 and un,2, thedefinition of the stochastic integral, using either un,1 or un,2 coincide. Thisis proved using the isometry property. Indeed

E

(∫ T

0

un,1t dBt −∫ T

0

un,2t dBt

)2

=

∫ T

0

E(un,1t − u

n,2t

)2dt

≤ 2

∫ T

0

E(un,1t − ut

)2dt+ 2

∫ T

0

E(un,2t − ut

)2dt

→ 0.

Consequently, denoting by I i(u) = L2(Ω) − limn→∞∫ T

0un,it dBt, i = 1, 2 and

using the triangular inequaltity, we have

‖I1(u)− I2(u)‖2 ≤ ‖I1(u)−∫ T

0

un,1t dBt‖2 + ‖∫ T

0

un,1t dBt −∫ T

0

un,2t dBt‖2

+ ‖I2(u)−∫ T

0

un,2t dBt‖2,

where ‖ · ‖2 denotes the norm in L2(Ω).The right-hand side of this inequality tends to zero as n → ∞. Hence theleft-hand side is null.By its very definition, the stochastic integral defined in Definition 3.1 satisfiesthe isometry property as well. Indeed, using again the notation I(·) for thestochastic integral, we have

‖I(u)‖L2(Ω) = limn→∞

‖I(un)‖L2(Ω)

= limn→∞

‖un‖L2(Ω×[0,T ]) = ‖u‖L2(Ω×[0,T ]).

Moreover,

(a) stochastic integrals are centered random variables:

E

(∫ T

0

utdBt

)= 0,

(b) stochastic integration is a linear operator:∫ T

0

(aut + bvt) dBt = a

∫ T

0

utdBt + b

∫ T

0

vtdBt.

33

Remember that these facts are true for processes in E , as has been mentionedbefore. The extension to processes in L2

a,T is done by applying Proposition3.1. For the sake of illustration we prove (a).Consider an approximating sequence un in the sense of Proposition 3.1. Bythe construction of the stochastic integral

∫ T0utdBt, it holds that

limn→∞

E

(∫ T

0

unt dBt

)= E

(∫ T

0

utdBt

),

Since E(∫ T

0unt dBt

)= 0 for every n ≥ 1, this concludes the proof.

We end this section with an interesting example.

Example 3.1 For the Brownian motion B, the following formula holds:∫ T

0

BtdBt =1

2

(B2T − T

).

Let us remark that we would rather expect∫ T

0BtdBt = 1

2B2T , by analogy

with rules of deterministic calculus.To prove this identity, we consider a particular sequence of approximatingstep processes based on the partition

jTn, j = 0, . . . , n

, as follows:

unt =n∑j=1

Btj−111]tj−1,tj ](t),

with tj = jTn

. Clearly, un ∈ L2a,T and we have∫ T

0

E (unt −Bt)2 dt =

n∑j=1

∫ tj

tj−1

E(Btj−1

−Bt

)2dt

≤ T

n

n∑j=1

∫ tj

tj−1

dt =T 2

n.

Therefore, (un, n ≥ 1) is an approximating sequence of B in the norm ofL2(Ω× [0, T ]).According to Definition 3.1,∫ T

0

BtdBt = limn→∞

n∑j=1

Btj−1

(Btj −Btj−1

),

in the L2(Ω) norm.

34

Clearly,

n∑j=1

Btj−1

(Btj −Btj−1

)=

1

2

n∑j=1

(B2tj−B2

tj−1

)− 1

2

n∑j=1

(Btj −Btj−1

)2

=1

2B2T −

1

2

n∑j=1

(Btj −Btj−1

)2. (3.6)

We conclude by using Proposition 2.3.

3.2 The Ito integral as a stochastic process

The indefinite Ito stochastic integral of a process u ∈ L2a,T is defined as

follows: ∫ t

0

usdBs :=

∫ T

0

us11[0,t](s)dBs, (3.7)

t ∈ [0, T ].For this definition to make sense, we need that for any t ∈ [0, T ], the processus11[0,t](s), s ∈ [0, T ] belongs to L2

a,T . This is clearly true.Obviously, properties of the integral mentioned in the previous section, likezero mean, isometry, linearity, also hold for the indefinite integral.The rest of the section is devoted to the study of important properties of thestochastic process given by an indefinite Ito integral.

Proposition 3.2 The process It =∫ t

0usdBs, t ∈ [0, T ] is a martingale.

Proof: We first establish the martingale property for any approximatingsequence

Int =

∫ t

0

unsdBs, t ∈ [0, T ],

where un converges to u in L2(Ω× [0, T ]). This suffices to prove the Proposi-tion, since L2(Ω)–limits of martingales are again martingales (this fact followsfrom Jensen’s inequality).Let unt , t ∈ [0, T ], be defined by the right hand-side of (3.2). Fix 0 ≤ s ≤ t ≤T and assume that tk−1 < s ≤ tk < tl < t ≤ tl+1. Then

Int − Ins = uk(Btk −Bs) +l∑

j=k+1

uj(Btj −Btj−1)

+ ul+1(Bt −Btl).

35

Using properties (g) and (f), respectively, of the conditional expectation (seeAppendix 1) yields

E (Int − Ins /Fs) = E (uk(Btk −Bs)/Fs) +l∑

j=k+1

E(E(uj∆jB/Ftj−1

)/Fs)

+ E (ul+1E (Bt −Btl/Ftl) /Fs)= 0.

This finishes the proof of the proposition.

A proof not very different as that of Proposition 2.3 yields

Proposition 3.3 For any process u ∈ L2a,T and bounded,

L1(Ω)− limn→∞

n∑j=1

(∫ tj

tj−1

usdBs

)2

=

∫ t

0

u2sds.

That means, the “quadratic variation” of the indefinite stochastic integral isgiven by the process

∫ t0u2sds, t ∈ [0, T ].

The isometry property of the stochastic integral can be extended in the fol-lowing sense. Let p ∈ [2,∞[. Then,

E

(∫ t

0

usdBs

)p≤ C(p)E

(∫ t

0

u2sds

) p2

. (3.8)

Here C(p) is a positive constant depending on p. This is Burkholder’s in-equality.A combination of Burkholder’s inequality and Kolmogorov’s continuity cri-terion allows to deduce the continuity of the sample paths of the indefi-nite stochastic integral. Indeed, assume that

∫ T0E (ur)

p dr < ∞, for anyp ∈ [2,∞[. Using first (3.8) and then Holder’s inequality (be smart!) implies

E

(∫ t

s

urdBr

)p≤ C(p)E

(∫ t

s

u2rdr

) p2

≤ C(p)|t− s|p2−1

∫ t

s

E (ur)p dr

≤ C(p)|t− s|p2−1.

Since p ≥ 2 is arbitrary, with Theorem 1.1 we have that the sample paths of∫ t0usdBs, t ∈ [0, T ] are γ–Holder continuous with γ ∈]0, 1

2[.

36

3.3 An extension of the Ito integral

In Section 3.1 we have introduced the set L2a,T and we have defined the

stochastic integral of processes of this class with respect to the Brownianmotion. In this section we shall consider a large class of integrands. Thenotations and underlying filtration are the same as in Section 3.1.Let Λ2

a,T be the set of real valued processes u adapted to the filtration (Ft, t ≥0), jointly measurable in (t, ω) with respect to the product σ-field B([0, T ])×F and satisfying

P

∫ T

0

u2tdt <∞

= 1. (3.9)

Clearly L2a,T ⊂ Λ2

a,T . Our aim is to define the stochastic integral for processesin Λ2

a,T . For this we shall follow the same approach as in section 3.1. Firstly,we start with step processes (un, n ≥ 1) of the form (3.2) belonging to Λ2

a,T

and define the integral as in (3.3). The extension to processes in Λ2a,T needs

two ingredients. The first one is an approximation result that we now statewithout giving a proof. Reader may consult for instance [1].

Proposition 3.4 Let u ∈ Λ2a,T . There exists a sequence of step processes

(un, n ≥ 1) of the form (3.2) belonging to Λ2a,T such that

limn→∞

∫ T

0

|unt − ut|2dt = 0,

a.s.

The second ingredient gives a connection between stochastic integrals of stepprocesses in Λ2

a,T and their quadratic variation, as follows.

Proposition 3.5 Let u be a step processes in Λ2a,T . Then for any ε > 0,

N > 0,

P

∣∣∣∣∫ T

0

utdBt

∣∣∣∣ > ε

≤ P

∫ T

0

u2tdt > N

+N

ε2. (3.10)

Proof: It is based on a truncation argument. Let u be given by the right-handside of (3.2) (here it is not necessary to assume that the random variables ujare in L2(Ω)). Fix N > 0 and define

vNt =

uj, if t ∈ [tj−1, tj[, and

∑nj=1 u

2j(tj − tj−1) ≤ N,

0, if t ∈ [tj−1, tj[, and∑n

j=1 u2j(tj − tj−1) > N,

The process vNt , t ∈ [0, T ] belongs to L2a,T . Indeed, by definition∫ t

0

|vNt |2dt ≤ N.

37

Moreover, if∫ T

0u2tdt ≤ N , necessarily ut = vNt for any t ∈ [0, T ]. Then by

considering the decomposition∣∣∣∣∫ T

0

utdBt

∣∣∣∣ > ε

=

∣∣∣∣∫ T

0

utdBt

∣∣∣∣ > ε,

∫ T

0

u2tdt > N

∪∣∣∣∣∫ T

0

utdBt

∣∣∣∣ > ε,

∫ T

0

u2tdt ≤ N

,

we obtain

P

∣∣∣∣∫ T

0

utdBt

∣∣∣∣ > ε

≤ P

∣∣∣∣∫ T

0

vNt dBt

∣∣∣∣ > ε

+ P

∫ T

0

u2tdt > N

.

We finally apply Chebychev’s inequality along with the isometry property ofthe stochastic integral for processes in L2

a,T and get

P

∣∣∣∣∫ T

0

vNt dBt

∣∣∣∣ > ε

≤ 1

ε2E

(∫ T

0

vNt dBt

)2

≤ N

ε2.

This ends the proof of the result.

The extension

Fix u ∈ Λ2a,T and consider a sequence of step processes (un, n ≥ 1) of the

form (3.2) belonging to Λ2a,T such that

limn→∞

∫ T

0

|unt − ut|2dt = 0, (3.11)

in the convergence of probability.By Proposition 3.5, for any ε > 0, N > 0 we have

P

∣∣∣∣∫ T

0

(unt − umt )dBt

∣∣∣∣ > ε

≤ P

∣∣∣∣∫ T

0

(unt − umt )2dt

∣∣∣∣ > N

+N

ε2.

Using (3.11), we can choose ε such that for any N > 0 and n,m big enough,

P

∣∣∣∣∫ T

0

(unt − umt )2dt

∣∣∣∣ > N

≤ ε

2.

Then, we may take N small enough so that Nε2≤ ε

2. Consequently, we have

proved that the sequence of stochastic integrals of step processes(∫ T

0

unt dBt, n ≥ 1

)(3.12)

38

is Cauchy in probability. The space L0(Ω) of classes of finite random variables(a.s.) endowed with the convergence in probability is a complete metric space.For example, a possible distance is

d(X, Y ) = E

(|X − Y |

1 + |X − Y |

).

Hence the sequence (3.12) does have a limit in probability. Then, we define∫ T

0

utdBt = P − limn→∞

∫ T

0

unt dBt. (3.13)

It is easy to check that this definition is indeed independent of the particularapproximation sequence used in the construction.

3.4 A change of variables formula: Ito’s formula

Like in Example 3.1, we can prove the following formula, valid for any t ≥ 0:

B2t = 2

∫ t

0

BsdBs + t. (3.14)

If the sample paths of Bt, t ≥ 0 were sufficiently smooth -for example, ofbounded variation- we would rather have

B2t = 2

∫ t

0

BsdBs. (3.15)

Why is it so? Consider a similar decomposition as the one given in (3.6)obtained by restricting the time interval to [0, t]. More concretely, considerthe partition of [0, t] defined by 0 = t0 ≤ t1 ≤ · · · ≤ tn = t,

B2t =

n−1∑j=0

(B2tj+1−B2

tj

)= 2

n−1∑j=0

Btj

(Btj+1

−Btj

)+

n−1∑j=0

(Btj+1

−Btj

)2, (3.16)

where we have used that B0 = 0.Consider a sequence of partitions of [0, t] whose mesh tends to zero. Wealready know that

n−1∑j=0

(Btj+1

−Btj

)2 → t,

39

in the convergence of L2(Ω). This gives the extra contribution in the devel-opment of B2

t in comparison with the classical calculus approach.Notice that, if B were of bounded variation then, we could argue as follows:

n−1∑j=0

(Btj+1

−Btj

)2 ≤ sup0≤j≤n−1

|Btj+1−Btj |

×n−1∑j=0

|Btj+1−Btj |.

By the continuity of the sample paths of the Brownian motion, the first factorin the right hand-side of the preceding inequality tends to zero as the meshof the partition tends to zero, while the second factor remains finite, by theproperty of bounded variation.Summarising. Differential calculus with respect to the Brownian motionshould take into account second order differential terms. Roughly speaking

(dBt)2 = dt.

A precise meaning to this formal formula is given in Proposition 2.3.

3.4.1 One dimensional Ito’s formula

In this section, we shall extend the formula (3.14) and write an expression forf(t,Xt) for a class of functions f and a family of stochastic processes whichinclude f(x) = x2 and the Brownian motion, respectively. To illustrate themethod of the proof, we start with the particular case given in the nextassertion (see [6]).

Theorem 3.1 Consider a function f : R → R belonging to C2, the classof continuous differentiable real functions up to order two. Then, for any0 ≤ a ≤ t,

f(Bt) = f(Ba) +

∫ t

a

f ′(Bs)dBs +1

2

∫ t

a

f ′′(Bs)ds, (3.17)

a.s.

The proof relies on two technical lemmas. In the sequel, we will considera sequence of partitions Πn = 0 = t0 ≤ t1 ≤ · · · ≤ tn = t such thatlimn→∞ |Πn| = 0.

40

Lemma 3.1 Let g be a real continuous function and λi ∈ (0, 1), i = 1, . . . , n.There exists a subsequence (denoted by (n), by simplicity) such that

Xn :=n∑i=1

[g(Bti−1

+ λi(Bti −Bti−1))− g(Bti−1

)]

(Bti −Bti−1)2, n ≥ 1,

converges to zero a.s.

Proof. Consider the random variable

Yn = max1≤i≤n,0<λ<1

∣∣g(Bti−1+ λ(Bti −Bti−1

))− g(Bti−1)∣∣ ,

Clearly

|Xn| ≤ Yn

n∑i=1

(Bti −Bti−1)2.

By the continuity of the sample paths of Brownian motion, (Yn)n≥1 convergesto zero a.s. Moreover,

∑ni=1(Bti −Bti−1

)2, n ≥ 1

converges in L2(Ω) to t.Hence, the lemma holds.

Lemma 3.2 The hypotheses are the same as in Lemma 3.1. The sequence

Sn :=n∑i=1

g(Bti−1)[(Bti −Bti−1

)2 − (ti − ti1)], n ≥ 1,

converges in probability to zero.

Proof. We will introduce a localization in Ω. With this, g may be assumed tobe bounded and we will prove convergence to zero in L2(Ω). Finally, we willremove the localization and obtain the result. This is a quite usual procedurein probability theory.Fix L > 0 and set

A(L)i−1 = |B(tl)| ≤ L, 0 ≤ l ≤ i− 1, 1 ≤ i ≤ n.

Notice that this is a decreasing family in the parameter i.Define

Sn,L =n∑i=1

g(Bti−1)1A

(L)i−1

[(Bti −Bti−1

)2 − (ti − ti1)].

This is a localization of Sn.

41

To simplify the notation, we call

Xi = (Bti −Bti−1)2 − (ti − ti1),

Yi = g(Bti−1)1A

(L)i−1

[(Bti −Bti−1

)2 − (ti − ti1)],

1 ≤ 1 ≤ n. Our aim is to prove first that E (|∑

i Yi|2)→ 0.Let Ft = σ(B(s), 0 ≤ s ≤ t). Fix 1 ≤ i < j ≤ n. Then

E(YiYj) = E(E(YiYj/Ftj−1

))= E

(Yig(Btj−1

)1A

(L)j−1E(Xj/Ftj−1

)= 0.

We clearly have Y 2i ≤ sup|x|≤L |g(x)|2X2

i and therefore,

E(Y 2i ) ≤ C(ti − ti−1)2 sup

|x|≤L|g(x)|2.

Consequently,

E(S2n,L) =

n∑i=1

E(Y 2i )

≤ Ct sup|x|≤L|g(x)|2|Πn|,

which converges to zero as n→∞.Next, we fix ε > 0 and write

P |Sn| > ε = P |Sn| > ε, Sn = Sn,L+ P |Sn| > ε, Sn 6= Sn,L≤ P |Sn,L| > ε+ P Sn 6= Sn,L .

Chebyshev’s inequality yields

limn→∞

P |Sn,L| > ε ≤ ε−2 limn→∞

E(|Sn,L|2) = 0.

Moreover, if ω is such that Sn(ω) 6= Sn,L(ω) there exists i = 1, . . . , n, such

that ω ∈ (A(L)i−1)c. The family (A

(L)i−1)i decreases in i, consequently, ω ∈

(A(L)n−1)c. Thus,

P Sn 6= Sn,L ≤ P (A(L)n−1)c ≤ P

sup

0≤s≤t|B(s)| > L

.

42

We proved in Proposition 2.4 that the law of sup0≤s≤t |B(s)| and |Bt| are thesame. Thus,

P

sup

0≤s≤t|B(s)| > L

= P |B(t)| > L ≤ L−1E(|B(t)|) = L−1

√2t

π.

This yieldslimL→∞

P Sn 6= Sn,L = 0,

and ends the proof of the Lemma.

43

Proof of Theorem 3.1

For simplicity, we take a = 0. We fix ω and consider a Taylor expansion upto the second order. This yields

f(Bt)− f(0) =n∑i=1

f ′(Bti−1)(Bti −Bti−1

)

+1

2

n∑i=1

f ′′(Bti−1+ λi(Bti −Bti−1

))(Bti −Bti−1)2.

The stochastic process ut = f ′(Bt), t ≥ 0 has continuous sample paths, a.s.Let

un(t) =n∑i=1

f ′(Bti−1)1[ti−1,ti].

By continuity,

limn→∞

∫ t

0

|un(s)− u(s)|2ds = 0, a.s.

Therefore,

limn→∞

n∑i=1

f ′(Bti−1)(Bti −Bti−1

) = limn→∞

∫ t

0

un(s)dBs =

∫ t

0

u(s)dBs,

in probability, and also a.s. for some subsequence.Next we consider the terms with the second derivative. By the triangularinequality, we have∣∣∣∣∣

n∑i=1


))(Bti −Bti−1)2 −

∫ t

0

f ′′(Bs)ds

∣∣∣∣∣≤

∣∣∣∣∣n∑i=1


))(Bti −Bti−1)2 −

n∑i=1

f ′′(Bti−1)(Bti −Bti−1

)2

∣∣∣∣∣+

∣∣∣∣∣n∑i=1

f ′′(Bti−1)[(Bti −Bti−1

)2 − (ti − ti−1)]∣∣∣∣∣

+

∣∣∣∣∣n∑i=1

f ′′(Bti−1)(ti − ti−1)−

∫ t

0

f ′′(Bs)ds

∣∣∣∣∣ .The first term on the right-hand side of the preceding inequality convergesto zero as n → ∞, a.s. Indeed this follows from Lemma 3.1 applied to thefunction g := f

′′. With the same choice of g, Lemma 3.2 yields the a.s.

44

convergence to zero of a subsequence for the second term. Finally, the thirdterm converges also to zero, a.s. by the classical result on approximation ofRiemann integrals by Riemann sums.

We now introduce the class of Ito Processes for which we will prove a moregeneral version of the Ito formula.

Definition 3.2 Let vt, t ∈ [0, T ] be a stochastic process, adapted, whose

sample paths are almost surely Lebesgue integrable, that is∫ T

0|vt|dt < ∞,

a.s. Consider a stochastic process ut, t ∈ [0, T ] belonging to Λ2a,T and a

random variable X0. The stochastic process defined by

Xt = X0 +

∫ t

0

usdBs +

∫ t

0

vsds, (3.18)

t ∈ [0, T ] is termed an Ito process.

An alternative writing of (3.18) in differential form is

dXt = utdBt + vtdt.

Let C1,2 denote the set of functions on [0, T ]×R which are jointly continuousin (t, x), continuous differentiable in t and twice continuous differentiablein x, with jointly continuous derivatives. Our next aim is to prove an Itoformula for the stochastic process f(t,Xt), t ∈ [0, T], f ∈ C1,2 This will bean extension of (3.17).

Theorem 3.2 Let f : [0, T ]× R→ R be a function in C1,2 and X be an Itoprocess with decomposition given in (3.18). The following formula holds:

f(t,Xt) = f(0, X0) +

∫ t

0

∂sf(s,Xs)ds+

∫ t

0

∂xf(s,Xs)usdBs

+

∫ t

0

∂xf(s,Xs)vsds+1

2

∫ t

0

∂2xxf(s,Xs)u

2sds. (3.19)

Formula (3.19) can also be written as

f(t,Xt) = f(0, X0) +

∫ t

0

∂sf(s,Xs)ds+

∫ t

0

∂xf(s,Xs)dXs

+1

2

∫ t

0

∂2xxf(s,Xs)(dXs)

2, (3.20)

45

or in differential form ,

df(t,Xt) = ∂tf(t,Xt)dt+ ∂xf(t,Xt)dXt +1

2∂2xxf(t,Xt)(dXt)

2, (3.21)

where (dXt)2 is computed using the formal rule

dBt × dBt = dt,

dBt × dt = dt× dBt = 0,

dt× dt = 0.

Example 3.2 Consider the function

f(t, x) = eµt−σ2

2t+σx,

with µ, σ ∈ R.Applying formula (3.19) to Xt := Bt -a Brownian motion- yields

f(t, Bt) = 1 + µ

∫ t

0

f(s, Bs)ds+ σ

∫ t

0

f(s, Bs)dBs.

Hence, the process Yt = f(t, Bt), t ≥ 0 satisfies the equation

Yt = 1 + µ

∫ t

0

Ysds+ σ

∫ t

0

YsdBs.

It is termed geometric Brownian motion. The equivalent differential form ofthis identity is the linear stochastic differential equation

dYt = µYtdt+ σYtdBt,

Y0 = 1. (3.22)

Black and Scholes proposed as model of a market with a single risky assetwith initial value S0 = 1, the process St = Yt. We have seen that such aprocess is in fact the solution to a linear stochastic differential equation (see(3.22)).

In the particular case where f : R→ R is a function in C2 (twice continuouslydifferentiable), Theorem 3.2 gives the following version of Ito’s formula:

f(Xt) = f(X0) +

∫ t

0

f ′(Xs)usdBs +

∫ t

0

f ′(Xs)vsds

+1

2

∫ t

0

f ′′(Xs)u2sds. (3.23)

46


Let Πn = 0 = tn0 < · · · < tnpn = t be a sequence of increasing partitionssuch that limn→∞ |Πn| = 0. First, we consider the decompositionWe can write

f(t,Xt)− f(0, X0) =

pn−1∑i=0

[f(tni+1, X

nti+1

)− f(ti, Xnti

)]

=

pn−1∑i=0

[f(tni+1, X

nti

)− f(tni , Xnti

)]

+[f(tni+1, X

nti+1

)− f(tni+1, Xnti

)]

=

pn−1∑i=0

[∂sf(tni , X

nti

)(tni+1 − tni )]

+[∂xf(tni+1, X

nti

)(Xnti+1−Xn

ti)]

+1

2

pn−1∑i=0

∂2xxf(tni+1, X

ni )(Xn

ti+1−Xn

ti)2. (3.24)

with tni ∈]tni , tni+1[ and Xn

i an intermediate (random) point on the segmentdetermined by Xn

tiand Xn

ti+1.

In fact, this follows from a Taylor expansion of the function f up to the firstorder in the variable s (or the mean-value theorem), and up to the secondorder in the variable x. The asymmetry in the orders is due to the existenceof quadratic variation of the processes involved. The expresion (3.24) is theanalogue of (3.16). The former is much simpler for two reasons. Firstly, thereis no s-variable; secondly, f is a polynomial of second degree, and therefore ithas an exact Taylor expansion. But both formulas have the same structure.When passing to the limit as n→∞, we expect

pn−1∑i=0

∂sf(tni , Xnti

)(tni+1 − tni )→∫ t

0

∂sf(s,Xs)ds

pn−1∑i=0

∂xf(tni+1, Xnti

)(Xnti+1−Xn

ti)→

∫ t

0

∂xf(s,Xs)usdBs

+

∫ t

0

∂xf(s,Xs)vsds

pn−1∑i=0

∂xxf(tni+1, Xni )(Xn

ti+1−Xn

ti)2 →

∫ t

0

∂2xxf(s,Xs)u

2sds,

47

in some topology.This actually holds in the a.s. convergence (by taking if necessary a subse-quence). As in Theorem 3.1, the proof requires a localization in Ω. However,this can be avoided by assuming some additional assumptions as follows: theprocess v is bounded; u ∈ L2

a,T ; the partial derivatives ∂xf , ∂2xx are bounded.

We shall give a proof of the theorem under these additional hypotheses.

Checking the convergences

First termpn−1∑i=0

∂sf(tni , Xtni)(tni+1 − tni )→

∫ t

0

∂sf(s,Xs)ds, (3.25)

a.s.Indeed∣∣∣∣∣

pn−1∑i=1

∂sf(tni , Xtni)(tni+1 − tni )−

∫ t

0

∂sf(s,Xs)ds

∣∣∣∣∣=

∣∣∣∣∣pn−1∑i=1

∫ tni+1

tni

[∂sf(tni , Xtni

)− ∂sf(s,Xs)]ds

∣∣∣∣∣≤

pn−1∑i=1

∫ tni+1

tni

∣∣∂sf(tni , Xtni)− ∂sf(s,Xs)

∣∣ ds≤ t sup

1≤i≤pn−1sup

s∈[tni ,tni+1]

∣∣∣(∂sf(tni , Xtni)− ∂sf(s,Xs)

)11[tni ,t

ni+1](s)

∣∣∣ .The continuity of ∂sf along with that of the process X implies

limn→∞

sup1≤i≤pn−1

sups∈[tni ,t

ni+1]

∣∣∣(∂sf(tni , Xtni)− ∂sf(s,Xs)

)11[tni ,t

ni+1](s)

∣∣∣ = 0.

This gives (3.25).

Second termWe prove

pn−1∑i=1

∂xf(tni , Xtni

)(Xtni+1

−Xtni)→

∫ t

0

∂xf(s,Xs)dXs (3.26)

in probability, which amounts to check two convengences:pn−1∑i=1

∂xf(tni , Xtni

) ∫ tni+1

tni

utdBt →∫ t

0

∂xf(s,Xs)usdBs, (3.27)

pn−1∑i=1

∂xf(tni , Xtni

) ∫ tni+1

tni

vtdt→∫ t

0

∂xf(s,Xs)vsds. (3.28)

48

We start with (3.27). We have

E

∣∣∣∣∣pn−1∑i=1

∂xf(tni , Xtni

) ∫ tni+1

tni

utdBt −∫ t

0

∂xf(s,Xs)usdBs

∣∣∣∣∣2

= E

∣∣∣∣∣pn−1∑i=1

∫ tni+1

tni

[∂xf

(tni , Xtni

)− ∂xf(s,Xs)

]usdBs

∣∣∣∣∣2

=

pn−1∑i=1

E

∣∣∣∣∣∫ tni+1

tni

[∂xf

(tni , Xtni

)− ∂xf(s,Xs)

]usdBs

∣∣∣∣∣2

=

pn−1∑i=1

∫ tni+1

tni

E[∂xf

(tni , Xtni

)− ∂xf(s,Xs)

]us2ds,

where we have applied successively that the stochastic integrals on disjointintervals are independent to each other, they are centered random variables,along with the isometry property.By the continuity of ∂xf and the process X,

sup1≤i≤pn−1

sups∈[tni ,t

ni+1]

∣∣∣(∂xf (tni , Xtni

)− ∂xf(s,Xs)

)11[tni ,t

ni+1](s)

∣∣∣→ 0, (3.29)

a.s. Then, by bounded convergence,

sup1≤i≤pn−1

sups∈[tni ,t

ni+1]

E∣∣∣[∂xf (tni , Xtni

)− ∂xf(s,Xs)

]us11[tni ,t

ni+1](s)

∣∣∣2 → 0.

Therefore we obtain (3.27) in L2(Ω).For the proof of (3.28) we write∣∣∣∣∣

pn−1∑i=1

∂xf(tni , Xtni

) ∫ tni+1

tni

vtdt−∫ t

0

∂xf(s,Xs)vsds

∣∣∣∣∣=

∣∣∣∣∣pn−1∑i=1

∫ tni+1

tni

[∂xf

(tni , Xtni

)− ∂xf(s,Xs)

]vsds

∣∣∣∣∣≤

pn−1∑i=1

∫ tni+1

tni

∣∣∂xf (tni , Xtni

)− ∂xf(s,Xs)

∣∣ |vs|ds.By virtue of (3.29) and bounded convergence, we obtain (3.28) in the a.s.convergence.Third term

49

Set fn,i = ∂2xxf(tni+1, X

ni ). We have to prove

pn−1∑i=0

fn,i(Xtni+1−Xtni

)2 →∫ t

0

∂2xxf(s,Xs)u

2sds. (3.30)

This will be a consequence of the following convergences

pn−1∑i=0

fn,i

(∫ tni+1

tni

usdBs

)2

→∫ t

0

∂2xxf(s,Xs)u

2sds, (3.31)

pn−1∑i=0

fn,i

(∫ tni+1

tni

usdBs

)(∫ tni+1

tni

vsds

)→ 0, (3.32)

pn−1∑i=0

fn,i

(∫ tni+1

tni

vsds

)2

→ 0, (3.33)

in the a.s. convergence.

Let us start by arguing on (3.31). We have

E

∣∣∣∣∣∣pn−1∑i=0

fn,i

(∫ tni+1

tni

usdBs

)2

−∫ t

0

∂2xxf(s,Xs)u

2sds

∣∣∣∣∣∣≤ (T1 + T2),

with

T1 = E

∣∣∣∣∣∣pn−1∑i=0

fn,i

(∫ tni+1

tni

usdBs

)2

−∫ tni+1

tni

u2sds

∣∣∣∣∣∣ ,T2 = E

∣∣∣∣∣pn−1∑i=0

fn,i

∫ tni+1

tni

u2sds−

∫ t

0

∂2xxf(s,Xs)u

2sds

∣∣∣∣∣ .Since ∂2

xk,xlf is bounded,

T1 ≤ CE

pn−1∑i=0

∣∣∣∣∣∣(∫ tni+1

tni

usdBs

)2

−∫ tni+1

tni

u2sds

∣∣∣∣∣∣ .This tends to zero as n→∞ (see Proposition 3.3).

50

As for T2, we have

T2 =

∣∣∣∣∣pn−1∑i=1

∫ tni+1

tni

[fn,i − ∂2

xxf(s,Xs)]u2sds

∣∣∣∣∣≤

pn−1∑i=1

∫ tni+1

tni

∣∣fn,i − ∂2xxf(s,Xs)

∣∣u2sds.

Using the continuity property, we have

sup0≤0≤pn−1

sups∈[tni ,t

ni+1]

∣∣fn,i − ∂2xxf(s,Xs)

∣∣→ 0,

a.s. Then, by bounded convergence we obtain that T2 converges to zero asn → ∞. Hence we have proved (3.31) in L1(Ω) (and therefore also a.s. forsome subsequence).Next we prove (3.32) in L1(Ω). Indeed

E

∣∣∣∣∣pn−1∑i=0

fn,i

(∫ tni+1

tni

usdBs

)(∫ tni+1

tni

vsds

)∣∣∣∣∣≤ C

pn−1∑i=0

E

(∣∣∣∣∣∫ tni+1

tni

usdBs

∣∣∣∣∣∣∣∣∣∣∫ tni+1

tni

vsds

∣∣∣∣∣)

≤ C

pn−1∑i=0

(E

∫ tni+1

tni

u2sds

) 12

E(∫ tni+1

tni

|vs|ds

)2 1

2

≤ C

pn−1∑i=0

|tni+1 − tni |12

(∫ tni+1

tni

E|us|2ds

) 12(E

(∫ tni+1

tni

|vs|2ds

)) 12

≤ C supi|tni+1 − tni |

12

(pn−1∑i=0

∫ tni+1

tni

E|us|2ds

) 12

×

(pn−1∑i=0

∫ tni+1

tni

E|vs|2ds

) 12

,

which tends to zero as n→∞ a.s.The proof of (3.33) is very easy. Indeed∣∣∣∣∣∣

pn−1∑i=0

fn,i

(∫ tni+1

tni

vsds

)2∣∣∣∣∣∣ ≤ C sup

i

(∫ tni+1

tni

|vs|ds

)∫ t

0

|vs|ds.

51

The first factor on the right-hand side of this inequality tends to zero asn → ∞, while the second one is bounded a.s. Therefore (3.33) holds in thea.s. convergence.This ends the proof of the Theorem.

3.4.2 Multidimensional version of Ito’s formula

Consider a m-dimensional Brownian motion (B1t , · · · , Bm

t ) , t ≥ 0 and areal-valued Ito processes, as follows:

dX it =

m∑l=1

ui,lt dBlt + vitdt, (3.34)

i = 1, . . . , p. We assume that each one of the processes ui,lt belong to Λ2a,T

and that∫ T

0|vit|dt <∞, a.s. Following a similar plan as for Theorem 3.2, we

will prove the following:

Theorem 3.3 Let f : [0,∞) × Rp 7→ R be a function of class C1,2 andX = (X1, . . . , Xp) be given by (3.34). Then

f(t,Xt) = f(0, X0) +

∫ t

0

∂sf(s,Xs)ds+

p∑k=1

∫ t

0

∂xkf(s,Xs)dXks

+1

2

p∑k,l=1

∫ t

0

∂xk,xlf(s,Xs)dXks dX

ls, (3.35)

where in order to compute dXks dX

ls, we have to apply the following rules

dBksdB

lt = δk,lds, (3.36)

dBksds = 0,

(ds)2 = 0,

where δk,l denotes the Kronecker symbol.

We remark that the identity (3.36) is a consequence of the independence ofthe components of the Brownian motion.

Example 3.3 Consider the particular case m = 1, p = 2 and f(x, y) = xy.That is, f does not depend on t and we have denoted a generic point of R by(x, y). Then the above formula (3.35) yields

X1tX

2t = X1

0X20 +

∫ t

0

X1sdX

2s +

∫ t

0

X2sdX

1s +

∫ t

0

(u1su

2s

)ds. (3.37)

52

4 Applications of the Ito formula

This chapter is devoted to give some important results that use the Ito for-mula in some parts of their proofs.

4.1 Burkholder-Davis-Gundy inequalities

Theorem 4.1 Let u ∈ L2a,T and set Mt =

∫ t0usdBs. Define

M∗t = sup

s∈[0,t]

|Ms| .

Then, for any p > 0, there exist two positive constants cp, Cp such that

cpE

(∫ T

0

u2sds

) p2

≤ E (M∗T )p ≤ CpE

(∫ T

0

u2sds

) p2

. (4.1)

Proof: We will only prove here the right-hand side of (4.1) for p ≥ 2. For this,we assume that the process Mt, t ∈ [0, T ] is bounded. This assumption canbe removed by a localization argument.Consider the function

f(x) = |x|p,for which we have that

f ′(x) = p|x|p−1sign(x),

f ′′(x) = p(p− 1)|x|p−2,

for x 6= 0. Then, according to (3.23) we obtain

|Mt|p =

∫ t

0

p|Ms|p−1sign(Ms)usdBs +1

2

∫ t

0

p(p− 1)|Ms|p−2u2sds.

Applying the expectation operator to both terms of the above identity yields

E (|Mt|p) =p(p− 1)

2E

(∫ t

0

|Ms|p−2u2sds

). (4.2)

We next apply Holder’s inequality to the expectation with exponents pp−2

and q = p2

and get

E

(∫ t

0

|Ms|p−2u2sds

)≤ E

((M∗

t )p−2

∫ t

0

u2sds

)

≤ [E (M∗t )p]

p−2p

[E

(∫ t

0

u2sds

) p2

] 2p

. (4.3)

53

Doob’s inequality (see Theorem 8.1) implies

E (M∗t )p ≤

(p

p− 1

)pE(|Mt|p).

Hence, by applying (4.2), (4.3), we obtain

E (M∗t )p ≤

(p

p− 1

)pE(|Mt|p)

≤(

p

p− 1

)pp(p− 1)

2E

(∫ t

0

|Ms|p−2u2sds

)

≤(

p

p− 1

)pp(p− 1)

2[E (M∗

t )p]p−2p

[E

(∫ t

0

u2sds

) p2

] 2p

.

Since 1− p−2p

= 2p, from this inequality we obtain

E (M∗t )p ≤

((p

p− 1

)pp(p− 1)

2

) p2

E

(∫ t

0

u2sds

) p2

.

This ends the proof of the upper bound.

4.2 Representation of L2 Brownian functionals

We already know that for any process u ∈ L2a,T , the stochastic integral process

∫ t

0usdBs, t ∈ [0, T ] is a martingale. The next result is a kind of converse

statement. In the proof we shall use a technical ingredient that we writewithout giving a proof.In the sequel we denote by FT the σ-field generated by (Bt, 0 ≤ t ≤ T ).

Lemma 4.1 The vector space generated by the random variables

exp

(∫ T

0

f(t)dBt −1

2

∫ T

0

f 2(t)dt

),

f ∈ L2([0, T ]), is dense in L2(Ω,FT , P ).

Theorem 4.2 Let Z ∈ L2(Ω,FT ). There exists a unique process h ∈ L2a,T

such that

Z = E(Z) +

∫ T

0

hsdBs. (4.4)

54

Hence, for any martingale M = Mt, t ∈ [0, T ] bounded in L2, there exist aunique process h ∈ L2

a,T and a constant C such that

Mt = C +

∫ t

0

hsdBs. (4.5)

Proof: We start with the proof of (4.4). Let H be the vector space consistingof random variables Z ∈ L2(Ω,FT ) such that (4.4) holds. Firstly, we arguethe uniqueness of h. This is an easy consequence of the isometry of thestochastic integral. Indeed, if there were two processes h and h′ satisfying(4.4), then

E

(∫ T

0

(hs − h′s)2ds

)= E

(∫ T

0

(hs − h′s)dBs

)2

= 0.

This yields h = h′ in L2([0, T ]× Ω).We now turn to the existence of h. Any Z ∈ H satisfies

E(Z2) = (E(Z))2 + E

(∫ T

0

h2sds

).

From this it follows that if (Zn, n ≥ 1) is a sequence of elements of H con-verging to Z in L2(Ω,FT ), then the sequence (hn, n ≥ 1) corresponding tothe representations is Cauchy in L2

a,T . Denoting by h the limit, we have

Z = E(Z) +

∫ T

0

hsdBs.

Hence H is closed in L2(Ω,FT ).For any f ∈ L2([0, T ]), set

Eft = exp

(∫ t

0

fsdBs −1

2

∫ t

0

f 2s ds

).

The random variable∫ t

0fsdBs is Gaussian, centered and with variance∫ t

0f 2s ds. Hence, E

(Eft)

= 1. Then, by the Ito formula,

Eft = 1 +

∫ t

0

Efs f(s)dBs.

Consequently, the representation holds for Z := EfT and also any linear com-bination of such random variables belong to H. The conclusion follows fromLemma 4.1.

55

Let us now prove the representation (4.5). The random variable MT belongsto L2(Ω). Hence, by applying the first part of the Theorem we have

MT = E(M0) +

∫ T

0

hsdBs,

for some h ∈ L2a,T . By taking conditional expectations we obtain

Mt = E(MT |Ft) = E(M0) +

∫ t

0

hsdBs, 0 ≤ t ≤ T.

The proof of the theorem is now complete.

Example 4.1 Consider Z = B3T . In order to find the corresponding process

h in the integral representation, we apply first Ito’s formula, yielding

B3T =

∫ T

0

3B2t dBt + 3

∫ T

0

Btdt.

An integration by parts gives∫ T

0

Btdt = TBT −∫ T

0

tdBt.

Thus,

B3T =

∫ T

0

3[B2t + T − t

]dBt.

Notice that E(BT )3 = 0. Then ht = 3 [B2t + T − t].

4.3 Girsanov’s theorem

It is well known that if X is a multidimensional Gaussian random variable,any affine transformation brings X into a multidimensional Gaussian randomvariable as well. The simplest version of Girsanov’s theorem extends thisresult to a Brownian motion. Before giving the precise statement and theproof, let us introduce some preliminaries.

Lemma 4.2 Let L be a nonnegative random variable such that E(L) = 1.Set

Q(A) = E(11AL), A ∈ F . (4.6)

Then, Q defines a probability on F , equivalent to P , with density given by L.Reciprocally, if P and Q are two probabilities on F and P Q, then thereexists a nonnegative random variable L such that E(L) = 1, and (4.6) holds.

56

Proof: It is clear that Q defines a σ-additive function on F . Moreover, since

Q(Ω) = E(11ΩL) = E(L) = 1,

Q is indeed a probability.Let A ∈ F be such that Q(A) = 0. Since L > 0, a.s., we should haveP (A) = 0. Reciprocally, for any A ∈ F with P (A) = 0, we have Q(A) = 0as well.The second assertion of the lemma is Radon-Nikodym theorem.

If we denote by EQ the expectation operator with respect to the probabilityQ defined before, one has

EQ(X) = E(XL).

Indeed, this formula is easily checked for simple random variables and thenextended to any random variable X ∈ L1(Ω) by the usual approximationargument.

Consider now a Brownian motion Bt, t ∈ [0, T ]. Fix λ ∈ R and let

Lt = exp

(−λBt −

λ2

2t

). (4.7)

Notice that Lt = Eft with f = −λ (see section 4.2).Ito’s formula yields

Lt = 1−∫ t

0

λLsdBs.

Hence, the process Lt, t ∈ [0, T ] is a positive martingale and E(Lt) = 1,for any t ∈ [0, T ]. Set

Q(A) = E (11ALT ) , A ∈ FT . (4.8)

By Lemma 4.2, the probability Q is equivalent to P on the σ-field FT .By the martingale property of Lt, t ∈ [0, T ], the same conclusion is true onFt, for any t ∈ [0, T ]. Indeed, let A ∈ Ft, then

Q(A) = E (11ALT ) = E (E (11ALT |Ft))= E (11AE (LT |Ft))= E (11ALt) .

Next, we give a technical result.

57

Lemma 4.3 Let X be a random variable and let G be a sub σ-field of F suchthat

E(eiuX |G

)= e−

u2σ2

2 .

Then, the random variable X is independent of the σ-field G and its proba-bility law is Gaussian, zero mean and variance σ2.

Proof: By the definition of the conditional expectation, for any A ∈ G,

E(11Ae

iuX)

= P (A)e−u2σ2

2 .

In particular, for A := Ω, we see that the characteristic function of X is thatof a N(0, σ2). This proves the last assertion.Moreover, for any A ∈ G,

EA(eiuX

)= e−

u2σ2

2 ,

saying that the law of X conditionally to A is also N(0, σ2). Thus,

P ((X ≤ x) ∩ A) = P (A)PA (X ≤ x) = P (A)P (X ≤ x) ,

yielding the independence of X and G.

Theorem 4.3 (Girsanov’s theorem) Let λ ∈ R and set

Wt = Bt + λt.

In the probability space (Ω,FT , Q), with Q given in (4.8), the process Wt, t ∈[0, T ] is a standard Brownian motion.

Proof: We will check that in the probability space (Ω,FT , Q), any incrementWt−Ws, 0 ≤ s < t ≤ T is independent of Fs and has N(0, t−s) distribution.That is, for any A ∈ Fs,

EQ(eiu(Wt−Ws)11A

)= EQ

(11Ae

−u2

2(t−s)

)= Q(A)e−

u2

2(t−s).

The conclusion will follow from Lemma 4.3.Indeed, writing

Lt = exp

(−λ(Bt −Bs)−

λ2

2(t− s)

)exp

(−λBs −

λ2

2s

),

58

we have

EQ(eiu(Wt−Ws)11A

)= E

(11Ae

iu(Wt−Ws)Lt)

= E(

11Aeiu(Bt−Bs)+iuλ(t−s)−λ(Bt−Bs)−λ

2

2(t−s)Ls

).

Since Bt −Bs is independent of Fs, the last expression is equal to

E (11ALs)E(e(iu−λ)(Bt−Bs)

)eiuλ(t−s)−λ

2

2(t−s)

= Q(A)e(iu−λ)2

2(t−s)+iuλ(t−s)−λ

2

2(t−s)

= Q(A)e−u2

2(t−s).

The proof is now complete.

59

5 Local time of Brownian motion and

Tanaka’s formula

This chapter deals with a very particular extension of Ito’s formula. Moreprecisely, we would like to have a decomposition of the positive submartingale|Bt − x|, for some fixed x ∈ R as in the Ito formula. Notice that the functionf(y) = |y − x| does not belong to C2(R). A natural way to proceed is toregularize the function f , for instance by convolution with an approximationof the identity, and then, pass to the limit. Assuming that this is feasible,the question of identifying the limit involving the second order derivativeremains open. This leads us to introduce a process termed the local time ofB at x introduced by Paul Levy.

Definition 5.1 Let B = Bt, t ≥ 0 be a Brownian motion and let x ∈ R.The local time of B at x is defined as the stochastic process

L(t, x) = limε→0

1

2ε

∫ t

0

11(x−ε,x+ε)(Bs)ds

= limε→0

1

2ελs ∈ [0, t] : Bs ∈ (x− ε, x+ ε), (5.1)

where λ denotes the Lebesgue measure on R.

We see that L(t, x) measures the time spent by the process B at x during aperiod of time of length t. Actually, it is the density of this occupation time.

We shall see later that the above limit exists in L2 (it also exists a.s.), a factthat it is not obvious at all.Local time enters naturally in the extension of the Ito formula we alludedbefore. In fact, we have the following result.

Theorem 5.1 For any t ≥ 0 and x ∈ R, a.s.,

(Bt − x)+ = (B0 − x)+ +

∫ t

0

11[x,∞)(Bs)dBs +1

2L(t, x), (5.2)

where L(t, x) is given by (5.1) in the L2 convergence.

Proof: The heuristics of formula (5.2) is the following. In the sense ofdistributions, f(y) = (y − x)+ has as first and second order derivatives,f ′(y) = 11[x,∞)(y), f ′′(y) = δx(y), respectively, where δx denotes the Diracdelta measure. Hence we expect a formula like

(Bt − x)+ = (B0 − x)+ +

∫ t

0

11[x,∞)(Bs)dBs +1

2

∫ t

0

δx(Bs)ds.

60

However, we have to give a meaning to the last integral.

Approximation procedureWe are going to approximate the function f(y) = (y − x)+. For this, we fixε > 0 and define

fxε(y) =

0, if y ≤ x− ε(y−x+ε)2

4ε, if x− ε ≤ y ≤ x+ ε

y − x if y ≥ x+ ε

which clearly has as derivatives

f ′xε(y) =

0, if y ≤ x− ε(y−x+ε)

2ε, if x− ε ≤ y ≤ x+ ε

1 if y ≥ x+ ε

and

f ′′xε(y) =

0, if y < x− ε12ε, if x− ε < y < x+ ε

0 if y > x+ ε

Let φn, n ≥ 1 be a sequence of C∞ functions with compact supports decreas-ing to 0. For instance we may consider the function

φ(y) = c exp(−(1− y2)−1

)11|y|<1,

with a constant c such that∫R φ(z)dz = 1, and then take

φn(y) = nφ(ny).

Set

gn(y) = [φn ∗ fxε](y) =

∫Rfxε(y − z)φn(z)dz.

It is well-known that gn ∈ C∞, gn and g′n converge uniformly in R to fxε andf ′xε, respectively, and g′′n converges pointwise to f ′′xε except at the points x+ εand x− ε.

We then have an Ito’s formula for gn, as follows:

gn(Bt) = gn(B0) +

∫ t

0

g′n(Bs)dBs +1

2

∫ t

0

g′′n(Bs)ds. (5.3)

61

Convergence of the terms in (5.3) as n→∞The function f ′xε is bounded. The function g′n is also bounded. Indeed,

|g′n(y)| =∫Rf ′xε(y − z)φn(z)dz

=

∫ 1n

− 1n

f ′xε(y − z)φn(z)dz

≤ 2‖f ′xε‖∞.

Moreover, ∣∣g′n(Bs)11[0,t] − f ′xε(Bs)11[0,t]

∣∣→ 0,

uniformly in t and in ω. Hence, by bounded convergence,

E

∫ t

0

|g′n(Bs)− f ′xε(Bs)|2 → 0.

Then, the isometry property of the stochastic integral implies

E

∣∣∣∣∫ t

0

[g′n(Bs)− f ′xε(Bs)] dBs

∣∣∣∣2 → 0,

as n→∞.We next deal with the second order term. Since the law of each Bs has adensity, for each s > 0,

PBs = x+ ε = PBs = x− ε = 0.

Thus, for any s > 0,

limn→∞

g′′n(Bs) = f ′′xε(Bs),

a.s. Using Fubini’s theorem, we see that this convergence also holds, foralmost every s, a.s. In fact,∫ t

0

ds

∫Ω

dP11f ′′x,ε(Bs)6=limn→∞ g′′n(Bs)

=

∫Ω

dP

∫ t

0

ds11f ′′x,ε(Bs)6=limn→∞ g′′n(Bs) = 0.

We have

supy∈R|g′′n(y)| ≤ 1

2ε.

62

Indeed,

|g′′n(y)| = 1

2ε

∣∣∣∣∫Rφn(z)11(x−ε,x+ε)(y − z)dz

∣∣∣∣≤ 1

2ε

∫ y−x+ε

y−x−ε|φn(z)|dz ≤ 2

2ε.

Then, by bounded convergence∫ t

0

g′′n(Bs)ds→∫ t

0

f ′′xε(Bs)ds,

a.s. and in L2.Thus, passing to the limit the expression (5.3) yields

fxε(Bt) = fxε(B0) +

∫ t

0

f ′xε(Bs)dBs +1

2

∫ t

0

1

2ε11(x−ε,x+ε)(Bs)ds. (5.4)

Convergence as ε→ 0 of (5.4)

Since fxε(y)→ (y − x)+ as ε→ 0 and

|fxε(Bt)− fxε(B0)| ≤ |Bt −B0|,

we havefxε(Bt)− fxε(B0)→ (Bt − x)+ − (B0 − x)+,

in L2.Moreover,

E

[∫ t

0

(f ′x,ε(Bs)− 11[x,∞)(Bs)

)2ds

]≤ E

[∫ t

0


)≤∫ t

0

2ε√2πs

ds.

that clearly tends to zero as ε → 0. Hence, by the isometry property of thestochastic integral ∫ t

0

f ′x,ε(Bs)dBs →∫ t

0

11[x,∞)(Bs)dBs,

in L2.Consequently, we have proved that∫ t

0

1

2ε11(x−ε,x+ε)(Bs)ds

63

converges in L2 as ε→ 0 and that formula (5.2) holds.

We give without proof two further properties of local time.

1. The property of local time as a density of occupation measure is madeclear by the following identity, valid por any t ≥ 0 and every a ≤ b:∫ b

a

L(t, x)dx =

∫ t

0

11(a,b)(Bs)ds.

2. The stochastic integral∫ t

011[x,∞)(Bs)dBs has a jointly continuous ver-

sion in (t, x) ∈ (0,∞) × R. Hence, by (5.2) so does the local timeL(t, x), (t, x) ∈ (0,∞)× R.

The next result, which follows easily from Theorem 5.1 is known as Tanaka’sformula.

Theorem 5.2 For any (t, x) ∈ [0,∞)× R, we have

|Bt − x| = |B0 − x|+∫ t

0

sign(Bs − x)dBs + L(t, x). (5.5)

Proof: We will use the following relations: |x| = x+ +x−, x− = max(−x, 0) =(−x)+. Hence, by virtue of (5.2), we only need a formula for (−Bt + x)+.Notice that we already have it, since the process −B is also a Brownianmotion. More precisely,

(−Bt + x)+ = (−B0 + x)+ +

∫ t

0

11[−x,∞)(−Bs)d(−Bs) +1

2L−(t,−x),

where we have denoted by L−(t,−x) the local time of −B at −x. We havethe following facts:∫ t

0

11[−x,∞)(−Bs)d(−Bs) = −∫ t

0

11(−∞,x](Bs)dBs,

L−(t,−x) = limε→0

1

2ε

∫ t

0

11(−x−ε,−x+ε)(−Bs)ds

= limε→0

1

2ε

∫ t

0


= L(t, x),

where the limit is in L2(Ω).

64

Thus, we have proved

(Bt − x)− = (B0 − x)− −∫ t

0

11(−∞,x](Bs)dBs +1

2L(t, x). (5.6)

Adding up (5.2) and (5.6) yields (5.5). Indeed

11[x,∞)(Bs)− 11(−∞,x](Bs) =

1, if Bs > x

−1 if Bs < x

0 if Bs = x

which is identical to sign (Bs − x).

65

6 Stochastic differential equations

In this section we shall introduce stochastic differential equations driven by amulti-dimensional Brownian motion. Under suitable properties on the coeffi-cients, we shall prove a result on existence and uniqueness of solution. Thenwe shall establish properties of the solution, like existence of moments of anyorder and the Holder property of the sample paths.

The setting

We consider a d-dimensional Brownian motion B = Bt = (B1t , . . . , B

dt ), t ≥

0, B0 = 0, defined on a probability space (Ω,F , P ), along with a filtration(Ft, t ≥ 0) satisfying the following properties:

1. B is adapted to (Ft, t ≥ 0),

2. the σ-field generated by Bu−Bt, u ≥ t is independent of (Ft, t ≥ 0).

We also consider functions

b : [0,∞)× Rm → Rm, σ : [0,∞)× Rm → L(Rd;Rm).

When necessary we will use the description

b(t, x) =(bi(t, x)

)1≤i≤m , σ(t, x) =

(σij(t, x)

)1≤i≤m,1≤j≤d .

By a stochastic differential equation, we mean an expression of the form

dXt = σ(t,Xt)dBt + b(t,Xt)dt, t ∈ (0,∞),

X0 = x, (6.1)

where x is a m-dimensional random vector independent of the Brownianmotion.We can also consider any time value u ≥ 0 as the initial one. In this case,we must write t ∈ (u,∞) and Xu = x in (6.1). For the sake of simplicity wewill assume here that x is deterministic.The formal expression (6.1) has to be understood as follows:

Xt = x+

∫ t

0

σ(s,Xs)dBs +

∫ t

0

b(s,Xs)ds, (6.2)

or coordinate-wise,

X it = xi +

d∑j=1

∫ t

0

σij(s,Xs)dBjs +

∫ t

0

bi(s,Xs)ds,

66

i = 1, . . . ,m.

Strong existence and path-wise uniqueness

We now give the notions of existence and uniqueness of solution that will beconsidered throughout this chapter.

Definition 6.1 A m-dimensional stochastic process (Xt, t ≥ 0) measurableand Ft-adapted is a strong solution to (6.2) if the following conditions aresatisfied:

1. The processes (σij(s,Xs), s ≥ 0) belong to L2a,∞, for any 1 ≤ i ≤ m,

1 ≤ j ≤ d.

2. The processes (bi(s,Xs), s ≥ 0) belong to L1a,∞, for any 1 ≤ i ≤ m.

3. Equation (6.2) holds true for the fixed Brownian motion defined before,for any t ≥ 0, a.s.

Definition 6.2 The equation (6.2) has a path-wise unique solution if anytwo strong solutions X1 and X2 in the sense of the previous definition areindistinguishable, that is,

PX1(t) = X2(t), for any t ≥ 0 = 1.

Hypotheses on the coefficients

We shall refer to (H) for the following set of hypotheses.

1. Linear growth:

supt

[|b(t, x)|+ |σ(t, x)] | ≤ L(1 + |x|). (6.3)

2. Lipschitz in the x variable, uniformly in t:

supt

[|b(t, x)− b(t, y)|+ |σ(t, x)− σ(t, y)|] ≤ L|x− y|. (6.4)

In (6.3), (6.4), L stands for a positive constant.

67

6.1 Examples of stochastic differential equations

When the functions σ and b have a linear structure, the solution to (6.2)admits an explicit form. This is not surprising as it is indeed the case forordinary differential equations. We deal with this question in this section.More precisely, suppose that

σ(t, x) = Σ(t) + F (t)x, (6.5)

b(t, x) = c(t) +D(t)x. (6.6)

Example 1 Assume for simplicity d = m = 1, σ(t, x) = Σ(t), b(t, x) =

c(t) +Dx, t ≥ 0 and D ∈ R. Now equation (6.2) reads

Xt = X0 +

∫ t

0

Σ(s)dBs +

∫ t

0

[c(s) +DXs]ds,

and has a unique solution given by

Xt = X0eDt +

∫ t

0

eD(t−s)(c(s)ds+ Σ(s)dBs). (6.7)

To check (6.7) we proceed as in the deterministic case. First we consider theequation

dXt = DXtdt,

with initial condition X0, which solution is

Xt = X0eDt, t ≥ 0.

The we use the variation of constants procedure and write

Xt = X0(t)eDt.

A priori X0(t) may be random. However, since eDt is differentiable, the Itodifferential of Xt is given by

dXt = dX0(t)eDt +X0(t)eDtDdt.

Equating the right-hand side of the preceding identity with

Σ(t)dBt + (c(t) +XtD) dt

68

yields

dX0(t)eDt +XtDdt

= Σ(t)dBt + (c(t) +XtD) dt,

that isdX0(t) = e−Dt [Σ(t)dBt + c(t)dt] .

In integral form

X0(t) = x+

∫ t

0

e−Ds [Σ(s)dBs + c(s)ds] .

Plugging the right-hand side of this equation in Xt = X0(t)eDt yields (6.7).

A particular example of the class of equations considered before is LangevinEquation:

dXt = αdBt − βXtdt, t > 0,

X0 = x0 ∈ R, where α ∈ R and β > 0. Here Xt stands for the velocity attime t of a free particle that performs a Brownian motion different from theBt in the equation. The solution to this equation is given by

Xt = e−βtx0 + α

∫ t

0

e−β(t−s)dBs.

Notice that Xt, t ≥ 0 defines a Gaussian process.

6.2 A result on existence and uniqueness of solution

This section is devoted to prove the following result.

Theorem 6.1 Assume that the functions σ, and b satisfy the assumptions(H). Then there exists a path-wise unique strong solution to (6.2).

Before giving a proof of this theorem we recall a version of Gronwall’s lemmathat will be used repeatedly in the sequel.

Lemma 6.1 Let u, v : [α, β] → R+ be functions such that u is Lebesgueintegrable and v is measurable and bounded. Assume that

v(t) ≤ c+

∫ t

α

u(s)v(s)ds, (6.8)

for some constant c ≥ 0 and for any t ∈ [α, β]. Then

v(t) ≤ c exp

(∫ t

α

u(s)ds

). (6.9)

69


Let us introduce Picard’s iteration scheme

X0t = x,

Xnt = x+

∫ t

0

σ(s,Xn−1s )dBs +

∫ t

0

b(s,Xn−1s )ds, n ≥ 1, (6.10)

t ≥ 0. Let us restrict the time interval to [0, T ], with T > 0. We shallprove that the sequence of stochastic processes defined recursively by (6.10)converges uniformly to a process X which is a strong solution of (6.2). Even-tually, we shall prove path-wise uniqueness.

Step 1: We prove by induction on n that for any t ∈ [0, T ],

E

sup

0≤s≤t|Xn

s |2<∞. (6.11)

Indeed, this property is clearly true if n = 0, since in this case X0t is constant

and equal to x. Suppose that (6.11) holds true for n = 0, . . . ,m − 1. Byapplying Burkholder’s and Holder’s inequality, we reach

E

sup

0≤s≤t|Xn

s |2≤ C

[x+ E

(sup

0≤s≤t

∣∣∣∣∫ s

0

σ(u,Xm−1u )dBu

∣∣∣∣2)

+ E

(sup

0≤s≤t

∣∣∣∣∫ s

0

b(u,Xm−1u )du

∣∣∣∣2)]

≤ C[x+ E

(∫ t

0

∣∣σ(u,Xm−1u )

∣∣2 du)+ E

(∫ t

0

∣∣b(u,Xm−1u )

∣∣2 du)]≤ C

[x+ E

(∫ t

0

(1 + |Xm−1

u |2)du

)]≤ C

[x+ T + TE

(sup

0≤s≤T|Xm−1

s |2)]

.

Hence (6.11) is proved.The assumptions (H) along with (6.11) imply that σ (s,Xn−1

s ) ∈ L2a,T and

b (s,Xn−1s ) ∈ L2(Ω× [0, T ]).

Step 2: As in Step 1, we prove by induction on n that

E

sup

0≤s≤t|Xn+1

s −Xns |2≤ (Ct)n+1

(n+ 1)!. (6.12)

70

Indeed, consider first the case n = 0 for which we have

X1s − x =

∫ s

0

σ(u, x)dBu +

∫ s

0

b(s, x)ds.

Burkolder’s inequality yields

E

(sup

0≤s≤t

∣∣∣∣∫ s

0

σ(u, x)dBu

∣∣∣∣2)≤ C

∫ t

0

|σ(u, x)|2du

≤ Ct(1 + |x|2).

Similarly,

E

(sup

0≤s≤t

∣∣∣∣∫ s

0

b(u, x)du

∣∣∣∣2)≤ Ct

∫ t

0

|b(u, x)|2du

≤ Ct2(1 + |x|2).

With this, (6.12) is established for n = 0.Assume that (6.12) holds for natural numbers m ≤ n − 1. Then, as we didfor n = 0, we can consider the decomposition

E

sup

0≤s≤t|Xn+1

s −Xns |2≤ 2(A(t) +B(t)),

with

A(t) = E

sup

0≤s≤t

∣∣∣∣∫ s

0

(σ(u,Xn

u )− σ(u,Xn−1u )

)dBu

∣∣∣∣2,

B(t) = E

sup

0≤s≤t

∣∣∣∣∫ s

0

(b(u,Xn

u )− b(u,Xn−1u )

)du

∣∣∣∣2.

Using first Burkholder’s inequality and then Holder’s inequality along withthe Lipschitz property of the coefficient σ, we obtain

A(t) ≤ C(L)

∫ t

0

E(|Xns −Xn−1

s |2)ds.

By the induction assumption we can upper bound the last expression by

C(L)

∫ t

0

(Cs)n

n!≤ C(T, L)

(Ct)n+1

(n+ 1)!.

71

Similarly, applying Holder’s inequality along with the Lipschitz property ofthe coefficient b and the induction assumption, yield

B(t) ≤ C(T, L)(Ct)n+1

(n+ 1)!.

Step 3: The sequence of processes Xnt , t ∈ [0, T ], n ≥ 0, converges uni-

formly in t to a stochastic process Xt, t ∈ [0, T ] which satisfies (6.2).Indeed, applying first Chebychev’s inequality and then (6.12), we have

P

sup

0≤t≤T

∣∣Xn+1t −Xn

t

∣∣ > 1

2n

≤ 22n (Ct)n+1

(n+ 1)!,

which clearly implies

∞∑n=0

P

sup

0≤t≤T

∣∣Xn+1t −Xn

t

∣∣ > 1

2n

<∞.

Hence, by the first Borel-Cantelli’s lemma

P

lim inf

n

sup

0≤t≤T

∣∣Xn+1t −Xn

t

∣∣ ≤ 1

2n

= 1.

In other words, for each ω a.s., there exists a natural number m0(ω) suchthat

sup0≤t≤T

∣∣Xn+1t −Xn

t

∣∣ ≤ 1

2n,

for any n ≥ m0(ω). The Weierstrass criterion for convergence of series offunctions then implies that

Xmt = x+

m−1∑k=0

[Xk+1t −Xk

t ]

converges uniformly on [0, T ], a.s. Let us denote by X = Xt, t ∈ [0, T ] thelimit. Obviously the process X has a.s. continuous paths.

To conclude the proof, we must check that X satisfies equation (6.2) on [0, T ].The continuity properties of σ and b imply the convergences

σ(t,Xnt )→ σ(t,Xt),

b(t,Xnt )→ b(t,Xt),

as n→∞, uniformly in t ∈ [0, T ], a.s.

72

Therefore, ∣∣∣∣∫ t

0

[b(s,Xns )− b(s,Xs)] ds

∣∣∣∣ ≤ L

∫ t

0

|Xns −Xs| ds

≤ L sup0≤s≤t

|Xns −Xs| → 0,

as n → ∞, a.s. This proves the a.s. convergence of the sequence of thepath-wise integrals.As for the stochastic integrals, we will prove∫ t

0

σ(s,Xns )dBs →

∫ t

0

σ(s,Xs)dBs (6.13)

as n→∞ with the convergence in probability.Indeed, applying the extension of Lemma 3.5 to processes of Λa,T , we havefor each ε,N > 0,

P

∣∣∣∣∫ t

0

(σ(s,Xns )− σ(s,Xs)) dBs

∣∣∣∣ > ε

≤ P

∫ t

0

|σ(s,Xns )− σ(s,Xs)|2 ds > N

+N

ε2.

The first term in the right-hand side of this inequality converges to zero asn → ∞. Since ε,N > 0 are arbitrary, this yields the convergence stated in(6.13).Summarising, by considering if necessary a subsequence Xnk

t , t ∈ [0, T ],we have proved the a.s. convergence, uniformly in t ∈ [0, T ], to a stochasticprocess Xt, t ∈ [0, T ] which satisfies (6.2), and moreover

E sup0≤t≤T

|Xt|2 <∞.

In order to conclude that X is a strong solution to (6.1) we have to checkthat the required measurability and integrability conditions hold. This is leftas an exercise to the reader.

Step 4: Path-wise uniqueness.

Let X1 and X2 be two strong solutions to (6.1). Proceeding in a similar wayas in Step 2, we easily get

E

(sup

0≤u≤t|X1(u)−X2(u)|2

)≤ C

∫ t

0

E

(sup

0≤u≤s|X1(u)−X2(u)|2

)ds.

73

Hence, from Lemma 6.1 we conclude

E

(sup

0≤u≤T|X1(u)−X2(u)|2

)= 0,

proving that X1 and X2 are indistinguishable.

6.3 Some properties of the solution

We start this section by studying the Lp-moments of the solution to (6.2).

Theorem 6.2 Assume the same assumptions as in Theorem 6.1 and supposein addition that the initial condition is a random variable X0, independent ofthe Brownian motion. Fix p ∈ [2,∞) and t ∈ [0, T ]. There exists a positiveconstant C = C(p, t, L) such that

E

(sup

0≤s≤t|Xs|p

)≤ C (1 + E|X0|p) . (6.14)

Proof: From (6.2) it follows that

E

(sup

0≤s≤t|Xs|p

)≤ C(p)

[E|X0|p + E

(sup

0≤s≤t

∣∣∣∣∫ s

0

σ(u,Xu)dBu

∣∣∣∣p)E

(sup

0≤s≤t

∣∣∣∣∫ s

0

b(u,Xu)du

∣∣∣∣p)].Applying first Burkholder’s inequality and then Holder’s inequality yield

E

(sup

0≤s≤t

∣∣∣∣∫ s

0

σ(u,Xu)dBu

∣∣∣∣p) ≤ C(p)E

(∫ t

0

|σ(s,Xs)|2ds) p

2

≤ C(p, t)E

(∫ t

0

|σ(s,Xs)|pds)≤ C(p, L, t)

∫ t

0

(1 + E|Xs|p)ds

≤ C(p, L, t)

(1 +

∫ t

0

E

(sup

0≤u≤s|Xu|p

))ds.

For the pathwise integral, we apply Holder’s inequality to obtain

E

(sup

0≤s≤t

∣∣∣∣∫ s

0

b(u,Xu)du

∣∣∣∣p) ≤ C(p, t)E

(∫ t

0

|b(s,Xs)|pds)

≤ C(p, L, t)

(1 +

∫ t

0

E

(sup

0≤u≤s|Xu|p

)ds

).

74

Define

ϕ(t) = E

(sup

0≤s≤t|Xs|p

).

We have established that

ϕ(t) ≤ C(p, L, t)

(E|X0|p + 1 +

∫ t

0

ϕ(s)ds

).

Then, with Lemma 6.1 we end the proof of (6.14).

It is clear that the solution to (6.2) depends on the initial value X0. Considertwo initial conditions X0, Y0 (remember that there should be m-dimensionalrandom vectors independent of the Brownian motion). Denote by X(X0),X(Y0) the corresponding solutions to (6.2). With a very similar proof as thatof Theorem 6.2 we can obtain the following.

Theorem 6.3 The assumptions are the same as in Theorem 6.1. Then

E

(sup

0≤s≤t|Xs(X0)−Xs(Y0)|p

)≤ C(p, L, t) (E|X0 − Y0|p) , (6.15)

for any p ∈ [2,∞), where C(p, L, t) is some positive constant depending onp, L and t.

The sample paths of the solution of a stochastic differential equation possessthe same regularity as those of the Brownian motion. We next discuss thisfact.

Theorem 6.4 The assumptions are the same as in Theorem 6.1. Let p ∈[2,∞), 0 ≤ s ≤ t ≤ T . There exists a positive constant C = C(p, L, T ) suchthat

E (|Xt −Xs|p) ≤ C(p, L, T ) (1 + E|X0|p) |t− s|p2 . (6.16)

Proof: By virtue of (6.2) we can write

E (|Xt −Xs|p)

≤ C(p)

[E

∣∣∣∣∫ t

s

σ(u,Xu)dBu

∣∣∣∣p + E

∣∣∣∣∫ t

s

b(u,Xu)du

∣∣∣∣p] .75

Burkholder’s inequality and then Holder’s inequality with respect toLebesgue measure on [s, t] yield

E

∣∣∣∣∫ t

s

σ(u,Xu)dBu

∣∣∣∣p ≤ C(p)E

(∫ t

s

|σ(u,Xu)|2du) p

2

≤ C(p)|t− s|p2−1E

(∫ t

s

|σ(u,Xu)|pdu)

≤ C(p, L, T )|t− s|p2−1

∫ t

s

(1 + E(|Xu|p)) du.

By using the estimate (6.14) of Theorem 6.2, we have∫ t

s

(1 + E(|Xu|p)) du ≤ C(p, T, L)|t− s|E(

sup0≤u≤t

|Xu|p)

≤ C(p, T, L)|t− s| (1 + E|X0|p) . (6.17)

This ends the estimate of the Lp moment of the stochastic integral.The estimate of the path-wise integral follows from Holder’s inequality and(6.17). Indeed, we have

E

∣∣∣∣∫ t

s

b(u,Xu)du

∣∣∣∣p ≤ |t− s|p−1

∫ t

s

E (|b(u,Xu)|p) du

≤ C(p, L)|t− s|p−1

∫ t

s

(1 + E(|Xu|p)) du

≤ C(p, T, L)|t− s|p (1 + E|X0|p) .

Hence we have proved (6.16)

We can now apply Kolmogorov’s continuity criterion (see Proposition 2.2) toprove the following.

Corollary 6.1 With the same assumptions as in Theorem 6.1, we have thatthe sample paths of the solution to (6.2) are Holder continuous of degreeα ∈ (0, 1

2).

Remark 6.1 Assume that in Theorem 6.3 the initial conditions are deter-ministic and are denoted by x and y, respectively. An extension of Kol-mogorov’s continuity criterion to stochastic processes indexed by a multi-dimensional parameter yields that the sample paths of the stochastic processXt(x), t ∈ [0, T ], x ∈ Rm are jointly Holder continuous in (t, x) of degreeα < 1

2in t and β < 1 in x, respectively.

76

6.4 Markov property of the solution

In Section 2.5 we discussed the Markov property of a real-valued Brownianmotion. With the obvious changes R into Rn, with arbitrary n ≥ 1 we cansee that the property extends to multi-dimensional Brownian motion. In thissection we prove that the solution to the sde (6.2) inherits the Markov prop-erty from Brownian motion. To establish this fact we need some preliminaryresults.

Lemma 6.2 Let (Ω,F , P ) be a probability space and (E, E) be a measur-able space. Consider two independent sub-σ-fields of F , denoted by G, H,respectively, along with mappings

X : (Ω,H)→ (E, E),

andΨ : (E × Ω, E ⊗ G)→ Rm,

with ω 7→ Ψ (X(ω), ω) in L1(Ω).Then,

E (Ψ (X, ·) |H) = Φ(X)(·), (6.18)

with Φ(x)(·) = E (Ψ(x, ·)).

Proof: Assume first that Ψ(x, ω) = f(x)Z(ω) with a G-measurable Z and aE-measurable f . Then, by the properties of the mathematical expectation,

E (Ψ (X, ·) |H) = E (f(X(·))Z(·)|H)

= f(X(·))E(Z).

Indeed, we use that X is H-measurable and that G, H are independent.Clearly,

f(X(·))E(Z) = f(x)E(Z)|x=X(·) = E (Ψ(x, ·)) |x=X(·) = Φ(X).

This yields (6.18).The result extends to any E ⊗G-measurable function Ψ by a monotone classargument.

Lemma 6.3 Fix u ≥ 0 and let η be a Fu-measurable random variable inL2(Ω). Consider the SDE

Y ηt = η +

∫ t

u

σ (s, Y ηs ) dBs

∫ t

u

b (s, Y ηs ) ds,

77

with coeffients σ and b satisfying the assumptions (H). Then for any t ≥ u,

Yη(ω)t (ω) = Xx,u

t (ω)|x=η(ω),

where Xx,ut , t ≥ 0, denotes the solution to

Xx,ut = x+

∫ t

u

σ (s,Xx,us ) dBs

∫ t

u

b (s,Xx,us ) ds. (6.19)

Proof: Suppose first that η is a step function,

η =r∑i=1

ci11Ai , Ai ∈ Fu.

By virtue of the local property of the stochastic integral, on the set Ai,Xci,ut (ω) = Xx,u

t (ω)|x=η(ω) = Y ηt (ω).

Let now (ηn, n ≥ 1) be a sequence of simple Fu-measurable random variablesconverging in L2(Ω) to η. By Theorem 6.3 we have

L2(Ω)− limn→∞

Y ηnt = Y η

t .

By taking if necessary a subsequence, we may assume that the limit is a.s..Then, a.s.,

Yη(ω)t (ω) = lim

n→∞Yηn(ω)t (ω) = lim

n→∞Xx,ut (ω)|x=ηn(ω) = Xx,u

t (ω)|x=η(ω),

where in the last equality, we have applied the joint continuity in (t, x) ofXx,ut .

As a consequence of the preceding lemma, we have Xx,st = XXx,s

u ,ut for any

0 ≤ s ≤ u ≤ t, a.s.For any Γ ∈ B(Rm), set

p(s, t, x,Γ) = PXx,st ∈ Γ, (6.20)

so, for fixed 0 ≤ s ≤ t and x ∈ Rm, p(s, t, x, ·) is the law of the randomvariable Xx,s

t .

Theorem 6.5 The stochastic process Xx,st , t ≥ s is a Markov process

with initial distribution µ = δx and transition probability function givenby (6.20).

78

Proof: According to Definition 2.3 we have to check that (6.20) defines aMarkovian transition function and that

PXx,st ∈ Γ|Fu = p(u, t,Xx,s

u ,Γ). (6.21)

We start by proving this identity. For this, we shall apply Lemma 6.3 in thefollowing setting:

(E, E) = (Rm,B(Rm)),

G = σ (Br+u −Bu, r ≥ 0) , H = Fu,Ψ(x, ω) = 11Γ (Xx,u

t (ω)) , u ≤ t,

X := Xx,su , s ≤ u.

The property of independent increments of the Brownian motion clearlyyields that the σ-fields G and H defined above are independent and Xx,s

u

is Fu-measurable. Moreover,

Φ(x) = E (Ψ(x, ·)) = E (11Γ (Xx,ut ))

= PXx,ut ∈ Γ = p(u, t, x,Γ).

Thus, Lemma 6.3 and then Lemma 6.2 yield

P Xx,st ∈ Γ|Fu = P

XXx,s

u ,ut ∈ Γ|Fu

= E

(11Γ

(XXx,s

u ,ut

)|Fu)

= E (Ψ (Xx,su , ω) |Fu)

= Φ (Xx,su ) = p(u, t,Xx,s

u ,Γ).

Since x 7→ Xx,st is continuous a.s., the mapping x 7→ p(s, t, ·,Γ) is also contin-

uous and thus measurable. Moreover, by its very definition Γ 7→ p(s, t, x,Γ)is a probability. We now prove that Chapman-Kolmogorov’s equation issatisfied (see (2.10)).Indeed, fix 0 ≤ s ≤ u ≤ t; by property (c) of the conditional expectation wehave

p(s, t, x,Γ) = E (11Γ(Xx,st ))

E (E (11Γ(Xx,st )|Fu)) = E (P Xx,s

t ∈ Γ |Fu) .

By (6.21) this last expression is E (p(u, t,Xx,su ,Γ)). But

E (p(u, t,Xx,su ,Γ)) =

∫Rm

p(u, t, y,Γ)LXx,su

(dy),

79

where LXx,su

denotes the probability law of Xx,su . By definition

LXx,su

(dy) = p(s, u, x, dy).

Therefore,

p(s, t, x,Γ) =

∫Rm

p(u, t, y,Γ)p(s, u, x, dy).

The proof of the theorem is now complete.

7 Numerical approximations of stochastic

differential equations

In this section we consider a fixed time interval [0, T ]. Let π = 0 = τ0 <τ1 < . . . < τN = T be a partition of [0, T ]. The Euler-Maruyama schemefor the SDE (6.2) based on the partition π is the stochastic process Xπ =Xπ

t , t ∈ [0, T ] defined iteratively as follows:

Xπτn+1

= Xπτn + σ(τn, X

πτn)(Bτn+1 −Bτn) + b(τn, X

πτn)(τn+1 − τn),

n = 0, · · · , N − 1

Xπ0 = x, (7.1)

Notice that the values Xπτn , n = 0, · · · , N − 1 are determined by the values

of Bτn , n = 1, . . . , N .We can extend the definition of Xπ to any value of t ∈ [0, T ] by setting

Xπt = Xπ

τj+ σ(τj, X

πτj

)(Bt −Bτj) + b(τj, Xπτj

)(t− τj), (7.2)

for t ∈ [τj, τj+1).The stochastic process Xπ

t , t ∈ [0, T ] defined by (7.2) can be written as astochastic differential equation. This notation will be suitable for comparingwith the solution of (6.2). Indeed, for any t ∈ [0, T ] set π(t) = supτl ∈π; τl ≤ t; then

Xπt = x+

∫ t

0

[σ(π(s), Xπ

π(s)

)dBs + b

(π(s), Xπ

π(s)

)ds]. (7.3)

The next theorem gives the rate of convergence of the Eurler-Maruyamascheme to the solution of (6.2) in the Lp norm.

80

Theorem 7.1 We assume that the hypotheses (H) are satisfied. Moreover,we suppose that there exists α ∈ (0, 1) such that

|σ(t, x)− σ(s, x)|+ |b(t, x)− b(s, x)| ≤ C(1 + |x|)|t− s|α, (7.4)

where C is some positive contant.Then, for any p ∈ [1,∞)

E

(sup

0≤t≤T|Xπ

t −Xt|p)≤ C(T, p, x)|π|βp, (7.5)

where |π| denotes the norm of the partition π and β = 12∧ α.

Proof: We shall apply the following result, that can be argued in a similarway as in Theorem 6.2

supπE

(sup

0≤s≤t|Xπ

s |p

)≤ C(p, T ). (7.6)

SetZt = sup

0≤s≤t|Xπ

s −Xs| .

Applying Burkholder’s and Holder’s inequality we obtain

E(Zpt ) ≤ 2p−1

tp2−1

∫ t

0

E(∣∣σ(π(s), Xπ

π(s))− σ(s,Xs)∣∣p) ds

+ tp−1

∫ t

0

E(∣∣b(π(s), Xπ

π(s))− b(s,Xs)∣∣p) ds.

The assumptions on the coefficients yield∣∣σ(π(s), Xππ(s))− σ(s,Xs)

∣∣ ≤ ∣∣σ(π(s), Xππ(s))− σ(π(s), Xπ(s))

∣∣+∣∣σ(π(s), Xπ(s))− σ(s,Xπ(s))

∣∣+∣∣σ(s,Xπ(s))− σ(s,Xs)

∣∣≤ CT

[∣∣Xππ(s) −Xπ(s)

∣∣+(1 +

∣∣Xπ(s)

∣∣) |s− π(s)|α +∣∣Xπ(s) −Xs

∣∣]≤ CT

[Zs +

(1 +

∣∣Xπ(s)

∣∣) ((s− π(s))α +∣∣Xπ(s) −Xs

∣∣)] ,and similarly for the coefficient b.Hence, we have

E(∣∣σ(π(s), Xπ

π(s))− σ(s,Xs)∣∣p)

≤ C(p, T )[E(|Zs|p) + (1 + |x|p)

(|π|αp + |π|

p2

)]≤ C(p, T )

[E(|Zs|p) + (1 + |x|p)

(|π|α + |π|

12

)p],

81

and a similar estimate for b. Consequently,

E(Zpt ) ≤ C(p, T, x)

[∫ t

0

E(Zps )ds+

(|π|α + |π|

12

)pT

].

With Gronwall’s lemma we conclude

E(Zpt ) ≤ C(p, T, x)|π|βp,

with β = 12∧ α.

Remark 7.1 If the coefficients σ and b do not depend on t, with a similarproof, we obtain β = 1

2in (7.4).

Assume that the sequence of partitions of [0, T ], (πn, n ≥ 1) satisfies thefollowing property: there exists γ ∈ (0, β) and p ≥ 1 such that∑

n≥1

|πn|(β−γ)p <∞. (7.7)

Then, Chebyshev’s inequality and (7.4) imply

∞∑n=0

P

|πn|−γ sup

0≤s≤T|Xπn

s −Xs| > ε

≤ C(p, T, x)ε−p

∞∑n=0

|πn|(β−γ)p

C(p, T, x)ε−p.

Then, Borel-Cantelli’s lemma then yields

|πn|−γ sup0≤s≤T

|Xπns −Xs| → 0, a.s., (7.8)

as n→∞, uniformly in t ∈ [0, T ].For example, for the sequence of dyadic partitions, |πn| = 2−n and for anyγ ∈ (0, β) and γ ∈ (0, β), p ≥ 1, (7.7) holds.

82

8 Continuous time martingales

In this chapter we shall study some properties of martingales (respectively,supermartingales and submartingales) whose sample paths are continuous.We consider a filtration Ft, t ≥ 0 as has been introduced in section 2.4 andrefer to definition 2.2 for the notion of martingale (respectively, supermartin-gale, submartingale). We notice that in fact this definition can be extendedto families of random variables Xt, t ∈ T where T is an ordered set. Inparticular, we can consider discrete time parameter processes.We start by listing some elementary but useful properties.

1. For a martingale (respectively, supermartingale, submartingale) thefunction t 7→ E(Xt) is a constant (respectively, decreasing, increasing)function.

2. Let Xt, t ≥ 0 be a martingale and let f : R → R be a convexfunction. Assume further that f(Xt) ∈ L1(Ω), for any t ≥ 0. Thenthe stochastic process f(Xt), t ≥ 0 is a submartingale. The sameconclusion holds true for a submartingale if, additionally the convexfunction f is increasing.

The first assertion follows easily from property (c) of the conditional ex-pectation. The second assertion can be proved using Jensen’s inequality, asfollows. Assume first that Xt, t ≥ 0 is a martingale, and fix 0 ≤ s ≤ t.Then by applying the functionn f to the identity E(Xt|Fs) = Xs along withthe convexity of f , we obtain

E (f(Xt)|Fs) ≥ f (E(Xt|Fs)) = f(Xs).

If Xt, t ≥ 0 is a submartingale, we consider the inequality E(Xt|Fs) ≥ Xs.Since f is increasing and convex, we have

E (f(Xt)|Fs) ≥ f (E(Xt|Fs)) ≥ f(Xs).

8.1 Doob’s inequalities for martingales

In the first part of this section, we will deal with discrete parameter martin-gales indexed by 0, 1, . . . , N.

Proposition 8.1 Let Xn, 0 ≤ n ≤ N be a submartingale. For any λ > 0,the following inequalities hold:

λP

(supnXn ≥ λ

)≤ E

(XN11(supnXn≥λ)

)≤ E

(|XN |11(supnXn≥λ)

). (8.1)

83

Proof: Consider the stopping time

T = infn : Xn ≥ λ ∧N.

Then

E (XN) ≥ E (XT ) = E(XT11(supnXn≥λ)

)+ E

(XT11(supnXn<λ)

)≥ λP

(supnXn ≥ λ

)+ E

(XN11(supnXn<λ)

).

By substracting E(XN11(supnXn<λ)

)from the first and last term before we

obtain the first inequality of (8.1). The second one is obvious.

As a consequence of this proposition we have the following.

Proposition 8.2 Let Xn, 0 ≤ n ≤ N be either a martingale or a positivesubmartingale. Fix p ∈ [1,∞) and λ ∈ (0,∞). Then,

λpP

(supn|Xn| ≥ λ

)≤ E (|XN |p) . (8.2)

Moreover, for any p ∈]1,∞),

E (|XN |p) ≤ E

(supn|Xn|p

)≤(

p

p− 1

)pE (|XN |p) . (8.3)

Proof: Without loss of generality, we may assume that E |XN |p) <∞, sinceotherwise (8.2) holds trivially.According to property 2 above, the process |Xn|p, 0 ≤ n ≤ N is a sub-martingale and then, by Proposition 8.1 applied to the process Xn := |Xn|p,

µP

(supn|Xn|p ≥ µ

)= µP

(supn|Xn| ≥ µ

1p

)= λpP

(supn|Xn| ≥ λ

)≤ E (|XN |p) ,

where for any µ > 0 we have written λ = µ1p .

We now prove the second inequality of (8.3); the first is obvious.Set X∗ = supn |Xn|, for which we have

λP (X∗ ≥ λ) ≤ E(|XN |11(X∗≥λ)

).

84

Fix k > 0. Fubini’s theorem yields

E ((X∗ ∧ k)p) = E

(∫ X∗∧k

0

pλp−1dλ

)=

∫Ω

dP

∫ ∞0

11λ≤X∗∧kpλp−1dλ

= p

∫ k

0

dλλp−1

∫λ≤X∗

dP

= p

∫ k

0

dλλp−2λP (X∗ ≥ λ)

≤ p

∫ k

0

dλλp−2E(|XN |11(X∗≥λ)

)= pE

(|XN |

∫ k∧X∗

0

λp−2dλ

)=

p

p− 1E(|XN | (X∗ ∧ k)p−1) .

Applying Holder’s inequality with exponents pp−1

and p yields

E ((X∗ ∧ k)p) ≤ p

p− 1[E ((X∗ ∧ k)p)]

p−1p [E (|XN |p)]

1p .

Consequently,

[E ((X∗ ∧ k)p)]1p ≤ p

p− 1[E (|XN |p)]

1p .

Letting k →∞ and using monotone convergence, we end the proof.

It is not difficult to extend the above results to martingales (submartingales)with continuous sample paths. In fact, for a given T > 0 we define

D = Q ∩ [0, T ],

Dn = D ∩k

2n, k ∈ Z+

,

where Q denotes the set of rational numbers.We can now apply (8.2), (8.3) to the corresponding processes indexed by Dn.By letting n to ∞ we obtain

λpP

(supt∈D|Xt| ≥ λ

)≤ sup

t∈DE (|Xt|p) , p ∈ [1,∞)

85

and

E

(supt∈D|Xt|p

)≤(

p

p− 1

)psupt∈D

E (|Xt|p) , p ∈]1,∞).

By the continuity of the sample paths we can finally state the following result.

Theorem 8.1 Let Xt, t ∈ [0, T ] be either a continuous martingale or acontinuous positive submartingale. Then

λpP

(supt∈[0,T ]

|Xt| ≥ λ

)≤ sup

t∈[0,T ]

E (|Xt|p) , p ∈ [1,∞), (8.4)

E

(supt∈[0,T ]

|Xt|p)≤(

p

p− 1

)psupt∈[0,T ]

E (|Xt|p) (8.5)

=

(p

p− 1

)pE(|XT |p), p ∈]1,∞). (8.6)

Inequality (8.4) is termed Doob’s maximal inequality, while (8.6) is calledDoob’s Lp inequality.

8.2 Quadratic variation of a continuous martingale

In this section we construct the quadratic variation of a continuous martin-gale.We start by a very simple consequence of the martingale property.

Lemma 8.1 Let Nt, t ≥ 0 be a martingale with respect to a filtration(Ft)t≥0. The following identity holds: for any 0 ≤ s ≤ t

E(N2t −N2

s

)= E

((Nt −Ns)

2), (8.7)

E(N2t −N2

s |Fs)

= E((Nt −Ns)

2Fs), (8.8)

Proof: By developing the square,

E((Nt −Ns)

2)

= E(N2t +N2

s − 2NsNt

).

From the martingale property,

E(NsNt) = E (E(NsNt|Fs)) = E(N2s ).

Hence, the (8.7) follows. The proof of (8.8) follows by similar arguments.

The next statement gives an idea of the roughness of the sample paths of acontinuous local martingale.

86

Proposition 8.3 Let N be a continuous bounded martingale, null at zeroand with sample paths of bounded variation, a.s. Then N is indistinguishablefrom the constant process 0.

Proof: Fix t > 0 and consider a partition 0 = t0 < t1 < · · · < tp = t of [0, t].Then

E(N2t ) =

p∑i=1

E|[N2ti−N2

ti−1

]=

p∑i=1

E[(Nti −Nti−1

)2]

≤ E

[(supi|Nti −Nti−1

|) p∑

i=1

|Nti −Nti−1|

]

≤ CE

[(supi|Nti −Nti−1

|)]

,

where the second identity above is a consequence of Lemma 8.1 before.By considering a sequence of partitions whose mesh tends to zero, the pre-ceding estimate yields E(N2

t ) = 0, by the continuity of the sample paths ofN . This finishes the proof of the proposition.

Throughout this section we will consider a fixed t > 0 and an increasingsequence of partitions of [0, t] whose mesh tends to zero. Points of the n-thpartition will be generically denoted by tnk , k = 0, 1, . . . , pn. We will alsoconsider a continuous martingale M and define

〈M〉nt =

pn∑k=1

(Mtnk−Mtnk−1

)2

,

(∆nkM)t = Mtnk

−Mtnk−1.

Theorem 8.2 Let M be a continuous bounded martingale. Then the se-quence (〈M〉nt , n ≥ 1), t ∈ [0, T ], converges uniformly in t, in probability, to acontinuous, increasing process 〈M〉 = (〈M〉t, t ∈ [0, T ]), such that 〈M〉0 = 0.That is, for any ε > 0,

P

supt∈[0,T ]

|〈M〉nt − 〈M〉t| > ε

→ 0, (8.9)

as n → ∞. The process 〈M〉 is unique satisfying the above conditions andthat M2 − 〈M〉 is a continuous martingale.

87

Proof: Uniqueness follows from Proposition 8.3. Indeed, assume there weretwo increasing processes 〈Mi〉, i = 1, 2 satisfying that M2 − 〈Mi〉 is a con-tinuous martingale. By taking the difference of these two processes we getthat 〈M1〉−〈M2〉 is of bounded variation and, at the same time a continuousmartingale. Hence 〈M1〉 and 〈M2〉 are indistinguishable.

The next objective is to prove that (〈M〉nt , n ≥ 1), t ∈ [0, T ] is, uniformly int, a Cauchy sequence in probability.

Let m > n. We have the following:

E(|〈M〉nt − 〈M〉mt |

2) = E

∣∣∣∣∣∣pn∑k=1

(∆nkM)2

t −∑

j:tmj ∈[tnk−1,tnk [

(∆mj M)2

t

∣∣∣∣∣∣2

= 4E

∣∣∣∣∣∣pn∑k=1

∑j:tmj ∈[tnk−1,t

nk [

(∆mj M)t

(Mtmj

−Mtnk−1

)∣∣∣∣∣∣2

= 4E

pn∑k=1

∑j:tmj ∈[tnk−1,t

nk [

(∆mj M)2

t

(Mtmj

−Mtnk−1

)2

≤ 4E

(supk

supj:tmj ∈[tnk−1,t

nk [

(Mtmj

−Mtnk−1

)2pm∑j=1

(∆mj M)2

t

)

≤ 4

E(supk

supj:tmj ∈[tnk−1,t

nk [

(Mtmj

−Mtnk−1

)4)E

(pm∑j=1

(∆mj M)2

t

)2 1

2

.

Let us now consider the last expression. The first factor tends to zero as nand m tends to infinity, because M is continuous and bounded. The secondfactor is easily seen to be bounded uniformly in m. Thus we have proved

limn,m→∞

E(|〈M〉nt − 〈M〉mt |

2) = 0, (8.10)

for any t ∈ [0, T ].One can easily check that for any n ≥ 1, M2

t − 〈M〉nt , t ∈ [0, T ] is acontinuous martingale. Indeed, let 0 ≤ s ≤ t, then

E(M2

t −M2s |Fs

)= E

((Mt −Ms)

2|Fs)

= E (〈M〉nt − 〈M〉ns |Fs) .

Therefore 〈M〉nt −〈M〉mt , t ∈ [0, T ] is a martingale for any n,m ≥ 1. Hence,Doob’s inequality yields

E

(supt∈[0,T ]

|〈M〉nt − 〈M〉mt |2

)≤ 4E

(|〈M〉nT − 〈M〉mT |

2) .88

Consequently,

P sup0≤t≤T

|〈M〉nt − 〈M〉mt | > ε ≤ ε−2E

(sup

0≤t≤T|〈M〉nt − 〈M〉mt |2

)≤ 4ε−2E

(|〈M〉nT − 〈M〉mT |2

).

This last expression tends to zero as n,m tend to infinity. Since the con-vergence in probability is metrizable, we see that there exists a process 〈M〉satisfying the required conditions.

Remark Assume that the martingale M in Theorem 8.2 is bounded in L2.Then, by a stopping argument the boundedness assumption on M can beremoved.

Example Consider the stochastic integral process

M =∫ t

0ϕ(s)dBs, s ∈ [0, T ]

, with ϕ ∈ L2

a,T . It was proved in Section 3

that the process M is a continuous martingale with respect to the filtrationgenerated by the Brownian motion B. By applying the extension of Theorem8.2 to continuous martingales bounded in L2 we can see that

〈M〉t =

∫ t

0

ϕ(s)2ds.

Indeed, the right-hand side of this equality defines an increasing processvanishing at t = 0. Moreover, by Ito’s formula,

M2t −

∫ t

0

ϕ(s)2ds = 2

∫ t

0

Msϕ(s)dBs,

and the right-hand side of this identity defines a continuous martingale.Given two continuous martingales M and N we define the cross variation by

〈M,N〉t =1

2[〈M +N〉t − 〈M〉t − 〈N〉t] , (8.11)

t ∈ [0, T ].From the properties of the quadratic variation (see Theorem 8.2) we see thatthe process 〈M,N〉t, t ∈ [0, T ] is a process of bounded variation and it isthe unique process (up to indistinguishability) such that MtNt−〈M,N〉t, t ∈[0, T ] is a continuous martingale. It is also clear that

limn→∞

pn∑k=1

(Mtnk−Mtnk−1

)(Ntnk−Ntnk−1

)= 〈M,N〉t, (8.12)

89

uniformly in t ∈ [0, T ] in probability.This result together with Schwarz’s inequality imply

|〈M,N〉t| ≤√〈M〉t〈N〉t.

and more generally, by setting 〈M,N〉ts = 〈M,N〉t − 〈M,N〉s, for 0 ≤ s ≤t ≤ T , then

|〈M,N〉ts| ≤√〈M〉ts〈N〉ts.

This inequality (a sort of Cauchy-Schwarz’s inequality) is a particular caseof the result stated in the next proposition.

Proposition 8.4 Let M , N be two continuous martingales and H, K betwo measurable processes. Then for any t ≥ 0,∣∣∣∣∫ t

0

HsKs d〈M,N〉s∣∣∣∣ ≤ (∫ t

0

H2sd〈M〉s

) 12(∫ t

0

K2sd〈N〉s

) 12

. (8.13)

Proof: Consider a finite partition of [0, t], given by t0 = 0 < t1 < . . . < tr =t, and assume first that the stochastic processes H, K are step processesand described by this partition, as follows:

H = H010 +H11]0,t1] + . . . Hr1]tr−1,tr],

K = K010 +K11]0,t1] + . . . Kr1]tr−1,tr],

with Hi, Ki, i = 0, . . . , r bounded measurable random variables. Then∫ t

0

HsKs d〈M,N〉s =r∑i=1

HiKi〈M,N〉ti+1

ti .

Thus, ∣∣∣∣∫ t

0

HsKs d〈M,N〉s∣∣∣∣ ≤ ≤ r∑

i=1

|HiKi|∣∣∣〈M,N〉ti+1

ti

∣∣∣≤

r∑i=1

|HiKi|(〈M〉ti+1

ti

) 12(〈N〉ti+1

ti

) 12.

By applying Schwarz’s inequality, the last expression is bounded by(∫ t

0

H2sd〈M〉s

) 12(∫ t

0

H2sd〈N〉s

) 12

.

The general case follows from an approximation argument.

90

The bounded variation process 〈M,N〉 gives rise to the total variation mea-sure d‖〈M,N〉‖s defined by

d‖〈M,N〉‖s = d(〈M,N〉+)(s) + d(〈M,N〉−)(s),

where 〈M,N〉+s , 〈M,N〉−s , denote the increasing functions such that

〈M,N〉s = 〈M,N〉+s − 〈M,N〉−s .

It is worth noticing that (8.13) can be extended to the following inequality.

Kunita-Watanabe inequality

∫ t

0

|Hs| |Ks| ‖d〈M,N〉s‖ ≤(∫ t

0

H2sd〈M〉s

) 12(∫ t

0

K2sd〈N〉s

) 12

. (8.14)

More details can be found in [11].

91

9 Stochastic integrals with respect to contin-

uous martingales

This chapter aims to give an outline of the main ideas of the extension ofthe Ito stochastic integral to integrators which are continuous martingales.We start by describing precisely the spaces involved in the construction ofsuch a notion. Throughout the chapter, we consider a fixed probability space(Ω,F , P ) endowed with a filtration (Ft, t ≥ 0).We denote by H2 the space of continuous martingales M , indexed by [0, T ],with M0 = 0 a.s. and bounded in L2(Ω). That is,

supt∈[0,T ]

E(|Mt|2) <∞.

This is a Hilbert space endowed with the inner product

(M,N)H2 = E[〈M,N〉T ].

A stochastic process (Xt, t ≥ 0) is said to be progressively measurable if forany t ≥ 0, the mapping (s, ω) 7→ Xs(ω) defined on [0, t] × Ω is measurablewith respect to the σ-field B([0, t])⊗Ft.For any M ∈ H2 we define L2(M) as the set of progressively measurableprocesses H such that

E

[∫ ∞0

H2sd〈M〉s

]<∞.

Notice that this in an L2 space of measurable mappings defined on R+ × Ωwith respect to the measure dPd〈M〉. Hence it is also a Hilbert space, thenatural inner product being

(H,K)L2(M) = E

[∫ ∞0

HsKsd〈M〉s].

The standard Brownian motion belongs to the space H2 and the space L2(M)will play the same role as L2

a,T in the Ito theory of stochastic integration withrespect to Brownian motion.Let E be the linear subspace of L2(M) consisting of processes of the form

Hs(ω) =

p∑i=0

Hi(ω)11]ti,ti+1](s), (9.1)

where 0 = t0 < t1 < . . . < tp+1, and for each i, Hi is a Fti-measurable,bounded random variable.Stochastic processes belonging to E are termed elementary. There are relatedwith L2(M) as follows.

92

Proposition 9.1 Fix M ∈ H2. The set E is dense in L2(M).

Proof: We will prove that if K ∈ L2(M) and is orthogonal to E then K = 0.For this, we fix 0 ≤ s < t ≤ T and consider the process

H = F11]s,t],

with F a Fs-measurable and bounded random variable.Saying that K is orthogonal to H in L2(M) can be written as

E

[∫ T

0

HuKud〈M〉u]

= E

[F

∫ t

s

Kud〈M〉u]

= 0.

Consider the stochastic process

Xt =

∫ t

0

Kud〈M〉u.

Notice that Xt ∈ L1(Ω). In fact,

E|Xt| ≤ E

∫ t

0

|Ku|d〈M〉u

≤(E

∫ t

0

|Ku|2d〈M〉u) 1

2 (E〈M〉2t

) 12 .

We have thus proved that E (F (Xt −Xs)) = 0, for any 0 ≤ s < t and anyFs-measurable and bounded random variable F . This shows that the process(Xt, t ≥ 0) is a martingale. At the same time, (Xt, t ≥ 0) is also a process ofbounded variation. Hence∫ t

0

Kud〈M〉u = 0, ∀t ≥ 0,

which implies that K = 0 in L2(M).

Stochastic integral of processes in E

Proposition 9.2 Let M ∈ H2, H ∈ E as in (9.1). Define

(H.M)t =

p∑i=0

Hi

(Mti+1∧t −Mti∧t

).

Then

93

(i) H.M ∈ H2.

(ii) The mapping H 7→ H.M extends to an isometry from L2(M) to H2.

The stochastic process (H.M)t, t ≥ 0 is called the stochastic integral of theprocess H with respect to M and is also denoted by

∫ t0HsdMs.

Proof of (i): The martingale property follows from the measurability proper-ties of H and the martingale property of M . Moreover, since H is bounded,(H.M) is bounded in L2(Ω).

Proof of (ii): We prove first that the mapping

H ∈ E 7→ H.M

is an isometry from E to H2.Clearly H 7→ H.M is linear. Moreover, H.M is a finite sum of terms like

M it = Hi

(Mti+1∧t −Mti∧t

)each one being a martingale and orthogonal to each other. It is easy to checkthat

〈M i〉t = H2i (〈M〉ti+1∧t − 〈M〉ti∧t).

Hence,

〈H.M〉t =

p∑i=0

H2i (〈M〉ti+1

− 〈M〉ti).

Consequently,

E〈H.M〉T = ‖H.M‖2H2 = E

[p∑i=0

H2i (〈M〉ti+1∧T − 〈M〉ti∧T )

]

= E

[∫ T

0

H2sd〈M〉s

]= ‖H‖L2(M).

Since E is dense in L2(M) this isometry extends to a unique isometry fromL2(M) into H2. The extension is termed the stochastic integral of the processH with respect to M .

94

10 Appendix 1: Conditional expectation

Roughly speaking, a conditional expectation of a random variable is themean value with respect to a modified probability after having incorporatedsome a priori information. The simplest case corresponds to conditioningwith respect to an event B ∈ F . In this case, the conditional expectation isthe mathematical expectation computed on the modified probability space(Ω,F , P (·/B)).However, in general, additional information cannot be described so easily.Assuming that we know about some events B1, . . . , Bn we also know aboutthose that can be derived from them, like unions, intersections, complemen-taries. This explains the election of a σ-field to keep known information andto deal with it.In the sequel, we denote by G an arbitrary σ-field included in F and by Xa random variable with finite expectation (X ∈ L1(Ω)). Our final aim is togive a definition of the conditional expectation of X given G. However, inorder to motivate this notion, we shall start with more simple situations.

Conditional expectation given an event

Let B ∈ F be such that P (B) 6= 0. The conditional expectation of X givenB is the real number defined by the formula

E(X/B) =1

P (B)E(11BX). (10.1)

It immediately follows that

• E(X/Ω) = E(X),

• E(11A/B) = P (A/B).

With the definition (10.1), the conditional expectation coincides with theexpectation with respect to the conditional probability P (·/B). We checkthis fact with a discrete random variable X =

∑∞i=1 ai11Ai . Indeed,

E(X/B) =1

P (B)E

(∞∑i=1

ai11Ai∩B

)=∞∑i=1

aiP (Ai ∩B)

P (B)

=∞∑i=1

aiP (Ai/B).

95

Conditional expectation given a discrete random variable

Let Y =∑∞

i=1 yi11Ai , Ai = Y = yi. The conditional expectation of X givenY is the random variable defined by

E(X/Y ) =∞∑i=1

E(X/Y = yi)11Ai . (10.2)

Notice that, knowing Y means knowing all the events that can be describedin terms of Y . Since Y is discrete, they can be described in terms of thebasic events Y = yi. This may explain the formula (10.2).The following properties hold:

(a) E (E(X/Y )) = E(X);

(b) if the random variables X and Y are independent, then E(X/Y ) =E(X).

For the proof of (a) we notice that, since E(X/Y ) is a discrete randomvariable

E (E(X/Y )) =∞∑i=1

E(X/Y = yi)P (Y = yi)

= E

(X

∞∑i=1

11Y=yi

)= E(X).

Let us now prove (b). The independence of X and Y yields

E(X/Y ) =∞∑i=1

E(X11Y=yi)

P (Y = yi)11Ai

=∞∑i=1

E(X)11Ai = E(X).

In the sequel, we shall denote by σ(Y ) the σ-field generated by a randomvariable Y . It consist of sets of the form ω : Y (ω) ∈ B, where B is a Borelset of R.The next proposition states two properties of the conditional expectationthat motivates the Definition 10.1.

Proposition 10.1 1. The random variable Z := E(X/Y ) is σ(Y )-measurable; that is, for any Borel set B ∈ B, Z−1(B) ∈ σ(Y ),

2. for any A ∈ σ(Y ), E (11AE(X/Y )) = E(11AX).

96

Proof: Set ci = E(X/Y = yi) and let B ∈ B. Then

Z−1(B) = ∪i:ci∈Bω : Z(ω) = ci = ∪i:ci∈Bω : Y (ω) = yi ∈ σ(Y ),

proving the first property.To prove the second one, it suffices to take A = Y = yk. In this case

E(11Y=ykE(X/Y )

)= E

(11Y=ykE(X/Y = yk)

)= E

(11Y=yk

E(X11Y=yk)

P (Y = yk)

)= E(X11Y=yk).

Conditional expectation given a σ-field

Definition 10.1 The conditional expectation of X given G is a random vari-able Z satisfying the properties

1. Z is G-measurable; that is, for any Borel set B ∈ B, Z−1(B) ∈ G,

2. for any G ∈ G,E(Z11G) = E(X11G).

We will denote the conditional expectation Z by E(X/G).Notice that the conditional expectation is not a number but a random vari-able. There is nothing strange in this, since conditioning depends on theobservations.Condition (1) tell us that events that can be described by means of E(X/G)are in G. Whereas condition (2) tell us that on events in G the random vari-ables X and E(X/G) have the same mean value.

The existence of E(X/G) is not a trivial issue. You should trust mathe-maticians and believe that there is a theorem in measure theory -the Radon-Nikodym Theorem- which ensures the existence and uniqueness of such arandom variable (out of a set of probability zero).Before stating properties of the conditional expectation, we are going toexplain how to compute it in two particular situations.

Example 10.1 Let G be the σ-field (actually, the field) generated by a finitepartition G1, . . . , Gm. Then

E(X/G) =m∑j=1

E(X11Gj)

P (Gj)11Gj . (10.3)

97

Formula (10.3) can be checked using Definition 10.1. It tell us that, oneach generator of G, the conditional expectation is constant; this constant isweighted by the mass of the generator (P (Gj)).

Example 10.2 Let G be the σ-field generated by random variablesY1, . . . , Ym, that is, the σ-field generated by events of the formY −1

1 (B1), . . . , Y −11 (Bm), with B1, . . . , Bm arbitrary Borel sets. Assume in

addition that the joint distribution of the random vector (X, Y1, . . . , Ym) hasa density f . Then

E(X/Y1, . . . , Ym) =

∫ ∞−∞

xf(x/Y1, . . . , Ym)dx, (10.4)

with

f(x/y1, . . . , ym) =f(x, y1, . . . , ym)∫∞

−∞ f(x, y1, . . . , ym)dx. (10.5)

In (10.5), we recognize the conditional density of X given Y1 = y1, . . . , Ym =ym. Hence, in (10.4) we first compute the conditional expectation E(X/Y1 =y1, . . . , Ym = ym) and finally, replace the real values y1, . . . , ym by the randomvariables Y1, . . . , Ym.

We now list some important properties of the conditional expectation.

(a) Linearity: for any random variables X, Y and real numbers a, b

E(aX + bY/G) = aE(X/G) + bE(Y/G).

(b) Monotony: If X ≤ Y then E(X/G) ≤ E(Y/G).

(c) The mean value of a random variable is the same as that of its conditionalexpectation: E(E(X/G)) = E(X).

(d) If X is a G-measurable random variable, then E(X/G) = X

(e) Let X be independent of G, meaning that any set of the form X−1(B),B ∈ B is independent of G. Then E(X/G) = E(X).

(f) Factorization: If Y is a bounded, G-measurable random variable,

E(Y X/G) = Y E(X/G).

(g) If Gi, i = 1, 2 are σ-fields with G1 ⊂ G2,

E(E(X/G1)/G2) = E(E(X/G2)/G1) = E(X/G1).

98

(h) Assume that X is a random variable independent of G and Z anotherG-measurable random variable. For any measurable function h(x, z)such that the random variable h(X,Z) is in L1(Ω),

E(h(X,Z)/G) = E(h(X, z))|Z=z.

We give some proofs.Property (a) follows from the definition of the conditional expectation andthe linearity of the operator E. Indeed, the candidate aE(X/G)+bE(Y/G) isG-measurable. By property 2 of the conditional expectation and the linearityof E,

E(1G[aE(X/G) + bE(Y/G)]) = aE(1GX) + bE(1GY )

= E(1G[aX + bY ]).

Property (b) is a consequence of the monotonicity property of the operatorE and a result in measure theory telling that, for G-measurable randomvariables Z1 and Z2, satisfying

E(Z111G) ≤ E(Z211G),

for any G ∈ G, we have Z1 ≤ Z2. Indeed, under the standing assumptions,for any G ∈ G,

E(1GX) ≤ E(1GY ).

Then, property 2 of the conditional expectation yields

E(1GE(X/G)) = E(1GX) ≤ E(1GY ) = E(1GE(Y/G)).

By applying the above mentioned property to Z1 = E(X/G), Z2 = E(Y/G),we obtain the result.Taking G = Ω in condition (2) above, we prove (c). Property (d) is obvious.Constant random variables are measurable with respect to any σ-field. Ther-fore E(X) is G-measurable. Assuming that X is independent of G, yields

E(X11G) = E(X)E(11G) = E(E(X)11G).

This proves (e).For the proof of (f), we first consider the case Y = 1G, G ∈ G. Claiming(f) means that we propose as candidate E(Y X/G) = 1GE(X/G). Clearly1GE(X/G) is G-measurable. Moreover

E(1G1GE(X/G)) = E(1G∩GE(X/G)) = E(1G∩GX).

99

The validity of the property extends by linearity to simple random variables.Then, by monotone convergence, to positive random variables and, finally,to random variables in L1(Ω), by the usual decomposition X = X+ −X−.For the proof of (g), we notice that since E(X/G1) is G1-measurable, it isG2-measurable as well. Then, by the very definition of the conditional expec-tation,

E(E(X/G1)/G2) = E(X/G1).

Next, we prove that

E(X/G1) = E(E(X/G2)/G1).

For this, we fix G ∈ G1 and we apply the definition of the conditional expec-tation. This yields

E(1GE(E(X/G2)/G1)) = E(1GE(X/G2)) = E(1GX).

Property (h) is very intuitive: Since X is independent of G in does not enterthe game of conditioning. Moreover, the measurability of Z means that byconditioning one can suppose it is a constant.Let us give a proof of this property in the particular case h(x, z) =1A(x)1B(z), where A, B are Borel sets. In this case we have

E(h(X,Z)/G) = E(1A(X)1B(Z)/G) = 1B(Z)E(1A(X)/G)

= 1B(Z)E(1A(X)).

Moreover,

E(h(X, z)) = E(1A(X)1B(z)) = 1B(z)E(1A(X)).

ThereforeE(h(X, z))

∣∣∣z=Z

= 1B(Z)E(1A(X)).

.

11 Appendix 2: Stopping times

Throughout this section we consider a fixed filtration (see Section 2.4)(Ft, t ≥ 0).

Definition 11.1 A mapping T : Ω→ [0,∞] is termed a stopping time withrespect to the filtration (Ft, t ≥ 0) is for any t ≥ 0

T ≤ t ∈ Ft.

100

It is easy to see that if S and T are stopping times with respect to the samefiltration then T ∧ S, T ∨ S and T + S are also stopping times.

Definition 11.2 For a given stopping time T , the σ-field of events prior toT is the following

FT = A ∈ F : A ∩ T ≤ t ∈ Ft, for all t ≥ 0.

Let us prove that FT is actually a σ-field. By the definition of stopping timeΩ ∈ FT . Asuume that A ∈ FT . Then

Ac ∩ T ≤ t = T ≤ t ∩ (A ∩ T ≤ t)c ∈ Ft.

Hence with any A, FT also contains Ac.Let now (An, n ≥ 1) ⊂ FT . We clearly have

(∪∞n=1An) ∩ T ≤ t = ∪∞n=1 (An ∩ T ≤ t) ∈ Ft.

This completes the proof.

Some properties related with stopping times

1. Any stopping time T is FT -measurable. Indeed, let s ≥ 0, then

T ≤ s ∩ T ≤ t = T ≤ s ∧ t ∈ Fs∧t ⊂ Ft.

2. If Xt, t ≥ 0 is a process with continuous sample paths, a.s. and (Ft)-adapted, then XT is FT -measurable. Indeed, the continuity implies

XT = limn→∞

∞∑i=0

Xi2−n11i2−n<T≤(i+1)2−n.

Let us now check that for any s ≥ 0, the random variable Xs11s<T isFT -measurable. This fact along with the property T ≤ (i+ 1)2−n ∈FT shows the result.

Let A ∈ B(R) and t ≥ 0. The set

Xs ∈ A ∩ s < T ∩ T ≤ t

is empty if s ≥ t. Otherwise it is equal to Xs ∈ A ∩ s < T ≤ t,which belongs to Ft.

101

References

[1] P. Baldi: Equazioni differenziali stocastiche e applicazioni. Quadernidell’Unione Matematica Italiana 28. Pitagora Editrici. Bologna 2000.

[2] K.L. Chung, R.J. Williams: Introduction to Stochastic Integration, 2ndEdition. Probability and Its Applications. Birkhauser, 1990.

[3] R. Durrett: Stochastic Calculus, a practical introduction. CRC Press,1996.

[4] L.C. Evans: An Introduction to Stochastic Differential Equations.http://math.berkeley.edu/ evans/SDE.course.pdf

[5] I. Karatzas, S. Shreve: Brownian Motion and Stochastic Calculus.Springer Verlag, 1991.

[6] H.H. Kuo: Introduction to Stochastic Integration. Universitext. Springer2006.

[7] S. Lang: Real Analysis. Addison-Wesley, 1973.

[8] G. Lawler: Introduction to Stochastic Processes. Chapman andHall/CRC, 2nd Edition, 2006.

[9] J.-F. Le Gall: Mouvement Brownien et calcul stochastiques. Notes deCours de DEA 1996-1997. http://www.dma.ens.fr/ legall/

[10] B. Øksendal: Stochastic Differential Equations; an Introduction withApplications. Springer Verlag, 1998.

[11] D. Revuz, M. Yor: Continuous martingales and Brownian motion.Springer Verlag, 1999.

[12] P. Todorovic: An Introduction to Stochastic Processes and Their Ap-plications. Springer Series in Statistics. Springer Verlag, 1992,

[13] R. L. Schilling, L. Partzsch: Bronian Motion: an introduction to stochas-tic processes. 2nd Edition. De Gruyter, 2014.

102

Documents

AN INTRODUCTION TO STOCHASTIC CALCULUS · Stochastic processes are well suited for modeling stochastic evolution phe-nomena. The interesting cases correspond to families of random