30
On the Convergence from Discrete to Continuous Time in an Optimal Stopping Problem Paul Dupuis and Hui Wang Lefschetz Center for Dynamical Systems Brown University Providence, R.I. 02912 USA Abstract We consider the problem of optimal stopping for a one dimensional diusion process. Two classes of admissible stopping times are considered. The rst class consists of all non-anticipating stopping times that take values in [0, ], while the second class further restricts the set of allowed values to the discrete grid {nh : n =0, 1, 2, ••• , } for some parameter h> 0. The value functions for the two problems are denoted by V (x) and V h (x), respectively. We identify the rate of convergence of V h (x) to V (x) and the rate of convergence of the stopping regions, and provide simple formulas for the rate coecients. Keywords. Optimal stopping, continuous time, discrete time, diusion process, rate of convergence, local time. 1 Introduction One of the classical formulations of stochastic optimal control is that of optimal stopping. In optimal stopping, the only decision to be made is when to stop the process. When the process is stopped, a benet is received (or a cost is paid), and the objective is to maximize the expected benet (or minimize the expected cost). Although the problem formulation is very simple, this optimization problem has many practical applications. Examples in- clude the pricing problems in investment theory, the valuation of American options, the development of natural resources, etc.; see, e.g., [1, 2, 4, 5, 6, 9, 10, 15, 16, 17, 18]. The formulation of the optimal stopping problem requires the specication of the class of allowed stopping times. Typically, one assumes these to be non-anticipative in an appro- priate sense, so that the control does not have knowledge of the future. Another important restriction is with regard to the actual time values at which one can stop, and here there are two important cases: continuous time and discrete time. In the rst case, the stopping time is allowed to take values in the interval [0, ], with corresponding to the decision to never stop. In the second case there is a xed discrete set of times D [0, ], and the Research supported in part by the National Science Foundation (NSF-DMS-0072004) and the Army Research Oce (ARO-DAAD19-99-1-0223).

On the Convergence from Discrete to Continuous Time in an

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

On the Convergence from Discrete to Continuous Time in an

Optimal Stopping Problem∗

Paul Dupuis and Hui Wang

Lefschetz Center for Dynamical SystemsBrown University

Providence, R.I. 02912USA

Abstract

We consider the problem of optimal stopping for a one dimensional diffusion process.Two classes of admissible stopping times are considered. The Þrst class consists of allnon-anticipating stopping times that take values in [0,∞], while the second class furtherrestricts the set of allowed values to the discrete grid nh : n = 0, 1, 2, · · · ,∞ for someparameter h > 0. The value functions for the two problems are denoted by V (x) andV h(x), respectively. We identify the rate of convergence of V h(x) to V (x) and therate of convergence of the stopping regions, and provide simple formulas for the ratecoefficients.

Keywords. Optimal stopping, continuous time, discrete time, diffusion process, rate ofconvergence, local time.

1 Introduction

One of the classical formulations of stochastic optimal control is that of optimal stopping.In optimal stopping, the only decision to be made is when to stop the process. When theprocess is stopped, a beneÞt is received (or a cost is paid), and the objective is to maximizethe expected beneÞt (or minimize the expected cost). Although the problem formulationis very simple, this optimization problem has many practical applications. Examples in-clude the pricing problems in investment theory, the valuation of American options, thedevelopment of natural resources, etc.; see, e.g., [1, 2, 4, 5, 6, 9, 10, 15, 16, 17, 18].

The formulation of the optimal stopping problem requires the speciÞcation of the classof allowed stopping times. Typically, one assumes these to be non-anticipative in an appro-priate sense, so that the control does not have knowledge of the future. Another importantrestriction is with regard to the actual time values at which one can stop, and here thereare two important cases: continuous time and discrete time. In the Þrst case, the stoppingtime is allowed to take values in the interval [0,∞], with ∞ corresponding to the decisionto never stop. In the second case there is a Þxed discrete set of times D ⊂ [0,∞], and the

∗Research supported in part by the National Science Foundation (NSF-DMS-0072004) and the ArmyResearch Office (ARO-DAAD19-99-1-0223).

stopping time must be selected from this set. Typically, this discrete set is a regular grid,e.g., Dh

.= nh : n ∈ IN0, where h > 0 is the grid spacing.

In the present paper we focus exclusively on the one dimensional case. Although thestatement of precise assumptions is deferred to Section 2.1, a rough description of thecontinuous and discrete time problems we consider is as follows.

Continuous time optimal stopping. We use the stochastic process model

dStSt

= b(St) dt+ σ(St) dWt

where b and σ are bounded continuous functions from IR to IR. Although the results canbe extended to cover other diffusion models as well, we focus on this model because of itswide use in optimal stopping problems that occur in economics and Þnance. We consider apayoff deÞned in terms of a convex nondecreasing function φ : IR→ [0,∞). The payoff fromstopping at time t is φ(St), and the decision maker wants to maximize the expected presentvalue by judiciously choosing a stopping time. This is modeled by the optimal stoppingproblem with value function

V (x) = supτ∈S

IE£e−rτφ(Sτ )

¯S0 = x

¤,

where r is the discount rate and S is the set of all admissible stopping times, which areallowed to take values in [0,∞]. The dynamic programming equation for this problem is asfollows. Let

LV (x) = 1

2σ2(x)x2V 00(x) + b(x)xV 0(x).

Then

max [φ(x)− V (x),LV (x)− rV (x)] = 0. (1.1)

In the case where φ is convex and nondecreasing, it is often optimal to stop when the processSt Þrst exceeds some Þxed threshold x∗. In this case, the value function V (x) equals φ(x) forx ≥ x∗, and it satisÞes the ordinary differential equation −rV (x) + LV (x) = 0 for x < x∗.For the case where σ and b are constants, V (x) takes the form Axβ for x < x∗. Here β is thepositive root of some quadratic equation, and (A, x∗) are constants that can be computedexplicitly using the principle of smooth Þt, i.e., the value function is C1 across the optimalexercise boundary x∗.

Discrete time optimal stopping. In this case the process model is the same as before,but the set of possible stopping times is restricted to those that take values in the time gridDh

.= nh, n ∈ IN0. The optimal strategy is often similar to the continuous time case: stop

the Þrst time Snh exceeds some Þxed threshold xh∗ . Let V h(x) denote the value function.

The pair¡V h(x), xh∗

¢satisfy the dynamic programming equation [21]

V h(x) =

½φ(x) , x ∈ [xh∗ ,∞)

e−rhIE£V h(Sh)|S0 = x

¤, x ∈ (0, xh∗).

2

Closed-form solutions to this dynamic programming equation are not usually available.

The aim of the present paper is to examine the connection between these two optimalstopping problems as h→ 0. There are two questions of main interest:

What is the convergence rate of the optimal exercise boundary xh∗ to x∗, and what isthe rate coefficient?

What is the convergence rate of the value function V h(x) to V (x), and what is therate coefficient?

As we will see in Section 2, the optimal exercise boundaries converge with rate√h,

while the value functions converge with rate h. In both cases there is a well deÞned ratecoefficient. The coefficient in the case of the exercise boundary is deÞned in terms ofthe expected value of a functional of local time of Brownian motion, while the coefficientfor the value function involves both local time and excursions of Brownian motion. Forproblems where the continuous time problem can be more or less solved explicitly (e.g.,the one dimensional problems considered in the present work), these results allow one toexplicitly compute accurate approximations for the discrete time problem. For problemswhere the continuous time problem does not have an explicit solution (e.g., multidimensionalproblems), the analogous information could possibly be used to improve the quality ofapproximation obtained using numerical approximations.

Few existing results are concerned with the rate of convergence of approximations forthis class of problems. Lamberton [14] considers the binomial tree approximation for pricingAmerican options and obtains upper and lower bounds (though not a rate of convergence) forthe value function. In his approximation both the time and state variables are discretized.References to a few papers giving qualitatively similar results also appear in [14].

The outline of the paper is as follows. In Section 2 we introduce notation and deÞne thebasic optimization problems. We state the main result, give an illustrative example, andthen lay out the main steps in the proof of the approximation theorem. The proofs of twokey approximations which are intimately connected with the local time and excursions ofBrownian motion are given in Section 3. The paper concludes with an Appendix in whicha result on a conditional distribution of the exit time is proved.

2 The Approximation Theorem

2.1 Notation, assumptions and background

Consider a probability space (Ω,F , IP; IF) with Þltration IF = (Ft) satisfying the usualconditions: right-continuity and completion by IP-negligible sets. The state process S =(St,Ft) is modeled by

dStSt

= b(St) dt+ σ(St) dWt, S0 ≡ x. (2.1)

Here W = (Wt,Ft) is a standard IF-Brownian motion.

3

DeÞne value function

V (x) = supτ∈S

IE£e−rτφ(Sτ )

¯S0 = x

¤,

where the supremum is over all stopping times with respect to the Þltration IF. DeÞne

V h(x) = supτ∈Sh

IE£e−rτφ(Sτ )

¯S0 = x

¤,

where Sh is the set of all stopping times that take values in Dh.The following assumptions will be used throughout the paper.

Condition 2.1. 1. The coefficients b : IR→ IR and σ : IR→ IR are bounded and contin-uous, with infx∈IR σ(x) > 0. Furthermore xb(x) and xσ(x) are Lipschitz continuous.

2. φ : IR → [0,∞) is non-decreasing, and both φ and its derivative φ0 are of polynomialgrowth. Furthermore

supt≥0

e−rtφ(St) ∈ IL1, limt→∞ e

−rtφ(St) = 0, a.s.

3. The continuation region for the continuous-time optimal stopping problem takes theform x : V (x) > φ(x) = (0, x∗).

4. The continuation region for the discrete-time optimal stopping problem takes theform x : V h(x) > φ(x) = (0, xh∗).

5. The payoff function φ is twice continuously differentiable in a neighborhood of x∗.

6. The smooth-Þt-principle holds, that is, the value function V is C1 across the optimalexercise boundary x∗.

As noted in the Introduction, V satisÞes the dynamic programming equation

max [φ(x)− V (x),LV (x)− rV (x)] = 0.Note that usually V is only once continuously differentiable across the optimal exerciseboundary x = x∗. Since φ(x) = V (x) if x ∈ [x∗,∞) and φ(x) < V (x) if x ∈ (0, x∗), itfollows that V 00(x∗−) ≥ φ00(x∗), where the − denotes limit from the left. DeÞne

A.=V 00(x∗−)− φ00(x∗)

φ(x∗)≥ 0. (2.2)

Although one can construct examples where A = 0, it is typically the case that A > 0.We will assume this condition below, and merely note that the rate of convergence of theoptimal threshold does not depend on A at all.

Remark 2.1. The change of variable t = − log x can be used to transform the ordinarydifferential equation (ODE) Lf(x)− rf(x) = 0 on (0,∞) into the ODE

1

2σ(e−t)W 00(t) +

∙1

2σ(e−t)− b(e−t)

¸W 0(t)− rW (t) = 0

4

on IR. Since σ(x) > 0 for x > 0, the classical theory for solutions of ODEs [3] can beused to show that the general solution to LV (x) − rV (x) = 0 can be written in the formc1f1(x) + c2f2(x), where f1(x) is positive and bounded as x ↓ 0 and f2(x) is unbounded asx ↓ 0. Under Condition 2.1, the function f1 is twice continuously differentiable on (0,∞).V (x) is then equal to c1f1(x) for x ∈ (0, x∗] and equal to φ(x) for x ∈ [x∗,∞), where c1 andx∗ are determined by the principle of smooth Þt, i.e.,

c1f1(x∗) = φ(x∗) and c1f 01(x∗) = φ0(x∗).

Remark 2.2. In the case that S is a geometric Brownian motion with b(x) ≡ b and σ(x) ≡σ, and φ(x) = (x − k)+ for some constant k, then Condition 2.1 holds when r > b. Forr ≤ b, the value function for the optimal stopping problem is +∞, and there is no optimalstopping time; see [6].

Remark 2.3. It is usually not a prior clear if parts 3 and 4 of Condition 2.1 hold for ageneral state process. Here we give a sufficient condition that is very easy to verify in thecase φ(x) = (x− k)+. Suppose parts 1 and 2 of Condition 2.1 hold, and in addition that

r ≥ supx∈(0,∞)

©b(x) + xb0(x)

ª.

We claim that parts 3 and 4 of Condition 2.1 hold. We will show that part 3 holds andomit the analogous proof for 4. DeÞne VT (x)

.= supτ≤T IEx [e−rτφ(Sτ )]. Let Sx stand for

the state process starting from S0 ≡ x. A small modiÞcation of the proof of Theorem 5.2[7] shows that the collection of random variables

sup0≤t≤T

¯Sx(t)− Sy(t)

x− y¯, y ∈ (x− δ, x+ δ)

is uniformly integrable for Þxed x, small δ > 0 and terminal time T . It follows immediatelythat VT (x) is a continuous function, since for any stopping τ ≤ T , we have

IE¯e−rτφ

¡Sx(τ)

¢− e−rτφ¡Sy(τ)¢¯ ≤ IE sup0≤t≤T

|Sx(t)− Sy(t)| ≤ c|x− y|

for some constant c. Furthermore, the upper left Dini derivate of VT is always bounded byone, i.e.,

lim supy↑x

VT (x)− VT (y)x− y ≤ 1.

To see this, let τ∗ be the optimal stopping time when S0 ≡ x. The existence of τ∗ isguaranteed by the classical theory of Snell envelop; see [11]. We have

VT (y) ≥ IE£e−rτ∗φ(Syτ∗)

¤,

which implies that

VT (x)− VT (y)x− y ≤ IE

"e−rτ∗φ(Sxτ∗)− e−rτ

∗φ(Syτ∗)

x− y

#.

5

However, if x ≥ y, then Sx(t) ≥ Sy(t) for all t by strong uniqueness. Since φ is non-decreasing and

φ(x)− φ(y) = (x− k)+ − (y − k)+ ≤ x− y ∀ x ≥ y,

we haveVT (x)− VT (y)

x− y ≤ IE"e−rτ∗

¡Sxτ∗ − Syτ∗

¢x− y

#.

Using the uniform integrability, it follows that

lim supy↑x

VT (x)− VT (y)x− y ≤ IE

"limy↑x

e−rτ∗¡Sxτ∗ − Syτ∗

¢x− y

#= IE

he−rτ

∗Dx(τ∗)

i,

where Dx(t).= ∂

∂xSx(t) satisÞes the SDE

dDx(t)

Dx(t)=£b(Sxt ) + S

xt b0(Sxt )

¤dt+

£σ(Sxt ) + S

xt σ

0(Sxt )¤dWt, Dx(0) = 1.

See, e.g., [12, 19]. Since by assumption r ≥ supx∈(0,∞) [b(x) + xb0(x)], it is easy to checkthat

lim supy↑x

VT (x)− VT (y)x− y ≤ IE

he−rτ

∗Dx(τ∗)

i≤ 1.

It follows from a standard result in real analysis (see [20, Proposition 5.1.2]) that

VT (x)− VT (y) ≤ x− y ∀ x ≥ y.

This implies thatx : VT (x) = φ(x) = [x∗T ,∞)

for some real number x∗T . Indeed, if VT (y) = φ(y) = (y − k)+, then since VT > 0 we musthave y > k. It follows that for all x ≥ y,

VT (x) ≤ VT (y) + (x− y) = (y − k) + (x− y) = x− k = φ(x).

But VT ≥ φ trivially, whence VT (x) = φ(x) for all x ≥ y.It remains to show that part 3 of Condition 2.1 holds. It suffices to observe that for all

xVT (x) ↑ V (x)

as T →∞. This completes the proof.Remark 2.4. If S is a geometric Brownian motion with b(x) ≡ b, and σ(x) ≡ σ, andφ(x) = (

PiAix

αi − k)+ for some positive constants (Ai,αi) and k ≥ 0, then one can showthat V (x) − φ(x) is decreasing, which in turn implies that parts 3 and 4 of Condition 2.1hold. A similar argument can be found in [9].

6

2.2 Rates of convergence: value function and stopping region

Let u ∈ [0, 1), letW be a Brownian motion withWu = 0, and deÞneN.= inf n ∈ IN :Wn ≥ 0.

Note that N <∞ w.p.1. DeÞne

H(u).= IEW 2

N and M(u).= IEWN . (2.3)

In terms of these functions we deÞne the constants

C =

Z 1

0H(u) du and K =

Z 1

0M (u) du. (2.4)

These quantities are shown to be Þnite in Lemma 3.2. Note that H(u) > M2(u), andtherefore

C =

Z 1

0H(u) du >

Z 1

0M2(u) du ≥

µZ 1

0M(u) du

¶2= K2.

Our main result is the following.

Theorem 2.1. Assume Condition 2.1, and deÞne the constants A,C, and K by (2.2) and(2.4). Assume that A > 0. The following conclusions hold for all x ∈ (0, x∗).1.

V h(x)− V (x)V (x)

= −12Ax2∗σ

2(x∗)(C −K2)h+ o(h)

2.

xh∗ = x∗ −Kx∗σ(x∗)√h+ o(

√h).

Remark 2.5. Using an elementary argument by contradiction, one can show that theasymptotic expansion in the preceding theorem holds uniformly in any compact subset of(0, x∗).

Example: Consider the special case where b(x) ≡ b and σ(x) ≡ σ. Assume r > b andφ(x) = (x−k)+ for some constant k > 0. It follows that the value function for the continuoustime optimal stopping problem is

V (x) =

½Bxα ; x < x∗x− k ; x ≥ x∗

where

α =

µ1

2− b

σ2

¶+

sµ1

2− b

σ2

¶2+2r

σ2B =

x∗ − kxα∗

,

andx∗ =

α

α− 1k.It follows that

xh∗ = x∗(1−Kσ√h) + o(

√h) =

αk

α− 1(1−Kσ√h) + o(

√h)

7

and

A =V 00(x∗−)− φ00(x∗)

x∗ − k =Bα(α− 1)(x∗)α−2

x∗ − k ,

which implies that

V h(x)− V (x)V (x)

= −12Ax2∗σ

2(C −K2)h+ o(h) = −12α(α− 1)(C −K2)σ2h+ o(h).

2.3 Overview of the proof

In this subsection we outline and prove the main steps in the proof of Theorem 2.1. Theproofs of two key asymptotic expansions are deferred to the next section.

For the simplicity of future analysis, we Þrst introduce a bounded modiÞcation of thepayoff function φ. This modiÞcation will not affect the asymptotics at all; see Proposition2.1.

Let φ ≤ φ be an increasing function satisfying

φ(x) =

½φ(x), if x ≤ x∗ + ak , if x ≥ x∗ + 2a. (2.5)

Here a and k are two positive constants, whose speciÞc values are not important. Withoutloss of generality, we assume that φ is twice continuously differentiable in the region [x∗,∞).Suppose h and δ are two positive constants, and let xδ

.= x∗− δ. We consider the quantities

Wδ(x).= IEx

£e−rτδ φ

¡Sτδ¢¤

and Wδ(x) = IEx£e−rτδφ

¡Sτδ¢¤,

whereτδ.= inf t ≥ 0 : St ≥ xδ ,

and IEx denotes expectation conditioned on S0 = x. Note that Wδ(x) = Wδ(x) for allx ≤ xδ. We also deÞne

Whδ (x)

.= IEx

he−rτ

hδ φ¡Sτhδ

¢iand W h

δ (x).= IEx

he−rτ

hδ φ¡Sτhδ

¢i,

whereτhδ

.= inf nh ≥ 0 : Snh ≥ xδ .

Main idea of the proof. The main idea for proving the rates of convergence is as follows.Write

Whδ (x)− V (x) =

hW hδ (x)− W h

δ (x)i+hW hδ (x)− Wδ(x)

i+£Wδ(x)− V (x)

¤.

For each term, we will obtain approximations as h and δ tend to zero. It turns out theleading term has the following form:

−12a1δ

2 + a2δ√h+ a3h+ higher order term.

8

Here a1, a2, and a3 are constants (some of which depend on x), with a1 > 0. Since the func-tion Wh

δ (x) attains its maximum at δ∗ = x∗ − xh∗ , one would expect that δ∗ approximatelymaximizes the leading term, or

δ∗ =a2a1

√h+ o(

√h). (2.6)

Furthermore, substituting this back in one would expect

V h(x) =Whδ∗(x) =

µ−a

22

a1+ a3

¶h+ o(h).

This is in fact how the argument will proceed. We begin with the estimation of the Þrstterm, which turns out to be negligible for small h and δ. DeÞne the quantity

4δ,h.=W h

δ (x)− W hδ (x) = IE

xhe−rτ

³φ(Sτhδ

)− φ(Sτhδ )´i. (2.7)

We have the following result.

Proposition 2.1. DeÞne 4δ,h by (2.7). There exist constants L <∞ and ε > 0 such that

|4δ,h| ≤ Le− εh

for all sufficiently small δ and h.

Proof. The proof of the proposition is based on the following bound. Let a be as in thecharacterization (2.5) of φ. Then for any x ≤ x∗ and y ≥ x∗ + a,

IP¡Sh > y

¯S0 = x

¢ ≤ exp(− ∙log yx∗− c1h

¸2,c2h

), (2.8)

where the Þnite constants c1, c2 depend only on the coefficients b,σ. To prove this boundwe use the expression

Sh = S0 exp

½Z h

0

∙b(St)− 1

2σ2(St)

¸dt+

Z h

0σ(St) dWt

¾.

DeÞne c1.= kbk∞ + 1

2kσ2k∞. Since x ≤ x∗

p.= IPx (Sh > y) ≤ IPx

³eR h0 σ(St) dWt ≥ elog y

x∗−c1h´.

However, if B.= log y

x∗ − c1h and c2.= 2kσ2k∞, then for θ > 0

p ≤ IPx³eθR h0 σ(St) dWt ≥ eθB

´≤ IPx

³eθR h0 σ(St) dWt− 1

2θ2R h0 σ(St)

2 dt ≥ eθB− 14θ2c2h

´≤ e−θB+

14θ2c2h.

9

The last inequality follows from Chebychevs inequality. Minimizing the right hand sideover θ completes the proof of (2.8).

We now complete the proof of the proposition. To ease the exposition, we use τ in lieuof τhδ throughout the proof. We have

4δ,h =

∞Xn=1

e−rnhIEx£¡φ(Snh)− φ(Snh)

¢1τ=n

¤≤

∞Xn=1

e−rnhZ ∞

a+x∗

¯φ0(y)− φ0(y)¯ IPx(Snh > y, τ = n) dy.

We also have, for any y > x∗ + a, that

IPx(Snh > y, τ = n)

= IPx(Snh > y | τ = n)IPx(τ = n)= IPx(Snh > y |Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)IPx(τ = n).

DeÞne the stopping timeσ.= inft ≥ (n− 1)h : St ≥ xδ.

Then

IPx(Snh > y | τ = n)= IPx(Snh > y |σ ≤ nh, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)

=

Z h

0IPx(Snh > y | σ = nh− t, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)

· IPx(σ ∈ nh− dt |σ ≤ nh, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ).However, the strong Markov property implies for all t ∈ [0, h] that

IPx(Snh > y |σ = nh− t, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)= IP(St > y |S0 = xδ, St ≥ xδ)=IP(St > y |S0 = xδ)IP(St ≥ xδ |S0 = xδ) .

The denominator in this display is uniformly bounded from below away from zero for t ∈[0, 1]:

IP(St ≥ xδ |S0 = xδ) ≥ α > 0, ∀ t ∈ [0, 1];

see Lemma 3.4 for a proof. Using (2.8), for all small h > 0 and t ∈ (0, h)

IP(St > y |S0 = xδ) ≤ exp(−∙log

y

x∗− c1t

¸2,c2t

)≤ exp

(−∙log

y

x∗− c1h

¸2,c2h

).

Now since φ0 is of polynomial growth and φ0(x) is zero for large x, it follows that thereare Þnite constants R and m such that¯

φ0(y)− φ0(y)¯ ≤ Rym−1 for all y > x∗ + a.10

Hence, for all small δ > 0, the change of variable x = log yx∗ − c1h gives

4δ,h ≤ R

α

∞Xn=1

e−rnhIPx(τ = n)Z ∞

a+x∗ym−1 exp

(−∙log

y

x∗− c1h

¸2,c2h

)dy

=R

α(x∗)memc1h

∞Xn=1

e−rnhIPx(τ = n)Z ∞

log³1+ a

x∗´−c1h

emx− x2

c2h dx.

For h small enough, there exists positive numbers a, C, c such that

4δ,h ≤ C

∞Xn=1

e−rnhIPx(τ = n)Z ∞

log³1+ a

x∗´ e−x2

ch dx

= C√2πc · Φ

µ− log

µ1 +

a

x∗

¶Á√ch

¶·√h

∞Xn=1

e−rnhIPx(τ = n)

≤ C√2πc · Φ

µ− log

µ1 +

a

x∗

¶Á√ch

¶·√h.

Here Φ is the cumulative distribution function for the standard normal distribution. Wecomplete the proof of the proposition by using the asymptotic relation

Φ(−x) ∼ 1√2πx

e−x2

2

as x→∞. 2

The bound just proved shows that£Whδ (x)− Wh

δ (x)¤is exponentially small as h →

0, uniformly for all small δ > 0. We now consider the terms£Whδ (x)− Wδ(x)

¤and£

Wδ(x)− V (x)¤. When considering the asymptotic behavior of these terms, it is often

convenient to scale δ with h as h→ 0 in the manner suggested by (2.6). For the remainderof this proof, unless explicitly stated otherwise, we will assume that

δ = c√h+ o(

√h) as h→ 0 (2.9)

for a non-negative parameter c. With an abuse of notation, the quantities Whδ (x) and Wδ(x)

will be denoted by W hc (x) and Wc(x) when the relation (2.9) holds.

We next estimate£Wc(x)− V (x)

¤as h→ 0.

Proposition 2.2. Assume Condition 2.1 and deÞne A by (2.2). Assume also that A > 0.Then

Wc(x)− V (x) =∙−12Ac2h+ o(h)

¸V (x).

Proof. Recall that V (x) can be characterized, for x ≤ x∗, as a multiple of the bounded(in a neighborhood of zero) solution f1 to Lf(x) − rf(x) = 0; see Remark 2.1. Wc(x)can be likewise characterized, with the constant determined by the boundary conditionWc(xδ) = φ(xδ). Thus

Wc(x) =φ(xδ)

V (xδ)V (x) for all x ∈ (0, xδ].

11

We now expand for small δ ≥ 0, and use xδ .= x∗ − δ, V (x∗) = φ(x∗), V 0(x∗) = φ0(x∗), andthe deÞnition of A to obtain

Wc(x)− V (x)V (x)

=

µφ

V

¶0 ¯¯x∗

· (−δ) + 1

2

µφ

V

¶00 ¯¯x∗

· (−δ)2 + o(δ2) = −12Aδ2 + o(δ2).

The proof is completed by using (2.9). 2

In the next proposition we state the expansion for£W hδ (x)− Wδ(x)

¤. This estimate

deals with the critical comparison between the discrete and continuous time problems. Theproof of this expansion is detailed, and therefore deferred to the next section.

Proposition 2.3. Assume Condition 2.1 and deÞne A, C, and K by (2.2) and (2.4).Assume also that A > 0. Then

W hc (x)− Wc(x) =

∙Kx∗σ(x∗)Ach− 1

2ACx2∗σ

2(x∗)h+ o(h)¸V (x).

Proof of Theorem 2.1. Recall that xh∗ is the optimal boundary for the stopping problemwith value function V h. On the stopping region, we always have V h(x) = φ(x). Also, sinceV h(x) is deÞned by supremizing over a subset of the stopping times allowed in the deÞnitionof V (x), it follows that V h(x) ≤ V (x). Since V (x) ≥ φ(x) for all x, it follows that xh∗ ≤ x∗.

According to Propositions 2.1, 2.2 and 2.3, for each Þxed c ∈ [0,∞)Whc (x)− V (x)V (x)

=

∙−12Ac2h+Kx∗σ(x∗)Ach− 1

2ACx2∗σ

2(x∗)h+ o(h)¸.

This suggests the choice c∗.= Kx∗σ(x∗). Inserting this into the last display gives

Whc∗(x)− V (x)V (x)

=

∙1

2A(K2 − C)x2∗σ(x∗)2h+ o(h)

¸,

and since V h(x) ≥W hc∗(x) it follows that

lim infh↓0

V h(x)− V (x)V (x)h

≥ 1

2A(K2 −C)x2∗σ(x∗)2. (2.10)

Now deÞne ch by xh∗ = x∗ − ch√h. Since xh∗ ≤ x∗ we know that ch ∈ [0,∞). By

taking a convergent subsequence, we can assume that ch → c ∈ [0,∞]. Using an elementaryweak convergence argument, one can show that xh∗ → x∗. First assume that c ∈ (0,∞). Ifc 6= Kx∗σ(x∗), then by Propositions 2.1, 2.2 and 2.3 we have

lim suph↓0

V h(x)− V (x)V (x)h

<1

2A(K2 − C)x2∗σ(x∗)2,

which contradicts (2.10). If c =∞, then Propositions 2.1 and 2.3 and an argument analo-gous to the one used in Proposition 2.2 shows that

V h(x)− V (x)V (x)

= −A(ch)2h[1 + o(1)].

Since (ch)2 → +∞, this again contradicts (2.10), and thus c = Kx∗σ(x∗). We extend to theoriginal sequence by the standard argument by contradiction, and Theorem 2.1 follows. 2

12

3 Approximations and Expansions in Terms of Local Timeand the Excursions of a Brownian Motion

In this section we prove Proposition 2.3, which is the expansion Whδ (x) − Wδ(x) for small

h > 0. We will use the fact that Wδ has the representation

Wδ(x) =

½φ(x) , for x ≥ xδ

−ε(x) + e−rhIE £Wδ(Sh)¯S0 = x

¤, for x < xδ.

Here ε(x) is an error term that can be given explicitly in terms of the value function Wδ

and the transition probabilities of Snh, and xδ.= x∗ − δ. From the last display, we have

ε(x) = IExhe−rhWδ(Sh)− Wδ(x)

i, ∀ x < xδ.

It follows from the generalized Ito formula that

e−rhWδ(Sh)− Wδ(x) =

Z h

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt (3.1)

+4W 0δ(xδ)

Z h

0e−rtdLSt (xδ) +

Z h

0e−rtW 0

δ(St)Stσ(St) dWt.

Here LS is the local time for process S, and

4W 0δ(xδ)

.= W 0

δ(xδ+)− W 0δ(xδ−).

Lemma 3.1. For every x ∈ (0, xδ),

ε(x) = IExe−rhWδ(Sh)− Wδ(x)

= IExZ h

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt+ IEx4W 0

δ(xδ)

Z h

0e−rtdLSt (xδ).

Proof. The result follows from (3.1) if the stochastic integral is zero. We recall that Wδ(x)is equal to φ(x) for x ≥ xδ and V (x)φ(xδ)/V (xδ) for x ≤ xδ. Since Wδ(x) is equal to φ(x)for large x and hence constant, W 0

δ(x) = 0 for all large x. Also, W0δ(x) is clearly bounded

in a neighborhood of xδ. For µ > 0, let σµ.= inft : St ≤ µ ∧ h. Then the boundedness of

xW 0δ(x) for x ≥ µ implies

IExe−rσµWδ(Sσµ)− Wδ(x) = IExZ σµ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt

+ IEx4W 0δ(xδ)

Z σµ

0e−rtdLSt (xδ).

The lemma follows by letting µ ↓ 0, and using dominated convergence for all terms but thelast, which uses monotone convergence. 2

13

Let τhδ.= infnh : Snh ≥ xδ. Then the discounting and (3.1) imply the formula

Wδ(x) = −IExτhδ /h−1Xn=0

e−rnhε(Snh) + IExe−rτhδ φ(Sτhδ

)

= − IExZ τhδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt

−4W 0δ(xδ)IE

x

Z τhδ

0e−rtdLSt (xδ) + IE

xe−rτhδ φ(Sτhδ

)

for all x < xδ. We recall the deÞnition

W hδ (x)

.= IExe−rτ

hδ φ(Sτhδ

).

It follows that

Whδ (x)− Wδ(x) = IEx

Z τδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt

+4W 0δ(xδ)IE

x

Z τhδ

0e−rtdLSt (xδ).

The proof of Proposition 2.3 is thereby reduced to proving the following two results.

Proposition 3.1. Assume Condition 2.1 and deÞne A and C by (2.2) and (2.4). Assumealso that A > 0. Then

IExZ τhδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt =

∙−12ACx2∗σ

2(x∗)h+ o(h)¸V (x). (3.2)

Proposition 3.2. Assume Condition 2.1 and deÞne A and K by (2.2) and (2.4). Assumealso that A > 0. Then

4W 0δ(xδ)IE

x

Z τhδ

0e−rtdLSt (xδ) = [Kx∗σ(x∗)Ach+ o(h)]V (x). (3.3)

The proofs of Propositions 3.1 and 3.2 use estimates on the excursions and local timeof Brownian motion, respectively, and are given in the next two subsections. We will needto relate the constants that appear in these approximations to the simple constants deÞnedby (2.3) and (2.4). The following lemma gives this relationship.

Lemma 3.2. Let W be a standard Brownian motion that satisÞes Wu = 0 for some u ∈[0, 1). DeÞne H(u) and M (u) by (2.3). Then

H(u) = IE

Z N

u1Wt≥0 dt and M(u) = IELWu,N(0),

where N.= inf n ∈ IN : Wn ≥ 0 and LWu,N(0) is the local time of W on the interval [u,N ].

14

Proof. We Þrst prove H(u) = IER Nu 1Wt≥0 dt. Consider the continuously differentiable

convex function

f(x).=1

2(x+)2 =

½0 , if x ≤ 012x2, if x ≥ 0.

It follows from Itos formula that

f(WN∧n) = f(Wu) +

Z N∧n

uf 0(Wt) dWt +

1

2

Z N∧n

uf 00(Wt) dt

=

Z N∧n

uWt1Wt≥0 dWt +

1

2

Z N∧n

u1Wt≥0 dt

for all integers n ∈ IN. This yields

IE¡W+N∧n

¢2= IE

Z N∧n

u1Wt≥0 dt ∀ n ∈ IN.

Letting n → ∞, the right hand side converges to IE R Nu 1Wt≥0 dt by the monotone con-vergence theorem. Since W+

N∧n ≤ WN , the result will follow by dominated convergence ifIEW 2

N is Þnite.To show IEW 2

N is Þnite, we consider the conditional probability

Pn+1(x).= IP

¡WN ≥ x

¯N = n+ 1

¢= IP

¡Wn+1 ≥ x

¯Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢for all n ∈ IN0 and x ≥ 0. DeÞne the following stopping time

σ.= inf t ≥ n : Wt = 0 .

Then

Pn+1(x) = IP¡Wn+1 ≥ x

¯σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢=

Z 1

0IP¡Wn+1 ≥ x

¯σ = n+ t,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢· IP ¡σ ∈ n+ dt ¯ σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢.

However, by strong Markov Property, for all t ∈ [0, 1]IP¡Wn+1 ≥ x

¯σ = n+ t,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢= IP

¡W1 ≥ x

¯Wt = 0,W1 ≥ 0

¢= 2Φ

µ− x√

1− t¶

≤ 2Φ(−x).Here Φ is the cumulative distribution function for the standard normal. Hence,

Pn+1(x) ≤ 2Φ(−x)Z 1

0IP¡σ ∈ n+ dt ¯σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0

¢= 2Φ(−x).

15

It follows that

IEW 2N =

∞Xn=1

IE(W 2N1N=n)

=

∞Xn=1

Z ∞

02xIP(WN ≥ x,N = n) dx

=∞Xn=1

Z ∞

02xIP(WN ≥ x

¯N = n)IP(N = n) dx

≤∞Xn=1

IP(N = n)

Z ∞

04xΦ(−x) dx

=

Z ∞

04xΦ(−x) dx

< ∞.As for the equality M(u) = IELWu,N(0), it follows from Tanakas formula that

WN =W+N =

Z N

u1Wt≥0 dWt + L

Wu,N(0).

However, since the preceding proof already implies that IER Nu 1Wt≥0 dt <∞,

IEWN = IELWu,N (0) =M(u).

This completes the proof. 2

3.1 Proof of Proposition 3.1

In this subsection we prove

IExZ τhδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt =

∙−12ACx2∗σ

2(x∗)h+ o(h)¸V (x).

We recall the deÞnition

H(u).= IE

Z N

u1Wt≥0 dt,

where W is a standard Brownian motion with Wu = 0, and N.= inf n ∈ IN :Wn ≥ 0.

Lemma 3.3. H(u) is continuous and bounded on the interval [0, 1).

Proof. DeÞne

Zu.=

Z N

u1Wt≥0 dt

where W is a Brownian motion with Wu = 0. We Þrst show that the family Zu, u ∈ [0, 1)is uniformly integrable (and in particular, that H(u) is bounded). Indeed, deÞne

c0.=

Z 1

u1Wt≥0 dt, cj

.=

Z j+1

j1Wt≥0 dt, j ∈ IN.

16

The key observation is that if cj > 0, then W must spend some time during the interval[j, j + 1] to the right of zero, therefore the probability that Wj+1 > 0 is at least half. Thusfor all j ∈ IN0

IP(N = j + 1¯N > j, cj > 0) ≥ 1

2.

Let Xu.=PN−1j=0 1cj>0. Clearly Xu dominates Zu. Furthermore, the strong Markov

property implies that

IP(Xu ≥ j + 1¯Xu ≥ j) ≤

µ1− 1

2

¶=1

2.

This, in turn, implies that

IP(Xu ≥ n) ≤ 1

2n−1,

and thus

IE(X2u) =

∞Xn=1

2nIP(Xu ≥ n) ≤∞Xn=1

n

2n−2<∞.

Therefore Zu, u ∈ [0, 1) is uniformly integrable.As for the continuity, we write

H(u) = IE

Z Nu

u1Bt−Bu≥0 dt = IEZu.

where B is some standard Brownian motion with B0 ≡ 0 and

Nu.= inf n ∈ IN0 : Bn −Bu ≥ 0 .

Let u ∈ [0, 1) and let un be an arbitrary sequence in [0, 1) with un → u. Since for anyÞxed n IP(Bn −Bu = 0) = 0,

Zun → Zu

with probability one. Since the Zun are uniformly integrable, we have

H(un)→ H(u),

which completes the proof. 2

Now for any u ∈ [0, 1) and h > 0, deÞne the function

G(h;u).= IE

Z Nhh

uhe−r(t−uh)

£−rφ(St) + Lφ(St)¤1St≥xδ dt,where

dStSt

= b(St) dt+ σ(St) dWt, Suh = 0

andNh .= inf n ∈ IN : Snh ≥ xδ .

17

Let bac denote the integer part of a. It follows from strong Markov property that

IExZ τhδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt = IEx

he−rτδG

³h;τδh−jτδh

k´i.

The change of variable t 7→ th and the transformation

Y(h)t

.=Sth − xδ√

h

yield

G(h;u) = hF (h; u).= hIE

Z Nh

ue−r(t−u)h

£−rφ+ Lφ¤(√hY (h)t + xδ)1Y (h)t ≥0 dt,

where Y (h) follows the dynamics

dY(h)t = (

√hY

(h)t + xδ)

h√hb(√hY

(h)t + xδ) + σ(

√hY

(h)t + xδ) dWt

i, Y (h)u = 0.

Therefore

IExZ τhδ

0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt = hIEx

he−rτδF

³h;τδh−jτδh

k´i. (3.4)

We have the following result regarding F (h;u). Although part of the proof is similar tothat of Lemma 3.3, we provide the details for completeness.

Lemma 3.4. 1. F (h;u) is uniformly bounded for small h and all u ∈ [0, 1).2.

limh→0

F (h; u) = [−rφ+ Lφ](x∗)H(u),

and the convergence is uniform on any compact subset of [0, 1).

Proof. Consider the family of random variables Zh,u : u ∈ [0, 1), h ∈ (0, 1) where

Zh,u.=

Z Nh

ue−r(t−u)h

£−rφ+ Lφ¤(√hY (h)t + xδ)1Y (h)t ≥0 dt

and

dY(h)t = (

√hY

(h)t + xδ)

h√hb(√hY

(h)t + xδ) + σ(

√hY

(h)t + xδ) dWt

i, Y (h)u = 0.

We Þrst show this family is uniformly integrable. Since (−rφ+Lφ) is bounded, it is sufficientto show that

Xh,u.=

Z Nh

u1Y (h)t ≥0 dt (3.5)

18

are uniformly integrable. DeÞne

c(h)0

.=

Z 1

u1Y (h)t ≥0 dt, c

(h)j

.=

Z j+1

j1Y (h)t ≥0 dt, j ∈ IN

As in the proof of Lemma 3.3, if c(h)j > 0, then Y

(h)t spends some time to the right of zero in

interval [j, j + 1]. Therefore the probability Y(h)j+1 ≥ 0 is bounded from below by a positive

constant:

IP³Nh = j + 1

¯Nh > j, c

(h)j > 0

´≥ α > 0, ∀ u ∈ [0, 1), h ∈ (0, 1).

A proof is as follows. To show a lower bound α > 0 exists, it suffices to show that for someα > 0

pt,h.= IP(Y

(h)t ≥ 0 ¯Y (h)0 ≥ 0) ≥ α > 0, ∀ t ∈ [0, 1].

However, it is easy to see that

pt,h = IP(Sth ≥ xδ |S0 ≥ xδ)

≥ IP

µexp

½Z th

0

£b(Su)− 1

2σ2(Su)

¤du+

Z th

0σ(Su) dWu

¾≥ 1

¶≥ IP

µZ th

0σ(Su) dWu ≥ c1th

¶,

where c1.= kbk∞ + 1

2kσ2k∞. We can view the stochastic integral Qt.=R t0 σ(Su) dWu a

time-changed Brownian motion. Indeed, there exists a Brownian motion B such that

Qt = BhQit .

Letσ.= inf

xσ(x) σ

.= sup

xσ(x).

Thenσ2h ≤ hQit ≤ σ2h.

It follows that

pt,h ≥ IP (Qth ≥ c1th) ≥ IPµ

minσ2th≤s≤σ2th

Bs ≥ c1th¶= IP

µmin

σ2≤s≤σ2Bs ≥ c1

√th

¶,

where the last equality follows sincen

1√thBths, s ≥ 0

ois still a standard Brownian motion.

For h ∈ (0, 1), we can choose

α = IP

µmin

σ2≤s≤σ2Bs ≥ c1

¶> 0,

which will serve as a lower bound.

19

Now deÞne

Mh,u.=

Nh−1Xj=0

1c(h)j >0,

which clearly dominates Xh,u. By the strong Markov property,

IP(Mh,u > j + 1¯Mh,u > j) ≤ 1− α,

and thusIP(Mh,u ≥ j) ≤ (1− α)j−1.

This implies that

IE(M2h,u) =

∞Xj=0

2jIP(Mh,u ≥ j) ≤∞Xj=0

2j(1− α)j−1 <∞,

which implies the uniform integrability of Zh,u, u ∈ [0, 1), h ∈ (0, 1). In particular,F (h;u) is uniformly bounded for u ∈ [0, 1) and h ∈ (0, 1).

For the uniform convergence, it suffices to show that for any u ∈ [0, 1) and any sequenceuh ∈ [0, 1) converging to u,

F (h; uh) = IEZh,uh → [−rφ+ Lφ](x∗)H(u).

Let Y (h) be the process with Y(h)

uh= 0. As h → 0, we have that Y (h) converges weakly to

Y , where Y is deÞned asYt = x∗σ(x∗)(Wt −Wu).

By the Skorohod representation, we can assume Y (h) → Y with probability one. Using theuniform integrability, it suffices to show that

Zh,uh → Z.=

Z N

u1Yt≥0 dt

with probability one. Note N is almost surely Þnite, and that

Nh → N

with probability one. The almost sure convergence of Zh,uh to Z then follows from thedominated convergence theorem, which completes the proof. 2

Returning to the proof of Proposition 3.1, we claim that

limh→0

IExhe−rτδF

³h;τδh−jτδh

k´i= C[−rφ+ Lφ](x∗)IEx

£e−rτ∗

¤(3.6)

where C.=R 10 H(u) du. To ease notation, let

Uh.=τδh−jτδh

k.

20

It suffices to show that

limh→0

IEx£e−rτδF (h;Uh)

¤= C[−rφ+ Lφ](x∗)IEx

£e−rτ∗

¤.

We can write

IEx£e−rτδF (h;Uh)

¤= IEx

£e−rτδF (h;Uh)

¤− [−rφ+ Lφ](x∗)IEx £e−rτδH(Uh)¤+ [−rφ+ Lφ](x∗)IEx

£e−rτδH(Uh)

¤.

In Proposition 4.1 in the Appendix we show, roughly speaking, that (τδ, Uh) converges indistribution to (τ∗, U) as h and δ tend to zero, where U is uniformly distributed and inde-pendent of τ∗. This is not strictly true, in that we ignore what happens on the unimportantevent τ∗ =∞. It follows from Proposition 4.1 that

IEx£e−rτδH(Uh)

¤ → IEx£e−rτ∗

¤ Z 1

0H(u) du = CIEx

£e−rτ∗

¤.

Therefore, to prove (3.6) we must show that

4 .= IEx

£e−rτδF (h;Uh)

¤− [−rφ+ Lφ](x∗)IEx £e−rτδH(Uh)¤ → 0.

Due to the uniform boundedness of F and H, there exists R ∈ (0,∞) such that¯F (h, u)

¯+¯[−rφ+ Lφ](x∗)H(u)

¯ ≤ R ∀ u ∈ [0, 1)when h is small enough. Since Uh ⇒ U , for h small enough,

IP (Uh > 1− ε) ≤ 2ε.Also, by Lemma 3.4 for h small enough

supu∈[0,1−ε]

¯F (h, u)− [−rφ+ Lφ](x∗)H(u)

¯ ≤ ε.It follows that, for h small enough,

4 ≤ εIP(Uh ≤ 1− ε) +RIP(Uh > 1− ε) ≤ (2R+ 1)ε,which completes the proof of (3.6).

It follows directly from the deÞnitions of V (x) and τ∗ that

IEx£e−rτ∗

¤= V (x)/φ(x∗). (3.7)

Also, the deÞnition of A in (2.2) and the fact that (−rV + LV )(x∗−) = 0 imply that

(−rφ+ Lφ)(x∗) = (−rV + LV )(x∗−) + 12σ2(x∗)x2∗

£φ00(x∗)− V 00(x∗−)

¤=

1

2σ2(x∗)x2∗

£φ00(x∗)− V 00(x∗−)

¤=

1

2σ2(x∗)x2∗Aφ(x∗).

The proposition follows by combining the last display with (3.4), (3.6), and (3.7).

21

3.2 Proof of Proposition 3.2

In this subsection we verify equation (3.3):

4W 0δ(xδ)IE

x

Z τhδ

0e−rtdLSt (xδ) = [Kx∗σ(x∗)Ach+ o(h)]V (x).

We recall the notation xδ = x∗ − δ, where δ = c√h+ o(

√h). It follows from the deÞnition

(2.2) of A that

4W 0δ(xδ) = φ0(xδ)− φ(xδ)

V (xδ)V 0(xδ)

=

µφ0 − φ

VV 0¶0 ¯¯x∗

· (−δ) + o(δ)

= [V 00(x∗−)− φ00(x∗)]δ + o(δ)= cAφ(x∗)

√h+ o(

√h).

As a consequence, the main difficulty in proving (3.3) lies with the term

IExZ τhδ

0e−rtdLSt (xδ).

As in the previous subsection, we consider the transformation

Y(h)t

.=Sth − xδ√

h.

Then Y (h) satisÞes the SDE

dY(h)t = (

√hY

(h)t + xδ)

h√hb(√hY

(h)t + xδ) + σ(

√hY

(h)t + xδ) dWt

i.

We have the following lemma, whose proof is trivial from the deÞnition of the local timeand thus omitted.

Lemma 3.5. Suppose X is a semimartingale, and Yt.= aXbt + v where a > 0, b > 0, v are

arbitrary constants. Let LY and LX denote the local times for Y and X, respectively. Thenfor all t ≥ 0

LYt (ax+ v) = aLXbt (x).

It follows from the lemma that

IExZ τhδ

0e−rtdLSt (xδ) =

√hIEx

Z Nh

0e−rthdLY

(h)

t (0).

For any u ∈ [0, 1), deÞne the processY ∗t = x∗σ(x∗)Wt, Y ∗u = 0.

Also deÞneQ(u)

.= IELY

∗u,N(0), where N

.= inf n ∈ IN : Y ∗n ≥ 0

We have the following result.

22

Proposition 3.3.

limh→0

IExZ Nh

0e−rthdLY

(h)

t (0) = IEx£e−rτ∗

¤ Z 1

0Q(u) du

Before giving the proof, we show how the proposition will follow from Proposition 3.3.We have IExe−rτ∗ = V (x)/φ(x∗), and the deÞnitions of Q and M imply

R 10 Q(u)du =

(x∗)2σ(x∗)2R 10 M(u)du. When combined with the expansion given above for ∆W

0δ(xδ), the

left hand side of (3.3) is equal to

cAφ(x∗)V (x)

φ(x∗)(x∗)2σ(x∗)2

Z 1

0M(u)du+ o(h),

and thus (3.3) follows from Lemma 3.2.

Proof of Proposition 3.3. We consider the test function

f(x).=

0, if x ≤ 0x, if 0 ≤ x ≤ 1k, if x ≥ 2.

We require f(x) to be increasing, and smooth except at the point x = 0 (the speciÞc choiceof k is not important). It follows from the generalized Ito formula and the integration byparts formula that

dhe−rthf(Y (h)t )

i= −rhe−rthf(Y (h)t ) dt+ e−rthD−f(Y (h)t ) dY

(h)t

+1

2e−rthf 00(Y (h)t )dY

(h)t · dY (h)t + e−rthdLX

(h)

t (0).

Without loss of generality, we let f 00(0) = 0. Now we integrate both sides from 0 to Nh andtake expected value. The stochastic integral on the right hand sideZ Nh

0e−rthD−f(Y (h)t )(

√hY

(h)t + xδ)σ(

√hY

(h)t + xδ) dWt

has expectation 0 since

D−f(Y (h)t )(√hY

(h)t + xδ)

is bounded (note that D−f(x) = 0 for x ≥ 2) and σ is bounded by assumption.The Þrst term on the right hand side will contribute

−rhIExZ Nh

0e−rthf(Y (h)t ) dt = −rhIEx

Z Nh

τδh

e−rthf(Y (h)t ) dt

since f(x) = 0 for x ≤ 0. We recall the deÞnition (3.5) of Xh,u. It follows from the strongMarkov property that

IExZ Nh

τδh

e−rthf(Y (h)t ) dt ≤ kIExZ Nh

τδh

1Y (h)t ≥0 dt = kIExG(h, Uh)

23

whereUh

.=τδh−jτδh

kand

G(h, u).= IEXh,u.

By the uniform integrability of Xh,u for small h and u ∈ [0, 1), IExG(h, Uh) is uniformlybounded for small h. Therefore the expectation of the Þrst term in the right hand side goesto zero as h→ 0. The second term in the right hand side contributes

√hIEx

Z Nh

0e−rthD−f(Y (h)t )(

√hY

(h)t + xδ)b(

√hY

(h)t + xδ) dt.

Note that the integrand is bounded by 1Y (h)t ≥0 up to a proportional constant. It followsexactly as in the case of the Þrst term that the contribution of the second term goes to zero.The third term in the right hand side contributes

IExZ Nh

0

1

2e−rthf 00(Y (h)t )(

√hY

(h)t + xδ)

2σ2(√hY

(h)t + xδ) dt.

Since f 00(x) = 0 for x < 0, expected value equals

IExZ Nh

τδh

1

2e−rthf 00(Y (h)t )(

√hY

(h)t + xδ)

2σ2(√hY

(h)t + xδ) dt.

It follows from strong Markov property that the expectation can also be written

IEx£e−rτδF (h;Uh)

¤,

where

F (h;u).= IE

Z Nh

u

1

2e−r(t−u)hf 00(Y (h)t )(

√hY

(h)t + xδ)

2σ2(√hY

(h)t + xδ)1Y (h)t ≥0 dt

and where Y (h) satisÞes the same dynamics with Y(h)u = 0. Since the integrand is bounded

due to the fact that f 00(x) = 0 for all x ≥ 2, it follows from an analogous argument to theone given in the proof of Lemma 3.4 that:

1. F (h;u) is uniformly bounded for small h and all u ∈ [0, 1);2.

J(u).= limh→0

F (h; u) =1

2IE

Z N

uf 00(Y ∗t )x

2∗σ2(x∗) dt

and the convergence is uniform on any compact subset of [0, 1).

The uniform convergence (on compact sets) of F and Proposition 4.1 in the Appendix implythat the expectation of the third term converges to

IEx£e−rτ∗

¤ Z 1

0J(u) du.

24

We omit the details here since an analogous argument is used in the proof of Proposition3.1.

It remains to calculate the contribution from the term

IExhe−rτ

hδ f(Y

(h)

Nh )i= IEx

£e−rτδK(h,Uh)

¤where

K(h;u).= IE

he−r(N

h−u)hf(Y (h)Nh )

iwith Y

(h)u = 0. However, the boundedness and continuity of f ensure that

1. K(h;u) is uniformly bounded for all h and all u ∈ [0, 1).2.

I(u).= limh→0

K(h;u) = IE[f(Y ∗N)|Y ∗u = 0]

and the convergence is uniform on any compact subset of [0, 1).

Indeed, the Þrst claim is trivial. As for the second claim, let uh → u. Then as h → 0,Y (h) ⇒ Y ∗. By the Skorohod representation theorem we can assume Y (h) → Y ∗ withprobability one, which also implies that Nh → N with probability one. Therefore,

Y(h)

Nh → Y ∗N

with probability one. The claim now follows from the dominated convergence theorem.Hence similarly we have

IExhe−rτ

hδ f(Y

(h)

Nh )i

→ IEx£e−rτ∗

¤ Z 1

0I(u) du,

as h→ 0. It is now sufficient to prove

I(u)− J(u) = Q(u) ∀ u ∈ [0, 1).

This is the same showing

IE

∙f(Y ∗N)−

1

2

Z N

uf 00(Y ∗t )x

2∗σ2(x∗) dt− LY ∗u,N(0)

¸= 0,

whereY ∗t = x∗σ(x∗)Wt, Y ∗u = 0.

To prove this, we apply Itos formula to f(Y ∗) to obtain

f(Y ∗N)− f(Y ∗u ) =Z N

uD−f(Y ∗t )x∗σ(x∗) dWt +

1

2

Z N

uf 00(Y ∗t )x

2∗σ2(x∗) dt+ LY

∗u,N(0).

But f(Yu) = f(0) = 0. Furthermore the integrand of the stochastic integral is dominatedby 1Y ∗t ≥0 up to a proportional constant, which implies that the stochastic integral hasexpectation 0. This completes the proof. 2

25

4 Appendix: Weak convergence of (τδ, Uh)

For an arbitrary y > 0, deÞne the function

P y(x, t).= IP

µmax0≤u≤t

Su ≥ y¯S0 = x

¶.

We have the following lemmata.

Lemma 4.1. For every Þxed y > 0, function P y ∈ C1,2¡(0, y)× (0,∞)¢∩ C¡(0, y)× [0,∞)¢and satisÞes the parabolic equation

−∂Py

∂t(x, t) + LP y(x, t) = 0 (x, t) ∈ (0, y)× (0,∞).

Proof: It follows from a standard weak convergence argument that P y is a continuousfunction; see, e.g., [13]. Let (x0, t0) ∈ (0, y)× (0,∞) and deÞne the region

D.= (x0 − ε, x0 + ε)× (t0 − ε, t0).

Consider the parabolic equation

−∂u∂t(x, t) + Lu(x, t) = 0 (x, t) ∈ D,

with boundary condition u = P y on its parabolic boundary. It follows from standard PDEtheory that there exists a classical solution u [8]. It remains to show that u = P y in thedomain D. DeÞne the stopping time

τ.= inf t ≥ 0 : (t0 − t, St) 6∈ D .

It follows that the process u(St, t0 − t) is a (bounded) martingale. In particular,

u(x0, t0) = IEx0u(Sτ , t0 − τ) = IEx0P y(Sτ , t0 − τ) = P y(x0, t0).

Here the last equality follows from strong Markov property. 2

For Þxed 0 < x < y, the density of the hitting time τy is deÞned as

py(x, t).=∂P y

∂t(x, t).

According to the preceding lemma, py is continuous in the domain (0, y)× (0,∞).Lemma 4.2. Suppose yn → y∗, then P yn(x, t) → P y∗(x, t) and pyn(x, t) → py

∗(x, t) uni-

formly on any compact subset of (0, y∗)× (0,∞).Proof: It suffices to show that P yn(x, t) → P y

∗(x, t) uniformly on any compact subset.

The uniform convergence of pyn then follows from Friedman [8, Section 3.6]. SupposeD.= [x0, x1]× [t0, t1] ⊆ (0, y∗)× (0,∞) is a compact subset. In the following, we will denote

26

P yn and P y∗ by Pn and P respectively. Also, without loss of generality, we assume yn ≤ y∗for all n, which implies that

Pn(x, t) ≥ P (x, t).For any ε > 0, we want to show that for large enough n,

0 ≤ Pn(x, t)− P (x, t) ≤ ε ∀ (x, t) ∈ D.DeÞne

τ.= inf t ≥ 0 : St ≥ y∗ ; τn

.= inf t ≥ 0 : St ≥ yn

Since P is continuous, it is uniformly continuous on the compact subset D. It follows thatthere exists a number h such that

IP¡t < τ ≤ t+ h ¯S0 = x¢ = P (x, t+ h)− P (x, t) ≤ ε

2∀ (x, t) ∈ D,

and thus for all (x, t) ∈ DPn(x, t)− P (x, t)= IPx(τn ≤ t, τ > t)= IPx(τn ≤ t, τ > t+ h) + IPx(τn ≤ t, t < τ ≤ t+ h)≤ IP(τn ≤ t, τ > t+ h) + ε

2.

However, it follows from strong Markov property that

IP(τn ≤ t, τ > t+ h¯S0 = x) ≤ IP

µmax0≤u≤h

St ≤ y∗¯S0 = yn

¶∀ (x, t) ∈ D.

Note that the right hand side is independent of (x, t) ∈ D. DeÞne

c1.= kbk∞ + 1

2kσ2k.

Then the right hand side is dominated by

IP

µmax0≤u≤h

[−c1u+Qu] ≤ log y∗yn

¶,

where Qu.=R u0 σ(St) dWt. Since σ

2u ≤ hQiu ≤ σ2u and Qu = BhQiu for some standardBrownian motion B, the probability is in turn dominated by

IP

µmax

0≤t≤σ2h

∙−c1σt+Bt

¸≤ log y∗

yn

¶.

For n big enough, this probability is at most ε2 since yn → y. This completes the proof. 2

Proposition 4.1. Suppose f : [0,∞)→ IR is a bounded continuous function with

limx→∞ f (x) = 0,

and g : [0, 1)→ IR a continuous, bounded function. Then

limh,δ→0

IExhf (τδ) g

³τδh−jτδh

k´i= IExf (τ∗) ·

Z 1

0g (u) du.

for all x ∈ (0, x∗).

27

Proof: Fix x ∈ (0, x∗). Let pδ and p denote the density of τδ and τ∗ respectively. We canassume that all xδ are close to x∗ in the sense that x∗ − xδ ≤ δ0 for some δ0, and x < xδ.Since

limx→∞ f (x) = 0,

we have

IExhf (τδ) g

³τδh−jτδh

k´i=

Z ∞

0f (s) g

³ sh−j sh

k´pδ (s) ds.

For any ε > 0, there exists 0 < a < M <∞ such thatZ[0,a]

f (s) g³ sh−j sh

k´pδ (s) ds ≤ kfk∞ · kgk∞IP (τδ ≤ a) ≤ kfk∞ · kgk∞IP (τδ0 ≤ a) ≤ ε.

and Z[M,∞]

f (s) g³ sh−j sh

k´pδ (s) ds ≤ max

M≤x|f (x) | · kgk∞ ≤ ε.

Note that such choices of (a,M) also make the above inequalities hold when pδ is replacedby p. Also since pδ → p uniformly on the compact interval [ε,M ], we haveZ M

af (s) g

³ sh−j sh

k´|pδ (s)− p (s)| ds ≤ ε

for δ small enough. It remains to show that, for h small enough,¯Z M

af (s) g

³ sh−j sh

k´p (s) ds−

Z M

af (s) p (s) ds ·

Z 1

0g (u) du

¯≤ ε.

However, since f, p and f · p are all uniformly continuous on compact intervals, we have

Z M

af (s) g

³ sh−j sh

k´p (s) ds =

bMh cXn=b ahc

Z (n+1)h

nhf (nh) g

³ sh− n

´p (nh) ds+ o (1)

=

Z 1

0g (u) du

bMh cXn=b ahc

f (nh) p (nh) · h+ o (1)

=

Z 1

0g (u) du

Z M

af (s) p (s) ds+ o (1)

This completes the proof. 2

References

[1] BATHER, J.A. & CHERNOFF, H. (1966) Sequential decisions in the control of aspaceship. Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability3, 181-207.

28

[2] BENSOUSSAN, A. (1984) On the theory of option pricing. Acta Appl. Math. 2,139-158.

[3] BRAUN, M. (1993) Differential Equations and Their Applications. Springer-Verlag,New York.

[4] BRENNAN, M.J. & SCHWARTZ, E.S. (1985) Evaluating natural rescource invest-ments. Journal of Business 58, 135-157.

[5] BROADIE, M. & GLASSERMAN, P. (1997) Pricing American-style securities usingsimulation. J. Econ. Dynam. Control 21, 1323-1352.

[6] DIXIT, A.K. & PINDYCK, R.S. (1994) Investment Under Uncertainty. PrincetonUniversity Press.

[7] EL KAROUI, N., JEANBLANC-PICQUE, M. & SHREVE, S.E. (1998) Robustnessof the Black and Scholes formula. Math. Finance 8, 93-126.

[8] FRIEDMAN, A. (1964) Partial Differential Equations of Parabolic Type. Prentice-hall,Inc. Englewood Cliffs, NJ.

[9] JACKA, S.D. (1991) Optimal stopping and the American put. Math. Finance 1, 1-14.

[10] KARATZAS, I. (1988) On the pricing of American options. Appl. Math. Optim. 17,37-60.

[11] KARATZAS, I. & SHREVE, S.E. (1998) Methods of Mathematical Finance. Springer,Berlin Heidelberg New York.

[12] KUNITA, H. (1990) Stochastic Flows and Stochastic Differential Equations. Cam-bridge, UK; Cambridge University Press.

[13] KUSHNER, H. (1984) Approximation and Weak Convergence Methods for RandomProcesses with Applications to Stochastic System Theory. MIT Press, Cambridge, MA.

[14] LAMBERTON, D. (1998) Error estimates for the binomial approximation of Americanput options. Ann. of Appl. Probab. 8, 206-233.

[15] MCDONALD, R. & SIEGEL, D. (1986) The value of waiting to invest. Quarterly J.Econ. 101, 707-728.

[16] MERTON, R.C. (1990) Continuous-Time Finance. Cambridge-Oxford: Blackwell.

[17] MUSIELA, M. & RUTKOWSKI, M. (1997) Martingale Methods in Financial Mod-elling. Springer, Berlin Heidelberg New York.

[18] MYNENI, R. (1992) The pricing of the American option. Ann. Appl. Probab. 2, 1-23.

[19] PROTTER, P. (1995) Stochastic Integration and Differential Equations. Springer-Verlag, Berlin.

[20] ROYDEN, H.L. (1968) Real Analysis. Collier MacMillan, London.

29

[21] SHIRYAYEV, A.N. (1978) Optimal Stopping Rules. Springer-Verlag, New York.

30