Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
On the Convergence from Discrete to Continuous Time in an
Optimal Stopping Problem∗
Paul Dupuis and Hui Wang
Lefschetz Center for Dynamical SystemsBrown University
Providence, R.I. 02912USA
Abstract
We consider the problem of optimal stopping for a one dimensional diffusion process.Two classes of admissible stopping times are considered. The Þrst class consists of allnon-anticipating stopping times that take values in [0,∞], while the second class furtherrestricts the set of allowed values to the discrete grid nh : n = 0, 1, 2, · · · ,∞ for someparameter h > 0. The value functions for the two problems are denoted by V (x) andV h(x), respectively. We identify the rate of convergence of V h(x) to V (x) and therate of convergence of the stopping regions, and provide simple formulas for the ratecoefficients.
Keywords. Optimal stopping, continuous time, discrete time, diffusion process, rate ofconvergence, local time.
1 Introduction
One of the classical formulations of stochastic optimal control is that of optimal stopping.In optimal stopping, the only decision to be made is when to stop the process. When theprocess is stopped, a beneÞt is received (or a cost is paid), and the objective is to maximizethe expected beneÞt (or minimize the expected cost). Although the problem formulationis very simple, this optimization problem has many practical applications. Examples in-clude the pricing problems in investment theory, the valuation of American options, thedevelopment of natural resources, etc.; see, e.g., [1, 2, 4, 5, 6, 9, 10, 15, 16, 17, 18].
The formulation of the optimal stopping problem requires the speciÞcation of the classof allowed stopping times. Typically, one assumes these to be non-anticipative in an appro-priate sense, so that the control does not have knowledge of the future. Another importantrestriction is with regard to the actual time values at which one can stop, and here thereare two important cases: continuous time and discrete time. In the Þrst case, the stoppingtime is allowed to take values in the interval [0,∞], with ∞ corresponding to the decisionto never stop. In the second case there is a Þxed discrete set of times D ⊂ [0,∞], and the
∗Research supported in part by the National Science Foundation (NSF-DMS-0072004) and the ArmyResearch Office (ARO-DAAD19-99-1-0223).
stopping time must be selected from this set. Typically, this discrete set is a regular grid,e.g., Dh
.= nh : n ∈ IN0, where h > 0 is the grid spacing.
In the present paper we focus exclusively on the one dimensional case. Although thestatement of precise assumptions is deferred to Section 2.1, a rough description of thecontinuous and discrete time problems we consider is as follows.
Continuous time optimal stopping. We use the stochastic process model
dStSt
= b(St) dt+ σ(St) dWt
where b and σ are bounded continuous functions from IR to IR. Although the results canbe extended to cover other diffusion models as well, we focus on this model because of itswide use in optimal stopping problems that occur in economics and Þnance. We consider apayoff deÞned in terms of a convex nondecreasing function φ : IR→ [0,∞). The payoff fromstopping at time t is φ(St), and the decision maker wants to maximize the expected presentvalue by judiciously choosing a stopping time. This is modeled by the optimal stoppingproblem with value function
V (x) = supτ∈S
IE£e−rτφ(Sτ )
¯S0 = x
¤,
where r is the discount rate and S is the set of all admissible stopping times, which areallowed to take values in [0,∞]. The dynamic programming equation for this problem is asfollows. Let
LV (x) = 1
2σ2(x)x2V 00(x) + b(x)xV 0(x).
Then
max [φ(x)− V (x),LV (x)− rV (x)] = 0. (1.1)
In the case where φ is convex and nondecreasing, it is often optimal to stop when the processSt Þrst exceeds some Þxed threshold x∗. In this case, the value function V (x) equals φ(x) forx ≥ x∗, and it satisÞes the ordinary differential equation −rV (x) + LV (x) = 0 for x < x∗.For the case where σ and b are constants, V (x) takes the form Axβ for x < x∗. Here β is thepositive root of some quadratic equation, and (A, x∗) are constants that can be computedexplicitly using the principle of smooth Þt, i.e., the value function is C1 across the optimalexercise boundary x∗.
Discrete time optimal stopping. In this case the process model is the same as before,but the set of possible stopping times is restricted to those that take values in the time gridDh
.= nh, n ∈ IN0. The optimal strategy is often similar to the continuous time case: stop
the Þrst time Snh exceeds some Þxed threshold xh∗ . Let V h(x) denote the value function.
The pair¡V h(x), xh∗
¢satisfy the dynamic programming equation [21]
V h(x) =
½φ(x) , x ∈ [xh∗ ,∞)
e−rhIE£V h(Sh)|S0 = x
¤, x ∈ (0, xh∗).
2
Closed-form solutions to this dynamic programming equation are not usually available.
The aim of the present paper is to examine the connection between these two optimalstopping problems as h→ 0. There are two questions of main interest:
What is the convergence rate of the optimal exercise boundary xh∗ to x∗, and what isthe rate coefficient?
What is the convergence rate of the value function V h(x) to V (x), and what is therate coefficient?
As we will see in Section 2, the optimal exercise boundaries converge with rate√h,
while the value functions converge with rate h. In both cases there is a well deÞned ratecoefficient. The coefficient in the case of the exercise boundary is deÞned in terms ofthe expected value of a functional of local time of Brownian motion, while the coefficientfor the value function involves both local time and excursions of Brownian motion. Forproblems where the continuous time problem can be more or less solved explicitly (e.g.,the one dimensional problems considered in the present work), these results allow one toexplicitly compute accurate approximations for the discrete time problem. For problemswhere the continuous time problem does not have an explicit solution (e.g., multidimensionalproblems), the analogous information could possibly be used to improve the quality ofapproximation obtained using numerical approximations.
Few existing results are concerned with the rate of convergence of approximations forthis class of problems. Lamberton [14] considers the binomial tree approximation for pricingAmerican options and obtains upper and lower bounds (though not a rate of convergence) forthe value function. In his approximation both the time and state variables are discretized.References to a few papers giving qualitatively similar results also appear in [14].
The outline of the paper is as follows. In Section 2 we introduce notation and deÞne thebasic optimization problems. We state the main result, give an illustrative example, andthen lay out the main steps in the proof of the approximation theorem. The proofs of twokey approximations which are intimately connected with the local time and excursions ofBrownian motion are given in Section 3. The paper concludes with an Appendix in whicha result on a conditional distribution of the exit time is proved.
2 The Approximation Theorem
2.1 Notation, assumptions and background
Consider a probability space (Ω,F , IP; IF) with Þltration IF = (Ft) satisfying the usualconditions: right-continuity and completion by IP-negligible sets. The state process S =(St,Ft) is modeled by
dStSt
= b(St) dt+ σ(St) dWt, S0 ≡ x. (2.1)
Here W = (Wt,Ft) is a standard IF-Brownian motion.
3
DeÞne value function
V (x) = supτ∈S
IE£e−rτφ(Sτ )
¯S0 = x
¤,
where the supremum is over all stopping times with respect to the Þltration IF. DeÞne
V h(x) = supτ∈Sh
IE£e−rτφ(Sτ )
¯S0 = x
¤,
where Sh is the set of all stopping times that take values in Dh.The following assumptions will be used throughout the paper.
Condition 2.1. 1. The coefficients b : IR→ IR and σ : IR→ IR are bounded and contin-uous, with infx∈IR σ(x) > 0. Furthermore xb(x) and xσ(x) are Lipschitz continuous.
2. φ : IR → [0,∞) is non-decreasing, and both φ and its derivative φ0 are of polynomialgrowth. Furthermore
supt≥0
e−rtφ(St) ∈ IL1, limt→∞ e
−rtφ(St) = 0, a.s.
3. The continuation region for the continuous-time optimal stopping problem takes theform x : V (x) > φ(x) = (0, x∗).
4. The continuation region for the discrete-time optimal stopping problem takes theform x : V h(x) > φ(x) = (0, xh∗).
5. The payoff function φ is twice continuously differentiable in a neighborhood of x∗.
6. The smooth-Þt-principle holds, that is, the value function V is C1 across the optimalexercise boundary x∗.
As noted in the Introduction, V satisÞes the dynamic programming equation
max [φ(x)− V (x),LV (x)− rV (x)] = 0.Note that usually V is only once continuously differentiable across the optimal exerciseboundary x = x∗. Since φ(x) = V (x) if x ∈ [x∗,∞) and φ(x) < V (x) if x ∈ (0, x∗), itfollows that V 00(x∗−) ≥ φ00(x∗), where the − denotes limit from the left. DeÞne
A.=V 00(x∗−)− φ00(x∗)
φ(x∗)≥ 0. (2.2)
Although one can construct examples where A = 0, it is typically the case that A > 0.We will assume this condition below, and merely note that the rate of convergence of theoptimal threshold does not depend on A at all.
Remark 2.1. The change of variable t = − log x can be used to transform the ordinarydifferential equation (ODE) Lf(x)− rf(x) = 0 on (0,∞) into the ODE
1
2σ(e−t)W 00(t) +
∙1
2σ(e−t)− b(e−t)
¸W 0(t)− rW (t) = 0
4
on IR. Since σ(x) > 0 for x > 0, the classical theory for solutions of ODEs [3] can beused to show that the general solution to LV (x) − rV (x) = 0 can be written in the formc1f1(x) + c2f2(x), where f1(x) is positive and bounded as x ↓ 0 and f2(x) is unbounded asx ↓ 0. Under Condition 2.1, the function f1 is twice continuously differentiable on (0,∞).V (x) is then equal to c1f1(x) for x ∈ (0, x∗] and equal to φ(x) for x ∈ [x∗,∞), where c1 andx∗ are determined by the principle of smooth Þt, i.e.,
c1f1(x∗) = φ(x∗) and c1f 01(x∗) = φ0(x∗).
Remark 2.2. In the case that S is a geometric Brownian motion with b(x) ≡ b and σ(x) ≡σ, and φ(x) = (x − k)+ for some constant k, then Condition 2.1 holds when r > b. Forr ≤ b, the value function for the optimal stopping problem is +∞, and there is no optimalstopping time; see [6].
Remark 2.3. It is usually not a prior clear if parts 3 and 4 of Condition 2.1 hold for ageneral state process. Here we give a sufficient condition that is very easy to verify in thecase φ(x) = (x− k)+. Suppose parts 1 and 2 of Condition 2.1 hold, and in addition that
r ≥ supx∈(0,∞)
©b(x) + xb0(x)
ª.
We claim that parts 3 and 4 of Condition 2.1 hold. We will show that part 3 holds andomit the analogous proof for 4. DeÞne VT (x)
.= supτ≤T IEx [e−rτφ(Sτ )]. Let Sx stand for
the state process starting from S0 ≡ x. A small modiÞcation of the proof of Theorem 5.2[7] shows that the collection of random variables
sup0≤t≤T
¯Sx(t)− Sy(t)
x− y¯, y ∈ (x− δ, x+ δ)
is uniformly integrable for Þxed x, small δ > 0 and terminal time T . It follows immediatelythat VT (x) is a continuous function, since for any stopping τ ≤ T , we have
IE¯e−rτφ
¡Sx(τ)
¢− e−rτφ¡Sy(τ)¢¯ ≤ IE sup0≤t≤T
|Sx(t)− Sy(t)| ≤ c|x− y|
for some constant c. Furthermore, the upper left Dini derivate of VT is always bounded byone, i.e.,
lim supy↑x
VT (x)− VT (y)x− y ≤ 1.
To see this, let τ∗ be the optimal stopping time when S0 ≡ x. The existence of τ∗ isguaranteed by the classical theory of Snell envelop; see [11]. We have
VT (y) ≥ IE£e−rτ∗φ(Syτ∗)
¤,
which implies that
VT (x)− VT (y)x− y ≤ IE
"e−rτ∗φ(Sxτ∗)− e−rτ
∗φ(Syτ∗)
x− y
#.
5
However, if x ≥ y, then Sx(t) ≥ Sy(t) for all t by strong uniqueness. Since φ is non-decreasing and
φ(x)− φ(y) = (x− k)+ − (y − k)+ ≤ x− y ∀ x ≥ y,
we haveVT (x)− VT (y)
x− y ≤ IE"e−rτ∗
¡Sxτ∗ − Syτ∗
¢x− y
#.
Using the uniform integrability, it follows that
lim supy↑x
VT (x)− VT (y)x− y ≤ IE
"limy↑x
e−rτ∗¡Sxτ∗ − Syτ∗
¢x− y
#= IE
he−rτ
∗Dx(τ∗)
i,
where Dx(t).= ∂
∂xSx(t) satisÞes the SDE
dDx(t)
Dx(t)=£b(Sxt ) + S
xt b0(Sxt )
¤dt+
£σ(Sxt ) + S
xt σ
0(Sxt )¤dWt, Dx(0) = 1.
See, e.g., [12, 19]. Since by assumption r ≥ supx∈(0,∞) [b(x) + xb0(x)], it is easy to checkthat
lim supy↑x
VT (x)− VT (y)x− y ≤ IE
he−rτ
∗Dx(τ∗)
i≤ 1.
It follows from a standard result in real analysis (see [20, Proposition 5.1.2]) that
VT (x)− VT (y) ≤ x− y ∀ x ≥ y.
This implies thatx : VT (x) = φ(x) = [x∗T ,∞)
for some real number x∗T . Indeed, if VT (y) = φ(y) = (y − k)+, then since VT > 0 we musthave y > k. It follows that for all x ≥ y,
VT (x) ≤ VT (y) + (x− y) = (y − k) + (x− y) = x− k = φ(x).
But VT ≥ φ trivially, whence VT (x) = φ(x) for all x ≥ y.It remains to show that part 3 of Condition 2.1 holds. It suffices to observe that for all
xVT (x) ↑ V (x)
as T →∞. This completes the proof.Remark 2.4. If S is a geometric Brownian motion with b(x) ≡ b, and σ(x) ≡ σ, andφ(x) = (
PiAix
αi − k)+ for some positive constants (Ai,αi) and k ≥ 0, then one can showthat V (x) − φ(x) is decreasing, which in turn implies that parts 3 and 4 of Condition 2.1hold. A similar argument can be found in [9].
6
2.2 Rates of convergence: value function and stopping region
Let u ∈ [0, 1), letW be a Brownian motion withWu = 0, and deÞneN.= inf n ∈ IN :Wn ≥ 0.
Note that N <∞ w.p.1. DeÞne
H(u).= IEW 2
N and M(u).= IEWN . (2.3)
In terms of these functions we deÞne the constants
C =
Z 1
0H(u) du and K =
Z 1
0M (u) du. (2.4)
These quantities are shown to be Þnite in Lemma 3.2. Note that H(u) > M2(u), andtherefore
C =
Z 1
0H(u) du >
Z 1
0M2(u) du ≥
µZ 1
0M(u) du
¶2= K2.
Our main result is the following.
Theorem 2.1. Assume Condition 2.1, and deÞne the constants A,C, and K by (2.2) and(2.4). Assume that A > 0. The following conclusions hold for all x ∈ (0, x∗).1.
V h(x)− V (x)V (x)
= −12Ax2∗σ
2(x∗)(C −K2)h+ o(h)
2.
xh∗ = x∗ −Kx∗σ(x∗)√h+ o(
√h).
Remark 2.5. Using an elementary argument by contradiction, one can show that theasymptotic expansion in the preceding theorem holds uniformly in any compact subset of(0, x∗).
Example: Consider the special case where b(x) ≡ b and σ(x) ≡ σ. Assume r > b andφ(x) = (x−k)+ for some constant k > 0. It follows that the value function for the continuoustime optimal stopping problem is
V (x) =
½Bxα ; x < x∗x− k ; x ≥ x∗
where
α =
µ1
2− b
σ2
¶+
sµ1
2− b
σ2
¶2+2r
σ2B =
x∗ − kxα∗
,
andx∗ =
α
α− 1k.It follows that
xh∗ = x∗(1−Kσ√h) + o(
√h) =
αk
α− 1(1−Kσ√h) + o(
√h)
7
and
A =V 00(x∗−)− φ00(x∗)
x∗ − k =Bα(α− 1)(x∗)α−2
x∗ − k ,
which implies that
V h(x)− V (x)V (x)
= −12Ax2∗σ
2(C −K2)h+ o(h) = −12α(α− 1)(C −K2)σ2h+ o(h).
2.3 Overview of the proof
In this subsection we outline and prove the main steps in the proof of Theorem 2.1. Theproofs of two key asymptotic expansions are deferred to the next section.
For the simplicity of future analysis, we Þrst introduce a bounded modiÞcation of thepayoff function φ. This modiÞcation will not affect the asymptotics at all; see Proposition2.1.
Let φ ≤ φ be an increasing function satisfying
φ(x) =
½φ(x), if x ≤ x∗ + ak , if x ≥ x∗ + 2a. (2.5)
Here a and k are two positive constants, whose speciÞc values are not important. Withoutloss of generality, we assume that φ is twice continuously differentiable in the region [x∗,∞).Suppose h and δ are two positive constants, and let xδ
.= x∗− δ. We consider the quantities
Wδ(x).= IEx
£e−rτδ φ
¡Sτδ¢¤
and Wδ(x) = IEx£e−rτδφ
¡Sτδ¢¤,
whereτδ.= inf t ≥ 0 : St ≥ xδ ,
and IEx denotes expectation conditioned on S0 = x. Note that Wδ(x) = Wδ(x) for allx ≤ xδ. We also deÞne
Whδ (x)
.= IEx
he−rτ
hδ φ¡Sτhδ
¢iand W h
δ (x).= IEx
he−rτ
hδ φ¡Sτhδ
¢i,
whereτhδ
.= inf nh ≥ 0 : Snh ≥ xδ .
Main idea of the proof. The main idea for proving the rates of convergence is as follows.Write
Whδ (x)− V (x) =
hW hδ (x)− W h
δ (x)i+hW hδ (x)− Wδ(x)
i+£Wδ(x)− V (x)
¤.
For each term, we will obtain approximations as h and δ tend to zero. It turns out theleading term has the following form:
−12a1δ
2 + a2δ√h+ a3h+ higher order term.
8
Here a1, a2, and a3 are constants (some of which depend on x), with a1 > 0. Since the func-tion Wh
δ (x) attains its maximum at δ∗ = x∗ − xh∗ , one would expect that δ∗ approximatelymaximizes the leading term, or
δ∗ =a2a1
√h+ o(
√h). (2.6)
Furthermore, substituting this back in one would expect
V h(x) =Whδ∗(x) =
µ−a
22
a1+ a3
¶h+ o(h).
This is in fact how the argument will proceed. We begin with the estimation of the Þrstterm, which turns out to be negligible for small h and δ. DeÞne the quantity
4δ,h.=W h
δ (x)− W hδ (x) = IE
xhe−rτ
hδ
³φ(Sτhδ
)− φ(Sτhδ )´i. (2.7)
We have the following result.
Proposition 2.1. DeÞne 4δ,h by (2.7). There exist constants L <∞ and ε > 0 such that
|4δ,h| ≤ Le− εh
for all sufficiently small δ and h.
Proof. The proof of the proposition is based on the following bound. Let a be as in thecharacterization (2.5) of φ. Then for any x ≤ x∗ and y ≥ x∗ + a,
IP¡Sh > y
¯S0 = x
¢ ≤ exp(− ∙log yx∗− c1h
¸2,c2h
), (2.8)
where the Þnite constants c1, c2 depend only on the coefficients b,σ. To prove this boundwe use the expression
Sh = S0 exp
½Z h
0
∙b(St)− 1
2σ2(St)
¸dt+
Z h
0σ(St) dWt
¾.
DeÞne c1.= kbk∞ + 1
2kσ2k∞. Since x ≤ x∗
p.= IPx (Sh > y) ≤ IPx
³eR h0 σ(St) dWt ≥ elog y
x∗−c1h´.
However, if B.= log y
x∗ − c1h and c2.= 2kσ2k∞, then for θ > 0
p ≤ IPx³eθR h0 σ(St) dWt ≥ eθB
´≤ IPx
³eθR h0 σ(St) dWt− 1
2θ2R h0 σ(St)
2 dt ≥ eθB− 14θ2c2h
´≤ e−θB+
14θ2c2h.
9
The last inequality follows from Chebychevs inequality. Minimizing the right hand sideover θ completes the proof of (2.8).
We now complete the proof of the proposition. To ease the exposition, we use τ in lieuof τhδ throughout the proof. We have
4δ,h =
∞Xn=1
e−rnhIEx£¡φ(Snh)− φ(Snh)
¢1τ=n
¤≤
∞Xn=1
e−rnhZ ∞
a+x∗
¯φ0(y)− φ0(y)¯ IPx(Snh > y, τ = n) dy.
We also have, for any y > x∗ + a, that
IPx(Snh > y, τ = n)
= IPx(Snh > y | τ = n)IPx(τ = n)= IPx(Snh > y |Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)IPx(τ = n).
DeÞne the stopping timeσ.= inft ≥ (n− 1)h : St ≥ xδ.
Then
IPx(Snh > y | τ = n)= IPx(Snh > y |σ ≤ nh, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)
=
Z h
0IPx(Snh > y | σ = nh− t, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)
· IPx(σ ∈ nh− dt |σ ≤ nh, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ).However, the strong Markov property implies for all t ∈ [0, h] that
IPx(Snh > y |σ = nh− t, Snh ≥ xδ, S(n−1)h < xδ, · · · , S0 < xδ)= IP(St > y |S0 = xδ, St ≥ xδ)=IP(St > y |S0 = xδ)IP(St ≥ xδ |S0 = xδ) .
The denominator in this display is uniformly bounded from below away from zero for t ∈[0, 1]:
IP(St ≥ xδ |S0 = xδ) ≥ α > 0, ∀ t ∈ [0, 1];
see Lemma 3.4 for a proof. Using (2.8), for all small h > 0 and t ∈ (0, h)
IP(St > y |S0 = xδ) ≤ exp(−∙log
y
x∗− c1t
¸2,c2t
)≤ exp
(−∙log
y
x∗− c1h
¸2,c2h
).
Now since φ0 is of polynomial growth and φ0(x) is zero for large x, it follows that thereare Þnite constants R and m such that¯
φ0(y)− φ0(y)¯ ≤ Rym−1 for all y > x∗ + a.10
Hence, for all small δ > 0, the change of variable x = log yx∗ − c1h gives
4δ,h ≤ R
α
∞Xn=1
e−rnhIPx(τ = n)Z ∞
a+x∗ym−1 exp
(−∙log
y
x∗− c1h
¸2,c2h
)dy
=R
α(x∗)memc1h
∞Xn=1
e−rnhIPx(τ = n)Z ∞
log³1+ a
x∗´−c1h
emx− x2
c2h dx.
For h small enough, there exists positive numbers a, C, c such that
4δ,h ≤ C
∞Xn=1
e−rnhIPx(τ = n)Z ∞
log³1+ a
x∗´ e−x2
ch dx
= C√2πc · Φ
µ− log
µ1 +
a
x∗
¶Á√ch
¶·√h
∞Xn=1
e−rnhIPx(τ = n)
≤ C√2πc · Φ
µ− log
µ1 +
a
x∗
¶Á√ch
¶·√h.
Here Φ is the cumulative distribution function for the standard normal distribution. Wecomplete the proof of the proposition by using the asymptotic relation
Φ(−x) ∼ 1√2πx
e−x2
2
as x→∞. 2
The bound just proved shows that£Whδ (x)− Wh
δ (x)¤is exponentially small as h →
0, uniformly for all small δ > 0. We now consider the terms£Whδ (x)− Wδ(x)
¤and£
Wδ(x)− V (x)¤. When considering the asymptotic behavior of these terms, it is often
convenient to scale δ with h as h→ 0 in the manner suggested by (2.6). For the remainderof this proof, unless explicitly stated otherwise, we will assume that
δ = c√h+ o(
√h) as h→ 0 (2.9)
for a non-negative parameter c. With an abuse of notation, the quantities Whδ (x) and Wδ(x)
will be denoted by W hc (x) and Wc(x) when the relation (2.9) holds.
We next estimate£Wc(x)− V (x)
¤as h→ 0.
Proposition 2.2. Assume Condition 2.1 and deÞne A by (2.2). Assume also that A > 0.Then
Wc(x)− V (x) =∙−12Ac2h+ o(h)
¸V (x).
Proof. Recall that V (x) can be characterized, for x ≤ x∗, as a multiple of the bounded(in a neighborhood of zero) solution f1 to Lf(x) − rf(x) = 0; see Remark 2.1. Wc(x)can be likewise characterized, with the constant determined by the boundary conditionWc(xδ) = φ(xδ). Thus
Wc(x) =φ(xδ)
V (xδ)V (x) for all x ∈ (0, xδ].
11
We now expand for small δ ≥ 0, and use xδ .= x∗ − δ, V (x∗) = φ(x∗), V 0(x∗) = φ0(x∗), andthe deÞnition of A to obtain
Wc(x)− V (x)V (x)
=
µφ
V
¶0 ¯¯x∗
· (−δ) + 1
2
µφ
V
¶00 ¯¯x∗
· (−δ)2 + o(δ2) = −12Aδ2 + o(δ2).
The proof is completed by using (2.9). 2
In the next proposition we state the expansion for£W hδ (x)− Wδ(x)
¤. This estimate
deals with the critical comparison between the discrete and continuous time problems. Theproof of this expansion is detailed, and therefore deferred to the next section.
Proposition 2.3. Assume Condition 2.1 and deÞne A, C, and K by (2.2) and (2.4).Assume also that A > 0. Then
W hc (x)− Wc(x) =
∙Kx∗σ(x∗)Ach− 1
2ACx2∗σ
2(x∗)h+ o(h)¸V (x).
Proof of Theorem 2.1. Recall that xh∗ is the optimal boundary for the stopping problemwith value function V h. On the stopping region, we always have V h(x) = φ(x). Also, sinceV h(x) is deÞned by supremizing over a subset of the stopping times allowed in the deÞnitionof V (x), it follows that V h(x) ≤ V (x). Since V (x) ≥ φ(x) for all x, it follows that xh∗ ≤ x∗.
According to Propositions 2.1, 2.2 and 2.3, for each Þxed c ∈ [0,∞)Whc (x)− V (x)V (x)
=
∙−12Ac2h+Kx∗σ(x∗)Ach− 1
2ACx2∗σ
2(x∗)h+ o(h)¸.
This suggests the choice c∗.= Kx∗σ(x∗). Inserting this into the last display gives
Whc∗(x)− V (x)V (x)
=
∙1
2A(K2 − C)x2∗σ(x∗)2h+ o(h)
¸,
and since V h(x) ≥W hc∗(x) it follows that
lim infh↓0
V h(x)− V (x)V (x)h
≥ 1
2A(K2 −C)x2∗σ(x∗)2. (2.10)
Now deÞne ch by xh∗ = x∗ − ch√h. Since xh∗ ≤ x∗ we know that ch ∈ [0,∞). By
taking a convergent subsequence, we can assume that ch → c ∈ [0,∞]. Using an elementaryweak convergence argument, one can show that xh∗ → x∗. First assume that c ∈ (0,∞). Ifc 6= Kx∗σ(x∗), then by Propositions 2.1, 2.2 and 2.3 we have
lim suph↓0
V h(x)− V (x)V (x)h
<1
2A(K2 − C)x2∗σ(x∗)2,
which contradicts (2.10). If c =∞, then Propositions 2.1 and 2.3 and an argument analo-gous to the one used in Proposition 2.2 shows that
V h(x)− V (x)V (x)
= −A(ch)2h[1 + o(1)].
Since (ch)2 → +∞, this again contradicts (2.10), and thus c = Kx∗σ(x∗). We extend to theoriginal sequence by the standard argument by contradiction, and Theorem 2.1 follows. 2
12
3 Approximations and Expansions in Terms of Local Timeand the Excursions of a Brownian Motion
In this section we prove Proposition 2.3, which is the expansion Whδ (x) − Wδ(x) for small
h > 0. We will use the fact that Wδ has the representation
Wδ(x) =
½φ(x) , for x ≥ xδ
−ε(x) + e−rhIE £Wδ(Sh)¯S0 = x
¤, for x < xδ.
Here ε(x) is an error term that can be given explicitly in terms of the value function Wδ
and the transition probabilities of Snh, and xδ.= x∗ − δ. From the last display, we have
ε(x) = IExhe−rhWδ(Sh)− Wδ(x)
i, ∀ x < xδ.
It follows from the generalized Ito formula that
e−rhWδ(Sh)− Wδ(x) =
Z h
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt (3.1)
+4W 0δ(xδ)
Z h
0e−rtdLSt (xδ) +
Z h
0e−rtW 0
δ(St)Stσ(St) dWt.
Here LS is the local time for process S, and
4W 0δ(xδ)
.= W 0
δ(xδ+)− W 0δ(xδ−).
Lemma 3.1. For every x ∈ (0, xδ),
ε(x) = IExe−rhWδ(Sh)− Wδ(x)
= IExZ h
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt+ IEx4W 0
δ(xδ)
Z h
0e−rtdLSt (xδ).
Proof. The result follows from (3.1) if the stochastic integral is zero. We recall that Wδ(x)is equal to φ(x) for x ≥ xδ and V (x)φ(xδ)/V (xδ) for x ≤ xδ. Since Wδ(x) is equal to φ(x)for large x and hence constant, W 0
δ(x) = 0 for all large x. Also, W0δ(x) is clearly bounded
in a neighborhood of xδ. For µ > 0, let σµ.= inft : St ≤ µ ∧ h. Then the boundedness of
xW 0δ(x) for x ≥ µ implies
IExe−rσµWδ(Sσµ)− Wδ(x) = IExZ σµ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt
+ IEx4W 0δ(xδ)
Z σµ
0e−rtdLSt (xδ).
The lemma follows by letting µ ↓ 0, and using dominated convergence for all terms but thelast, which uses monotone convergence. 2
13
Let τhδ.= infnh : Snh ≥ xδ. Then the discounting and (3.1) imply the formula
Wδ(x) = −IExτhδ /h−1Xn=0
e−rnhε(Snh) + IExe−rτhδ φ(Sτhδ
)
= − IExZ τhδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt
−4W 0δ(xδ)IE
x
Z τhδ
0e−rtdLSt (xδ) + IE
xe−rτhδ φ(Sτhδ
)
for all x < xδ. We recall the deÞnition
W hδ (x)
.= IExe−rτ
hδ φ(Sτhδ
).
It follows that
Whδ (x)− Wδ(x) = IEx
Z τδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt
+4W 0δ(xδ)IE
x
Z τhδ
0e−rtdLSt (xδ).
The proof of Proposition 2.3 is thereby reduced to proving the following two results.
Proposition 3.1. Assume Condition 2.1 and deÞne A and C by (2.2) and (2.4). Assumealso that A > 0. Then
IExZ τhδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt =
∙−12ACx2∗σ
2(x∗)h+ o(h)¸V (x). (3.2)
Proposition 3.2. Assume Condition 2.1 and deÞne A and K by (2.2) and (2.4). Assumealso that A > 0. Then
4W 0δ(xδ)IE
x
Z τhδ
0e−rtdLSt (xδ) = [Kx∗σ(x∗)Ach+ o(h)]V (x). (3.3)
The proofs of Propositions 3.1 and 3.2 use estimates on the excursions and local timeof Brownian motion, respectively, and are given in the next two subsections. We will needto relate the constants that appear in these approximations to the simple constants deÞnedby (2.3) and (2.4). The following lemma gives this relationship.
Lemma 3.2. Let W be a standard Brownian motion that satisÞes Wu = 0 for some u ∈[0, 1). DeÞne H(u) and M (u) by (2.3). Then
H(u) = IE
Z N
u1Wt≥0 dt and M(u) = IELWu,N(0),
where N.= inf n ∈ IN : Wn ≥ 0 and LWu,N(0) is the local time of W on the interval [u,N ].
14
Proof. We Þrst prove H(u) = IER Nu 1Wt≥0 dt. Consider the continuously differentiable
convex function
f(x).=1
2(x+)2 =
½0 , if x ≤ 012x2, if x ≥ 0.
It follows from Itos formula that
f(WN∧n) = f(Wu) +
Z N∧n
uf 0(Wt) dWt +
1
2
Z N∧n
uf 00(Wt) dt
=
Z N∧n
uWt1Wt≥0 dWt +
1
2
Z N∧n
u1Wt≥0 dt
for all integers n ∈ IN. This yields
IE¡W+N∧n
¢2= IE
Z N∧n
u1Wt≥0 dt ∀ n ∈ IN.
Letting n → ∞, the right hand side converges to IE R Nu 1Wt≥0 dt by the monotone con-vergence theorem. Since W+
N∧n ≤ WN , the result will follow by dominated convergence ifIEW 2
N is Þnite.To show IEW 2
N is Þnite, we consider the conditional probability
Pn+1(x).= IP
¡WN ≥ x
¯N = n+ 1
¢= IP
¡Wn+1 ≥ x
¯Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢for all n ∈ IN0 and x ≥ 0. DeÞne the following stopping time
σ.= inf t ≥ n : Wt = 0 .
Then
Pn+1(x) = IP¡Wn+1 ≥ x
¯σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢=
Z 1
0IP¡Wn+1 ≥ x
¯σ = n+ t,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢· IP ¡σ ∈ n+ dt ¯ σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢.
However, by strong Markov Property, for all t ∈ [0, 1]IP¡Wn+1 ≥ x
¯σ = n+ t,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢= IP
¡W1 ≥ x
¯Wt = 0,W1 ≥ 0
¢= 2Φ
µ− x√
1− t¶
≤ 2Φ(−x).Here Φ is the cumulative distribution function for the standard normal. Hence,
Pn+1(x) ≤ 2Φ(−x)Z 1
0IP¡σ ∈ n+ dt ¯σ ≤ n+ 1,Wn+1 ≥ 0,Wn < 0, · · · ,W1 < 0
¢= 2Φ(−x).
15
It follows that
IEW 2N =
∞Xn=1
IE(W 2N1N=n)
=
∞Xn=1
Z ∞
02xIP(WN ≥ x,N = n) dx
=∞Xn=1
Z ∞
02xIP(WN ≥ x
¯N = n)IP(N = n) dx
≤∞Xn=1
IP(N = n)
Z ∞
04xΦ(−x) dx
=
Z ∞
04xΦ(−x) dx
< ∞.As for the equality M(u) = IELWu,N(0), it follows from Tanakas formula that
WN =W+N =
Z N
u1Wt≥0 dWt + L
Wu,N(0).
However, since the preceding proof already implies that IER Nu 1Wt≥0 dt <∞,
IEWN = IELWu,N (0) =M(u).
This completes the proof. 2
3.1 Proof of Proposition 3.1
In this subsection we prove
IExZ τhδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt =
∙−12ACx2∗σ
2(x∗)h+ o(h)¸V (x).
We recall the deÞnition
H(u).= IE
Z N
u1Wt≥0 dt,
where W is a standard Brownian motion with Wu = 0, and N.= inf n ∈ IN :Wn ≥ 0.
Lemma 3.3. H(u) is continuous and bounded on the interval [0, 1).
Proof. DeÞne
Zu.=
Z N
u1Wt≥0 dt
where W is a Brownian motion with Wu = 0. We Þrst show that the family Zu, u ∈ [0, 1)is uniformly integrable (and in particular, that H(u) is bounded). Indeed, deÞne
c0.=
Z 1
u1Wt≥0 dt, cj
.=
Z j+1
j1Wt≥0 dt, j ∈ IN.
16
The key observation is that if cj > 0, then W must spend some time during the interval[j, j + 1] to the right of zero, therefore the probability that Wj+1 > 0 is at least half. Thusfor all j ∈ IN0
IP(N = j + 1¯N > j, cj > 0) ≥ 1
2.
Let Xu.=PN−1j=0 1cj>0. Clearly Xu dominates Zu. Furthermore, the strong Markov
property implies that
IP(Xu ≥ j + 1¯Xu ≥ j) ≤
µ1− 1
2
¶=1
2.
This, in turn, implies that
IP(Xu ≥ n) ≤ 1
2n−1,
and thus
IE(X2u) =
∞Xn=1
2nIP(Xu ≥ n) ≤∞Xn=1
n
2n−2<∞.
Therefore Zu, u ∈ [0, 1) is uniformly integrable.As for the continuity, we write
H(u) = IE
Z Nu
u1Bt−Bu≥0 dt = IEZu.
where B is some standard Brownian motion with B0 ≡ 0 and
Nu.= inf n ∈ IN0 : Bn −Bu ≥ 0 .
Let u ∈ [0, 1) and let un be an arbitrary sequence in [0, 1) with un → u. Since for anyÞxed n IP(Bn −Bu = 0) = 0,
Zun → Zu
with probability one. Since the Zun are uniformly integrable, we have
H(un)→ H(u),
which completes the proof. 2
Now for any u ∈ [0, 1) and h > 0, deÞne the function
G(h;u).= IE
Z Nhh
uhe−r(t−uh)
£−rφ(St) + Lφ(St)¤1St≥xδ dt,where
dStSt
= b(St) dt+ σ(St) dWt, Suh = 0
andNh .= inf n ∈ IN : Snh ≥ xδ .
17
Let bac denote the integer part of a. It follows from strong Markov property that
IExZ τhδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt = IEx
he−rτδG
³h;τδh−jτδh
k´i.
The change of variable t 7→ th and the transformation
Y(h)t
.=Sth − xδ√
h
yield
G(h;u) = hF (h; u).= hIE
Z Nh
ue−r(t−u)h
£−rφ+ Lφ¤(√hY (h)t + xδ)1Y (h)t ≥0 dt,
where Y (h) follows the dynamics
dY(h)t = (
√hY
(h)t + xδ)
h√hb(√hY
(h)t + xδ) + σ(
√hY
(h)t + xδ) dWt
i, Y (h)u = 0.
Therefore
IExZ τhδ
0e−rt[−rφ(St) + Lφ(St)]1St≥xδ dt = hIEx
he−rτδF
³h;τδh−jτδh
k´i. (3.4)
We have the following result regarding F (h;u). Although part of the proof is similar tothat of Lemma 3.3, we provide the details for completeness.
Lemma 3.4. 1. F (h;u) is uniformly bounded for small h and all u ∈ [0, 1).2.
limh→0
F (h; u) = [−rφ+ Lφ](x∗)H(u),
and the convergence is uniform on any compact subset of [0, 1).
Proof. Consider the family of random variables Zh,u : u ∈ [0, 1), h ∈ (0, 1) where
Zh,u.=
Z Nh
ue−r(t−u)h
£−rφ+ Lφ¤(√hY (h)t + xδ)1Y (h)t ≥0 dt
and
dY(h)t = (
√hY
(h)t + xδ)
h√hb(√hY
(h)t + xδ) + σ(
√hY
(h)t + xδ) dWt
i, Y (h)u = 0.
We Þrst show this family is uniformly integrable. Since (−rφ+Lφ) is bounded, it is sufficientto show that
Xh,u.=
Z Nh
u1Y (h)t ≥0 dt (3.5)
18
are uniformly integrable. DeÞne
c(h)0
.=
Z 1
u1Y (h)t ≥0 dt, c
(h)j
.=
Z j+1
j1Y (h)t ≥0 dt, j ∈ IN
As in the proof of Lemma 3.3, if c(h)j > 0, then Y
(h)t spends some time to the right of zero in
interval [j, j + 1]. Therefore the probability Y(h)j+1 ≥ 0 is bounded from below by a positive
constant:
IP³Nh = j + 1
¯Nh > j, c
(h)j > 0
´≥ α > 0, ∀ u ∈ [0, 1), h ∈ (0, 1).
A proof is as follows. To show a lower bound α > 0 exists, it suffices to show that for someα > 0
pt,h.= IP(Y
(h)t ≥ 0 ¯Y (h)0 ≥ 0) ≥ α > 0, ∀ t ∈ [0, 1].
However, it is easy to see that
pt,h = IP(Sth ≥ xδ |S0 ≥ xδ)
≥ IP
µexp
½Z th
0
£b(Su)− 1
2σ2(Su)
¤du+
Z th
0σ(Su) dWu
¾≥ 1
¶≥ IP
µZ th
0σ(Su) dWu ≥ c1th
¶,
where c1.= kbk∞ + 1
2kσ2k∞. We can view the stochastic integral Qt.=R t0 σ(Su) dWu a
time-changed Brownian motion. Indeed, there exists a Brownian motion B such that
Qt = BhQit .
Letσ.= inf
xσ(x) σ
.= sup
xσ(x).
Thenσ2h ≤ hQit ≤ σ2h.
It follows that
pt,h ≥ IP (Qth ≥ c1th) ≥ IPµ
minσ2th≤s≤σ2th
Bs ≥ c1th¶= IP
µmin
σ2≤s≤σ2Bs ≥ c1
√th
¶,
where the last equality follows sincen
1√thBths, s ≥ 0
ois still a standard Brownian motion.
For h ∈ (0, 1), we can choose
α = IP
µmin
σ2≤s≤σ2Bs ≥ c1
¶> 0,
which will serve as a lower bound.
19
Now deÞne
Mh,u.=
Nh−1Xj=0
1c(h)j >0,
which clearly dominates Xh,u. By the strong Markov property,
IP(Mh,u > j + 1¯Mh,u > j) ≤ 1− α,
and thusIP(Mh,u ≥ j) ≤ (1− α)j−1.
This implies that
IE(M2h,u) =
∞Xj=0
2jIP(Mh,u ≥ j) ≤∞Xj=0
2j(1− α)j−1 <∞,
which implies the uniform integrability of Zh,u, u ∈ [0, 1), h ∈ (0, 1). In particular,F (h;u) is uniformly bounded for u ∈ [0, 1) and h ∈ (0, 1).
For the uniform convergence, it suffices to show that for any u ∈ [0, 1) and any sequenceuh ∈ [0, 1) converging to u,
F (h; uh) = IEZh,uh → [−rφ+ Lφ](x∗)H(u).
Let Y (h) be the process with Y(h)
uh= 0. As h → 0, we have that Y (h) converges weakly to
Y , where Y is deÞned asYt = x∗σ(x∗)(Wt −Wu).
By the Skorohod representation, we can assume Y (h) → Y with probability one. Using theuniform integrability, it suffices to show that
Zh,uh → Z.=
Z N
u1Yt≥0 dt
with probability one. Note N is almost surely Þnite, and that
Nh → N
with probability one. The almost sure convergence of Zh,uh to Z then follows from thedominated convergence theorem, which completes the proof. 2
Returning to the proof of Proposition 3.1, we claim that
limh→0
IExhe−rτδF
³h;τδh−jτδh
k´i= C[−rφ+ Lφ](x∗)IEx
£e−rτ∗
¤(3.6)
where C.=R 10 H(u) du. To ease notation, let
Uh.=τδh−jτδh
k.
20
It suffices to show that
limh→0
IEx£e−rτδF (h;Uh)
¤= C[−rφ+ Lφ](x∗)IEx
£e−rτ∗
¤.
We can write
IEx£e−rτδF (h;Uh)
¤= IEx
£e−rτδF (h;Uh)
¤− [−rφ+ Lφ](x∗)IEx £e−rτδH(Uh)¤+ [−rφ+ Lφ](x∗)IEx
£e−rτδH(Uh)
¤.
In Proposition 4.1 in the Appendix we show, roughly speaking, that (τδ, Uh) converges indistribution to (τ∗, U) as h and δ tend to zero, where U is uniformly distributed and inde-pendent of τ∗. This is not strictly true, in that we ignore what happens on the unimportantevent τ∗ =∞. It follows from Proposition 4.1 that
IEx£e−rτδH(Uh)
¤ → IEx£e−rτ∗
¤ Z 1
0H(u) du = CIEx
£e−rτ∗
¤.
Therefore, to prove (3.6) we must show that
4 .= IEx
£e−rτδF (h;Uh)
¤− [−rφ+ Lφ](x∗)IEx £e−rτδH(Uh)¤ → 0.
Due to the uniform boundedness of F and H, there exists R ∈ (0,∞) such that¯F (h, u)
¯+¯[−rφ+ Lφ](x∗)H(u)
¯ ≤ R ∀ u ∈ [0, 1)when h is small enough. Since Uh ⇒ U , for h small enough,
IP (Uh > 1− ε) ≤ 2ε.Also, by Lemma 3.4 for h small enough
supu∈[0,1−ε]
¯F (h, u)− [−rφ+ Lφ](x∗)H(u)
¯ ≤ ε.It follows that, for h small enough,
4 ≤ εIP(Uh ≤ 1− ε) +RIP(Uh > 1− ε) ≤ (2R+ 1)ε,which completes the proof of (3.6).
It follows directly from the deÞnitions of V (x) and τ∗ that
IEx£e−rτ∗
¤= V (x)/φ(x∗). (3.7)
Also, the deÞnition of A in (2.2) and the fact that (−rV + LV )(x∗−) = 0 imply that
(−rφ+ Lφ)(x∗) = (−rV + LV )(x∗−) + 12σ2(x∗)x2∗
£φ00(x∗)− V 00(x∗−)
¤=
1
2σ2(x∗)x2∗
£φ00(x∗)− V 00(x∗−)
¤=
1
2σ2(x∗)x2∗Aφ(x∗).
The proposition follows by combining the last display with (3.4), (3.6), and (3.7).
21
3.2 Proof of Proposition 3.2
In this subsection we verify equation (3.3):
4W 0δ(xδ)IE
x
Z τhδ
0e−rtdLSt (xδ) = [Kx∗σ(x∗)Ach+ o(h)]V (x).
We recall the notation xδ = x∗ − δ, where δ = c√h+ o(
√h). It follows from the deÞnition
(2.2) of A that
4W 0δ(xδ) = φ0(xδ)− φ(xδ)
V (xδ)V 0(xδ)
=
µφ0 − φ
VV 0¶0 ¯¯x∗
· (−δ) + o(δ)
= [V 00(x∗−)− φ00(x∗)]δ + o(δ)= cAφ(x∗)
√h+ o(
√h).
As a consequence, the main difficulty in proving (3.3) lies with the term
IExZ τhδ
0e−rtdLSt (xδ).
As in the previous subsection, we consider the transformation
Y(h)t
.=Sth − xδ√
h.
Then Y (h) satisÞes the SDE
dY(h)t = (
√hY
(h)t + xδ)
h√hb(√hY
(h)t + xδ) + σ(
√hY
(h)t + xδ) dWt
i.
We have the following lemma, whose proof is trivial from the deÞnition of the local timeand thus omitted.
Lemma 3.5. Suppose X is a semimartingale, and Yt.= aXbt + v where a > 0, b > 0, v are
arbitrary constants. Let LY and LX denote the local times for Y and X, respectively. Thenfor all t ≥ 0
LYt (ax+ v) = aLXbt (x).
It follows from the lemma that
IExZ τhδ
0e−rtdLSt (xδ) =
√hIEx
Z Nh
0e−rthdLY
(h)
t (0).
For any u ∈ [0, 1), deÞne the processY ∗t = x∗σ(x∗)Wt, Y ∗u = 0.
Also deÞneQ(u)
.= IELY
∗u,N(0), where N
.= inf n ∈ IN : Y ∗n ≥ 0
We have the following result.
22
Proposition 3.3.
limh→0
IExZ Nh
0e−rthdLY
(h)
t (0) = IEx£e−rτ∗
¤ Z 1
0Q(u) du
Before giving the proof, we show how the proposition will follow from Proposition 3.3.We have IExe−rτ∗ = V (x)/φ(x∗), and the deÞnitions of Q and M imply
R 10 Q(u)du =
(x∗)2σ(x∗)2R 10 M(u)du. When combined with the expansion given above for ∆W
0δ(xδ), the
left hand side of (3.3) is equal to
cAφ(x∗)V (x)
φ(x∗)(x∗)2σ(x∗)2
Z 1
0M(u)du+ o(h),
and thus (3.3) follows from Lemma 3.2.
Proof of Proposition 3.3. We consider the test function
f(x).=
0, if x ≤ 0x, if 0 ≤ x ≤ 1k, if x ≥ 2.
We require f(x) to be increasing, and smooth except at the point x = 0 (the speciÞc choiceof k is not important). It follows from the generalized Ito formula and the integration byparts formula that
dhe−rthf(Y (h)t )
i= −rhe−rthf(Y (h)t ) dt+ e−rthD−f(Y (h)t ) dY
(h)t
+1
2e−rthf 00(Y (h)t )dY
(h)t · dY (h)t + e−rthdLX
(h)
t (0).
Without loss of generality, we let f 00(0) = 0. Now we integrate both sides from 0 to Nh andtake expected value. The stochastic integral on the right hand sideZ Nh
0e−rthD−f(Y (h)t )(
√hY
(h)t + xδ)σ(
√hY
(h)t + xδ) dWt
has expectation 0 since
D−f(Y (h)t )(√hY
(h)t + xδ)
is bounded (note that D−f(x) = 0 for x ≥ 2) and σ is bounded by assumption.The Þrst term on the right hand side will contribute
−rhIExZ Nh
0e−rthf(Y (h)t ) dt = −rhIEx
Z Nh
τδh
e−rthf(Y (h)t ) dt
since f(x) = 0 for x ≤ 0. We recall the deÞnition (3.5) of Xh,u. It follows from the strongMarkov property that
IExZ Nh
τδh
e−rthf(Y (h)t ) dt ≤ kIExZ Nh
τδh
1Y (h)t ≥0 dt = kIExG(h, Uh)
23
whereUh
.=τδh−jτδh
kand
G(h, u).= IEXh,u.
By the uniform integrability of Xh,u for small h and u ∈ [0, 1), IExG(h, Uh) is uniformlybounded for small h. Therefore the expectation of the Þrst term in the right hand side goesto zero as h→ 0. The second term in the right hand side contributes
√hIEx
Z Nh
0e−rthD−f(Y (h)t )(
√hY
(h)t + xδ)b(
√hY
(h)t + xδ) dt.
Note that the integrand is bounded by 1Y (h)t ≥0 up to a proportional constant. It followsexactly as in the case of the Þrst term that the contribution of the second term goes to zero.The third term in the right hand side contributes
IExZ Nh
0
1
2e−rthf 00(Y (h)t )(
√hY
(h)t + xδ)
2σ2(√hY
(h)t + xδ) dt.
Since f 00(x) = 0 for x < 0, expected value equals
IExZ Nh
τδh
1
2e−rthf 00(Y (h)t )(
√hY
(h)t + xδ)
2σ2(√hY
(h)t + xδ) dt.
It follows from strong Markov property that the expectation can also be written
IEx£e−rτδF (h;Uh)
¤,
where
F (h;u).= IE
Z Nh
u
1
2e−r(t−u)hf 00(Y (h)t )(
√hY
(h)t + xδ)
2σ2(√hY
(h)t + xδ)1Y (h)t ≥0 dt
and where Y (h) satisÞes the same dynamics with Y(h)u = 0. Since the integrand is bounded
due to the fact that f 00(x) = 0 for all x ≥ 2, it follows from an analogous argument to theone given in the proof of Lemma 3.4 that:
1. F (h;u) is uniformly bounded for small h and all u ∈ [0, 1);2.
J(u).= limh→0
F (h; u) =1
2IE
Z N
uf 00(Y ∗t )x
2∗σ2(x∗) dt
and the convergence is uniform on any compact subset of [0, 1).
The uniform convergence (on compact sets) of F and Proposition 4.1 in the Appendix implythat the expectation of the third term converges to
IEx£e−rτ∗
¤ Z 1
0J(u) du.
24
We omit the details here since an analogous argument is used in the proof of Proposition3.1.
It remains to calculate the contribution from the term
IExhe−rτ
hδ f(Y
(h)
Nh )i= IEx
£e−rτδK(h,Uh)
¤where
K(h;u).= IE
he−r(N
h−u)hf(Y (h)Nh )
iwith Y
(h)u = 0. However, the boundedness and continuity of f ensure that
1. K(h;u) is uniformly bounded for all h and all u ∈ [0, 1).2.
I(u).= limh→0
K(h;u) = IE[f(Y ∗N)|Y ∗u = 0]
and the convergence is uniform on any compact subset of [0, 1).
Indeed, the Þrst claim is trivial. As for the second claim, let uh → u. Then as h → 0,Y (h) ⇒ Y ∗. By the Skorohod representation theorem we can assume Y (h) → Y ∗ withprobability one, which also implies that Nh → N with probability one. Therefore,
Y(h)
Nh → Y ∗N
with probability one. The claim now follows from the dominated convergence theorem.Hence similarly we have
IExhe−rτ
hδ f(Y
(h)
Nh )i
→ IEx£e−rτ∗
¤ Z 1
0I(u) du,
as h→ 0. It is now sufficient to prove
I(u)− J(u) = Q(u) ∀ u ∈ [0, 1).
This is the same showing
IE
∙f(Y ∗N)−
1
2
Z N
uf 00(Y ∗t )x
2∗σ2(x∗) dt− LY ∗u,N(0)
¸= 0,
whereY ∗t = x∗σ(x∗)Wt, Y ∗u = 0.
To prove this, we apply Itos formula to f(Y ∗) to obtain
f(Y ∗N)− f(Y ∗u ) =Z N
uD−f(Y ∗t )x∗σ(x∗) dWt +
1
2
Z N
uf 00(Y ∗t )x
2∗σ2(x∗) dt+ LY
∗u,N(0).
But f(Yu) = f(0) = 0. Furthermore the integrand of the stochastic integral is dominatedby 1Y ∗t ≥0 up to a proportional constant, which implies that the stochastic integral hasexpectation 0. This completes the proof. 2
25
4 Appendix: Weak convergence of (τδ, Uh)
For an arbitrary y > 0, deÞne the function
P y(x, t).= IP
µmax0≤u≤t
Su ≥ y¯S0 = x
¶.
We have the following lemmata.
Lemma 4.1. For every Þxed y > 0, function P y ∈ C1,2¡(0, y)× (0,∞)¢∩ C¡(0, y)× [0,∞)¢and satisÞes the parabolic equation
−∂Py
∂t(x, t) + LP y(x, t) = 0 (x, t) ∈ (0, y)× (0,∞).
Proof: It follows from a standard weak convergence argument that P y is a continuousfunction; see, e.g., [13]. Let (x0, t0) ∈ (0, y)× (0,∞) and deÞne the region
D.= (x0 − ε, x0 + ε)× (t0 − ε, t0).
Consider the parabolic equation
−∂u∂t(x, t) + Lu(x, t) = 0 (x, t) ∈ D,
with boundary condition u = P y on its parabolic boundary. It follows from standard PDEtheory that there exists a classical solution u [8]. It remains to show that u = P y in thedomain D. DeÞne the stopping time
τ.= inf t ≥ 0 : (t0 − t, St) 6∈ D .
It follows that the process u(St, t0 − t) is a (bounded) martingale. In particular,
u(x0, t0) = IEx0u(Sτ , t0 − τ) = IEx0P y(Sτ , t0 − τ) = P y(x0, t0).
Here the last equality follows from strong Markov property. 2
For Þxed 0 < x < y, the density of the hitting time τy is deÞned as
py(x, t).=∂P y
∂t(x, t).
According to the preceding lemma, py is continuous in the domain (0, y)× (0,∞).Lemma 4.2. Suppose yn → y∗, then P yn(x, t) → P y∗(x, t) and pyn(x, t) → py
∗(x, t) uni-
formly on any compact subset of (0, y∗)× (0,∞).Proof: It suffices to show that P yn(x, t) → P y
∗(x, t) uniformly on any compact subset.
The uniform convergence of pyn then follows from Friedman [8, Section 3.6]. SupposeD.= [x0, x1]× [t0, t1] ⊆ (0, y∗)× (0,∞) is a compact subset. In the following, we will denote
26
P yn and P y∗ by Pn and P respectively. Also, without loss of generality, we assume yn ≤ y∗for all n, which implies that
Pn(x, t) ≥ P (x, t).For any ε > 0, we want to show that for large enough n,
0 ≤ Pn(x, t)− P (x, t) ≤ ε ∀ (x, t) ∈ D.DeÞne
τ.= inf t ≥ 0 : St ≥ y∗ ; τn
.= inf t ≥ 0 : St ≥ yn
Since P is continuous, it is uniformly continuous on the compact subset D. It follows thatthere exists a number h such that
IP¡t < τ ≤ t+ h ¯S0 = x¢ = P (x, t+ h)− P (x, t) ≤ ε
2∀ (x, t) ∈ D,
and thus for all (x, t) ∈ DPn(x, t)− P (x, t)= IPx(τn ≤ t, τ > t)= IPx(τn ≤ t, τ > t+ h) + IPx(τn ≤ t, t < τ ≤ t+ h)≤ IP(τn ≤ t, τ > t+ h) + ε
2.
However, it follows from strong Markov property that
IP(τn ≤ t, τ > t+ h¯S0 = x) ≤ IP
µmax0≤u≤h
St ≤ y∗¯S0 = yn
¶∀ (x, t) ∈ D.
Note that the right hand side is independent of (x, t) ∈ D. DeÞne
c1.= kbk∞ + 1
2kσ2k.
Then the right hand side is dominated by
IP
µmax0≤u≤h
[−c1u+Qu] ≤ log y∗yn
¶,
where Qu.=R u0 σ(St) dWt. Since σ
2u ≤ hQiu ≤ σ2u and Qu = BhQiu for some standardBrownian motion B, the probability is in turn dominated by
IP
µmax
0≤t≤σ2h
∙−c1σt+Bt
¸≤ log y∗
yn
¶.
For n big enough, this probability is at most ε2 since yn → y. This completes the proof. 2
Proposition 4.1. Suppose f : [0,∞)→ IR is a bounded continuous function with
limx→∞ f (x) = 0,
and g : [0, 1)→ IR a continuous, bounded function. Then
limh,δ→0
IExhf (τδ) g
³τδh−jτδh
k´i= IExf (τ∗) ·
Z 1
0g (u) du.
for all x ∈ (0, x∗).
27
Proof: Fix x ∈ (0, x∗). Let pδ and p denote the density of τδ and τ∗ respectively. We canassume that all xδ are close to x∗ in the sense that x∗ − xδ ≤ δ0 for some δ0, and x < xδ.Since
limx→∞ f (x) = 0,
we have
IExhf (τδ) g
³τδh−jτδh
k´i=
Z ∞
0f (s) g
³ sh−j sh
k´pδ (s) ds.
For any ε > 0, there exists 0 < a < M <∞ such thatZ[0,a]
f (s) g³ sh−j sh
k´pδ (s) ds ≤ kfk∞ · kgk∞IP (τδ ≤ a) ≤ kfk∞ · kgk∞IP (τδ0 ≤ a) ≤ ε.
and Z[M,∞]
f (s) g³ sh−j sh
k´pδ (s) ds ≤ max
M≤x|f (x) | · kgk∞ ≤ ε.
Note that such choices of (a,M) also make the above inequalities hold when pδ is replacedby p. Also since pδ → p uniformly on the compact interval [ε,M ], we haveZ M
af (s) g
³ sh−j sh
k´|pδ (s)− p (s)| ds ≤ ε
for δ small enough. It remains to show that, for h small enough,¯Z M
af (s) g
³ sh−j sh
k´p (s) ds−
Z M
af (s) p (s) ds ·
Z 1
0g (u) du
¯≤ ε.
However, since f, p and f · p are all uniformly continuous on compact intervals, we have
Z M
af (s) g
³ sh−j sh
k´p (s) ds =
bMh cXn=b ahc
Z (n+1)h
nhf (nh) g
³ sh− n
´p (nh) ds+ o (1)
=
Z 1
0g (u) du
bMh cXn=b ahc
f (nh) p (nh) · h+ o (1)
=
Z 1
0g (u) du
Z M
af (s) p (s) ds+ o (1)
This completes the proof. 2
References
[1] BATHER, J.A. & CHERNOFF, H. (1966) Sequential decisions in the control of aspaceship. Proc. Fifth Berkeley Symposium on Mathematical Statistics and Probability3, 181-207.
28
[2] BENSOUSSAN, A. (1984) On the theory of option pricing. Acta Appl. Math. 2,139-158.
[3] BRAUN, M. (1993) Differential Equations and Their Applications. Springer-Verlag,New York.
[4] BRENNAN, M.J. & SCHWARTZ, E.S. (1985) Evaluating natural rescource invest-ments. Journal of Business 58, 135-157.
[5] BROADIE, M. & GLASSERMAN, P. (1997) Pricing American-style securities usingsimulation. J. Econ. Dynam. Control 21, 1323-1352.
[6] DIXIT, A.K. & PINDYCK, R.S. (1994) Investment Under Uncertainty. PrincetonUniversity Press.
[7] EL KAROUI, N., JEANBLANC-PICQUE, M. & SHREVE, S.E. (1998) Robustnessof the Black and Scholes formula. Math. Finance 8, 93-126.
[8] FRIEDMAN, A. (1964) Partial Differential Equations of Parabolic Type. Prentice-hall,Inc. Englewood Cliffs, NJ.
[9] JACKA, S.D. (1991) Optimal stopping and the American put. Math. Finance 1, 1-14.
[10] KARATZAS, I. (1988) On the pricing of American options. Appl. Math. Optim. 17,37-60.
[11] KARATZAS, I. & SHREVE, S.E. (1998) Methods of Mathematical Finance. Springer,Berlin Heidelberg New York.
[12] KUNITA, H. (1990) Stochastic Flows and Stochastic Differential Equations. Cam-bridge, UK; Cambridge University Press.
[13] KUSHNER, H. (1984) Approximation and Weak Convergence Methods for RandomProcesses with Applications to Stochastic System Theory. MIT Press, Cambridge, MA.
[14] LAMBERTON, D. (1998) Error estimates for the binomial approximation of Americanput options. Ann. of Appl. Probab. 8, 206-233.
[15] MCDONALD, R. & SIEGEL, D. (1986) The value of waiting to invest. Quarterly J.Econ. 101, 707-728.
[16] MERTON, R.C. (1990) Continuous-Time Finance. Cambridge-Oxford: Blackwell.
[17] MUSIELA, M. & RUTKOWSKI, M. (1997) Martingale Methods in Financial Mod-elling. Springer, Berlin Heidelberg New York.
[18] MYNENI, R. (1992) The pricing of the American option. Ann. Appl. Probab. 2, 1-23.
[19] PROTTER, P. (1995) Stochastic Integration and Differential Equations. Springer-Verlag, Berlin.
[20] ROYDEN, H.L. (1968) Real Analysis. Collier MacMillan, London.
29