STOCHASTIC DIFFERENTIAL EQUATIONS AND ...n00008500/ND.pdfSTOCHASTIC DIFFERENTIAL EQUATIONS AND HYPOELLIPTIC OPERATORS Denis R. Bell Department of Mathematics University of North Florida

.

STOCHASTIC DIFFERENTIAL EQUATIONS ANDHYPOELLIPTIC OPERATORS

Denis R. BellDepartment of MathematicsUniversity of North Florida

Jacksonville, FL 32224U. S. A.

In Real and Stochastic Analysis, 9-42, Trends Math. ed. M. M. Rao, Birkhauser,Boston. MA, 2004.

1

1. IntroductionThe first half of the twentieth century saw some remarkable developments in analyticprobability theory. Wiener constructed a rigorous mathematical model of Brownianmotion. Kolmogorov discovered that the transition probabilities of a diffusion processdefine a fundamental solution to the associated heat equation. Ito developed a stochas-tic calculus which made it possible to represent a diffusion with a given (infinitesimal)generator as the solution of a stochastic differential equation. These developments cre-ated a link between the fields of partial differential equations and stochastic analysiswhereby results in the former area could be used to prove results in the latter.

More specifically, let X0, . . . , Xn denote a collection of smooth vector fields on Rd,regarded also as first-order differential operators, and define the second-order differen-tial operator

L ≡ 12

n∑i=1

X2i + X0. (1.1)

Consider also the Stratonovich stochastic differential equation (sde)

dξt =n∑

i=1

Xi(ξt) dwi + X0(ξt)dt (1.2)

where w = (w1, . . . , wn) is an n−dimensional standard Wiener process. Then thesolution ξ to (1.2) is a time-homogeneous Markov process with generator L, whosetransition probabilities p(t, x, dy) satisfy the following PDE (known as the Kolmogorovforward equation) in the weak sense

∂p

∂t= L∗

yp.

A differential operator G is said to be hypoelliptic if, whenever Gu is smooth fora distribution u defined on some open subset of the domain of G, then u is smooth.In 1967, Hormander proved that an operator of the form L in (1.1) is hypoelliptic ifthe Lie algebra generated by X0, . . . , Xn has dimension d at each point*. It followsimmediately that the transition probabilities p of ξ in (1.2) have smooth densitiesat all positive times provided the parabolic operator ∂/∂t − L∗

y satisfies Hormander’scondition at ξ0. Of course, this is a rather circuitous route for proving a result ofa decidedly probabilistic nature, a fact that had hardly escaped the notice of PDEtheorists and stochastic analysts alike.

In the mid-seventies Malliavin ([Ma1], [Ma2]) outlined a new and innovativemethod for directly proving the smoothness of the transition probabilities p. The ideais as follows: the measure p(t, x, dy) is the image of Wiener measure under the Ito

* This hypothesis will be referred to as Hormander’s (Lie bracket) condition.

2

map g : w → ξt. Now the Wiener measure, which can be considered as an infinite-dimensional analogue of standard Gaussian measure on Euclidean space, has a well-understood analytic structure. If the map g were smooth, then regularity propertiesof p could be studied by a process of integration by parts (cf. Section 2). In fact,this is not the case; g is only defined up to a set of full Wiener measure and is mostpathological from the standpoint of classical calculus. Malliavin solved this problemby constructing an extended calculus applicable to solution maps of sde’s. His method(which has since been termed Malliavin calculus) was based on the infinite-dimensionalOrnstein-Uhlenbeck semigroup and was rather elaborate. It has since been simplifiedand extended by many authors and has become a powerful tool in stochastic analysis.Following further pioneering work by Kusuoka and Stroock, it has led to a completeprobabilistic proof of Hormander’s theorem, as Malliavin originally intended, and hasinspired a host of other results. Examples include filtering theorems by Michel [Mi], adeeper understanding of the Skorohod integral and the development of an anticipatingstochastic calculus by Nualart and Pardoux [NP], an extension of Clark’s formula byOcone [O], and Bismut’s probabilistic analysis of the small-time asymptotics of theheat kernel of the Dirac operator on a Riemannian manifold [Bi2]. In this article, Idescribe my own contributions to the subject. The interested reader is encouraged toalso study the aforementioned works.

The contents of the article are as follows. In Section 2, we present an elemen-tary derivation of Malliavin’s integration by parts formula by which one establishessmoothness of the transition probabilities discussed above. We also include a result ofBismut that illustrates how this formula is related to Hormander’s condition. Much ofthe notation, together with the basic tools that will be used later, are introduced inthis section of the article.

The material presented in Sections 3 and 4 is joint work with S. Mohammed. InSection 3, we prove a sharp generalization of Hormander’s theorem. This result assertsthat the operator L in (1.1) is hypoelliptic under hypotheses that allow Hormander’s Liebracket condition to fail on hypersurfaces in the domain of L. The theorem establishesthe hupoellipticity of a large class of operators with points of infinite-type, and is sharpin the category of Hormander operators with smooth coefficients.

In Section 4, we establish sufficient conditions for the existence of smooth densitiesfor a class of stochastic functional equations of the form

dxt = g(xt−r)dw + B(t, x)dt (1.3)

where g denotes a matrix-valued function, r is a positive time-delay, and B is a non-anticipating functional defined on the space of paths. Our hypotheses allow degeneracyof g on a hypersurface in the ambient space of the equation. Probabilistically, thereis an important difference between the solution ξ to equation (1.2) and the solution x

3

to (1.3). Namely, while ξ is a Markov process, x is non-Markov. As a consequence,the aforementioned result cannot be proved by appealing to existing results in PDEtheory.

Finally, in Section 5 we describe some open problems, which we hope will stimulatefurther study into this interesting subject.

2. Integration by parts and the regularity of induced measures

As in Section 1, let ξ denote the solution to the sde (1.2). In this section wewill give an elementary proof of a key result in Malliavin’s paper [Ma1], which gives acriterion under which the random variables ξt have absolutely continuous distributionsat all positive times t. The material in this section is taken from the author’s doctoraldissertation [Be1]. Following Malliavin, we will make use the following result fromharmonic analysis (see [Ma1] for a proof).

Lemma 2.1. Let µ denote a finite Borel measure on Rd. Suppose that for k ∈ N and

each set of non-negative integers d1, . . . , dk, there exists a constant C such that for all

test (C∞ compact support) functions φ on Rd

∣∣∣ ∫Rd

∂d1+...dk

∂xd11 . . . ∂xdk

d

φ dν∣∣∣ ≤ C||φ||∞. (2.1)

If this condition holds for k = 1, then the measure ν is absolutely continuous with

respect to the Lebesgue measure on Rd. If the condition holds for every k ≥ 1, then ν

has a smooth density.

The following result is included in order to motivate the approach that follows.It treats a finite-dimensional analogue of the infinite-dimensional problem that will bestudied later.

Theorem 2.2. Suppose T : Rm → Rd is a C2 map. Let dγ(x) = (2π)−m/2 ×exp(−|x|2/2)dx denote the standard Gaussian measure on Rm. Let ν ≡ T (γ) denote

the induced measure on Rd, ν(B) ≡ γ(T−1(B) for Borel subsets B ⊆ Rd. Define

N ≡ x ∈ Rd/DT (x) ∈ L(Rm,Rd) is non surjective.

Then the condition

γ(N) = 0 (2.2)

is necessary and sufficient for absolute continuity of the measure ν.

4

Proof. The necessity of condition (2.2) follows immediately from Sard’s theorem whichasserts that the set T (N) has zero Lebesgue measure.

To prove sufficiency, we argue as follows. Define a d × d matrix

σ(x) = DT (x)DT (x)∗. (2.3)

Let ψk denote a sequence of smooth bump functions from [0,∞) → [0, 1] such that(i) ψk(t) = 1 if 0 ≤ t ≤ k

(ii) ψk(t) = 0 if t ≥ k + 1(iii) For each p ≥ 1, supk |Dpψk(t)| < ∞.

For each k ∈ N, define Rk : Rd ⊗ Rd → [0, 1] by

Rk(α) =

ψk(||α−1||), α ∈ GL(d)0, α /∈ GL(d).

Note that Rk is C∞, since GL(d) is an open subset of Rd⊗Rd. We also define sequencesof measures γk on C0 and νk on Rd by

dγk

dγ(x) = Rk(σ(x)), νk = T (γk).

Assume now that (2.2) holds. Since N is precisely the set on which σ is degenerate,this is equivalent to the condition σ(x) ∈ GL(d) a.s., which implies νk → ν in variationnorm. Hence it suffices to prove absolute continuity of νk for each k. We do this asfollows: let e1, . . . , ed denote the standard basis of Rd and define, for any 1 ≤ i ≤ d, afunction hi : Rm → Rm by

hi(x) =

DT (x)∗σ−1(x)ei, σ ∈ GL(d)0, σ /∈ GL(d).

Let φ be a test function on Rd. By definition of hi, we have∫Rd

∂φ

∂xidνk =

∫Rm

Dφ(T (x)eiψk(||σ−1(x)||)dγ(x)

=∫Rm

D(φ T )(x)hi(x)ψk(||σ−1(x)||)dγ(x). (2.4)

Define the (Gaussian) divergence operator Div acting on smooth functions G : Rn →Rn by

Div(G)(x) =< G(x), x > −Trace DG(x).

Integrating by parts in (2.4) yields∫Rd

∂φ

∂xidνk =

∫Rm

φ T (x)Xi(x)dγ(x)

5

whereXi = Div[ψk(||σ−1||)hi] (2.5)

Since Xi is a continuous function with compact support, it follows that Xi ∈ L1(γ).Thus (2.5) yields ∫

Rd

∂φ

∂xidνk ≤ ||φ||∞||Xi||L1(γ)

and the absolute continuity of νk follows from Lemma 2.1.With the above finite-dimensional case as motivation, we now move to the heart

of the matter. Let γ denote the Wiener measure, defined on the space C0 of continuouspaths w : [0, 1] → Rn/ w(0) = 0. Recall that we wish to study the law ν of a randomvariable ξt, for t > 0, where ξ is the solution to a stochastic differential equation. LetA and B denote bounded smooth maps from Rd into Rd ⊗ Rn and Rd respectivelywith bounded derivatives of all orders, and let x ∈ Rd. As before, let w = (w1, . . . , wn)denote a standard Wiener process and consider the Ito sde

ξt = x +∫ t

0

A(ξs)dws +∫ t

0

B(ξs)ds, t ≥ 0. (2.6)

Denoting by g the map w → ξ and by gt the composition of g with evaluation at timet > 0, we have ν = gt(γ). We seek conditions under which the measure ν is absolutelycontinuous with respect to the Lebesgue measure on Rd. The setting looks similar tothat of Theorem 2.2. There are, however, two important differences:

(i) the measure γ is defined on an infinite-dimensional vector space(ii) the map g (and hence gt) is non-differentiable in the classical sense.

Point (i) can be handled without too much difficulty. Although there is nothing like aLebesgue measure on C0, the usual backdrop for an integration by parts calculation,the Wiener measure has a well-developed analytic structure. In particular, there areformulae for integration by parts on Wiener space*. The second point is more serious.Malliavin developed his stochastic calculus of variations in [Ma1] in order to addressthis.

We shall adopt here a more elementary approach, based on the follwing observa-tion. There is a Hilbert subspace H ⊂ C0 of paths, known as the Cameron-Martinspace, canonically associated with the Wiener measure. The space H is the set ofLebesgue a.e. differentiable paths h such that∫ 1

0

|h′(t)|2dt < ∞.

equipped with the inner product

< h, k >H≡∫ 1

0

< h′(t), k′(t) > dt, h, k ∈ H.

* See, for example, Gross’ divergence theorem for abstract Wiener spaces [G].

6

The point is that, when restricted to H, the map g becomes entirely regular. In fact, itis intuitively clear that the restriction* of g to H, which we denote by g, is the maph ∈ H → k defined by the ordinary integral equation

kt = x +∫ t

0

A(ks)h′sds +

∫ t

0

B(ks)ds, t ≥ 0 (2.7)

and it is easily seen that the map g possesses the same degree of smoothness as thecoefficient functions A and B. For example, an equation for the derivative η ≡ Dr g(h),r ∈ H, is obtained by formally differentiating in (2.7) with respect to h. Thus η satisfies

ηt =∫ t

0

DA(ks)(ηs, h

′s) + A(ks)r′s + DB(ks)ηs

ds.

Integral equations for higher order derivatives of g can be similarly obtained.The foregoing remarks provide an alternative elementary approach to Malliavin’s

work. As before, let φ denote a test function on Rd and let gt denote the compositionof g with evaluation at time t. Suppose that for each standard basis vector ei, one canconstruct a sequence of paths hm

i such that

Dgt(Pmw)hmi = ei.

Applying the dominated convergence theorem and integrating by parts with respect tothe measure γ, we will then have∫

Rd

∂φ

∂xidν = lim

m

∫C0

Dφ(gt(Pmw))eidγ

= limm

∫C0

D(φ gmt )(Pmw)hm

i dγ

= limm

∫C0

φ(gt(Pmw))Div[hmi ]dγ

If we can show that the sequence Div[hmi ]m is bounded in L1(γ), then we obtain∣∣∣ ∫

Rd

∂φ

∂xidν

∣∣∣ ≤ ||φ||∞ supm

||Div[hmi ]||L1(γ)

* The term “restriction” requires clarification because g is only defined up to a setof γ-measure 1, and γ(H) = 0. It is more correct to say that g stochastically extends g

in the following sense. For each m ∈ N, define Pm : C0 → H to be the operator thatpiecewise linearizes on the uniform partition [0, 1/m, 2/m . . . , 1]. It can be shown that,as m → ∞, Pm converges strongly to the identity map on C0, and g(Pmw) → g(w) a.s.

7

and the absolute continuity of ν follows from Lemma 2.1 as before.We will actually carry out a modified version of the above procedure, using a

sequence of piecewise linear approximations to g. Being finite-rank operators, theseapproximations finite-dimensionalize the diffrerential analysis in the problem. Thus,at each level of approximation, we need only perform an elementary integration byparts in Euclidean space, as in the proof of Theorem 2.2.

Our approximation scheme is as follows. For each m ∈ N and w ∈ C0, let ∆jw (=∆m

j w) denote w(j + 1)t/m) − w(tj/m). Define v0, vt/m, . . . , vt ∈ Rd inductively byv0 = x and

vkt/m = x +k−1∑j=0

A(vjt/m)∆jw +t

m

k−1∑j=0

B(vjt/m), k = 1, . . . , m. (2.8)

Let vm : [0, 1] → Rd denote the path piecewise linear between the points (kt/m, vkt/m),k = 0, . . . , m and constant on [t, 1].

It is easy to see that for each m, the map gm : C0 → H is C∞. We prove in [Be2,pp 35 - 37]

Theorem 2.3. For every p ∈ N,

limm→∞

sups∈[0,t]

E[|ξs − vm

s |p]

= 0

where ξ is the solution to the sde (2.6). We now define an analogue of the matrix σ

appearing in the proof of Theorem 2.2. Let σm(w) ≡ Dgmt (w)Dgm

t (w)∗ ∈ Rd ⊗ Rd.

We prove in [Be2, pp 37 - 39]

Theorem 2.4. As m → ∞, the matrix sequence σm converges in probability to a

limit σ in Rd ⊗ Rd. Let I denote the d × d identity matrix and consider the d × d

matrix-valued equations

Ys = I +∫ s

0

DA(ξu)(Yu, dwu) +∫ s

0

DB(ξu)Yudu

and

Zs = I −∫ s

0

ZuDA(ξu)(., dwu) −∫ s

0

ZuDB(ξu)du.

Then

σ = Yt

[ ∫ t

0

ZsA(ξs)A(ξs)∗Z∗s ds

]Y ∗

t . (2.9)

where ∗ denotes matrix transpose.

8

RemarksThe matrix processes Yt and Zt have a natural interpretation in terms of the

stochastic flow of the sde (2.6), i.e. the random map on Rd, φt : x → ξt. It canbe shown that φt is a.s. a C∞ map and an easy computation shows that Yt = Dφt.Furthermore, Yt ∈ GL(d) for all t ≥ 0 and Zt = Y −1

t .The matrix σ defined in (2.9) is called the Malliavin covariance matrix associated

to the random variable ξt. The next theorem shows that establishing non-degeneracyof this matrix is the key to proving regularity of the distribution of ξt, as might besuspected from Theorem 2.2 and its proof.

Theorem 2.5 (Malliavin). Suppose σ ∈ GL(d) a.s. Then the random variable ξt is

absolutely continuous with respect to Lebesgue measure on Rd.

Our proof of Theorem 2.5 will make use of the following technical result, whichis tailor-made for the estimations we will need to do later (see [Be2, pp34-35] for itsproof).

Lemma 2.6. Suppose that X, U0, . . . , Um−1 are Rd-valued random variables,

V0, . . . , Vm−1 and Y0, . . . , Ym−1 are random linear maps from Rn to Rd, and Z0, . . . ,

Zm−1 are random bilinear maps from Rd×Rn to Rd, satisfying the following conditions

(i) For all 0 ≤ i ≤ m−1, Ui, Vi, Yi, and Zi are measurable with respect to Fit/m, where

Ft is the fitration generated by wt.(ii) max

||X||p, ||Ui||p, ||Vi||p, ||Yi||p, ||Zi||p, 0 ≤ i ≤ m − 1

≤ M , where ||.||p denotes

the Lp norms of the (norms of) the various quantities in their respective spaces.

Let ηkt/m, 0 ≤ k ≤ m, be random variables satisfying the equations

ηkt/m = X +1m

k−1∑j=0

Uj +k−1∑j=0

Vj(∆jw) +1m

k−1∑j=0

Yj(ηjt/m) +k−1∑j=0

Zj(ηjt/m,∆jw).

Then there exists a constant N , depending only on M and p, such that

||ηkt/m||p ≤ N, ∀ 0 ≤ k ≤ m.

Proof of Theorem 2.5 Let Vm denote the finite-dimensional subspace of C0 consisting ofpaths that are piecewise linear between the times 0, t/m, . . . , t, and constant on [t, 1].Following the method used to prove Theorem 2.2, we define

dγk

dγ(w) = Rk(σ(w)), νk = gt(γk)

where Rk are as defined previously. As before, the assumption σ ∈ GL(d), a.s. impliesνk → ν in variation, so it suffices to prove that each νk is absolutely continuous.

9

Let e denote any unit vector in Rd and define hm : C0 → H by

hi(w) =

Dgmt (w)∗σm(w)−1e, σm ∈ GL(d)

0, σm /∈ GL(d).

Arguing as before and integrating by parts with respect to the measures Pm(γ) (notethese are Gaussian measures on the finite-dimensional vector spaces Vm) yields∫

Rd

Dyφ dνk = limm→∞

∫C0

φ gmt Div [hmRk σm] dγ

where Div is now defined with respect to the Cameron-Martin space H

Div G(w) =< G(w), w >H −Trace HDG(w).

To complete the proof, we must thus show that

supm

E|Div [hmRk σm]| < ∞ (2.10)

First consider the inner product term in the divergence. This is non-zero only if σm ∈GL(d) and ||σm||−1 ≤ k + 1. In this case

E| < hmRk σm, w >H | = E| < (σm)−1e, Dgmt (w)w > |

≤ (k + 1)E|ηmt | (2.11)

where ηm = Dgm(w)w satisfies the equation

ηmkt/m =

k−1∑i=0

A(vit/m)∆iw + DA(vit/m)(ηm

it/m,i w) + t/mDB(vit/m)ηmit/m

,

k = 1, . . . , m.

Since A, DA, and DB are bounded, Lemma 2.6 implies that E|ηmt | is bounded and it

follows from (2.11) that E| < hmRk σm, w >H | is bounded.It remains to show that the same holds for the second term in the divergence, i.e.

supm

E|TraceH D[hmRk σm]| < ∞. (2.12)

Let f1, . . . , fn be an orthonormal basis of Rn. For 1 ≤ r ≤ n and 0 ≤ l ≤ m− 1, definefrl ∈ H by

frls =

0, 0 ≤ s < lt/m√m/t(s − lt/m)fr, lt/m ≤ s < (l + 1)t/m

√t/mfr,

(l + 1)t/m < s ≤ 1

10

The set Sm ≡ frl, 1 ≤ r ≤ n, 0 ≤ l ≤ m−1 is orthonormal in H, thus can be extendedto an orthonormal basis Bm of H. Note that for any f ∈ Bm ∩ Sc

m, Dgm(w)f = 0.

Evaluating the Trace in (2.12) on the basis Bm gives

TraceH D[hmRk σm] =n,m−1∑r=1,l=0

< D[hmRk σm](w)frl, frl >

=n,m−1∑r=1,l=0

Rk σm(w)

[< (σm)−1e, D2gm

t (w)(frl, frl) >

− < (σm)−1Dσm(w)frl(σm)−1e, Dgmt (w)frl

]+DRk(σm)Dσm(w)frl < (σm)−1e, Dgm

t (w)frl >

.

In view of the definition of Rk, it suffices to show that

supm

E∣∣∣ m−1∑

l=0

D2gmt (w)(frl, frl)

]< ∞ (2.13)

and

supm

E[ m−1∑

l=0

|Dσm(w)frl| × |Dgmt (w)frl|

]< ∞. (2.14)

Let ηrl denote the path√

mDgm(w)frl. Differentiation in (2.8) yields ηrljt/m = 0 if

j ≤ l and

ηrljt/m =

√tA(vlt/m)fr+

j−1∑p=l

DA(vpt/m)(ηrlpt/m,∆pw)+

t

m

j−1∑p=l

DB(vpt/m)ηrlpt/m, j ≥ l+1

It follows from Lemma 2.6 that

supj,k,m

||ηrkjt/m||4 < ∞ (2.15)

Now let ρrl denote the path mD2gm(w)(frl, frl). This satisfies the equation

ρrljt/m = tDA(vlt/m)(ηrl

lt/m, fr)

+j−1∑p=l

tD2A(vpt/m)(ηrl

pt/m, ηrlpt/m,∆pw) + DA(vpt/m)(ρrl

pt/m,∆pw)

+1m

j−1∑p=l

tD2B(vpt/m)(ηrl

pt/m, ηrlpt/m) + DB(vpt/m)ρrl

pt/m

, j ≥ l + 1

11

(ρrljt/m = 0 for j ≤ l). Lemma 2.6 together with (2.15) now give

supr,j,k,m

||ρrkjt/m||4 < ∞

and this implies (2.13). Condition (2.14) can be established by a similar argument.With this, the proof of the theorem is complete.

We now return to the sde (1.2) defined in terms of vector fields. It follows from(2.9) that the Malliavin covariance matrix σ for ξt now has the form

σ = Yt

n∑i=1

∫ t

0

[ZsXi(ξs)] ⊗ [ZsXi(ξs)]∗dsY ∗t (2.16)

where Z satisfies is the d × d matrix-valued sde

Zs = I −n∑

i=1

∫ s

0

ZuDXi(ξu) dwi −∫ s

0

ZuDX0(ξu)du (2.17)

As before, Yt is the inverse matrix of Zt. Note that the stochastic integral in (2.17) isof Stratonovich type.

The following result and its proof (which I first learned from Bismut’s paper [Bi1])makes transparent the relationship between Hormander’s Lie bracket condition and thenon-degeneracy of σ.

Theorem 2.7. Suppose the Lie algebra generated by the vector fields X1, . . . , Xn span

Rd at ξ0. Then σ ∈ GL(d), a.s. (hence, by Theorem 2.5 ξt is absolutely continuous

with respect to Lebesgue measure on Rd). The proof will require the following

Lemma 2.8. Suppose y ∈ Rd, B is a smooth vector field on Rd and τ is a stopping

time such that

< ZsB(ξs), y >= 0 , ∀s ∈ [0, τ ]. (2.18)

Then for i = 1, . . . , n,

< Zs[Xi, B](ξs), y >= 0,∀s ∈ [0, τ ].

Proof. Applying Ito’s formula and using (2.17) , we obtain, for s ∈ [0, τ ],

0 = d < (ZsB(ξs)), y >

=⟨Zs[B, X0]ds +

n∑i=1

Zs[B, Xi](ξs) dwi, y⟩. (2.19)

12

Converting the Stratonovich integrals to Ito form, we see that the infinitesimal termin (2.19) has the form

G(s)ds +n∑

i=1

Zs[B, Xi](ξs)dwi

for some continuous adapted process G. Thus

< G(s)ds +n∑

i=1

Zs[B, Xi](ξs)dwi, y >= 0,∀s ∈ [0, τ ].

The conclusion now follows from the computational rules of Ito calculus: dwidt = 0,while dwidwj = δijdt.Proof of Theorem 2.7. Define V to be the set of vectors

Xi(ξ0), [Xi, Xj ](ξ0), [Xi, [Xj , Xk]](ξ0), . . . , 1 ≤ i, j, k, . . . ,≤ n.

We will show that V ⊆ σ, which clearly implies the result. For each 0 < s < t, define

Rs = spanZuXi(ξu/ 0 ≤ u ≤ s, 0 ≤ i ≤ n

andR = ∩0<s<tRs.

Then Rt = Range σ. Furthermore by the Blumenthal zero-one law, there exists adeterministic set R such that R = R a.s. Suppose y ∈ R⊥. Then with probability 1,there exists τ > 0 such that Rs = R for all s ∈ [0, τ ]. Then for i = 1, . . . , n

< ZsXi(ξs), y >= 0,∀s ∈ [0, τ ]. (2.20)

Iterating Lemma 2.8 on (2.20) gives

< ZsXi(ξs), y >, < Zs[Xi, Xj ](ξs), y >, < Zs[[Xi, Xj ], Xk](ξs), y >, . . . = 0

for s ∈ [0, τ ], and setting s = 0 shows that y ∈ (span V )⊥. Thus span V ⊆ R ⊆ Rt =Range V, as required.

In [Ma1], Malliavin proves further

Theorem 2.9. If

(detσ) ∈ ∩p≥1Lp(γ) (2.21)

then the density of ξt is C∞.

This is also proved in [Be1], by iterating the argument used to prove Theorem 2.5.Kusuoka and Stroock prove in [KS2] that (2.21) holds under Hormander’s generalcondition and they use this to give a complete probabilistic proof of Hormander’stheorem. We do not include this work here since we will prove a more general resultin the next section.

13

3. A Hormander theorem for infinitely degenerate operators

The material in this section comes from [BM3]. Let X0, . . . , Xn denote smoothvector fields defined on an open subset D of Rd. As before, we consider the second-orderdifferential operator

L ≡ 12

n∑i=1

X2i + X0. (3.1

Let Lie(X0, . . . , Xn) be the Lie algebra generated by the vector fields X0, . . . , Xn.According to the theorem of Hormander ([H], Theorem 1.1), L is hypoelliptic on D ifthe vector space Lie(X0, . . . , Xn)(x) has dimension d at every x ∈ D. It can be shownthat this is a necessary condition for hypoellipticity for operators of the form (3.1) withanalytic coefficients.

This is not the case if the vector fields X0, . . . , Xn defining L are allowed to be(smooth) non-analytic. This fact was strikingly illustrated by Kusuoka and Stroock,who studied differential operators on R3 of the form

La ≡ ∂2

∂x21

+ a2(x1)∂2

∂x22

+∂2

∂x23

.

They assume a to be a C∞ real-valued even function, non-decreasing on [0,∞), whichvanishes (only) at zero. It is shown in ([KS2], Theorem 8.41) that La is hypoelliptic onR3 if and only if a satisfies the condition lim

a→0+s log a(s) = 0. In particular, in the case

a(s) = exp(−|s|p), La is hypoelliptic provided p ∈ (−1, 0). However, it is clear thatany such operator fails to satisfy Hormander’s condition on the hyperplane x1 = 0.

In this section we present a sharp criterion for hypoellipticity that impliesHormander’s theorem and encompasses the class of superdegenerate hypoelliptic oper-ators of Kusuoka and Stroock.

We introduce the following notation. For a positive integer m, let E(m) denote anymatrix whose columns consist of X0, · · · , Xn together with all (iterated) Lie bracketsof the form

[Xi1 , Xi2 ]ni1,i2=0 ; · · · ; [Xi1 , [Xi2 , [Xi3 , · · · , [Xim1 , Xim ]] · · ·]]ni1,i2,···,im=0 ,

For x ∈ D and m ≥ 1, define

µ(m) ≡ smallest eigenvalue of [E(m)E(m)∗].

Observe that µ(m)(x) > 0 for some m ≥ 1 if and only if Hormander’s (general) conditionholds for the operator L at x ∈ D. In this case we will say that x is a Hormander pointfor L. We denote the set of all such points by H (note that H is an open subset ofD). The set D ∩ Hc of non-Hormander points of L will be denoted simply by Hc inthe sequel.

The main result of this section is the following

14

Theorem 3.1. Suppose the non-Hormander set Hc of L is contained in a C2 hyper-

surface S. Assume that at every point x in Hc

(i) at least one of the vector fields X1, · · · , Xn is transversal to S.

(ii) There exists an integer m ≥ 1, an open neighborhood U of x, and an exponent

p ∈ (−1, 0) such that

µ(m)(y) ≥ exp−[ρ(y, S)]p, ∀y ∈ U (3.3)

where ρ(y, S) denotes the Euclidean distance between y and S.

Then L is hypoelliptic on D.

We note that hypotheses (ii) in Theorem 3.1 controls the rate at which theHormander condition fails in a neighborhood of non-Hormander points. It is clearthat some such condition is necessary, since the Kusuoka-Stroock result cited aboveshows that the operator

Lp =∂2

∂x21

+ exp−|x1|p∂2

∂x22

+∂2

∂x23

is non-hypoelliptic if p ≤ −1. Furthermore, the non-hypoellipticity of the operator L−1

shows that the allowed range (-1, 0) for p in (3.3) is optimal. Hypothesis (ii) has thefollowing probabilistic interpretation in terms of the diffusion process ξt correspondingto L, defined in (1.2). It implies that if ξ starts at a non-Hormander point x on S, thenit will escape from x at a fast enough rate to acquire a non-singular distribution.

One can see that a hypothesis such as (i) is also necessary for the hypoellipticityof L (at least in the case where X0 = 0) by looking at the probabilistic picture. Forif Xi is tangential to S for each i = 1, . . . , n then, started from a point x ∈ S, ξ willstay on S. Hence ξt will not have a density in Rd at positive times t, which impliesthat L cannot be hypoelliptic (as we remarked earlier hypoellipticity of L implies theexistence of a density for ξt, t > 0).

The ideas underlying the proof of Theorem 3.1 are as follows. In Section 2 weintroduced the Malliavin covariance matrix σ corresponding to the random variable ξt

and showed that non-degeneracy of σ implies the existence of a density for ξt. Kusuokaand Stroock have proved that if the inverse moments of σ do not explode too quicklyas t ↓ 0, then the parabolic operator L + ∂/∂t is hypoelliptic, they also showed how todeduce hypoellipticity of L from this. More specifically, let σ denote the matrix definedin (2.16) and (2.17) (recall that the process ξ in these formulas is defined by equation(1.2)) and let ∆(t, ξ0) denote the determinant of σ. The following result is proved in[KS2].

Let D be an open set in Rd. Suppose that for every q ≥ 1 and every x in D, thereexists a neighborhood V ⊆ D of x such that

limt→0+

t log

supy∈V

‖∆(t, y)−1‖q

= 0. (3.4)

15

Then the differential operator L +∂

∂tis hypoelliptic on R × D. We will prove that

(3.4) is satisfied under a (parabolic version of) the hypotheses of Theorem 3.1. Thereare two stages to this argument:

(a) a local parameterization φ of the hypersurface S is introduced. Throughoutthis section, let ξ denote the diffusion process defined in (1.2). The quantity φ(ξt)measures the distance between ξt and S. We obtain probabilistic lower bounds on theLq-norms of the paths φ(ξ). These lower bounds are asymptotically sharp as q → ∞.

(b) We study the way the lower bounds in (a) are degraded under hypothesis (ii)of Theorem 3.1. This allows us to obtain sharp lower bounds on ∆ from which we areable to verify condition (3.4).

The proof of Theorem 3.1 uses basic stochastic analytic tools, e.g. Ito’s formula,Girsanov’s theorem, and the time-change theorem for stochastic integrals. In particu-lar, this work establishes a precise connection between the maximal class of hypoellipticoperators of the form (1.1) and the space-time scaling property of the Wiener process.This provides new insight into hypoellipticity that is not available through a classicalanalysis of the problem.

Before proving Theorem 3.1, we state and prove a parabolic version of the theorem.To this end, denote by F (m) the matrix obtained by deleting the column X0 from thematrix E(m) defined on Page 14 and define λ(m) to be the smallest eigenvalue of thematrix [F (m)F (m)∗]. Define K ≡ x ∈ D/ λ(m)(x) > 0

Suppose the set Kc is contained in a C2 hypersurface N of D. Assume that at everypoint in Kc, at least one of the vector fields X1, · · · , Xn is transversal to N . Assumefurther that for every x ∈ Kc, there exists an integer m ≥ 1, an open neighborhood U

of x, and p ∈ (−1, 0) such that λ(m)(y) ≥ exp−[ρ(y, N)]p for all y ∈ U . Then theoperator L + ∂

∂t is hypoelliptic on R × D.The proof of Theorem 3.3 will require a definition and several preliminary results

which we now state.

Definition. A non-negative random variable T is exponentially positive if there exist

positive constants c1 and c2 (which we will refer to as the characteristics of T ) such

that

P (T < ε) < e−c1/ε

for all ε ∈ (0, c2) .

We will make frequent use of the following well-known result ([IW], Lemma 10.5).

Lemma 3.4. Let y : [0, 1] × Ω → Rd be an Ito process of the form

dy(t) =n∑

i=1

ai(t) dWi(t) + b(t) dt, 0 ≤ t ≤ 1, (3.5)

16

where a1, . . . , an, b : [0, 1] × Ω → Rd are measurable adapted processes, all bounded

a.s. by a deterministic constant c3. Let r > 0 and define

τ ≡ infs > 0 : |ys − y0| = r ∧ 1. (3.6)

Then τ is an exponentially positive stopping time, and the characteristics of τ depend

only on r, c3, n and d.

The next two lemmas are central to our argument (in order to facilitate the exposi-tion, we delay their proofs to the end of the section). The first yields sharp probabilisticlower bounds when applied to diffusion processes with at least one non-zero initial timediffusion coefficient.

Lemma 3.5. Let y : [0, 1] × Ω → Rd be the Ito process in (3.5). Suppose that τ ≤ T

is an exponentially positive stopping time such that at least one diffusion coefficient ai

satisfies the condition: a.s., |ai(s)| ≥ δ, for all 0 ≤ s ≤ τ , for some deterministic δ > 0.

Then for every m ≥ 2, there exist positive constants c4, c5 and T0 such that for all

t ∈ (0, T0) and ε ∈ (0, c4 tm+1), the following holds

P

(∫ t∧τ

0

|y(u)|m du < ε

)< exp

−c5ε

−1m+1

. (3.7)

The constants c4 and c5 can be chosen to depend only on m, c3, δ, and the character-

istics of τ . The constant T0 depends only on the characteristics of τ .

The following result describes how the estimate (3.7) transforms under compositionof the integrand with a function that vanishes at zero, at an appropriate exponentialrate.

Lemma 3.6. Let τ be an exponentially positive stopping time. Suppose y is an Ito

process as in Lemma 3.4. Suppose further that y and τ satisfy an estimate of the form

(3.7) for some m > − pp+1 , where p ∈ (−1, 0). Then there exist positive constants T1,

c6, c7 and q > 1 such that for all t ∈ (0, T1) and all ε < exp−c6t− 1

q , the following

holds

P

(∫ t∧τ

0

exp(−|yu|p) du < ε

)< exp−c7| log ε|q. (3.8)

Furthermore, the constants T1, c6, c7 and q are completely determined by c3 in Lemma

3.4, c4, c5, and m in (3.7), p, and the characteristics of τ .

Finally, we will need the following two technical Lemmas (since the proofs arestraightforward we omit them)

17

Lemma 3.7. For every q ≥ 1 and every bounded set V ⊂ Rd there exists a positive

constant c8 such that for all t ∈ (0, 1) and x ∈ V

‖∆(t, x)−1‖2q2q ≤ c8

1 +

∞∑j=1

P

(Q(t, x) < j−

12dq

), (3.9)

where

Q(t, x) ≡ inf n∑

i=1

∫ t

0

< Zx(u)Xi(ξx(u)), h >2 du : h ∈ Rd, |h| = 1

. (3.10)

Lemma 3.8. Suppose that the hypotheses of Theorem 3.3 are satisfied. Then for

every x ∈ D there exists an integer m ≥ 1 such that exactly one of the following two

conditions holds:

(a) < λ(m)(x) > 0.

(b) There exists an open neighborhood U ⊆ D of x, a C2 function φ : U → R, and an

exponent p ∈ (−1, 0) such that

(i) φ(x) = 0 and ∇φ(x) · Xi(x) = 0, for at least one i = 1, . . . , n.

(ii) < λ(m)(y) ≥ exp(−|φ(y)|p), for all y ∈ U .

We now assume the hypotheses and notations of Theorem 3.3. Without loss ofgenerality, the vector fields X0, · · · , Xn may be supposed to be defined on the whole ofRd and to have compact support (this follows from a simple argument using a partitionof unity and the fact that hypoellipticity is a local property). We will be assume thisfrom now on).

Let x0 ∈ D and choose m so that either (a) or (b) of Lemma 3.8 hold for x0. Lett ∈ (0, 1) and suppose x lies in a fixed bounded neighborhood W of x0. Define

τ1 ≡ inf

s > 0 : |ξx

s − x| ∨ ‖Zxs − I‖ = 1/2

∧ 1 . (3.11)

By Lemma 3.4, τ1 is an exponentially positive stopping time with characteristics inde-pendent of x ∈ W .

Let Sd ≡ h ∈ Rd : |h| = 1 denote the unit sphere in Rd. Suppose h ∈ Sd andα = 1/18. Then

P

( n∑i=1

∫ t

0

< ZxuXi(ξx

u), h >2 du < ε

)≤ P (A ∩ E) + P (A ∩ Ec),

where

A ≡( n∑

i=1

∫ t∧τ1

0

< ZxuXi(ξx

u), h >2 du < ε))

18

and

E ≡( n∑

i=1

∫ t∧τ1

0

[ n∑j=1

< Zxu [Xi, Xj ](ξx

u), h >2 +

< Zxu

[Xi, X0] +

12

n∑j,k=1

[Xi, [Xj , Xk]

](ξx

u), h >2

]du < εα

),

By a lemma of Kusuoka-Stroock and Norris (cf. e.g. [B], Lemma 6.5; cf[KS2], TheoremA.24), there exist positive constants c9 and c10 such that

P (A ∩ Ec) ≤ c9 exp(−c10ε−α).

The constants c9 and c10 are independent of h ∈ Sd. Note that E ⊆ F ∩ G, where

F ≡( n∑

i,j=1

∫ t∧τ1

0

< Zxu [Xi, Xj ](ξx

u), h >2 du < εα

)

G ≡( n∑

i=1

∫ t∧τ1

0

< Zxu

[Xi, X0] +

12

n∑j,k=1

[Xi, [Xj , Xk]]

(ξx

u), h >2 du < εα

).

Thus

P

( n∑i=1

∫ t

0

< ZxuXi(ξx

u), h >2 du < ε

)≤ c9 exp(−c10ε

−α) + P (A ∩ F ∩ G). (3.12)

Applying a similar argument to P (A ∩ F ∩ G) gives

P (A ∩ F ∩ G) ≤ c11 exp(−c12ε−α2

) + P (A ∩ F ∩ G ∩ H) (3.13)

where

H :=( n∑

i,j,k=1

∫ t∧τ1

0

< Zxu

[Xi, [Xj , Xk]

](ξx

u), h >2 du < εα2)

.

It is easy to check that

G ∩ H ⊆( n∑

i=1

∫ t∧τ1

0

< Zxu [Xi, X0](ξx

u), h >2 du < εr1

)for some r1 ∈ (0, 1) and sufficiently small ε > 0.Thus

A ∩ F ∩ G ∩ H ⊆

(∫ t∧τ1

0

<

n∑i=1

(Zx

uXi(ξxu), h >2 +

n∑i,j=0

< Zxu [Xi, Xj ](ξx

u), h >2

du < εr2

)

19

for some r2 ∈ (0, 1) and sufficiently small ε > 0. Combining this with (3.12) and (3.13),one obtains

P

( n∑i=1

∫ t

0

< ZxuXi(ξx

u), h >2 du < ε

)≤ c9 exp(−c10ε

−α) + c11 exp(−c12ε−α2

)

+P

(∫ t∧τ1

0

n∑i=1

< ZxuXi(ξx

u), h >2 +n∑

i,j=0

< Zxu [Xi, Xj ](ξx

u), h >2

du < εr2

).

(3.14)Iterating the argument used to derive (3.14) yields the following:

For each m ≥ 1, there exist positive constants c13 and c14 and exponents r3 andr4 ∈ (0, 1), all independent of h ∈ Sd, such that for all t ∈ (0, T ), x ∈ W , andε ∈ (0, c14), one has

P

( n∑i=1

∫ t

0

< ZxuXi(ξx

u), h >2 du < ε

)

≤ exp(−c14ε−r3) + P

( N∑j=1

∫ t∧τ1

0

< ZxuKj(ξx

u), h >2 du < εr4

). (3.15)

Here the vector fields K1, . . . , KN are the columns of the matrix function X(m). Apply-ing a straightforward compactness argument (cf[B], Lemma 6.8) to (3.15), one obtains

P (Q(t, x) < ε) ≤ exp(−c15ε−r3)

+c16ε−d sup

P

( N∑j=1

∫ t∧τ1

0

< ZxuKj(ξx

u), h >2 du < c17εr4

): |h| = 1

(3.16)

for ε ∈ (0, c18) and positive constants c15, c16, c17, c18.Since (3.8) implies ‖Zx(u) − I‖ ≤ 1/2, for all 0 ≤ u ≤ τ1, it is easy to deduce

from (3.16) that

P (Q(t, x) < ε) ≤ exp(−c15ε−r3) + c16 ε−dP

(∫ t∧τ1

0

λ(m)(ξxu) du < c′18ε

r4

). (3.17)

We now consider each of the two cases (a) and (b) delineated in the conclusion ofLemma 3.8. Suppose first that (a) holds at x0 for some m ≥ 1. Then by continuity ofλ(m) there exist ρ > 0 and δ > 0 such that

λ(m)(y) ≥ δ (3.18)

20

for all y ∈ Bρ(x0), where Bρ(x0) denotes the open ball in Rd with center x0 and radiusρ. Let V ≡ Bρ/2(x0), assume x ∈ V , and let τ2 denote the first exit time of ξx fromV . Then (3.17) and (3.18) imply

P (Q(t, x) < ε)

≤ exp(−c15ε−r3) + c16ε

−dP

(τ1 ∧ τ2 ∧ t <

c18er4

δ

)(3.19)

≤ c19 exp(−c20ε−c21r5) (3.20)

provided t > c18εr4

δ , where r5 ≡ r3 ∧ r4 and c19, c20 and c21 are positive constants,independent of (t, x) ∈ (0, T )×V . Substituting (3.20) into (3.9) yields, for every q ≥ 1,the following inequality

‖∆(t, x)−1‖2q2q ≤ c8

(δt

c18

)−2dqr4

+ A(t)

,

where

A(t) ≡ 1 +∞∑

j=k

c19 exp(−c20jr6),≤ 1 +

∞∑j=1

c19 exp(−c20jr6) < ∞,

where r6 ≡ c21r5/2dq and k is the integer part of (δt/c18)−2dq/r4 . We conclude that‖∆(t, x)−1‖2q grows no faster than a power of t as t ↓ 0, uniformly with respect tox ∈ V . Hence (3.4) is satisfied.

We now turn to the case where (b) of Lemma 3.8 holds at the point x0. By thetransversality condition b(i) we may choose ρ > 0 small enough to ensure that Bρ(x0)Uand such that

|∇φ(x).Xi(x)| ≥ 12|∇φ(x0).Xi(x0)| > 0

for some 1 ≤ i ≤ n and every x ∈ Bρ(x0). Let V ≡ Bρ/2(x0). Assume x ∈ V and letτ3 denote the first exit time of ξx from Bρ/2(x). In view of Lemma 3.4 (b)(ii), (3.17)implies

P (Q(t, x) < ε) ≤ exp(−c15ε−r3) + c16ε

−d P

(∫ t∧τ1∧τ3

0

exp(−|ηxu|p) du < c18ε

r4

)(3.21)

where ηx(t) denotes the process φ(ξx(t)), t ≤ τ3. Applying Ito’s lemma to computeηx(t) gives

dηx(t) =n∑

i=1

∇φ(ξx(t)).Xi(ξx(t))dWi(t) + (L − c)φ(ξx(t)) dt. (3.22)

Lemma 3.8 (b)(i), Lemma 3.1, and (3.22) imply that the process y ≡ ηx and thestopping time τ ≡ τ1 ∧ τ2 satisfy the hypotheses of Lemma 3.5. Hence (3.7) is satisfied

21

for every m > 1 with τ = τ1 ∧ τ3 and y = ηx. Thus, by Lemma 3.6 there exist positiveconstants c6, c7, T1 and q′ > 1, all independent of x ∈ V , such that for all t ∈ (0, T1)and ε < exp(−c6t

− 1q′ )

P

(∫ t∧τ1∧τ3

0

exp(−|ηxu|p) du < ε

)< exp−c7| log ε|q′. (3.23)

Substituting this into (3.17) gives

P (Q(t, x) < ε) ≤ exp(−c15ε−r3) + c16ε

−d exp(−c7| log εr4 |q′) (3.24)

for t ∈ (0, T1) and ε < exp(−c6t− 1

q′ ). Combining (3.21) with (3.5), we arrive at

‖∆(t, x)−1‖2q2q ≤ c8

exp(2dqc6t

− 1q′ ) + c22

, 0 < t < T1 (3.25)

where

c22 ≡ 1 +∞∑

j=1

exp

(−c15j

r32dq

)+ c16j

1/2q exp(−c7| log jr42dq |q′

)

< ∞.

Note that the constants c6, c8 and c22 can all be chosen to be independent of x ∈ V .The right hand side of (3.25) explodes exponentially fast as t ↓ 0. However, sinceq′ > 1 we conclude that (3.4) holds also for this case, and the proof of Theorem 3.3 iscomplete.

Proof of Theorem 3.1.Assume L satisfies the hypotheses of Theorem 3.1. We borrow yet another tech-

nique from Kusuoka and Stroock [KS2]. The idea is to imbed the operator L in ananother operator L defined on a (d + 1)-dimensional domain and satisfying the hy-potheses of Theorem 3.3. This will prove the hypoellipticity of the parabolic operatorL + ∂

∂t , which in turn will imply hypoellipticity of L.Choose a smooth non-negative real-valued function ρ on (0, 1) such that both ρ(s)

and its derivative ρ′(s) are bounded away from zero for all s ∈ (0, 1). Define theoperator L on the domain D × (0, 1) by

L ≡ ρ(s)L +12

∂2

∂s2.

Then L has the form

L ≡ 12

n+1∑i=1

X2i + X0.

where X0(x, s) ≡ ρ(s)X0(x), Xi(x, s) ≡ ρ(s)1/2 Xi(x), 1 ≤ i ≤ n, and Xn+1 (x, s) ≡∂∂s , (x, s) ∈ D × (0, 1). Define F (m), λ(m), H, and K similarly to before, using the

22

vector fields Xi in place of the Xi’s. Since ρ and ρ′ are bounded away from zero on(0, 1), it is easy to see that there are positive constants δm such that

λ(m)(x, s) ≥ δm µ(m) (x), ∀(x, s) ∈ D × (0, 1). (3.26)

This implies Kc ⊆ Hc × (0, 1). Since Hc is contained in a C2 hypersurface S in D,then Hc is contained in the C2 hypersurface S× (0, 1) in D× (0, 1). By assumption, atleast one the vector fields X1, · · · , Xn is transversal to S at every point of Hc. Henceone of the vectors fields X1, · · · , Xn+1 is transversal to S × (0, 1) at every point of Hc.Hypothesis (ii) of Theorem 3.1 together with (3.26) imply that λ(m) satisfies the corre-sponding hypothesis in Theorem 3.3 ( with respect to the set Kc). We conclude fromTheorem 3.3 that the operator L + ∂

∂t is hypoelliptic on R×D × (0, 1). ConsequentlyL is hypoelliptic on D × (0, 1), and this implies L is hypoelliptic on D.

We conclude this section by proving Lemmas 3.5 and 3.6, which provided the keysteps in the preceding argument. The proof of Lemma 3.5 requires two preliminaryresults which we first state and prove.

Proposition 3.9. Suppose m ≥ 2 and a > 0. Let B : [0,∞) × Ω → R be a one-

dimensional Brownian motion. Then there exists a positive constant c23 such that

P

(∫ a

0

|B(u)|mdu < ε

)≤

√2 exp

(−c23a

1+ 2m ε−

2m

)for every ε > 0. The constant c23 may be chosen to be 2−7.

Proof. The result is known to hold for m = 2, with c23 = 2−7 (cf[IW], Lemma V.10.6,p. 399).

For m > 2 we apply Holder’s inequality and use the result for m = 2 to obtain

P

(∫ a

0

|B(u)|mdu < ε

)≤ P

(∫ a

0

|B(u)|2du < a1− 2m ε

2m

)≤

√2 exp(−c23a

1+2 ε−2)

for every ε > 0. This proves the proposition.

Proposition 3.10. Assume the notation and hypotheses of Lemma 3.5. Then for

every m ≥ 2 and q > 0, there exist positive constants c24 and c25 such that for all

ε > 0,

P

(∫ τ

0

|yu|mdu < ε, τ ≥ εq

)< c24 exp

(−c25ε

q+2(q−1)

m

).

The constants c24 and c25 depend only on c3 (the bound for the drift and diffusion

coefficients of y), δ, m, and q. In particular, they are independent of y0 and the

characteristics of τ .

23

Proof. Expressing (3.5) in components, it is sufficient to treat the case d = 1, n ≥ 1.In this case, the process y may be written in the form

yt = B(τ4(t)) +∫ t

0

b(u) du, 0 ≤ t ≤ T,

where

τ4(t) ≡∫ t

0

|a(u)|2 du, 0 ≤ t ≤ T,

and B : [0,∞) × Ω → R is a one-dimensional adapted Brownian motion started atB(0) = y0. By assumption, we may take |a(u)| ≥ δ > 0 for 0 ≤ u ≤ τ . Hence τ ≥ εq

implies τ4(τ) ≥ δ2εq. Furthermore, the function τ4(t) is strictly increasing on (0, τ)and changes of the time variable yield∫ τ

0

|yu|mdu ≥ c−126

∫ τ4(τ)

0

|y(τ−14 (s))|m ds

= c−126

∫ τ4(τ)

0

∣∣∣B(s) +∫ s

0

b(τ−14 (u))

|a(τ−14 (u))|2

du∣∣∣m ds,

where c26 ≡ nc23. Thus

P

(∫ τ

0


)

≤ P

(∫ δ2εq

0

∣∣∣B(s) +∫ s

0

b(τ−14 (u))

a(τ−14 (u))2

du∣∣∣m ds < c26ε, τ4(τ) ≥ δ2εq

). (3.27)

We now define a (bounded) process h : [0,∞) × Ω → R by

h(u) ≡

b(τ−1

4 (u))

|a(τ−14 (u))|2 , u ∈ (0, τ4(τ))

b(τ)|a(τ)|2 , u ≥ τ4(τ).

and we denote by B′ the process

B′(s) ≡ B(s) +∫ s

0

h(u)du, 0 ≤ s ≤ τ4(T ).

By the Girsanov theorem, B′ is a Brownian motion on Ω with respect to the measure

dP ′ ≡

exp(−

∫ τ4(T )

0

h(u) dB(u) − 12

∫ τ4(T )

0

h2(u) du

)dP.

Denote by Ωε the event

Ωε ≡(∫ δ2εq

0

|B′(s)|m ds < c26ε

),

24

and by G the Girsanov density

G ≡ exp(−

∫ τ4(T )

0

h(u)dB(u) − 12

∫ τ4(T )

0

h2(u) du

).

We now apply Holder’s inequality to (3.27) to obtain

P

(∫ τ

0


)≤ P (Ωε) ≤

√E(G−2)P ′(Ωε).

By Proposition 3.9, we have

P ′(Ωε) ≤√

2 exp(−2c25ε

q+2(q−1)

m

), (3.28)

where c25 ≡ 12c23c

− 2m

26 δ2(1+ 2m ) . The boundedness of h and τ4(T ) imply the existence of

a constant c27, depending only on the bounds of the foregoing quantities, such that

G−2 ≤ c27 exp(

2∫ τ4(T )

0

h(u) dB(u) − 2∫ τ4(T )

0

h2(u) du

). (3.29)

The desired conclusion follows from (3.28) and (3.29), together with the fact thatthe exponential on the right hand side of (3.29) is a Girsanov density (note that itis obtained by replacing h by (−2h) in the relation defining G), and therefore hasexpectation equal to 1.

Proof of Lemma 3.2.Note that for every q > 0, we may write

P

(∫ t∧τ

0

|yu|mdu < ε

)≤ P1 + P2, (3.30)

where

P1 ≡ P

(∫ t∧τ

0

|yu|mdu < ε, t ∧ τ ≥ εq

)and

P2 ≡ P (t ∧ τ < εq).

By Proposition 3.10,

P1 < c24 exp(−c25ε

q+2(q−1)

m

), (3.31)

where c24 and c25 are independent of t. Now τ is exponentially positive; so if T0 > t >

εq, thenP2 < exp(−c27ε

−q), (3.32)

25

where c27 and T0 denote the characteristics of τ . Combining (3.30) - (3.32), we obtain

P

(∫ t∧τ

0

|yu|m du < ε

)≤ c24 exp

(−c25ε

q+2(q−1)

m

)+ exp(c27ε

−q)

for t ∈ (0, T0) and 0 < ε < t1q . The lemma now follows by choosing as q the value for

which the two exponents

q + 2(q−1)m

and −q coincide, namely 1

m+1 .

Proof of Lemma 3.6.

Choose and fix m ≥ max− p

1+p , 2

and set q ≡ −mp(m+1) ; so q > 1. Define a

function ψ : [0,∞) → [0, 1) by

ψ(z) ≡

exp(−zpm ) z > 0

0 z = 0.

Note that(i) ψ is strictly increasing;(ii) ψ is convex in an interval (0, c28), for some positive constant c28.

Furthermore

P

(∫ t∧τ

0

exp(−|yu|p)du < ε

)= P

(∫ t∧τ

0

ψ(|yu|m) du < ε

).

We break the proof of the lemma into two cases. Firstly, suppose that |y0| ≥ c29

(≡ ( 12c28)

1m ). Let τ5 ≡ infs > 0 : |ys − y0| = 1

2c29 ∧ τ . Then there exists a positiveconstant c30, determined by m, p, and δ, such that

P

(∫ t∧τ

0


)≤ P (t ∧ τ5 ≤ c30ε). (3.33)

Applying Lemma 3.5 to the right hand side of (3.33), we deduce the existence of positiveconstants c31 and c32, such that if t ∈ (0, c31) and 0 < ε < t

c30, then

P

(∫ t∧τ

0


)≤ exp

(−c32

ε

).

The constants c30, c31, c32 depend only on m, p, c3, and the characteristics of τ . Thusthe conclusion of the lemma holds in this case.On the other hand, suppose |y0| < c29. We now set

τ6 ≡ infs > 0 : |ys|m = c28 ∧ τ.

26

Jensen’s inequality yields

P

(∫ t∧τ

0

ψ(|yu|m)du < ε

)≤ P

(∫ t∧τ6

0

|yu|m du ≤ (t ∧ τ6)ψ−1

(ε

t ∧ τ6

))≤ P1 + P2

where

P1 ≡ P

(t ∧ τ6 ≤ ε

c28

)and

P2

(∫ t∧τ

0

|yu|mdu ≤ (t ∧ τ6)ψ−1

(ε

t ∧ τ6

), t ∧ τ6 >

ε

c28

).

Note that P1 is of the same form as the probability on the right hand side of (4.7), andhence satisfies a similar estimate.

We now consider P2. An elementary argument shows that the convexity of ψ in the

interval (0, c28) implies that the function θ(u) ≡ uψ−1

(εu

)is increasing for u > ε

c28.

In particular, if t ∧ τ6 > εc28

then

(t ∧ τ6)ψ−1

(ε

t ∧ τ6

)≤ Tψ−1

(ε

T

)where T is any upper bound for t. This implies

P2 ≤ P

(∫ t∧τ6

0

|yu|mdu

)≤ Tψ−1

(ε

T

)). (3.34)

We now apply Lemma 3.5 to estimate the right hand side of (3.34). Thus

P2 < exp(−c5

Tψ−1

(ε

T

)− 1m+1

)≤ exp(−c7| log ε|q), (3.35)

for all 0 < t < c34 and ε < exp(−c33t

− 1q

), where c7, c33, and c34 are positive constants

exhibiting the appropriate dependence. Then (3.35) gives an estimate for P2 of therequired form, and the proof of the lemma is complete.

4. A study of a class of degenerate functional stochastic differential equa-tions

In the previous section we derived very sharp sufficient conditions for the existenceof smooth densities for the class of Ito processes

dξt =n∑

i=1

Xi(ξt)dwi + X0(ξt)dt. (4.1)

27

The arguments relied heavily on the existence of an invertible stochastic flow on Rd

determined by the equation. Furthermore, as is apparent from Theorem 2.7 and itsproof, the Lie bracket condition of Hormander appears as a result of the interactionbetween Z, the derivative of the stochastic flow, and the vector fields Xi. The existenceof a stochastic flow is the analytic counterpart of the probabilistic fact that the solutionto the classical Ito equation (4.1) is a Markov process. In this section (which comesfrom [BM2]) we study a class of stochastic differential equations with far less analyticand probabilistic structure.

In order to describe this class of equations, we introduce the following notationand assumptions. Denote by:

η a continuous (deterministic) Rd-valued path defined on a time interval [−r, 0],for a fixed r > 0.

g, a bounded smooth map from Rd to Rd ⊗ Rn with bounded derivatives of allorders.

C the space of continuous paths x : [−r,∞) → Rd, equipped with the uniformnorm on compact time sets.

B : C → C a smooth bounded map, with bounded derivatives of all orders.Assume also that B is non-anticipating, i.e. for all t ≥ −r and x ∈ C, B(t, x) ≡ B(x)t

depends only on x(u) : u ≤ t.Hd[0, s], the d-dimensional Cameron-Martin space (cf. Section 2) with paths

and norm defined on the time interval [0, s]. Suppose that for all x ∈ C, the mapDB(x)/Hd : Hd → Hd and that

α ≡ sups∈[0,t],x∈C

||DB(x)||Hd[0,s] < ∞.

We consider the following stochastic functional differential equation

dxt = g(xt−r)dwt + B(t, x)dt, t ≥ 0 (4.2)

xt = η(t), −r ≤ t ≤ 0

where, as before, w is a standard Wiener process in Rn. Owing to the past-statedependence of the coefficients in the equation, the solution x will not generally be aMarkov process.

In addition to their theoretical interest, equations of the form (4.2) have importantapplications. For example, physical systems influenced by noise are often modeled asdiffusion processes (4.1). However, this choice of model assumes that the evolutionof the system at any instant in time depends only on the position of the system atthat instant, together with the noise input. Since physical systems are always subjectto some degree of inertia, this assumption is somewhat unrealistic. It would seem

28

that equations of the form (4.2), where the coefficients are allowed to depend on pastbehaviour, more accurately model such physical systems.

We establish the existence of smooth densities for x under hypotheses that allowdegeneracy of the matrix-valued function g. The general scheme of Chapter 2 is used.As before, we construct a Malliavin covariance σ for each xt, (t > 0) and deducethe existence of a smooth density for xt by demonstrating the strong invertibility ofσ. However the argument for doing this is, by necessity, quite different from that inChapter 3. Because of its non-Markovian nature*, x does not generate a stochasticflow on Rd. Thus the analysis used in Chapter 3 to study the non-degeneracy of σ

breaks down. We salvage the situation by proving and exploiting a property that seemspeculiar to equations of the form (4.2); namely we show (cf. Lemma 4.5) that for alltimes a ≥ 0 and k ≥ 1, x satisfies estimates of the form

P

(∫ a+r

a

|xt|2dt < ε

)= o(εk) as ε → 0.

This probabilistic lower bound on x is propagated from the initial condition η by thetime delay r in equation 4.2 **. We use this property to deduce non-degeneracy of theMalliavin covariance matrices, under appropriate hypotheses on g and B.

The main result of this section is

Theorem 4.1. Suppose there exist positive constants ρ, δ, an integer p ≥ 2 and a

function φ : Rd → R such that

(i) the Lebesgue measure of the set s ∈ [−r, 0] : φ(η(s)) = 0 is zero.

(ii) g satisfies

gg∗(x) ≥|φ(x)|pI, |φ(x)| < ρ

δI, |φ(x)| ≥ ρ.(4.3)

(iii) φ is C2 with bounded first and second derivatives and there is a positive

constant c such that

‖∇φ(x)‖ ≥ c, whenever |φ(x)| ≤ ρ. (4.4)

Suppose 0 < α√

2teαt < 1. Then the random variable xt defined by (4.2) is

absolutely continuous and has a C∞ density.

* In fact, the process x can be embedded as a Markov process with an infinite-dimensional state space. However, the generator of this process is a highly degenerateoperator whose analysis is way beyond the reach of existing PDE techniques.** There is no reason to believe that similar estimates hold for classical diffusion

processes.

29

We note that condition (ii) above allows degeneracy of g on the hypersurfaceD ⊆ Rd where φ vanishes, while condition (ii)) implies that D does not contain anysingular points.

Denote by F the map w → xt. In order to prove the theorem we must computethe Malliavin covariance matrix, defined formally by σ ≡ DF (w)DF (w)∗. This canbe done as follows. Let H denote the Cameron-Martin space and Pm the sequence ofpiecewise linear projections defined in Chapter 2, we consider the natural restriction F

of the map F to H, i.e. the map h ∈ H → kt where k ≡ η on [−r, 0] and

k′s = g(ks−r)h′

s + B(s, k), s > 0.

The matrix σ is then obtained as the limit in probability of the matrix sequence

DF (Pm)DF (Pm)∗.

The result of this computation is

Lemma 4.2. The Malliavin covariance matrix σ for the random variable xt is

σ =∫ t

0

Zug(xu−r)g(xu−r)∗Z∗udu (4.5)

where the process Z satisfies

Zs = I +∫ t

(s+r)∧t

Dg(xs−r)(., dws)∗Zs + DHdB(x)∗

( ∫ .

0

Z

)s

. (4.6)

It is at this point that the non-Markovian nature of the problem manifests itselfanalytically; in contrast to the situation in Chapter 3, there is now no reason to supposethat the matrix process Zs is non-degenerate for small values of s. We show in thenext Lemma that it is, however, non-degenerate for values of s close enough to t.

Lemma 4.3. Under the hypotheses of Theorem 4.1, Zs ∈ GL(d) for s ∈ [t − r, t] and

there exists a deterministic constant c < ∞ such that ||Z−1s || ≤ c for s ∈ [t − r, t].

Proof. For notational convenience, write the operator DHdB(x)∗ as K(x). For s ∈

[t − r, t] and any unit vector e ∈ Rd, (4.6) gives

Zse = e + K(x)s(∫ .

0

Zue du). (4.7)

Now ∣∣K(x)s(∫ .

0

Zue du)∣∣ ≤ ∣∣∣∣K(x)(

∫ .

0

Zue du)∣∣∣∣

Hd[0,s]

≤ α||∫ .

0

Zue du||Hd[0,s] = α( ∫ s

0

|Zue|2)1/2

. (4.8)

30

Substituting this back into (4.7), we deduce

||Zs|| ≤ 1 + α( ∫ s

0

||Zu||2du)1/2

which implies

||Zs||2 ≤ 2 + 2α

∫ s

0

||Zu||2du.

Applying Gronwall’s inequality to this yields

||Zs|| ≤√

2eαs ≤√

2eαt, ∀s ≤ t.

Combining this with (4.8) gives

∣∣∣∣K(x)s(∫ .

0

Zu du)∣∣∣∣ ≤ √

2teαtα < 1.

It follows from (4.7) that Zs ∈ GL(d),

Z−1s =

∞∑j=0

(−1)j

[K(x)s(

∫ .

0

Zu du)]j

and||Z−1

s || ≤ (1 − α√

2teαt)−1.

Let λ denote the smallest eigenvalue of σ and set ξs ≡ φ(xs), s ≥ −r. It followsfrom (4.5) and Lemma 4.3 that

λ ≥ 1c

∫ t

(t−r)∧0

(|ξu−r|p ∧ δ) du. (4.9)

The following two results are used to produce lower bounds on the Malliavincovariance matrices corresponding to equation 4.2. These lower bounds comprise theessential part of the proof of Theorem 4.1.

Lemma 4.4. Fix b > a > 0. Let y : [0,∞) × Ω → R be a measurable stochastic

process such that E supa≤t≤b

|y(t)|p is finite for every positive integer p. Suppose that y

satisfies

P

(∫ b

a

y(t)2dt < ε

)= o(εk).

Then for every positive constant α,

P

(∫ b

a

y(t)2 ∧ α dt < ε

)= o(εk).

31

Proof This result was originally proved as Lemma 3 in ([BM1], pp. 91–94). We givehere a simplified proof of the result.

Let A and B denote, respectively, the sets s ∈ [a, b] : y(s)2 ≤ α and s ∈ [a, b] :y(s)2 > α. Then

P

(∫ b

a

y(t)2 ∧ α dt < ε

)= P

(∫A

y(t)2 dt + αλ(B) < ε

)

≤ P

(∫A

y(t)2 dt < ε, αλ(B) < ε

)

= P

(∫ b

a

y(t)2 dt < ε +∫

B

y(t)2 dt, λ(B) < ε/α

)≤ P1 + P2.

Here

P1 := P

(∫ b

a

y(t)2 dt < ε +∫

B

y(t)2 dt, λ(B) < ε/α, supa≤t≤b

y(t)2 ≤ 1/√

ε

)and

P2 := P ( supa≤t≤b

y(t)2 > 1/√

ε).

Note that

P1 ≤ P

(∫ b

a

y(t)2 dt < ε +√

ε/α

)= o(εk)

by hypothesis.Using the finite-moment hypothesis on y and applying Markov Chebyshev’s in-

equality to the probability P2, we obtain P2 = o(εk). This completes the proof of thelemma.

Lemma 4.5 (Propagation lemma). Suppose that, for some −r < a < b, ξ satisfies

P

(∫ b

a

ξ2s ds < ε

)= o(εk). (4.10)

Then

P

(∫ b+r

a+r

ξ2s ds < ε

)= o(εk). (4.11)

Proof. Write g = (g1 . . . gn), where gi, 1 ≤ i ≤ n, are column d-vectors. Computingξs = φ(xs), s > 0, by Ito’s formula gives

dξs =n∑

i=1

∇φ(xs) · gi(xs−r) dwi(s) + G(s) ds, s > 0 (4.12)

32

where G is a bounded adapted real-valued process. We write

P

(∫ b+r

a+r

ξ2s ds < ε

)= P1 + P2

where

P1 ≡ P

(∫ b+r

a+r

ξ2s ds < ε,

n∑i=1

∫ b+r

a+r

[∇φ(xs) · gi(xs−r)]2 ds ≥ ε1/18

)and

P2 ≡ P

(∫ b+r

a+r

ξ2s ds < ε,

n∑i=1

∫ b+r

a+r

[∇φ(, xs) · gi(xs−r)]2 ds < ε1/18

).

In view of 4.12, an inequality of Kusuoka and Stroock (cf[KS2], Lemma 6.5) impliesthat P1 = o(εk) . Thus it is sufficient to show that P2 also has this property.

Write

a(s) =n∑

i=1

[∇φ(xs) · gi(xs−r)]2.

Then by (4.3) and (4.4), it follows that a(s) ≥ c2(|ξs−r|p ∧ δ) if |ξs| ≤ ρ.Define

A ≡ s ∈ [a + r, b + r] : |ξs| ≤ ρ and ≡ s ∈ [a + r, b + r] : |ξs| > ρ.

Then

P2 = P

(∫ b+r

a+r

ξ2s ds < ε,

∫ b+r

a+r

a(s) ds < ε1/18

)

≤ P

(∫ b+r

a+r

ξ2s ds < ε,

∫A

c2(|ξs−r|p ∧ δ) ds < ε1/18

)

= P

(∫ b+r

a+r

ξ2s ds < ε,

∫ b+r

a+r

c2(|ξs−r|p ∧ δ) ds < ε1/18 +∫

B

c2(|ξs−r|p ∧ δ) ds

)

≤ P

(∫ b+r

a+r

ξ2s ds < ε,

∫ b+r

a+r

c2(|ξs−r|p ∧ δ) ds < ε1/18 + c2δλ(B))

.

However∫ b+r

a+r

ξ2s ds < ε implies λ(B) < ε/ρ2. Thus the preceding probability is

≤ P

(∫ b+r

a+r

c2(|ξs−r|p ∧ δ) ds < ε1/18 + c2δε/ρ2

)

≤ P

(∫ b

a

(|ξs|p ∧ δ) ds < c′ε1/18

)33

for some positive constant c′, and for small enough ε. Assumption 4.10, Lemma 4.4,and Jensen’s inequality allow us to conclude that the probability on the right hand sideof the above inequality is o(εk). This implies that P2 = o(εk), and the proof of Lemma(4.2) is complete.

We are now in a position to complete the proof of Theorem 4.1. Recall that weneed to verify the condition

(detσ)−1 ∈ ∩p≥1Lp(γ). (4.13)

As before let λ denote the smallest eigenvalue of σ. Let n denote the integer such thatt ∈ ((n− 1)r, nr]. First, suppose n = 1. Hypothesis (i) of Theorem 4.1 and (4.9) imply

λ ≥ 1c

∫ t−r

−r

(|ξu|p ∧ δ) du =1c

∫ t−r

−r

(|ηu|p ∧ δ) du > 0.

Since the second integral is deterministic, 4.13 trivially holds in this case.On the other hand, suppose n > 1. Since [t− (n+1)r, t−nr] ⊂ [−r, 0], hypotheses

(i) implies

P

(∫ t−nr

t−(n+1)r

ξ2s ds < ε

)= o(εk).

We now iterate Lemma 4.5 n times on this estimate to obtain

P

(∫ t

t−r

ξ2s ds < ε

)= o(εk).

Applying Lemma 3.4 yields

P

(∫ t

t−r

ξ2s ∧ δ2/pds < ε

)= o(εk) (4.14)

Using Jensen’s inequality in (4.14) gives

P

(∫ t

t−r

|ξs|p ∧ δ ds < ε

)= o(εk).

Combining this with (4.9), we finally have

P (λ < ε) = o(εk).

This implies (4.13) and completes the proof of the theorem.

34

5. Some open problemsIn this section we describe a few open problems that we hope will serve as a

stimulus to further research.1. It would be interesting to generalize Theorem 4.1 to the class of fully hereditary

functional differential equations

dxt = A(t, x)dw + B(t, x)dt (5.1)

where both A(t, x) and B(t.x) are allowed to depend on the whole history of the pathxs : 0 ≤ s ≤ t. Kusuoka and Stroock [KS1] addressed this problem under a strongellipticity assumption, i.e. they have shown that xt has a smooth density for all positivet if there exists δ > 0 such that A(t, x)A(t, x)∗ ≥ δI,∀(t, x) ∈ [0,∞) × C. As far as Iam aware, Theorem 4.1 is the only result establishing the existence of densities for ageneral class of non-Markov Ito processes under hypotheses that allow degeneracy ofthe diffusion coefficient. An analogous result in the fully general setting (5.1) wouldtherefore be of considerable interest and importance.

2. The non-hypoellipticity of the operators

∂2

∂x21

+ exp−|x1|p∂2

∂x22

+∂2

∂x23

, p ≤ −1

described in Section 2, contrasts strikingly with a result proved by Fedii in 1971. Fediishowed that the operator on R2

∂2

∂x2+ exp(−|x|p) ∂2

∂y2, p < 0

is hypoelliptic for all negative values of p. It would be interesting to gain a deeperunderstanding, either by classical or probabilistic means, of the role that dimension isplaying in these results.

3. The probabilistic methods employed above can also be used to study quasilinearpartial differential equations. For example, let ψ denote a smooth non-linear and non-negative function defined on [0,∞) × Rd × R and consider the initial-value problem

∂u

∂t= Lu + ψ(t, x, u), (t, x) ∈ (0,∞) × Rd

u(0, x) = 0, x ∈ Rd

. ((5.2)

where L is defined in (1.1). In particular, A continuous weak solution to (5.2) is givenby the probabilistic representation

u(t, x) = E[ ∫ t

0

ψ(s, ξxt−s, u(s, ξx

t−s)) ds]

(5.3)

35

where ξx denotes process ξ in (1.2) with initial point x ∈ Rd. It should be possibleto use this probabilistic representation together with the ideas in of Section 3 to showthat, under suitable conditions, the solution u to (5.2) is smooth for t > 0. We note,however, that the problems is considerably more difficult than for the linear case treatedin Section 3, owing to the presence of u in the right hand side of (5.3).

Quasilinear problems of this type have received a great deal of attention in recentyears. Dynkin [D] and others have discovered a remarkable link between operators ofthe form L + ψ and a collection of stochastic processes called superprocesses.

4. A related issue is the study of the quasilinear Dirichlet problem. Suppose D isa bounded regular open subset of Rd with a C2 boundary, f is a smooth non-negativefunction defined on D × R, and g is a smooth non-negative function on ∂D

Lu = f(x, u), x ∈ D

u(x) = g(x), x ∈ ∂D

(5.4)

Under mild further conditions, a weak continuous solution of equation (5.4) exists,given implicitly by

u(x) + E[ ∫ τ

0

f(ξx(s), u(ξx(s))) ds]

= E[g(ξx(τ)] (5.5)

where τ = τ(x) is the first exit time of the diffusion ξx from D. Again, one woldhope to be able to study the regularity of the solution u to (5.4) via the representation(5.5) by using the methods of Section 3. The goal would be to establish smoothness ofu on D under conditions that allow degeneracy of the operator L (e.g. Hormander’scondition or even the superdegeneracy condition introduced in Theorem 3.1.).

References

[Be1] Bell, D. R., Some properties of measures induced by solutions of stochastic differ-ential equations, Ph. D. Thesis, Univ. Warwick, 1982.

[Be2] Bell, D. R., Degenerate Stochastic Differential Equations and Hypoellipticity, Pit-man Monographs and Surveys in Pure and Applied Mathematics, Vol. 79. Long-man, Essex, 1995.

[BM1] Bell, D. R., and Mohammed, S.-E. A., The Malliavin calculus and stochastic delayequations, J. Funct. Anal. 99, No1 (1991) 75–99.

[BM2] Bell, D. R., and Mohammed, S.-E. A., Smooth densities for degenerate stochasticdelay equations with hereditary drift, Ann. Prob. 23, no. 4, 1875-1894, 1995.

36

[BM3] Bell, D. R. and Mohammed, S.-E. A., An extension of Hormander’s theorem forinfinitely degenerate second-order operators, Duke Math J., 78, no. 3 (1995), 453-475.

[Bi1] Bismut, J. M., Martingales, the Malliavin calculus and hypoellipticity under gen-eral Hormander’s conditions, Z. Wahrsch. Verw. Gebiete 56 (1981) 529-548.

[Bi2] Bismut, J. M., Large Deviations and the Malliavin calculus. Progress in Mathe-matics, vol. 45, Birkhauser, Boston, 1984.

[D] Dynkin, E. B., Superdiffusions and parabolic nonlinear differential equations, Ann.Prob., 20, no. 2 (1992) 942-962.

[F] Fedii, V. S., On a criterion for hypoellipticity, Math. USSR sb., 14 (1971) 15-45.

[G] L. Gross, Potential theory on Hilbert space, J. Funct. Anal. No. 1 (1967) 123-181.

[H] Hormander, L, Hypoelliptic second order differential equations, Acta Math. 119:3–4 (1967) 147-171.

[IW] Ikeda, N, and Watanabe, S, Stochastic Differential Equations and Diffusion Pro-cesses, Second Edition, North-Holland-Kodansha, 1989.

[KS1] Kusuoka, S, and Stroock, D, Applications of the Malliavin calculus, I, TaniguchiSymposSA Katata (1982), 271–306.

[KS2] Kusuoka, S, and Stroock, D, Applications of the Malliavin calculus, Part II, Jour-nal of Faculty of Science, University of Tokyo, Sec1A, Vol32, No1 (1985) 1-76.

[Ma1] Malliavin, P, Stochastic calculus of variations and hypoelliptic operators, Proceed-ings of the International Conference on Stochastic Differential Equations, Kyoto,Kinokuniya, 1976, 195-263.

[Ma2] Malliavin, P, Ck-hypoellipticity with degeneracy, part II, Stochastic Analysis, A.Friedman and M. Pinsky (eds), 1978, 327-340.

[Mi] Michel, D., Regularite des lois conditionnelles en theorie du filtrage non-lineaireet calcul des variations stochastique. J. Funct. Anal. 41 No. 1 (1981) 1-36.

[NP] Nualart, D. and Pardoux, E., Stochstic calculus with anticipating integrands.Probab. Theory Rel. Fields 78 (1988) 535-581.

[O] D. Ocone, Malliavin’s calculus and stochastic integral representation of functionalsof diffusion processes. Stochastics 12 (1984) no. 3-4, 161-185.

37

Documents

STOCHASTIC DIFFERENTIAL EQUATIONS AND ...n00008500/ND.pdfSTOCHASTIC DIFFERENTIAL EQUATIONS AND HYPOELLIPTIC OPERATORS Denis R. Bell Department of Mathematics University of North Florida