45
ON WEAK SOLUTION OF SDE DRIVEN BY INHOMOGENEOUS SINGULAR L ´ EVY NOISE TADEUSZ KULCZYCKI, ALEXEI KULIK, AND MICHA L RYZNAR Abstract. We study a time-inhomogeneous SDE in R d driven by a cylindrical L´ evy process with independent coordinates which may have different scaling properties. Such a structure of the driving noise makes it strongly spatially inhomogeneous and complicates the analysis of the model significantly. We prove that the weak solution to the SDE is uniquely defined, is Markov, and has the strong Feller property. The heat kernel of the process is presented as a combination of an explicit ‘principal part’ and a ‘residual part’, subject to certain L (dx) L 1 (dy) and L (dx) L (dy)-estimates showing that this part is negligible in a short time, in a sense. The main tool of the construction is the analytic parametrix method, specially adapted to L´ evy-type generators with strong spatial inhomogeneities. 1. Introduction In this paper we study an SDE of the form dX t = Z V t (X t- ,z ) N (dt, dz ), X 0 = x R d , t 0, (1) where N (dt, dz ) is a Poisson random measure, which corresponds to a symmetric L´ evy process Z =(Z t ,t 0) in the usual sense that dZ t = Z zN (dt, dz ). Heuristically, the dynamics of the process X can be described as follows: whenever the driving process has a jump with the altitude 4 t Z = z , the process X makes the jump with the altitude 4 t X = V t (X t- ,z ). Such a description can be made rigorous either if the total intensity of jumps for Z is finite (and then the jumps can be processed one by one), or the jump coefficient V t (x, z ) satisfies a proper version of the Lipschitz condition w.r.t. x (and then the solution to (1) can be obtained by the Itˆ o-L´ evy stochastic calculus tools, e.g. [19, Section IV.9]). In both these cases, X is a strong solution to (1), i.e. a process adapted to the natural filtration generated by the L´ evy noise. In the current paper, we deal with a more sophisticated setting where the coefficient is assumed to be H¨ older continuous, only. In this case, one can still expect to have X uniquely defined in law as a weak solution to (1). The guideline here is provided by the classic diffusion theory [34], based on an analytic study of the backward Kolmogorov equation for the (formal) generator, associated with the SDE. Extension of this analytic theory to L´ evy driven SDEs has been a subject of intensive studies, see the literature overview in Section 2.4 below. Such an extension is far from being straightforward; namely, because of high diversity of the possible structure of the L´ evy noise, numerous new effects appear, often requiring specific methods to be treated. In the current paper we approach a quite challenging case, where the driving process Z has the form Z =(Z 1 ,...Z d ), (2) T. Kulczycki and M. Ryznar were supported in part by the National Science Centre, Poland, grant no. 2019/35/B/ST1/01633. A. Kulik has been supported through the DFG-NCN Beethoven Classic 3 programme, contract no. 2018/31/G/ST1/02252 (National Science Center, Poland) and SCHI-419/11–1 (DFG, Germany). 1

Introduction - prac.im.pwr.edu.pl

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

ON WEAK SOLUTION OF SDE DRIVEN BY INHOMOGENEOUS

SINGULAR LEVY NOISE

TADEUSZ KULCZYCKI, ALEXEI KULIK, AND MICHA L RYZNAR

Abstract. We study a time-inhomogeneous SDE in Rd driven by a cylindrical Levyprocess with independent coordinates which may have different scaling properties. Such astructure of the driving noise makes it strongly spatially inhomogeneous and complicatesthe analysis of the model significantly. We prove that the weak solution to the SDEis uniquely defined, is Markov, and has the strong Feller property. The heat kernel ofthe process is presented as a combination of an explicit ‘principal part’ and a ‘residualpart’, subject to certain L∞(dx)⊗L1(dy) and L∞(dx)⊗L∞(dy)-estimates showing thatthis part is negligible in a short time, in a sense. The main tool of the construction isthe analytic parametrix method, specially adapted to Levy-type generators with strongspatial inhomogeneities.

1. Introduction

In this paper we study an SDE of the form

dXt =

∫Vt(Xt−, z)N(dt, dz), X0 = x ∈ Rd, t ≥ 0, (1)

where N(dt, dz) is a Poisson random measure, which corresponds to a symmetric Levyprocess Z = (Zt, t ≥ 0) in the usual sense that

dZt =

∫z N(dt, dz).

Heuristically, the dynamics of the process X can be described as follows: whenever thedriving process has a jump with the altitude 4tZ = z, the process X makes the jump withthe altitude 4tX = Vt(Xt−, z). Such a description can be made rigorous either if the totalintensity of jumps for Z is finite (and then the jumps can be processed one by one), or thejump coefficient Vt(x, z) satisfies a proper version of the Lipschitz condition w.r.t. x (andthen the solution to (1) can be obtained by the Ito-Levy stochastic calculus tools, e.g. [19,Section IV.9]). In both these cases, X is a strong solution to (1), i.e. a process adapted tothe natural filtration generated by the Levy noise. In the current paper, we deal with amore sophisticated setting where the coefficient is assumed to be Holder continuous, only.In this case, one can still expect to have X uniquely defined in law as a weak solution to(1). The guideline here is provided by the classic diffusion theory [34], based on an analyticstudy of the backward Kolmogorov equation for the (formal) generator, associated withthe SDE. Extension of this analytic theory to Levy driven SDEs has been a subject ofintensive studies, see the literature overview in Section 2.4 below. Such an extension isfar from being straightforward; namely, because of high diversity of the possible structureof the Levy noise, numerous new effects appear, often requiring specific methods to betreated. In the current paper we approach a quite challenging case, where the drivingprocess Z has the form

Z = (Z1, . . . Zd), (2)

T. Kulczycki and M. Ryznar were supported in part by the National Science Centre, Poland, grant no.2019/35/B/ST1/01633.

A. Kulik has been supported through the DFG-NCN Beethoven Classic 3 programme, contract no.2018/31/G/ST1/02252 (National Science Center, Poland) and SCHI-419/11–1 (DFG, Germany).

1

2 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

with Zi, i = 1, . . . , d being independent scalar Levy processes which have the weak scalingproperty (WSP), see (6) below. The jump coefficient will be assumed to have a naturalform

Vt(x, z) = At(x)z + Ut(x, z) (3)

with the linear part At(x)z being principal, in a sense, for small |z|. Clearly, when U ≡ 0equation (1) is equivalent to

dXt = At(Xt−) dZt, X0 = x ∈ Rd.

We stress that even the case of At(x) = A(x), Ut(x, z) ≡ 0 and all Zi, i = 1, . . . , dhaving the same α-stable distribution is quite complicated; for instance, correspondingtransition probability densities may fail to be locally bounded. Such an effect appears ifthe distributions of a jump for various starting points are mutually singular; for a detaileddiscussion we refer to [22, Section 4], where such models are called essentially singular.The essential singularity in the above setting is caused by a combination of two features:the fact that the Levy measure of the process (2) is supported by the collection of thecoordinate axes in Rd and thus is singular w.r.t. the Lebesgue measure, and a non-trivialrotation provided by the matrix A(x). In this paper we will make one more substantialstep further and allow the one-dimensional components of the noise to have different laws.To outline the new difficulties which appear in this setting, let us consider for a while Zwith αi-stable components, i = 1, . . . , d. For small t, the law of Zt is mainly concentratedaround the axis with the number j = argmini αi, which combined with a non-trivialrotation makes the model quite difficult to analyze analytically.

The first steps in the study of essentially singular models have been made in [30], [28],[22] and [3]. In [30], the components of the noise were the same and α-stable. The resultsof [30] were significantly extended in [3], where time-inhomogeneous model with a driftwas studied. In [22] general stable-like models have been treated, where the stability indexand the spherical kernel (i.e. the distribution of the jump direction) are x-dependent. In[28], instead of stable noise, a more general class of noises has been treated, satisfyingweak scaling condition; see definition in Section 2 below. In this paper we extend theseprevious results in several directions. First, in the setting of [28], where the cylindricalnoise has the same laws of the coordinates, we remove several hidden limitations. Namely,

• instead of the linear-in-z coefficient V (x, z) = A(x)z, we consider the coefficientsof the form (3) with a principal linear part and residual non-linearity;• time-inhomogeneous models are engaged into study;• instead of the Lipschitz continuity of the matrix coefficient A(x), the Holder con-

tinuity is assumed.

Second, we make a further substantial step, treating a cylindrical noise which has differentlaws of the coordinates. As we have explained before, such an extension leads to substantialanalytical difficulties; in addition, quite new effects may appear because of different scalingfor various coordinates. Namely, we will see in Example 2.7 that, in this setting, non-trivialassumptions on the Holder indices of the coefficients should be made, in the strikingcontrast to the case of same coordinates, or the stable-like case studied in [22].

To provide a comprehensive analysis of the new effects which appear due to stronglyinhomogeneous and singular Levy noise, we restrict ourselves to models which do notcontain a drift term; i.e. without a gradient term in the generator. Adding a drift term canlead to further complications because of possible lack of domination property in the case ofthe lower scaling index α < 1. It is visible that these problems can be resolved by the ‘flowcorrector’ method introduced in [21], [32], see also a discussion in [22, Sections 6.1,6.2];such an extension is a topic of our ongoing research.

We will prove existence and uniqueness of the weak solution to (1), which will be shownto be a time-inhomogeneous Markov process. We will also provide a representation of the

3

transition probability density of this process as a sum of explicitly given ‘principal part’,and a ‘residual part’ subject to a set of estimates showing that this part is negligible in ashort time, in a sense. The ‘principal part’ will be given in the form

pt,s(x, y) =1

| detAt(x)|Gs−t((y − x)(At(x)−1)T ), 0 ≤ t < s, x, y ∈ Rd, (4)

where Gt(·) is the distribution density of Zt. Clearly, as a function of y, this is thedistribution density of the variable

Xt,xs = x+At(x)(Zs − Zt), (5)

which can be seen as a natural approximation to the value at the time instant s of thesolution to (1), which starts from the point x at the time instant t.

It is worth mentioning that recently the existence of densities for SDEs driven by sin-gular Levy processes have been studied in [12] (cf. also [8]).

The structure of the rest of the paper is the following. In Section 2 we introduce theassumptions, formulate the main results, and provide a comprehensive discussion for them,based on examples and an overview of related results, available in the literature. Sections3 – 5 contain the proofs. The proofs are rather technical, hence in order to improvereadability we first explain the keystones of the proofs in Section 3. Numerous estimatesrequired in the main proof are deduced in Sections 4 – 5.

2. Main results

2.1. Assumptions. In this section, we collect all the assumptions we impose on ourmodel. Let us begin with the description of scalar Levy processes involved, as the coordi-nates, into the representation (2). Let the characteristic exponent ψ of a one-dimensional,

symmetric Levy process be given by ψ(ξ) =

∫R

(1− cos(ξx))ν(dx), where ν is a symmetric,

infinite Levy measure. The corresponding Pruitt function h(r) is given by

h(r) =

∫R

(1 ∧ (|x|2r−2))ν(dx), r > 0.

We will assume the following scaling conditions for the function h: for some 0 < α ≤β ≤ 2 and 0 < C1 ≤ 1 ≤ C2 <∞,

C1λ−αh (r) ≤ h (λr) ≤ C2λ

−βh (r) , 0 < r ≤ 1, 0 < λ ≤ 1. (6)

We claim that the above assumption is equivalent to the following weak scaling propertyfor ψ: there are constants 0 < C∗1 ≤ 1 ≤ C∗2 <∞,

C∗1λαψ (ξ) ≤ ψ (λξ) ≤ C∗2λβψ (ξ) , |ξ| ≥ 1, λ ≥ 1. (7)

The argument of the equivalence is postponed to Section 4.Once the condition (6) (or equivalently (7)) is satisfied, we say that the characteristic

exponent ψ (or the Levy measure ν) have the weak scaling property with indices α, β, andwrite ψ ∈WSC(α, β) (resp. ν ∈WSC(α, β)).

By ψi, νi and hi we denote corresponding characteristic exponents, Levy measures andPruitt functions of coordinates Zi of the process Z = (Z1, . . . , Zd).

We will consider two cases:

(A) All characteristic exponents ψi, i = 1, . . . , d are equal and ψ1 ∈WSC(α, β).

(B) Characteristic exponents ψi, i = 1, . . . , d are not the same and ψi ∈ WSC(α, β), i =1, . . . , d.

4 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

In both of these cases, the process Z has the transition density Gt(x, y) = Gt(y − x),where

Gt(w) =

d∏i=1

git(wi), w = (w1, . . . , wd) ∈ Rd,

and git, i = 1, . . . , d are the distribution densities for the coordinates (all git are the samein the case (A)).

Next, we assume the following conditions on the coefficients.

(C) For any t ≥ 0, x ∈ Rd At(x) = (at,i,j(x)) is a d × d matrix and there are constants

C3, . . . , C6 > 0, γ1, γ2 ∈ (0, 1] such that for any s, t ≥ 0, x, y ∈ Rd, i, j ∈ {1, . . . , d},|at,i,j(x)| ≤ C3, (8)

|det(At(x))| ≥ C4, (9)

|at,i,j(x)− at,i,j(y)| ≤ C5|x− y|γ1 , (10)

|as,i,j(x)− at,i,j(x)| ≤ C6|s− t|γ2 . (11)

The function ((0,∞) × Rd × Rd) 3 (t, x, z) → Ut(x, z) ∈ Rd is continuous and there areconstants C7 > 0 and

γ3 > max(1, β) (12)

such that for any t ≥ 0, x, z ∈ Rd

|Ut(x, z)| ≤ C7|z|γ3 . (13)

In the case (A), the Holder indices γ1, γ2 can be arbitrarily small. In the case (B),these indices and the U -smallness index γ3 should satisfy certain additional assumptions.Namely, we assume the following

(D)β

α< 1 + γ1,

1

α− 1

β< γ2,

β

α< γ3. (14)

For abbreviation, for any u > 0 we will use the notation

κ(u) = (u, h1(1), . . . , hd(1), h−11 (1), . . . , h−1

d (1), h−11 (1/u), . . . , h−1

d (1/u)).

2.2. Main statements. In this section, we formulate the main statements of the paper.

Theorem 2.1. Assume either (A),(C), or (B),(C),(D). Then for any x ∈ Rd the SDE(1) has a unique weak solution X. The process X is a time-inhomogeneous Markov processwhich has a transition density pt,s(x, y). The transition density admits a representation

pt,s(x, y) = pt,s(x, y) + rt,s(x, y), x, y ∈ Rd, 0 ≤ t < s, (15)

where pt,s(x, y) is given by (4) and the residual part rt,s(x, y) satisfies∫Rd|rt,s(x, y)| dy ≤ c(s− t)ε0 , x ∈ Rd,

where ε0 is defined in Remark 5.1 and the constant c depends only on d, α, β, γ1, γ2, γ3,C1, . . . , C7, h1(1), . . . , hd(1).

Theorem 2.1 actually states that the distribution density for Xs conditioned by Xt = xcan be approximated by the density of the variable (5), with the error of approximationgiven in the integral form. A natural question would be to obtain other types of the boundsfor the residue, e.g. uniform in x, y. It is known that, in the essentially singular setting,the residue can be locally unbounded, see Example 2.5 below. Hence, in order to get auniform bound for the residue, one has to impose some new intrinsic assumptions. Herewe give one such assumption, formulated in the form inspired by the change of measure

5

argument used in [30]. Alternative possibility would be to use a certain integral-in-xcondition, similar to (3.17) in [31] or (3.17) – (3.19) in [22].

Denote by µ the Levy measure of the process Z, and define

T t,zf(x) = f(x+ Vt(x, z)).

Assume the following.

(I) For all t and µ-a.a. z, T t,z is a bounded linear operator in L1(Rd), and there existsC8 <∞ such that

‖T t,z‖L1→L1 ≤ C8, t ≥ 0, z ∈ suppµ.

We have the following representation of the transition density.

Theorem 2.2. Let the conditions of Theorem 2.1 and additional assumption (I) hold.Then the density pt,s(x, y) is bounded, that is

supx,y∈Rd

pt,s(x, y) <∞, 0 ≤ t < s <∞.

Moreover, for any τ > 0 there exists c > 0, depending only on d, α, β, γ1, γ2, γ3,C1, . . . , C8, κ(τ) such that the residual term in the representation (15) satisfies

|rt,s(x, y)| ≤ cGs−t(0)(s− t)ε0 , 0 < s− t ≤ τ, x, y ∈ Rd.

In particular, the following two-sided on-diagonal estimate for pt,s(x, y) holds:

for 0 < s− t ≤ τ , x ∈ Rd,1

| detAt(x)|(1− c(s− t)ε0) ≤ pt,s(x, x)

Gs−t(0)≤ 1

|detAt(x)|(1 + c(s− t)ε0) . (16)

Define by {Pt,s} the evolutionary family corresponding to the process X in the usual

way: for any 0 ≤ t < s, x ∈ Rd and a bounded Borel function f : Rd → R,

Pt,sf(x) =

∫Rdpt,s(x, y)f(y) dy.

Under just the basic conditions of Theorem 2.1, we prove Holder continuity of this evolu-tionary family.

Theorem 2.3. Assume either (A),(C), or (B),(C),(D). For any 0 < γ < γ′ < α,γ ≤ 1, 0 < s− t ≤ τ , x, y ∈ Rd and a bounded Borel function f : Rd → R we have

|Pt,sf(x)− Pt,sf(y)| ≤ c|x− y|γ(s− t)−γ′/α‖f‖∞, 0 ≤ t < s <∞,

where c depends only on γ, γ′, d, α, β, γ1, γ2, γ3, C1, . . . , C7, κ(τ).

2.3. Examples. Let us give several examples illustrating various specific issues of themodel. Our first example shows that the distributions of the components of the Levy noiseZ can be quite singular. Note that a simplest example of a Levy measure ν ∈WSC(α, β)is a symmetric α-stable Levy measure

ν(dx) = cdx

|x|α+1, (17)

for which

h(r) =4c

α(2− α)r−α

and thus (6) holds true with β = α and C1 = C2 = 1. The weak scaling property has thesame spirit with the (true) scaling property of the α-stable Levy measure, but is muchmore flexible.

6 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Example 2.4. (Discretized α-stable measure) Let µ(dx) be obtained from the symmetricα-stable measure (17) by discretization in the following way:

µ(dx) =∞∑k=1

ν({y : ρk+1 < |y| ≤ ρk})2

(δ−ρk(dx) + δρk(dx)

),

where ρk ↘ 0 is a given sequence. Assume that {ρk} decays not faster than geometrically;that is, for some c > 0

ρk+1 ≥ cρk, k ≥ 1.

Then it is easy to show that the Pruitt function for µ satisfies

B1r−α ≤ h(r) ≤ B2r

−α, r ∈ (0, 1], (18)

for the reader’s convenience we prove this inequality in Appendix A below. This inequalityyields immediately that the discretized measure µ belongs to the same class WSC(α, α)with the original α-stable measure.

The following two examples illustrate the difference between the integral-in-y estimatefor the residual term rt,s(x, y) from Theorem 2.1 and the uniform estimate for this termfrom Theorem 2.2. First, we note that, under just the basic assumptions of Theorem 1,the transition probability density may be locally unbounded.

Example 2.5. (See [30, Remark 4.23], [22, Example 4.2]). Let d > 1, all the coordinatesZi, i = 1, . . . , d have the same α-stable distribution, and Vt(x, z) = A(x)z, where thematrices A(x) are Holder continuous in x, for each x ∈ Rd the matrix A(x) is a rotation(hence, an isometry) and for any x in some open cone with vertex at 0, which satisfies|x| ≥ 1 we have A(x)e1 = x/|x|. Then for α + 1 ≤ d, for any x ∈ Rd the transitionprobability density pt(x, y) is unbounded at any neighbourhood of the point y = 0.

In the above example, an ‘accumulation of mass’ effect appears due to singularity ofthe noise combined with a non-trivial rotation provided by the matrix A(x). The nextexample shows a typical situation where the additional condition (I) holds true, and thusthe ‘accumulation of mass’ effect does not appear.

Example 2.6. Let the function Vt(x, z) be Lipschitz continuous in x with the Lipshitzconstant satisfying

Lip(Vt(·, z)

)≤ ρ, t ≥ 0, z ∈ Rd,

where ρ < 1. Then the mapping IRd + Vt(·, z) has an inverse and

Lip(

[IRd + Vt(·, z)]−1)≤ 1

1− ρ, t ≥ 0, z ∈ Rd.

Moreover [IRd + Vt(·, z)]−1 has a gradient, which is defined a.e. with respect to theLebesgue measure and bounded, see [7]. In addition, the following change of variablesformula holds [18]:∫

Rdf(x+ Vt(x, z)) dx =

∫Rdf(v) det(∇v[IRd + Vt(v, z)]

−1) dv.

This yields (I) with

C8 = supt,z

esssupv| det(∇v[IRd + Vt(v, z)]

−1)| ≤ d!

(1− ρ)d.

Our last example explains why in the case (B), i.e. for a cylindrical noise which hasdifferent scaling indices of the coordinates, non-trivial assumptions on the Holder indicesof the coefficients should be made, on the contrary to the case (A), where the Holderindices can be arbitrarily small.

7

Example 2.7. Let Zi, i = 1, . . . , d be symmetric αi-stable with different αi, i = 1, . . . , d.The process Z = (Z1, . . . , Zd) fits to our case (B) with α = mini αi, β = maxi αi. Inthis example, we show that in such – extremely spatially non-homogeneous – setting theadditional assumption (D) is crucial in the sense that, without this condition, the structureof the transition density can be quite different.

Take d = 2 and α1 = α < α2 = β. Take also Vt(x, z) = Atz, where

At =

(1 tγ

tγ 1

)is a matrix-valued function which depends on t, only. Then the additional assumption (I)holds true since each operator T t,z is just an isometry which corresponds to the shift ofthe variable x 7→ x+Atz (we can also refer to Example 2.6 here). Denote, as usual, f � gif the ratio f

g is bounded and separated from 0. Then, by Theorem 2.2, one has

pt,s(x, x) � Gs−t(0) � (s− t)−1α− 1β

provided that 1α −

1β < γ, which is actually the second inequality in (14) (the first and

the third one hold true automatically). For 1α −

1β > γ the situation changes drastically;

namely, we have

p0,s(x, x) ≤ Cs−γ−2β and

s−γ− 2

β

s− 1α− 1β

→ 0, s→ 0 + . (19)

We prove this relation in Appendix A; here we give an informal explanation of the effect.The original noise has two components, a ‘weaker’ one and a ‘stronger’ one, which act alongthe 1st and the 2nd coordinate vectors e1, e2, respectively. The law of the solution to SDE(1) with x = 0, t = 0 is a convolution of the laws of the solutions X(1), X(2) to SDE (1),where instead of Z we substitute these two components of the noise separately. Considerthe projections of these laws on the direction e1, the one where the ‘weaker’ noise acts.

It is easy to show that the projection of the law of X(1)s on e1 has a distribution density

p(1,1)0,s (0, ·) with p

(1,1)0,s (0, 0) � s−

1α . On the other hand, any jump of the noise at the time t,

having the altitude z and the direction e2, produces a jump of the 1st coordinate with thealtitude tγz. Then it is not difficult to prove (and it is easy to believe) that the projection

of the law of X(2)s on e1 has a distribution density p

(2,1)0,s (0, ·) with p

(2,1)0,s (0, 0) � s

−γ− 1β .

Since −γ− 1β > −

1α , this means that the projection of the law of X

(1)s in the direction e1 is

‘more concentrated’ around 0 than the same projection for the law of X(2)s . The direction

e1 is the ‘worst possible’ here in the sense that A0e1 is equal to the first basis vector e1 andis orthogonal to the second one e2. One can actually show that the same ‘concentrationcomparison’ hold true for the projections on arbitrary direction l. This means that, inthe convolution of the laws X(1), X(2), the first component is negligible when compared tothe second one. It should be noted that, in this example, the one-dimensional noise Z2e2

generates a two-dimensional distribution density, which is actually a hypoellipticity-typeeffect. This density appears to be principal for the entire solution, which indicates that,without a condition of the type (D), analysis of the SDE with different components of thecylindrical noise should involve a study of hypoellipticity features. A systematic study ofthat type does not seem realistic for SDEs with low regularity of the coefficients, thus werestrict ourselves to the case where the condition (D) holds and thus hypoellipticity-typeeffects do not come into play.

2.4. Literature overview. Our main tool in the construction of the heat kernel pt,s(x, y)of the solution to SDE (90) is the parametrix method, properly adapted to the sophisticatedmodel we have. The parametrix method was first proposed by Levi [33], Hadamard [17] andGevrey [13] for differential operators and later extended by Feller [11] to a simple non-local

8 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

setting. The first version of the parametrix method for non-local operators was developedby Kochubei [24], see also the monograph by Eidelman, Ivasyshen & Kochubei [9]. Thismethod required the Levy measure of the noise to be comparable with the rotationallyinvariant α-stable Levy measure, and α > 1, i.e. the non-local part of the generator shoulddominate – in the order sense – the gradient part. These results have been extended innumerous directions e.g. by Kolokoltsov [25], where the limitation α > 1 was removed forthe operators without a gradient part; see also Chen & Zhang [5]. The parametrix methodfor the stable-like case, where the stability index is x-dependent, have been developed firstby Kolokoltsov [25]; in the papers of Kuhn [27, 26] this problem was treated for a widerclass of Levy kernels assuming a kind of sector condition for the symbol of the opera-tors. In Knopova & Kulik [21, 32] the parametrix method was extended to super-criticalcase, where the (non-trivial) gradient part is not dominated by an α-stable noise withα < 1. In all these results the Levy noise, principally, was comparable with the rotation-ally invariant α-stable one. Levy-type models with other types of the reference measureshave been studied as well; see Bogdan, Knopova & Sztonyk [2], Kulczycki & Ryznar [29],where α-stable reference measures with various types of spherical measure (i.e. the dis-tribution of the jumps directions) have been treated, and Grzywny & Szczypkowksi [15],where the reference measure is rotationally invariant and satisfies weak scaling condition.The symmetry assumption, typically imposed on the Levy noise in order to simplify thetechnicalities, is not substantial; see the recent publications by Chen, Hu, Xie & Zhang[6, 4], Grzywny & Szczypkowksi [15], Kulik [31] for the parametrix method for variousnon-symmetric Levy-type models.

Essentially singular models, where the distributions of a jump for various starting pointsare mutually singular, lack a fixed reference measure, to the striking contrast with theresults listed above. This leads to a considerably new technical difficulties; essentiallysingular models also exhibit new effects such as the one discussed in Example 2.5. For thefirst advances in the study of such models see Kulczycki, Ryznar & Sztonyk [30, 28] andKnopova, Kulik & Schilling [22], which we have already mentioned and discussed in theIntroduction.

3. Road map to the proofs

3.1. The parametrix method. We will construct the transition density pt,s(x, y) for theunknown process using a proper modification of the parametrix method, which is a classicalanalytical method for construction of fundamental solutions to elliptic and parabolic PDEsof second order; for a detailed overview of the history and the ideas the method is basedon, we refer to [20] or [22]. Here we outline briefly the construction, taking into accountthe fact that the actual model is non-homogeneous in time.

Consider a (time-dependent) operator Lt with the domain C2∞(Rd), given by

Ltf(x) = P.V.

∫Rd

(f(x+ Vt(x, z))− f(x)

)µ(dz)

=

d∑k=1

∫R

(f(x+ uAt(x)ek + Ut(x, eku))− f(x)

− u1|u|≤1∇f(x) ·At(x)ek

)νk(du),

where µ is the Levy measure of the process Z, νk is the Levy measure of the k-th componentZk, k = 1, . . . , d, and P.V. means that the first integral is taken in the principal value sense.By the virtue of the Ito formula, one can naturally expect that, once the solution X to (1) iswell defined and is a (time-inhomogeneous) Markov process, the operator Lt should be itsgenerator. Corresponding Kolmogorov’s backward differential equation for the transition

9

probability density of X has the form

(∂t + Lt;x)pt,s(x, y) = 0, 0 ≤ t < s, x, y ∈ Rd, (20)

here and below x at the operator Lt;x indicates that the operator Lt is applied with respectto the variable x. Together with the initial condition

pt,s(x, y)→ δx(y), s→ t, (21)

this actually gives that pt,s(x, y) is a fundamental solution to the parabolic Cauchy problemfor the operator Lt. The streamline of the method is to construct a (candidate for)the required fundamental solution, and then to show that this kernel pt,s(x, y) indeedcorresponds to the unique weak solution to (1).

To construct a candidate for the fundamental solution, we use the parametrix method,

which, in a wide generality, can be outlined as follows. Fix a function p(0)t,s (x, y), which is

C1 in t and C2∞(Rd) in x for a fixed s, y, and define

q(0)t,s (x, y) = −(∂t + Lt;x)p

(0)t,s (x, y).

Then differential equation (20) can be written as

(∂t + Lt;x)(pt,s(x, y)− p(0)t,s (x, y)) = q

(0)t,s (x, y).

Since we expect pt,s(x, y) to be a (true) fundamental solution, we can formally resolve theabove equation as

pt,s(x, y) = p(0)t,s (x, y) +

∫ s

t

∫Rdpt,r(x, v)q(0)

r,s (v, y) dvdr, 0 ≤ t < s, x, y ∈ Rd, (22)

The identity (22) can be seen as an integral equation for the unknown kernel pt,s(x, y),which is easier to deal with than the original differential equation (20). This is the essenceof the method: we first construct a candidate for the transition probability density pt,s(x, y)as the solution to the integral equation (22) and then study its properties in order to showthat this kernel indeed corresponds to the unique weak solution to (1).

3.2. Choice of the zero-order approximation. One of the crucial points in the strat-

egy outlined above is the choice of the kernel p(0)t,s (x, y), which has a natural meaning of

the ‘zero-order approximation’ term for the unknown pt,s(x, y). This choice determines

the ‘differential error of approximation’ q(0)t,s (x, y), and should be precise enough to guar-

antee integrability of q(0)t,s (x, y); note that we require this integrability in order to treat the

integral equation (22) properly. We will choose p(0)t,s (x, y) in the form

p(0)t,s (x, y) =

1

|detAs(y)|Gs−t

((y − x)(As(y)−1)T

), (23)

where Gr(w) is the distribution density of a dynamically truncated Levy noise; see Section5 for its definition and properties. The density Gs−t is also dependent on ε > 0, howeverwe do not reflect this in our notation. Such a choice combines two ideas. The first one isthe classical parametrix idea that a good ‘zero-order approximation’ to the fundamentalsolution can be obtained by taking the heat kernel for an equation with constant coefficients(e.g. the Gaussian kernel in the diffusion setting) and substituting there the coefficientsfrozen at the endpoint y, s. This classical construction also suggests that negligible (ina sense) parts should be removed from the generator: in the diffusion setting this is thedrift (gradient) term, in our case this is the non-linear jump term Ut(x, z). Though, suchclassical parametrix construction appears to be not precise enough in the singular Levynoise setting. Namely, such a construction would suggest, instead of (23), the choice

p(0)t,s (x, y) =

1

|detAs(y)|Gs−t

((y − x)(As(y)−1)T

),

10 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

recall that Gt(·) is the distribution density of Zt. However, in general, p(0)t,s (x, y) may

provide quite a poor approximation to pt,s(x, y): e.g. in the model from Example 2.7 itcan be verified easily that ∫

Rdp

(0)t,s (x, y) dy =∞,

in the striking contrast to the fact that pt,s(x, ·) should be a probability density. This is anessentially non-local effect; in order to avoid it we use the second idea to ‘cut off’ big jumps.In [30], [28] the cut off level was chosen small but fixed, which required Lipschitz continuityof the coefficients. In [22], a time dependent cut off level was proposed, which allows oneto treat the models where the coefficients are only assumed to be Holder continuous. Herewe use the same dynamic truncation idea, properly adapted to the current model. Namely,

Gr(w) in (23) will be the distribution density of Zr = (Z1r , . . . , Z

dr ), where the components

are independent and

Zir =

∫ r

0

∫|u|≤R(i)

ρ

uN i(dρ, du),

whereN i(dρ, du) is the Poisson point measure corresponding to Zi, and the time-dependent

truncation function R(i)ρ = R

(i)ρ (ε) is determined by means of the corresponding Pruitt

function hi(r) and ε > 0 of our choice. Note that in the case (A) the cut off level is thesame for all coordinates, while in the case (B) these levels can be quite different. Thisis the actual reason for the condition (D) to appear in the case (B): we will need thiscondition in order to balance, in a sense, the ‘cut off effects’ for various coordinates.

3.3. Functional analytical framework. It is convenient to treat (22) within the func-tional analytic framework introduced in [22, Section 5.2], properly adapted to the timenon-homogeneous setting. Consider the Banach space L∞(dx)⊗L1(dy) of kernels k(x, y)satisfying

‖k‖∞,1 := esssupx∈Rd

∫Rd|k(x, y)| dy <∞.

Each kernel k ∈ L∞(dx) ⊗ L1(dy) generates a bounded linear operator K in the spaceBb = Bb(R

d) of bounded measurable functions,

Kf(x) =

∫Rdk(x, y)f(y) dy, f ∈ Bb(Rd),

with the operator norm ‖K‖Bb→Bb equal to the norm ‖k‖∞,1. Denote Pt,s, P(0)t,s , Q

(0)t,s , 0 <

t < s the families of operators corresponding to the unknown transition probability den-

sity pt,s(x, y) and the kernels p(0)t,s (x, y), q

(0)t,s (x, y) introduced above. Then (22) can be

equivalently written as

Pt,s = P(0)t,s +

∫ s

tPt,rQ

(0)r,s dr, 0 ≤ t < s. (24)

Let 0 < ε ≤ ε0, with ε0 defined in Remark 5.1. In the whole subsection c denotes aconstant dependent on d, α, β, γ1, γ2, γ3, C1, . . . , C8, κ(τ) and ε. In Lemma 5.10 below

we prove an estimate for q(0)t,s (x, y) in ‖ · ‖∞,1-norm which actually can be written as a

bound for the operator norm

‖Q(0)t,s ‖Bb→Bb ≤ c(s− t)

−1+ε, 0 < s− t ≤ τ. (25)

This allows us to treat (24), in a standard way, as a Volterra equation with a mild (in-tegrable) singularity. Recall that each kernel pt,s(x, y) is supposed to be a probabilitydensity, hence it is necessary that

‖Pt,s‖Bb→Bb <∞. (26)

11

The unique solution to (24) which satisfies (26) can be interpreted as a classical Neumannseries

Pt,s = P(0)t,s +

∞∑k=1

∫· · ·∫

t<r1<···<rk<s

P(0)t,r1Q(0)r1,r2 . . . Q

(0)rk,s

dr1 . . . drk

= P(0)t,s +

∫ s

tP

(0)t,r Qr,s dr,

(27)

where the operator

Qt,s := Q(0)t,s +

∞∑k=1

∫· · ·∫

t<r1<···<rk<s

Q(0)t,r1

. . . Q(0)rk,s

dr1 . . . drk (28)

corresponds to the kernel

qt,s(x, y) :=∞∑k=0

q(k)t,s (x, y), q

(k+1)t,s (x, y) :=

∫ s

t

∫Rdq

(k)t,r (x, v)q(0)

r,s (v, y) dvdr, k ≥ 0. (29)

The series (28), (29) converge uniformly in 0 < s− t ≤ τ in the operator norm ‖ · ‖Bb→Bband the norm ‖ · ‖∞,1, respectively. This follows easily from (25), since for k ≥ 1

‖q(k)t,s ‖∞,1 =

∥∥∥∥ ∫· · ·∫

t<r1<···<rk<s

Q(0)t,r1

. . . Q(0)rk,s

dr1 . . . drk

∥∥∥∥Bb→Bb

≤∫· · ·∫

t<r1<···<rk<s

‖Q(0)t,r1

. . . Q(0)rk,s‖Bb→Bb dr1 . . . drk

≤ ck∫· · ·∫

t<r1<···<rk<s

(r1 − t)−1+ε · . . . · (s− rk)−1+ε dr1 . . . drk

= (s− t)−1+kε (cΓ(ε))k

Γ(kε),

(30)

and the Gamma function Γ(z) behaves asymptotically like√

2πzz−12 e−z � cz as z →∞.

This estimate yields

‖qt,s‖∞,1 ≤ c(s− t)−1+ε, 0 < s− t ≤ τ. (31)

In Lemma 5.8 below we prove that p(0)t,s (x, y) is bounded in ‖ · ‖∞,1-norm, which similarly

to (30) yields that the residue rt,s(x, y) = pt,s(x, y)− p(0)t,s (x, y) satisfies

‖rt,s‖∞,1 ≤ c(s− t)ε, 0 < s− t ≤ τ. (32)

These representations and estimates form an essential part of the proof of Theorem 2.1.

3.4. Approximate fundamental solution and weak uniqueness of the solution.Let C∞(Rd) denote a space of continuous functions vanishing at infinity. Define Pt,t =

P(0)t,t = id, the identity operator. The operator families {P (0)

t,s }, {Q(0)t,s } have the following

properties, see Lemmas 5.8, 5.10, 5.13, 5.15 and 5.16.

Lemma 3.1. Each of the operators P(0)t,s , 0 ≤ t ≤ s,Q

(0)t,s , 0 ≤ t < s maps C∞(Rd) to

C∞(Rd). The corresponding families of operators are strongly continuous w.r.t. t, s.

Note that the operator norm ‖ · ‖C∞→C∞ is dominated by the norm ‖ · ‖Bb→Bb , thusthe norm estimates from the previous section yield that the series (27) converges in thenorm ‖ · ‖C∞→C∞ uniformly in 0 ≤ s− t ≤ τ , and the series (28) converges uniformly forτ1 ≤ s − t ≤ τ , for any 0 < τ1 < τ . Moreover, a standard argument based on the strong

12 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

continuity of Q(0)t,s , t < s and the estimate (25) shows that for every n ≥ 1 the operators

Q(n)t,s (defined by kernels q

(n)t,s (x, y)) are strongly continuous w.r.t. t < s. This yields

Corollary 3.2. Each of the operators Pt,s, 0 ≤ t ≤ s,Qt,s, 0 ≤ t < s maps C∞(Rd) to

C∞(Rd). Corresponding families of operators are strongly continuous w.r.t. t, s.

In general, it might be quite difficult to prove that the kernel pt,s(x, y), constructed asa solution to the integral equation (22), solves the differential equation (20). We avoidthis complicated step, using the following approximate procedure. Define for η > 0

Pt,s,η = P(0)t,s+η +

∫ s

tP

(0)t,r+ηQr+η,s+η dr, 0 ≤ t ≤ s.

The following lemma shows that pt,s(x, y) solves the backward Kolmogorov equation (20)in a certain approximate sense.

Lemma 3.3. Let τ > 0 and a compact subset F ⊂ C∞(Rd) be fixed.

a)

‖Pt,s,ηf − Pt,sf‖ → 0, η → 0

uniformly in 0 ≤ s ≤ t ≤ τ, f ∈ F ,b) For any f ∈ C0(Rd) and η > 0, the function Pt,s,ηf(x) is C1 in t and C2

0 (Rd) in

x on [0, s]×Rd, and thus the operators

∆t,s,η = (∂t + Lt)Pt,s,η, s ≤ t, η > 0

are well defined. These operators satisfy

‖∆t,s,ηf‖ → 0, η → 0 (33)

uniformly in s− t > τ1, 0 ≤ s ≤ τ , f ∈ F for any τ1 > 0 and∫ τ

t‖∆t,r,ηf‖ dr → 0, η → 0 (34)

uniformly in 0 ≤ t ≤ τ , f ∈ F .

The proof of this statement is remarkably simple, and requires only representation (24)and the continuity properties stated in Lemma 3.1 and Corollary 3.2. Thus we give ithere.

Proof. Statement a) follows directly from the representation (24) and the continuity prop-erties (Lemma 3.1 and Corollary 3.2).

To prove (33), note that

∆t,s,ηf = (∂t + Lt)P(0)t,s+ηf +

∫ s

t(∂t + Lt)P

(0)t,r+ηQr+η,s+ηf dr − Pt,t+ηQt+η,t+ηf

= Q(0)t,s+ηf +

∫ s

tQ

(0)t,r+ηQr+η,s+ηf dr − Pt,t+ηQt+η,s+ηf

= Qt,s+ηf −∫ t+η

tQ

(0)t,rQr,s+ηf dr − Pt,t+ηQt+η,s+ηf,

in the last identity we have used that Qt,s is given by (28) and thus satisfies

Qt,s = Q(0)t,s +

∫ s

tQ

(0)t,rQr,s dr.

By the continuity properties Lemma 3.1 a),

‖Qt,s+ηf − Pt,t+ηQt+η,s+ηf‖ → 0, η → 0

13

uniformly in s− t > τ, f ∈ F . Convergence∥∥∥∥∫ t+η

tQ

(0)t,rQr,s+ηf dr

∥∥∥∥→ 0, η → 0

follows by the bounds (25), (31). The same bounds combined with (33) yield (34). �

Lemma 3.3 provides an efficient tool for identifying weak solutions to the SDE (1). Notethat it is easy to prove existence of a weak solution to (1) by smooth approximation of thecoefficients and using the compactness argument; see [21, Section 5] for such an argumentexplained in details. To identify a weak solution to (1) with given initial condition, wewill consider operator L defined by

Lφ(t, x) = ∂tφ(t, x) + Lt;xφ(t, x), φ ∈ D

on the set D = C1,2∞ ([0, τ ]×Rd) of functions φ(t, x) which are C1 in t, C2 in x, and have

their derivatives continuous and being from the class C∞(Rd) for any t fixed; τ > 0 hereis a fixed number. The Ito formula yields that, for a weak solution X to SDE (1) withX0 = x and φ ∈ D, the process

φ(s,Xs)−∫ s

0Lφ(r,Xr) dr, s ∈ [0, τ ]

is a martingale. We can use, with minor changes, the argument from [31, Section 5.3] toderive from this finite-dimensional distributions of X. Namely, let f ∈ C∞(Rd), τ > 0 befixed. Taking φ(t, x) = Pt,τ,ηf(x), we get for any t ≤ s ≤ τ

E[Pτ,τ,ηf(Xτ )

∣∣∣Fs]− Ps,τ,ηf(Xs) = E[∫ τ

s(∂r + Lrx)Pt,τ,ηf(Xr) dr

∣∣∣Fs]=

[E∫ τ

s∆t,τ,ηf(Xr) dr

∣∣∣Fs] .Then by Lemma 3.3, passing to the limit η → 0, we get

E[Ps,τ,ηf(Xτ )

∣∣∣Fs]− Ps,τf(Xs) = 0, s ∈ [0, τ ]

or, equivalently,

E[f(Xτ )

∣∣∣Fs] = Ps,τf(Xs)

for any f ∈ C∞(Rd) and any pair of time moments τ ≥ s ≥ 0. Since

Pt,sf(x) =

∫Rdf(y)pt,s(x, y) dy,

this yields the identity

P(Xs1 ∈ A1, . . . , Xsk ∈ Ak) =

∫∫A1×···×Ak

p0,s1(x, v1) . . . psk−1,sk(vk−1, vk) dv1 . . . dvk

valid for any k ≥ 1, t < s1 < · · · < sk and Borel measurable A1, . . . , Ak. This identifiesuniquely the finite-dimensional distributions of X and proves that X is a (time non-homogeneous) Markov process with the transition density pt,s(x, y).

3.5. Outline of the rest of the proofs. Recall that our choice of p(0)t,s (x, y) and q

(0)t,s (x, y)

was dependent on ε ∈ (0, ε0]. In this subsection we choose ε = ε0.To complete the proof of Theorem 2.1, we have to prove the bound on the residual term

rt,s(x, y) in the decomposition (15). To do this, we use the decomposition

pt,s(x, y) = p(0)t,s (x, y) + rt,s(x, y)

14 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

and the bound (32) for the residual term rt,s(x, y), obtained by the parametrix method.Then

rt,s(x, y) = p(0)t,s (x, y)− pt,s(x, y) + rt,s(x, y),

and the required bound follows from the estimate

‖p(0)t,s (x, y)− pt,s(x, y)‖∞,1 ≤ c(s− t)ε0 , 0 < s− t ≤ τ, (35)

which we prove in Lemma 5.18 below. Next, combining (32) and (35) we obtain

‖rt,s(x, y)‖∞,1 ≤ c(s− t)ε0 , 0 < s− t ≤ τ, (36)

Here the constant c is a constant dependent on d, α, β, γ1, γ2, γ3, C1, . . . , C8, κ(τ), but forτ ≤ τ0, where τ0 is defined at the beginning of Section 5, the constant depends on d, α, β,γ1, γ2, γ3, C1, . . . , C8, h1(1), . . . , hd(1). In fact (36) holds for all 0 ≤ t < s with a constantdependent on d, α, β, γ1, γ2, γ3, C1, . . . , C8, h1(1), . . . , hd(1). Indeed, if s− t > τ0 then

‖rt,s(x, y)‖∞,1 = ‖pt,s(x, y)− pt,s(x, y)‖∞,1 ≤ 2 ≤ 2

τ ε00

(s− t)ε0 .

The functional analytic framework from Section 3.3 is quite convenient also for provingthe uniform estimates for the residual term, stated in Theorem 2.2, see a detailed discus-sion in [22, Section 5.2]. Namely, the uniform-in-x, y bound for a (continuous) kernel isequivalent to the ‖ · ‖L1→Bb operator norm for the corresponding integral operator. Inaddition to the bound (25), we have

‖Q(0)t,s ‖L1→Bb ≤ cGs−t(0)(s− t)−1+ε0

(Lemma 5.10, estimate (89)). Under the additional assumption (I), we also have

‖Q(0)t,s ‖L1→L1 ≤ c(s− t)−1+ε0

(Lemma 5.11). Then for any k ∈ N, j = 0, . . . , k and r0 < r1 < . . . rk+1

‖Q0r0,r1 . . . Q

0rk,rk+1

‖L1→Bb ≤ ‖Q0r0,r1‖Bb→Bb . . . ‖Q

0rj−1,rj‖Bb→Bb

× ‖Q0rj ,rj+1

‖L1→Bb‖Q0rj+1,rj+2

‖L1→L1 . . . ‖Q0rk,rk+1

‖L1→L1

≤ ck+1Grj+1−rj (0)

k∏i=0

(ri+1 − ri)−1+ε0 .

If we take r0 = t, rk+1 = t, and j such that rj+1−rj = maxi(ri+1−ri), then Grj+1−rj (0) ≤G(s−t)k−1(0), since Gu(0) is nonincreasing as a function of u > 0. By Corollary 5.3, wehave

G(s−t)k−1(0) ≤ ckd/αGs−t(0), k ∈ N.

Also, by Lemma 4.8,

Gs−t(0) ≤ cGs−t(0), 0 < s− t ≤ τ.Then, similarly to (30), (31) we get

‖Qt,s‖L1→Bb ≤ cGs−t(0)(s− t)−1+ε0 , 0 < s− t ≤ τ. (37)

We have that ‖P (0)t,s ‖Bb→Bb is bounded and, in addition, by (23)

‖P (0)t,s ‖L1→L∞ ≤ cGs−t(0).

Then using the parametrix representation (27) and repeating the argument above we get

‖Pt,s − P (0)t,s ‖L1→L∞ ≤ cGs−t(0)(s− t)ε0 , 0 < s− t ≤ τ.

15

This is actually a uniform-in-x, y bound for the residual term rt,s(x, y) in the decompositionobtained by the parametrix method. To complete the proof of Theorem 2.2, we prove thecorresponding analogue of (37), see (116):

‖Pt,s − P (0)t,s ‖L1→L∞ ≤ cGs−t(0)(s− t)ε0 , 0 < s− t ≤ τ.

The proof of Theorem 2.3 (postponed to Section 5) stating the Holder continuity for theevolutionary family {Pt,s}, is based on the parametrix representation (24) for this family

and the Holder continuity of the family {P (0)t,s } involved in this representation.

4. One-dimensional density

This section is devoted to the study of one-dimensional components of the processZ = (Z1, . . . , Zd). Recall that the characteristic exponent of Zi is ψi. In this section we fixi ∈ {1, . . . , d} and ψ denotes the fixed ψi. By ν, h and g we denote the corresponding Levymeasure, the Pruitt function and the transition density, respectively. We will constructa truncated version g of the transition density g. We will show various estimates of g, gand its derivatives. These construction and estimates will play a crucial role to make theparametrix construction in Section 5 work.

For r > 0 we put

K(r) =

∫{x∈R: |x|≤r}

|x|2r−2ν(dx).

We have the following relationship between K(r) and h(r) [16, Lemma 2.2],

h(r) = 2

∫ ∞r

K(w)w−1 dw, r > 0. (38)

We observe that, due to infinitness of the Levy measure K(w) > 0, w > 0, hence h(r)is strictly decreasing on (0,∞).

Clearly, r2h(r), r2K(r) are increasing on (0,∞). Using the monotonicity of the functionr2h(r) we can easily extend (6) to all θ > 0,

C1λ−αh (θ) (1 ∨ θ2)−1 ≤ h (λθ) ≤ C2λ

−βh (θ) (1 ∨ θ2), 0 < λ ≤ 1. (39)

Let us observe that the scaling property (6) is equivalent to

C1/α1 h−1 (θ)λ−1/α ≤ h−1 (λθ) ≤ C1/β

2 h−1 (θ)λ−1/β, θ > h(1), λ ≥ 1. (40)

Morever this can be extended to all θ > 0 (via (39)),

C1/α1 h−1 (θ)λ−1/α(1 ∨ h−1(θ)2)−1 ≤ h−1 (λθ) ≤ C1/β

2 h−1 (θ)λ−1/β(1 ∨ h−1(θ)2), λ ≥ 1.(41)

The following important result was essentially proved in [16, Lemma 2.3]. For the readerconvenience we provide its proof.

Lemma 4.1. Let c = ( 2C1

)2/α − 1. For 0 < r ≤ r0 we have

h(r) ≤ c(1 ∨ r20)K(r). (42)

Moreover, for |ξ| ≥ ξ0 > 0,

1

4c(1 ∨ ξ−20 )

h(1/|ξ|) ≤ ψ(ξ) ≤ 2h(1/|ξ|). (43)

Proof. Let r ≤ 1 and λ0 =(C12

)1/α< 1. Then, by (6), we have

2h(r) ≤ h(λ0r).

16 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Next, by (38) and monotonicity of w2K(w) we obtain

h(r) ≤ h(λ0r)− h(r) = 2

∫ r

λ0rw2K(w)w−3dw ≤ 2r2K(r)

∫ r

λ0rw−3dw

Hence

h(r) ≤ 1− λ20

λ20

K(r) =

((2

C1

)2/α

− 1

)K(r), r ≤ 1.

If r0 > 1, then for 1 ≤ r ≤ r0 then, by monotonicity of w2K(w) and w2h(w), we have

r2h(r) ≤ r20h(r0)

r2K(r)

K(1).

Hence,

h(r) ≤ r20

h(1)

K(1)K(r) ≤

((2

C1

)2/α

− 1

)r2

0K(r).

The proof of (42) is completed.Next, by the inequality 1− cosx ≥ x2/4 for |x| ≤ 1, we obtain

(1/4)K(1/|ξ|) ≤ ψ(ξ) =

∫R

(1− cos(ξx))dν(x) ≤ 2h(1/|ξ|).

Applying (42) we have

ψ(ξ) ≥ 1

4c(1 ∨ ξ−20 )

h(1/|ξ|), |ξ| ≥ ξ0,

which completes the proof. �

Now, we can give the arguments that (6) is equivalent to (7). If (6) holds, then (43)

with ξ0 = 1 implies (7) with C∗1 = C18c , C

∗2 = 8cC2, where c = (2/C1)2/α − 1.

On the other hand (7) implies the same scaling conditions for the maximal functionψ∗(ξ) = sup|x|≤|ξ| ψ

∗(ξ). Then (6) holds, since due to [14, Lemma 4] we have ψ∗(ξ) �h (1/|ξ|) , ξ ∈ R.

Lemma 4.2. Let τ > 0. For 0 < u ≤ τ we have

c1u1/α ≤ h−1(1/u) ≤ c2u

1/β, (44)

where c1 = C1/α1 (h(1) ∧ 1

τ )1/α and c2 = C1/β2

(h−1

(1τ

)∨ 1)h (1)1/β.

Proof. Taking θ = 1 we can rewrite (6) as

C1/α1 h (1)1/α h (λ)−1/α ≤ λ ≤ C1/β

2 h (1)1/β h (λ)−1/β , 0 < λ ≤ 1.

Putting λ = h−1(s), for s ≥ h(1), we have (C1h (1))1/αs−1/α ≤ h−1(s) ≤ (C2h (1))1/βs−1/β.

If 0 < s0 ≤ s ≤ h(1) we have s1/α0 s−1/α ≤ h−1(s) ≤ h−1(s0)h (1)1/β s−1/β. Choosing

1s = u ≤ τ we show that

C1/α1

(1

τ∧ h(1)

)1/α

t1/α ≤ h−1(1/t) ≤ C1/β2 h−1

(1

τ∧ h(1)

)h (1)1/β t1/β.

Now we state an easy corollary to (40) and Lemma 4.2.

17

Corollary 4.3. Let 0 < ε < 1 and 0 < u ≤ τ <∞. We have

h−1(uε−1)

h−1(u−1)≤ cu−ε/α, (45)

where c = c(h(1), h−1(1/τ), h−1(1), τ, α, ε, C1).

If τ0 = (h(1) ∨ 1)−1

1−ε , then for u ≤ τ0,

h−1(uε−1)

h−1(u−1)≤ C−1/α

1 u−ε/α. (46)

Moreover, for 0 < λ ≤ 1,

h−1(1/u)

h−1(1/(λu))≤ C−1/α

1 λ−1/α(h−1(τ−1) ∨ 1)2. (47)

Proof. Let u ≤ τ0 = (h(1)∨ 1)−1/(1−ε). We apply (40) with λ = u−ε and θ = u−1+ε to get

h−1(uε−1)

h−1(u−1)=

h−1(uε−1)

h−1(u−εuε−1)≤ C−1/α

1 u−ε/α.

If τ0 ≤ u ≤ τ then by monotonicity of h−1,

h−1(uε−1) ≤ h−1(1/τ) ∨ h−1(1) and h−1(u−1) ≥ h−1(τ−10 ).

Moreover, by Lemma 4.2, we obtain

h−1(τ−10 ) ≥ C1/α

1

(1

τ∧ h(1)

)1/α

τ1/α0 .

It follows that for τ0 ≤ u ≤ τ ,

h−1(uε−1)

h−1(u−1)≤ h−1(1/τ) ∨ h−1(1)

C1/α1

(1τ ∧ h(1)

)1/α τ1/α0 τ ε/αu−ε/α.

Now we start to construct a truncated version g of the transtion density g. Fix ε ∈(0, ε0], where ε0 < 1 is defined in Remark 5.1. Let for u > 0

Ru = Ru(ε) = h−1

(1

u1−ε

)and

ψu(ξ) =

∫|v|≤Ru

(1− cos(vξ))ν(dv).

We haveψu(ξ) ≥ ψ(ξ)− 2uε−1. (48)

To prove (48) we note that∫|v|>Ru

(1− cos(vξ))ν(dv) ≤ 2

∫|v|>Ru

ν(dv) ≤ 2h(Ru) = 2uε−1,

hence

ψu(ξ) = ψ(ξ)−∫|v|>Ru

(1− cos(vξ))ν(dv) ≥ ψ(ξ)− 2uε−1.

Let 0 < u <∞ and w ∈ R. Put

gu(w) =1

∫R

eiwze−∫ u0 ψr(z) dr dz.

Since, by (48) and then by (7),∫ u

0 ψr(z) dr ≥ uψ(z)− (2/ε)uε ≥ c|z|α for |z| large enough,the function gu is well defined density function such that gu ∈ C∞(R).

18 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

For 0 < u <∞ and a measurable set D ⊂ R put

νu(D) =

∫ u

0

∫|x|≤Rr

1D(x)ν(dx) dr.

It is clear that νu is the Levy measure of the infinitely divisible density gu.For any u > 0 put

mu =

∫R

x2 νu(dx).

Lemma 4.4. For any τ > 0 there is a constant c = c(α,C1, h−1(1/τ), h−1(1)) such that

for u ≤ τ we have

cR2uu

ε ≤ mu ≤ R2uu

ε.

The upper bound holds for any u > 0.

If u ≤ τ0 = h(1)−1

1−ε , then for u ≤ τ0

mu ≥ cR2uu

ε

with c = c(C1, α).

Proof. The upper bound is clear:

mu =

∫R

x2νu(dx) ≤ u∫|x|≤Ru

x2ν(dx) ≤ uR2uh(Ru) = R2

uuε,

The lower bound follows from the scaling property. By (47),

RuRu/2

≤(

21−ε

C1

)1/α(1 ∨ h−1

(1

τ1−ε

))2

.

By Lemma 4.1, we have for r ≤ Rτ ,

h(r)

K(r)≤(

2

C1

)2/α(1 ∨ h−1

(1

τ1−ε

))2

.

Observing that h−1( 1τ1−ε ) ≤ h−1(1) ∨ h−1(1/τ)) and taking

c =(

2C1

)−2/α(1 ∨ h−1(1) ∨ h−1(1/τ))−2 we have ch(r) ≤ K(r) for r ≤ Rτ . Moreover

cRu ≤ Ru/2 for u ≤ τ . Then using monotonicity of h we have

mu =

∫ u

0

∫|x|≤Rr

x2ν(dx) dr =

∫ u

0R2rK(Rr) dr

≥ c

∫ u

u/2R2rh(Rr)dr ≥

c

2uR2

u/2h(Ru) ≥ c2

2uR2

uh(Ru) =c2

2R2uu

ε.

Lemma 4.5. Let γ > β. For any τ > 0 there is a constant c = c(C2, β, γ, h−1(1/τ), h−1(1))

such that for u ≤ τ we have∫|x|≤Ru

|x|γν(dx) ≤ cRγuh(Ru) = cRγuu−1+ε.

Moreover for u ≤ τ0 = h(1)−1

1−ε the above constant c = c(C2, β, γ).If γ = 2 the above inequality holds for any u > 0 and with c = 1.

Proof. Let L(r) = ν([r,∞)), r > 0. By integration by parts∫0<x≤Ru

xγν(dx) ≤ lim supr→0+

rγL(r) + γ

∫0<x≤Ru

xγ−1L(x)dx.

19

Next, by (6),lim supr→0+

rγL(r) ≤ lim supr→0+

rγh(r) = 0,

which implies∫0<x≤Ru

xγν(dx) ≤ γ∫

0<x≤Ruxγ−1L(x)dx ≤ γ

∫0<x≤Ru

xγ−1h(x)dx.

It follows from (6) that, if Ru ≤ 1, then

h(x) = h(Ru(x/Ru)) ≤ C2(x/Ru)−βh(Ru), x ≤ Ru.If 1 ≤ Ru ≤ Rτ , since r2h(r) is an increasing function and by (6), we have

h(x) ≤ h(x/Ru) ≤ C2(x/Ru)−βh(Ru)R2u, x ≤ Ru.

The last two estimates yield∫0<x≤Ru

xγ−1h(x)dx ≤ C2(R2τ∨1)Rβuh(Ru)

∫0<x≤Ru

xγ−β−1dx = C2(R2τ∨1)

1

γ − βRγuh(Ru).

This together with the estimate

Rτ = h−1

(1

τ1−ε

)≤ h−1(1/τ) ∨ h−1(1)

end the proof for arbitrary γ > β. Moreover, for τ = τ0, we have Rτ = 1 which showsthat the constant c = C2

γγ−β .

The assertion of the lemma for γ = 2 is a consequence of the definition of the functionh. �

Lemma 4.6. Let γ ≥ 0 and τ > 0. There are constants c1 = c1(α, h−1(1/τ), τ, γ, ε, C1)and c2 = c2(γ) such that for any 0 < u ≤ τ ,

c2

h−1(1/u)γ+1≤∫R

|z|γe−∫ u0 ψr(z)drdz ≤ c1

h−1(1/u)γ+1. (49)

If τ = 1/h(1) then the constant c1 = c1(h(1), ε, α, γ, C1).

Proof. By (48), ∫ u

0ψr(z)dr ≥ uψ(z)− (2/ε)uε.

Next, ∫R

|z|γe−∫ u0 ψr(z)drdz ≤ e(2/ε)uε

∫R

|z|γe−uψ(z)dz.

Moreover, by Lemma 4.1,∫R

|z|γe−uψ(z)dz ≤ 2 +

∫|z|>1

|z|γe−cuh(1/|z|)dz,

where c = 1

4(

( 2C1

)2/α−1) . By the same arguments as in the proof of [1, Lemma 16] we get∫R

|z|γe−uch(1/|z|)dz ≤ c∗h−1(1/u)γ+1

, u ≤ 1/h(1),

where c∗ = c∗(γ, α, C1). Since h−1(1/u) ≤ 1 for u ≤ 1/h(1), it follows that∫R

|z|γe−uψ(z)dz ≤ c∗

h−1(1/u)γ+1, u ≤ 1/h(1),

where c∗ = c∗ + 2. If 1/h(1) ≤ u ≤ τ , then from the above estimate and monotonicity ofh−1 we obtain ∫

R

|z|γe−uψ(z)dz ≤ c∗h−1(1/τ)γ+1

h−1(1/u)γ+1.

20 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Finally we conclude that∫R

|z|γe−uψ(z)dz ≤ c∗(h−1(1/τ) ∨ 1)γ+1 1

h−1(1/u)γ+1

for u ≤ τ . The proof of the upper bound is completed.To get the lower bound we observe that∫ u

0ψr(z)dr ≤ uψ(z) ≤ 2uh(1/|z|).

Hence, denoting a = 1h−1(1/u)

we arrive at∫R

|z|γe−∫ u0 ψr(z)drdz ≥

∫ a

−a|z|γe−u2h(1/|z|)dz ≥ 2e−2

∫ a

0|z|γdz = 2e−2 1

γ + 1aγ+1,

which ends the proof of the lower bound. �

Corollary 4.7. For 0 < u ≤ τ

c2

2πh−1(1/u)≤ gu(0) ≤ c1

2πh−1(1/u), (50)

where c1, c2 are constants from (49) corresponding to γ = 0.The function (0,∞)×R 3 (u, x)→ gu(x) is continuous.

Proof. Since

gu(x) =1

∫R

eixze−∫ u0 ψr(z) dr dz.

we get, by Lemma 4.6, the lower and upper estimate of gu(0). The continuity follows from

the continuity of the map (0,∞) × R 3 (u, x) → eixze−∫ u0 ψr(z) dr, the upper estimate in

(49) and the bounded convergence theorem. �

Lemma 4.8. For any u > 0 we have

supx∈R|gu(x)− gu(x)| ≤ 2

εgu(0), (51)∫

R

|gu(x)− gu(x)| dx ≤ 2uε

ε(52)

and

gu(0) ≤ gu(0) ≤ gu(0)euε

ε . (53)

Proof. The proof is similar to the proof of Proposition C.9 in [22]. Let u ∈ (0, τ ] andx ∈ R be arbitrary. For any z ∈ R we have

uψ(z) =

∫ u

0ψr(z) dr +

∫ u

0

∫|v|>Rr

(1− cos(vz)) ν(dv) dr.

It follows that

gu(x) =1

∫R

eixze−uψ(z) dz

=1

∫R

eixze−∫ u0 ψr(z) dre

−∫ u0

∫|v|>Rr (1−cos(vz)) ν(dv) dr

dz

=

∫R

gu(x− z)P tailu (dz), (54)

where P tailu (dz) is the exponential (for the convolution) of the measure Λtail

u i.e.

P tailu (A) = e−Λtail

u (R)∞∑k=0

1

k!

(Λtailu

)∗k(A), A ∈ B(R), (55)

21

where Λtailu (A) =

∫ u0 ν({v ∈ A : |v| > Rr}) dr. We have

Λtailu (R) =

∫ u

0ν({v ∈ R : |v| > Rr}) dr ≤

∫ u

0h(Rr) dr =

ε. (56)

It follows that ∣∣∣1− e−Λtailu (R)

∣∣∣ ≤ uε

ε. (57)

Moreover, by (54) and (55), we get

|gu(x)− gu(x)| ≤ gu(x)∣∣∣1− e−Λtail

u (R)∣∣∣

+ e−Λtailu (R)

∞∑k=1

∫R

1

k!gu(x− z)

(Λtailu

)∗k(dz). (58)

Using this, (56) and (57) we get (51). Integrating (58) and using (56), (57) we get (52).

Applying (54) with x = 0 we obtain gu(0)e−Λtailu (R) ≤ gu(0) ≤ gu(0), which combined

with (56) proves (53). �

Let for u > 0, ξ, w ∈ R

vu(ξ, w) = −ξw +

∫R

(cosh(ξv)− 1) νu(dv).

Lemma 4.9. Fix u > 0, w ∈ R. Let ξ0 ∈ R be such that

vu(ξ0, w) = infξ∈R

vu(ξ, w).

Then,

|ξ0| ≤ 2|w|mu

. (59)

Proof. We have ∫R

(cosh(ξz)− 1) νu(dz) ≥ 1

2

∫R

ξ2z2 νu(dz) =1

2ξ2mu

Hence vu(ξ, w) ≥ −ξw + ξ2mu/2. Since vu(ξ0, w) ≤ vu(0, w) = 0 we have −|ξ0||w| +ξ2

0mu/2 ≤ 0, which gives (59). �

Lemma 4.10. Let τ > 0. For any 0 < u ≤ τ , w ∈ R, k ∈ N0 we have∣∣∣∣ dkdwk gu(w)

∣∣∣∣ ≤ ck ( 1

h−1(1/u)

)kgu(0) (60)

and ∣∣∣∣ dkdwk gu(w)

∣∣∣∣ ≤ ck ( 1

uεh−1(1/u)

)ke−

|w|8Ru gu(0). (61)

The constant ck depends on C1, α, ε, τ, h−1(1/τ), h−1(1) and k. If

τ = τ0 = (h(1) ∨ 1)−1

1−ε , then the constant ck depends on k,C1, α, ε, h(1).

Proof. The proof of (60) follows immediately from Lemma 4.6 and Corollary 4.7.Let

Qu(ξ, w) = iξw −∫ u

0ψr(ξ) dr, u > 0, ξ, w ∈ R.

We have

gu(w) =1

∫R

eQu(ξ,w) dξ.

22 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

For any k ∈ N0 we get

dk

dwkgu(w) =

1

∫R

ikξkeQu(ξ,w)dξ

Recall that ξ0 = arg minξvu(ξ, w). We proceed in a similar way as in [23] where abound on the transition density was derived. Note that functions ψr and Qu(·, w) canbe extended analytically to C. Applying the Cauchy-Poincare theorem (justification isexactly the same as in the proof of Theorem 6 of [23]) we claim that∣∣∣∣ dkdwk gu(w)

∣∣∣∣ =1

∣∣∣∣∫R

(ξ + iξ0)keQu(ξ+iξ0,w)dξ

∣∣∣∣ .Observe that

ReQu(ξ + iξ0, w) ≤ vu(ξ0, w)−∫ u

0ψr(ξ) dr.

Hence ∣∣∣∣ dkdwk gu(w)

∣∣∣∣ ≤ ∫R

(|ξ|+ |ξ0|)kevu(ξ0,w)−∫ u0 ψr(ξ) drdξ

≤ 2kevu(ξ0,w)

∫R

(|ξ|k + |ξ0|k)e−∫ u0 ψr(ξ) drdξ

Now, we will show that for any w ∈ R we have

vu(ξ0, w) ≤ emu

2R2u

− |w|4Ru

(62)

≤ e

2uε− |w|

4Ru. (63)

If |w| ≤ 2emuRu

, then

emu

2R2u

− |w|4Ru

≥ 0 = vu(0, w) ≥ vu(ξ0, w),

which proves (62) in this case.If |w| ≥ 2emu

Ru, to prove (62), we use the arguments as in [35, proof of Lemma 4.2], so

we omit the details.Next, observe that (63) follows from Lemma 4.4.We also have, by (49), ∫

R

|ξ|ke−∫ u0 ψr(ξ) drdξ ≤ ck

h−1(1/u)k+1.

By Lemma 4.9, ∫R

|ξ0|ke−∫ u0 ψr(ξ) drdξ ≤ ck|w|k

c

mkuh−1(1/u)

.

Hence ∣∣∣∣ dkdwk gu(w)

∣∣∣∣ ≤ ckh−1(1/u)k+1

(1 +

Ruh−1(1/u)

mu

|w|Ru

)ke−

|w|4Ru

≤ ckh−1(1/u)k+1

(1 +

Ruh−1(1/u)

mu

)ke−

|w|8Ru .

Next, we observe that, by Lemma 4.4 and since h−1(1/u) ≤(h−1(1/τ)h−1(1)

∨ 1)Ru, we obtain

Ruh−1(1/u)

mu≤ ch

−1(1/u)

uεRu≤ c

which implies

23

∣∣∣∣ dkdwk gu(w)

∣∣∣∣ ≤ ckukεh−1(1/u)k+1

e−|w|8Ru .

Finally, we note that all the constants in the case τ = τ0 = (h(1)∨1)−1

1−ε are dependentonly on k,C1, α, ε, h(1). It follows from the appropriate parts of Lemma 4.4 and Lemma4.6. �

Let for 0 < u <∞ and f ∈ C2(R)

Kuf(w) = P.V.

∫|z|≤Ru

(f(w + z)− f(w)) ν(dz), w ∈ R.

For f ∈ L1(R) and ξ ∈ R denote f(ξ) =∫Re−iξxf(x) dx. If f ∈ C2(R) ∩ L1(R) is such

that lim sup|x|→∞ |x|1+δ|f ′′(x)| <∞ for some δ > 0 we observe that Kuf ∈ L1(R) and

Kuf(ξ) = −ψu(ξ)f(ξ), ξ ∈ R. (64)

Next, by (61), the above requirements are satisfied for f = gu.

We have gu(ξ) = e−∫ u0 ψr(ξ) dr, so for any 0 < u <∞ and ξ ∈ R

∂ugu(ξ) + ψu(ξ)gu(ξ) = 0.

By (64), it follows that for any 0 < u <∞ and w ∈ R we have

∂ugu(w)− Kugu(w) = 0. (65)

Next we claim that∂

∂ugu(w) =

1

∫R

eiwz∂

∂ugu(z) dz.

This follows from the estimate

| ∂∂ugu(z)| = ψu(z)gu(z) ≤ e(2/ε)uεψ(z)e−uψ(z),

which is implied by (48). Next we use (65) to get

∂ugu(w) =

1

∫R

eiwzKugu(z) dz = Kugu(w), (66)

where we have equality almost surely. By continuity we have it everywhere.

5. Parametrix construction

This highly technical section contains detailed proofs of a number of facts and esti-mates needed to provide the construction of the fundamental solution pt,s(x, y), whichwas explained earlier in Section 3. The construction demands many auxiliary results, in

particular key estimates of the zero-order approximation term p(0)t,s (x, y) and the kernel

q(0)t,s (x, y) contained in Lemma 5.8 and Lemma 5.10, respectively.

In this section we adopt the convention that constants denoted by c (or c1, c2, . . .) maychange their value from one use to the next. Unless is explicitly stated otherwise, weunderstand that constants denoted by c (or c1, c2, . . .) depend only on d, α, β, γ1, γ2, γ3,C1, . . . , C7. We also understand that they may depend on on the choice of the constant ε.We write c = c(a, b, . . .) when c depends on the above constants and additionally on a, b,. . . . For a square matrix A we denote by |A| its standard operator norm. The standardinner product for x, y ∈ Rd we denote by xy.

24 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Remark 5.1. Our choice of p(0)t,s (x, y) and the kernel q

(0)t,s (x, y) will depend on the given value

of ε > 0. In order to have required bounds involving these objects, we have to impose arestriction on ε. Namely throughout the rest of the paper we assume that ε ≤ ε0, with ε0

defined below.In the case (A) we set

ε0 = min

{γ1α

2(d+ 3)β,

γ2α

2(d+ 3),

γ1/(β(1 + γ1))

2 + 2/α+ γ1/(β(1 + γ1)),

γ3 − 1

γ3 − 1 + β(1 + (d+ 1)/α)

},

while in the case (B) we pick

ε0 = min

{(1 + γ1)/β − 1/α

2(d+ 3)/α,γ2 − (1/α− 1/β)

2(d+ 3)/α,

1/β − 1/((1 + γ1)α)

2 + 2/α+ 1/β − 1/((1 + γ1)α),

γ3 − β/αγ3 + β(1 + d/α)

}.

Due to our assumptions ε0 is positive.

For i ∈ {1, . . . , d}, u > 0 put

R(i)u = R(i)

u (ε) = h−1i

(1

u1−ε

).

Let g(i)u = g

(i,ε)u be the truncated density corresponding to ψi according to the truncation

procedure described in Section 4.For any u > 0, x ∈ Rd define

Gu(x) = g(1)u (x1)g(2)

u (x2) · · · g(d)u (xd).

Let h−1max(r) = maxj h

−1j (r) and let h−1

min(r) = minj h−1j (r). Let Mu be a diagonal

d - dimensional square matrix with the diagonal 8(R(1)u , . . . R

(d)u ). The multiplier 8 is

only for the notational convenience. Note that |Mu| = 8 maxiR(i)u = 8h−1

max

(u−1+ε

)and

|M−1u | = 1

8 maxi1

R(i)u

= 18

1h−1min(u−1+ε)

.

We recall that for u > 0 we defined

κ(u) = (u, h1(1), . . . , hd(1), h−11 (1), . . . , h−1

d (1), h−11 (1/u), . . . , h−1

d (1/u)).

Observe that for every τ > 0

1

cu(ε−1)/β ≤ |M−1

u | ≤ cu(ε−1)/α, 0 < u ≤ τ, (67)

and1

cu(1−ε)/α ≤ |Mu| ≤ cu(1−ε)/β, 0 < u ≤ τ, (68)

where c depends also on τ trough the vector κ(τ), that is in our notation c = c(κ(τ)).This follows from Lemma 4.2. Throughout the whole section we set

τ0 = mink≤d

{(hk(1) ∨ 1)−

11−ε}.

Then the constant c in (67) and (68) depends on the vector h = (h1(1), . . . , hd(1)), ifτ ≤ τ0, in our convention we write c = c(h). Again, this follows from Lemma 4.2.

The following lemma follows easily from Corollary 4.7 and Lemma 4.10.

Lemma 5.2. Fix τ > 0. For any u ∈ (0, τ ], x ∈ Rd we have

c−1d∏i=1

1

h−1i (1/u)

≤ Gu(0) ≤ cd∏i=1

1

h−1i (1/u)

,

Gu(x) ≤ cGu(0)e−|xM−1u |,

25∣∣∣∣ ∂∂xiGu(x)

∣∣∣∣ ≤ cGu(0)1

h−1i (1/u)

u−εe−|xM−1u |,∣∣∣∣ ∂2

∂xi∂xjGu(x)

∣∣∣∣ ≤ cGu(0)1

h−1i (1/u)h−1

j (1/u)u−2εe−|xM

−1u |,

where the constant c depends additionally on τ , that is c = c(κ(τ)). If τ ≤ τ0, thenc = c(h).

Proof. The first estimate follows directly from Corollary 4.7.All the remaining estimates follow directly from Lemma 4.10 and the observation

d∑j=1

|xj |8R

(i)u

d∑j=1

(|xj |

8R(i)u

)21/2

= |xM−1u |.

By (47) and the above lemma we obtain the following corollary.

Corollary 5.3. Fix τ > 0. For any u ∈ (0, τ ]

Guλ(0) ≤ cλ−d/αGu(0), λ ≤ 1,

where c = c(κ(τ)). If τ ≤ τ0, then c = c(h).

For 0 ≤ t < s and x,w, y ∈ Rd put

pxt,s(w) =1

|detAs(x)|Gs−t(w(A−1

s (x))T ). (69)

and

Lyt,sf(x) =d∑

k=1

P.V.

∫|u|<R(k)

s−t

[f(x+ uekATs (y))− f(x)] νk(du).

By (66), for any 0 ≤ t < s and w, y ∈ Rd, we have(∂

∂t+ Lyt,s

)pyt,s(w) = 0.

We choose our zero-order approximation p(0)t,s in parametrix construction as

p(0)t,s (x, y) = pyt,s(x− y). (70)

Before we come to crucial estimates of q(0)t,s (defined in Section 3.1) we need to show some

auxiliary results on pwt,s.Let

‖A‖ = max

(sup

t>0,x∈Rd|ATt (x)|, sup

t>0,x∈Rd|A−1

t (x)|, supt>0,x,y∈Rd,x 6=y

|ATt (x)−ATt (y)||x− y|γ1

,

supt>0,x,y∈Rd,x 6=y

|(ATt (x))−1 − (ATt (y))−1||x− y|γ1

, sups>t>0,x∈Rd,

|(ATt (x))−1 − (ATs (x))−1|(s− t)γ2

).

It is clear that ‖A‖ ≥ 1. Note that ‖A‖ may be bounded from above by a constantwhich depends only on d, C3, . . . , C6. Using standard calculations and the conditions (8,9, 10, 11) we have

||A|| ≤ (C3 + C5 + C6)d+Cd−1

3

C4d+ (C5 + C6)

Cd−13

C4d2.

From Lemma 5.2, Lemma 4.2 and (69) we easily get the following corollary.

26 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Corollary 5.4. Let τ > 0. Then there is a constant c = c(κ(τ)) such that for 0 < s−t ≤ τand y, w ∈ Rd ∣∣pyt,s(w)

∣∣ ≤ cGs−t(0)e−|w(A−1s (y))TM−1

s−t|, (71)∣∣pyt,s(w)∣∣ ≤ cGs−t(0)e

− 1||A||

|w||Ms−t| , (72)∣∣∇pyt,s(w)

∣∣ ≤ c|M−1s−t|(s− t)−ε(1+1/α)Gs−t(0)e−|w(A−1

s (y))TM−1s−t|, (73)∣∣∇2pyt,s(w)

∣∣ ≤ c|M−1s−t|2(s− t)−ε(2+2/α)Gs−t(0)e−|w(A−1

s (y))TM−1s−t|, (74)∣∣∇2pyt,s(w)

∣∣ ≤ c|M−1s−t|2(s− t)−ε(2+2/α)Gs−t(0)e

− 1||A||

|w||Ms−t| . (75)

If τ ≤ τ0, then c = c(h).

Proof. We provide the proof only for (74) and (75) since the other estimates can be shownin a similar fashion. Applying Lemma 5.2 and (69) we obtain∣∣∇2pyt,s(w)

∣∣ ≤ cGs−t(0)1

(h−1min((s− t)−1))2

(s− t)−2εe−|w(A−1s (y))TM−1

s−t|.

Next, by (45),

1

|M−1s−t|h

−1min((s− t)−1)

=h−1min((s− t)ε−1)

8h−1min((s− t)−1)

≤ c(s− t)−ε/α,

which completes the proof of (74).

Since |w(A−1s (y))TM−1

s−t| ≥ 1||A||

|w||Ms−t| the estimate (74) yields (75). �

The following lemma is a simple consequence of the change of variable formula, henceits proof is omitted.

Lemma 5.5. Let ρ > 0, x ∈ Rd, 0 ≤ t < s <∞. There is a constant c = c(ρ) such that∫Rd|x− y|ρe−|(x−y)(A−1

s (x))TM−1s−t|dy ≤ c|Ms−t|ρdet(Ms−t).

Lemma 5.6. Let x, y ∈ Rd, 0 ≤ t < s and δ > 0. Let

ξ = y − x+ θuekATt (x) + (1− θ)uekATs (y)) + λUt(x, eku), |u| ≤ R(k)

s−t,

where 0 ≤ θ, λ ≤ 1.

If |x− y|1+γ1 ≤ (s−t)δ|M−1

s−t|and 0 < s− t ≤ τ , then

|ξ(ATs (y))−1M−1s−t − (y − x)(ATs (x))−1M−1

s−t| ≤ c. (76)

The contant c = c(δ, κ(τ)). If τ ≤ τ0, then c = c(δ, h).

Proof. We denote z = y − x, U = Ut(x, eku), R(k) = R(k)s−t and M = Ms−t. It is clear

that it is enough to consider the case when θ = 1 or θ = 0. We provide the argument ifθ = 1 since the case θ = 0 is similar, if not easier. Let ξ = z + uekA

Tt (x). Noting that

uekM−1 = (u/R(k))ek, we obtain

|ξ(ATs (y))−1M−1 − z(ATs (x))−1M−1|= |z[(ATs (y))−1 − (ATs (x))−1]M−1 + uek[A

Tt (x)−ATs (y)](ATs (y))−1M−1

+uekM−1 + U(ATs (y))−1M−1|

≤ ||A|||z|1+γ1 |M−1|+ ||A|||u||z|γ1 |M−1|+ |u|/R(k) + C7||A|||M−1||u|γ3

≤ ||A||(|z|1+γ1 |M−1|+R(k)|z|γ1 |M−1|+ 1 + C7|M−1|(R(k))γ3

)

27

In the case (A) ( when h1 = · · · = hd) we have R(k)|M−1| = 1/8, hence, by (68),

|z|1+γ1 |M−1|+R(k)|z|γ1 |M−1|+1+C7|M−1|(R(k))γ3 ≤ (s−t)δ+|z|γ1+1+C7|M−1|1−γ3 ≤ c,

with c = c(δ, κ(τ)).In the case (B), by (67) and (68),

|R(k)|z|γ1 ||M−1| ≤ c(s− t)(1−ε)/β(

(s− t)δ

8|M−1|

) γ1γ1+1

|M−1|

= c(s− t)(1−ε)/β+δγ1γ1+1 |M−1|

1γ1+1

≤ c(s− t)(1−ε)/β+δγ1γ1+1 (s− t)

−1+ε(γ1+1)α = c(s− t)δ0 ,

where δ0 = δ γ1γ1+1 + (1− ε)( 1

β −1

(γ1+1)α) > 0. Moreover, again by (67) and (68),

|M−1|(R(k))γ3 ≤ c(s− t)(1−ε)( γ3β− 1α

).

This implies that

|z|1+γ1 |M−1|+R(k)|z|γ1 |M−1|+ 1 + η5|M−1|(R(k))γ3

≤ c(s− t)δ + c(s− t)δ0 + 1 + c(s− t)(1−ε)( γ3β− 1α

) ≤ c,

with c = c(δ, κ(τ)). Hence, in both cases, the proof of (76) is completed. �

For any x ∈ Rd, δ > 0 let

D(δ, x) =

{w ∈ Rd : |x− w|1+γ1 ≤ (s− t)δ

|M−1s−t|

}.

Lemma 5.7. Fix τ > 0 and δ > 0. For any x ∈ Rd, y ∈ D(δ, x), 0 < s− t ≤ τ we have∣∣Gs−t ((x− y)(ATs (y))−1)−Gs−t

((x− y)(ATs (x))−1

)∣∣≤ c(s− t)δ−ε(1+1/α)Gs−t(0)e−|(x−y)(A−1

s (x))TM−1s−t| (77)

and∫D(δ,x)

∣∣Gs−t ((x− y)(ATs (y))−1)−Gs−t

((x− y)(ATs (x))−1

)∣∣ dy ≤ c(s− t)δ−(d+3)ε/α.

(78)The contant c = c(δ, κ(τ)). If τ ≤ τ0, then c = c(δ, h).

Proof. We have

Gs−t((x− y)(ATs (y))−1

)= Gs−t

((x− y)(ATs (x))−1

)+ ∇Gs−t(ξ)

[(x− y)((ATs (y))−1 − (ATs (x))−1)

],

where ξ = θ(x−y)(ATs (x))−1 +(1−θ)(x−y)(ATs (y))−1, 0 ≤ θ ≤ 1. Next, we observe that

|(x− y)(ATs (y))−1 − (x− y)(ATs (x))−1| ≤ ||A|||x− y|1+γ1 ≤ (s− t)δ||A|| 1

|M−1s−t|

. (79)

Hence,

|ξM−1s−t − (x− y)(ATs (x))−1M−1

s−t| ≤ (s− t)δ|M−1s−t|||A|||M

−1s−t|−1 = (s− t)δ||A||.

This implies, by Lemma 5.2, that

|∇Gs−t(ξ)| ≤ c1

h−1min(1/(s− t))

(s− t)−εGs−t(0)e−|(x−y)(A−1s (x))TM−1

s−t|,

28 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

which together with (79) yield∣∣∇Gs−t(ξ) [(x− y)((ATs (y))−1 − (ATs (x))−1)]∣∣

≤ c(s− t)δ 1

|M−1s−t|

1

h−1min(1/(s− t))

(s− t)−εGs−t(0)e−|(x−y)(A−1s (x))TM−1

s−t|

≤ c(s− t)δ(s− t)−ε(1+1/α)Gs−t(0)e−|(x−y)(A−1s (x))TM−1

s−t|,

since, by Corollary 4.3,

1

|M−1s−t|

1

h−1min(1/(s− t))

= 8h−1min((s− t)ε−1)

h−1min((s− t)−1)

≤ c(s− t)−ε/α.

The proof of (77) is completed.Applying (77) and Lemma 5.5 we get∫

D(δ,x)

∣∣Gs−t ((x− y)(ATs (y))−1)−Gs−t

((x− y)(ATs (x))−1

)∣∣ dy≤ c det(Ms−t)Gs−t(0)(s− t)δ−ε−ε/α.

Next, by Lemma 5.2 and Corollary 4.3,

det(Ms−t)Gs−t(0) ≤ cd∏

k=1

h−1k ((s− t)ε−1)

h−1k ((s− t)−1)

≤ c(s− t)−dε/α, (80)

which implies (78). �

Lemma 5.8. Fix τ > 0. There exists c(κ(τ)) = c > 0 such that for any 0 < s − t ≤ τand x ∈ Rd we have ∫

Rdpyt,s(x− y) dy ≤ c. (81)

Moreover, for any 0 < s− t ≤ τ and x ∈ Rd we have

supy∈Rd

∣∣pyt,s(x− y)− pxt,s(x− y)∣∣ ≤ cGs−t(0)(s− t)ε (82)

and ∫Rd

∣∣pyt,s(x− y)− pxt,s(x− y)∣∣ dy ≤ c(s− t)ε. (83)

If τ ≤ τ0, then c = c(h).

Proof. Let δ > 0 and z be x or y. For y ∈ Dc(δ, x) we use (72) and Lemma 4.2 to have

pzt,s(x− y) ≤ c(s− t)−d/α exp

(− |x− y|||A|||Ms−t|

). (84)

In the case (A) we have |M−1s−t|−1 = |Ms−t|. Hence, by (68),

|x− y||Ms−t|

≥[(s− t)δ|Ms−t|

] 11+γ1

|Ms−t|=[(s− t)δ|Ms−t|−γ1

] 11+γ1

≥ c[(s− t)δ−(1−ε) γ1

β

] 11+γ1 .

In the case (A) we choose δ = (1−2ε)γ1β < (1−ε)γ1

β . Then clearly the exponent at s − t isnegative.

Next, we observe that in the case (B), by (67) and (68),

|x− y||Ms−t|

≥[(s− t)δ|M−1

s−t|−1] 11+γ1

|Ms−t|≥ c(s− t)

δ+(1−ε)/α−(1−ε)(1+γ1)/β1+γ1 .

29

In the case (B) we choose δ = (1 − 2ε) [(1 + γ1)/β − 1/α] < (1 − ε) [(1 + γ1)/β − 1/α].Then clearly the exponent at s− t is negative.

Hence, in both cases, we find c = c(κ(τ)) such that for y ∈ Dc(δ, x)

pzt,s(y − x) ≤ c(s− t)−d/α exp

(− |x− y|||A|||Ms−t|

)≤ c(s− t)ε. (85)

Similarly, we find c = c(κ(τ)) such that∫Dc(δ,x)

pzt,s(y − x)dy ≤ c(s− t)−d/α∫Dc(δ,x)

exp

(− |x− y|||A|||Ms−t|

)dy ≤ c(s− t)ε. (86)

For any x ∈ Rd, 0 < s− t ≤ τ we have, by (78),∫D(δ,x)

pyt,s(x− y) ≤ 1

C4

∫D(δ,x)

Gs−t((x− y)(ATs (y))−1

)dy

≤ 1

C4

∫D(δ,x)

∣∣Gs−t ((x− y)(ATs (y))−1)−Gs−t

((x− y)(ATs (x))−1

)∣∣ dy+

1

C4

∫D(δ,x)

Gs−t((x− y)(ATs (x))−1

)dy

≤ c+ c(s− t)δ−(d+3)ε/α. (87)

We also have∣∣pyt,s(x− y)− pxt,s(x− y)∣∣

=

∣∣∣∣ 1

| det(As(y))|Gs−t

((x− y)(ATs (y))−1

)− 1

|det(As(x))|Gs−t

((x− y)(ATs (y))−1

)+

1

| det(As(x))|Gs−t

((x− y)(ATs (y))−1

)− 1

|det(As(x))|Gs−t

((x− y)(ATs (x))−1

)∣∣∣∣≤ c|x− y|γ1Gs−t

((x− y)(ATs (y))−1

)+c∣∣Gs−t ((x− y)(ATs (y))−1

)−Gs−t

((x− y)(ATs (x))−1

)∣∣Note that for any x ∈ Rd, y ∈ D(δ, x) and 0 < s−t ≤ τ we have |x−y|γ1 ≤ c(s−t)δ+(1−ε)/β.It follows that for any x ∈ Rd and 0 < s− t ≤ τ∫

D(δ,x)

∣∣pyt,s(x− y)− pxt,s(x− y)∣∣ dy ≤ c(s− t)δ+(1−ε)/β + c(s− t)δ−(d+3)ε/α. (88)

Recall that in the case (A) we picked δ = (1−2ε)γ1β . Since ε ≤ ε0 ≤ γ1α

2(d+3)β we have(d+3)εα ≤ γ1

2β = δ2(1−2ε) < δ. Hence

δ − (d+ 3)ε

α≥ δ

(1− 1

2(1− 2ε)

)≥ (1− 4ε)

(d+ 3)ε

α≥ ε,

since ε ≤ 1/8.Recall that in the case (B) we picked δ = (1 − 2ε)((1 + γ1)/β − 1/α). Since ε ≤ ε0 ≤

(1+γ1)/β−1/α2(d+3)/α we obtain (d+3)ε

α ≤ (1+γ1)/β−1/α2 = δ

2(1−2ε) < δ , since ε ≤ 1/8. Hence, as in

the case (A) we obtain

δ − (d+ 3)ε

α≥ δ

(1− 1

2(1− 2ε)

)≥ (1− 4ε)

(d+ 3)ε

α≥ ε.

Now (86), (87) imply (81). By (77) and (85) we get (82). (86), (88) imply (83). �

30 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Lemma 5.9. There exist c = c(h) such that for any 0 ≤ t < s and x ∈ Rd we have

supy∈Rd

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ ≤ cGs−t(0)(s− t)ε.

and ∫Rd

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ dy ≤ c(s− t)ε.

Proof. We first prove the lemma under the assumption 0 < s− t ≤ τ0. Let δ > 0. Put

D(δ, x) =

{w : |x− w| ≤ (s− t)δ−γ2

|M−1s−t|

}.

First, we consider y ∈ D(δ, x). Then we have∣∣(x− y)(ATs (x))−1 − (x− y)(ATt (x))−1∣∣ ≤ ‖A‖|x− y|(s− t)γ2 ≤ (s− t)δ‖A‖

|M−1s−t|

.

By the same arguments as in the proof of Lemma 5.7, we get∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣

≤ c(s− t)δ−ε(1+1/α)Gs−t(0)e−|(x−y)(A−1s (x))TM−1

s−t|.

and∫D(δ,x)

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ dy ≤ c(s− t)δ−(d+3)ε/α.

Next, we estimate the expression∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)

∣∣ for

y ∈ Dc(δ, x). In the case (A) we have |M−1s−t|−1 = |Ms−t|. Hence for y ∈ Dc(δ, x)

|x− y||Ms−t|

≥ (s− t)δ−γ2 .

In the case (A) we will assume that δ < γ2.

In the case (B), by (67) and (68), for y ∈ Dc(δ, x)

|x− y||Ms−t|

≥ (s− t)δ−γ2

|M−1s−t||Ms−t|

≥ c(s− t)δ−γ2+(1−ε)(

1α− 1β

).

In the case (B) we will assume that δ < γ2 −(

1α −

).

By the same arguments as in the proof of Lemma 5.8 we find c = c(h) such that for

y ∈ Dc(δ, x)∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ ≤ c(s− t)ε ≤ cGs−t(0)(s− t)ε

and ∫Dc(δ,x)

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ dy ≤ c(s− t)ε.

In the case (A) we pick δ = (1 − ε)γ2. Since ε ≤ ε0 ≤ γ2α2(d+3) ≤ 1/4 we have (d+3)ε

α ≤γ22 = δ/2(1− ε). Hence

δ − (d+ 3)ε

α≥ δ(1− 1

2(1− ε)) ≥ (1− 2ε)

(d+ 3)ε

α≥ ε,

31

so we obtain the conclusion of the lemma in the case (A). In the case (B) we pick δ =

(1 − ε)(γ2 −

(1α −

)). Since ε ≤ ε0 ≤

γ2−(

1α− 1β

)2(d+3)/α ≤ 1/4 we get (d+3)ε

α ≤γ2−

(1α− 1β

)2 =

δ/2(1− ε). As in the case (A) we obtain

δ − (d+ 3)ε

α≥ δ(1− 1

2(1− ε)) ≥ (1− 2ε)

(d+ 3)ε

α≥ ε.

This completes the proof in the case 0 < t− s ≤ τ0.In the case 0 < t− s ≥ τ0 the conclusion is trivial since

supy∈Rd

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ ≤ 2Gs−t(0)

and

| det(ATs (x))−1|∫Rd

∣∣Gs−t((x− y)(ATs (x))−1)−Gs−t((x− y)(ATt (x))−1)∣∣ dy ≤ 2.

Lemma 5.10. Suppose that 0 < s− t ≤ τ . We have∫Rd|q(0)t,s (x, y)| dy ≤ c(s− t)−1+ε, x ∈ Rd.

Moreover,

|q(0)t,s (x, y)| ≤ c(s− t)−1+εGs−t(0), x, y ∈ Rd. (89)

The constant c = c(κ(τ)). If τ ≤ τ0, then c = c(h).

Proof. Let

Lzt f(x) =d∑

k=1

P.V.

∫R

[f(x+ uekATt (z) + Ut(z, uek))− f(x)] νk(du).

Recall that (∂

∂t+ Lyt,s

)pyt,s(w) = 0.

It follows that

|q(0)t,s (x, y)|

=

∣∣∣∣( ∂

∂t+ Lt

)p

(0)t,s (·, y)(x)

∣∣∣∣=

∣∣∣∣( ∂

∂t+ Lxt

)pyt,s(·)(x− y)

∣∣∣∣=∣∣(−Lyt,s + Lxt

)pyt,s(·)(x− y)

∣∣≤

∣∣∣∣∣d∑

k=1

P.V.

∫|u|<R(k)

s−t

[pyt,s(x− y + uekA

Tt (x))− pyt,s(x− y + uekA

Ts (y))

]νk(du)

∣∣∣∣∣+

∣∣∣∣∣d∑

k=1

P.V.

∫|u|<R(k)

s−t

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))

− pyt,s(x− y + uekATt (x))

]νk(du)

∣∣+

∣∣∣∣∣d∑

k=1

∫|u|≥R(k)

s−t

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))− pyt,s(x− y)

]νk(du)

∣∣∣∣∣= I(x, y) + II(x, y) + III(x, y).

32 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

For the sake of simplicity we will denote p(w) = pyt,s(w) and z = x− y. To handle theterm I(x, y) we have to estimate

P.V.

∫|u|<R(k)

s−t

[p(z + uekA

Tt (x))− p(z + uekA

Ts (y))

]νk(du).

Because of the symmetry, we can re-write this integral as∫|u|<R(k)

s−t

[p(z + uekA

Tt (x))− p(z + uekA

Ts (y)) −∇p(z)(uek(ATt (x)−ATs (y)))

]νk(du).

We have

p(z + uekATt (x))− p(z + uekA

Ts (y))−∇p(z)(uek(ATt (x)−ATs (y)))

=[∇p(z + θuekA

Tt (x) + (1− θ)uekATs (y))−∇p(z)

](uek(A

Tt (x)−ATs (y)))

=[∇p(z + uekA

Ts (y))−∇p(z + θuekA

Tt (x) + (1− θ)uekATs (y))

](uek(A

Tt (x)−ATs (y)))

+[∇p(z + uekA

Ts (y))−∇p(z)

](uek(A

Tt (x)−ATs (y)))

= ∆1 + ∆2,(90)

where θ ∈ [0, 1]. Next,

∇p(z+uekATs (y))−∇p(z+θuekA

Tt (x)+(1−θ)uekATs (y)) = (θuek(A

Ts (y)−ATt (x))∇2p(ξ),

whereξ = z + θ∗uekA

Tt (x) + (1− θ∗)uekATs (y)), |u| ≤ R(k)

s−t.

Troughout the whole proof we pick δ = (1−2ε)γ1β in the case (A) and δ = (1 − 2ε)((1 +

γ1)/β − 1/α) in the case (B). Note that such choice of δ is dictated by Lemma 5.8, sincewe are going to use some arguments contained therein.

Let |z|1+γ1 ≤ (s−t)δ|M−1

s−t|, that is y ∈ D(δ, x). By Lemma 5.6 we have

|ξ(ATs (y))−1M−1s−t − z(ATs (x))−1M−1

s−t| ≤ c.Applying this and (74), we arrive at∣∣∇2p(ξ)

∣∣ ≤ c |M−1s−t|2

(s− t)2ε(1+1/α)e−|z(A

−1s (x))TM−1

s−t|Gs−t(0).

This implies

|∆1| ≤ c|u|2|ATs (y)−ATt (x)|2|M−1

s−t|2

(s− t)2ε(1+1/α)e−|z(A

−1s (x))TM−1

s−t|Gs−t(0)

Next, by Lemma 4.4,∫|u|<R(k)

s−t

u2νk(du) ≤ (R(k)s−t)

2(s− t)ε−1 ≤ |Ms−t|2(s− t)ε−1.

Hence,

I1(x, y) : =

∫|u|<R(k)

s−t

|∆1|νk(du)

≤ cGs−t(0)(|M−1s−t||Ms−t|)2|ATs (y)−ATt (x)|2e−|z(A

−1s (x))TM−1

s−t|(s− t)−ε(1+2/α)−1.

Now, let us estimate the second summand ∆2 in the right hand side of (90). Sincep(w) = |det((ATs (y))−1)|Gs−t

(w(ATs (y))−1

), w ∈ Rd, we have

∇p(w) = | det((ATs (y))−1)|∇Gs−t(w(ATs (y))−1

)(ATs (y))−1.

33

Hence,

|∇p(z + uekATs (y))−∇p(z)|

≤ c|∇Gs−t(z(ATs (y))−1 + uek

)−∇Gs−t

(z(ATs (y))−1

)|

≤ c∣∣∣∣ ∂

∂wk∇Gs−t(ξ)

∣∣∣∣ |u|,where ξ = z(ATs (y))−1 + θuek, 0 ≤ θ ≤ 1. By Lemma 5.6, we obtain

|ξM−1s−t − z(ATs (x))−1M−1

s−t| ≤ c.

Applying this and (74), we arrive at∣∣∣∣ ∂

∂wk∇Gs−t(ξ)

∣∣∣∣ ≤ ce−|z(A−1s (x))TM−1

s−t|(s− t)−(2+2/α)ε 1

R(k)s−t|M−1

s−t|Gs−t(0).

Then we have

|∆2| =∣∣∣[∇p(z + uekA

Ts (y))−∇p(z)

](uek(A

Tt (y)−ATt (x)))

∣∣∣≤ (s− t)−(2+2/α)ε 1

R(k)s−t|M−1

s−t|Gs−t(0)|u|2∣∣ATt (x)−ATs (y)

∣∣ e−|z(A−1s (x))TM−1

s−t|.

Hence, since∫|u|<R(k)

s−tu2νk(du) ≤ (R

(k)s−t)

2(s− t)ε−1, we have

I2(x, y) :=

∫|u|<R(k)

s−t

|∆2|νk(du)

≤ cGs−t(0)|M−1s−t|R

(k)s−t∣∣ATt (x)−ATs (y)

∣∣ e−|z(A−1s (x))TM−1

s−t|(s− t)−(2+2/α)ε(s− t)ε−1

≤ cGs−t(0)|M−1s−t||Ms−t|

∣∣ATt (x)−ATs (y)∣∣ e−|z(A−1

s (x))TM−1s−t|(s− t)−(2+2/α)ε(s− t)ε−1.

We observe that, by (68),

|x− y|γ1 ≤

((s− t)δ

|M−1s−t|

) γ11+γ1

≤ c(s− t)(δ+(1−ε)/β)γ1

1+γ1 ≤ c(s− t)((1−ε)/β)γ1

1+γ1 .

In the case (A) |M−1s−t||Ms−t| = 1, hence

|M−1s−t||Ms−t||ATs (y)−ATt (x)| ≤ |ATs (y)−ATs (x)|+ |ATs (y)−ATt (y)|

≤ c(|x− y|γ1 + (s− t)γ2)

≤ c((s− t)1−εβ

γ11+γ1 + (s− t)γ2)

≤ c(s− t)(1−ε)ρ,

where ρ = min{γ1/β(1 + γ1), γ2}. In the case (B) we have , by (67) and(68),

|M−1s−t||Ms−t||ATs (y)−ATt (x)| ≤

(|M−1

s−t||Ms−t|)|ATs (y)−ATs (x)|+ |ATs (y)−ATt (y)|

≤ c(|M−1

s−t||Ms−t|)

(|x− y|γ1 + (s− t)γ2)

≤ c(|Ms−t||M−1s−t|

11+γ1 (s− t)δ

γ11+γ1 + |M−1

s−t||Ms−t|(s− t)γ2)

≤ (s− t)(1−ε)( 1β− 1

(1+γ1)α)+δ

γ11+γ1 + (s− t)(1−ε)( 1

β− 1α

)+γ2

≤ c(s− t)(1−ε)ρ,

34 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

where ρ = min{1/β − 1/(1 + γ1)α, γ2 + 1/β − 1/α}. This implies that for y ∈ D(δ, x),

I(x, y) ≤ cI2(x, y)

≤ cGs−t(0)|M−1s−t||Ms−t|

∣∣ATt (x)−ATs (y)∣∣ e−|z(A−1

s (x))TM−1s−t|(s− t)−(1+2/α)ε−1(91)

≤ cGs−t(0)|M−1s−t||Ms−t|

∣∣ATt (x)−ATs (y)∣∣ (s− t)−(1+2/α)ε−1e

− |z|||A|||Ms−t|

≤ cGs−t(0)(s− t)ρ(1−ε)−(2+2/α)ε(s− t)ε−1e− |z|||A|||Ms−t|

≤ cGs−t(0)(s− t)ε−1e− |z|||A|||Ms−t| , (92)

provided

ε ≤ ρ(1− ε)2 + 2/α

. (93)

That is, when

0 < ε ≤ min{γ1/β(1 + γ1), γ2}2 + 2/α+ min{γ1/β(1 + γ1), γ2}

in the case (A)

and

0 < ε ≤ min{1/β − 1/(1 + γ1)α, γ2 + 1/β − 1/α}2 + 2/α+ min{1/β − 1/(1 + γ1)α, γ2 + 1/β − 1/α}

in the case (B).

Hence (93) holds for ε ≤ ε0. Next,

|ATs (y)−ATt (x)| ≤ |ATs (y)−ATs (x)|+ |ATs (y)−ATt (y)| ≤ c(|x− y|γ1 + (s− t)γ2).

By (91), Lemma 5.5 used twice ( with ρ = γ1 or with ρ = 0), (80) and finally (68), wearrive at ∫

D(δ,x)|I(x, y)|dy

≤ cGs−t(0)det(Ms−t)|M−1s−t||Ms−t|(|Ms−t|γ1 + (s− t)γ2)(s− t)−ε(1+2/α)−1

≤ c(t− s)−ε(2+(d+2)/α)|M−1s−t||Ms−t|(|Ms−t|γ1 + (s− t)γ2)(s− t)ε−1

≤ c(t− s)−ε(2+(d+2)/α)|M−1s−t||Ms−t|((s− t)(1−ε)γ1/β + (s− t)γ2)(s− t)ε−1.

In the case (A) we have |M−1s−t||Ms−t| = 1, hence∫D(δ,x)

|I(x, y)|dy ≤ c(s− t)−1+ε, (94)

if ε ≤ min{ γ1γ1+β(2+(d+2)/α) ,

γ22+(d+2)/α}, which is satisfied with our assumptions on ε. In

the case (B) we have , by (67) and (68) , |M−1s−t||Ms−t| ≤ c(s− t)(1−ε)(−1/α+1/β), so∫

D(δ,x)|I(x, y)|dy ≤ c[(t− s)−ε(2+(d+2)/α)+(1−ε)((1+γ1)/β−1/α)

+ (t− s)−ε(2+(d+2)/α)+(1−ε)(1/β−1/α)+γ2 ](s− t)−1+ε

≤ c(s− t)−1+ε. (95)

provided −ε(2 + (d+ 2)/α) + (1− ε)((1 + γ1)/β − 1/α) ≥ 0 and−ε(2 + (d+ 2)/α) + (1− ε)(1/β − 1/α) + γ2 ≥ 0. That is

ε ≤ min

{(1 + γ1)/β − 1/α

(γ1 + 1)/β + 2 + (d+ 1)/α,

(1/β − 1/α+ γ2)

1/β + 2 + (d+ 1)/α

}.

Again, this is true with our assumptions.

35

Now we deal with the estimates of I(x, y) over Dc(δ, x). We note that our assumptions

yield that in the case (A) ε ≤ γ1α2(d+3)β , while in the case (B) ε ≤ (1+γ1)/β−1/α

2(d+3)/α . We have

|p(z + uekATt (x))− p(z + uekA

Ts (y))−∇p(z)(uek(ATt (y)−ATt (x)))| ≤ c|∇2p(ξ)|u2 (96)

forξ = z + λ(θuekA

Tt (x) + (1− θ)uekATs (y)), |u| ≤ R(k)

s−t,

with λ, θ ∈ [0, 1]. We note that

|ξ − z| ≤ λ(θ|u||ekATt (x)|+ (1− θ)|u||ekATt (y)|)≤ |u|||A||

≤ R(k)s−t||A||

≤ |Ms−t|||A||.Hence, by (75) and then by (67), we get∣∣∇2p(ξ)

∣∣ ≤ c|M−1s−t|2 exp

(− |z|||A|||Ms−t|

)(s− t)−(2+ε)/αGs−t(0)

≤ c exp

(− |z|||A|||Ms−t|

)(s− t)−4/αGs−t(0).

Combined with (96) it yields

I(x, y) ≤ c exp

(− |z|||A|||Ms−t|

)(s− t)−4/αGs−t(0)

d∑k=1

∫|u|<R(k)

s−t

|u|2νk(du) du

≤ c exp

(− |z|||A|||Ms−t|

)(s− t)−

d+4α ,

since Gs−t(0) ≤ c(s− t)−dα . Next, observe that |z|

||A|||Ms−t| ≥ c(s− t)− εγ1

β for y ∈ Dc(δ, x),

which implies that

I(x, y) ≤ c exp

(− |z|

2||A|||Ms−t|

). (97)

Using this we obtain ∫Dc(δ,x)

I(x, y) dy ≤ c. (98)

Now we estimate II(x, y). We have∣∣p(z + uekATt (x) + Ut(x, uek))− p(z + uekA

Tt (x))

∣∣ ≤ |∇p(ξ)Ut(x, uek)| , (99)

whereξ = z + uekA

Tt (x) + λUt(x, uek), |u| ≤ R

(k)s−t,

with λ ∈ (0, 1).

Let |z|1+γ1 ≤ (s−t)δ|M−1

s−t|. By Lemma 5.6, we have

|ξ(ATs (y))−1M−1s−t − z(ATs (x))−1M−1

s−t| ≤ c.Applying this and (73), we arrive at

|∇p(ξ)| ≤ c|M−1

s−t|(s− t)ε(1+1/α)

e−|z(A−1s (x))TM−1

s−t|Gs−t(0). (100)

From Lemma 4.4 we infer that∫|u|<R(k)

s−t|u|γ3νk(du) ≤ c(R(k)

s−t)γ3(s− t)ε−1. This combined

with (99) yield

II(x, y) ≤ c|M−1

s−t||Ms−t|γ3

(s− t)ε(1+1/α)e−|z(A

−1s (x))TM−1

s−t|Gs−t(0)(s− t)ε−1. (101)

36 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

We also note that

II(x, y) ≤ cGs−t(0)(s− t)ε−1e− |z|||A|||Ms−t| . (102)

To prove it in the case (A), using (67), we observe that

|M−1s−t||Ms−t|γ3 = |Ms−t|γ3−1 ≤ c(s− t)(1−ε)(γ3−1)/β.

Hence, from (101) we obtain (102) provided

ε ≤ γ3 − 1

γ3 + β(1 + 1/α),

which holds with our assumptions. To prove it in the case (B) we observe that, by (67)and (68),

|M−1s−t||Ms−t|γ3 ≤ c(s− t)(1−ε)(γ3/β−1/α).

Hence, from (101) we obtain (102) provided

ε ≤ γ3 − β/αγ3 + β

,

which again holds in this case.Applying Lemma 5.5 (with ρ = 0), the estimate (101), (80) and finally (68), we obtain∫D(δ,x)

|II(x, y)|dy ≤ cGs−t(0)det(Ms−t)|M−1s−t||Ms−t|γ3(s− t)−(1+1/α)ε(s− t)ε−1

≤ c(t− s)−ε(1+(d+1)/α)|M−1s−t||Ms−t|γ3(s− t)ε−1

≤ c(t− s)−ε(1+(d+1)/α)|M−1s−t||Ms−t|(s− t)(1−ε)(γ3−1)/β(s− t)ε−1.

Then, by the same arguments as we applied to handle the term I(x, y), we obtain∫D(δ,x)

|II(x, y)|dy ≤ c(s− t)−1+ε (103)

in both cases: (A) (since ε ≤ γ3−1γ3−1+β(1+(d+1)/α)) and (B) (since ε ≤ γ3−β/α

γ3+β(1+d/α)). More-

over, again the same reasoning, as when I(x, y) was explored, leads to∫Dc(δ,x)

|II(x, y)|dy ≤ c (104)

and

II(x, y) ≤ c exp

(− |z|

2||A|||Ms−t|

), y ∈ Dc(δ, x). (105)

By Lemma 5.8,∫Rd

III(x, y) dy ≤ cd∑

k=1

∫|u|≥R(k)

s−t

νk(du) ≤ cd∑

k=1

hk(R(k)s−t)

= c

d∑k=1

hk(h−1k ((s− t)−1+ε)) = cd(s− t)−1+ε.

Using this, (94), (95), (98), (103) and (104) we get the first assertion of the lemma.Finally it is clear that

III(x, y) ≤ cGs−t(0)

d∑k=1

∫|u|≥R(k)

s−t

νk(du) ≤ cGs−t(0)(s− t)−1+ε.

This together with (92), (102) and (105) prove the second assertion of the lemma. Finally,we remark that all the constants appearing in the above estimates c depned on τ throughκ(τ) and for τ ≤ τ0 the constants c = c(h).

37

Lemma 5.11. Assume (I) and let τ > 0. For 0 < t < s ≤ τ and y ∈ Rd we have∫Rd|q(0)t,s (x, y)| dx ≤ c(s− t)−1+ε,

where c = c(C8, κ(τ)). If τ ≤ τ0, then c = c(C8, h).

Proof. The proof repeats partially the proof of Lemma 5.10. Namely, take the decompo-sition

q(0)t,s (x, y) = I(x, y) + II(x, y) + III(x, y)

from this proof, and observe that literally the same estimates as in the above proof yieldthe required intergal-in-x bound for the first two terms:∫

Rd(|I(x, y)|+ |II(x, y)|) dx ≤ c(s− t)−1+ε.

For the third term, we have to use the additional assumption (I). Namely,

‖III(·, y)‖L1 ≤d∑

k=1

∫|u|≥R(k)

s−t

∫Rd

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek)) + pyt,s(x− y)

]νk(du)

=

∫z:zk≥R

(k)s−t,k=1,...,d

[‖T t,zpyt,s(· − y)‖L1 + ‖pyt,s(· − y)‖L1

]µ(dz)

≤ c∫z:zk≥R

(k)s−t,k=1,...,d

µ(dz)

≤ c(s− t)−1+ε,

where in the penultimate inequality we have used (I) and the identity∫Rdpyt,s(x, y) dx = 1,

which is easy to derive from the definition of pyt,s(x, y). �

Lemma 5.12. Fix τ > 0. For any ξ ∈ (0, 1], ζ > 0, 0 < s− t ≤ τ , x, y ∈ Rd, if s− t ≥ ξ,then we have∣∣∣∣∣

d∑k=1

∫|u|<R(k)

s−t∧ζ

[pyt,s(x− y + uekA

Tt (x))− pyt,s(x− y + uekA

Ts (y))

]νk(du)

∣∣∣∣∣+

∣∣∣∣∣d∑

k=1

∫|u|<R(k)

s−t∧ζ

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))− pyt,s(x− y + uekA

Tt (x))

]νk(du)

∣∣∣∣∣≤ c

d∑k=1

∫|u|<R(k)

s−t∧ζ(|u|2 + |u|γ3) νk(du),

where c = c(ξ, κ(τ)).

Proof. The lemma follows from the estimates of ∆1, ∆2 (in the proof of Lemma 5.10),(99) and (100). �

Lemma 5.13. Fix τ > 0. We have

limr→∞

supx∈Rd,0<s−t<τ

∫Bc(x,r)

|q(0)t,s (x, y)| dy = 0 (106)

and

limr→∞

supx∈Rd,0<s−t<τ

∫Bc(x,r)

|p(0)t,s (x, y)| dy = 0. (107)

38 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Proof. Keeping the notation from Lemma 5.10 we have

|q(0)t,s (x, y)| ≤ I(x, y) + II(x, y) + III(x, y).

By (92), (97) and (102), (105) we get for any x, y ∈ Rd,

I(x, y) + II(x, y) ≤ c exp

(− |x− y|

2‖A‖|Ms−t|

)(s− t)−d/α+ε−1. (108)

By (8) and (13), for any x ∈ Rd, u ∈ R, t > 0 we have

max1≤k≤d

∣∣uekATt (x) + Ut(x, uek)∣∣ ≤ (dC3 + C7)(|u|γ3 ∨ |u|). (109)

Put r0 = 2(dC3 + C7). We bound III(x, y) from above by∣∣∣∣∣∣∣d∑

k=1

∫|x−y|1/γ3

r1/γ30

≥|u|≥R(k)s−t

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))− pyt,s(x− y)

]νk(du)

∣∣∣∣∣∣∣+

∣∣∣∣∣∣∣d∑

k=1

∫|u|≥max

(R

(k)s−t,

|x−y|1/γ3

r1/γ30

) [pyt,s(x− y + uekATt (x) + Ut(x, uek))− pyt,s(x− y)

]νk(du)

∣∣∣∣∣∣∣= IV(x, y) + V(x, y).

Assume now that |x− y| ≥ r0. When u satisfies |x− y|1/γ3/r1/γ30 ≥ |u| ≥ R(k)

s−t then, by(109), we have∣∣x− y + uekA

Tt (x) + Ut(x, uek)

∣∣ ≥ |x− y| −∣∣uekATt (x) + Ut(x, uek)

∣∣≥ |x− y| − (r0/2)(|u|γ3 ∨ |u|)

≥ |x− y| − (r0/2)|x− y|r0

=|x− y|

2.

Using this and Corollary 5.4 we get∣∣pyt,s(x− y + uekATt (x) + Ut(x, uek))− pyt,s(x− y)

∣∣ ≤ cGs−t(0) exp

(− |x− y|

2‖A‖|Ms−t|

).

It follows that for |x− y| ≥ r0 we have

IV(x, y) ≤ cGs−t(0)

(d∑

k=1

∫|u|≥R(k)

s−t

νk(du)

)exp

(− |x− y|

2‖A‖|Ms−t|

)≤ cGs−t(0) exp

(− |x− y|

2‖A‖|Ms−t|

), (110)

since for any k ∈ {1, . . . , d} we have∫|u|≥R(k)

s−tνk(du) ≤ hk

(R

(k)s−t

)= 1

(s−t)ε−1 ≤ c.By elementary arguments for any a, r > 0 we have∫

Bc(x,r)e−a|x−y| dy =

c

ad

∫ ∞ar

e−vvd−1 dv ≤ c

ade−ar/2,

where c depends only on d. Using this, (108), (110) and and (68) we get for r ≥ r0∫Bc(x,r)

(I(x, y)+II(x, y)+IV(x, y)) dy ≤ c(s−t)−d/α+ε−1|Ms−t|d exp

(−r

4|Ms−t|‖A‖

)≤ ce−c1r.

(111)

39

By Lemma 5.8, we have∫Rd

IV(x, y) dy ≤ c

d∑k=1

∫|u|≥max

(R

(k)s−t,r

1/γ3/r1/γ30

) νk(du)

≤ cd∑

k=1

hk

(r1/γ3

r1/γ30

). (112)

Since limr→∞ hk(r) = 0, the first assertion of the lemma follows from (111) and (112).The proof of the second follows easily from (72). �

PutW = {(t, s) : t, s ∈ [0,∞), t < s}.

Lemma 5.14. The function W ×Rd ×Rd 3 (t, s, x, y)→ pyt,s(x) is continuous as well as

the function W ×Rd ×Rd 3 (t, s, x, y)→ q(0)t,s (x, y).

Proof. The first assertion follows from Lemma 4.7 and continuity of the map (s, x) 7→As(x). Recall that q

(0)t,s (x, y) is equal to

d∑k=1

∫|u|<R(k)

s−t

[pyt,s(x− y + uekA

Tt (x))− pyt,s(x− y + uekA

Ts (y))

]νk(du)

+

d∑k=1

∫|u|<R(k)

s−t

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))− pyt,s(x− y + uekA

Ts (x))

]νk(du)

+

d∑k=1

∫|u|≥R(k)

s−t

[pyt,s(x− y + uekA

Tt (x) + Ut(x, uek))− pyt,s(x− y)

]νk(du).

Hence, the second assertion of the lemma follows from the first, Lemma 5.12, (72) and thebounded convergence theorem. �

Lemma 5.15. For any f ∈ C∞(Rd), t0 ≥ 0 we have

limW3(t,s)→(t0,t0)

‖P (0)t,s f − f‖∞ = 0.

Proof. Note that for any x ∈ Rd, 0 < t < s < ∞ we have∫Rdpxt,s(x − y) dy = 1. Using

this, (83) and (72) we easily obtain the assertion of the lemma. �

Lemma 5.16. For any f ∈ C∞(Rd), 0 ≤ t0 < s0 we have

limW3(t,s)→(t0,s0)

‖Q(0)t,s f −Q

(0)t0,s0

f‖∞ = 0

andlim

W3(t,s)→(t0,s0)‖P (0)

t,s f − P(0)t0,s0

f‖∞ = 0.

Proof. We give a detailed proof of the first statement and only a sketch for the second.By Lemma 5.13, it is enough to prove the lemma for f with compact support. We note

that the function (W × Rd) 3 (t, s, x) → Q(0)t,s f(x) is continuous. This follows from (89),

Lemma 5.14 and the bounded convergence theorem. Let r > 0. Hence it is uniformlycontinuous on

{t, s, x : t ≥ 0; |x| ≤ r; |t− t0|, |s− s0| ≤ |s0 − t0|/3}.It follows that

limW3(t,s)→(t0,s0)

sup|x|≤r

|Q(0)t,s f(x)−Q(0)

t0,s0f(x)| = 0.

40 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

Let r be so large that the support of f is contained in B(0, r/2). Next, we have

sup|x|≥r,0<s−t<2(s0−t0)

|Q(0)t,s f(x)| ≤ ‖f‖∞ sup

x∈Rd,0<s−t<2(s0−t0)

∫Bc(x,r/2)

|q(0)t,s (x, y)|dy.

Hence

lim supW3(t,s)→(t0,s0)

‖Q(0)t,s f −Q

(0)t0,s0

f‖∞ ≤ 2‖f‖∞ supx∈Rd,0<s−t<2(s0−t0)

∫Bc(x,r/2)

|q(0)t,s (x, y)|dy,

which converges to 0, if r → ∞, by Lemma 5.13. This completes the proof of the firstassertion.

Finally, we remark that the function W ×Rd 3 (t, s, x) → P(0)t,s f(x) is continuous, due

to Lemma 5.14. Next, similarly as above, we apply (107) to complete the proof of thesecond assertion. �

Lemma 5.17. For any 0 < s− t ≤ τ and x, y ∈ Rd such that |x− y| ≤ (s− t)1/α we have∫Rd

∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ dz ≤ c|x− y|(s− t)−1/α−(d+2)ε/α, (113)

where c = c(κ(τ)).

Proof. We have

pzt,s(x− z)− pzt,s(y − z)

=1

|det(As(z))|[Gs−t((x− z)(A−1

s (z))T )−Gs−t((y − z)(A−1s (z))T )

]=

1

|det(As(z))|∇Gs−t(ξ)

[(x− y)(A−1

s (z))T],

where ξ = (θ(x − z) + (1 − θ)(y − z))(A−1s (z))T , 0 ≤ θ ≤ 1. By Lemma 5.2 and then

Lemma 4.2, we obtain∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ ≤ c|x− y|Gs−t(0)1

h−1min(1/(s− t))

1

(s− t)εe−|ξM

−1s−t|

≤ c|x− y|Gs−t(0)1

(s− t)ε+1/αe−|ξM

−1s−t|.

We have

|ξM−1s−t| ≥ |(x− z)(A−1

s (z))TM−1s−t| − |ξM

−1s−t − (x− z)(A−1

s (z))TM−1s−t|.

Since |x− y| ≤ (s− t)1/α, using (68), we get

|ξM−1s−t − (x− z)(A−1

s (z))TM−1s−t| ≤ |(−(1− θ)x+ (1− θ)y)(A−1

s (z))TM−1s−t|

≤ |x− y|‖A‖|M−1s−t|

≤ c|x− y|(s− t)−1/α+ε/α

≤ c.

We pick δ > 0 in the same way as in Lemma 5.8. By the same arguments as in the proofof Lemma 5.7 for z ∈ D(δ, x) we have

|(x− z)(A−1s (z))T − (x− z)(A−1

s (x))T | ≤ ‖A‖|x− z|1+γ1 ≤ (s− t)δ‖A‖ 1

|M−1s−t|

.

Hence

|(x−z)(A−1s (z))TM−1

s−t−(x−z)(A−1s (x))TM−1

s−t| ≤ (s−t)δ|M−1s−t|‖A‖|M

−1s−t|−1 = (s−t)δ‖A‖.

41

Therefore for z ∈ D(δ, x) we have∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ ≤ c|x− y|Gs−t(0)1

(s− t)ε+1/αe−|(x−z)(A

−1s (x))TM−1

s−t|.

Using the above estimates, Lemma 5.5 and Lemma 5.2 we get∫D(δ,x)

∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ dz≤ c|x− y|Gs−t(0)

1

(s− t)ε+1/α

∫D(δ,x)

e−|(x−z)(A−1s (x))TM−1

s−t| dz

≤ c|x− y|Gs−t(0)1

(s− t)ε+1/αdet(Ms−t)

≤ c|x− y|(s− t)1/α+ε

d∏i=1

h−1i (1/(s− t)1−ε)

h−1i (1/(s− t))

.

By Corollary 4.3 this is bounded from above by

c|x− y|(s− t)−1/α−(d+2)ε/α.

For z ∈ Dc(δ, x) we have∣∣pzt,s(x− z)− pzt,s(y − z)∣∣≤ c|x− y|Gs−t(0)

1

(s− t)ε+1/αe−|(x−z)(A

−1s (z))TM−1

s−t|

≤ c|x− y|Gs−t(0)1

(s− t)ε+1/αe− |x−z|‖A‖|Ms−t| .

Using (86) we infer that there exists c such that∫Dc(δ,x)

∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ dz≤ c|x− y|Gs−t(0)

1

(s− t)ε+1/α

∫Dc(δ,x)

e− |x−z|‖A‖|Ms−t| dz

≤ c|x− y|,

which finishes the proof of (113). �

Proof of Theorem 2.3. Let 0 < γ < γ′ < α, γ ≤ 1. We pick ε = min{ε0,γ′−γγ(d+2)}. Then for

any 0 < s− t ≤ τ and x, y ∈ Rd such that |x− y| ≤ (s− t)1/α we have∫Rd

∣∣pzt,s(x− z)− pzt,s(y − z)∣∣ dz ≤ c|x− y|γ(s− t)−γ′/α, (114)

where c = c(γ, γ′, κ(τ)). To prove (114) we observe that our choice of ε ∈ (0, ε0] yields1 + (d+ 2)ε ≤ γ′/γ. Hence, by (113), we get(∫

Rd

∣∣pwt,s(x− w)− pwt,s(y − w)∣∣ dw)γ ≤ c|x−y|γ(s−t)−

γα

(1+(d+2)ε) ≤ c|x−y|γ(s−t)−γ′/α.

On the other hand, by (81), we obtain(∫Rd

∣∣pwt,s(x− w)− pwt,s(y − w)∣∣ dw)1−γ

≤ c.

The last two estimates imply (114).

42 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

If |x− y| ≥ ((s− t)/2)1/α then the assertion of the theorem is trivial, so we may assume

that |x− y| < ((s− t)/2)1/α. We have

|Pt,sf(x)− Pt,sf(y)| =∣∣∣∣∫Rd

(pzt,s(x− z)− pzt,s(y − z)

)f(z) dz

+

∫ s

t

∫Rd

(pwt,r(x− w)− pwt,r(y − w)

) ∫Rdqr,s(w, z)f(z) dz dw dr

∣∣∣∣ .By (114) and (31), this is bounded from above by

c|x− y|γ(s− t)−γ′/α‖f‖∞

+c‖f‖∞∫ s

t

∫Rd

∣∣pwt,r(x− w)− pwt,r(y − w)∣∣ (s− r)εα−1 dw dr. (115)

Recall that we assumed |x− y| < ((s− t)/2)1/α. Let us denote∫ s

t

∫Rd

∣∣pwt,r(x− w)− pwt,r(y − w)∣∣ (s− r)εα−1 dw dr

=

∫ t+|x−y|α

t. . .+

∫ t+(s−t)/2

t+|x−y|α. . .+

∫ s−t

t+(s−t)/2. . .

= I + II + III.

By (81) and our assumption |x− y|α < (s− t)/2 we get

I ≤ c∫ t+|x−y|α

t

(s− t

2

)εα−1

dr ≤ c |x− y|α

s− t≤ c

(|x− y|α

s− t

)γ/α≤ c|x− y|γ(s− t)−γ′/α.

By (114) we get

II ≤ c|x− y|γ(r − t)−γ′/α dr ≤ c|x− y|γ(s− t)−γ′/α.

Again by (114) we obtain

III ≤ c∫ s−t

t+(s−t)/2|x− y|γ(r − t)−γ′/α(s− r)εα−1 dr ≤ c|x− y|γ(s− t)−γ′/α.

By (115) and the estimates of I, II, III we obtain the assertion of the theorem. �

Lemma 5.18. For any 0 < s− t ≤ τ and x ∈ Rd we have

supy∈Rd

∣∣∣p(0)t,s (x, y)− pt,s(x, y)

∣∣∣ ≤ cGs−t(0)(s− t)ε (116)

and ∫Rd

∣∣∣p(0)t,s (x, y)− pt,s(x, y)

∣∣∣ dy ≤ c(s− t)ε. (117)

The contant c = c(κ(τ)). If τ ≤ τ0, then c = c(h).

Proof. Let 0 < s− t ≤ τ , x ∈ Rd be arbitrary. We have∣∣∣p(0)t,s (x, y)− pt,s(x, y)

∣∣∣ =∣∣pyt,s(y − x)− pt,s(x, y)

∣∣≤

∣∣pyt,s(y − x)− pxt,s(y − x)∣∣+∣∣pxt,s(y − x)− pt,s(x, y)

∣∣= I1 + I2.

43

We also have

I2 ≤∣∣∣∣ 1

| detAs(x)|Gs−t((As(x))−1(y − x))− 1

|detAt(x)|Gs−t((As(x))−1(y − x))

∣∣∣∣+

1

| detAt(x)|∣∣Gs−t((As(x))−1(y − x))−Gs−t((At(x))−1(y − x))

∣∣+

1

| detAt(x)|

∣∣∣Gs−t((At(x))−1(y − x))− Gs−t((At(x))−1(y − x))∣∣∣

= I3 + I4 + I5.

It remains to justify supy∈Rd Ii ≤ cGs−t(0)(s − t)ε and∫Rd

Ii dy ≤ c(s − t)ε for i ∈{1, 2, 3, 4, 5} and some c > 0 for some c = c(κ(τ)) in the general case or c = c(h) ift < s ≤ τ0.

By Lemma 5.8 we get such estimates for I1. By (8), (9) and (11) we get the estimatesfor I3. By Lemma 5.9 and (9) we obtain such estimates for I4. Analogous estimates of I5

follow from Lemma 4.8 and definitions of Gs−t, Gs−t. �

Appendix A. Estimates for Example 2.4 and Example 2.7

In this section we prove the two inequalities, which were stated and used in Example2.4 and Example 2.7.

Proof of (18). For ρk+1 < |y| ≤ ρk we have ρk ≤ c−1|y|, hence

h(r) = r−2∑k:ρk≤r

ρ2kν(ρk+1 < |y| ≤ ρk) +

∑k:ρk>r

ν(ρk+1 < |y| ≤ ρk)

≤ c−2r−2

∫(1 ∧ (|y|2r−2)ν(dy) = c−2 4cα

α(2− α)r−α,

which proves the upper bound in (18). Similarly, we have

h(r) ≥∫|y|≤ρ1

(1 ∧ (|y|2r−2)ν(dy) ≥ Br−α, r ∈ (0, 1],

which proves the lower bound. �

Proof of (19). Without loss of generality we can take t = 0, x = 0, then

Xs =

∫ s

0A(r)e1 dZ

1r +

∫ s

0A(r)e2 dZ

2r

The characteristic function of Xt has the form

φXs (z) = exp

{−∫ s

0(|(A(r)z)1|α1 + |(A(r)z)2|α2) dr

},

and thus the distribution density equals

pXs (x) =1

(2π)2

∫R2

exp

{−ix · z −

∫ s

0(|(A(r)z)1|α1 + |(A(r)z)2|α2) dr

}dz.

In particular,

pXs (0) =1

(2π)2

∫R2

exp

{−∫ s

0(|(A(r)z)1|α1 + |(A(r)z)2|α2) dr

}dz,

below we will show that the latter integral exists.Lets estimate from below∫ s

0(|(A(r)z)1|α1 + |(A(r)z)2|α2) dr ≥

∫ s

0|(A(r)z)2|α2 dr

We recall that(A(r)z)2 = rγz1 + z2, z = (z1, z2),

44 T. KULCZYCKI, A. KULIK, AND M. RYZNAR

and perform case study.Case 1: |z2| > sγ

2 |z1|. Then

|rγz1 + z2| ≥1

4|z2| if r ∈ [0,

s

41/γ] and |rγz1 + z2| ≥ 0 othwerwize,

which gives ∫ s

0|(A(r)z)2|α2 dr ≥ cs|z2|α2 .

Case 2: |z2| ≤ sγ

2 |z1|. Consider two intervals I(s) = [ s2 ,3s4 ), J(s) = [3s

4 , s]. At least oneof these intervals is free from the roots of the function r 7→ |rγz1 + z2|, and this functiondepends on v = rγ linearly with the slope ±|z1|. Since the values of this function in theendpoints are positive, this yields that, at least on the half of the interval,

|rγz1 + z2| ≥ csγ |z1|,which gives ∫ s

0|(A(r)z)2|α2 dr ≥ cs1+α2γ |z1|α2 .

Now we can complete the estimate of pY (0). We have

pXs (0) ≤ 1

(2π)2

∫R2

exp

{−∫ t

0|(A(s)z)2|α2 ds

}dz

≤ 1

(2π)2

∫|z2|> sγ

2|z1|

exp {−cs|z2|α2} dz

+1

(2π)2

∫|z2|≤ s

γ

2|z1|

exp{−cs1+α2γ |z1|α2

}dz =: I1 + I2.

Since

I1 =4

(2π)2sγ

∫<|z2| exp {−cs|z2|α2} dz2 =

∣∣∣s1/α2z2=v

= Cs−γ−2/α2 ,

I2 =sγ

(2π)2

∫<|z1| exp

{−cs1+α2γ |z1|α2

}dz1 =

∣∣∣s(1+α2γ)/α2z1=v

= Csγ · t−2(1+α2γ)/α2 = Cs−γ−2/α2 ,

this completes the proof of (19). �

References

[1] K. Bogdan, T. Grzywny, M. Ryznar, Density and tails of unimodal convolution semigroups, J. Funct.Anal. 266 (2014) 3543–3571.

[2] K. Bogdan, V. Knopova, P. Sztonyk, Heat kernel of anisotropic nonlocal operators, Documenta Math-ematicae 25 (2020) 1–54.

[3] Z.-Q. Chen, Z. Hao, X. Zhang, Holder regularity and gradient estimates for SDEs driven by cylindricalα-stable processes, Electron. J. Probab. 25 (2020), article no. 137, 1–23.

[4] Z.-Q. Chen, E. Hu, L. Xie, X. Zhang, Heat kernels for non-symmetric diffusions operators with jumps,J. Differ. Equ. 263 (2017) 6576–6634.

[5] Z.-Q. Chen, X. Zhang, Heat kernels and analyticity of non-symmetric jump diffusion semigroups,Probab. Theory Relat. Fields 16 (2016) 267–312.

[6] Z.-Q. Chen, X. Zhang, Heat kernels for time-dependent non-symmetric stable-like operators, J. Math.Anal. Appl. 465 (2018) 1–21.

[7] F. H. Clarke, On the inverse function theorem, Pacific Journal of Mathematics Vol. 64, No 1 (1976),97–102.

[8] A. Debussche, N. Fournier, Existence of densities for stable-like driven SDE’s with Holder continuouscoefficients, J. Funct. Anal. 264(8) (2013) 1757–1778.

[9] S.D. Eidelman, S.D. Ivasyshen, A.N. Kochubei, Analytic Methods in the Theory of Differential andPseudo-Differential Equations of Parabolic Type, Birkhauser, Basel 2004.

[10] S. N. Ethier, T. G. Kurtz, Markov Processes: Characterization and Convergence, Wiley, New York1986.

45

[11] W. Feller, Zur Theorie der stochastischen Prozesse. (Existenz- und Eindeutigkeitssatze), Mathema-tische Annalen 113 (1936) 113–160. Reprinted and translated in R.L. Schilling, Z. Vondracek, W.Wojczynski, William Feller. Selected Papers I, Springer, Cham (2015).

[12] M. Friesen, P. Jin, B. Rudiger, Existence of densities for stochastic differential equations driven byLevy processes with anisotropic jumps, arXiv:1810.07504

[13] M. Gevrey, Sur les equations aux derivees partielles du type parabolique, Journal des MathematiquesPures et Appliquees 9 (1913) 305–471 and 10 (1914) 105–148.

[14] T. Grzywny, On Harnack inequality and Holder regularity for isotropic unimodal Levy processes,Potential Anal. 41 (2014) 1–29.

[15] T. Grzywny, K. Szczypkowski, Heat kernels of non-symmetric Levy-type operators, J. Diff. Equ. 267(2019) 6004-6064.

[16] T. Grzywny, K. Szczypkowski, Levy processes: Concentration function and heat kernel bounds,Bernoulli 26(4) (2020) 3191–3223.

[17] J. Hadamard, Sur la solution fondamentale des equations aux derivees partielles du type parabolique,Comptes Rendus de l’Academie des Sciences, Paris 152 (1911) 1148–1149.

[18] P. Haj lasz, Change of variables formula under minimal assumptions, Colloquium Mathematicum 64(1)(1993), 93–101.

[19] N. Ikeda, S. Watanabe, Stochastic differential equations and diffusion processes, North-Holland, Am-sterdam, 1981.

[20] V. Knopova, A. Kochubei, A. Kulik, Parametrix Methods for Equations with Fractional Laplacians,In: A.N. Kochubei, Y. Luchko (eds.), Handbook of Fractional Calculus with Applications, Vol.2. DeGruyter, Berlin 2019.

[21] V. Knopova, A. Kulik, Parametrix construction of the transition probability density of the solutionto an SDE driven by α-stable noise, Annales de l’Institut Henri Poincare 54(1) (2018) 100–140.

[22] V. Knopova, A. Kulik, R. Schilling, Construction and heat kernel estimates of general stable-likeMarkov processes, arXiv:2005.08491

[23] V. Knopova, R. Schilling, Transition density estimates for a class of Levy and Levy-type processes, J.Theoret. Probab. 25(1) (2012) 144–170.

[24] A.N. Kochubei, Parabolic pseudodifferential equations, hypersingular integrals, and Markov processes,Mathematics of the USSR – Izvestiya 33 (1989) 233–259.

[25] V. Kolokoltsov, Symmetric stable laws and stable-like jump-diffusions, Proc. London Math. Soc. 80(2000) 725–768.

[26] F. Kuhn, Levy-Type Processes: Moments, Construction and Heat Kernel Estimates, Springer, LectureNotes in Mathematics 2187 (Levy Matters VI), Berlin 2017.

[27] F. Kuhn, Transition probabilities of Levy-type processes: Parametrix construction, Math. Nachr. 292(2019) 358–376.

[28] T. Kulczycki, M. Ryznar, Semigroup properties of solutions of SDEs driven by Levy processes withindependent coordinates, Stochastic Process. Appl. 130 (2020) 7185–7217.

[29] T. Kulczycki, M. Ryznar, Transition density estimates for diagonal systems of SDEs driven by cylin-drical α-stable process, ALEA Lat. Am. J. Probab. Math. Stat. 15 (2018) 1335–1375.

[30] T. Kulczycki, M. Ryznar, P. Sztonyk, Strong Feller property for SDEs driven by multiplicative cylindri-cal stable noise, Potential Anal. (2020), published online https://doi.org/10.1007/s11118-020-09850-8

[31] A. Kulik, Approximation in law of locally α–stable Levy-type processes by non-linear regressions,Electron. J. Probab. 24 (2019), paper no. 83, 45 pp.

[32] A. Kulik, On weak uniqueness and distributional properties of a solution to an SDE with α-stablenoise, Stochastic Process. Appl. 129 (2019) 473–506.

[33] E.E. Levi, Sulle equazioni lineari totalmente ellittiche alle derivate parziali, Rendiconti del CircoloMatematico di Palermo 24 (1907) 275–317.

[34] D. W. Stroock, S. R. S. Varadhan, Multidimensional Diffusion Processes, Springer, Berlin 1979.[35] P. Sztonyk, Estimates of densities for Levy processes with lower intensity of large jumps, Math. Nachr.

290(1) (2017) 120–141.

Faculty of Pure and Applied Mathematics, Wroc law University of Science and Technol-ogy, Wyb. Wyspianskiego 27, 50-370 Wroc law, Poland.

Email address: [email protected]

Email address: [email protected]

Email address: [email protected]