Publishers’ pageramm/papers/486.pdf · that is, random functions of one variable. Random elds are random func-tions of several variables. Wiener’s theory was based on the analytical

February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Publishers’ page

i


Publishers’ page

ii


Publishers’ page

iii


Publishers’ page

iv


To the memory of my parents

v


vi


Preface

This book presents analytic theory of random fields estimation optimal by

the criterion of minimum of the variance of the error of the estimate. This

theory is a generalization of the classical Wiener theory. Wiener’s theory

has been developed for optimal estimation of stationary random processes,

that is, random functions of one variable. Random fields are random func-

tions of several variables. Wiener’s theory was based on the analytical

solution of the basic integral equation of estimation theory. This equation

for estimation of stationary random processes was Wiener-Hopf-type of

equation, originally on a positive semiaxis. About 25 years later the theory

of such equations has been developed for the case of finite intervals. The

assumption of stationarity of the processes was vital for the theory. An-

alytical formulas for optimal estimates (filters) have been obtained under

the assumption that the spectral density of the stationary process is a pos-

itive rational function. We generalize Wiener’s theory in several directions.

First, estimation theory of random fields and not only random processes

is developed. Secondly, the stationarity assumption is dropped. Thirdly,

the assumption about rational spectral density is generalized in this book:

we consider kernels of positive rational functions of arbitrary elliptic self-

adjoint operators on the whole space. The domain of observation of the

signal does not enter into the definition of the kernel. These kernels are

correlation functions of random fields and therefore the class of such kernels

defines the class of random fields for which analytical estimation theory is

developed. In the appendix we consider even more general class of kernels,

namely kernels R(x, y), which solve the equation QR = Pδ(x − y). Here

P and Q are elliptic operators, and δ(x − y) is the delta-function. We

study singular perturbation problem for the basic integral equation of esti-

mation theory Rh = f . The solution to this equation, which is of interest

vii


viii Random Fields Estimation Theory

in estimation theory, is a distribution, in general. The perturbed equation,

εhε+Rhε = f has the unique solution in L2(D). The singular perturbation

problem consists of the study of the asymptotics of hε as ε→ 0. This theory

is not only of mathematical interest, but also a basis for the numerical solu-

tion of the basic integral equation in distributions. We discuss the relation

between estimation theory and quantum-mechanical non-relativistic scat-

tering theory. Applications of the estimation theory are also discussed. The

presentation in this book is based partly on the author’s earlier monographs[Ramm (1990)] and [Ramm (1996)], but also contains recent results [Ramm

(2002)], [Ramm (2003)],[Kozhevnikov and Ramm (2005)], and [Ramm and

Shifrin (2005)].

The book is intended for researchers in probability and statistics, anal-

ysis, numerical analysis, signal estimation and image processing, theoreti-

cally inclined electrical engineers, geophysicists, and graduate students in

these areas. Parts of the book can be used in graduate courses in proba-

bilty and statistics. The analytical tools that the author uses are not usual

for statistics and probability. These tools include spectral theory of elliptic

operators, pseudodifferential operators, and operator theory. The presen-

tation in this book is essentially self-contained. Auxiliary material which

we use is collected in Chapter 8.


Contents

Preface vii

1. Introduction 1

2. Formulation of Basic Results 9

2.1 Statement of the problem . . . . . . . . . . . . . . . . . . . 9

2.2 Formulation of the results (multidimensional case) . . . . . 14

2.2.1 Basic results . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.2 Generalizations . . . . . . . . . . . . . . . . . . . . . 17

2.3 Formulation of the results (one-dimensional case) . . . . . . 18

2.3.1 Basic results for the scalar equation . . . . . . . . . . 19

2.3.2 Vector equations . . . . . . . . . . . . . . . . . . . . 22

2.4 Examples of kernels of class R and solutions to the basic

equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.5 Formula for the error of the optimal estimate . . . . . . . . 29

3. Numerical Solution of the Basic Integral Equation in Dis-

tributions 33

3.1 Basic ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Theoretical approaches . . . . . . . . . . . . . . . . . . . . . 37

3.3 Multidimensional equation . . . . . . . . . . . . . . . . . . . 43

3.4 Numerical solution based on the approximation of the kernel 46

3.5 Asymptotic behavior of the optimal filter as the white noise

component goes to zero . . . . . . . . . . . . . . . . . . . . 54

3.6 A general approach . . . . . . . . . . . . . . . . . . . . . . . 57

ix


x Random Fields Estimation Theory

4. Proofs 65

4.1 Proof of Theorem 2.1 . . . . . . . . . . . . . . . . . . . . . . 65

4.2 Proof of Theorem 2.2 . . . . . . . . . . . . . . . . . . . . . . 73

4.3 Proof of Theorems 2.4 and 2.5 . . . . . . . . . . . . . . . . 79

4.4 Another approach . . . . . . . . . . . . . . . . . . . . . . . 84

5. Singular Perturbation Theory for a Class of Fredholm

Integral Equations Arising in Random Fields Estimation

Theory 87

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.2 Auxiliary results . . . . . . . . . . . . . . . . . . . . . . . . 90

5.3 Asymptotics in the case n = 1 . . . . . . . . . . . . . . . . . 93

5.4 Examples of asymptotical solutions: case n = 1 . . . . . . . 98

5.5 Asymptotics in the case n > 1 . . . . . . . . . . . . . . . . . 103

5.6 Examples of asymptotical solutions: case n > 1 . . . . . . . 105

6. Estimation and Scattering Theory 111

6.1 The direct scattering problem . . . . . . . . . . . . . . . . 111

6.1.1 The direct scattering problem . . . . . . . . . . . . . 111

6.1.2 Properties of the scattering solution . . . . . . . . . 114

6.1.3 Properties of the scattering amplitude . . . . . . . . 120

6.1.4 Analyticity in k of the scattering solution . . . . . . 121

6.1.5 High-frequency behavior of the scattering solutions . 123

6.1.6 Fundamental relation between u+ and u− . . . . . . 127

6.1.7 Formula for det S(k) and the Levinson Theorem . . . 128

6.1.8 Completeness properties of the scattering solutions . 131

6.2 Inverse scattering problems . . . . . . . . . . . . . . . . . . 134

6.2.1 Inverse scattering problems . . . . . . . . . . . . . . 134

6.2.2 Uniqueness theorem for the inverse scattering problem 134

6.2.3 Necessary conditions for a function to be a scatterng

amplitude . . . . . . . . . . . . . . . . . . . . . . . . 135

6.2.4 A Marchenko equation (M equation) . . . . . . . . . 136

6.2.5 Characterization of the scattering data in the 3D in-

verse scattering problem . . . . . . . . . . . . . . . . 138

6.2.6 The Born inversion . . . . . . . . . . . . . . . . . . . 141

6.3 Estimation theory and inverse scattering in R3 . . . . . . . 150

7. Applications 159


Contents xi

7.1 What is the optimal size of the domain on which the data

are to be collected? . . . . . . . . . . . . . . . . . . . . . . . 159

7.2 Discrimination of random fields against noisy background . 161

7.3 Quasioptimal estimates of derivatives of random functions . 169

7.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 169

7.3.2 Estimates of the derivatives . . . . . . . . . . . . . . 170

7.3.3 Derivatives of random functions . . . . . . . . . . . . 172

7.3.4 Finding critical points . . . . . . . . . . . . . . . . . 180

7.3.5 Derivatives of random fields . . . . . . . . . . . . . . 181

7.4 Stable summation of orthogonal series and integrals with

randomly perturbed coefficients . . . . . . . . . . . . . . . . 182

7.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 182

7.4.2 Stable summation of series . . . . . . . . . . . . . . . 184

7.4.3 Method of multipliers . . . . . . . . . . . . . . . . . . 185

7.5 Resolution ability of linear systems . . . . . . . . . . . . . . 185

7.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 185

7.5.2 Resolution ability of linear systems . . . . . . . . . . 187

7.5.3 Optimization of resolution ability . . . . . . . . . . . 191

7.5.4 A general definition of resolution ability . . . . . . . 196

7.6 Ill-posed problems and estimation theory . . . . . . . . . . 198

7.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 198

7.6.2 Stable solution of ill-posed problems . . . . . . . . . 205

7.6.3 Equations with random noise . . . . . . . . . . . . . 216

7.7 A remark on nonlinear (polynomial) estimates . . . . . . . . 230

8. Auxiliary Results 233

8.1 Sobolev spaces and distributions . . . . . . . . . . . . . . . 233

8.1.1 A general imbedding theorem . . . . . . . . . . . . . 233

8.1.2 Sobolev spaces with negative indices . . . . . . . . . 236

8.2 Eigenfunction expansions for elliptic selfadjoint operators . 241

8.2.1 Resoluion of the identity and integral representation

of selfadjoint operators . . . . . . . . . . . . . . . . . 241

8.2.2 Differentiation of operator measures . . . . . . . . . 242

8.2.3 Carleman operators . . . . . . . . . . . . . . . . . . . 246

8.2.4 Elements of the spectral theory of elliptic operators

in L2(Rr) . . . . . . . . . . . . . . . . . . . . . . . . 249

8.3 Asymptotics of the spectrum of linear operators . . . . . . . 260

8.3.1 Compact operators . . . . . . . . . . . . . . . . . . . 260

8.3.1.1 Basic definitions . . . . . . . . . . . . . . . . 260


xii Random Fields Estimation Theory

8.3.1.2 Minimax principles and estimates of eigen-

values and singular values . . . . . . . . . . 262

8.3.2 Perturbations preserving asymptotics of the spectrum

of compact operators . . . . . . . . . . . . . . . . . . 265

8.3.2.1 Statement of the problem . . . . . . . . . . 265

8.3.2.2 A characterization of the class of linear com-

pact operators . . . . . . . . . . . . . . . . . 266

8.3.2.3 Asymptotic equivalence of s-values of two op-

erators . . . . . . . . . . . . . . . . . . . . . 268

8.3.2.4 Estimate of the remainder . . . . . . . . . . 270

8.3.2.5 Unbounded operators . . . . . . . . . . . . . 274

8.3.2.6 Asymptotics of eigenvalues . . . . . . . . . . 275

8.3.2.7 Asymptotics of eigenvalues (continuation) . 283

8.3.2.8 Asymptotics of s-values . . . . . . . . . . . . 284

8.3.2.9 Asymptotics of the spectrum for quadratic

forms . . . . . . . . . . . . . . . . . . . . . . 287

8.3.2.10 Proof of Theorem 2.3 . . . . . . . . . . . . . 293

8.3.3 Trace class and Hilbert-Schmidt operators . . . . . . 297

8.3.3.1 Trace class operators . . . . . . . . . . . . . 297

8.3.3.2 Hilbert-Schmidt operators . . . . . . . . . . 298

8.3.3.3 Determinants of operators . . . . . . . . . . 299

8.4 Elements of probability theory . . . . . . . . . . . . . . . . 300

8.4.1 The probability space and basic definitions . . . . . . 300

8.4.2 Hilbert space theory . . . . . . . . . . . . . . . . . . 306

8.4.3 Estimation in Hilbert space L2(Ω,U , P ) . . . . . . . 310

8.4.4 Homogeneous and isotropic random fields . . . . . . 312

8.4.5 Estimation of parameters . . . . . . . . . . . . . . . 315

8.4.6 Discrimination between hypotheses . . . . . . . . . . 317

8.4.7 Generalized random fields . . . . . . . . . . . . . . . 319

8.4.8 Kalman filters . . . . . . . . . . . . . . . . . . . . . . 320

Appendix A Analytical Solution of the Basic Integral Equa-

tion for a Class of One-Dimensional Problems 325

A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 326

A.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Appendix B Integral Operators Basic in Random Fields Es-

timation Theory 337


Contents xiii

B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 337

B.2 Reduction of the basic integral equation to a boundary-value

problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

B.3 Isomorphism property . . . . . . . . . . . . . . . . . . . . . 349

B.4 Auxiliary material . . . . . . . . . . . . . . . . . . . . . . . 354

Bibliographical Notes 359

Bibliography 363

Symbols 371

Index 373


Chapter 1

Introduction

This work deals with just one topic: analytic theory of random fields estima-

tion within the framework of covariance theory. No assumptions about dis-

tribution laws are made: the fields are not necessarily Gaussian or Marko-

vian. The only information used is the covariance functions. Specifically,

we assume that the random field is of the form

U(x) = s(x) + n(x), x ∈ Rr, (1.1)

where s(x) is the useful signal and n(x) is noise. Without loss of generality

assume that

s(x) = n(x) = 0, (1.2)

where the bar denotes the mean value. If these mean values are not zeros

then one either assumes that they are known and considers the fields s(x)−s(x) and n(x) − n(x) with zero mean values, or one estimates the mean

values and then subtracts them from the corresponding fields. We also

assume that the covariance functions

U∗(x)U(y) := R(x, y), U∗(x)s(y) := f(x, y) (1.3)

are known. The star stands for complex conjugate. This information is

necessary for any development within the framework of covariance theory.

We will show that, under some assumptions about functions (1.3), one can

develop an analytic theory of random fields estimation. If the functions

(1.3) are not known then one has to estimate them from statistical data

or from some theory. In many applications the exact analytical expression

for the covariance functions is not very important, but rather some general

features of R or f are of practical interest. These features include, for

example, the correlation radius.

1


2 Random Fields Estimation Theory

The estimation problem of interest is the following one. The signal u(x)

of the form (1.1) is observed in a domain D ⊂ Rr with the boundary Γ.

Assuming (1.2) and (1.3) one needs to linearly estimate As(x0) where A is

a given operator and x0 ∈ Rr is a given point. The linear estimate is to be

best possible by the criterion of minimum of variance, i.e. the estimate is

by the least squares method. The most general form of a linear estimate of

u observed in the domain D is

LU :=

∫

D

h(x, y)U(y)dy (1.4)

where h(x, y) is a distribution. Therefore the optimal linear estimate solves

the variational problem

ε := (LU − As)2 = min (1.5)

where Lu and As are computed at the point x0. A necessary condition on

h(x, y) for (1.5) (with A = I) to hold is (see equations (2.11) and (8.423))

Rh :=

∫

D

R(x, y)h(z, y)dy = f(x, z), x, z ∈ D := D ∪ Γ. (1.6)

The basic topic of this work is the study of a class of equations (1.6) for

which the analytical properties of the solution h can be obtained, a numer-

ical procedure for computing h can be given, and properties of the operator

R in (1.6) can be studied. Since z enters (1.6) as a parameter, one can

study the basic equation of estimation theory

Rh :=

∫

D

R(x, y)h(y)dy = f(x), x ∈ D. (1.7)

A typical one-dimensional example of the equation (1.7) in estimation the-

ory is

∫ 1

−1

exp(−|x− y|)h(y)dy = f(x), −1 ≤ x ≤ 1. (1.8)

Its solution of minimal order of singularity is

h(x) = (−f ′′+f)/2+δ(x+1)[−f ′ (−1)+f(−1)]/2+δ(x−1)[f ′ (1)+f(1)]/2.

(1.9)

One can see that the solution is a distribution with singular support at

the boundary of the domain D. By sing supp h we mean the set having

no open neigborhood to which the restriction of h can be identified with a

locally integrable function.


Introduction 3

In the case of equation (1.8) this domain D = (−1, 1). Even if f ∈C∞(D) the solution to equations (1.7), (1.8) are, in general, not in L2(D).

The problem is: in what functional space should one look for the solu-

tion? Is the solution unique? Does the solution to (1.7) provide the solution

to the estimation problem (1.5)? Does the solution depend continuously on

the data, e.g. on f and on R(x, y)? How does one compute the solution

analytically and numerically? What are the properties of the solution, for

example, what is the order of singularity of the solution? What is the sin-

gular support of the solution? What are the properties of the operator R as

an operator in L2(D)?

These questions are answered in Chapters 2-4.

The answers are given for the class of random fields whose covariance

functions R(x, y) are kernels of positive rational functions of selfadjoint

elliptic operators in L2(Rr). The class R of such kernels consists of kernels

R(x, y) =

∫

Λ

P (λ)Q−1(λ)Φ(x, y, λ)dρ(λ) (1.10)

where Λ, dρ, Φ(x, y, λ) are respectively the spectrum , spectral measure and

spectral kernel of an elliptic selfadjoint operator L in L2(Rr) of order s, and

P (λ) and Q(λ) are positive polynomials of degrees p and q respectively. The

notions of spectral measure and spectral kernel are discussed in Section 8.2.

If p > q then the operator in L2(D) with kernel (1.10) is an elliptic integro-

differential operator R; if p = q then R = cI +K, where c = const > 0, I is

the identity operator, and K is a compact selfadjoint operator in L2(D); if

p < q, which is the most interesting case, then R is a compact selfadjoint

operator in L2(D). In this case the noise n(x) is called colored. If φ(λ) is a

measurable function then the kernel of the operator φ(L) is defined by the

formula

φ(L)(x, y) =

∫

Γ

φ(λ)Φ(x, y, λ)dρ(λ). (1.11)

The domain of definition of the operator φ(L) consists of all functions f ∈L2(Rr) such that

∫

Γ

|φ(λ)|2d(Eλf, f) < ∞ (1.12)

where Eλ is the resolution of the identity for L. It is a projection operator



with the kernel

Eλ(x, y) =

∫ λ

−∞Φ(x, y, λ)dρ(λ). (1.13)

In particular, since E+∞ = I, one has

δ(x − y) =

∫ ∞

−∞Φ(x, y, λ)dρ(λ). (1.14)

In (1.13) and (1.14) the integration is actually taken over (−∞, λ)∩Λ and

(−∞,∞) ∩ Λ respectively, since dρ = 0 outside Λ. The kernel in (1.8)

corresponds to the simple case r = 1, L = −i ddx

, Λ = (−∞,∞), dρ = dλ,

Φ(x, y, λ) = (2π)−1 expiλ(x − y), P (λ) = 1, Q(λ) = (λ2 + 1)/2, e−|x| =

(2π)−1∫∞−∞

(λ2+1

2

)−1

exp(iλx)dλ, and formula (1.9) is a very particular

case of the general formulas given in Chapter 2. Let R(x, y) ∈ R, α :=12s(q − p), H`(D) be the Sobolev spaces and H−`(D) be its dual with

respect to H0(D) = L2(D). Then the answers to the questions, formulated

above, are as follows.

The solution to equation (1.7) solves the estimation problem if and only

if h ∈ H−α(D). The operator R : H−α(D) → Hα(D) is an isomorphism.

The singular support of the solution h ∈ H−α(D) of equation (1.7) is

Γ = ∂D. The analytic formula for h is of the form h = Q(L)G, where

G is a solution to some interface elliptic boundary value problem and the

differentiation is taken in the sense of distributions.

Exact description of this analytic formula is given in Chapter 2. The

spectral properties of the operator R : L2(D) → L2(D) with the kernel

R(x, y) ∈ R are given also in Chapter 2. These properties include asymp-

totics as n → ∞ of the eigenvalues λn of R, dependence λn on D, and

asymptotics of λ1(D) as D → Rr, that is D growing uniformly in direc-

tions. Numerical methods for solving equation (1.7) in the space H−α(D)

of distributions are given in Chapter 3. These methods depend heavily on

the analytical results given in Chapter 2.

The necessary background material on Sobolev spaces and spectral the-

ory is given in Chapter 8 so that the reader does not have to consult the

literature in order to understand the contents of this work.

No attempts were made by the author to present all aspects of the the-

ory of random fields. There are several books [Adler (1981)], [Yadrenko

(1983)], [Vanmarcke (1983)], [Rosanov (1982)] and [Preston (1967)], and

many papers on various aspects of the theory of random fields. They have


Introduction 5

practically no intersection with this work which can be viewed as an ex-

tension of Wiener’s filtering theory. The statement of the problem is the

same as in Wiener’s theory, but we study random functions of several vari-

ables, that is random fields, while Wiener (and many researchers after him)

studied filtering and extrapolation of stationary random processes, that is

random functions of one variable. Wiener’s basic assumptions were:

1) the random process u(t) = s(t) + n(t) is stationary,

2) it is observed on the interval (−∞, T ),

3) it has a rational spectral density (this assumption can be relaxed,

but for effective solution of the estimation problems it is quite

useful).

The first assumption means that R(t, τ ) = R(t − τ ), where R is the

covariance function (1.3). The second one means that D = (−∞, T ). The

third one means R(λ) = P (λ)Q−1(λ), where P (λ) and Q(λ) are polyno-

mials, R(λ) ≥ 0 for −∞ < λ < ∞, R(λ) :=∫∞−∞R(t) exp(−iλt)dt. The

analytical theory used by Wiener is the theory of Wiener-Hopf equations.

Later the Wiener theory was extended to the case D = [T1, T ] of a finite

interval of observation, while assumptions 1) and 3) remained valid. A

review of this theory with many references is [Kailath (1974)].

Although the literature on filtering and estimation theory is large

(dozens of books and hundreds of papers are mentioned in [Kailath (1974)]),

the analytic theory presented in this work and developed in the works of

the author cited in the references has not been available in book form in

its present form, although a good part of it appeared in [?, Ch. 1]. Most

of the previously known analytical results on Wiener-Hopf equations with

rational R(λ) are immediate and simple consequences of our general the-

ory. Engineers can use the theory presented here in many applications.

These include signal and image processing in TV, underwater acoustics,

geophysics, optics, etc. In particular, the following question of long stand-

ing is answered by the theory given here. Suppose a random field (1.1)

is observed in a ball B and one wants to estimate s(x0), where x0 is the

center of B. What is the optimal size of the radius of B? If the radius

is too small then the estimate is not accurate. If it is too large then the

estimate is not better than the one obtained from the observations in a

ball of smaller radius, so that the efforts are wasted. This problem is of

practical importance in many applications.

We will briefly discuss some other applications of the estimation the-

ory, for example, discrimination of hypotheses, resolution ability of linear



systems, estimation of derivatives of random functions, etc. However, the

emphasis is on the theory, and the author hopes that other scientists will

pursue further possible applications.

Numerical solution of the basic integral equation of estimation theory

was widely discussed in the literature [Kailath (1974)] in the case of random

processes (r = 1), mostly stationary that is when R(x, y) = R(x− y), and

mostly in the case when the noise is white, so that the integral equation for

the optimal filter is

(I +R)h := h(t) +

∫ T

0

Rs(t− τ )h(τ )dτ = f(t), 0 ≤ t ≤ T, (1.15)

where Rs is the covariance function of the useful signal s(t). Note that

the integral operator R in (1.10) is selfadjoint and nonnegative in L2[0, T ].

Therefore (I + T )−1 exists and is bounded, and the numerical solution

of (1.15) is not difficult. Many methods are available for solving one-

dimensional second order Fredholm integral equation (1.15) with positive-

definite operator I+R. Iterative methods, projection methods, colloqation

and many other methods are available for solving (1.15), convergence of

these methods has been proved and effective error estimates of the numer-

ical methods are known [Kantorovich and Akilov (1980)]. Much effort was

spent on effective numerical inversion of Toeplitr matrices which one ob-

tains if one discretizes (1.15) using equidistant colloqation points [Kailath

(1974)]. However, if the noise is colored, the basic equation becomes

∫ T

0

R(t− τ )h(τ )dτ = f(t), 0 ≤ t ≤ T. (1.16)

This is a Fredholm equation of the first kind. A typical example is equation

(1.8). As we have seen in (1.9), the solution to (1.8) is a distribution, in

general. The theory for the numerical treatment of such equation was given

by the author [Ramm (1985)] and is presented in Chapter 3 of this book.

In particular, the following question of singular perturbation theory is of

interest. Suppose that equation

εhε + Rhε = f, ε > 0, (1.17)

is given. This equation corresponds to the case when the intensity of the

white-noise component of the noise is ε. What is the behavior of hε when

ε→ +0? We will answer this question in Chapter 5.

This book is intended for a broad audience: for mathematicians, engi-

neers interested in signal and image processing, geophysicists, etc. There-


Introduction 7

fore the author separated formulation of the results, their discussion and

examples from proofs. In order to understand the proofs, one should be

familiar with some facts and ideas of functional analysis. Since the author

wants to give a relatively self-contained presentation, the necessary facts

from functional analysis are presented in Chapter 8.

The book presents the theory developed by the author. Many aspects of

estimation theory are not discussed in this book. The book has practically

no intersection with works of other authors on random fields estimation

theory.




Chapter 2

Formulation of Basic Results

2.1 Statement of the problem

Let D ⊂ Rr be a bounded domain with a sufficiently smooth boundary

Γ. The requirement that D is bounded could be omitted, it is imposed for

simplicity. The reader will see that if D is not bounded then the general

line of the arguments remains the same. The additional difficulties, which

appear in the case when D is unbounded, are of technical nature: one needs

to establish existence and uniqueness of the solution to a certain transmis-

sion problem with transmission conditions on Γ. Also the requirement of

smoothness of Γ is of technical nature: the needed smoothness should guar-

antee existence and uniqueness of the solution to the above transmission

problem.

Let L be an elliptic selfadjoint in H = L2(Rr) operator of order s. Let

Λ, Φ(x, y, λ), dρ(λ) be the spectrum, spectral kernel and spectral measure

of L, respectively. A function F (L) is defined as an operator on H with

the kernel

F (L)(x, y) =

∫

Λ

F (λ)Φ(x, y, λ)dρ(λ) (2.1)

and domF (L) = f : f ∈ H,∫∞−∞ |F (λ)|2d(Eλf, f) < ∞, where

(Eλf, f) =

∫ λ

−∞

∫∫Φ(x, y, µ)f(y)f(y)dxdy

dρ(µ),

∫:=

∫

Rr. (2.2)

Definition 2.1 Let R denote the class of kernels of positive rational func-

tions of L, where L runs through the set of all selfadjoint elliptic operators

9



in H = L2(Rr). In other words, R(x, y) ∈ R if and only if

R(x, y) =

∫

Λ

P (λ)Q−1(λ)Φ(x, y, λ)dρ(λ), (2.3)

where P (λ) > 0 and Q(λ) > 0, ∀λ ∈ Λ, and Λ, Φ, dρ correspond to an

elliptic selfadjoint operator L in H = L2(Rr).

Let

p = degP (λ), q = degQ(λ), s = ordL, (2.4)

where degP (λ) stands for the degree of the polynomial P (λ), and ordLstands for the order of the differential operator L.

An operator given by the differential expression

Lu :=∑

|j|≤saj(x)∂

ju, (2.5)

where j = (j1, j2 . . . jr) is a multiindex, ∂ju = ∂j1x1∂j2x2

. . .∂jrxru |j| = j1 +

j2 + . . . jr, jm ≥ 0, are integers. The expression (2.5) is called elliptic if, for

any real vector t ∈ Rr, the equation∑

|j|=saj(x)t

j = 0

implies that t = 0. The expression

L+u :=∑

|j|≤s(−1)|j|∂j(a∗j (x)u) (2.6)

is called the formal adjoint with respect to L. The star in (2.6) stands

for complex conjugate. One says that L is formally selfadjoint if L =

L+. If L is formally selfadjoint then L is symmetric on C∞0 (Rr), that

is (Lφ, ψ) = (φ,Lψ) ∀φ, ψ ∈ C∞0 (Rr), where (φ, ψ) is the inner product

in H = L2(Rr). Sufficient conditions on aj(x) can be given for a formally

selfadjoint differential expression to define a selfadjoint operator inH in the

following way. Define a symmetric operator L0 with the domain C∞0 (Rr)

by the formula L0u = Lu for u ∈ C∞0 (Rr). Under suitable conditions on

aj(x) one can prove that L0 is essentially selfadjoint, that is, its closure is

selfadjoint (See Chapter 8). In particular this is the case if aj = a∗j = const.

In what follows we assume that R(x, y) ∈ R. Some generalizations will

be considered later. The kernel R(x, y) is the covariance function (1.3) of

the random field U(x) = s(x)+n(x) observed in a bounded domainD ⊂ Rr.


Formulation of Basic Results 11

A linear estimation problem can be formulated as follows: find a linear

estimate

U := LU :=

∫

D

h(x, y)U(y)dy (2.7)

such that

ε := |U −As|2 = min . (2.8)

The kernel h(x, y) in (2.7) is a distribution, so that, by L. Schwartz’s theo-

rem about kernels, estimate (2.7) is the most general linear estimate. The

operator A in (2.8) is assumed to be known. It is an arbitrary operator, not

necessarily a linear one. In the case when AU = U , that is A = I, where I

is the identity operator, the estimation problem (2.8) is called the filtering

problem. From (2.8) and (2.7) one obtains

ε =

∫

D

h(x, y)U(y)dy

∫

D

h∗(x, z)U∗(z)dz

− 2Re

∫

D

h(x, z)U(z)dz(As)∗(x) + |As(x)|2

=

∫

D

∫

D

h(x, y)h∗(x, z)R(z, y)dzdy

− 2Re

∫

D

h∗(x, z)f(z, x)dz + |As(x)|2

= min . (2.9)

Here

f(y, x) := U∗(y)(As(x)) = f∗(x, y), (2.10)

the bar stands for the mean value and the star stands for complex conjugate.

By the standard procedure one finds that a necessary condition for the

minimum in (2.9) is:∫

D

R(z, y)h(x, y)dy = f(z, x), x, z ∈ D := D ∪ Γ. (2.11)

In order to derive (2.11) from (2.9) one takes h+ αη in place of h in (2.9).

Here α is a small number and η ∈ C∞0 (D). The condition ε(h) ≤ ε(h+αη)

implies ∂ε∂α

∣∣α=0

= 0. This implies (2.11). Since h is a distribution, the

left-hand side of (2.11) makes sense only if the kernel R(z, y) belongs to

the space of test functions on which the distribution h is defined. We



will discuss this point later in detail. In (2.11) the variable x enters as a

parameter. Therefore, the equation

Rh :=

∫

D

R(x, y)h(y)dy = f(x), x ∈ D, (2.12)

is basic for estimation theory. We have supressed the dependence on x in

(2.11) and have written x in place of z in (2.12).

From this derivation it is clear that the operator A does not influence

the theory in an essential way: if one changes A then f is changed but the

kernel of the basic equation (2.12) remains the same. If Au = ∂u∂xj

, one has

the problem of estimating the derivative of u. If Au = u(x + x0), where

x0 is a given point such that x + x0 6∈ D, then one has the extrapolation

problem. Analytically these problems reduce to solving equation (2.12). If

no assumptions are made about R(x, y) except that R(x, y) is a covariance

function, then one cannot develop an analytical theory for equation (2.12).

Such a theory will be developed below under the basic assumption R(x, y) ∈R.

Let us show that the class R of kernels, that is the class of random fields

that we have introduced, is a natural one. To see this, recall that in the

one-dimensional case, studied analytically in the literature, the covariance

functions are of the form

R(x, y) = R(x− y), x, y ∈ R1,

R(λ) :=

∫ ∞

∞R(x) exp(−iλx)dx = P (λ)Q−1(λ),

where P (λ) and Q(λ) are positive polynomials [Kai]. This case is a very

particular case of the kernels in the class R. Indeed, take

r = 1, L = −i ddx, Λ = (−∞,∞), dρ(λ) = dλ,

Φ(x, y, λ) = (2π)−1 expiλ(x− y).

Then formula (2.3) gives the above class of convolution covariance functions

with rational Fourier transforms. If p = q, where p and q are defined in

(2.4), then the basic equation (2.12) can be written as

Rh := σ2h(x) +

∫

D

R1(x, y)h(y)dy = f(x), x ∈ D, σ2 > 0, (2.13)



where

P (λ)Q−1(λ) = σ2 + P1(λ)Q−1(λ), p1 := degP1 < q, (2.14)

and σ2 > 0 is interpreted as the variance of the white noise component of

the observed signal U(x). If p < q, then the noise in U(x) is colored, it does

not contain a white noise component.

Mathematically equation (2.13) is very simple. The operator R in (2.13)

is of Fredholm type, selfadjoint and positive definite in H, R ≥ σ2I, where

I is the identity operator, and A ≥ B means (Au, u) ≥ (Bu, u) ∀u ∈H. Therefore, if p = q then equation (2.12) reduces to (2.13) and has a

unique solution in H. This solution can be computed numerically without

difficulties. There are many numerical methods which are applicable to

equation (2.13). In particular (see Section 2.3.2), an iterative process can

be constructed for solving (2.13) which converges as a geometrical series;

a projection method can be constructed for solving (2.13) which converges

and is stable computationally; one can solve (2.13) by collocation methods.

However the important and practically interesting question is the following

one: what happens with the solution hσ to (2.13) as σ → 0? This is a

singular perturbation theory question: for σ > 0 the unique solution to

equation (2.13) belongs to L2(D), while for σ = 0 the unique solution

to (2.13) of minimal order of singularity is a distribution. What is the

asymptotics of hσ as σ → 0? As we will show, the answer to this question

is based on analytical results concerning the solution to (2.13).

The basic questions we would like to answer are:

1) In what space of functions or distributions should one look

for the solution to (2.12)?

2) When does a solution to (2.12) solve the estimation prob-

lem (2.8)?

Note that (2.12) is only a necessary condition for h(x, y) to solve (2.8).

We will show that there is a solution to (2.12) which solves (2.8) and this

solution to (2.12) is unique. The fact that estimation problem (2.8) has

a unique solution follows from a Hilbert space interpretation of problem

(2.8) as the problem of finding the distance from the element (As)(x) to

the subspace spanned by the values of the random field u(y), y ∈ D. Since

there exists and is unique an element in a subspace at which the distance

is attained, problem (2.8) has a solution and the solution is unique. It was

mentioned in the Introduction (see (1.9)) that equation (2.12) may have no

solutions in L1(D) but rather its solution is a distribution. There can be



several solutions to (2.12) in spaces of distributions, but only one of them

solves the estimation problem (2.8). This solution is characterized as the

solution to (2.12) of minimal order of singularity.

3) What is the order of singularity and singular support of

the solution to (2.12) which solves (2.8)?

4) Is this solution stable under small perturbations of the

data, that is under small perturbations of f(x) and R(x, y)?

What is the appropriate notion of smallness in this case?

What are the stability estimates for h?

5) How does one compute the solution analytically?

6) How does one compute the solution numerically?

7) What are the properties of the operator R : L2(D) → L2(D)

in (2.12)? In particular, what is the asymptotics of its

eigenvalues λj(D) as j → +∞? What is the asymptotics

of λ1(D) as D → Rr , that is, as D expands uniformly in

directions?

These questions are of interest in applications. Note that if D is finite

then the operator R is selfadjoint positive compact operator in L2(D), its

spectrum is discrete, λ1 > λ2 ≥ · · · > 0, the first eigenvalue is nondegen-

erate by Krein-Rutman’s theorem. However, if D = Rr then the spectrum

of R may be continuous, e.g., this is the case when R(x, y) = R(x − y),

R(λ) = P (λ)Q−1(λ). Therefore it is of interest to find λ1∞ := limλ1(D)

as D → Rr. The quantity λ1∞ is used in some statistical problems.

8) What is the asymptotics of the solution to (2.13) as σ → 0?

2.2 Formulation of the results (multidimensional case)

2.2.1 Basic results

We assume throughout that R(x, y) ∈ R, f(x) is smooth, more precisely,

that f ∈ Hα, Hα = Hα(D)is the Sobolev space, α := (q − p)s/2, where

s, q, p are the same as in (2.4), and the coefficients aj(x) of L (see (2.5)) are

sufficiently smooth, say as(x) ∈ C(Rr), aj(x) ∈ Lr/(s−|α|)loc if s − |j| < r/2,

aj ∈ L2loc if s− |j| > r/2, aj ∈ L2+ε

loc , ε > 0, if s− |j| = r/2 (see [Hormander

(1983-85), Ch 17]).

If q ≤ p, then the problem of finding the is simple: such a solution does

exist, is unique, and belongs to Hm+2|α| if f ∈ Hm. This follows from



the usual theory of elliptic boundary value problems [Berezanskij (1968)],

since the operator P (L)Q−1(L) is an elliptic integral-differential operator

of order 2|α| if q ≤ p. The solution satisfies the elliptic estimate:

‖ h ‖Hm+2|α|≤ c ‖ f ‖Hm , q ≤ p,

where c depends on L but does not depend on f .

If q > p, then the problem of finding the mos solution of (2.12) is more

interesting and difficult because the order of singularity of h, is in general,

positive, ordh = α. The basic result we obtain is:

The mapping R : H−α → Hα is a linear isomorphism between the

spaces H−α and Hα. The of h is ∂D = Γ, provided that f is smooth.

If h1 is a solution to equation (2.12) and ordh1 > α then ε = ∞, where

ε is defined in (2.8) . Therefore if h1 solves (2.12) and ordh1 > α then h1

does not solve the estimation problem (2.8).

The unique solution to (2.8) is the unique mos solution to (2.12) . We

give analytical formulas for the mos solution to (2.12) . This solution

is stable towards small perturbations of the data. We also give a stable

numerical procedure for computing this solution.

In this section we formulate the basic results.

Theorem 2.1 If R(x, y) ∈ R, then the operator R in (2.12) is an isomor-

phism between the spaces H−α and Hα. The solution to (2.12) of minimal

order of singularity, ordh ≤ α, can be calculated by the formula:

h(x) = Q(L)G, (2.15)

where

G(x) =

g(x) + v(x) in D

u(x) in Ω := Rr \D,(2.16)

g(x) ∈ Hs(p+q)/2 is an arbitrary fixed solution to the equation

P (L)g = f in D, (2.17)

and the functions u(x) and v(x) are the unique solution to the following

(2.18)-(2.20):

Q(L)u = 0 in Ω, u(∞) = 0, (2.18)

P (L)v = 0 in D, (2.19)



∂jNu = ∂jN (v + g) on Γ, 0 ≤ j ≤ s(p + q)

2− 1. (2.20)

By u(∞) = 0 we mean limu(x) = 0 as |x| → ∞.

Corollary 2.1 If f ∈ H2β, β ≥ α, then

sing supph = Γ. (2.21)

Corollary 2.2 If P (λ) = 1, then the transmission problem (2.18)-(2.20)

reduces to the Dirichlet problem in Ω:

Q(L)u = 0 in Ω, u(∞) = 0, (2.22)

∂jNu = ∂jN f on Γ, 0 ≤ j ≤ sq

2− 1, (2.23)

and (2.15) takes the form

h = Q(L)F, F =

f in D

u in Ω.(2.24)

Corollary 2.1 follows immediately from formulas (2.15) and (2.16) since

g(x)+v(x) and u(x) are smooth inside D and Ω respectively. Corollary 2.2

follows immediately from Theorem 2.1: if P (λ) = 1 then g = f , v = 0, and

p = 0.

Let ω(λ) ≥ 0, ω(λ) ∈ C(R1), ω(∞) = 0,

ω := maxλ∈Λ

ω(λ), (2.25)

R(x, y) =

∫

Λ

ω(λ)Φ(x, y, λ)dρ(λ), (2.26)

λj = λj(D) be the eigenvalues of the operator R : L2(D) → L2(D) be

the eigenvalues of the operator R : L2(D) → L2(D) with kernel (2.26),

arranged so that

λ1 ≥ λ2 ≥ λ3 ≥ · · · > 0. (2.27)

Theorem 2.2 If D ⊂ D′ then λj ≤ λ′j , where λ′j = λj(D′). If

supx∈Rr

∫|R(x, y)|dy := A < ∞, (2.28)



then

λ1∞ = ω, (2.29)

where

limD→Rr

λ1(D) := λ1∞, (2.30)

and ω is defined in (2.25).

Theorem 2.3 If ω(λ) = |λ|−a(1+ o(1)) as |λ| → ∞, and a > 0, then the

asymptotics of the eigenvalues of the operator R with kernel (2.26) is given

by the formula:

λj ∼ cj−as/r as j → ∞, c = const > 0, (2.31)

where c = γas/r and

γ := (2π)−r∫

D

η(x)dx, (2.32)

with

η(x) := meast : t ∈ Rr,∑

|α|=|β|=s/2aαβ(x)t

α+β ≤ 1. (2.33)

Here the form aαβ(x) generates the principal part of the selfadjoint elliptic

operator L:

Lu =∑

|α|=|β|=s/2∂α(aαβ(x))∂

βu+ L1, ordL1 < s.

Corollary 2.3 If ω(λ) = P (λ)Q−1(λ) then a = q − p, where q = degQ,

p = degP , and λn ∼ cn−(q−p)s/r, where λn are the eigenvalues of the

operator in equation (2.12) .

This Corollary follows immediately from Theorem 2.3. Theorems 2.1-

2.3 answer questions 1)-5) and 7) in section 2.1. Answers to questions 6)

and 8) will be given in Chapter 3. Proof of Theorem 2.3 is given in Section

8.3.2.10.

2.2.2 Generalizations

First, let us consider a generalization of the class R of kernels for the case

when there are several commuting differential operators. Let L1, . . .Lm be

a system of commuting selfadjoint differential operators in L2(Rr). There



exists a dµ(ξ) and a spectral kernel Φ(x, y, ξ), ξ = (ξ1, . . . , ξm) such that a

function F (L1, . . .Lm) is given by the formula

F (L1, . . .Lm) =

∫

M

F (ξ)φ(ξ)dµ(ξ), (2.34)

where Φ(ξ) is the operator with kernel Φ(x, y, ξ). The domain of definition

of the operator F (L1, . . .Lm) is the set of all functions u ∈ L2(Rr) for which∫M

|F (ξ)|2(Φ(ξ)u, u)dµ <∞, M is the support of the spectral measure dµ,

and the parentheses denote the inner product in L2(Rr).

For example, let m = r, Lj = −i ∂∂xj

. Then ξ = (ξ1, . . . ξr), dµ =

dξ1 . . . dξr, φ(x, y, ξ) = (2π)−r expiξ · (x−y), where dot denotes the inner

product in Rr.

If F (ξ) = P (ξ)Q−1(ξ), where P (ξ) and Q(ξ) are positive polynomials

and the operators P (L) := P (L1, . . .Lr) and Q(L) := Q(L1, . . .Lr) are

elliptic of orders m and n respectively, m < n, then the theorems, analogous

to Theorems 2.1-2.2, hold with sp = m and sq = n. Theorem 2.3 has also

an analogue in which as = n−m in formula (2.31).

Another generalization of the class R of kernels is the following one.

Let Q(x, ∂) and P (x, ∂) be elliptic differential operators and

QR = Pδ(x− y) in Rr . (2.35)

Note that the kernels R ∈ R satisfy equation (2.35) with Q = Q(L), P =

P (L). Let ordQ = n, ordP = m, n > m. Assume that the transmission

problem (2.18)-(2.20), with Q(x, ∂) and P (x, ∂) in place of Q(L) and P (L)

respectively, and ps = m, qs = n, has a unique solution in H(n+m)/2. Then

Theorem 2.1 holds with α = (n −m)/2.

The transmission problem (2.18)-(2.20) with Q(x, ∂) and P (x, ∂) in

place of Q(L) and P (L) is uniquely solvable provided that, for example,

Q(x, ∂) and P (x, ∂) are elliptic positive definite operators. For more de-

tails see Chapter 4 and Appendices A and B.

2.3 Formulation of the results (one-dimensional case)

In this section we formulate the results in the one-dimensional case, i.e.,

r = 1. Although the corresponding estimation problem is the problem

for random processes (and not random fields) but since the method and

the results are the same as in the multidimensional case, and because of



the interest of the results in applications, we formulate the results in one-

dimensional case separately.

2.3.1 Basic results for the scalar equation

Let r = 1, D = (t− T, t), R(x, y) ∈ R. The basic equation (2.12) takes the

form

∫ t

t−TR(x, y)h(y)dy = f(x), t− T ≤ x ≤ t. (2.36)

Assume that f ∈ Hα, α = s(q − p)/2.

Theorem 2.4 The solution to equation (2.36) in H−α exists, is unique,

and can be found by the formula

h = Q(L)G, (2.37)

where

G(x) =

∑sq/2j=1 b

−j ψ

−j (x), x ≤ t− T

g(x), t− T ≤ t ≤ t∑sq/2

j=1 b+j ψ

+j (x), x ≥ t.

(2.38)

Here b±j are constants, the functions ψ±j (x), 1 ≤ j ≤ sq/2, form a funda-

mental system of solutions to the equation

Q(L)ψ = 0, ψ−j (−∞) = 0, ψ+

j (+∞) = 0. (2.39)

The function g(x) is defined by the formula

g(x) = g0(x) +

sp∑

j=1

cjφj(x), (2.40)

where g0(x) is an arbitrary fixed solution to the equation

P (L)g = f, t − T ≤ x ≤ t, (2.41)

the functions φj , 1 ≤ j ≤ sp, form a fundamental system of solutions to

the equation

P (L)φ = 0, (2.42)



and cj , 1 ≤ j ≤ sp, are constants. The constants b±j , 1 ≤ j ≤ sq/2, and cj,

1 ≤ j ≤ sp, are uniquely determined from the linear system:

Dk

sq/2∑

j=1

b−j ψ−j

∣∣∣∣∣∣x=t−T

= Dk

g0 +

sp∑

j=1

cjφj

∣∣∣∣∣∣x=t−T

, (2.43)

Dk

sq/2∑

j=1

b+j ψ+j

∣∣∣∣∣∣x=t

= Dk

g0 +

sp∑

j=1

cjφj

∣∣∣∣∣∣x=t

, (2.44)

where D = d/dx, 0 ≤ k ≤ 12s(p + q) − 1. The map R−1 : f → h, where h

is given by formula (2), is an isomorphism of the space Hα onto the space

H−α.

Remark 2.1 This theorem is a complete analogue of Theorem 2.1. The

role of L is played now by an ordinary differential selfadjoint operator Lin L2(R1). An ordinary differential operator is elliptic if and only if the

coefficient in front of its senior (that is, the highest order) derivative does

not vanish:

Lu =

s∑

j=0

aj(x)Dju, as(x) 6= 0, (2.45)

One can assume that as(x) > 0, x ∈ R1, and the condition of uniform

ellipticity is assumed, that is

0 < c1 ≤ as(x) ≤ c2, (2.46)

where c1 and c2 are positive constants which do not depend on x.

Corollary 2.4 If f ∈ Hα, then sing supph = ∂D, where ∂D consists of

two points t and t− T .

Corollary 2.4 is a complete analogue of Corollary 2.1.

Corollary 2.5 Let Q(λ) = a+(λ)a−(λ), where a±(λ) are polynomials of

degree q/2, the zeros of the polynomial a+(λ) lie in the upper half plane

Imλ > 0, while the zeros of a−(λ) lie in the lower half-plane Imλ < 0.

Since Q(λ) > 0 for −∞ < λ < ∞, the zeros of a+(λ) are complex conjugate

of the corresponding zeros of a−(λ). Assume that P (λ) = 1. Then formula

(2.37) can be written as

h(x) = a+(L)[θ(x− t+ T )a−(L)f(x)]− a−(L)[θ(x− t)a+(L)f(x)], (2.47)



where θ(x) =

1 if x ≥ 0

0 if x < 0, and the differentiation in (2.47) is understood

in the sense of distributions.

This Corollary is an analogue of Corollary 2.2.

Remark 2.2 Formula (2.47) is convenient for practical calculations. Let

us give a simple example of its application. Let L = −i∂, r = 1, P (λ) = 1,

Q(λ) = (λ2+1)/2, R(x, y) = exp(−|x−y|), Φ(x, y, λ) = (2π)−1 expiλ(x−y), dρ(λ) = dλ, t = 1, t − T = −1. Equation (2.36) becomes

∫ 1

−1

exp(−|x− y|)h(y)dy = f(x), −1 ≤ x ≤ 1, (2.48)

a+(λ) = λ−i√2, a−(λ) = λ+i√

2. Formula (2.47) yields:

h(x) =1

2(−i∂ − i)[θ(x+ 1)(−i∂ + i)f(x)]

−1

2(−i∂ + i)[θ(x − 1)(−i∂ − i)f(x)]

= −1

2(∂ + 1)[θ(x+ 1)(∂ − 1)f ] +

1

2(∂ − 1)[θ(x− 1)(∂ + 1)f ]

= −1

2θ(x + 1)(∂2 − 1)f − 1

2δ(x+ 1)(∂ − 1)f

+1

2θ(x − 1)(∂2 − 1)f +

1

2δ(x− 1)(∂ + 1)f

=−f ′′ + f

2+f ′(1) + f(1)

2δ(x− 1)

+−f ′(−1) + f(−1)

2δ(x+ 1). (2.49)

Here we have used the well known formula θ′(x−a) = δ(x−a) where δ(x) is

the delta-function. Formula (2.49) is the same as formula (1.9). The term

(−f ′′ + f)/2 in (2.49) vanishes outside the interval [−1, 1] by definition.

Remark 2.3 If t = +∞ and t− T = 0, so that equation (2.36) takes the

form of Wiener-Hopf equation of the first kind∫ ∞

0

R(x, y)h(y)dy = f(x), x ≥ 0, (2.50)

then formula (2.47) reduces to

h(x) = a+(L)[θ(x)a−(L)f(x)]. (2.51)



If L = −i∂ formula (2.51) can be obtained by the well-known factoriza-

tion method.

Example 2.1 Consider the equation

∫ ∞

0

exp(−|x− y|)h(y)dy = f(x), x ≥ 0. (2.52)

By formula (2.51) one obtains

h(x) =−f ′′ + f

2+

−f ′(0) + f(0)

2δ(x), (2.53)

if one uses calculations similar to the given in formula (2.49).

2.3.2 Vector equations

In both cases r = 1 and r > 1 it is of interest to consider estimation

problems for vector random processes and vector random fields. For vec-

tor random processes the basic equation is (2.36) with the positive kernel

R(x, y) in the sense that (Rh,h) > 0 for h 6≡ 0, given by formula (2.3) in

which R(λ) = P (λ)Q−1(λ) is a matrix:

R(λ) = (Rij(λ)), Rij(λ) := Pij(λ)Q−1ij (λ), 1 ≤ i, j ≤ d, (2.54)

where Pij(λ) and Qij(λ) are relatively prime positive polynomials for each

fixed pair of indices (ij), 1 ≤ i, j ≤ d, d is the number of components of

the random processes U , s and n. Let Q(λ) be the polynomial of minimal

degree, degQ(λ) = q, for which any Qij(λ), 1 ≤ i, j ≤ d, is a divisor,

Aij(λ) := Rij(λ)Q(λ). Denote by E the unit d×d matrix and by A(L) the

matrix differential operator with entries Aij(L). Assume that

det |Aij(λ)| > 0, ∀λ ∈ R1, (2.55)

det Bm(x) 6= 0, ∀x ∈ R1, (2.56)

where m := smax1≤i,j≤d degAij(λ), s = ordL,

A(L) :=

m∑

j=0

Bj(x)∂j , ∂ =

d

dx. (2.57)



Let S(x, y) denote the matrix kernel

S(x, y) := δij

∫

Λ

Q−1(λ)Φ(x, y, λ)dρ(λ), δij =

1 i = j

0 i 6= j(2.58)

of the diagonal operator Q−1(L)E. The operator Q(L)E is a diagonal

matrix differential operator of order n = sq.

Let us write the basic equation

∫ t

t−TR(x, y)h(y)dy = f (x), t− T ≤ x ≤ t, (2.59)

where R(x, y) is the d × d matrix with the spectral density (2.54), h and

f are vector functions with d components, f ∈ Hα, α = (n − m)/2, Hα

denotes the space of vector functions (f1, . . . fd) such that ‖ f ‖Hα :=(∑dj=1 ‖ fj ‖2

Hα

)1/2

.

Remark 2.4 In the vector estimation problem h and f are d×d matrices,

but for simplicity and without loss of generality we discuss the case when h

is a vector. Matrix equation (2.59) is equivalent to d vector equations.

Equation (2.59) one can write as

A(L)v = f , (2.60)

v := Q−1(L)Eh =

∫

D

S(x, y)h(y)dy. (2.61)

Let Φj , 1 ≤ j ≤ m, be a fundamental system of matrix solutions to the

equation

A(L)φ = 0, (2.62)

and Ψ±j , 1 ≤ j ≤ n/2 be the fundamental system of matrix solutions to

the equation

Q(L)EΨ = 0, (2.63)

such that

Ψ+j (+∞) = 0, Ψ−

j (−∞) = 0. (2.64)

The choice of the fundamental system of matrix solutions to (2.63) with

properties (2.64) is possible if L is an elliptic ordinary differential operator,



that is as(x) 6= 0, x ∈ R1 (see Remark 2.1 and [N] p. 118). Let us write

equations (2.60) and (2.61) as

∫

D

S(x, y)h(y)dy = g0(x) +

m∑

j=1

Φj(x)cj , (2.65)

where g0(x) is an arbitrary fixed solution to the equation (2.60), and cj,

1 ≤ j ≤ m, are arbitrary linearly independant constant vectors.

Theorem 2.5 If R ∈ R with R given by (2.54), the assumptions (2.46),

(2.55), (2.56) hold, and f ∈ Hα, α = (n −m)/2, then the matrix equation

(2.59) has a solution in H−α, this solution is unique and can be found by

the formula

h = Q(L)EG, (2.66)

where the vector function G is given by

G(x) =

∑n/2j=1 Ψ−

j b−j , x ≤ t− T

g0(x) +∑m

j=1 Φjcj , t− T ≤ x ≤ t,∑n/2

j=1 Ψ+j b+

j , x ≥ t.

(2.67)

Here the functions Ψ±j , Φj and g0(x) were defined above, and the constant

vectors b±j , 1 ≤ j ≤ n/2 and cj , 1 ≤ j ≤ m, can be uniquely determined

from the linear system

Dk

n/2∑

j=1

Ψ−j (x)b−

j

∣∣∣∣∣∣x=t−T

= Dk

g0(x) +

m∑

j=1

Φj(x)cj

∣∣∣∣∣∣x=t−T

(2.68)

Dk

n/2∑

j=1

Ψ+j (x)b+

j

∣∣∣∣∣∣x=t

= Dk

g0(x) +

m∑

j=1

Φj(x)cj

∣∣∣∣∣∣x=t

, (2.69)

where 0 ≤ k ≤ (n+m)/2 − 1.

The map R−1 : f → h, given by formulas (2.66)-(2.69) is an isomor-

phism between the spaces Hα and H−α, α = (n−m)/2.

Remark 2.5 The conditions (2.68), (2.69) guarantee that the function

G(x), defined by formula (2.67), is maximally smooth so that the order of

singularity of G and, therefore, of h (see formula (2.66) ) is minimal.



2.4 Examples of kernels of class R and solutions to the basic

equation

1. If r = 1, L = −i∂, ∂ = d/dx, Φ(x, y, λ) = (2π)−1 expiλ(x − y),dρ = dλ, then R(x, y) ∈ R if

R(x, y) = (2π)−1

∫ ∞

−∞R(λ) expiλ(x − y)dλ, (2.70)

where

R(λ) = P (λ)Q−1(λ) (2.71)

and P (λ), Q(λ) are positive polynomials.

2. If r > 1, L = (L1, . . .Lr), Lr = −i∂r , ∂r = ∂/∂xr, Φ(x, y, λ) =

(2π)−r expiλ · (x− y), λ = (λ1, . . .λr), dρ(λ) = dλ = dλ1 . . .dλr, then

R(x, y) = (2π)−r∫

RrR(λ) expiλ · (x− y)dλ, (2.72)

where R(λ) is given by (2.71) and

P (λ) = P (λ1, . . .λr) > 0, Q(λ) = Q(λ1 . . .λr) > 0 (2.73)

are polynomials. For the operators P (L) and Q(L) to be elliptic of orders

p and q respectively, one has to assume that

0 < c1 ≤ P (λ)|λ|−p ≤ c2, 0 < c3 ≤ Q(λ)|λ|−q ≤ c4, ∀λ ∈ Rr (2.74)

where |λ| = (λ21 + · · ·+ λ2

r)1/2 and cj , 1 ≤ j ≤ 4, are positive constants.

3. If r = 1, L = − d2

dx2 , D(L) = u : u ∈ H2(0,∞), u′(0) = 0, D(L) =

domain of L, then

R(x, y) =1

2[A(|x+ y|) +A(|x− y|)], x, y ≥ 0 (2.75)

where

A(x) = π−1

∫ ∞

0

P (λ)Q−1(λ) cos(√λx)λ−1/2dλ (2.76)

and P (λ) > 0, Q(λ) > 0 are polynomials.

Indeed, one has for L

Φ(x, y, λ)dρ(λ) =

π−1 cos(

√λx) cos(

√λy)λ−1/2dλ, λ ≥ 0

0, λ < 0,



0 ≤ x, y <∞. Since

cos(kx) cos(ky) =1

2[cos(kx− ky) + cos(kx+ ky)], k =

√λ

one obtains (2.75) and (2.76).

If one put√λ = k in (2.76) one gets

A(x) =2

π

∫ ∞

0

P (k2)Q−1(k2) cos(kx)dk, (2.77)

which is a cosine transform of a positive rational function of k. The eigen-

functions of L, normalized in L2(0,∞), are(

2π

)1/2cos(kx) and dρ = dk in

the variable k. If L = − d2

dx2 is determined in L2(0,∞) by the boundary

condition u(0) = 0, then

R(x, y) =1

2[A(|x− y|) −A(x+ y)], x, y ≥ 0, (2.78)

where A(x) is given by (2.77), the eigenfunctions of L with the Dirichlet

boundary condition u(0) = 0 are√

2π sin(kx), dρ = dk in the variable k,

and Φ(x, y, k)dρ(k) = 2π sin(kx) sin(ky)dk one can compare this with the

formula Φ(x, y, k)dρ(k) = 2π cos(kx) cos(ky)dk, which holds for L deter-

mined by the Neumann boundary condition u′(0) = 0.

4. If L = − d2

dx2 + (ν2 − 14 )x−2, ν ≥ 0, x ≥ 0, then

Φ(x, y, λ)dρ(λ) =

√xλJν(xλ)

√yλJν(yλ)dλ, if λ ≥ 0

0, if λ < 0,(2.79)

so that

R(x, y) =√xy

∫ ∞

0

P (λ)Q−1(λ)Jν(λx)Jν(λy)λdλ, (2.80)

where P (λ) and Q(λ) are positive polynomials on the semiaxis λ ≥ 0.

5. Let R(x, y) = exp(−a|x− y|)(4π|x− y|)−1, x, y ∈ R3, a = const > 0.

Note that (−∆ + a2)R = δ(x− y) in R3. The kernel R(x, y) ∈ R. One has

L = (L1,L2,L3), Lj = −i∂j , P (λ) = 1, Q(λ) = λ2 + a2, λ2 = λ21 + λ2

2 + λ23,

Φdρ = (2π)−3 expiλ · (x− y)dλ,

R(x, y) = (2π)−3

∫

R3

expiλ · (x− y)λ2 + a2

dλ. (2.81)

6. Let R(x, y) = R(xy). Put x = exp(ξ), y = exp(−η). Then R(xy) =

R(exp(ξ − y)) := R1(ξ − y). If R1 ∈ R with L = −i∂, then one can solve



the equation

∫ b

a

R(xy)h(y)dy = f(x), a ≤ x ≤ b (2.82)

analytically.

7. Let K0(a|x|) be the modified Bessel function which can be defined

by the formula

K0(a|x|) = (2π)−1

∫

R2

exp(iλ · x)λ2 + a2

dλ, a > 0, (2.83)

where λ · x = λ1x1 + λ2x2. Then the kernel R(x, y) := K0(a|x− y|) ∈ R,

L = (−i∂1,−i∂2), r = 2, P (λ) = 1, Q(λ) = λ2 + a2, Φ(x, y, λ)dρ(λ) =

(2π)−1 expiλ · (x− y)dλ.8. Consider the equation

∫

D

exp(−a|x− y|)4π|x− y| h(y)dy = f(x), x ∈ D ⊂ R3, a > 0, (2.84)

with kernel (2.81). By formula (2.15), Theorem 2.1, one obtains the unique

solution to equation (2.84) in H−1(D):

h(x) = (−∆ + a2)f +

(∂f

∂N− ∂u

∂N

)δΓ, (2.85)

where u is the unique solution to the Dirichlet problem in the exterior

domain Ω := R3 \D:

(−∆ + a2)u = 0 in Ω, u|Γ = f |Γ, (2.86)

Γ = ∂D = ∂Ω is the boundary of D, and δΓ is the delta function with

support Γ.

Let us derive formula (2.85). For the kernel (2.81) one has r = 3, p = 0,

P (λ) = 1, Q(λ) = λ2 + a2, s = 1, q = 2, α = sq2

= 1. Formula (2.15)

reduces to

h(x) = (−∆ + a2)G, (2.87)

with

G =

f in D

u in Ω,(2.88)



and u is the solution to (2.86). Indeed, since P (λ) = 1, one has v = 0 and

g = f . In order to compute h by formula (2.87) one uses the definition of

the derivative in the sense of distributions. For any φ ∈ C∞0 (Rr) one has:

((−∆ + a2)G,φ

)=(G, (−∆ + a2)φ

)

=

∫

D

f(−∆ + a2)φdx+

∫

Ω

u(−∆ + a2)φdx

=

∫

D

(−∆ + a2)fφdx+

∫

Ω

(−∆ + a2)uφdx

−∫

Γ

(f∂φ

∂N− ∂f

∂Nφ

)ds+

∫

Γ

(u∂φ

∂N− φ

∂u

∂N

)ds

=

∫

D

(−∆ + a2)fφdx+

∫

Γ

(∂f

∂N− ∂u

∂N

)φds, (2.89)

where the condition u = f on Γ was used. Formula (2.89) is equivalent to

(2.85).

9. Consider the equation

2π

∫

D

K0(a|x− y|)h(y)dy = f(x), x ∈ D ⊂ R2, a > 0, (2.90)

where D = x : x ∈ R2, |x| ≤ b, and K0(x) is given by formula (2.83).

The solution to (2.90) in H−1(D) can be calculated by formula (2.85) in

which u(x) can be calculated explicitly

u(x) =

∞∑

n=−∞fn

exp(inφ)Kn(ar)

Kn(ab), (2.91)

where x = (r, φ), (r, φ) are polar coordinates in R2,

fn := (2π)−1

∫ 2π

0

f(b, φ) exp(−inφ)dφ, (2.92)

Kn(r) is the modified Bessel function of order n which decays as r → +∞.

One can easily calculate ∂u∂N

∣∣Γ

in formula (2.85):

∂u

∂N

∣∣Γ=∂u

∂r

∣∣r=b

=

∞∑

n=−∞afn

exp(inφ)K ′n(ab)

Kn(ab). (2.93)

Formulas (2.85) and (2.93) give an explicit analytical formula for the solu-

tion to equation (2.90) in H−1(D).



2.5 Formula for the error of the optimal estimate

In this section we give an explicit formula for the error of the optimal

estimator. This error is given by formula (1.5). We assume for simplicity

that A = I in what follows. This means that we are discussing the filtering

problem. In the same way the general estimation problem can be treated.

1. The error of the estimate can be computed by formula (2.9) with

A = I. This yields

ε = (Rh, h) − 2Re(h, f) + ε0(x), ε0(x) := |s(x)|2, (2.94)

where (u, v) :=∫D uvdx, (Rh, h) =

∫D

∫D R(z, y)h(y)h∗(z)dydz, and

h(y) := h(x, y). The optimal estimate

U(x) =

∫

D

h(x, y)U(y)dy (2.95)

is given by the solution to the equation (2.11):

Rh = f, (2.96)

and we assume that R ∈ R. Since (Rh, h) > 0, it follows from (2.94) and

(2.96) that

ε(x) = ε0(x) − (Rh, h). (2.97)

It is clear that the right side of (2.97) is finite if and only if the quadratic

form (Rh, h) is finite. Our goal is to show that if and only if one takes

the solution to (2.96) of minimal order of singularity, the mos solution to

(2.96), that is the solution h ∈ H−α, one obtains a finite value of (Rh, h).

Therefore only the mos solution to (2.96) solves the estimation problem,

and the error of the optimal estimate is given by formula (2.97) in which

h ∈ H−α is the unique solution to (2.96) of minimal order of singularity.

2. In order to achieve our goal, let us write the form (Rh, h) using the

Parseval equality and the basic assumption R ∈ R:

(Rh, h) =

∫

Λ

P (λ)Q−1(λ)|h(λ)|2dρ(λ). (2.98)

Here

∣∣∣h(λ)∣∣∣2

=

N(λ)∑

j=1

∣∣∣∣∫

D

h(x)φ∗j(x, λ)dx

∣∣∣∣2

, (2.99)



where φj(x, λ) are the eigenfunctions of L which are used in the expansion

of the spectral kernel:

Φ(x, y, λ) =

Nλ∑

j=1

φj(x, λ)φ∗j (y, λ), (2.100)

and Nλ ≤ ∞ (see Section 8.2). One has

h ∈ H−bs ⇔∫

Λ

∣∣∣h(λ)∣∣∣2

(1 + λ2)−bdρ(λ) <∞, (2.101)

where b > 0 is an arbitrary number. By the assumption (see 2.74)

0 < c1 ≤ P (λ)(1 + λ2)−p/2 ≤ c2, (2.102)

0 < c3 ≤ Q(λ)(1 + λ2)−q/2 ≤ c4, (2.103)

where cj, 1 ≤ j ≤ 4, are positive constants. Thus

0 < c5 ≤ PQ−1(1 + λ2)(q−p)/2 ≤ c6. (2.104)

From (2.98),(2.101) and (2.104) it follows that

(Rh, h) < ∞ ⇔ h ∈ H−(q−p)s/2 = H−α. (2.105)

In particular, if m(h) := ordh > α, then (Rh, h) = ∞.

Let L be an operator with constant coefficients in L2(Rr) so that

R(x, y) = R(x − y), L = (L1, . . .Lr), Lr := −i∂/∂xr , and formula (2.98)

takes the form

(Rh, h) =

∫

RrP (λ)Q−1(λ)

∣∣∣h(λ)∣∣∣2

dλ, (2.106)

with λ = (λ1, . . .λr) and

h(λ) := (2π)−r/2∫

Rrh(x) exp(iλ · x)dx. (2.107)

Then P (L) is elliptic of order p if and only if P (λ) satisfies (2.102) and

Q(L) is elliptic of order q if and only if Q(λ) satisfies (2.103). The integral

(2.106) is finite if and only if h(λ)(1+λ2)(p−q)/4 ∈ L2(Rr), where h(λ) is the

usual Fourier transform. This is equivalent to h ∈ H−α(Rr), α = (q−p)/2.

Since we assume that supph ⊂ D, we conclude in the case described that

h ∈ H−α(D).



3. Formula (2.97) can be written as

ε(x) = ε0(x) − (f, h) = ε0(x) − (f,R−1f). (2.108)

If U is a vector random field then h(x, y) is a d × d matrix and formulas

(2.97) and (2.108) take the form

ε(x) = ε0(x) − tr(f ,h) = ε0(x) −∫

D

trf (x, y)h∗(y, x)dy, (2.109)

where trA is the trace of the matrix A.




Chapter 3

Numerical Solution of the Basic

Integral Equation in Distributions

3.1 Basic ideas

It is convenient to explain the basic ideas using the equation

Rh =

∫ 1

−1

exp(−|x− y|)h(y)dy = f(x), −1 ≤ x ≤ 1 (3.1)

as an example, which contains all of the essential features of the general

equation

Rh =

∫

D

R(x, y)h(y)dy = f(x), x ∈ D ⊂ Rr (3.2)

with the kernel R ∈ R.

The first idea that might come to mind is that equation (3.1) is Fred-

holm’s equation of the first kind, so that the regularization method can yield

a numerical solution to (3.1) (see [R28], for example). On second thought

one realizes that, according to Theorem 3.1, the solution to (3.1) does not

belong to L2(−1, 1) in general, and that the mapping R−1 : Hα → H−α,

α = 1 for equation (3.1), is an isomorphism. Therefore the problem of

numerical solution of equation (3.1) is not an ill-posed but a well-posed

problem. The solution to (3.1) is a distribution in general, which is clear

from formula (2.49). For the solution of (3.1) to be an integrable function

it is necessary and sufficient that the following boundary conditions hold

f ′(1) + f(1) = 0, f ′(−1) = f(−1). (3.3)

This follows immediately from the formula (2.49). The problem is to de-

velop a numerical method for solving equations (3.1) and (3.2) in the space

of distributions H−α.

33



Much work has been done on the effective numerical inversion of the

Toeplitz matrices which one obtains by discretizing the integral equation

εhε + Rhε = f, ε > 0 (3.4)

with R, for example, given by (3.1). If the nodes of discretization are

equidistant, equation (3.4), after discretizing, reduces to a linear algebraic

system with Toeplitz matrix tij = ti−j. Discussion of this, however, is not

in the scope of our work. The question of principal interest is the question

about asymptotic behavior of the solution to (3.4) as ε → +0. Note that,

for any ε > 0, equation (3.4) is an equation with a selfadjoint positive defi-

nite operator εI + R in L2(−1, 1). Therefore, for any ε > 0, equation (3.4)

has a solution in L2(−1, 1) for any fεL2(−1, 1), and this solution is unique.

Numerical solution of equation (3.4) by the above mentioned discretization

(or collocation) method becomes impossible as ε → +0 because the con-

dition number of the matrix of the discretized problem grows quickly as

ε → +0. The nature of singularity of the solution to the limiting equation

(ε = 0) (3.1) is not clear from the discretization method described above.

Numerical solution of equation (3.1) requires therefore a new approach

which we wish to describe.

The basic idea is to take into account theoretical results obtained in The-

orems 2.1 and 2.4 . According to the Theorem 2.4, the solution to equation

(3.1) with a smooth right hand side f(x) has the following structure:

h = Aδ(x− 1) +Bδ(x + 1) + hsm, (3.5)

where A and B are constants and hsm is a smooth function. The order

of singularity of the solution to equation (3.1) is 1 since α = 1 for this

equation.

Let us assume that f is smooth, f ∈ H2, for example, so that

−f ′′ + f

2∈ H0 = L2(D).

Let us look for an approximate solution to equation (3.1) of the form

hn(x) =

n∑

j=1

cjφj(x) + c0δ(x− 1) + c−1δ(x+ 1), (3.6)

where cj , j = −1, 0, 1, . . ., are constants, φj, 1 ≤ j < ∞, is a basis in

H = L2(−1, 1). The constants can be found, for example, from the least


Numerical Solution of the Basic Integral Equation in Distributions 35

squares method:

‖ Rhn − f ‖1= min, (3.7)

where ‖ f ‖α is the norm of f in the space Hα.

The variational problem (3.7) can be written as

ε :=∫ 1

−1|c′0 exp(x) + c′−1 exp(−x) +∑n

j=1 cjψj(x) − f(x)|2

+|c′0 exp(x) − c′−1 exp(−x) +∑n

j=1 cjψ′j(x) − f ′(x)|2dx

= min, (3.8)

where

c′0 := c0e−1, c′−1 = c−1e

−1, ψj(x) = Rφj, 1 ≤ j ≤ n. (3.9)

The linear system for finding the cj , 1 ≤ j ≤ n, and c′0, c′−1 is

∂ε

∂cj= 0, 1 ≤ j ≤ n,

∂ε

∂c′0= 0,

∂ε

∂c′−1

= 0. (3.10)

The matrix of this system is

aij := (ψj, ψi)1, −1 ≤ i, j ≤ n, (3.11)

where ψ0 = exp(x), ψ−1 = exp(−x), ψj for 1 ≤ j < ∞ is defined in

(3.9), the system ψj, −1 ≤ j ≤ n is assumed to be linearly independent

for any n, and the inner product is taken in the space H1: (u, v)1 :=∫ 1

−1(uv + u′v′)dx. Matrix aij is positive definite for any n so that the

system

n∑

j=−1

aijcj = bi, −1 ≤ i ≤ n, bi := (f, ψi)1 (3.12)

is uniquely solvable for any n.

Convergence of the suggested numerical method is easy to prove. One

wishes to prove that

‖ hn − h ‖−1→ 0 as n→ ∞. (3.13)

Since R : H−1 → H1 is an isomorphism, it is sufficient to prove that

‖ Rhn −Rh ‖1→ 0 as n→ ∞. (3.14)

Since Rh = f , equation (3.14) reduces to

‖ Rhn − f ‖1→ 0 as n → ∞. (3.15)



Equation (3.15) holds if the set of functions ψj, −1 ≤ j < ∞ is complete

in H1. Here

ψj = Rφj, 1 ≤ j ≤ ∞, ψ−1 = exp(−x), ψ0 = exp(x). (3.16)

Therefore, if one chooses a system ψj ∈ H1, 1 ≤ j < ∞, such that the

system ψj, −1 ≤ j < ∞, forms a basis of H1 (or just a complete system

inH1) then (3.15) holds. Since for practical calculations one need only know

the matrix aij and the vector bi (see (3.12)), and both these quantities can

be computed if the system ψj, −1 ≤ j < ∞, and f are known, it is not

necessary to deal with the system φj, 1 ≤ j ≤ ∞.

We have proved the following

Proposition 3.1 If ψj, −1 ≤ j < ∞, ψ0 = exp(x), ψ−1 = exp(−x),is a complete system in H1 then, for any n, the system (3.12) is uniquely

solvable. Let c(n)j be its solution. Then the function

hn =∑n

j=−1 c(n)j φj(x), where φ0 = δ(x − 1), φ−1 = δ(x +

1), φj = R−1ψj, 1 ≤ j ≤ n

converges in H−1 to the solution h of equation (3.1):

‖ h− hn ‖−1→ 0, n → ∞.

There are some questions of a practical nature:

1) how does one choose the system ψj , 1 ≤ j < ∞, so that the matrix

aij in equation (3.12) is easily invertible?

2) how does one choose ψj , 1 ≤ j < ∞, so that the functions φj are

easily computable?

The first question is easy to answer: it is sufficient that the condition

number of the matrix aij, −1 ≤ j ≤ n for any n is bounded. This will be

the case if the system ψj, −1 ≤ j < ∞, forms a Riesz basis. Let us recall

that the system ψj is called a Riesz basis of a Hilbert space if and only if

there exists an orthonormal basis fj of H and a linear isomorphism B of

H onto H such that Bfj = ψj , ∀j. The system ψj forms a Riesz basis of

the Hilbert space H if and only if the Gram matrix (ψi, ψj) := Γij defines

a linear isomorphism of `2 onto itself.

If ψj, 1 ≤ j < ∞, in (3.6) is a basis of H then ψj, 1 ≤ j < ∞,

ψj = Rφj, is a complete set in H1. Indeed, suppose that f ∈ H1 and

(f,Rφj)1 = 0 ∀j. Then 0 = (f,Rφj)+ = (I−1f,Rφj)0 = (RI−1f, φj)0 ∀j,where the operator I has been introduced in §VI.1, and H+ = H1.



Since the system φj is complete in H0 (H0 = H in our case) by the

assumption, one concludes that RI−1f = 0. Since I−1f ∈ H− and R(x, y)

is positive so that (Rg, g)0 > 0 for g 6= 0, g ∈ H−, one concludes that

I−1f = 0. Since I−1 is an isometry between H+ and H−, one concludes

that f = 0. Therefore, by Proposition 1, if the system φj forms a basis

of H then ‖ hn − h ‖−1→ 0 as n → ∞, where hn is the solution to (3.7)

of the form (3.6).

If φj is a basis of H then the system (3.12) is uniquely solvable for

all n if and only if the system ψj, −1 ≤ j ≤ n is linearly independent in

H1 for all n. Here the system ψj is defined by formula (3.16).

3.2 Theoretical approaches

1. Let us consider equation (3.2) as an equation with the operator R :

H− → H+ which is a linear isomorphism between the spaces H− and H+.

The general theory of the triples of spaces H+ ⊂ H0 ⊂ H− is given in

Section 8.1, and we will use the results proved in Section 8.1.

In our case H+ = Hα, H0 = H0, H− = H−α. In general, H+ ⊂ H0

and H0 are Hilbert spaces, H+ is dense in H0, ‖ u ‖0≤‖ u ‖+, and H−is the dual space to H+ with respect to H0. It is proved in Section 8.1

that there exist linear isometries p+ : H0 → H+ and p− : H− → H0, and

(u, v)+ = (qu, qv)0, where q = p−1+ . The operator q∗, the adjoint of q in

H0, is an isometry of H0 onto H−.

Let us rewrite equation (3.2) in the equivalent form

Ah0 := qRq∗h0 = f0, (3.17)

where

f0 := qf, h0 := (q∗)−1h, f0 ∈ H0, h0 ∈ H0. (3.18)

The linear operator qRq∗ is bounded, selfadjoint, and positive definite:

(qRq∗φ, φ)0 = (Rq∗φ, q∗φ)0 ≥ c1 ‖ q∗φ ‖2−= c1 ‖ φ ‖2

0, c1 > 0. (3.19)

Moreover

(R∗1φ, q

∗φ)0 ≤ c2 ‖ q∗φ ‖2−= c2 ‖ φ ‖2

0, c2 > 0. (3.20)

Here we used the isometry of q∗: ‖ q∗φ ‖−=‖ φ ‖0, and the inequality

c1 ‖ h ‖2−≤ (Rh, h)0 ≤ c2 ‖ h ‖2

−, c2 ≥ c1 > 0. (3.21)



This inequality is proved in Section 3.4, Lemma 3.5, below.

Equation (3.17) with a linear positive definite operator A on a Hilbert

space H0 is uniquely solvable in H0 and its solution can be obtained by

iterative or projection methods. If the solution h0 to equation (3.17) is

found then the desired function h = q∗h0. Let us describe these methods.

Let us start with an iterative method. Assume that A is a bounded

positive definite operator on a Hilbert space:

0 < m ≤ A ≤ M. (3.22)

This means that m ‖ φ ‖2≤ (Aφ, φ) ≤ M ‖ φ ‖2, ∀φ ∈ H. Let

Au = f. (3.23)

Consider the iterative process

un+1 = (I − aA)un + af, a :=2

M +m, (3.24)

where u0 ∈ H is arbitrary.

Lemma 3.1 There exists limn→∞ un = u. This limit solves equation

(3.23). One has

‖ un − u ‖≤ cqn, q :=M −m

M +m, 0 < q < 1, c = const > 0. (3.25)

This is a well known result (see e.g. [Kantorovich and Akilov (1980)]).

We give a proof for convenience of the reader.

Proof. If un → u in H then, passing to the limit in (3.24), one concludes

that the limit u solves equation (3.23). In order to prove convergence and

the estimate (3.25), it is sufficient to check that

‖ I − aA ‖≤ q, (3.26)

where q is defined in (3.25). This follows from the spectral representation:

‖ I − aA ‖= supm≤λ≤M

∣∣∣∣1 − 2λ

m +M

∣∣∣∣ =M −m

M +m= q. (3.27)

Lemma 3.1 is proved.

If A is not positive definite but only nonnegative, and f ∈ R(A), where

R(A) is the range of A, then consider the following iterative process

un+1 +Aun+1 = un + f (3.28)



u0 ∈ H is arbitrary.

Lemma 3.2 If A ≥ 0 and f ∈ R(A) then there exists

limn→∞

un = u, (3.29)

where un is defined by (3.28) and u solves equation (3.23).

Proof. If un → u in H then, passing to the limit in (3.28) yields equation

(3.23) for u. In order to prove that un → u one writes equation (3.28) as

un+1 = Bun + h, (3.30)

where

B := (I +A)−1, h := Bf. (3.31)

Since A ≥ 0 one has 0 ≤ B ≤ I, where I is the identity operator in H.

Under this condition (0 ≤ B ≤ I) one can prove [Krasnoselskii et. al.

(1972), p. 71] that un → u. Lemma 3.2 is proved.

Remark 3.1 If A satisfies assumptions (3.22), then 0 < (M + 1)−1 ≤B ≤ (m+ 1)−1, and the iterative process (3.30) converges as a geometrical

series with q = (m + 1)−1.

2. Let us consider the projection methods for solving equation (3.23)

under the assumption (3.22).

First, consider the least squares method which is a variant of the projec-

tion method. The least squares method can be described as follows. Take

a complete linearly independent system φj in H. Look for a solution

un =

n∑

j=1

cjφj. (3.32)

Find the constants cj from the condition

‖ Aun − f ‖= min . (3.33)

This leads to the linear system for cj:

n∑

j=1

aijcj = fi, 1 ≤ i ≤ n (3.34)

where

aij := (Aφj , Aφi), fi = (f,Aφi). (3.35)



Since the system φj, 1 ≤ j ≤ n, is linearly independent for any n, and A

is an isomorphism of H onto H, the system Aφj, 1 ≤ j ≤ n, is linearly

independent for any n. Therefore det aij 6= 0, 1 ≤ i, j ≤ n, and the system

(3.34) is uniquely solvable for any right hand sides and any n. Let c(n)j ,

1 ≤ j ≤ n, be the unique solution to system (3.34) and

un :=

n∑

j=1

c(n)j φj . (3.36)

Let us prove that un → u as n → ∞. It is sufficient to prove that the

system Aφj, 1 ≤ j < ∞, is complete in H. Indeed, if this is so, then

‖ Aun − f ‖→ 0 as n → ∞, where un is given by (3.36). Therefore

‖ un − u ‖=‖ A−1(Aun − f) ‖≤ m−1 ‖ Aun − f ‖→ 0. (3.37)

Here we used the estimate ‖ A−1 ‖≤ m−1. It is easy to check that the

system Aφj, 1 ≤ j < ∞, is complete in H. Indeed, suppose (h,Aφj) = 0,

1 ≤ j < ∞, for some h ∈ H. Then (Ah, φj) = 0, 1 ≤ j < ∞. Thus Ah = 0

since by the assumption the system φj, 1 ≤ j < ∞, is complete in H.

Since A−1 exists, equation Ah = 0 implies h = 0. We have proved the

following lemma.

Lemma 3.3 If A satisfies condition (3.22) and φj, 1 ≤ j < ∞, is a

complete linearly independent system in H, then the least squares method of

solving equation (3.23) converges. Namely: a) for any n the system (3.34)

is uniquely solvable and the aproximate solution un is uniquely determined

by formula (3.36), and b) ‖ un − u ‖→ 0 as n→ ∞, where u is the unique

solution to equation (3.23).

The general projection method can be described as follows. Pick two

complete linearly independent systems in H φj and ψj, 1 ≤ j < ∞.

Look for an approximate solution to equation (3.23) of the form (3.32).

Find the coefficients cj , 1 ≤ j ≤ n, from the condition

(Aun − f, ψi) = 0, 1 ≤ i ≤ n. (3.38)

Geometrically this means that the vector Aun−f is orthogonal to the linear

span of the vectors ψi, 1 ≤ i ≤ n. Equations (3.38) can be written as

n∑

j=1

bijcj = fi, 1 ≤ i ≤ n (3.39)



where

bij = (Aφj , ψi), fi = (f, ψi). (3.40)

The least squares method is the projection method with ψi = Aφi.

In [Krasnoselskii, M. (1972)] one can find a detailed study of the general

projection method.

Let us give a brief argument which demonstrates convergence of the

projection method. Let φj be a complete linearly independent system in

H, Ln := spanφ1, . . .φn, Pn is the orthogonal projection on Ln in H. An

infinite system φj is called linearly independent in H, if, for any n, the

system φj, 1 ≤ j ≤ n, is linearly independent in H. Take ψj = φj and

write equation (3.38) as

PnAun = Pnf, un ∈ Ln. (3.41)

Since un = Pnun and the operator PnAPn is selfadjoint positive definite on

the subspace Ln ⊂ H, equation (3.41) is uniquely solvable for any f ∈ H

and any n. Note that PnAun = PnAPnun and A satisfies assumes (3.22).

To prove that ‖ un − u ‖→ 0 as n → ∞, let us subtract from (3.41) the

equation

PnAh = Pnf. (3.42)

The result is

PnAPn(un − u) = PnA(u − Pnu). (3.43)

Since φj is complete, Pn → I, as n → ∞, strongly, where I is the identity

operator. This and the boundedness of A imply

‖ PnA(u − Pnu) ‖≤ c ‖ u− Pnu ‖→ 0, n→ ∞. (3.44)

Multiply (3.43) by Pn(u − un) and use the positive definiteness of A to

obtain

c ‖ Pn(u− un) ‖2≤‖ u− Pnu ‖‖ Pn(u− un) ‖,

or

c ‖ Pn(u − un) ‖≤‖ u− Pnu ‖→ 0, n→ ∞. (3.45)

Since Pnun = un, equations (3.44) and (3.45) imply

‖ u− un ‖≤‖ u− Pnu ‖ + ‖ Pnu− un ‖→ 0, n → ∞. (3.46)



We have proved

Lemma 3.4 If (3.22) holds and φj is a complete linearly independent

system in H, then the projection method (3.41) for solving equation (3.23)

converges, and equation (3.41) is uniquely solvable for any n.

3. If one applies the projection method with ψj = φj to equation (3.17)

and chooses a Schauder basis φj of H, then the matrix

aij := (qRq∗φj , φi) = (Rq∗φj, q∗φi) (3.47)

and the numbers

f0i = (f0, φi) = (f, q∗φi) (3.48)

can be computed. The parentheses here denote the inner product in H0 =

H0, (u, v) = (u, v)0. As q∗φi := wi one can take a basis of H−: since q∗ is

an isometry between H0 and H− it sends a basis φi of Ho onto a basis

wi of H−. Let us suggest a system wi for computational purposes.

Let B be a ball which contains the domain D, vj be the orthonormal in

L2(B) system of eigenfunctions of the Dirichlet Laplacian in B:

−∆vj = λjvj , vj = 0 on ∂B. (3.49)

For any −∞ < β < ∞, the system vj, 1 ≤ j < ∞, forms a basis of

Hβ(B). Indeed the norm in Hβ(B) is equivalent to the norm

‖ (−∆)β/2u ‖L2(B):=‖ u ‖sand

‖ u ‖β=

∞∑

j=1

|uj|2λβj

1/2

, uj := (u, vj). (3.50)

Therefore, u ∈ Hβ(B) is a necessary and sufficient condition for the Fourier

series

u =

∞∑

j=1

ujvj (3.51)

to converge in Hβ(B), so that the system vj is a basis of Hβ(B) for any

β, −∞ < β < ∞. Moreover, this basis is orthogonal in Hβ(B) for any β,

although it is not normalized in Hβ(B) for β 6= 0. In order to check the

orthogonality property, note that



(vj , vi)β =((−∆)βvj, vi

)0

= λβj (vj , vi)0 = λβj δji,

where δji is the Kronecker delta.

The basis vj can be used therefore as a basis of H− = H−β for any

β ≥ 0. In the case of equation (3.17) with R ∈ R, one has H− = H−α,

α = s(q − p)/2. Although the system vj, 1 ≤ j < ∞, is a basis of H−α,

it is not very convenient for representation of singular functions, such as

δΓ, for example. The situation is similar to the one arising when δ(x − y)

is represented as

δ(x− y) =

∞∑

j=1

φj(x)φ∗j (y), (3.52)

where φj is an orthonormal basis in L2(D). Formula (3.52) is valid in

the sense that for any f ∈ L2(D) one has

f(x) =

∞∑

j=1

(f, φj)φj. (3.53)

The sequence δn(x− y) :=∑n

j=1 φj(x)φ∗j (y) is a delta-sequence, that is

∣∣∣∣∣∣∣∣∫

D

fδn(x− y)dy − f(x)

∣∣∣∣∣∣∣∣L2(D)

→ 0 as n→ ∞ ∀f ∈ L2(D). (3.54)

The series (3.52) does not converge in the classical sense.

3.3 Multidimensional equation

In this section we describe the application of the basic idea presented in

Section 3.1 to the multidimensional equation of random fields estimation

theory

∫

D

R(x, y)h(y)dy = f(x), x ∈ D ⊂ Rr. (3.55)

We assume that R ∈ R and Γ is smooth. Let j = (j1, . . . jr) be a multiindex,

|j| = j1+j2+· · ·+jr . By b(s)δΓ(j) we mean the distribution with support

on Γ = ∂D which acts on a test function φ ∈ C∞0 (Rr) by the formula

(b(x)δΓ(j), φ

)= (−1)|j|

∫

Γ

b(s)φ(j)(s)ds. (3.56)



Here b(s) is a smooth function on Γ. Let us look for an approximate solution

to equation (3.55) in H−α of the form

hn =

n∑

j=1

cjφj(x) +

α−1∑

i=0

n∑

m=1

(−1)|i|amibmi(s)δΓ(i) := hon + hsn (3.57)

where ami and cj are constants, the system φj, 1 ≤ j < ∞, forms a

basis of H0 = L2(D), and the systems bmi(s), 1 ≤ m < ∞, form a basis

of L2(Γ), 0 ≤ |i| ≤ α − 1. Here hon stands for an approximation of h0,

the ordinary part of the solution h = ho + hs to equation (3.55), and hsnstands for an approximation of the singular part, hs, of this solution. If

f ∈ H2qs−ps, then G(x), defined by formula (2.16), belongs to H2qs and h,

the solution to (3.55) given by formula (2.15), belongs toH0 = L2(D) in the

interior of D, ho = Q(L)G|D, where the symbol h|D denotes the restriction

of the distribution h to the interior of D. For example, if D = (t− T, t), h

is given by formula (2.49), then h|D = −f ′′+f2 . The term

hon :=

n∑

j=1

cjφj(x) (3.58)

in (3.57) can approximate ho with an arbitrary accuracy in H0, if n is

large enough, because the system φj, 1 ≤ j < ∞, forms a basis of H0.

(It would be sufficient to assume that φj is a complete in H0 linearly

independent system. The term

hsn :=

α−1∑

i=0

n∑

m=1

(−1)iamibmi(s)δΓ(i) (3.59)

can approximate hs in H−α with an arbitrary accuracy, if n is large enough,

because the systems bmi(s), 1 ≤ m < ∞, are complete in L2(Γ) and hsis of the form (see formula (2.16)

hs =

α−1∑

i=0

(−1)|i|bi(s)δ(i), (3.60)

where the coefficients bi(s) are the traces on Γ of certain derivatives of

f(x) (see formula (2.85), for example). If f(x) is sufficiently smooth then

the functions bi(s) are in L2(Γ) and can be approximated in L2(Γ) with an

arbitrary accuracy, if m is large enough, by linear combinations of functions

bmi(s) because the systems bmi(s), 1 ≤ m < ∞, are assumed to be

complete in L2(Γ) for any 0 ≤ |i| ≤ α− 1.



Choose coefficients cj and bmi in (3.57) so that

‖ Rhn − f ‖1= min . (3.61)

The variational problem (3.61) leads to the linear algebraic system for the

coefficients cj and bmi. The arguments given in Section 3.1 below formula

(3.7) remain valid without essential changes. Rather than formulate some

general statements, consider an example. Let∫

D

exp(−|x− y|)4π|x− y| hdy = f(x), D ⊂ R3. (3.62)

This is equation (2.84) with a = 1. Look for its approximate solution in

H−1 of the form

hn =

n∑

j=1

cjφj +

n∑

m=1

ambm(s)δΓ. (3.63)

Here α = 1, so that the double sum in (3.59) reduces to the second sum in

(3.63), cj and am are the coefficients to be determined by the least squares

method (3.61). One has

Rhn =

n∑

j=1

cjηj(x) +

n∑

m=1

amwm(x) (3.64)

where

g(x, s) =exp(−|x− s|)

4π|x− s| , ηj := Rφj(x) (3.65)

wm(x) :=

∫

Γ

bm(s)g(x, s)ds. (3.66)

Therefore (3.61) yields:

‖n∑

j=1

cjηj(x) +n∑

m=1

amwm(x) − f ‖1= min . (3.67)

This leads to the linear system for the 2n coefficients cj and am:

2n∑

j=1

aijβj = γi, 1 ≤ i ≤ 2n (3.68)



where

βj = cj , 1 ≤ j ≤ n, βj = am, j = n+m, 1 ≤ m ≤ n, (3.69)

γi = (f, ηi)1, 1 ≤ i ≤ n, γi = (f, wi)1, i = n+m, 1 ≤ m ≤ n,(3.70)

aij := (vj , vi)1, (3.71)

vi = ηi, 1 ≤ i ≤ n, vi = wi, i = n+m, j 1 ≤ m ≤ n. (3.72)

Exercise: Under what assumptions can one prove that

‖ Rhn − f ‖1→ 0 as n→ ∞ implies ‖ hon − ho ‖o→ 0 and

‖ hsn − hs ‖−1→ 0 as n→ ∞?

3.4 Numerical solution based on the approximation of the

kernel

Consider the basic equation

Rh :=

∫

D

R(x, y)h(y)dy = f(x), x ∈ D ⊂ Rr (3.73)

with kernel

R(x, y) =

∫

Λ

R(λ)Φ(x, y, λ)dρ(λ), R(λ) > 0, (3.74)

where R(λ) is a positive continuous function vanishing at infinity. Let us

call this function the spectral density corresponding to the kernel R(x, y).

Assume that, for any ε > 0, one can find polynomials Pε(λ) > 0 and

Qε(λ) > 0, such that the kernel Rε(λ) := Pε(λ)Q−1ε (λ) approximates R(λ)

in the following sense:

supλ∈Λ

|Rε − R|(1 + λ2)β :=‖ Rε − R ‖β< ε. (3.75)

We assume that for all sufficiently small ε, 0 < ε < εo, εo is a small number,

one has

degQε(λ) − degPε(λ) = 2β > 0, (3.76)

where β does not depend on ε. For example, if

R(λ) =

(λ2 + 3

λ2 + 2

)1/21

λ2 + 1, Λ = (−∞,∞),



then β = 1. We also assume that

inf0<ε<εo

infλ∈Λ

|Rε(λ)|(1 + λ2)β

:= γo > 0, (3.77)

and

infλ∈Λ

R(λ)(1 + λ2)β := γ1 > 0, supλ∈Λ

R(λ)(1 + λ2)β := γ2 < ∞. (3.78)

The basic idea of this section is this: if the operator Rε : H−βs → Hβs,

with the spectral density Rε(λ) is for all ε ∈ (0, εo) an isomorphism and

the asumptions (3.75)-(3.78) hold with constants β, c which do not depend

on ε, then R, the operator with the kernel R(x, y), is also an isomorphism

between H−βs and Hβs. Therefore the properties of the operator R will

be expressed in terms of the properties of the rational approximants of its

spectral density.

We will need some preparations. First, let us prove a general lemma.

Lemma 3.5 Let H+ ⊂ H0 ⊂ H− be a rigged triple of Hilbert spaces,

where H− is the dual space to H+ with respect to H0. Assume that R :

H− → H+ is a linear map such that

c1 ‖ h ‖2−≤ (Rh, h) ≤ c2 ‖ h ‖2

−, ∀h ∈ H−, (3.79)

where 0 < c1 < c2 are constants, and (f, h) is the value of the functional

h ∈ H− on the element f ∈ H+. Then

‖ R ‖≤ c2, ‖ R−1 ‖≤ c−11 , (3.80)

so that R is an isomorphism of H− onto H+. Here ‖ R ‖ is the norm of

the mapping

R : H− → H+.

Proof. One has

c1 ‖ h ‖2−≤ (Rh, h) ≤‖ Rh ‖+‖ h ‖−,

so that

‖ Rh ‖+≥ c1 ‖ h ‖− ∀h ∈ H−. (3.81)

Therefore R−1 is defined on the range of R and

‖ R−1 ‖≤ c−11 . (3.82)



Let us prove that the map R is surjective, that is, the range of R is all of

H+. If it is not, then there exists a φ ∈ H−, φ 6= 0, such that

(Rh, φ) = 0 ∀h ∈ H−. (3.83)

It follows from (3.83) that

(h,Rφ) = 0 ∀h ∈ H−. (3.84)

Therefore Rφ = 0 and

0 = (Rφ, φ) ≥ c1 ‖ φ ‖2− . (3.85)

Thus φ = 0 contrary to the assumption. Therefore the map R is surjective.

Let us now prove the first inequality (3.80). One has

‖ R ‖ = sup |(Rg, h)| = sup |Re(Rg, h)|

= sup|(R(h+ g), h+ g) − (R(h− g), h− g)|

4

≤ c24

sup‖ h+ g ‖2

− + ‖ h− g ‖2−≤ c2, (3.86)

where the supremum is taken over all h, g ∈ H− such that

‖ h ‖−≤ 1, ‖ g ‖−≤ 1. (3.87)

Remark 3.2 The surjectivity of the map R : H− → H+ follows also from

the fact that R is a coercive, monotone and continuous mapping (see, e.g.,[Deimling (1985), p. 100]).

Let us now prove

Lemma 3.6 Let Rε : H− → H+ be an isomorphism for all ε ∈ (0, εo),

where εo > 0 is a small fixed number. Let R : H− → H+ be a linear map

defined on all of H−. Assume that

‖ R−1ε ‖≤ M (3.88)

and

‖ R−Rε ‖< ε, (3.89)



where M = const > 0 does not depend on ε ∈ (0, εo). Then R : H− → H+

is an isomorphism and

‖ R−1 ‖≤M (1 − εM )−1 for εM < 1. (3.90)

Proof. One has

R = Rε + R−Rε = Rε[I +R−1ε (R −Rε)], (3.91)

where I is the identity operator on H−. The operator R−1ε (R − Rε) is an

operator from H− into H− and

‖ R−1ε (R −Rε) ‖≤ εM (3.92)

because of (3.88) and (3.89). If εM < 1, then the operator I+R−1ε (R−Rε)

is an isomorphism of H− onto H−, and

‖ [I + R−1ε (R− Rε)]

−1 ‖≤ (1 − εM )−1. (3.93)

Therefore, the operator R is an isomorphism of H− onto H+,

R−1 = [I +R−1ε (R− Rε)]

−1R−1ε , (3.94)

and

‖ R−1 ‖≤M (1 − εM )−1, εM < 1. (3.95)

Lemma (3.6) is proved.

Let us choose H− = H−βs, H0 = H0 = L2(D), and H+ = Hβs, where

s = ordL, and L is the elliptic operator which defines the kernel R ∈ R of

the equation (3.73). Note that, by Parseval’s equality, one has:

(Rh, h) =

∫

Λ

R(λ)|h(λ)|2dρ(λ) ≤ γ2

∫

Λ

(1+λ2)−β|h(λ)|2dρ(λ) = γ2 ‖ h ‖2−βs

(3.96)

and, similarly,

γ1 ‖ h ‖2−βs≤ (Rh, h), (3.97)

where γ1 and γ2 are constants from condition (3.78). From (3.96), (3.97)

and Lemma 3.5, one obtains

Lemma 3.7 If the spectral density R(λ) of the kernel (3.74) satisfies

conditions (3.78) with some β > 0, then the operator R with kernel R(x, y),



defined by formula (3.74), is an isomorphism between the spaces H−βs(D)

and Hβs(D) and

‖ R ‖≤ γ2, ‖ R−1 ‖≤ γ−11 , (3.98)

where γ1 and γ2 are constants from condition (3.78).

Let us discuss briefly the approximation problem. Let R(λ) be a con-

tinuous positive function such that conditions (3.78) hold and

limλ→±∞

R(λ)λ2β = γ3, (3.99)

where β is a positive integer, and let λ = tg(φ/2). Then λ runs through

the real axis, −∞ < λ < ∞, if φ runs through the unit circle, −π ≤ φ ≤π. Because of the assumption (3.99), one can identify +∞ and −∞, and

consider the function R(λ) as a function

R(φ) := R

(tgφ

2

), −π ≤ φ ≤ π, (3.100)

which is 2π periodic function defined on a unit circle. Since

sinφ =1 − λ2

1 + λ2, cosφ =

2λ

1 + λ2(3.101)

and cos(mφ) is a polynomial of degree m of cosφ, while sin(m+1)φsin φ is also a

polynomial of degree m of cosφ, a trigonometric polynomial

Sn(φ) := ao +

n∑

j=1

aj cos(jφ) + bj sin(jφ), (3.102)

where aj are bj are constants, can be written as

Sn(φ) := Rn(λ) =

∑2nm=0 cmλ

m

(1 + λ2)n, (3.103)

where cm are some constants. Therefore, if one wishes to approximate

a function R(λ) on the whole axis (−∞,∞) by a rational function, one

can approximate the function R(φ), defined by formula (3.100) on the unit

circle, by a trigonometric polynomial Sn(φ) and then write this polynomial

as a rational function of λ as in formula (3.103). The function Rn(λ),

defined by formula (3.103), satisfies the condition

Rn(λ) ∼ γ3λ−2β as λ → ±∞ (3.104)



if and only if β, 0 < β < n, is an integer and

0 = c2n = c2n−1 = . . . = c2n−2β+1 = 0. (3.105)

If (3.105) holds then the constant γ3 in formula (3.104) equals c2n−2β. The

theory of approximation by trigonometric polynomials is well developed

(see [Akhieser (1965)]). In particular, if condition (3.99) holds, then ap-

proximation in the norm

supλ∈R1

(1 + λ2)β |R(λ) − Rε(λ)| :=‖ R− Rε ‖β (3.106)

is possible. The norm (3.106) is the norm (3.75) with Λ = R1. We keep

the same notation for these two norms since there will be no chance to

make an error by confusing these two norms: in all our arguments they are

interchangeable.

Lemma 3.8 If R(λ) is a continuous function defined on R1 = (−∞,∞)

which satisfies condition (3.99), then, for any ε > 0, there exists a rational

function Rε = Pε(λ)Q−1ε (λ) such that

‖ R(λ) − Rε(λ) ‖β< ε (3.107)

and condition (3.76) holds. If R(λ) > 0 then the polynomials Pε(λ) and

Qε(λ) can be chosen positive.

Proof. The function ψ(λ) := (1 + λ2)βR(λ) is continuous on R1 and

limλ→±∞

(1 + λ2)βR(λ) = γ3 (3.108)

because of (3.99). The function ψ(tg φ2

), tg φ2 = λ, is a continuous function

of φ on the interval [−π, π]. Therefore, there exists, for any given ε > 0, a

trigonometric polynomial such that

max−π≤φ≤π

∣∣∣∣ψ(tgφ

2

)− Sn(φ)

∣∣∣∣ < ε, λ = tgφ

2, (3.109)

where n = n(ε) depends on ψ. From (3.109) and (3.103) it follows that

max−∞<λ<∞

∣∣∣∣∣(1 + λ2)βR(λ) −2n∑

m=0

cmλm(1 + λ2)−n

∣∣∣∣∣ < ε. (3.110)



Therefore (3.107) holds with

Pε(λ) :=

2n∑

m=0

cmλm, Qε(λ) := (1 + λ2)−n−β. (3.111)

In order to prove the last statement of Lemma 3.8, one approximates

first the continuous function (1+λ2)β/2R1/2(λ) by a rational function T (λ)

in the uniform norm on R1 with accuracy ≤ ε, where ε > 0 is a given num-

ber. Then the square of this rational function approximates the function

(1 + λ2)βR(λ) with accuracy const · ε, where the constant does not depend

on ε. Indeed, if |f − T | < ε, then

|f2 − T 2| ≤ |f − T |(max |f | + max |T |) ≤ ε · const. (3.112)

If T is a rational function then T 2(λ) = P (λ)Q−1(λ) where P (λ) and Q(λ)

are positive. Lemma 3.8 is proved.

Let us summarize the results in the following theorem.

Theorem 3.1 Let R(λ) be a continuous positive function on R1 and

suppose condition (3.99) holds with a positive integer β. Then:

a) for any ε > 0, there exists a positive rational function Rε(λ) such that

conditions (3.75) and (3.76) hold;

b) for all sufficiently small ε, 0 ≤ ε ≤ ε0, the operator Rε, with the kernel

defined by the spectral density Rε(λ), is an isomorphism of the space

H−βs(D) := H−βs onto Hβs(D) := Hβs;

c) the operator R : H−βs → Hβs is an isomorphism;

d) there exist positive constants γ0, γ1, and γ2, such that conditions (3.77)

and (3.78) hold;

e) the following estimates hold:

‖ R ‖≤ γ2, ‖ R−1 ‖≤ γ−11 , (3.113)

where γ1 and γ2 are the constants in formula (3.78);

f) if γ−11 ε < 1, then

‖ R−1ε ‖≤ γ−1

1 (1 − γ−11 ε)−1 (3.114)

and

‖ R−1 −R−1ε ‖≤ εγ−2

1 (1 − γ−11 ε)−1. (3.115)



Proof. The statements (a) to (e) of Theorem 3.1 follow from Lemmas

3.5-3.8. The statement (3.114) is analogous to (3.95) and can be proved

similarly. The last statement (3.115) follows immediately from the identity:

R−1 −R−1ε = R−1

ε (Rε − R)R−1 (3.116)

and estimate (3.114), second estimate (3.113), and the estimate

‖ Rε −R ‖≤ ε, (3.117)

which is a consequence of (3.75). Let us explain why (3.75) implies (3.117).

One has

‖ R ‖ = sup‖h‖−≤1

|(Rh, h)| = sup‖h‖−≤1

∫R(λ)

∣∣∣h(λ)∣∣∣2

dρ(λ)

≤ supλ∈R1

(1 + λ2)β

∣∣∣R(λ)∣∣∣

sup‖h‖−≤1

∫(1 + λ2)−β|h|2dρ(λ)

= supλ∈R1

(1 + λ2)β

∣∣∣R(λ)∣∣∣

sup‖h‖−≤1

‖ h ‖2−

= supλ∈R1

(1 + λ2)β

∣∣∣R(λ)∣∣∣. (3.118)

Here H− = H−βs. Thus

‖ R ‖≤ supλ∈R1

(1 + λ2)β

∣∣∣R(λ)∣∣∣. (3.119)

From (3.75) and (3.119) one obtains (3.117). Theorem 3.1 is proved.

It is now easy to study the stability of the numerical solution of equation

(3.73) based on the approximation of R(x, y).

Consider the equation

Rhδ = fδ , fδ ∈ H+, (3.120)

where fδ is the noisy data:

‖ fδ − f ‖+≤ δ. (3.121)

This means that, in place of the exact data f ∈ H+, an approximate

data fδ is given, where δ > 0 is the accuracy with which the given data

approximates inH+ the exact data. Suppose that R(λ), the spectral density

of the given kernel, satisfies condition (3.99) with β > 0 an integer. Take a

kernel Rε ∈ R, such that the estimate (3.117) holds with ε > 0 sufficiently



small. This is possible by Theorem 3.1. Then estimate (3.115) holds.


Rεhεδ = fδ . (3.122)

By Theorem 3.1 the operator Rε : H− → H+ is an isomorphism if ε > 0

is sufficiently small, 0 < ε < ε0. Therefore, for such ε equation (3.122) is

uniquely solvable in H−. We wish to estimate the error of the approximate

solution:

‖ h− hεδ ‖− = ‖ R−1f − R−1ε fδ ‖−

≤ ‖ R−1(f − fδ) ‖− + ‖ (R−1 −R−1ε )fδ ‖−

≤ ‖ R−1 ‖‖ f − fδ ‖+ + ‖ R−1 − R−1ε ‖‖ fδ ‖+

≤ γ−11 δ + εγ−2

1 (1 − εγ−11 )−1 ‖ fδ ‖+ . (3.123)

In our case H− = H−βs, H+ = Hβs. Estimate (3.123) proves that the

error of the approximate solution goes to zero as the accuracy of the data

increases, that is δ → 0. Indeed, one can choose ε > 0 sufficiently small,

so that the second term on the right side of the inequality (3.123) will be

arbitrarily small, say less than δ. Then the right side of (3.123) is not more

than (γ−11 + 1)δ. We have proved

Lemma 3.9 The error estimate of the approximate solution hεδ is given

by the inequality:

‖ h− hεδ ‖−≤ γ−11 δ + εγ−1

1 (1 − εγ−11 )−1 ‖ fδ ‖+ (3.124)

where γ1 is the constant in condition (3.78).

3.5 Asymptotic behavior of the optimal filter as the white

noise component goes to zero


εhε + Rhε = f, ε > 0, (3.125)

where R ∈ R, or, more generally, R is an isomorphism between H− and

H+, where H+ ⊂ H0 ⊂ H− is a rigged triple of Hilbert spaces. We wish to

study the behavior as ε → 0 of hε, the optimal filter. This question is of

theoretical and practical interest as was explained in the Introduction. It



will be discussed in depth in Chapter 5. We assume that the estimate

c1 ‖ h ‖2−≤ (Rh, h) ≤ c2 ‖ h ‖2

− ∀h ∈ H− (3.126)

holds, where c1 > 0 and c2 > 0 are constants and the parentheses denote

the pairing between H− and H+.

From (3.125) it follows that

ε ‖ hε ‖20 +(Rhε, hε) = (f, hε), (3.127)

where the parentheses denote the inner product in H0, which is the pairing

between H− and H+ (see Section 8.1). It follows from (3.126) and (3.127)

that

c1 ‖ hε ‖2−≤ (Rhε, hε) ≤ (f, hε) ≤‖ f ‖+‖ hε ‖− . (3.128)

Thus

‖ hε ‖−≤ c ‖ f ‖+, c = c−11 , (3.129)

where the constant c > 0 does not depend on ε. Since H− is a Hilbert

space, and bounded sets are weakly compact in Hilbert spaces, inequality

(3.129) implies that there is a weakly convergent subsequence of hε, which

we denote again hε, so that

hε h in H− as ε → 0. (3.130)

Here denotes weak convergence in H− which means that for any f ∈ H+

one has

(f, hε) → (f, h) as ε → 0, ∀f ∈ H+. (3.131)

Let φ ∈ H+ be arbitrary. It follows from (3.125) that

ε(hε, φ) + (Rhε, φ) = (f, φ),

or, since (Rh, φ) = (h,Rφ), Rφ ∈ H+,

ε(hε, φ) + (hε, Rφ) = (f, φ), ∀φ ∈ H+. (3.132)

One has

|ε(hε, φ)| ≤ ε ‖ hε ‖−‖ φ ‖+→ 0 as ε → 0, (3.133)



where we used estimate (3.129). Therefore one can pass to the limit ε→ 0

in equation (3.132) and obtain

(h,Rφ) = (f, φ) ∀φ ∈ H+, (3.134)

or

(Rh− f, φ) = 0 ∀φ ∈ H+, (3.135)

where h ∈ H− is the weak limit (3.130). Since H+ ⊂ H− is dense in H− in

the norm of H−, one concludes from (11) that

Rh = f. (3.136)

We have proved the following theorem.

Theorem 3.2 Let H+ ⊂ H0 ⊂ H− be a triple of rigged Hilbert spaces.

If R : H− → H+ is an isomorphism, and (3.126) holds, then the unique

solution to equation (3.125) converges weakly in H− to the unique solution

to the limit equation (3.136).

Remark 3.3 The weak convergence in H− is exactly what is natural in

the estimation theory. Indeed, the estimate

u =

∫

D

h(x, y)f(y)dy = (hx, f) (3.137)

is the value of the functional (3.137) at the element f , hx = h(x, y) ∈ H−,

f ∈ H+, x ∈ Rr being a parameter. The error of the optimal estimate

(see e.g. formulas (2.96) and (2.108) are also expressed as the values of a

functional of the form (3.137).

One can prove that actually hε converges strongly to H−. Indeed, equa-

tion (3.125) implies (hε, hε)− = (Rhε, hε) ≤ (Rh, hε) = (h, hε)−. Thus

||hε||− ≤ ||h||−. Choose a weakly convergent in H− sequence hn := hεn ,

limn→∞

εn = 0. Then hn h in H−, ||h||− ≤ limn→∞||hn||− and

limn→∞||hn|| ≤ ||h||−. Consequently, limn→∞ ||hn||− = ||h||−. This and

the weak convergence hn h in H− imply strong convergence in H−, so

limn→∞ ||h− hn||− = 0.



3.6 A general approach

In this section we outline an approach to solving the equation

Rh = f (3.138)

which is based on the theory developed in Section 8.1.

Assume that R : H → H is a compact positive operator in the Hilbert

space H, that is

(Rh, h) > 0 ∀h ∈ H, h 6= 0. (3.139)

The parentheses denote the inner product in H. The inner product

(h, g)− := (Rh, g) (3.140)

induces on H a new norm

‖ h ‖−= (Rh, h)1/2 =‖ R1/2h ‖ . (3.141)

Let H− be the Hilbert space with the inner product (3.140) which is the

completion of H in the norm (3.141). By H+ we denote the dual to H−space with respect to H = H0 (see Section 8.1). One has

H+ ⊂ H ⊂ H− (3.142)

where H+ is dense in H and H is dense in H−. The inner product in H+

is

(u, v)+ = (R−1u, v) = (R−1/2u,R−1/2v), u, v ∈ Dom(R−1/2). (3.143)

Therefore

‖ u ‖+=‖ R−1/2u ‖ . (3.144)

One can see that H+ is the range of R1/2.

Indeed, Ran(R1/2) ⊂ H+ by definition and is closed in H+ norm. In-

deed, let fn = R1/2un and assume that ‖ fn − fm ‖+→ 0 as n,m → ∞.

Then, by (3.144), ‖ un − um ‖→ 0 as n,m → ∞. Therefore there exists

a u ∈ H such that ‖ un − u ‖→ 0 as n → ∞. Let f := R1/2u. Then

f ∈ H+ and ‖ f − fn ‖+→ 0 as n → ∞. Thus Ran(R1/2) is closed in H+,

where Ran(A) is the range of an operator A. Since H+ is the completion



of Ran(R1/2) in H+ norm, it follows that H+ = Ran(R1/2). One can also

define the norm in H+ as

‖ u ‖+= suph∈H

h6=0

|(u,R−1/2h)|‖ h ‖ . (3.145)

If and only if the right side of (3.145) is finite one concludes that u ∈Dom(R−1/2) and obtains from (3.145) equation (3.144). The triple (3.142)

is a triple of rigged Hilbert spaces (see Section 8.1), and R : H− → H+ is

an isomorphism. Therefore, a general approach to stable solving equation

(3.138) can be described as follows. Suppose an operator A : H → H

is found such that A > 0 and the norm (Au, u)1/2 is equivalent to the

norm (3.141). In this case the spaces H+ and H− constructed with the

help of A will consist of the same elements as the spaces H+ and H−constructed above, and the norms of these spaces are equivalent, so that

one can identify these spaces. Suppose that one can construct the mapping

A−1 : H+ → H−. This mapping is an isomorphism between H+ and H−.

Then equation (3.138) can be written as

Bh := A−1Rh = A−1f := g, (3.146)

where B : H− → H− is an isomorphism. Therefore equation (3.138) is

reduced to the equation

Bh = g (3.147)

which is an equation in the Hilbert spaceH− with a linear bounded operator

B which is an isomorphism of H− onto H−. The operator B in equation

(3.148) is selfadjoint and positive in H−. Indeed

(Bh, v)− = (RBh, v) = (RA−1Rh, v) = (h,RA−1Rv) = (h,Bv)−. (3.148)

Moreover

(Bh, h)− = (A−1Rh,Rh) > 0 for h 6= 0 (3.149)

since A and R are positive.

Equation (3.148) with positive, in the sense (3.149), isomorphism B

fromH− ontoH− can be easily solved numerically by iterative or projection

methods described in Section 3.2.

Let us now describe some connections between the concepts of this sec-

tion and the well known concept of a reproducing kernel.



Definition 3.1 A kernel K(x, y), x, y ∈ D ⊂ Rr, is called a reproducing

kernel for a Hilbert space H+ of functions defined on D, where D is a (not

necessarily bounded) domain in Rr, if for any u ∈ H+ one has

(K(x, y), u(y))+ = u(x). (3.150)

It is assumed that, for every x ∈ D, K(x, y) ∈ H+, and H+ consists of

functions for which their values at a point are well defined.

From (3.150) it follows that (K(x, y)u, u)+ ≥ 0, where K(x, y)u :=

(K(x, y), u)+, and

K(x, x) ≥ 0, K(x, y) = K∗(y, x) |K(x, y)|2 ≤ K(x, x)K(y, y) (3.151)

as we will prove shortly. The reproducing kernel, if it exists for a Hilbert

space H+, is unique. Indeed, if K1 is another reproducing kernel then

(K(x, y) −K1(x, y), u(y))+ = 0 ∀u ∈ H+. (3.152)

Therefore K(x, y) = K1(x, y).

The reproducing kernel exists if and only if the estimate

|u(x)| ≤ c ‖ u ‖+ ∀u ∈ H+ (3.153)

holds with a positive constant c which does not depend on u. Indeed

|u(x)| ≤∣∣(K(x, y), u(y))+

∣∣ ≤‖ K(x, y) ‖+‖ u ‖+ . (3.154)

Thus (3.153) holds with c =‖ K(x, y) ‖+. Note that

‖ K(x, y) ‖2+= (K(x, y),K(x, y))+ = K(x, x) (3.155)

because of (3.150). Conversely, if (3.153) holds then, by Riesz’s theorem

about linear functionals on a Hilbert space, there exists a K(x, y) such that

(3.150) holds.

Since (3.150) implies that, for any numbers tj , 1 ≤ j ≤ n, one has

n∑

i,j=1

K(xi, xj)tit∗j =

(n∑

i=1

K(xi, y)tj ,

n∑

i=1

K(x, y)tj

)≥ 0, (3.156)

one sees that the matrix K(xi, xj) is nonnegative definite and therefore

(3.151) holds.

Lemma 3.10 Assume that D ⊂ Rr is a bounded domain and the kernel

R(x, y) of the operator R : H → H, H = L2(D) is nonnegative definite and



continuous in x, y ∈ D. Then the Hilbert space H+ generated by R (see

formula (3.143)) is a Hilbert space with reproducing kernel R(x, y).

Proof. If u ∈ H+ then u ∈ Ran(R1/2) so that there is a v such that

u = R1/2v, v ∈ H. (3.157)

If we prove that the operator R1/2 is an integral operator:

u(x) = R1/2v =

∫

D

T (x, y)v(y)dy (3.158)

such that the function

t(x) :=

(∫

D

|T (x, y)|2 dy)2

(3.159)

is continuous in D, then (3.158) and (3.144) imply

|u(x)| ≤ t(x) ‖ v ‖= t(x) ‖ R−1/2u ‖= t(x) ‖ u ‖+ . (3.160)

This is an estimate identical with (3.153), and we have proved that this

estimate implies that H+ has the reproducing kernel K(x, y).

To finish the proof one has to prove (3.158) and (3.159). Since D is

bounded, the operator R : H → H with continuous kernel is in the trace

class. This means that

R(x, y) =

∞∑

j=1

λjφj(x)φ∗j (y) (3.161)

where λ1 ≥ λ2 ≥ · · · > 0 are the eigenvalues of R counted according to

their multiplicities, φj are the normalized eigenfunctions

∫

D

R(x, y)φj(y)dy = λjφj(x), (φj, φi) = δij , (3.162)

and

TrR =

∞∑

j=1

λj =

∫

D

R(x, x)dx <∞. (3.163)

We will explain the second equality (3.163) later. The operator R1/2 has

the kernel

T (x, y) =∑

λ1/2j φj(x)φ

∗j (y), (3.164)



which can be easily checked: the kernel of R is the composition:

T T :=

∫

D

T (x, z)T (z, y)dz = R(x, y). (3.165)

Therefore

∫

D

|T (x, y)|2dy =

∞∑

j=1

λj|φj(x)|2 = R(x, x). (3.166)

Therefore (3.158) and (3.159) are proved, and

t(x, x) = [R(x, x)]1/2. (3.167)

Let us finally sketch the proof of the second equality (3.163). This equality

is well known and the proof is sketched for convenience of the reader.

It is sufficient to use

Mercer’s theorem : If R(x, y) is continuous and nonnegative definite,

that is∫

D

∫

D

R(x, y)h(y)h∗(x)dydx ≥ 0 ∀h ∈ L2(D), (3.168)

then the series (3.161) converges absolutely and uniformly in D ×D.

If Mercer’s theorem is applied to the series (3.161) with x = y, then

∫

D

R(x, x)dx=

∞∑

j=1

λj

∫

D

|φj|2dx =

∞∑

j=1

λj . (3.169)

Thus, (3.163) holds.

To prove Mercer’s theorem, note that the kernel Rn(x, y) := R(x, y) −∑nj=1 λjφj(x)φ

∗j (y) is nonnegative definite for every n. Therefore

Rn(x, x) ≥ 0 so that

n∑

j=1

λj |φj(x)|2 ≤ R(x, x) ∀n. (3.170)

Therefore the series∑∞

j=1 λj |φj(x)|2 ≤ R(x, x) ≤ c converges and c does

not depend on x because R(x, x) is a continuous function in D. Thus, the

series

R(x, y) =

∞∑

j=1

λjφj(x)φ∗j (y) (3.171)



converges uniformly in x for each y ∈ D. Indeed:

∣∣∣∣∣

n∑

m

λjφj(x)φ∗j (y)

∣∣∣∣∣

2

≤n∑

m

λj |φj(x)|2 ·n∑

m

λj |φj(y)|2

≤ c

n∑

m

λj|φj(y)|2 → 0 as m,n→ ∞. (3.172)

Take y = x in (3.171) and get

R(x, x) =

∞∑

j=1

λj|φj(x)|2. (3.173)

Since R(x, y) is continuous in D × D, the functions φj(x) are continuous

in D. By Dini’s lemma the series (3.173) converges uniformly in x ∈ D.

Therefore the series (3.173) can be termwise integrated which gives (3.169).


Exercise Prove Dini’s lemma: if a monotone sequence of continuous

functions on a compactum D ⊂ Rr converges to a continuous function,

then it converges uniformly.

In Definition 3.1 we assume that the space H+ with reproducing kernel

consists of functions u(x) whose values at a point are well defined. This

excludes spaces L2(D), for example. If the definition (3.150) is understood

in the sense that both sides of (3.150) are equal as elements of H+ (and

not pointwise) then the spaces of the type L2 can be included. However, in

general, for such kind of spaces the reproducing kernel is not necessarily an

element of the space. For example, ifH+ = L2(D) then (3.150) implies that

K(x, y) is the kernel of the identity operator in L2(D). But the identity

operator in L2(D) does not have a kernel in the set of locally integrable

functions. In the set of distributions, however, it has kernel δ(x− y), where

δ(x) is the delta-function. As the kernel of the identity operator in L2(D)

the delta-function is understood in weak sense:∫

D

∫

D

δ(x − y)fgdxdy =

∫

D

f(x)g(x)dx, ∀f, g ∈ L2(D). (3.174)

Remark 3.4 We have seen that the operator R in L2(D), D ⊂ Rr is

bounded, with continuous nonnegative definite kernel R(x, y), belongs to

the trace class. Therefore R1/2 is a Hilbert-Schmidt operator. One can

prove that such an operator is an integral operator without assuming that

R1/2 ≥ 0. A linear operator A : H → H on a Hilbert space is called a



Hilbert-Schmidt operator if∑∞

j=1 ‖ Aφj ‖2<∞ where φj, 1 ≤ j < ∞, is

an orthonormal basis of H. Pick an arbitrary f ∈ H, f =∑∞

j=1(f, φj)φj.

Consider

Af =

∞∑

i=1

(Af, φi)φi =

∞∑

i,j=1

(f, φj)(φj , A∗φi)φi. (3.175)

Let H = L2(D). Then (3.175) can be written as

Af =

∫

D

A(x, y)f(y)dy (3.176)

with

A(x, y) :=

∞∑

i,j=1

ajiφ∗j (y)φi(x), aji := (φj , A

∗φi). (3.177)

One can check that the series (3.177) converges in L2(D) × L2(D) and

∫

D

∫

D

|A(x, y)|2dx dy =

∞∑

i,j=1

|aji|2 < ∞. (3.178)

Indeed, by Parseval’s equality one has

∞∑

i=1

∞∑

j=1

|aji|2 =

∞∑

j=1

‖ Aφj ‖2<∞. (3.179)

One can prove that the sum (3.179) does not depend on the choice of the

orthonormal basis of H.




Chapter 4

Proofs

In this chapter we prove all of the theorems formulated in Chapter 2 except

Theorem 2.3, the proof of which is given in Section 8.3.2.10 as a consequence

of an abstract theory we develop.

4.1 Proof of Theorem 2.1

In order to make it easier for the reader to understand the basic ideas, we

give first a proof of Corollary 2.2, which is a particular case of Theorem

2.1. This case corresponds to the assumption P (λ) = 1, and in this case

the transmission problem (2.18 - 2.20) reduces to the exterior Dirichlet

boundary problem (2.22 - 2.23).

Proof of Corollary 2.2 Equation (2.12)

∫

D

R(x, y)h(y)dy = f(x), x ∈ D (4.1)

holds if and only if

(Rh, φ) = (f, φ) ∀φ ∈ H−α. (4.2)

Since smooth functions with compact support in D are dense in H−α equa-

tion (4.2) holds if and only if

(Rh, φ) = (f, φ) ∀φ ∈ Hm0 (D), m ≥ −α, (4.3)

where Hm0 (D) is the Sobolev space of functions defined in the domain D

with compact support in D. Let us take φ = Q(L)ψ, ψ ∈ C∞0 (D). The

function φ ∈ Hm0 (D) if the coefficients of the operator L belong to Hm(D).

65



We will assume that these coefficients are sufficiently smooth. Sharp con-

ditions of smoothness on the coefficients of the operator L are formulated

in the beginning of Section 2.2.

Let us write (4.3) as

(Q(L)Rh,ψ) = (Q(L)f, ψ) ∀ψ ∈ C∞0 (D). (4.4)

By the assumption

Q(L)R = δ(x − y), (4.5)

equation (4.4) reduces to

(h, ψ) = (Q(L)f, ψ) ∀ψ ∈ C∞0 (D). (4.6)

This means that the distribution h equals to Q(L)f in the domainD (which

is an open set). If f is smooth enough, say f ∈ Hqs, then the obtained

result says that

sing supp h = ∂D = Γ, (4.7)

since in D the distribution h is equal to a regular function Q(L)f in D. In

order to find h in D, we extend f from D to Rr so that the extension F

has two properties:

F is maximally smooth (4.8)

and

Q(L)F = 0 in Ω. (4.9)

Requirement (4.9) is necessary because the function h = Q(L)F has to

have support in D. Requirement (4.8) is natural from two points of view.

The first, purely mathematical, is: the requirement (4.8) selects the unique

solution to equation (4.1) of minimal order of singularity, the mos solution

to (4.1). The second point of view is of statistical nature: only the mos

solution to equation (4.1) gives the (unique) solution to the estimation

problem we are interested in (see formula (2.105)).

Let F = u in Ω. Then (4.9) says that

Q(L)u = 0 in Ω. (4.10)

Since F = f in D, condition (4.8) requires that

∂jNu = ∂jN f on Γ = ∂D, 0 ≤ j ≤ qs

2− 1, (4.11)


Proofs 67

where N is the outer normal to Γ, and one cannot impose more than qs2

boundary conditions on u since the Dirichlet problem in Ω allows one to

impose not more than qs2 conditions on Γ.

Finally one has to impose the condition

u(∞) = 0. (4.12)

Indeed, one can consider F as the left hand side of equation (4.1) with

h ∈ H−α. In this case it is clear that condition (4.12) holds since R(x, y) →0 as |x| → ∞.

The Dirichlet problem (4.10)-(4.12) is uniquely solvable inH∞(Ω) if f ∈C∞(D), Γ ∈ C∞, and the coefficients of L are C∞. If Γ and the coefficients

aj(x) of L are C∞, but f ∈ Hm(D), m ≥ α = qs2 , then the solution u

to the Dirichlet problem (4.10)-(4.12) belongs to Hm(Ω) ∩ H∞(Ω). We

assume that aj(x) ∈ C∞ and Γ ∈ C∞. This is done for simplicity and

in order to avoid lengthy explanations of the results on elliptic regularity

of the solutions and the connection between smoothness of Γ and of the

coefficients of L with the smoothness of the solution u to the problem

(4.10)-(4.12).

The uniqueness of the solution to the Dirichlet problem (4.10)-(4.12)

follows from the positivity of Q(λ): the quadratic form (Q(L)u, u)L2(Ω) = 0

if and only if u = 0 provided that u satisfies conditions (4.11) with f = 0.

If u is the unique solution to the problem (4.10)-(4.12) then

F (x) =

f(x) in D,

u(x) in Ω,F ∈ Hα(Rr). (4.13)

Indeed, F ∈ Hαloc(R

r), and F ∈ Hα(Rr) since u decays at infinity. Since

F ∈ Hα(Rr) and ordQ(L) = qs = 2α, one has

h(x) = Q(L)F ∈ H−α(D). (4.14)

Corollary 2.2 is proved.

Remark 4.1 Consider the operator L = −∆+a2, a > 0, in L2(R3) with

the domain of definition H2(R3). The operator L is elliptic selfadjoint and

positive definite in H0 = L2(R3):

(Lu, u) ≥ a2(u, u), a > 0, ∀u ∈ H0. (4.15)



The Green function of L, which is the kernel of the operator L−1, is

G(x, y) =exp(−a|x− y|)

4π|x− y| . (4.16)

It decays exponentially as |x| → ∞.

Exponential decay as |x| → ∞ holds for the Green function of L =

−∆ + q in H0 if L has property (4.15), in particular if q(x) ≥ q0 > 0.

Exercise. Is it true that if L is an elliptic selfadjoint operator in H0 =

L2(Rr) and Q(λ) > 0 for all λ ∈ R1 is a polynomial, then the kernel of the

operator [Q(L)]−1 decays exponentially as |x| → ∞?

Proof of Theorem 2.1 Let us start with rewriting equation (4.1) with

R ∈ R in the form

P (L)

∫

D

S(x, y)h(y)dy = f(x), x ∈ D, (4.17)

where

S(x, y) :=

∫

Λ

Q−1(λ)Φ(x, y, λ)dρ(λ). (4.18)

Equation (4.17) can be written as∫

D

S(x, y)h(y)dy = g + v(x), (4.19)

where g is a fixed solution to the equation

P (L)g = f in D (4.20)

and v is an arbitrary solution to the equation

P (L)v = 0 in D. (4.21)

Equation (4.19) is of the form considered in the proof of Corollary 2.2 with

g+v in place of f . Applying the result proved in this corollary, one obtains

the following formula:

h = Q(L)G, (4.22)

where

G =

g + v in D,

u in Ω,(4.23)


Proofs 69

and

Q(L)u = 0 in Ω, u(∞) = 0. (4.24)

Here g is a particular solution of (4.20) and v is an arbitrary solution

to (4.21). Formula (4.22) gives the unique solution of minimal order of

singularity, mos solution, to equation (4.1) if and only if G is maximally

smooth. If f and Γ are sufficiently smooth, the maximal smoothness of G

is guaranteed if and only if the following transmission boundary conditions

hold on Γ:

∂jNu = ∂jN (v + g) on Γ, 0 ≤ j ≤ s(p+ q)

2− 1. (4.25)

Given the orders of the elliptic operators

ord P (L) = ps, ord Q(L) = qs, (4.26)

one cannot impose, in general, more than ps+qs2

conditions of the form

(4.25). We will prove that if one imposes ps+qs2

conditions then the

transmission problem (4.20), (4.21),(4.24), (4.25) is uniquely solvable and

G ∈ Hs(p+q)/2(Rr). Therefore the mos solution h to equation (4.1), given

by formula (4.22) in which G has maximal smoothness: G ∈ Hs(p+q)/2(Rr),

has the minimal order of singularity:

h ∈ H−α(D), α = −(s(p+ q)

2− qs

)=

(q − p)s

2.

In order to complete the proof one has to prove that the transmission

problem (4.20), (4.21),(4.24), (4.25) has a solution and its solution is unique.

This problem can be written as

P (L)G = f in D (4.27)

Q(L)G = 0 in Ω (4.28)

G(∞) = 0 (4.29)

(∂jNG

)+

=(∂jG

)− , 0 ≤ j ≤ s(p + q)

2− 1, (4.30)

where + and − in (4.30) denote the limiting values on Γ from D and from

Ω respectively.



First let us prove uniqueness of the solution to the problem (4.27)-(4.30).

Suppose f = 0. The problem (4.27)-(4.30) is equivalent to finding the mos

solution of equation (4.1).

Indeed, we have proved that if h is the mos solution to (4.1) then h

is given by formula (4.22) where G solves (4.27)-(4.30). Conversely, if G

solves (4.27)-(4.30)) then h given by formula (4.22) solves equation (4.1)

and has minimal order of singularity. This is checked by a straightforward

calculation: for any φ ∈ C∞0 (D) one has

(RQ(L)G,φ) = (G,Q(L)Rφ) = (G,P (L)φ) = (P (L)G,φ)

= (f, φ), ∀φ ∈ C∞0 (D), (4.31)

where we have used the formula

Q(L)R(x, y) = P (L)δ(x− y) (4.32)

and the selfadjointness of P (L). Formula (4.31) implies that

Rh = RQ(L)G = f in D. (4.33)

Equation (4.22)-(4.24) imply that supph ⊂ D. It follows from (4.22) and

the inclusion G ∈ Hs(p+q)/2(Rr) that h ∈ H−α(D), α = (q − p)s/2. Thus,

we have checked that h given by (4.22) with G given by (4.27)-(4.30) solves

equation (1) and belongs to H−α(D), that is h is the mos solution to (4.1).

Exercise. Prove uniqueness of the solution to problem (4.27)-(4.30) in

Hs(p+q)/2(Rr) by establishing equivalence of this problem and the problem

of solving equation (4.1) in H−α(D), and then proving that equation (4.1)

has at most one solution in H−α(D).

Hint: If h ∈ H−α and Rh = 0 in D, then

0 = (Rh, h) ≥ c1 ‖ h ‖2−α, c1 > 0, (4.34)

so that h = 0.

Let us prove the existence of the solution to the problem (4.27)-(4.30)

in Hs(p+q)/2(Rr). Consider the bilinear form

[φ, ψ] :=

∫

Λ

P (λ)Q(λ)φ(λ)ψ∗(λ)dρ(λ) (4.35)

defined on the set V := Hs(p+q)/2(Rr) ∩Hsq(Ω) of functions which satisfy


Proofs 71

the equation:

Q(L)φ = 0 in Ω. (4.36)

Since P (λ)Q(λ) ≥ c > 0 ∀λ ∈ R1, one has the norm

[φ, φ]1/2 =

∫

Λ

P (λ)Q(λ)|φ|2dρ(λ)1/2

, (4.37)

which is equivalent to the norm of Hs(p+q)/2(Rr). Indeed

0 < d1 ≤ P (λ)Q(λ)(1 + λ2)−(p+q)/2 ≤ d2. (4.38)

Therefore

d1

∫

Λ

(1 + λ2)(p+q)/2|φ|2dρ ≤∫

Λ

P (λ)Q(λ)|φ|2dρ

≤ d2

∫

Λ

(1 + λ2)(p+q)/2|φ|2dρ(λ). (4.39)

On the other hand,

∫

Λ

(1 + λ2)β |φ|2dρ(λ) =‖ φ ‖2Hβs(Rr) . (4.40)

This proves that the norm (4.37) is equivalent to the norm of the space

Hs(p+q)/2(Rr). Consider the form

[G,ψ] =

∫

Λ

P (λ)Q(λ)G(λ)ψ∗(λ)dρ(λ)

=

∫

RrP (L)G(y)Q(L)ψ∗(y)dy =

∫

D

P (L)GQ(L)ψ∗dy

=

∫

D

fQ(L)ψ∗dy ∀ψ ∈ v, (4.41)

where Parseval’s equality was used. LetW be the Hilbert space which is the

completion of V in the norm (4.37). For any f ∈ Hα(D), α = s(q − p)/2,

the right-hand side of (4.41) is a linear bounded functional on W . Indeed,

extend f to all of Rr so that f ∈ Hα(Rr) and use Parseval’s equality and



the equation Q(L)ψ = 0 to obtain

∣∣∣∣∫

D

fQ(L)ψ∗dy∣∣∣∣ =

∣∣∣∣∫

RrfQ(L)ψ∗dy

∣∣∣∣

=

∣∣∣∣∫

Λ

Q(λ)f (λ)ψ∗(λ)dρ(λ)

∣∣∣∣

≤∫

Λ

|f |2Q(λ)

P (λ)dρ(λ)

1/2∫

Λ

Q(λ)P (λ)|ψ|2dρ(λ)1/2

≤ ‖ f ‖α‖ ψ ‖W≤ c ‖ ψ ‖W , (4.42)

where c =‖ f ‖α is a positive constant which does not depend on ψ ∈ W .

According to Riesz’s theorem about linear functionals one concludes from

(4.41) and (4.42) that

[G,ψ] = [Tf, ψ] ∀ψ ∈W, (4.43)

where T : Hα →W is a bounded linear mapping. The function

G = Tf, G ∈W, (4.44)

is the solution to problem (4.27)-(4.30) in Hs(p+q)/2(Rr). The last state-

ment can be proved as follows. Suppose that G ∈ W satisfies equation

(4.41) for all ψ ∈ V . Then G ∈ Hs(p+q)/2(Rr) so that equations (4.30) and

(4.29) hold, and G solves equation (4.28). In order to check that equation

(4.27) holds, let ψ ∈ C∞0 (Rr) in (4.41). This is possible since C∞

0 (Rr) ⊂ V .

Then∫

D

P (L)G− f η∗dy = 0 ∀η = Q(L)ψ, ψ ∈ C∞0 (Rr). (4.45)

If the set of such η is complete in L2(D), one can conclude from equation

(4.45) that equation (4.27) holds. To finish the proof of Theorem 2.1, let

us prove that the set Q(L)ψ ∀ψ ∈ C∞0 (Rr) is complete in L2(D). But

this is clear because even the smaller set of functions

Q(L)ψ ∀ψ ∈ C∞(D), ∂jNψ = 0 on Γ, 0 ≤ j ≤ qs

2− 1 (4.46)

is dense in L2(D). Indeed, the operator Q(L) is essentially selfadjoint in

L2(D) on the set

ψ : ψ ∈ C∞(D), ∂jNψ = 0 on Γ, 0 ≤ j ≤ qs

2− 1. (4.47)


Proofs 73

That is, the closure of Q(L) with the domain (4.47) is selfadjoint in L2(D):

it is the Dirichlet operator Q(L) in L2(D). Since Q(L) is positive definite

on the set (4.47), its closure is also a positive definite selfadjoint operator

in L2(D). Therefore, the range of the closure of Q(L) is the whole space

L2(D), and the range of the operator Q(L) with the domain of definition

(4.47) is dense in L2(D). This completes the proof of Theorem 2.1.

4.2 Proof of Theorem 2.2

Let us first prove a lemma of a general nature.

Lemma 4.1 Let

Rφ :=

∫

D

R(x, y)φ(y)dy, R(x, y) = R∗(y, x) (4.48)

and assume that the kernel R(x, y) defines a compact selfadjoint operator in

H = L2(D) for any bounded domain D ⊂ Rr. Let λj(D) be the eigenvalues

of R : L2(D) → L2(D),

Rφj = λj(D)φj (4.49)

and λ+j (D) be the positive eigenvalues, ordered so that

λ+1 (D) ≥ λ+

2 ≥ · · · (4.50)

and counted according to their multiplicities. Then

λ+j (D2) ≥ λ+

j (D1) ∀j provided that D2 ⊃ D1. (4.51)

Proof. By the well-known minimax principle one has

λ+j (D2) = min

ψ1,...,ψj−1

max(φ,ψi)2=0, 1≤i≤j−1

(φ,φ)2=1

(Rφ, φ)2 := minψµj(ψ). (4.52)

Here (u, v)m :=∫Dm

uv∗dx, m = 1, 2, and

µj(ψ) := max(φ,ψi)2=0, 1≤i≤j−1

(φ,φ)2=1

(Rφ, φ)2. (4.53)

Let D3 := D2 \D1. If we assume that an additional restriction is imposed

on φ in formula (4.52), namely

φ = 0 in D3 (4.54)



then µj(ψ) cannot increase:

νj(ψ) := max(φ,ψi)2=0, 1≤i≤j−1

(φ,φ)2=1,φ=0 in D3

(Rφ, φ)2 ≤ µj(ψ). (4.55)

On the other hand,

νj(ψ) = max(φ,ψi)1=0, 1≤i≤j−1

(φ,φ)1=1

(Rφ, φ)1. (4.56)

Since

λ+j (D1) = min

ψνj(ψ) ≤ min

ψµj(ψ) = λj(D2), (4.57)

one obtains (4.51). Lemma 4.1 is proved.

Assume now that D expands uniformly in directions, D → Rr. We wish

to find out if the limit

limD→Rr

λ1(D) := λ1∞ (4.58)

exists and is finite. We assume that

R(x, y) =

∫

Λ

R(λ)Φ(x, y, λ)dρ(λ), (4.59)

where R(λ) > 0 is a continuous function which vanishes at infinity. This

implies that the operator R : L2(D) → L2(D) with kernel (4.59) is compact,

as follows from

Lemma 4.2 If R(λ) is a continuous function such that

limλ→±∞

R(λ) = 0, (4.60)

then the operator R : L2(D) → L2(D), where D ⊂ Rr is a bounded domain,

with the kernel (4.59) is compact in H = L2(D).

Proof. Given a number ε > 0, find a continuous function rε(λ) = 0 for

|λ| > N such that

max|λ|≤N

|R(λ) − rε(λ)| < ε. (4.61)

Here the number N is chosen so large that

|R(λ)| < ε for |λ| > N. (4.62)


Proofs 75

Denote by Rε the operator in L2(D) whose kernel has the spectral density

rε(λ). Let P denote the orthoprojection in L2(Rr) onto L2(D). If R∞ is

the operator with kernel (4.59) ( considered as an operator in L2(Rr)), then

R = PR∞P (4.63)

and

Rε = PRε∞P (4.64)

with the same notation for Rε∞. One has

‖ R −Rε ‖≤‖ R∞ − Rε∞ ‖≤ ε. (4.65)

Here one has used the fact that the norm of the operator r : L2(Rr) →L2(Rr) with the kernel

r(x, y) =

∫

Λ

r(λ)Φ(x, y, λ)dρ(λ) (4.66)

is given by the formula

‖ r ‖= maxλ∈Λ

|r(λ)|. (4.67)

Indeed

‖ r ‖ = sup‖φ‖L2(Rr)=1

|(Rφ, φ)| = sup‖φ‖L2(Rr )=1

∣∣∣∣∫

Λ

r(λ)|φ|2dρ(λ)∣∣∣∣

≤ maxλ∈Λ

|r(λ)| sup‖φ‖L2(Rr)=1

‖ φ ‖2L2(Rr)= max

λ∈Λ|r(λ)|. (4.68)

This proves the inequality

‖ r ‖≤ maxλ∈Λ

|r(λ). (4.69)

In order to establish the equality (4.67), take the point λ0 at which

the function |r(λ)| attains its maximum. Such a point does exist since

the function |r(λ)| is continuous and vanishes at infinity. Then find a φ(x),

‖ φ ‖L2(Rr)= 1, such that φ = 0 for |λ−λ0| > δ, where δ > 0 is an arbitrary

small number. Then, using continuity of r(λ), one obtains

‖ r ‖ = sup‖φ‖L2(Rr )=1

∣∣∣∣∣

∫

|λ−λ0|≤δr(λ)|φ|2dρ(λ)

∣∣∣∣∣ ≥ (|r(λ0)| − η(δ))

= maxλ∈Λ

|r(λ)| − η(δ) (4.70)



where η(δ) is arbitrarily small if δ > 0 is sufficiently small. From (4.69)

and (4.70) formula (4.67) follows. From (4.67) and the obvious inequality ‖P ‖≤ 1 one obtains (4.65). If one can prove that the operator Rε is compact

in L2(D) then Lemma 4.2 is proved, because R can be approximated in the

norm by compact operators Rε with arbitrary accuracy according to (4.65).

Let us prove that the operator Rε is compact in L2(D). One has

w := Rεf =

∫

D

(∫ N

−Nrε(λ)φ(x, y, λ)dρ(λ)

)f(y)dy. (4.71)

Let ‖ f ‖L2(Rr)≤ 1. Taking into account that LΦ = λΦ and using Parseval’s

equality, one obtains

‖ LRεf ‖2L2(Rr)=

∥∥∥∫D

(∫ N−N λrε(λ)Φ(x, y, λ)dρ(λ)

)f(y)dy

∥∥∥L2(Rr)

=∫ N−N λ

2|rε(λ)|2|f |2dρ(λ) ≤ N 2 max−N≤λ≤N |rε(λ)|2 ‖ f ‖2L2(Rr)

≤ c(N ). (4.72)

Let us now recall the well-known elliptic estimate ([Hormander (1983-85)]):

‖ w ‖Hs(D1)≤ c(D1, D2)(‖ Lw ‖L2(D2) + ‖ w ‖L2(D2)

), (4.73)

which holds for the elliptic operator L, ordL = s, and for arbitrary bounded

domains D1 ⊂ D2 ⊂ Rr, D1 is a strictly inner subdomain of D2. From

(4.72) and (4.73) it follows that

‖ w ‖Hs(D)≤ c ∀f ∈ B1 :=f :‖ f ‖L2(Rr)≤ 1

, (4.74)

where c > 0 is a constant which does not depend on f ∈ B1. Indeed,

the estimate for ‖ Lw ‖L2(Rr) is given by formula (4.72), and the estimate

for ‖ w ‖L2(Rr) is obtained in the same way. Inequality (4.74) and the

imbedding theorem (see Theorem 8.1) imply that the set Rεf is relatively

compact in L2(D). Therefore the operator Rε maps a unit ball in L2(D)

into a relatively compact set in L2(D). This means that Rε is compact in

L2(D). Lemma 4.2 is proved.

In order to proceed with the study of the behavior of λ1(D) as D → Rr

let us assume that

supx∈Rr

∫

Rr|R(x, y)|dy := A <∞. (4.75)

Lemma 4.3 If the kernel R(x, y) defines, for any bounded domain D ⊂Rr, a selfadjoint compact nonnegative operator in L2(D), and condition


Proofs 77

(4.75) holds, then the limit (4.58) exists and

λ1∞ ≤ A. (4.76)

Proof. From Lemma 4.1 one knows that λ1(D) grows monotonically as

D increases in the sense (4.51). Therefore, existence of the limit (4.58) and

the estimate (4.76) will be established if one proves that

λ1(D) ≤ A (4.77)

for all D ⊂ Rr. Let Rφ1 = λ1(D)φ1. One has

λ1(D) supx∈D

|φ1(x)| ≤ supx∈D

∫

D

|R(x, y)|dy supy∈D

|φ1(y)|

≤ supx∈Rr

∫

Rr|R(x, y)|dy sup

y∈D|φ1(y)|. (4.78)

Therefore inequality (4.77) is obtained. Lemma 4.3 is proved.

Let us now prove Theorem 2.2.

Proof of Theorem 2.2 We need only prove formula (2.29). Take φ1 in

Lemma 4.3 with ‖ φ1 ‖L2(D)= 1. Extend φ1 to all of Rr by setting φ1 = 0

in Ω. Then, using Parseval’s equality, one obtains:

λ1(D) =

∫

D

(∫

D

R(x, y)φ1(y)dy

)φ∗

1(x)dx

=

∫

Rr

∫

RrR(x, y)φ1(y)φ

∗1(x)dydx =

∫

Λ

R(λ)∣∣∣φ1(λ)

∣∣∣2

dρ(λ)

≤ maxλ∈Λ

R(λ) ‖ φ1 ‖L2(Rr)= maxλ∈Λ

R(λ). (4.79)

Choose φ1 with support in a small neighborhood of the point λ0 at which

R(λ) attains its maximum. Then λ(D) ≥ maxλR(λ)− ε, where ε > 0 is an

arbitrary small number.

This proves formula (2.29) and Theorem 2.2 in which ω(λ) stands for

R(λ).

We now discuss some properties of the eigenvalues λj(D).

Suppose that R(x, y) = R(x − y), R(−x) = R(x), and the domain

D is centrally symmetric with respect to the origin, that is, if x ∈ D

then −x ∈ D. Let us recall that an eigenvalue is called simple if the

corresponding eigenspace is one dimensional.



Lemma 4.4 If λ is a simple eigenvalue of the operator R : L2(D) →L2(D) with the kernel R(x − y), R(−x) = R(x), and D is centrally sym-

metric, then the corresponding eigenfunction φ

Rφ = λφ (4.80)

is either even or odd.

Proof. One has

λφ(−x) =

∫

D

R(−x− y)φ(y)dy =

∫

D

R(x+ y)φ(y)dy

=

∫

D

R(x− z)φ(−z)dz. (4.81)

Here we set y = −z in the second integral and used the assumption of

central symmetry. Therefore φ(−x) is the eigenfunction corresponding to

the same eigenvalue λ. Since this eigenvalue is simple, one has φ(−x) =

cφ(x), c = const. This implies φ(x) = cφ(−x), so that c2 = 1. Thus c = ±1.

If c = 1 then φ(x) is even. Otherwise it is odd. Lemma 4.4 is proved.

Remark 4.2 If λ is not simple, the corresponding eigenfunction may be

neither even nor odd.

For example the operator

Rφ :=

∫ π

−πφ(y)dy (4.82)

has an eigenvalue λ = 0. The corresponding eigenspace is infinite dimen-

sional: it consists of all functions orthogonal to 1 in L2(−π, π). In particu-

lar, the function cos y + sin y is an eigenfunction which is neither even nor

odd and it corresponds to the eigenvalue λ = 0.

Suppose one has a family of domains Dt, 0 < t < ∞, such that D1 = D,

and Dt = x : x = tξ, ξ ∈ D1. Then the eigenvalues λj(Dt) := λj(t)

depend on the parameter t and one can study this dependence. If one

writes∫

Dt

R(x, y)φ(y)dy = tr∫

D1

R(tξ, tη)φ(tη)dη = λj(t)φ(tξ)

then one sees that λj(t) are the eigenvalues of the operator R(t) in L2(D)

with the kernel R(ξ, η, t) := trR(tξ, tη), where D = D1 does not depend on


Proofs 79

t. This implies immediately that λj(t) depend on t continuously provided

that

‖ R(t′) −R(t) ‖→ 0 as t→ t′. (4.83)

Indeed,

maxj

|λj(t′) − λj(t)| ≤‖ R(t) − R(t′) ‖ . (4.84)

Estimate (4.84) follows from the minimax principle and is derived in Section

8.3. Condition (4.83) holds, for example, if∫

D

∫

D

|R(x, y)|2dxdy ≤ c(D) < ∞ (4.85)

for any bounded domain D ⊂ Rr.

One can also study differentiability of the eigenvalues λj(t) with respect

to parameter t using, for example, methods given in [K]. This and the study

of the eigenfunctions as functions of the parameter t would lead us astray.

4.3 Proof of Theorems 2.4 and 2.5

Proof of Theorem 2.4

This proof can be given in complete analogy to the proof of Theorem

2.1. On the other hand, there is a special feature in the one-dimensional

(r = 1) theory which is the subject of Theorem 2.4. Namely the spaces of

all solutions to homogeneous equations

P (L)φ = 0 (4.86)

and

Q(L)ψ = 0 (4.87)

are finite dimensional (in contrast to the case when r > 1). The system

(2.43)-(2.44), for example, is a linear algebraic system. Therefore, existence

of the solution to this system follows from the uniqueness of this solution

by Fredholm’s alternative.

Let us briefly describe the basic steps of the proof.

Step 1. Consider first the case when P (λ) = 1.



Lemma 4.5 The set of all solutions of the equation

Rh :=

∫ t

t−TR(x, y)h(y)dy = f(x), t − T ≤ x ≤ t (4.88)

with the kernel R(x, y) ∈ R and P (λ) = 1 in the space H−sq := H−sq(D),

D = (t−T, t), is in one-to-one correspondence with the set of the solutions

of the equation∫ ∞

−∞R(x, y)h(y)dy = f(x), x ∈ R1, (4.89)

where

h = Q(L)F, (4.90)

h ∈ H−sq(R1) and supph ⊂ D = [t− T, t]. Here

F (x) =

∑qs/2j=1 b

−j ψ

−j (x), x ≤ t − T,

f(x), t− T ≤ x ≤ t,∑qs/2

j=1 b+j ψ

+j (x), t ≤ x,

(4.91)

b±j are arbitrary constants, and the functions ψ±j , 1 ≤ j ≤ qs/2, form a

fundamental system of solutions to equation (4.87) such that

ψ+j (+∞) = 0, ψ−

j (−∞) = 0. (4.92)

Proof. Let h ∈ H−qs be a solution to (4.88). This means that for any

φ ∈ C∞0 (R1) one has

(Rh, φ) = (F, φ), ∀φ ∈ C∞0 (R1), (4.93)

where the parentheses in (4.93) denote the L2(R1) inner product and F

is given by (4.91). One can say that F is defined to be the integral Rh

for x > t and x < t − T (see formula (4.88)). This integral for h ∈ H−qs

can be considered as the value of the functional h on the test function

R(x, y), since, for a fixed x, R(x, y) ∈ Hqsloc outside of an arbitrarily small

neighborhood of the point x. Since Q(L)R = δ(x − y), the kernel R does

not belong to Hqs, however, δ(x − y) can be interpreted as a kernel of

the identity operator in L2, so that (δ(x− y), φ) = φ in the sense that

(δ(x− y)φ, ψ) = (φ, ψ) ∀φ, ψ ∈ L2.

Since h ∈ H−qs one can consider it as an element of H−qs(R1).

Equation (4.93) then is equivalent to (4.89). The correspondence men-

tioned in Lemma 4.5 1 can be described as follows. If h ∈ H−qs(R1),


Proofs 81

supph ∈ [t− T, t], and h solves equation (4.89), then h = h solves equation

(4.88). Conversely, if h ∈ H−qs solves equation (4.88), then h = h is an

element of H−qs(R1) with supph ∈ [t− T, t], and h solves equation (4.89).

The constants b±j in (4.91) can be uniquely determined by the given h from

the formula∫ t

t−TR(x, y)hdy = F (x).


Example 4.1 Let

∫ 1

−1

e−|x−y|h(y)dy = f(x), −1 ≤ x ≤ 1. (4.94)

Here L = −i ddx

, r = 1, P (λ) = 1, Q(λ) = λ2+12

, Q(L) = (−∂2 + 1)/2,

∂ = ddx

. (See Section 2.4.) Equation (4.87) has the solutions

ψ− = exp(x), ψ+ = exp(−x). (4.95)

Choose F by formula (4.91) with some b±, qs = 1, t = 1, t−T = −1. Then

h =1

2(−∂2 + 1)F

=−f ′′ + f

2+ δ′(x− 1)

[f(1) − b+e

]+ δ(x− 1)

[f ′(1) − b+e

]

+ δ′(x+ 1)[b−e − f(−1)] + δ(x+ 1)[−f ′(−1) − b−e]. (4.96)

For convenience of the reader let us give all the details of the calculation.

By definition of the distributional derivative one has

(Q(L)F, φ) = (F,Q(L)φ) =b+

2

∫ ∞

1

exp(x)(−φ′′ + φ)dx

+1

2

∫ 1

−1

f(x)(−φ′′ + φ)dx+b−

2

∫ −1

−∞exp(−x)(−φ′′ + φ)dx.

After integrating by parts two times one obtains

(Q(L)F, φ) =1

2

∫ 1

−1

(−f ′′ + f)φdx + b+ [φ′(1)e − φ(1)e]

− φ′(1)f(1) + φ(1)f ′(1) + φ′(−1)f(−1) − φ(−1)f ′(−1)

+ b− [−φ′(−1)e − φ(−1)e] . (4.97)



Formula (4.97) is equivalent to (4.96). Corollary 2.5 follows from Lemma

4.5 immediately. Indeed, the solution of minimal order of singularity one

obtains from formula (4.90) if and only if the constants b±j are chosen so

that F has minimal order of singularity, that is, if F is maximally smooth.

This happens if and only if the following conditions hold:[F (j)(t)

]= 0,

[F (j)(t − T )

]= 0, 0 ≤ j ≤ qs

2− 1. (4.98)

Here, for example,[F (j)(t)

]:= F (j)(t + 0) − F (j)(t− 0) (4.99)

is the jump of F (j)(x) across point t. If conditions (4.98) hold, one can

rewrite formula (4.90) as formula (2.47).

Exercise. Check the last statement.

Hint: The calculations are similar to the calculations given in Example

4.1.

Step 2. Assume now that P (λ) 6≡ 1. Write equation (4.88) as

P (L)

∫ t

t−TS(x, y)h(y)dy = f(x), t− T ≤ x ≤ t, (4.100)

where

S(x, y) =

∫

Λ

Q−1(λ)Φ(x, y, λ)dρ(λ). (4.101)

Rewrite equation (4.100) as

∫ t

t−TS(x, y)h(y)dy = g0(x) +

ps∑

j=1

cjφj. (4.102)

Here φj, 1 ≤ j ≤ ps, is the fundamental system of solutions to equation

(4.86), g0 is a particular solution to the equation

P (L)g = f, t− T ≤ x ≤ t, (4.103)

and cj, 1 ≤ j ≤ ps, are arbitrary constants.

Apply to equation (4.102) the result of Lemma 4.5, in particular, use

formula (4.90) to get the conclusion of Theorem 2.4. Equations (2.43)-(2.44)


Proofs 83

are necessary and sufficient for G, given by formula (2.38), to have minimal

order of singularity and, therefore, for h, given by formula (2.37), to have

minimal order of singularity. As we have already noted, the solvability

of the system (2.43)-(2.44) for the coefficients cj , 1 ≤ j ≤ ps, and b±j ,

0 ≤ j ≤ qs2 − 1, follows from the fact that the homogeneous system (2.43)-

(2.44) has only the trivial solution and from Fredholm’s alternative.

The fact that for f(x) = 0 the system (2.43)-(2.44) has only the trivial

solution can be established as in the proof of Theorem 2.1.

Corollary 2.1 follows from formulas (2.15)-(2.20) immediately.

Exercise. Give a detailed proof of the uniqueness of the solution of the

homogeneous system (2.43)-(2.44).

Hint: A solution to this system generates a solution h to equation (4.88)

with f = 0, h ∈ H−α. Use Parseval’s equality to derive from

Rh = 0 t− T ≤ x ≤ t, h ∈ H−α (4.104)

that h = 0. If h = 0 then cj = b±j = 0 for all j.

Proof of Theorem 2.5 The proof is similar to the proof of Theorem 2.4.

Equation (2.65)

∫ t

t−TS(x, y)h(y)dy = g(x), t − T ≤ x ≤ t, (4.105)

where g(x) is the right hand side of equation (2.65), can be written as∫ ∞

−∞S(x, y)h(y)dy = G(x), −∞ < x < ∞, (4.106)

where G(x) is given by (2.67) and

supph ⊆ [t− T, t]. (4.107)

There is a one-to-one correspondence between solutions to equation (4.105)

in H−α and solutions to equation (4.106) in H−α(R1) with property (4.107).

Equation (4.106) can be solved by the formula

h = Q(L)EG, (4.108)

and h given by formula (4.108) has property (4.107). This h has minimal

order of singularity ≤ α = n−m2 . The constant vectors b±j and cj solve

linear algebraic system (2.68-2.69). That this system is solvable follows



from Fredholm’s alternative and the fact that this system has at most one

solution. The fact that this system has at most one solution follows, as in

the proof of Theorem 2.4, from Parseval’s equality and the positivity of the

kernel R(x, y).

Theorem 2.5 is proved.

4.4 Another approach

The remark we wish to make in this section concerns the case of the positive

definite kernel R(x, y) which satisfies the equation

Q(x, ∂)R(x, y) = P (x, ∂)δ(x− y), (4.109)

where Q(x, ∂) and P (x, ∂) are elliptic positive definite in L2(Rr) operators,

not necessarily commuting, ordQ(x, ∂) = n, ordP (x, ∂) = m, m < n, and

R(x, y) → 0 as |x − y| → ∞. As in Section 4.1, we wish to study the

equation

Rh = f, x ∈ D (4.110)

with f ∈ Hα(D), α := n−m2 , to prove that the operator R : H−α(D) →

Hα(D) is an isomorphism, and to give analytical formulas for the solution

h of minimal order of singularity. Consider equation (4.110) as an equation

in L2(Rr)

Rh = F, x ∈ Rr, (4.111)

with

F =

f in D

u in Ω := Rr \D,(4.112)

where

Q(x, ∂)u = 0 in Ω. (4.113)

Equation (4.113) follows from equation (4.109) and the assumption that

supph ⊂ D. The function F has minimal order of singularity if and only if

u solves (4.113) and satisfies the boundary conditions

∂jNu = ∂jN f on Γ, 0 ≤ j ≤ n

2− 1, u(∞) = 0. (4.114)


Proofs 85

The problem (5)-(6) is the exterior Dirichlet problem which is uniquely

solvable if Q is positive definite. If u solves problem (4.113)-(4.114) then

F ∈ H(n−m)/2(Rr), suppQ(x, ∂)F ⊂ D, QF ∈ H−(n+m)/2(Rr). Apply

Q(x, ∂) to both sides of equation (4.111) and use equation (4.109) to get

P (x, ∂)h = Q(x, ∂)F, QF ∈ H−(n+m)/2(Rr), suppQF ⊂ D. (4.115)

The solution to (4.115) in the space H−(n−m)/2(D) exists, is unique, and

is the solution to equation (4.110) of the minimal order of singularity.

More details are given in Appendix B.




Chapter 5

Singular Perturbation Theory for a

Class of Fredholm Integral Equations

Arising in Random Fields Estimation

TheoryA basic integral equation of random fields estimation theory by the criterion

of minimum of variance of the estimation error is of the form Rh = f , where

Rh =∫D

R(x, y)h(y) dy, and R(x, y) is a covariance function. The singular

perturbation problem we study consists of finding the asymptotic behavior

of the solution to the equation εh(x, ε) +Rh(x, ε) = f(x), as ε → 0, ε > 0.

The domain D can be an interval or a domain in Rn, n > 1. The class

of operators R is defined by the class of their kernels R(x, y) which solve

the equation Q(x,Dx)R(x, y) = P (x,Dx)δ(x − y), where Q(x,Dx) and

P (x,Dx) are elliptic differential operators. Presentation in this chapter is

based on [Ramm and Shifrin (2005)]

5.1 Introduction


εh(x, ε) +Rh(x, ε) = f(x) , x ∈ D ⊂ Rn , (5.1)

where D is a bounded domain with a sufficiently smooth boundary ∂D,

and

Rg(x) :=

∫

D

R(x, y)g(y) dy .

In this paper we study the class R of kernels R(x, y) which satisfy the

equation

Q(x,Dx)R(x, y) = P (x,Dx)δ(x − y) in Rn

87



and tend to zero as |x− y| → ∞, where Q(x,Dx) and P (x,Dx) are elliptic

differential operators with smooth coefficients, and δ(x − y) is the delta -

function.

For technical reasons below we use the kernels R(x, y) of the same class,

but written in a slightly different form ( see (5.5) ).

Specifically, we write

R(x, y) = P (y,Dy)G(x, y) (5.2)

where

P (y,Dy) =∑

|α|≤paα(y)Dα

y , Q(x,Dx) =∑

|β|≤qbβ(x)Dβ

x , p < q, (5.3)

and

Q(x,Dx)G(x, y) = δ(x− y). (5.4)

Note that

Q(x,Dx)R(x, y) = P (y,Dy)δ(x− y) (5.5)

In this paper all the functions are assumed to be real - valued. We

assume that the coefficients aα(x), bβ(x) and f(x) are sufficiently smooth

functions in Rn, α = (α1, · · · , αn) and β = (β1, · · · , βn) are multiindices,

|α| =n∑i=1

αi, |β| =n∑j=1

βj , Dαy =

∂|α|

∂yα1

1 · · ·∂yαnn, Dβ

x =∂|β|

∂xβ1

1 · · ·∂xβnn. Suf-

ficient smoothness of the coefficients means that the integrations by parts

we use are justified.

The following assumptions hold throughout the paper:

A1) (Q(x,Dx)ϕ, ϕ) ≥ c1(ϕ, ϕ) , c1 = const > 0 , ∀ϕ(x) ∈ C∞0 (Rn) , (5.6)

(P (x,Dx)ϕ, ϕ) ≥ c2(ϕ, ϕ) , c2 = const > 0 , ∀ϕ(x) ∈ C∞0 (Rn) , (5.7)

where (·, ·) is the L2(Rn) inner product, and L2 is the real Hilbert space.

By Q∗(x,Dx) and P ∗(x,Dx) the operators formally adjoint to Q(x,Dx)

and P (x,Dx) are denoted.

If (5.6) holds, then q > 0 is an even integer, and (5.7) implies that p is

an even integer, 0 ≤ p < q. Define a := (q − p)/2. Let Hλ(D) be the usual

Sobolev space and H−λ(D) be its dual with respect to L2(D) = H0(D).

Denote ‖ϕ‖λ = ‖ϕ‖Hλ(D) for λ > 0 and ‖ϕ‖λ = ‖ϕ‖Hλ(D) for λ < 0. Let us

denote, for the special value λ = a, Ha(D) = H+, H−a(D) = H−. Denote


Singular Perturbation Theory 89

by (h1, h2)− and by (·, ·) the inner products in H− and, respectively, in

L2(D). As in Chapter 8, let us assume that

A2) c3‖ϕ‖2− ≤ (Rϕ,ϕ) ≤ c4‖ϕ‖2

− , c3 = const > 0 , ∀ϕ(x) ∈ C∞0 (Rn) .

(5.8)

This assumption holds, for example ( see Chapter 8) if

c5‖ϕ‖(p+q)/2 ≤ ‖Q∗ϕ‖−a ≤ c6‖ϕ‖(p+q)/2 , c5 = const > 0 ,

∀ϕ(x) ∈ C∞0 (Rn) , (5.9)

and

c7‖ϕ‖(p+q)/2 ≤ (PQ∗ϕ, ϕ) ≤ c8‖ϕ‖(p+q)/2 , c7 = const > 0 ,

∀ϕ(x) ∈ C∞0 (Rn) . (5.10)

The following result is proved in Chapter 8.

Theorem 5.1 If (5.8) holds, then the operator R : H− → H+ is an

isomorphism. If QR = Pδ(x − y) and (5.9) and (5.10) hold, then (5.8)

holds.

Equation (5.1) and the limiting equation Rh = f are basic in random

fields estimation theory, and the kernel R(x, y) in this theory is a covariance

function, so R(x, y) is a non - negative definite kernel:

(Rϕ,ϕ) ≥ 0 , ∀ϕ(x) ∈ C∞0 (Rn) .

If p < q, then the inequality (Rϕ,ϕ) ≥ C(ϕ, ϕ), C = const > 0,

∀ϕ(x) ∈ C∞0 (Rn) does not hold.

In [Ramm and Shifrin (1991); Ramm and Shifrin (1993); Ramm and

Shifrin (1995)] a method was developed for finding asymptotics of the so-

lution to equation (5.1) with kernel R(x, y) satisfying equation (5.5) with

Q(x,Dx) and P (x,Dx) being differential operators with constant coeffi-

cients.

Our purpose is to generalise this theory to the case of operators with

variable coefficients.

In Chapter 8 the limiting equation Rh = f is studied for the above

class of kernels. In Chapter 2 the class of kernels R(x, y), which are kernels

of positive rational functions of an arbitrary selfadjoint in L2(Rn) elliptic

operator, was studied.

In Section 5.2 we prove some auxiliary results. In Section 5.3 the asymp-

totics of the solution to equation (5.1) is constructed in case n = 1, that is,

for one - dimensional integral equations of class R defined below formula



(5.1). In Section 5.4 examples of applications of the proposed asymptotical

solutions are given. In Section 5.5 the asymptotics of the solution to equa-

tion (5.1) is constructed in the case n > 1, and in Section 5.6 examples of

applications are given.

5.2 Auxiliary results

Lemma 5.1 Assume (5.4) and suppose that G(∞, y) = 0. Then

Q∗(y,Dy)G(x, y) = δ(x− y) . (5.11)

Proof. Let ϕ(x) ∈ C∞0 (Rn). Then

(Q∗(y,Dy)Q(x,Dx)G(x, y), ϕ(y)) = (δ(x− y), Q(y,Dy)ϕ(y))

= Q(x,Dx)ϕ(x) . (5.12)

Also one has:

(Q∗(y,Dy)Q(x,Dx)G(x, y), ϕ(y)) = Q(x,Dx)(Q∗(y,Dy)G(x, y), ϕ(y)) .

(5.13)

Therefore

Q(x,Dx)(Q∗(y,Dy)G(x, y), ϕ(y)) = Q(x,Dx)ϕ(x) . (5.14)

Because of (5.6) and of the condition G(∞, y) = 0 this implies

(Q∗(y,Dy)G(x, y), ϕ(y)) = ϕ(x) , ∀ϕ(x) ∈ C∞0 (Rn) , (5.15)

so (5.11) follows.

Consider now the case n = 1:

P (y,Dy) =

p∑

i=0

ai(y)di

dyi, Q(x,Dx) =

q∑

j=0

bj(x)dj

dxj, x ∈ R1 , y ∈ R1

(5.16)

In this case D = (c, d), D = [c, d].

Lemma 5.2 If g(y) is a smooth function in D, then

d∫

c

[P (y,Dy)G(x, y)] g(y) dy =

d∫

c

G(x, y) [P ∗(y,Dy)g(y)] dy +K2g(x) ,

(5.17)



where

K2g(x) :=

p∑

k=1

k∑

j=1

(−1)j−1

[∂k−jG(x, y)

∂yk−jdj−1(ak(y)g(y))

dyj−1

]d

c

, (5.18)

and

K2 = 0 if p = 0 . (5.19)

Proof. Use definition (5.16) of P (y,Dy) in (5.17), integrate by parts, and

get formulas (5.17) - (5.19).

Lemma 5.3 If g(y) is a smooth function in D, then

d∫

c

G(x, y) [Q(y,Dy)g(y)] dy =

d∫

c

[Q∗(y,Dy)G(x, y)] g(y) dy +K1g(x) ,

(5.20)

where

K1g(x) :=

q∑

m=1

m∑

i=1

(−1)i−1

[dm−ig(y)

dym−i∂i−1(bm(y)G(x, y))

∂yi−1

]d

c

. (5.21)

Proof. Similarly to Lemma 5.2, integrations by parts yield the desired

formulas.

Consider the case n > 1.

Lemma 5.4 If P (y,Dy) is defined in (5.3) and g(x) is a smooth function

in D, then

∫

D

[P (y,Dy)G(x, y)] g(y) dy =

∫

D

G(x, y) [P ∗(y,Dy)g(y)] dy +M2g(x) ,

(5.22)

where

M2g(x) :=∑

1≤α≤p

n∑

k=1

αk∑

γk=1

(−1)γk+αk+1+αk+2+···+αn−1

×∫

∂D

∂|α|−αn−αn−1−···−αk+1−γkG(x, y)

∂yα1

1 ∂yα2

2 · · ·∂yαk−1

k−1 ∂yαk−γkk



×∂αk+1+αk+2+···+αn+γk−1(aα(y)g(y))

∂yγk−1k ∂y

αk+1

k+1 ∂yαk+2

k+2 · · ·∂yαnnNk(y) dSy . (5.23)

Here ∂D is the boundary of D, y ∈ ∂D, Nk(y) is the k - th component

of the unit normal N to ∂D at the point y, pointing into D′ := Rn \D, and

if αk = 0 then the summation over γk should be dropped.

Proof. Apply Gauss’ formula ( i.e. integrate by parts ).

Lemma 5.5 If Q(x,Dx) is defined in (5.3) and g(y) is a smooth function

in D, then∫

D

G(x, y) [Q(y,Dy)g(y)] dy =

∫

D

[Q∗(y,Dy)G(x, y)] g(y) dy +M1g(x) ,

(5.24)

where

M1g(x) :=∑

1≤β≤q

n∑

k=1

βk∑

γk=1

(−1)γk+βk+1+βk+2+···+βn−1

×∫

∂D

∂βk+1+βk+2+···+βn+γk−1(bβ(y)G(x, y))

∂yγk−1k ∂y

βk+1

k+1 yβk+2

k+2 · · ·∂yβnn

×∂|β|−βn−βn−1−···−βk+1−γkg(y)

∂yβ1

1 ∂yβ2

2 · · ·∂yβk−1

k−1 ∂yβk−γkk

Nk(y) dSy (5.25)

Here y ∈ ∂D, and if βk = 0 then the summation over γk should be

dropped.

Remark 5.1 For any smooth in D function g(x), one has

Q(x,Dx)Kjg(x) = 0 , x ∈ (c, d) , j = 1, 2 , (5.26)

and

Q(x,Dx)Mjg(x) = 0 , x ∈ D , j = 1, 2 . (5.27)

Formulas (5.26) and (5.27) follow from the definitions of Kj and Mj

and from equation (5.4).



5.3 Asymptotics in the case n = 1

To construct asymptotic solutions to equation (5.1) with R(x, y) ∈ R we

reduce this equation to a differential equation with special, non - standard,

boundary conditions.

Theorem 5.2 Equation (5.1) is equivalent to the problem:

εQ(x,Dx)h(x, ε) + P ∗(x,Dx)h(x, ε) = Q(x,Dx)f(x) , x ∈ (c, d) (5.28)

with the following conditions

εK1h(x, ε) −K2h(x, ε) = K1f(x) . (5.29)

Proof. If h(x, ε) solves (5.1) and R(x, y) satisfies (5.2), one gets

εh(x, ε) +

d∫

c

[P (y,Dy)G(x, y)]h(y, ε) dy = f(x) . (5.30)

From (5.30) and (5.17) one gets:

εh(x, ε) +

d∫

c

G(x, y) [P ∗(y,Dy)h(y, ε)] dy +K2h(x, ε) = f(x) . (5.31)

Applying Q(x,Dx) to (5.31) and using (5.4) and (5.26), yields (5.28).

Let us check (5.29). From (5.28) and (5.31) one gets:

εh(x, ε) +

d∫

c

G(x, y)Q(y,Dy) [f(y) − εh(y, ε)] dy +K2h(x, ε) = f(x) .

(5.32)

From (5.32) and (5.20) one obtains

εh(x, ε) +

d∫

c

[Q∗(y,Dy)G(x, y)] (f(y) − εh(y, ε)) dy

+K1(f − εh)(x, ε) +K2h(x, ε) = f(x) . (5.33)

From(5.33) and (5.11) one concludes:

εh(x, ε) + f(x) − εh(x, ε) +K1f(x) − εK1h(x, ε) +K2h(x, ε) = f(x)

This relation yields (5.29).



Let us now assume (5.28) and (5.29) and prove that h(x, ε) solves (5.1).

Indeed, (5.2) and (5.17) imply

εh(x, ε) +

d∫

c

R(x, y)h(y, ε) dy = εh(x, ε) +

d∫

c

[P (y,Dy)G(x, y)]h(y, ε) dy

= εh(x, ε) +

d∫

c

G(x, y) [P ∗(y,Dy)h(y, ε)] dy +K2h(x, ε) . (5.34)

From (5.34) and (5.28) one gets

εh(x, ε) + Rh(x, ε) = εh(x, ε)

+

d∫

c

G(x, y)Q(y,Dy)(f(y) − εh(y, ε)) dy +K2h(x, ε) . (5.35)

From (5.35) and (5.20) one obtains:

εh(x, ε) + Rh(x, ε) = εh(x, ε)

+

d∫

c

[Q∗(y,Dy)G(x, y)] (f(y) − εh(y, ε)) dy +K1(f − εh)(x, ε) +K2h(x, ε) .

This relation and equation (5.11) yield:

εh(x, ε) + Rh(x, ε) = εh(x, ε) + f(x) − εh(x, ε)

+K1f(x) − εK1h(x, ε) +K2h(x, ε) ,

and, using (5.29), one gets (5.1). Theorem 5.2 is proved.

This theorem is used in our construction of the asymptotic solution to

(5.1). Let us look for this asymptotics of the form:

h(x, ε) =

∞∑

l=0

εl(ul(x) + wl(x, ε)) =

∞∑

l=0

εlhl(x, ε) , (5.36)

where the series in (5.36) is understood in the asymptotical sense as follows:

h(x, ε) =

L∑

l=0

εl(ul(x) + wl(x, ε)) +O(εL+1) as ε→ 0



where O(εL+1) is independent of x and ul(x) and wl(x, ε) are some func-

tions.

Here u0(x) is an arbitrary solution to the equation

P ∗(x,Dx)u0(x) = Q(x,Dx)f(x) . (5.37)

If u0(x) is chosen, the function w0(x, ε) is constructed as a unique so-

lution to the equation:

εQ(x,Dx)w0(x, ε) + P ∗(x,Dx)w0(x, ε) = 0 , (5.38)

which satisfies the conditions

εK1w0(x, ε) −K2w0(x, ε) = K1f(x) +K2u0(x) . (5.39)

Theorem 5.3 The function h0(x, ε) = u0(x) + w0(x, ε) solves the equa-

tion

εh0(x, ε) + Rh0(x, ε) = f(x) + εu0(x) . (5.40)

Proof. From (5.2) and (5.17) one gets:

εh0(x, ε) +Rh0(x, ε) = εh0(x, ε) +

d∫

c

[P (y,Dy)G(x, y)]h0(y, ε) dy

= εh0(x, ε) +

d∫

c

G(x, y)P ∗(y,Dy)h0(y, ε) dy +K2h0(x, ε) . (5.41)

From (5.37) and (5.38) it follows that

P ∗(y,Dy)h0(y, ε) = P ∗(y,Dy)(u0(y) + w0(y, ε)) = P ∗(y,Dy)u0(y)

+P ∗(y,Dy)w0(y, ε) = Q(y,Dy)f(y) − εQ(y,Dy)w0(y, ε)

= Q(y,Dy) [f(y) − εw0(y, ε)] . (5.42)

From (5.42) and from the definition of h0(x, ε) one derives:

P ∗(y,Dy)h0(y, ε) = Q(y,Dy) [f(y) − εh0(y, ε) + εu0(y)] . (5.43)




εh0(x, ε) + Rh0(x, ε) = εh0(x, ε) +

d∫

c

G(x, y)Q(y,Dy)[f(y)

−εh0(y, ε) + εu0(y)] dy +K2h0(x, ε) . (5.44)

Equations (5.44) and (5.20) yield:

εh0(x, ε) + Rh0(x, ε) = εh0(x, ε)

+

d∫

c

[Q∗(y,Dy)G(x, y)] (f(y) − εh0(y, ε) + εu0(y)) dy

+K1(f(x) − εh0(x, ε) + εu0(x)) +K2h0(x, ε) . (5.45)

From (5.45) and (5.11) one derives:

εh0(x, ε) +Rh0(x, ε) = εh0(x, ε) + f(x) − εh0(x, ε) + εu0(x)

+K1f(x) − εK1h0(x, ε) + εK1u0(x) +K2h0(x, ε) . (5.46)

This implies:

εh0(x, ε) +Rh0(x, ε) = f(x) + εu0(x) +K1f(x)

−εK1w0(x, ε) +K2u0(x) +K2w0(x, ε) . (5.47)

Equations (5.47) and (5.39) yield (5.40). Theorem 5.3. is proved.

Let us construct higher order approximations. If l ≥ 1 then ul(x) is

chosen to be an arbitrary particular solution to the equation

P ∗(x,Dx)ul(x) = −Q(x,Dx)ul−1(x) . (5.48)

After ul(x) is fixed, the function wl(x, ε) is constructed as the unique

solution to the equation

εQ(x,Dx)wl(x, ε) + P ∗(x,Dx)wl(x, ε) = 0 , (5.49)

satisfying the conditions

εK1wl(x, ε) −K2wl(x, ε) = −K1ul−1(x) +K2ul(x) . (5.50)



Theorem 5.4 The function hl(x, ε) = ul(x)+wl(x, ε) solves the equation

εhl(x, ε) +Rhl(x, ε) = −ul−1(x) + εul(x) . (5.51)

Proof. The proof is similar to that of theorem 5.3 and is omitted.

Define

HL(x, ε) =

L∑

l=0

εlhl(x, ε) . (5.52)

Theorem 5.5 The function HL(x, ε) solves the equation

εHL(x, ε) +RHL(x, ε) = f(x) + εL+1uL(x) . (5.53)

Proof. From (5.52) one gets

εHL(x, ε) + RHL(x, ε) = ε

L∑

l=0

εlhl(x, ε) +

L∑

l=0

εlRhl(x, ε)

=

L∑

l=0

εl [εhl(x, ε) +Rhl(x, ε)] . (5.54)

Using (5.40), (5.51) and (5.54) yield (5.54). Theorem 5.5 is proved.

Theorem 5.6 If the function f(x) is sufficiently smooth in D, then it is

possible to choose a solution u0(x) to (5.37) and a solution ul(x) to (5.48)

so that the following inequality holds

‖HL(x, ε) − h(x, ε)‖− ≤ CεL+1 , (5.55)

where C = const > 0 does not depend on ε, but it depends on f(x).

Proof. From (5.1) and (5.54) one obtains

ε(HL(x, ε) − h(x, ε)) + R(HL(x, ε) − h(x, ε)) = εL+1uL(x) . (5.56)


ε(HL(x, ε) − h(x, ε),HL(x, ε) − h(x, ε))

+(R(HL(x, ε)−h(x, ε)),HL(x, ε)−h(x, ε)) = εL+1(uL(x),HL(x, ε)−h(x, ε)) .(5.57)

Using (5.8) one obtains

c3‖HL(x, ε) − h(x, ε)‖2− ≤ εL+1‖uL(x)‖+ ‖HL(x, ε) − h(x, ε)‖−. (5.58)



Inequality (5.55) follows from (5.58) if the norm ‖uL(x)‖+ is finite.

Consider L = 0. If f(x) ∈ H3(q−p)/2(D) then it is possible to find a

solution of (5.37) u0(x) ∈ H(q−p)/2(D). Thus the norm ‖u0(x)‖+ is finite.

For L = 1 suppose that f(x) ∈ H5(q−p)/2(D). Then there exist a

solution to (5.37) u0(x) ∈ H3(q−p)/2(D) and a solution to (5.38) u1(x) ∈H(q−p)/2(D) = H+ so that the norm ‖u1(x)‖+ is finite.

If f(x) ∈ C∞(D) then the approximationHL(x, ε) satisfying (5.55) can

be constructed for an arbitrary large L.

5.4 Examples of asymptotical solutions: case n = 1

Example 5.1 Let

εh(x, ε) +

1∫

−1

e−a|x−y|r(y)h(y, ε) dy = f(x) , (5.59)

where r(y) ≥ C2 > 0 is a given function.

In this example the operators P (y,Dy) andQ(x,Dx) act on an arbitrary,

sufficiently smooth, function g(x) according to the formulas:

P (y,Dy)g(y) = r(y)g(y) , i.e., p = 0 ,

and

Q(x,Dx)g(x) = − 1

2a

d2g(x)

dx2+a

2g(x) .

One has

Q(x,Dx)e−a|x−y| = δ(x− y) , so G(x, y) = e−a|x−y| .

Equation (5.37) yields

u0(x) =−f ′′(x) + a2f(x)

2ar(x), (5.60)

and (5.38) takes the form

ε

2a(−w′′

0 (x, ε) + a2w0(x, ε)) + r(x)w0(x, ε) = 0 . (5.61)



If one looks for the main term of the asymptotics of h(x, ε), then one

can solve in place of (5.61) the following equation

− ε

2aw′′

0a(x, ε) + r(x)w0a(x, ε) = 0 , (5.62)

where w0a(x, ε) is the main term of the asymptotics of w0(x, ε).

We seek asymptotics of the bounded, as ε → 0, solutions to (5.61) and

(5.62). To construct the asymptotics, one may use the method, developed

in [Vishik and Lusternik (1962)]. Namely, near the point x = −1 one sets

x = y − 1, y ≥ 0, and writes (5.62) as:

− ε

2av′′a (y, ε) + r(y − 1) va(y, ε) = 0, (5.63)

where va(y, ε) := w0a(y − 1, ε).

Put y = t√ε and denote ϕa(t, ε) := va(t

√ε, ε). Then

− 1

2a

d2ϕa(t, ε)

dt2+ r(t

√ε − 1)ϕa(t, ε) = 0 . (5.64)

Neglecting the term t√ε in the argument of r is possible if we are looking

for the main term of the asymptotics of ϕa. Thus, consider the equation:

− 1

2a

d2ϕa(t, ε)

dt2+ r(−1)ϕa(t, ε) = 0 . (5.65)

Its solution is

ϕa(t, ε) = C1e−√

2ar(−1) t + C2e√

2ar(−1) t .

Discarding the unbounded, as t → +∞, part of the solution, one gets

ϕa(t, ε) = C1e−√

2ar(−1) t .

Therefore, the main term of the asymptotics of w0a(x, ε) near the point

x = −1 is:

w0a(x, ε) = C1e−√

2ar(−1)/ε (1+x) , C1 = const. (5.66)

Similarly one gets near the point x = 1

w0a(x, ε) = D1e−√

2ar(1)/ε (1−x) , D1 = const . (5.67)

From (5.66) and (5.67) one derives the main term of the asymptotics of

the bounded, as ε → 0, solution to equation (5.62):

w0a(x, ε) = C1e−√

2ar(−1)/ε (1+x) +D1e−√

2ar(1)/ε (1−x) . (5.68)



Now the problem is to find the constants C1 and D1 from condition

(5.39). Since p = 0, formula (5.19) yields K2 = 0, and (5.39) is:

εK1w0(x, ε) = K1f(x) . (5.69)

From (5.69) and (5.21) one gets

ε

w0a(y, ε)

∂G(x, y)

∂y

1

−1

− dw0a(y, ε)

dyG(x, y)

1

−1

= f(y)∂G(x, y)

∂y

1

−1

− f ′(y)G(x, y)1−1 . (5.70)

Note that∂G(x, y)

∂y= −ae−a|x−y|sgn(y − x), where sgn(t) = t/|t|, so

∂G(x, 1)

∂y= −ae−a(1−x) , ∂G(x,−1)

∂y= ae−a(1+x) . (5.71)


εw0a(1, ε)(−a)e−a(1−x) − w0a(−1, ε)ae−a(1+x)

−w′0a(1, ε)e

−a(1−x) + w′0a(−1, ε)e−a(1+x) = −f(1)ae−a(1−x)

−f(−1)ae−a(1+x) − f ′(1)e−a(1−x) + f ′(−1)e−a(1+x) . (5.72)

This implies:

εaw0a(1, ε) + w′0a(1, ε) = af(1) + f ′(1),

and

ε−aw0a(−1, ε) + w′0a(−1, ε) = −af(−1) + f ′(−1) . (5.73)

Keeping the main terms in the braces, one gets:

√ε2ar(1)D1 = f ′(1) + af(1) ,

and

−√ε2ar(−1)C1 = f ′(−1) − af(−1) .



Therefore

C1 =−f ′(−1) + af(−1)√

ε2ar(−1), D1 =

f ′(1) + af(1)√ε2ar(1)

. (5.74)

From (5.60), (5.68) and (5.74) one finds the main term of the asymp-

totics of the solution to (5.59):

h(x, ε) ≈ −f ′′(x) + a2f(x)

2ar(x)+

−f ′(−1) + af(−1)√ε2ar(−1)

e−√

2ar(−1)/ε(1+x)

+f ′(1) + af(1)√

ε2ar(1)e−

√2ar(1)/ε(1−x) . (5.75)

If r(x) = const, then (4.17) yields the asymptotic formula obtained in[Ramm and Shifrin (1991)].


εh(x, ε) +

d∫

c

G(x, y)h(y, ε) dy = f(x) , (5.76)

where G(x, y) solves the problem

−∂2G(x, y)

∂x2+ a2(x)G(x, y) = δ(x− y) , G(∞, y) = 0 , (5.77)

and a2(x) ≥ const > 0, ∀x ∈ R1.

Here P (y,Dy) = I, p = 0, Q(x,Dx) = − d2

dx2+ a2(x), q = 2.

One can write G(x, y) as

G(x, y) =

ϕ1(x)ϕ2(y) , x < y ,

ϕ2(x)ϕ1(y) , y < x ,(5.78)

where functions ϕ1(x) and ϕ2(x) are linearly independent solutions to the

equation Q(x,Dx)ϕ(x) = 0, satisfying conditions ϕ1(−∞) = 0, ϕ2(+∞) =

0 and

ϕ′1(x)ϕ2(x) − ϕ1(x)ϕ

′2(x) = 1 . (5.79)

By (5.37) one gets

u0(x) = −f ′′(x) + a2(x)f(x) . (5.80)



By (5.38) one obtains

ε(−w′′0 (x, ε) + a2(x)w0(x, ε)) + w0(x, ε) = 0 .

The main term w0a(x, ε) of the asymptotics of w0(x, ε) solves the equa-

tion:

−εw′′0a(x, ε) + w0a(x, ε) = 0 .

Thus

w0a(x, ε) = Ce−(x−c)/√ε +De−(d−x)/√ε . (5.81)

Condition (5.39) takes the form (5.69). Using w0a(x, ε) in place of

w0(x, ε) in (5.69), one gets, similarly to (5.70), the relation

ε

w0a(y, ε)

∂G(x, y)

∂y

d

c

− w′0a(y, ε)G(x, y)

dc

= f(y)∂G(x, y)

∂y

d

c

− f ′(y)G(x, y)dc .

Keeping the main terms, one gets

−εw′0a(y, ε)G(x, y)

dc = f(y)

∂G(x, y)

∂y

d

c

− f ′(y)G(x, y)dc . (5.82)


ε −w′0a(d, ε)ϕ1(x)ϕ2(d) + w′

0a(c, ε)ϕ2(x)ϕ1(c)

= f(d)ϕ1(x)ϕ′2(d) − f(c)ϕ2(x)ϕ

′1(c) − f ′(d)ϕ1(x)ϕ2(d)

+f ′(c)ϕ2(x)ϕ1(c) . (5.83)

Because ϕ1(x) and ϕ2(x) are linearly independent, it follows from (5.83)

−εw′0a(d, ε)ϕ2(d) = f(d)ϕ′

2(d) − f ′(d)ϕ2(d) ,

εw′0a(c, ε)ϕ1(c) = −f(c)ϕ′

1(c) + f ′(c)ϕ1(c) . (5.84)

Substitute (5.81) in (5.84) and keep the main terms, to get

−ε D√εϕ2(d) = f(d)ϕ′

2(d) − f ′(d)ϕ2(d) ,



−ε C√εϕ1(c) = −f(c)ϕ′

1(c) + f ′(c)ϕ1(c) .

This yields the final formulas for the coefficients:

C =−f ′(c)ϕ1(c) + f(c)ϕ′

1(c)√ε ϕ1(c)

, D =f ′(d)ϕ2(d) − f(d)ϕ′

2(d)√ε ϕ2(d)

. (5.85)

From (5.80), (5.81) and (5.85) one gets the main term of the asymptotics

of the solution to (5.76):

h(x, ε) ≈ −f ′′(x) + a2(x)f(x) +−f ′(c)ϕ1(c) + f(c)ϕ′

1(c)√ε ϕ1(c)

e−(x−c)/√ε

+f ′(d)ϕ2(d) − f(d)ϕ′

2(d)√ε ϕ2(d)

e−(d−x)/√ε . (5.86)

5.5 Asymptotics in the case n > 1

Consider equation (5.1) with R(x, y) ∈ R. The method for construction

of the asymptotics of the solution to (5.1) in the multidimensional case is

parallel to the one developed in the case n = 1. The proofs are also parallel

to the ones given for the case n = 1, and are omitted by this reason.

Let us state the basic results.

Theorem 5.7 Equation (5.1) is equivalent to the problem

εQ(x,Dx)h(x, ε) + P ∗(x,Dx)h(x, ε) = Q(x,Dx)f(x) , (5.87)

εM1h(x, ε) −M2h(x, ε) = M1f(x) . (5.88)

Proof. One uses lemmas 5.1, 5.4 and 5.5 and formula (5.27) to prove

theorem 5.7.

To construct the asymptotics of the solution to equation (5.1), let us

look for the asymptotics of the form:

h(x, ε) =

∞∑

l=0

εl(ul(x) + wl(x, ε)) =

∞∑

l=0

εlhl(x, ε) , (5.89)

where u0(x) is an arbitrary solution to the equation

P ∗(x,Dx)u0(x) = Q(x,Dx)f(x) , (5.90)



and if some u0(x) is found, then w0(x, ε) is uniquely determined as the

solution to the problem

εQ(x,Dx)w0(x, ε) + P ∗(x,Dx)w0(x, ε) = 0 , (5.91)

εM1w0(x, ε) −M2w0(x, ε) = M1f(x) +M2u0(x) . (5.92)

Theorem 5.8 The function h0(x, ε) = u0(x) + w0(x, ε) solves the equa-

tion

εh0(x, ε) + Rh0(x, ε) = f(x) + εu0(x) . (5.93)

Let us construct higher order terms of the asymptotics. Define ul(x)

(l ≥ 1) as an arbitrary solution to the equation

P ∗(x,Dx)ul(x) = −Q(x,Dx)ul−1(x) . (5.94)

After finding ul(x), one finds wl(x, ε) as the unique solution to the

problem

εQ(x,Dx)wl(x, ε) + P ∗(x,Dx)wl(x, ε) = 0 , (5.95)

εM1wl(x, ε) −M2wl(x, ε) = −M1ul−1(x) +M2ul(x) . (5.96)

Theorem 5.9 The function hl(x, ε) = ul(x)+wl(x, ε) solves the equation

εhl(x, ε) +Rhl(x, ε) = −ul−1(x) + εul(x) . (5.97)

Define

HL(x, ε) =

L∑

l=0

εlhl(x, ε). (5.98)

From Theorems 5.8 and 5.9 one derives

Theorem 5.10 The function HL(x, ε) solves the equation

εHL(x, ε) +RHL(x, ε) = f(x) + εL+1uL(x) . (5.99)

Theorem 5.11 If the function f(x) is sufficiently smooth in D, then it is

possible to choose a solution u0(x) to (5.90) and a solution ul(x) to (5.94),

so that the following inequality holds

‖HL(x, ε) − h(x, ε)‖− ≤ CεL+1,

where C = const > 0 does not depend on ε, but it depends on f(x).



5.6 Examples of asymptotical solutions: case n > 1

Example 5.3


εh(x, ε) +

∫

S1

G(x, y)s(|y|)h(y, ε) dy = 1 , (5.100)

where x = (x1, x2), y = (y1, y2), |y| =√y21 + y2

2, s(|y|) is a known smooth

positive function, s(|y|) ≥ C2 > 0, G(x, y) =1

2πK0(a|x − y|), K0(r) is

the MacDonalds function, (−∆x + a2)G(x, y) = δ(x− y), S1 is a unit disk

centered at the origin.

In this example P (y,Dy)g(y) = s(|y|)g(y), p = 0, Q(x,Dx) = −∆x+a2,

q = 2.

Let us construct the main term of the asymptotics of the solution to

(5.100). By (5.90) one gets

s(|x|)u0(x) = (−∆x + a2)1 = a2 .

Thus

u0(x) =a2

s(|x|) . (5.101)

Equation (5.91) yields:

ε(−∆x + a2)w0(x, ε) + s(|x|)w0(x, ε) = 0 . (5.102)

The main term w0a(x, ε) of the asymptotics of w0(x, ε) solves the equa-

tion

−ε∆xw0a(x, ε) + s(|x|)w0a(x, ε) = 0 . (5.103)

In polar coordinates one gets

−ε(∂2w0a(r, ϕ, ε)

∂r2+

1

r

∂w0a(r, ϕ, ε)

∂r+

1

r2∂2w0a(r, ϕ, ε)

∂ϕ2

)

+s(r)w0a(r, ϕ, ε) = 0 . (5.104)

By radial symmetry w0a(r, ϕ, ε) = w0a(r, ε), so

−ε(d2w0a(r, ε)

dr2+

1

r

dw0a(r, ε)

dr

)+ s(r)w0a(r, ε) = 0 . (5.105)



The asymptotics of the solution to (5.105) we construct using the

method of [Vishik and Lusternik (1962)]. Let r = 1 − %. Then

−ε(d2w0a(%, ε)

d%2− 1

1 − %

dw0a(%, ε)

d%

)+ s(1 − %)w0a(%, ε) = 0 .

Put % = t√ε and keep the main terms, to get

−d2w0a(t)

dt2+ s(1)w0a(t) = 0 , (5.106)

so

w0a(t) = Ce−√s(1) t +De

√s(1) t.

Keeping exponentially decaying, as t→ +∞, solution one obtains:

w0a(t) = Ce−√s(1) t .

Therefore

w0a(r, ε) = Ce−√s(1)/ε (1−r) . (5.107)

To find the constant C in (5.107) we use condition (5.92). Since p = 0,

one concludes M2 = 0, and (5.92) takes the form

εM1w0a(x, ε) = M1f(x) = M11 . (5.108)


ε

∫

∂S1

[∂G(x, y)

∂Nyw0(y, ε) − G(x, y)

∂w0(y, ε)

∂Ny

]dly

=

∫

∂S1

[∂G(x, y)

∂Ny1 − G(x, y)

∂1

∂Ny

]dly ,

where dly is the element of the arclength of ∂S1.

If one replaces w0(y, ε) by w0a(y, ε) in the above formula then one gets

ε

∫

∂S1

[∂G(x, y)

∂Nyw0a(y, ε) −G(x, y)

∂w0a(y, ε)

∂Ny

]dly =

∫

∂S1

∂G(x, y)

∂Nydly .

(5.109)



The main term in (5.109) can be written as:

−ε∫

∂S1

G(x, y)∂w0a(y, ε)

∂Nydly =

∫

∂S1

∂G(x, y)

∂Nydly . (5.110)

By (5.107) for y ∈ ∂S1 one gets

∂w0a(y, ε)

∂Ny=

√s(1)

εC . (5.111)


−√εs(1)C

∫

∂S1

G(x, y) dly =

∫

∂S1

∂G(x, y)

∂Nydly , ∀x ∈ S1 . (5.112)

For x = 0 and y ∈ ∂S1 one gets

G(0, y) =1

2πK0(a) ,

∂G(0, y)

∂Ny=

1

2π

dK0(ar)

dr r=1

=a

2πK ′

0(ar)r=1

= − a

2πK1(a) .

These relations and (5.112) imply:

−√εs(1)CK0(a) = −aK1(a) .

Therefore

C =aK1(a)√εs(1)K0(a)

. (5.113)

From (5.101), (5.107) and (5.113) one finds the main term of the asymp-

totics of the solution to (5.100):

h(x, ε) ≈ a2

s(|x|) +aK1(a)√εs(1)K0(a)

e−√s(1)/ε (1−|x|) . (5.114)

If s(|x|) = 1, then (5.114) agrees with the earlier result, obtained in[Ramm and Shifrin (1995)].


εh(x, ε) +

∫

B1

G(x, y)s(|y|)h(y, ε) dy = 1 , (5.115)



where x = (x1, x2, x3), y = (y1, y2, y3), s(|y|) is a smooth positive function,

s(|y|) ≥ C2 > 0, G(x, y) =e−a|x−y|

4π|x− y| , P (y,Dy)g(y) = s(|y|)g(y), so p = 0,

(−∆x+a2)G(x, y) = δ(x− y), so Q(x,Dx) = −∆x+a2, q = 2, B1 is a unit

ball centered at the origin.

The main term of the asymptotics is constructed by the method of

Section 5. By (5.90) one gets

s(|x|)u0(x) = (−∆x + a2)1 = a2 .

Thus

u0(x) =a2

s(|x|) . (5.116)

By (5.91)

ε(−∆x + a2)w0(x, ε) + s(|x|)w0(x, ε) = 0 .

Keeping the main terms w0a(x, ε) of the asymptotics of w0(x, ε), one

gets

−ε∆xw0a(x, ε) + s(|x|)w0a(x, ε) = 0 .

In spherical coordinates this equation for the spherically symmetric so-

lution becomes:

−ε(d2w0a(r, ε)

dr2+

2

r

dw0a(r, ε)

dr

)+ s(r)w0a(r, ε) = 0 . (5.117)

Let r = 1 − %. Then (5.117) can be written as:

−ε(d2w0a(%, ε)

d%2− 2

1 − %

dw0a(%, ε)

d%

)+ s(1 − %)w0a(%, ε) = 0 .

Put % = t√ε and keep the main terms in the above equation to get

−d2w0a(t)

dt2+ s(1)w0a(t) = 0 . (5.118)

The exponentially decaying, as t→ +∞, solution to (6.19) is:

w0a(t) = Ce−√s(1) t .

Therefore

w0a(x, ε) = Ce−√s(1)/ε (1−|x|) . (5.119)



The constant C in (5.119) is determined from conditions (5.92), which

in this example can be written as

εM1w0(x, ε) = M1f(x) = M11 . (5.120)

Using formulas (5.25) and (5.120) one gets

ε

∫

∂B1

[∂G(x, y)

∂Nyw0(y, ε) −G(x, y)

∂w0(y, ε)

∂Ny

]dSy =

∫

∂B1

∂G(x, y)

∂NydSy .

Replacing w0(y, ε) by w0a(y, ε) and keeping the main terms, one obtains

−ε∫

∂B1

G(x, y)∂w0a(y, ε)

∂NydSy =

∫

∂B1

∂G(x, y)

∂NydSy . (5.121)

From (5.119) for y ∈ ∂B1 one derives

∂w0a(y, ε)

∂Ny=

√s(1)

εC . (5.122)


−√εs(1)C

∫

∂B1

G(x, y) dSy =

∫

∂B1

∂G(x, y)

∂NydSy . (5.123)

Put x = 0 in (5.123). Let us compute the corresponding integrals:

∫

∂B1

G(0, y) dSy =1

4π

∫

∂B1

e−a|y|

|y| dSy =e−a

4π

∫

∂B1

dSy = e−a . (5.124)

Note that:

∂G(0, y)

∂Ny=

1

4π

∂

∂r

(e−ar

r

)

r=1

= − 1

4π

(ae−a + e−a

).

Thus∫

∂B1

∂G(0, y)

∂NydSy = − 1

4πe−a(a+ 1)

∫

∂B1

dSy = −e−a(a+ 1) . (5.125)

From (5.123), (5.124) and (5.125)) one gets, setting x = 0, the relation

−√εs(1)Ce−a = −e−a(a+ 1) .



This yields

C =a+ 1√εs(1)

. (5.126)

From (5.116), (5.119) and (5.126) the main term of the asymptotics of

the solution to equation (5.115) follows:

h(x, ε) ≈ a2

s(|x|) +a + 1√εs(1)

e−√s(1)/ε (1−|x|). (5.127)

y If s(x) = 1, formula (5.127) yields a result obtained in [Ramm and Shifrin

(1995)].

Let us summarize briefly our results. In this paper we constructed

asymptotics of the solution to (5.1) as ε → +0, and demonstrated how

the L2 - solution to (5.1) tends to a distributional solution of the limiting

equation Rh(x) = f(x).


Chapter 6

Estimation and Scattering Theory

In recent years a number of papers have appeared in which the three-

dimensional (3D) inverse scattering problem is associated with the random

fields estimation problem. In this Chapter we give a brief presentation of

the direct and inverse scattering theory in the three-dimensional case and

outline the connection between this theory and the estimation theory. This

connection, however, is less natural and significant than in one-dimensional

case, due to the lack of causality in the spacial variables. In Chapter 1

the direct scattering problem is studied, in Chapter 2 the inverse scattering

problem is studied, in Chapter 3 the connection between the estimation

theory and inverse scattering is discussed.

6.1 The direct scattering problem

6.1.1 The direct scattering problem

Consider the problem

`qu− k2u := [−∇2 + q(x) − k2]u = 0 in R3, k > 0 (6.1)

u = exp(ikθ ·x)+A(θ′, θ, k)r−1 exp(ikr)+o(r−1), r = |x| → ∞, θ′ =x

r(6.2)

where θ, θ′ ∈ S2, S2 is the unit sphere in R3, and o(r−1) in (6.2) is uniform

in θ, θ′ ∈ S2.

The function u is called the scattering solution, the function A(θ′, θ, k)is called the scattering amplitude, the function q(x) is the potential.

111



Let us assume that

q ∈ Q :=q : q = q, |q| ≤ c(1 + |x|)−a, a > 3

. (6.3)

The bar in this chapter stands for complex conjugate (and not for mean

value). By c we denote various positive constants. By QM we denote the

following class of q

Qm :=q : q(j) ∈ Q, 0 ≤ |j| ≤ m

, (6.4)

so that Q0 = Q.

The scattering theory is developed for q which may have local singular-

ities and are described by some integral norms, but this is not important

for our presentation here. Our purpose is to give a brief outline of the the-

ory for the problem (6.1)-(6.3) with minimum technicalities. The following

questions are discussed:

1) selfadjointness of `q,

2) the nature of the spectrum of `q ,

3) existence and uniqueness of the solution to (6.1)-(6.3),

4) eigenfunction expansion in scattering solutions,

and

5) properties of the scattering amplitude.

The operator `q defined by the differential expression (6.1) on C∞0 (R3)

is symmetric and bounded from below. Let us denote by `q its closure in

H = L2(R3).

Lemma 6.1 The operator `q is selfadjoint.

Proof. This lemma is a particular case of Lemma 8.5 in Section 8.2.4.

Lemma 6.2 1) The negative spectrum of `q is discrete and finite. 2) The

positive spectrum is absolutely continuous. 3) The point λ = 0 belongs to

the continuous spectrum but may not belong to the absolutely continuous

spectrum.

Proof. Let us recall Glazman’s lemma:

Lemma 6.3 Negative spectrum of a selfadjoint operator A is discrete and

finite if and only if

sup dimM < ∞, (6.5)


Estimation and Scattering Theory 113

where the supremum is taken over the set of subspaces M such that

(Au, u) ≤ 0 for u ∈ M.

A proof is, e.g., in [Ramm (1986), p. 330] or [Glazman (1965), §3].

Therefore the first statement of Lemma 6.2 is proved if one proves that

N− := sup dimM <∞, M :=

u :

∫

R3

|∇u|2dx <∫

R3

q−(x)|u|2dx,

(6.6)

where N− is the number of negative eigenvalues of `q counting their multi-

plicities and q− = max0,−q(x). One has

q = q+(x) − q−(x), q+ = maxq, 0. (6.7)

Let us write

M =

u : 1 <

∫q−|u|2dx

(∫|∇u|2dx

)−1. (6.8)

The ratio of the quadratic forms in (6.8) has a discrete spectrum and the

corresponding eigenfunctions solve the problem

q−u = −λ∇2u in R3

which can be written as

g0q−u = λu, g0f :=

∫(4π|x− y|)−1f(y)dy. (6.9)

Let q1/2− := p. Then (6.9) can be written as

Aφ := pg0pφ = λφ, φ := pu. (6.10)

The operator A, defined in (6.10), is compact, selfadjoint, and nonnegative-

definite in L2(R3) if q ∈ Q. Therefore the number of the eigenvalues λnof the problem (6.10) which satisfy the inequality λn > 1, is finite. This

number is the dimension of M defined in (6.8). Thus N− < ∞, where N−is defined in (6.6). Statement 1) of Lemma 6.2 is proved.

Only a little extra work is needed to give an estimate of N− from above.

Namely,

A2φj = λ2jφj, T rA2 =

∞∑

j=1

λ2j ≥

∑

λj>1

λ2j ≥

∑

λj>1

1 = N−. (6.11)



Thus

N− ≤ Trpg0q−g0p =

∫∫q−(x)q−(y)dxdy

(4π)2|x− y|2 . (6.12)

The right-hand side of (6.12) is finite if q− ∈ Q. Note that if q ∈ Q, then

q− ∈ Q and q+ ∈ Q.

6.1.2 Properties of the scattering solution

Lemma 6.4 The scattering solution exists and is unique.

Proof. The scattering solution solves the integral equation

u = u0 − gqu, gf :=

∫exp(ik|x− y|)

4π|x− y| f(y)dy, (6.13)

where

u0 := exp(ikθ · x). (6.14)

Conversely, the solution to (6.13) is the scattering solution with

A(θ′, θ, k) = − 1

4π

∫exp(−ikθ′ · y)g(y)u(y, θ, k)dy. (6.15)

It is not difficult to check that if q ∈ Q then the operator

T (k)u := gqu (6.16)

is compact in C(R3). Therefore, the existence of the solution to (6.13)

follows from the uniqueness of the solution to the homogeneous equation

u = −Tu, u ∈ C(R3) (6.17)

by Fredholm’s alternative.

If u solves (6.17) then u solves equation (6.1) and satisfies the radiation

condition

limr→∞

∫

|s|=r

∣∣∣∣∂u

∂r− iku

∣∣∣∣2

ds = 0. (6.18)

Since q = q, the function u solves equation (6.1) and Green’s formula yields

limr→∞

∫

|s|=r

(u∂u

∂r− u

∂u

∂r

)ds = 0. (6.19)




limr→∞

∫

|s|=r

[∣∣∣∣∂u

∂r

∣∣∣∣2

+ k2|u|2]ds = 0. (6.20)

Any solution to (6.1) which satisfies condition (6.20) has to vanish iden-

tically according to a theorem of Kato [Kato (1959)]. Thus u = 0 and


Lemma 6.5 Let f ∈ L2(R3) be arbitrary. Define

f(ξ) := (2π)−3/2

∫f(x)u(x, ξ)dx, fj := (f, uj), 1 ≤ j ≤ N−, (6.21)

where uj are the orthonormalized eigenfunctions corresponding to the dis-

crete spectrum of `q:

`quj = λjuj, 1 ≤ j ≤ N−, λj < 0, (6.22)

(uj, um) :=

∫uj(x)um(x)dx = δjm, (6.23)

ξ ∈ R3, |ξ| = k, ξ = kθ, θ ∈ S2. (6.24)

Then

f(x) = (2π)−3/2

∫f (ξ)u(x, ξ)dξ +

N−∑

j=1

fjuj(x). (6.25)

Formulas (6.21), (6.25) are analogous to the usual Fourier inversion

formulas. They reduce to the latter if q(x) = 0.

The proof of Lemma 6.5 requires some preparations. We follow the

scheme used in [Ramm (1963); Ramm (1963b); Ramm (1965); Ramm

(1968b); Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987);

Ramm (1988b)] and in [Ramm (1986), p. 47].

Let G(x, y, k) be the resolvent kernel of `q:

(`q − k2)G(x, y, k) = δ(x− y) in R3. (6.26)

This kernel solves the equation

G(x, y, k) = g(x, y, k) −∫g(x, z, k)q(z)G(z, y, k)dz. (6.27)



This equation is similar to (6.13) and can be written as

(I + T )G = g, (6.28)

where T is defined in (6.16). Therefore, as in the proof of Lemma 6.4, the

solution to the equation (6.27) exists and is unique in the space Cy(R3) of

functions of the form G = c|x− y|−1 + v(x, y) where v(x, y) is continuous

and c = const, ‖ G ‖:= |c| + maxx∈R3 |v(x, y)|. The operator T is compact

in Cy(R3) if q ∈ Q. The homogeneous equation (6.28) has only the trivial

solution.

The operator T = T (k) depends continuously on k ∈ C+ := k :

Imk ≥ 0. Therefore [I + T (k)]−1 is a continuous function of k in the

region C+ ∩ ∆(k), where ∆(k) is a neighborhood of a point k0, ∆(k) :=

k : |k − k0| < δ δ > 0, and the operator I + T (k0) is invertible. Since for

any k > 0 the operator I + T (k) is invertible, it follows that G(x, y, k) is

continuous in k in the region C+ ∩ ∆(0,∞), that is in a neighborhood of

the positive semiaxis in C+ . The continuity holds for any x, y fixed, x 6= y,

and also in the norm of Cy. This implies that the continuous spectrum of

`q in the interval (0,∞) is absolutely continuous.

From the equation (6.27) it follows that

G(x, y, k) = g(r)u(y,−θ, k) [1 + o(1)] , r = |x| → ∞,x

|x| = θ (6.29)

where g(r) := (4πr)−1 exp(ikr) and u(y,−θ, k) is the scattering solution.

In fact, o(1) = 0(

1|x|

)uniformly in y ∈ D, where D ∈ R3 is an arbitrary

fixed bounded domain. Indeed, it follows from (6.27) that

u(y,−θ, k) = exp(−ikθ · y) −∫

exp(−ikθ · z)q(z)G(z, y, k)dz. (6.30)

The function (6.30) solves equation (6.1):

(`q − k2)u = q(y) exp(−ikθ · y) −∫

exp(−ikθ · z)q(z) [δ(z − y)] dz = 0,

and satisfies the condition (6.2) since the integral term in (6.30) satisfies

the radiation condition. Therefore, the scattering solution can be defined

by formula (6.29). This definition was introduced and used systematically

in [R 28)].



The starting formula in the proof of the eigenfunction expansion theo-

rem is the Cauchy formula

0 =1

2πi

∫

CN

Rλfdλ (6.31)

where

Rλ := (A − λI)−1, A = A∗ = `q. (6.32)

CN is a contour which consists of the circle γN := λ : |λ| = N |, of a finite

number N− of circles γj := λ : |λ + λj| = δ where λj < 0, 1 ≤ j ≤ N−,

are negative eigenvalues of `q, δ > 0 is a small number such that γj does

not intersec with γm for j 6= m, and of a loop LN which joins points N − i0and N + i0 and goes from N − i0 to 0 and from 0 to N + i0. The circles

γj , 1 ≤ j ≤ N− are run clockwise and γN is run counterclockwise. The

integral

1

2πi

∫

γj

Rλfdλ = Pjf (6.33)

where Pj is the orthoprojection in H = L2(R3) onto the eigenspace of A

corresponding to the eigenvalue λj. Note that there is no minus sign in

front of the integral in (6.33) because γj is run clockwise and not counter-

clockwise.

One has:

1

2πi

∫

LNRλfdλ =

1

π

∫ N

0

ImRλ+i0fdλ, (6.34)

where we have used the relation

Rλ−i0f = Rλ+i0f . (6.35)

Formula (6.35) follows from the selfadjointness of A:

Rλ−i0 = R∗λ+i0 (6.36)

and from the symmetry of the kernel of the operator Rλ(x, y).

Finally, for any selfadjoint A one has

limN→∞

− 1

2πi

∫

γN

Rλfdλ = f. (6.37)



Indeed, if A is selfadjoint then

Rλ =

∫ ∞

−∞(t − λ)−1dEλ, (6.38)

where Eλ is the resolution of the identity for A. Substitute (6.38) into

(6.37) to get

limN→∞

− 1

2πi

∫

γN

∫ ∞

−∞

dEλf

t − λdλ = lim

N→∞

∫ ∞

−∞dEλf

(− 1

2πi

∫

γN

dλ

t − λ

)

= limN→∞

∫ N

−NdEλf = f. (6.39)

Here we have used the formula

− 1

2πi

∫

γN

dλ

t− λ=

1, −N < t < N,

0, t > N or t < −N.(6.40)

Using (6.31), (6.33), (6.34) and (6.37) one obtains

f =

N−∑

j=1

fjuj(x) +1

π

∫ ∞

0

ImRλ+i0fdλ, (6.41)

where

fj = (f, uj), 1 ≤ j ≤ N− (6.42)

and the sum in (6.41) is the term

∑

j

Pjf. (6.43)

Let λ = k2 in (6.41). Then

J :=1

π

∫ ∞

0

ImRλ+i0fdλ =2

π

∫ ∞

0

(∫ImG(x, y, k)f(y)dy

)kdk. (6.44)

We wish to show that the term (6.44) is equal to the integral in (6.25).

This can be done by expressing ImG(x, y, k) via the scattering solutions.

Green’s formula yields:

G(x, y, k) − G(x, y, k) =

∫

|s|=r

[G(s, y)

∂G(x, s)

∂|s| − G(s, y)∂G(x, s)

∂|s|

]ds.

(6.45)



Take r → ∞ and use (6.29) to get

2iImG(x, y, k) = limr→∞

∫

|s|=rg(r)u(y,−θ, k)ikgu(x,−θ, k)

+ g(r)u(y,−θ, k)ikgu(x,−θ, k)

=2ik

(4π)2

∫

S2

u(x, θ, k)u(y, θ, k)dθ. (6.46)

Thus

2

πImG(x, y, k) =

k

(2π)3

∫

S2

u(x, θ, k)u(y, θ, k)dθ. (6.47)

Substitute (6.47) into (6.44) to get

J =1

(2π)3/2

∫ ∞

0

∫

S2

f(ξ)u(x, θ, k)k2dk =1

(2π)3/2

∫f (ξ)u(x, ξ)dξ.

(6.48)

Here ξ = kθ, dξ = k2dkdθ, f(ξ) is given by (6.21). From (6.41), (6.44), and

(6.48) formula (6.25) follows. Lemma 6.5 is proved.

Remark 6.1 Let us give a discussion of the passage from (6.44) and

(6.47) to (6.48). First note that our argument yields Parseval’s equality:

(f, h) = (f (ξ), h(ξ)) +∑

j

fjhj (6.49)

and the formula for the kernel of the operator dEλdλ , where the derivative is

understood in the weak sense

dEλ(x, y)

dλ=

1

πImG(x, y,

√λ)

=

√λ

16π3

∫

S2

u(x, θ,√x)u(y, θ,

√λ)dθ, λ > 0. (6.50)

To check (6.49) one writes

(f, h) =∑

j

fjhj +

(∫ ∞

0

dEλf,

∫ ∞

0

dEµh

)=∑

j

fjhj +

∫ ∞

0

d(Eλf, h)

where we have used the orthogonality of the spectral family: E(∆)E(∆′) =

E(∆ ∩ ∆′). Furthermore, using (6.50) one obtains

∫ ∞

0

d(Eλf, g) =

∫f(ξ)h(ξ)dξ.



The last two formulas yield (6.49). The passage from (6.44) and (6.47)

to (6.48) is clear if f ∈ L2(R3) ∩ L1(R3): in this case the integral

f (ξ) :=∫f(y)u(y, ξ)dy converges absolutely. If f ∈ L2(R3) then one

can establish formula (6.25) by a limiting argument. Namely, let F be

the operator of the Fourier transform, Ff =[f (ξ), fj

]:= [Fcf,Fdf ]

where the brackets indicate that the Fourier transform is a set of the co-

efficients fj , corresponding to the discrete spectrum of `q and the function

f (ξ) corresponding to the continuous spectrum of `q . The operator F is

isometric by (6.49), and if it is defined originally on a dense in L2(R3) set

L2(R3)∩L1(R3) it can be uniquely extended by continuity on all of L2(R3).

Formula (6.21) is therefore well defined for f ∈ L2(R3). If formula (6.25)

is proved for f ∈ L2(R3) ∩ L1(R3), it remains valid for any f ∈ L2(R3)

because the inverse of F is also an isometry from RanF onto L2(R3). Let

us note finally that RanFC = L2(R3), where FCf := f (ξ), and FC is an

isometry from L2(R3) onto L2(R3), F∗CFCf = f−E0f , FCF∗

C f = f . Here

E0f is the projection of f onto the linear span of the eigenfunctions of `q,

E0f =∑j Pjf . This follows from the formula N (F∗

C) = 0, the proof of

which is the same as in [Ramm (1986), p. 51].

6.1.3 Properties of the scattering amplitude

Let us now formulate some properties of the scattering amplitudeA(θ′, θ, k).

Lemma 6.6 If q ∈ Q then the scattering amplitude has the properties

A(θ′, θ,−k) = A(θ′, θ, k), k > 0 (reality) (6.51)

A(θ′, θ, k) = A(−θ,−θ′, k), (reciprocity) (6.52)

A(θ′, θ, k)− A(θ, θ′, k)

2i=

k

4π

∫

S2

A(θ′, α, k)A(θ, α, k)dα (unitarity).

(6.53)

In particular, if θ′ = θ in (6.53) then one obtains the identity

ImA(θ, θ, k) =k

4π

∫

S2

|A(θ, α, k)|2 dα (optical theorem). (6.54)

Proof. 1) Equation (6.51) follows from the real-valuedness of q(x). In-

deed, u(x, θ,−k) and u(x, θ, k), k > 0, solve the same integral equation



(6.13). Since this integral equation has at most one solution, it follows that

u(x, θ, k) = u(x, θ,−k), k > 0. (6.55)

Equation (6.51) follows from (55) immediately.

2) The proof of (6.52)-(6.54)is somewhat longer and since it can be

found in [Ramm (1975), p. 54-56] we refer the reader to this book.

Let us define the S-matrix

S = I − k

2πiA (6.56)

where S : L2(S2) → L2(S2) is considered to be an operator on L2(S2) with

the kernel

S(θ′, θ, k) = δ(θ − θ′) − k

2πiA(θ′, θ, k). (6.57)

The unitarity of S

S∗S = I (6.58)

implies

A− A∗

2i=

k

4πA∗A (6.59)

which is (6.53) in the operator notation.

6.1.4 Analyticity in k of the scattering solution

Define

φ := exp(−ikθ · x)u(x, θ, k). (6.60)

Then φ solves the equation

[I + Tθ(k)]φ = 1 (6.61)

where

Tθ(k)φ :=

∫exp [ik|x− y| − ikθ · (x− y)]

4π|x− y| q(y)φ(y)dy. (6.62)

The operator Tθ(k) : C(R3) → C(R3) is compact and continuous (in the

norm of operators) in the parameter k ∈ C+ := k : Imk ≥ 0. If q ∈ Q,

the operator Tθ(k) is analytic in k in C+ since |x− y|− θ · (x− y) ≥ 0. The

operator I + Tθ(k) is invertible for some k ∈ C+, for example, for k ∈ C+



sufficiently close to the positive real semiaxis, or for k = a + ib, where a

and b are real numbers and b > 0 is sufficiently large. Indeed under the

last assumption the norm of the operator Tθ(k) is less than one. Therefore

by the well-known result, the analytic Fredholm’s theorem, one concludes

that [I+Tθ(k)]−1 is a meromorphic in C+ operator function on C(R3) (see

[Ramm (1975), p. 57]). The poles of this function occur at the values kjat which the operator I + Tθ(kj) is not invertible. These values are

kj = i√

|λj| (6.63)

where λj are the eigenvalues of the operator `q, and, possibly, the value

k = 0. Indeed, if

[I + Tθ(kj)] v = 0, v ∈ C(R3), kj ∈ C+ (6.64)

then the function

w := exp(ikθ · x)v, v ∈ C(R3) (6.65)

solves the equation

w = −Tw (6.66)

where T is defined in (6.16). It follows from (6.65) and (6.66) that w =

O(|x|−1). This and equation (6.66) imply that

(`q − k2)w = 0 (6.67)

and

w ∈ L2(R3). (6.68)

Equation (6.67) follows from (6.66) immediately. Equation (6.68) can be

easily checked if k ∈ C+, that is, if k = a+ ib, b > 0, a is real. Indeed, use

(6.66) , the assumption q ∈ Q, which implies that q ∈ L2(R3) ∩ L1(R3),

and boundedness of w to get:

‖ w ‖2L2(R3) ≤

∫dx

∣∣∣∣∫

exp(−b|x− y|)4π|x− y| |q||w|dy

∣∣∣∣2

≤ c

∫dx

∫exp(−2b|x− y|)

|x− y|2 |q(y)|dy ≤ c1. (6.69)

Since the operator `q = −∆+q(x) is selfadjoint equations (6.67) and (6.68)

imply w = 0 provided that k2 is not real. Since k ∈ C+, the number k2 is

real if and only if k = i√

|λ|, k2 = −|λ| < 0. Equations (6.67) and (6.68)



with k2 = −|λ| imply that λ = λj is an eigenvalue of `q . Therefore, the

only points at which the operator [I + Tθ(k)]−1 has poles in C+ are the

points (6.63) and, possibly, the point k = 0. One can prove [R, 28h,i)] that

if q ∈ Q and `q ≥ 0 the number λ = 0 is not an eigenvalue of `q . However,

even if `q ≥ 0 the point λ = 0 may be a resonance (half-bound state) for

`q . This means that the equation

∆u = qu, u ∈ C(R3) and u 6∈ L2(R3) (6.70)

may have a nontrivial solution which is not in L2(R3). In this case the

operator I+Tθ(0) is not invertible. Even if q(x) ∈ C∞0 the operator `q may

have a resonance at λ = 0. Even if `q ≥ 0 and q is compactly supported

and locally integrable the operator `q may have a resonance at λ = 0.

Example 6.1 Let B =x : |x| ≤ 1, x ∈ R3

. Let u = |x|−1 for |x| ≥ 1.

Extend u inside B as a C∞ real-valued function such that u(x) ≥ δ > 0 in

B. This is possible since u = 1 on ∂B. Define

q(x) :=∆u

u. (6.71)

Then q ∈ C∞0 , q = 0 for |x| ≥ 1, q is real-valued, u 6∈ L2(R3), and the

desired example is constructed. This argument does not necessarily lead

to a nonnegative `q. In order to get `q ≥ 0 one needs an extra argument

given in [Ramm (1987)]. Let us give a variant of this argument. The

inequality `q ≥ 0 holds if and only if (∗)∫ [

|∇φ|2 + q(x)|φ|2]dx ≥ 0 for

all φ ∈ C∞0 (R3). It is known that

∫|∇φ|2dx ≥

∫(4r2)−1|φ|2dx for all

φ ∈ C∞0 (R3), r := |x|. Therefore (∗) holds if (∗∗) (4r2)−1 + q ≥ 0. Choose

u = rγ−1(1 + γ − γr), where γ > 0 is a sufficiently small number. Then q,

defined by (71), satisfies (∗∗) as one can easily check. This q is integrable

and `q ≥ 0. The function u = r−1 for r ≥ 1 and u = rγ−1(1+γ−γr) solves

the equation `qu = 0 in R3, u 6∈ L2(R3), `q ≥ 0, q = 0 for r ≥ 1, and q is

locally integrable.

Exercise: Prove that the numbers (6.63) are simple poles of [I+Tθ(k)]−1.

6.1.5 High-frequency behavior of the scattering solutions

Assume now that q ∈ Q1. Then the function φ defined in (6.60) can be

written as

φ = 1 +1

2ik

∫ ∞

0

q(x− rθ)dr + o

(1

k

), k → +∞. (6.72)



If q ∈ Qm, m > 1, more terms in the asymptotic expansion of φ as k → ∞can be written [Skriganov (1978)]. Formula (6.72) is well known and can

be derived as follows.

Proof of Formula (6.72)

Step 1: Note that

maxθ∈S2

‖ T 2θ (k) ‖→ 0 as k → +∞. (6.73)

This can be proved, as in [Ramm (1986), p. 390], by writing the kernel

Bθ(x, y, k) of T 2θ :

Bθ(x, y, k) =

∫exp [ik(|x− z| + |z − y|)]

16π2|x− z||z − y| q(z)dzq(y) exp [ikθ · (x− y)] .

(6.74)

Introduce the coordinates s, t, ψ defined by the formulas

z1 = `st +x1 + y1

2, z2 = `

√(s2 − 1)(1 − t2) cosψ +

x2 + y22

z3 = `√

(s2 − 1)(1 − t2) sinψ +x3 + y3

2, (6.75)

where

` = |x−y|/2, |x−z|+|z−y| = 2`s, |x−y|−|z−y| = 2`t, J = `3(s2−t2),(6.76)

and J is the Jacobian of the transformation

(z1, z2, z3) → (s, t, ψ), 1 ≤ s < ∞, −1 ≤ t ≤ 1, 0 ≤ ψ < 2π.

In the new coordinates one obtains

|Bθ(x, y, k)| ≤1

16π2|q(y)|`

∣∣∣∣∫ ∞

1

exp(2ik`s)p(s)ds

∣∣∣∣ , (6.77)

where

p(s) :=

∫ 2π

0

dψ

∫ 1

−1

dtq1(s, t, ψ)

and q1(s, t, ψ) is q(z) in the new coordinates given by formula (6.75). One

can choose a sufficiently large number N > 0 such tht

|x| > N or |y| > N implies supθ∈S2,k>0

|Bθ(x, y, k)| < ε(N ), (6.78)



where ε(N ) → 0 as N → ∞.

If N is fixed, then for |x| ≤ N and |y| ≤ N it follows from (77) that

|Bθ| → 0 as k → +∞ (6.79)

since p(s) ∈ L1(1,∞).

This proves (6.73). Note that in this argument it is sufficient to assume

q ∈ Q.

Step 2: If (6.73) holds, one can write

φ = 1 +

∞∑

j=1

(−1)jT jθ (k)1, (6.80)

where the series in (6.80) converges in the norm of C(R3) if k is sufficiently

large so that ‖ T 2θ ‖< 1. Note that if ‖ T ‖< 1 then

(I + T )−1 =

∞∑

j=0

(−1)jT j (6.81)

and the series converges in the norm of operators. If ‖ T 2 ‖< 1, formula

(6.81) remains valid. Indeed

∞∑

j=0

(−1)jT j =

∞∑

j=0

(−1)2jT 2j + T

∞∑

j=0

(−1)2j+1T 2j

= (I − T 2)−1 − T (I − T 2)−1

= (I − T )(I − T 2)−1 = (I + T )−1. (6.82)

In fact it is known that the series (6.81) converges and formula (6.81) holds

if ‖ Tm ‖< 1 for some integer m ≥ 1.

As k → ∞, each term in (6.80) has a higher order of smallness than the

previous one. Therefore it is sufficient to consider the first term in the sum

(6.80) and to check that

−Tθ(k)1 =1

2ik

∫ ∞

0

q(x− rθ)dr + o

(1

k

), k → +∞ (6.83)



in order to prove (6.72).

Step 3: Let us check (6.83). One has

Tθ(k)1 =

∫exp ik [|x− y| − θ · (x− y)]

4π|x− y| q(y)dy

=

∫ ∞

0

drr2 exp(ikr)

4πr

∫

S2

exp(ikrθ · α)q(x+ rα)dα, (6.84)

where we set y = x + z, z = rα, α ∈ S2. Use formula [Ramm (1986)], p.

55:

∫S2 exp(ikrθ · α)f(α)dα

= 2πi[

exp(−ikr)kr f(−θ) − exp(ikr)

kr f(θ)]

+ o(

1k

), as k → ∞ (6.85)

which holds if f ∈ C1(S2).


Tθ(k)1 = − 1

2ik

∫ ∞

0

drq(x− rθ) + o

(1

k

), k → +∞ (6.86)

which is equivalent to (6.83). Formula (6.72) is proved.


u(x, θ, k) = exp(ikθ · x)[1 +

1

2ik

∫ ∞

0

q(x− rθ)

]+ o

(1

k

), k → +∞

(6.87)

provided that q ∈ Q1.

Another formula, which follows from (6.72), is

θ · ∇x limk→+∞

2ik [φ(x, θ, k)− 1] = θ · ∇x

∫ ∞

0

q(x− rθ)dr

= −∫ ∞

0

∂q

∂rdr = q(x)

or

q(x) = θ · ∇x limk→+∞

2ik [φ(x, θ, k)− 1] . (6.88)

Note that the left side does not depend on θ, so that (6.88) is a compatibility

condition on the function φ(x, θ, k).




A(θ′, θ, k) = − 1

4π

∫exp [ik(θ − θ′) · x] q(x)dx+ 0

(1

k

), k → +∞

(6.89)

provided that q ∈ Q1. In particular,

A(θ, θ, k) = − 1

4π

∫q(x)dx+ 0

(1

k

), k → +∞. (6.90)

6.1.6 Fundamental relation between u+ and u−

If one defines

u+ := u(x, theta, k), u− := u(x,−θ,−k) (6.91)

then one can prove that

u+ = Su− := u− +ik

2π

∫

S2

A(θ′, θ, k)u−(x, θ′, k)dθ′. (6.92)

Let us derive (6.92). We start with the equations

u+ = u0 −G+qu0, u0 := exp(ikθ · x), (6.93)

u− = u0 −G−qu0, (6.94)

where G+ = G, where G is defined by the equation (6.27), and

G− := G. (6.95)

Equations (6.93) and (6.94) one can easily check by applying the operator

`q − k2 to these equations. Subtract (6.94) from (6.93) and use (6.95) to

get

u+ − u− = −2iImG+qu0

=ik

2π

∫

S2

dθ′u−(x, θ′, k)

− 1

4π

∫u−(y, θ′, k)q(y)u0(y, θ, k)

=ik

2π

∫

S2

A(θ′, θ, k)u−(x, θ′, k)dθ′. (6.96)

The last equality in (6.96) follows from the definition (91), properties (6.51)

and (6.52) of the scattering amplitude and the formula

ImG+(x, y, k) =k

16π2

∫

S2

u−(x, θ′, k)u−(y, θ′, k)dθ′ (6.97)



which is similar to (6.46) and which follows from (6.46) and (6.91). Let us

derive (6.97). Note that, by formulas (6.46) and (6.91), one has

ImG+(x, y, k) = ImG+(x, y, k), u±(x, θ,−k) = u±(x, θ, k), (6.98)

ImG+(x, y, k) =k

16π2

∫

S2

u(x, θ, k)u(y, θ, k)dθ

=k

16π2

∫

S2

u−(x,−θ,−k)u−(y,−θ,−k)dθ

=k

16π2

∫

S2

u−(x, θ′, k)u−(y, θ′, k)dθ′

=k

16π2

∫

S2

u−(x, θ′, k)u−(y, θ′, k)dθ′. (6.99)

Here we used (6.98). Thus, formula (6.97) is obtained.

Note that

−4πA(θ′, θ, k) =

∫u0(y,−θ′, k)q(y)u(y, θ, k)dy (6.100)

and

−4πA(−θ,−θ′ , k) =

∫u0(y, θ, k)q(y)u(y,−θ′ , k)dy

=

∫u0(y, θ, k)q(y)u−(y, θ′, k). (6.101)

From formula (6.52) it follows that the right sides of (6.100) and (6.101)

are equal. This explains the last equation (6.96).

6.1.7 Formula for detS(k) and the Levinson Theorem

If q(x) = q(x) is decaying sufficiently fast, (for example, if (1 + |x|)q ∈ Q,

x ∈ R3) then the operator A : L2(S2) → L2(S2) with kernel A(θ′, θ, k),k > 0, is in the trace class and

det S(k) = det

(I +

ik

2πA

)= exp

[− ik

2π

∫q(x)dx

]d(−k)d(k)

, k > 0

(6.102)

where

d(k) := det2

(I + T (k)) . (6.103)



The operator T (k) in (6.103) is defined in (6.16) and the symbol det2(I+T )

is defined in Definition 8.7 p. 300. If k > 0 and q = q, then d(−k) = d(k),

where the bar stands for complex conjugate. Therefore

det S(k) = exp [2iδ(k)] , (6.104)

where

δ(k) = − k

4π

∫q(x)dx− β(k), β(k) := argd(k). (6.105)

The Levinson Theorem says that

δ(0) = π(m +

ν

2

), (6.106)

where m is the number of the bound states counting with their multiplic-

ities, in other words m is the dimension of the subspace spanned by the

eigenfunctions of `q corresponding to all of its negative eigenvalues, and

ν = 1 if k = 0 is a resonance and ν = 0 otherwise, that is, if I + T (0) is

invertible. It is assumed that δ(k) is normalized in such a way that

limk→∞

[δ(k) +

k

4π

∫q(x)dx

]= 0 (6.107)

or, according to (6.105), that

limk→∞

β(k) = 0. (6.108)

Formula (6.106) follows from (6.105) and the argument principle applied

to d(k).

Formula (6.102) can be derived as follows:

d(−k) := det2

(I + T (−k))

= det2

[I + T (k)]

[I + (I + T (k))

−1(T (−k) − T (k))

]

= det2

[I + T (k)] det[I + (I + T (k))

−1(T (−k) − T (k))

]×

exp −Tr [T (−k) − T (k)] . (6.109)

Here we have used formula (12) which precedes Definition 8.8 on p. 300.

The operator T (−k) − T (k) has the kernel − 2i sin(k|x−y|)4π|x−y| q(y), so that its

trace is

Tr [T (−k) − T (k)] =

∫−2ik

4πq(y)dy = − ik

2π

∫q(y)dy. (6.110)



Therefore formula (109) can be written as

d(−k)d(k)

exp

− ik

2π

∫q(x)dx

= det

[I + (I + T (k))

−1(T (−k) − T (k))

].

(6.111)

Finally one proves that

det[I + (I + T (k))

−1(T (−k) − T (k))

]= det

(I +

ik

2πA

). (6.112)

Formulas (6.111) and (6.112) imply (6.102). Let us prove (6.112). Let

(I + T (k))−1 := B. Then

τ := (I + T (k))−1

(T (−k) − T (k)) = B [g(−k) − g(k)] q. (6.113)

One has

g(−k) − g(k) = −2ik

4π

sin(k|x− y|)k|x− y| = − ik

2π

1

4π

∫

S2

exp ikθ · (x− y) dθ.(6.114)

Furthermore, if u0 := exp(ikθ · x) then

Bu0 = u(x, θ, k), (6.115)

where u(x, θ, k) is the scattering solution (6.1)-(6.2). Therefore the right-

hand side of (6.113) is the operator in L2(R3) with the kernel

τ (x, y) := − ik

8π2

∫

S2

dθ exp(−ikθ · y)u(x, θ, k)q(y). (6.116)

Note that

Trτ =

∫τ (x, x)dx =

ik

2π

∫

S2

A(θ, θ, k)dθ = Tr

(ik

2πA

). (6.117)

From formula 8) of Section 8.3.3 it follows that (6.112) is valid provided

that

Trτ j = Tr

(ik

2πA

)j, j = 1, 2, 3 . . . . (6.118)

One can check (6.118) as we checked (6.117). Thus, formula (6.102) is

derived.



6.1.8 Completeness properties of the scattering solutions

Theorem 6.1 Let h(θ) ∈ L2(S2) and assume that∫

S2

h(θ)u(x, θ, k)dθ = 0 ∀x ∈ ΩR := x : |x| > R , (6.119)

where k > 0 is fixed, x ∈ R3. It is assumed that q ∈ Q. Then h(θ) = 0. The

same conclusion holds if one replaces u(x, θ, k) by u(x,−θ,−k) in (119) and

if x ∈ Rr, r ≥ 2.

Proof. The proof consists of two steps.

Step 1. The conclusion of Theorem 6.1 holds if u(x, θ, k) is replaced by

u0(x, θ, k) := exp(ikθ · x) in (6.119). Indeed, if∫

S2

h(θ) exp(ikθ · x)dθ = 0 ∀x ∈ ΩR (6.120)

and a fixed k > 0, then the Fourier transform of the distribution h(θ)δS2

vanishes for all sufficiently large x. The distribution h(θ)δS2 is defined by

the formula∫φ(y)δS2dy =

∫

S2

φ(θ)h(θ)dθ ∀φ ∈ C∞0 (R3). (6.121)

Since h(θ)δS2 has compact support, its Fourier transform is an entire func-

tion of x. If this entire function vanishes for all sufficiently large x ∈ 3, it

vanishes identically. Therefore h(θ) = 0.

Step 2. If (6.119) holds then (6.120) holds. Therefore, by Step 1, h(θ) =

0. In order to prove that (6.119) implies (6.120) let us note that

u0(x, θ, k) = (I + T (k))u, (6.122)

where T (k) is defined by (6.16). For every k > 0 the operator I + T (k)

is an isomorphism of C(R3) onto C(R3). Applying the operator I + T (k)

to (6.119) and using (6.122) one obtains (6.120). Note that the operator

I + T (k) acts on u(x, θ, k) which is considered as a function of x while θ

and k are parameters. Theorem 6.1 is proved.

Theorem 6.1 is used in [R 26)] for a characterization of the scattering

data which we give in Section sscattering2.5.

Another completeness property of the scattering solution can be formu-

lated. Let

ND(`q) :=w : w ∈ H2(D), `qw = 0 in D

, (6.123)



where D ⊂ R3 is a bounded domain with a sufficiently smooth boundary

Γ, for example Γ ∈ C1,α, α > 0, suffices.

Theorem 6.2 Let q ∈ Q, where Q is defined in (6.3). The closure

in L2(D) (and in H1(D)) of the linear span of the scattering solutions

u(x, θ, k) ∀θ ∈ S2 and any fixed k > 0 contains ND(`q − k2).

Proof. We first prove the statement concerning the L2(D) closure. Let

f ∈ ND(`q − k2) and assume that

∫

D

fu(x, θ, k)dx = 0 ∀θ ∈ S2. (6.124)

Define

v(x) :=

∫

D

G(x, y, k)f(y)dy, (6.125)

where G is uniquely defined by equation (6.27). Use (6.29) and (123) to

conclude that

v(x) = O(|x|−2

)as |x| → ∞. (6.126)

Since

(`q − k2)v = 0 in Ω := R3 \D (6.127)

and (6.126) holds, one concludes applying Kato’s theorem ( see [Kato

(1959)]) that v = 0 in Ω (see the end of the proof of Lemma 6.4 in subsection

6.1.2). In particular,

v = vN = 0 on Γ, (6.128)

where vN is the normal derivative of v on Γ. It follows from (124) that

(`q − k2)v = −f in D. (6.129)

Since

(`q − k2)f = 0 in D (6.130)



by the assumption, one can multiply (128) by f , integrate over D and get

−∫

D

|f |2dy =

∫

D

f (`q − k2)vdy

=

∫

D

(`q − k2)fvdx +

∫

Γ

(fvN − fNv)ds

=

∫

D

(`q − k2f)vdx = 0. (6.131)

Here we have used (6.128) and the real-valuedness of the potential. It

follows from (6.131) that f = 0. The first statement of Theorem 6.2 is

proved.

In order to prove the second statement which deals with completeness

in H1(D), one assumes

∫

D

(fu + ∇f · ∇u)dx = 0 ∀θ ∈ S2 (6.132)

and some f ∈ ND(`q − k2). Integrate (131) by parts to get

∫

D

(−∆f + f)udx+

∫

Γ

fNuds = 0, ∀θ ∈ S2. (6.133)

Define

v :=

∫

D

(−∆f + f)G(x, y, k)dy +

∫

Γ

G(x, s, k)fNds. (6.134)

Argue as above to conclude that v = 0 in Ω and

v = v−N = 0 on Γ, (6.135)

where v−N is the limit value of vN on Γ from Ω. By the jump formula for the

normal derivative of the single-layer potential (see, e.g., [Ramm (1986)], p.

14) one has

v+N − v−N = fN . (6.136)

Since v−N = 0 it follows that v+N = fN on Γ. Thus

v = 0, v+N = fN on Γ, (6.137)

and

(`q − k2)v = ∆f − f in D. (6.138)



Multiply (6.138) by f , integrate over D and then by parts to get

−∫ [

|∇f |2 + |f |2]dx+

∫

Γ

ffNds =

∫

D

(`q−k2)fvdx+

∫

Γ

(fv+

N − fN v)ds.

(6.139)

From (6.130), (6.137) and (6.139) it follows that

∫

D

(|∇f |2 + |f |2

)dx = 0. (6.140)

Thus, f = 0. Theorem 6.2 is proved.

6.2 Inverse scattering problems

6.2.1 Inverse scattering problems

The inverse scattering problem consists of finding q(x) given A(θ′, θ, k).One should specify for which values of θ′, θ and k the scattering amplitude

is given.

Problem 1 A(θ′, θ, k) is given for all θ′, θ ∈ S2 and all k > 0. Find q(x).

Problem 2 A(θ′, θ, k) is given for all θ′, θ ∈ S2 and a fixed k > 0.

Problem 3 A(θ′, θ, k) is given for a fixed θ ∈ S2 and all θ′ ∈ S2 and all

k > 0.

Problem 1 has been studied much. We will mention some of the results

relevant to estimation theory.

Problem 2 has been solved recently [R 29)] but we do not describe the

results since they are not connected with the estimation theory

Problem 3 is open, but a partial result is given in [R 29d)].

6.2.2 Uniqueness theorem for the inverse scattering prob-

lem

The uniqueness of the solution to Problem 1 follows immediately from for-

mula (1.89). Indeed, if A(θ′, θ, k) is known for all θ′, θ ∈ S2 and all k > 0,

then take an arbitrary ξ ∈ R3, an arbitrary sequence kn → +∞, and find



a sequence θn, θ′n ∈ S2 such that

limn→∞

(θn − θ′n)kn = ξ, kn → +∞. (6.141)

This is clearly possible. Pass to the limit kn → ∞ in (6.89) to get

−4π limkn→∞

kn(θn − θ′n) = ξA(θ′n, θn, kn) =

∫exp(iξ · x)q(x)dx. (6.142)

Therefore the Fourier transform of q is uniquely determined. Thus q is

uniquely determined. We have proved

Lemma 6.7 If q ∈ Q1 then the knowledge of A(θ′, θ, k) on S2×S2 ×R+,

R+ := (0,∞), determines q(x) uniquely.

In fact, our proof shows that it suffices to have the knowledge of A for

an arbitrary sequence kn → ∞ and for some θ′ and θ such that for any

ξ ∈ R3 one can choose θ′n and θn such that (6.1) holds.

The reconstruction of q(x) from the scattering data via formula (6.142)

requires the knowledge of the high frequency data. These data are not

easy to collect in the quantum mechanics problems, and for very high en-

ergies the Schrodinger equation is no longer a good model for the physical

processes.

Therefore much effort was spent in order to find a solution to Prob-

lem 1 which uses all of the scattering data; to find necessary and sufficient

conditions for a function A(θ′, θ, k) to be the scattering amplitude for a

potential q from a certain class, e.g. for q ∈ Qm, this is called a char-

acterization problem; and to give a stable reconstruction of q given noisy

data.

6.2.3 Necessary conditions for a function to be a scatterng

amplitude

A number of necessary conditions for A(θ′, θ, k) to be the scattering ampli-

tude corresponding to q ∈ Q1 follow from the results of Section 6.1 of this

scatteringtheory. Let us list some of these necessary conditions:

1) reality, reciprocity and unitarity: that is, formulas (6.51)-(6.54)

2) high-frequency behavior: formulas (6.89), (6.90), (6.142).

Other necessary conditions will be mentioned later (see formulas (6.158)

and (6.159) below). Some necessary and sufficient conditions for A(θ′, θ, k)to be the scattering amplitude for a q ∈ Q1 are given first in [R 26), 27)].



These conditions can not be checked algorithmically: they are formulated in

terms of the properties of the solutions to certain integral equations whose

kernel is the given function A(θ′, θ, k).

6.2.4 A Marchenko equation (M equation)

Define

η(x, θ, α) :=1

2π

∫ ∞

−∞[φ(x, θ, k) − 1] exp(−ikα)dk, (6.143)

where φ is defined by (6.60) and has property (6.72) as k → ∞, provided

that q ∈ Q1.

For simplicity we assume that `q has no bound states. (6.144)

Under this assumption φ is analytic in C+ and continuous in C+ \ 0. Let

us assume that k = 0 is not an exceptional point, that is, φ is continuous

in C+.

Start with equation (6.92) which we rewrite as

φ(x, θ, k) = φ(x,−θ,−k) +ik

2π×

∫

S2

A(θ′, θ, k) exp [ik(θ′ − θ) · x]φ(x,−θ′,−k)dθ (6.145)

or

φ(x, θ, k)− 1 = φ(x,−θ,−k) − 1

+ik

2π

∫

S2

A(θ′, θ, k) exp [ik(θ′ − θ) · x] [φ(x,−θ′,−k) − 1]dθ′

+ik

2π

∫

S2

A(θ′, θ, k) exp [ik(θ′ − θ) · x]dθ′. (6.146)

Take the Fourier transform of (6.146) and use (6.143) to get

η(x, θ, α) = η(x,−θ,−α) +

∫ ∞

−∞

∫

S2

B(α− β)η(x,−θ′,−β)dθ′dβ + η0.

(6.147)

Here

η0 :=

∫ ∞

−∞exp(ikα)

ik

2π

∫

S2

A(θ′, θ, k) exp [ik(θ′ − θ) · x]dθ′dk (6.148)



and the integral term in (6.147) is

12π

∫∞−∞ dk exp(−ikα) ik2π

∫S2 A(θ′, θ, k) exp [ik(θ′ − θ) · x]×

[φ(x,−θ′,−k) − 1]dθ′

=∫∞−∞ dβ

∫S2 dθ

′B(α − β, θ′, θ, x)η(x,−θ′,−β)dθ′, (6.149)

where

B(α, θ′, θ, x) :=1

2π

∫ ∞

−∞dk exp(−ikα)

ik

2πA(θ′, θ, k) exp [ik(θ′ − θ) · x] .

(6.150)

The Fourier transform in (6.150) is understood in the sense of distributions.

Under the assumption (4), the analyticity of φ(x, θ, k) in k in the region

C+ and the decay of φ as |k| → ∞, k ∈ C+, which follows from (6.72)

imply that

η(x, θ, α) = 0 for α < 0. (6.151)

Therefore the right hand side of (6.149) can be written as

∫ ∞

0

dβ

∫

S2

dθ′B(α + β, θ′, θ, x)η(x,−θ′, β) :=

∫ ∞

0

B(α + β)η(β)dβ.

(6.152)

Equation (6.147) now takes the form of the Marchenko equation

η(x, θ, α) =

∫ ∞

0

B(α + β)η(β)dβ + η0, α > 0, (6.153)

where we took into account that

η(x,−θ,−α) = 0 for α > 0 (6.154)

according to (6.151). The function η0 in (6.153) is defined in (6.148), and

the integral operator in (6.153) is defined in (6.152). The kernel of the

operator in (6.153) is defined by (6.150) and is known if the scattering

amplitude is known. If A(θ′, θ, k) is the scattering amplitude corresponding

to a q ∈ Q01, where Q0

1 is the subset of Q1 which consists of the potentials

with no bound states, then equation (6.153) has a solution η with the

following properties: if one defines η for α < 0 by formula (6.151) then the

function

φ(x, θ, k) := 1 +

∫ ∞

0

dα exp(ikα)η(x, θ, α) (6.155)



solves the equation

∇2xφ+ 2ikθ · ∇xφ− q(x)φ = 0 (6.156)

and the function u := exp(ikθ · x)φ solves the Schrodinger equation

`qu = 0. (6.157)

In particular, the function

(∇2 + k2)u

u:= q(x) (6.158)

does not depend on θ (this is a compatibility condition).

Another compatibility condition gives formula (6.88). This formula can

be written as

q(x) = −2θ · ∇xη(x, θ,+0). (6.159)

Indeed, it follows from (6.155) that

limk→∞

2ik(φ− 1) = −2η(x, θ,+0). (6.160)

Formula (6.159) follows from (6.88) and (6.160).

The compatibility condition (6.159) and the Marchenko equation (6.153)

appeared in [Newton (1982)] where condition (6.159) was called the “mir-

acle” condition since the left side of (6.159) does not depend on θ). The

above derivation is from [Ramm (1992)].

6.2.5 Characterization of the scattering data in the 3D in-

verse scattering problem

Let us write A ∈ AQ if A := A(θ′, θ, k) is the scattering amplitude corre-

sponding to a potential q ∈ Q. Assuming that q ∈ Q, we have proved in

Section 6.1 that equation (6.92), which we rewrite as

v(x, θ, k) = v(x,−θ,−k) +ik

2π

∫

S2

A(θ′, θ, k)v(x,−θ′,−k)dθ′

+ik

2π

∫

S2

A(θ′, θ, k) exp(ikθ′ · x)dθ′ (6.161)

has a solution v for all x ∈ R3 and all k > 0, where

v := u(x, θ, k) − exp(ikθ · x) := u− u0. (6.162)



This v has the following properties:

v = A(θ′, θ, k)g(r) + o(r−1), r = |x| → ∞, θ′ = xr−1 (6.163)

which is equation (6.2), and

(∆ + k2)(u0 + v)

u0 + v= q(x) ∈ Q (6.164)

which is equation (6.158).

These properties are necessary for A ∈ AQ. It turns out that they are

also sufficient for A ∈ AQ. Let us formulate the basic result (see [Ramm

(1992)]).

Theorem 6.3 For A ∈ AQ it is necessary and sufficient that equation

(6.161) has a solution v such that (6.164) holds and

v = Aq(θ′, θ, k)g(r) + o(r−1), r = |x| → ∞, xr−1 = θ′. (6.165)

The function Aq defined by (6.165) is equal to the function A(θ′, θ, k) which

is the given function, the kernel of equation (6.161), and it is equal to the

scattering amplitude corresponding to the function q(x) defined by (6.164).

There is at most one solution to equation (6.161) with properties (6.163),

(6.164) and (6.165).

Proof. We have already proved the necessity part. Let us prove the

sufficiency part. Let A(θ′θ, k) be a given function such that equation (6.161)

has a solution with properties (6.164) and (6.165). First, it follows that u

defined by the formula

u := exp(ikθ · x) + v (6.166)

is the scattering solution for the potential q(x) defined by formula (6.164).

Since the scattering solution is uniquely determined (see Lemma 6.4 in

section 6.1.2 2, §1) one concludes that the function Aq(θ′, θ, k) defined by

formula (6.165) is the scattering amplitude corresponding to the potential

q(x) defined by formula (6.164).

Secondly, let us prove that

Aq(θ′, θ, k) = A(θ′, θ, k) (6.167)

where A(θ′, θ, k) is the given function, the kernel of equation (6.161). Note



that we have proved in 6.1 that v satisfies the equation

v(x, θ, k) = v(x,−θ,−k) +ik

2π

∫

S2

A1(θ′, θ, k)v(x,−θ′,−k)dθ′

+ik

2π

∫

S2

Aq(θ′, θ, k) exp(ikθ′ · x)dθ′. (6.168)

This is equation (6.92) written in terms of v. Subtract (6.168) from (6.161)

to get

0 =

∫

S2

[A(θ′, θ, k) − Aq(θ′, θ, k)]u(x,−θ′,−k)dθ′, ∀x ∈ R3. (6.169)

Equation (6.169) and Theorem 6.1 from Section 6.1.8, imply (6.167).

The last statement of Theorem 6.3 can be proved as follows. Suppose

there are two (or more) solutions vj , j = 1, 2, to equation (6.161) with

properties (6.164) and (6.165). Let qj(x) and Aj(θ′, θ, k), j = 1, 2 be the

corresponding potentials and scattering amplitudes. If q1 = q2 then v1 = v2by the uniqueness of the scattering solution (Lemma 6.4, p. 114). If q1 6≡ q2then w := v1 − v2 6≡ 0. The function w solves the equation

w(x, θ, k) = w(x,−θ,−k)+ ik

2π

∫

S2

A(θ′′, θ, k)w(x,−θ′′,−k)dθ′′, ∀x ∈ R3.

(6.170)

Note that

w(x, θ, k) = [A1(θ′, θ, k) − A2(θ

′, θ, k)] g(r)+ o(r−1), r → ∞, xr−1 = θ′

(6.171)

and

w(x,−θ,−k) = [A1(θ′,−θ,−k) − A2(θ

′,−θ,−k)] g(r) + o(r−1),

r → ∞, xr−1 = θ′ (6.172)

where g(r) := r−1 exp(ikr).

From (6.170), (6.171) and (6.172) it follows that

[A1(θ′, θ, k) − A2(θ

′, θ, k)] g(r) = B(θ′, θ, k)g(r) + o(r−1), r → ∞(6.173)

where the expression for B(θ′, θ, k) is not important for our argument. It

follows from (6.173) that

A1(θ′, θ, k) = A2(θ

′, θ, k) (6.174)



so that B(θ′, θ, k) = 0). By Lemma 6.7, it follows that q1 = q2. Theorem

6.3 is proved.

Exercise. Prove that if ag(r) = bg(r) + o(r−1), r → ∞, k > 0, where a

and b do not depend on r then a = b = 0.

Hint: Write a exp(ikr) = b exp(−ikr) + o(1); choose rn = nπk

, n → ∞.

Derive that a = b. Then choose r′n =nπ+π

2

k , n → ∞. Derive −a = b. Thus

a = b = 0.

Another characterization of the class of scattering amplitudes is given

in [Ramm (1992)]. A characterization of the class of scattering amplitudes

at a fixed k > 0 is given in [Ramm (1988)].

6.2.6 The Born inversion

The scattering amplitude in the Born approximation is defined to be

AB(θ′, θ, k) := − 1

4π

∫exp ik(θ − θ′) · x q(x)dx (6.175)

which is formula (6.15) with u(y, θ, k) substituted by u0(y, θ, k) := exp(ikθ ·y). The Born inversion is the inversion for q(x) of the equation

∫exp ik(θ − θ′) · x q(x)dx = −4πA(θ′, θ, k) (6.176)

which comes from setting

A(θ′, θ, k) = AB(θ′, θ, k). (6.177)

The first question is: does a q(x) ∈ Q exist such that (6.177) holds for

all θ′, θ ∈ S2 and all k > 0?

The answer is no, unless q(x) = 0 so that AB(θ′, θ, k) = A(θ′, θ, k) = 0.

Theorem 6.4 Assume that q ∈ Q. If (6.177) holds for all θ′, θ ∈ S2 and

all k > 0 then q(x) = 0.

Proof. Since q = q, it follows from (6.175) that

AB(θ, θ, k) − AB(θ, θ, k) = 0. (6.178)



From (6.178), (6.177) and (1.54) one concludes that

∫

S2

|AB(θ, α, k)|2dα = 0 ∀θ ∈ S2 and all k > 0. (6.179)

Thus AB(θ, α, k) = 0 for all θ, α ∈ S2 and k > 0. This and (6.175) imply

that q(x) = 0. Theorem 6.4 is proved.

Remark 6.2 If q ∈ Q is compactly supported and (6.177) holds for all

θ′, θ ∈ S2 and a fixed k > 0 then q(x) = 0. This follows from the uniqueness

theorem proved in [Ramm (1992)].

It follows from Theorem 6.4 that the scattering amplitude A(θ′, θ, k)cannot be a function of p := k(θ − θ′) only. The Born inversion in practice

reduces to choosing a p ∈ R3, finding θ, θ′ ∈ S2 and k > 0 such that

p = k(θ − θ′), (6.180)

writing equation (6.176) as

q(p) :=

∫exp(ip · x)q(x)dx = −4π [A(p) + η] (6.181)

where

A(p) := − 1

4π

∫q(x) exp(ip · x)dx = − 1

4πq(p) (6.182)

and η is defined as

η := A(θ′, θ, k)|k(θ−θ′)=p − A(p). (6.183)

One then wishes to neglect η and compute q(x) by the formula

q(x) =−4π

(2π)3

∫A(p) exp(−ip · x)dp. (6.184)

However, the data are the values A(p)+η or, if the measurements are noisy,

the values

A(p) + η + η1 := Bδ(p) (6.185)

where η1 is noise and δ > 0 is defined by formula (6.186) below.

The question is: assuming that δ > 0 is known such that

|η + η1| < δ (6.186)



how does one compute qδ(x), such that

|qδ(x) − q(x)| ≤ ε(δ) → 0 as δ → 0. (6.187)

In other words, how does one compute a stable approximation of q(x)

given noisy values of A(p) as in (6.185)? This question is answered in[Ramm (1992)]. We present the answer here. Define

qδ(x) := −4π(4π)−3

∫

|p|≤R(δ)

Bδ(p) exp(−ip · x)dp, R(δ) = c0δ− 1

2b

(6.188)

where the constants c0 > 0 and b > 32 will be specified below (see formulas

(6.191) and (6.190)).

Theorem 6.5 The following stability estimate holds:

|qδ(x) − q(x)| ≤ c1δ1− 3

2b (6.189)

provided that

|q(p)| ≤ c2(1 + |p|2)−b, b >3

2. (6.190)

The constants c0 and c1 are given by the formulas

c0 =( c2

4π

) 12b

(6.191)

c1 =

[2

3π

(1

4π

) 32b

+1

2π2(2b− 3)(4π)3−2b2b

]c

32b

2 . (6.192)

Proof. Using (6.182) and (6.190), one obtains

|qδ(x) − q(x)| ≤ 4π(2π)−3 ×∣∣∣∣∣

∫

|p|<RBδ(p) exp(−ip · x)dp−

∫exp(−ip · x)A(p)dp

∣∣∣∣∣

≤ (2π2)−1

4

3πR3δ + c2

∫ ∞

R

r2dr

(1 + r2)b

≤ 2

3πδR3 +

c22π2

R3−2b

2b− 3:= φ(δ,R). (6.193)



For a fixed δ > 0, minimize φ(δ,R) in R to get

φmin = c1δ1− 3

2b if R(δ) =( c2

4πδ

) 12b

(6.194)

where c1 is given by (52). Theorem 6.5 is proved.

The practical conclusions, which follow from Theorem 6.5, are:

1) The Born inversion needs a regularization. One way to use a regulariza-

tion is given by formula (6.188). If one would take the integral in (6.188)

over all of R3, or over too large a ball, the error of the Born inversion

might have been unlimited.

2) Even if the error η of the Born approximation for solving the direct scat-

tering problem is small, it does not imply that the error of the Born

inversion (that is the Born approximation for solving the inverse scatter-

ing problem) is small.

The second conclusion can be obtained in a different way, a more gen-

eral one. Let B(q) = A(∗), where B is a nonlinear map which sends a

potential q ∈ Q into a scattering amplitude A. The Born approximation is

a linearization of (∗). Let us write it as

B′(q0)(q − q0) = A− B(q0). (6.195)

The inverse of the operator B′(q0) is unbounded on the space of functions

with the sup norm. Therefore small in absolute value errors in the data

may lead to large errors in the solution q − q0. The Born approximation

is a linearization around q0 = 0. The distorted wave Born approximation

is a linearization around the reference potential q0. In both cases the basic

conclusion is the same: without regularization the Born inversion may lead

to large errors even if the perturbation q− q0 is small (in which case Born’s

approximation is accurate for solving the direct problem).

Let us discuss another way to recover q(x) fromA(θ′, θ, k) given for large

k. This way has a computational advantage of the following nature. One

does not need to find θ′, θ, and k such that (6.180) holds and one integrates

over S2 × S2 instead of R3 in order to recover q(x) stably from the given

noisy data Aδ(θ′, θ, k):

|Aδ(θ′, θ, k) −A(θ′, θ, k)| ≤ δ. (6.196)



We start with the following known formula ([Saito (1982)])

limk→∞

k2

∫

S2

∫

S2

A(θ′, θ, k) exp ik(θ′ − θ) · xdθdθ′ = −2π

∫|x−y|−2q(y)dy.

(6.197)

To estimate the rate of convergence in (6.197) one substitutes for

A(θ′, θ, k) its expression (6.15) to get

J :=k2

−4π

∫dyq(y)

∫

S2

exp ikθ′ · (x− y) dθ′∫

S2

exp −ikθ · (x− y) dθ +

∫

S2

exp(−ikθ · x)(φ− 1)dθ

(6.198)

where φ is defined by (6.60). A simple calculation yields

∫

S2

exp ikθ · (x− y) dθ = 4πsin(k|x− y|)k|x− y| . (6.199)

Thus

J = −4π

∫dyq(y)

sin2(k|x− y|)

|x− y|2 +sin(k|x− y|)

4π|x− y| ×∫

S2

exp(−ikθ · x) [φ(y, θ, k) − 1]dθ

:= J1 + J2. (6.200)

One has

J1 = −4π

∫dyq(y)

1 − cos(2k|x− y|)2|x− y|2

= −2π

∫q(y)|x − y|−2dy − 2π

∫ ∞

0

dr cos(2kr)Q(x, r) (6.201)

where

Q(x, r) :=

∫

S2

q(x+ rα)dα, r = |y − x|. (6.202)

Let us assume that

q ∈ Q2 :=q : q = q, |q|+ |Dq| + |D2q| ∈ Q

. (6.203)



Then, integrating by parts, one obtains

∣∣∣∣∫ ∞

0

dr cos 2krQ(x, r)

∣∣∣∣ =∣∣∣∣−1

2k

∫ ∞

0

dr sin(2kr)∂Q

∂r

∣∣∣∣

=

∣∣∣∣1

4k2

[cos(2kr)

∂Q

∂r

∣∣∞0

−∫ ∞

0

dr cos(2kr)∂2Q

∂r2

]∣∣∣∣

≤ c

k2, k > 1. (6.204)

Here c = const > 0 does not depend on x:

c = maxx∈R3

r≥0

∫

S2

|∇q(x+ rα)|dα+ maxx∈R3

∫ ∞

0

dr

∣∣∣∣∂2Q

∂r2

∣∣∣∣ , (6.205)

∣∣∣∣∂2Q

∂r2

∣∣∣∣ ≤ c1(1 + |x− r|)−a, a > 3 (6.206)

and∫ ∞

0

dr(1 + |x− r|)−a ≤∫ ∞

0

dr (1 + ||x| − r|)−a ≤∫ ∞

−∞dr (1 + ||x| − r|)−a

≤ 2

∫ ∞

0

dr(1 + r)−a ≤ c2 (6.207)

so that the right-hand side of (6.205) is bounded uniformly in x ∈ R3. Thus

J1 = −2π

∫|x− y|2q(y)dy + O(k−2), k → ∞ (6.208)

provided (6.203) holds.

Note that if q ∈ L2loc(R

3) and no a priori information about its smooth-

ness is known, then one obtains only o(1) in place of O(k−2) in (6.208).


J2 ≤ ck−1, k > 1. (6.209)

Thus, assuming (6.203),

J = −2π

∫|x− y|−2q(y)dy +O(k−1), k → ∞ (6.210)

where O(k−1) is uniform in x.



Therefore (6.197) can be written as

k2

∫

S2

∫

S2

A(θ′, θ, k) exp ik(θ′ − θ) · xdθdθ′

= −2π

∫|x− y|−2q(y)dy +O(k−1). (6.211)

The equation

−2π

∫|x− y|−2q(y)dy = f(x), x ∈ R3 (6.212)

is solvable analytically. Take the Fourier transform of (6.212) to get

q(p) = − 1

4π3|p|f(p). (6.213)

Here we have used the formula

|x|2 :=

∫|x|−2 exp(ip · x)dx = 2π

∫ ∞

0

dr

∫ π

0

exp(i|p|r cos θ) sin θdθ

= 4π

∫ ∞

0

drsin |p|r|p|r =

2π2

|p| . (6.214)

thus

q(x) =−1

32π6

∫exp(−ip · x)|p|f(p)dp. (6.215)

Assume for a moment that the term O(k−1) in (71) is absent. Then, ap-

plying formula (6.215), taking f(x) to be the left-hand side of (6.211), and

taking the Fourier transform of f (p), one would obtain

q(x) = − 1

32π6

∫dp|p| exp(−ip · x) ×

k2

∫

S2

∫

S2

A(θ′, θ, k)(2π)3δ [p + k(θ′ − θ)] dθdθ′

= − k3

4π3

∫

S2

∫

S2

dθdθ′A(θ′, θ, k)|θ′ − θ| exp ik(θ′ − θ) · x .(6.216)

This formula appeared in [Somersalo, E. et al. (1988)]. If Aδ is known in

place of A, and (6.196) holds, then formula (6.216), with Aδ in place of A,

gives

qδ(x) := − k3

4π3

∫

S2

∫

S2

dθdθ′Aδ(θ′, θ, k)|θ′−θ| exp ik(θ′ − θ) · x . (6.217)



Neglecting the term O(k−1) in (6.211), one obtains

|q − qδ| ≤ δk3

∫

S2

∫

S2

dθdθ′|θ − θ′|(4π3)−1 = δk3 16

3π, (6.218)

where we have used the formula:∫

S2

∫

S2

|θ − θ′|dθdθ′ =

∫

S2

dθ

∫

S2

|θ − θ′|dθ′ = 4π

∫

S2

|θ − θ′|dθ′

= 8π2

∫ π

0

√2 − 2 cos γ sin γdγ =

64π2

3. (6.219)

It is now possible to take into account the term O(k−1) in (6.211). Note

that, as follows from (6.72) and (6.200),

J2 ≤ 1

2k

∫dy|q(y)| 1

|x− y|

∫

S2

dθ

∫ ∞

0

|q(x− rθ)|dr + o(k−1). (6.220)

One has

c

∫dy|q(y)||x− y|−1

∫ ∞

0

∫

S2

dθ(1 + |x− rθ|2

)−a/2dr, a > 3 (6.221)

and∫

S2

(1 + |x− rθ|2

)−a/2dθ = 2π

∫ π

0

sinγdγ

(1 + |x|2 + |r|2 − 2|x|r cos γ)a/2

= 2π

∫ 1

−1

dt

(1 + |x|2 + |r|2 − 2|x|rt)a/2

= 2π

[1 + ||x| − r|2

]−a/2+1

−[1 + (|x|+ r)2

]−a/2+1

2|x||r|(a2 − 1

) . (6.222)

Moreover

∫ ∞

0

dr

[1 + ||x| − r|2

]−a/2+1

−[1 + (|x|+ r)2

]−a/2+1

r≤ c

|x| , |x| ≥ 1

(6.223)

where c > 0 is a constant. Therefore

J2 ≤ ck−1

1 + |x|2 . (6.224)



This means that the L2(R3) norm of J2 as a function of x is O(k−1) as

k → ∞. Therefore, if one takes into account the O(k−1) term is (71), one

obtains in place of (6.218) the following estimate

‖ q − qδ ‖L2(R3)≤ c(δk3 + k−1), c = const > 0. (6.225)

Minimization of the right-hand side of (6.225) in k yields

‖ q − qδ ‖L2(R3)≤ c1δ−1/4 for km = (3δ)−1/4 (6.226)

where km = km(δ) is the minimizer of the right-hand side of (6.225) on the

interval k > 0.

Therefore, if the data Aδ(θ′, θ, k) are noisy, so that (6.196) holds, one

should not take k in formula (6.217) too large. The quasioptimal k is given

in (6.226), and formula (6.217) with k = km gives a stable approximation

of q(x).

Let us finally discuss (6.225). The term ck−1 has already been discussed.

The first term cδk3 has been discussed for the estimate in sup norm. In the

case of L2 norm one has to estimate the L2(R3) norm of the function

h(x) :=

∫

S2

∫

S2

a(θ′, θ) exp ik(θ′ − θ) · xdθ′dθ, a := A −Aδ (6.227)

given that

|a| ≤ δ, |∇θa| + |∇θ′a| ≤ m1 (6.228)

where ∇θ, ∇θ′ are the first derivatives in θ and θ′, and m1 = const > 0. In

order to estimate h we use the following formula [Ramm (1986), p.54]

∫

S2

exp(ikrθ · α)f(θ)dθ =2πi

k

[exp(−ikr)

rf(−α) − exp(ikr)

rf(α)

]

+ o

(1

r

), r → +∞, k > 0,

α ∈ S2. (6.229)

This formula is proved under the assumption f ∈ C1(S2). From (6.227)-

(6.229) one obtains

|h(x)| ≤ cδ

1 + |x|2 . (6.230)

Thus

‖ h ‖L2(R3)≤ cδ. (6.231)



By c we denote various positive constants. From (6.231) one obtains the

first term in (6.225) for the case of the estimate in L2(R3) norm.

6.3 Estimation theory and inverse scattering in R3

Consider for simplicity the filtering problem which is formulated in Chapter

2. Let

U = s+ n(x), x ∈ R3 (6.232)

where the useful signal s(x) has the properties

s(x) = 0, s∗(x)s(y) = Rs(x, y) (6.233)

s(x)n(y) = 0 (6.234)

and the noise is white

n(x) = 0, n∗(x)n(y) = δ(x− y). (6.235)

In this section the star stands for complex conjugate and the bar stands for

the mean value.

The optimal linear estimate of s(x) is given by

s(x) =

∫

D

h(x, y)U(y)dy. (6.236)

Here the optimality is understood in the sense of minimum of variance of

the error of the estimate, as in Chapter 2. Other notations are also the

same as in Chapter 2. In particular, D ⊂ R3 is a bounded domain in which

U is observed.

It is proved in Chapter 2 that the optimal h(x, y) solves equation 2.11

which in the present case is of the form:

Rh := h(x, z) +

∫

D

Rs(z, y)h(x, y)dy = Rs(z, x), x, y ∈ D (6.237)

or, if one changes z → y and y → z, this equation takes the form

h(x, y) +

∫

D

Rs(y, z)h(x, z)dz = Rs(y, x), x, y ∈ D. (6.238)

Note that under the assumptions (6.233)-(4) one has

R(x, y) = Rs(x, y) + δ(x− y), f(x, y) = Rs(x, y) (6.239)



where R(x, y) and f(x, y) are defined in 2.3 (II.1.3). The basic equation

(6.238) is a Fredholm’s second kind equation with positive definite in L2(D)

operator R, R ≥ I. It is uniquely solvable in L2(D) (that is, it has a solution

in L2(D) and the solution is unique).

There are many methods to solve (6.238) numerically. In particular an

iterative method can easily be constructed for solving (6.238). This method

converges as a geometrical series (see Section 3.2, Lemma 3.1).

Projection methods can be easily constructed and applied to (6.238)

(see Section 3.2, Lemma 3.3). In [Levy and Tsitsiklis (1985)] and [Yagle

(1988)] attempts are made to use for the numerical solution of the equation

(6.238) some analogues of Levinson’s recursion which was used in the one-

dimensional problems when Rs(x, y) = Rs(x− y).

In the one-dimensional problems causality plays important role. In the

three-dimensional problems causality plays no role: the space is isotropic

in contrast with time.

Therefore, in order to use the ideas similar to Levinson’s recursion one

needs to assume that the domain D is parametrized by one parameter. In[Levy and Tsitsiklis (1985)] the authors assume D to be a disc (so that the

domain is determined by the radius of the disc; this radius is the parameter

mentioned above). Of course, one has to impose severe restrictions on the

correlation function Rs(x, y). In [Levy and Tsitsiklis (1985)] it is assumed

that

Rs(x, y) = Rs(|x− y|). (6.240)

This means that s(x) is an isotropic random field.

In [Yag] the case is considered when D ⊂ R3 is a ball and Rs(x, y) solves

the equation

∆xRs(x, y) = ∆yRs(x, y), (6.241)

where ∆x is the Laplacian. If Rs(x, y) = Rs(x− y) then (6.241) holds.

Let us derive a differential equation for the optimal filter h(x, y). The

derivation is similar to the one given in subsection 8.4.8 for Kalman filters.

Let us apply the operator ∆x−∆y to (6.238) assuming (6.241) and taking



D = z : |z| ≤ |x|:

0 = (∆x − ∆y)h(x, y) + ∆x

∫

|z|≤|x|h(x, z)Rs(y, z)dz

−∫

|z|≤|x|h(x, z)∆zRs(y, z)dz

= (∆x − ∆y)h(x, y) +

∫

|z|≤|x|(∆x − ∆z)h(x, z)Rs(y, z)dz − J ,

(6.242)

where, as we will prove,

J := r2∫

S2

Rs(y, rβ)q(x, rβ)dβ, r = |x|. (6.243)

Here S2 is the unit sphere in R3 and

q = q(r, α, β) := q(x, rβ) := −2d

drh(rα, rβ) − 4

rh(rα, rβ)

= − 2

r2d

dr2[r2h(rα, rβ)

], x = rα. (6.244)

Let us prove (6.243) and (6.244). Integrate by parts the second integral

in (6.242) to get

∫

|z|≤|x|h(x, z)∆zRs(y, z)dz =

∫

|z|≤|x|∆zh(x, z)Rs(y, z)dz

+

∫

|z|=r

[h(x, z)

∂

∂|z|Rs(y, z) −∂h(x, z)

∂|z| Rs(y, z)

]dz. (6.245)

The first integral in (6.242) can be written as

J1 :=

(d2

dr2+

2

r

d

dr+

∆∗

r2

)∫ r

0

dρρ2

∫

S2

h(rα, ρβ)Rs(y, ρβ)dβ, (6.246)

where ∆∗ is the angular part of the Laplacian and z = ρβ, where ρ = |z|



and β ∈ S2. One has

J1 =

∫ r

0

dρρ2

∫

S2

1

r2∆∗h(rα, ρβ)Rs(y, ρβ)dβ + 2r

∫

S2

h(rα, rβ)Rs(y, rβ)dβ

+

∫ r

0

dρρ2

∫

S2

2

r

d

drh(rα, ρβ)Rs(y, ρβ)dβ

+d

dr

[r2∫

S2


+

∫ r

0

dρρ2

∫

S2

d

drh(rα, ρβ)Rs(y, ρβ)dβ

]

=

∫

|z|≤r∆xh(x, z)Rs(y, z)dz + 4r

∫

S2


+ r2∫

S2

d

drh(rα, rβ)Rs(y, rβ)dβ + r2

∫

S2

h(rβ, rβ)d

drRs(y, rβ)dβ

+ r2∫

S2

∂

∂rh(rα, ρβ)

∣∣∣∣ρ=r

Rs(y, rβ)dβ (6.247)


−J = r2∫

S2

[−h(rα, rβ)

∂

∂rRs(y, rβ) +

∂h(rα, ρβ)

∂ρ

∣∣∣∣ρ=r

Rs(y, rβ)

]dβ

+ ∗4r∫

S2


+ r2∫

S2

[d

drh(rα, rβ)Rs(y, rβ)dβ + h(rα, rβ)

d

drRs(y, rβ)

+∂

∂rh(rα, ρβ)

∣∣∣∣ρ=r

Rs(y, rβ)

]dβ

= 2r2∫

S2

d

drh(rα, rβ)Rs(y, rβ)dβ + 4r

∫

S2


= r2∫

S2

Rs(y, rβ)q(x, rβ)dβ, (6.248)

where q is given by (6.244).

In order to derive a differential equation for h let us assume that

Rs(x, y) = Rs(y, x), (6.249)



write equation (6.238) for D = z : |z| ≤ |x| as

H(x, y) +

∫

|z|≤|x|Rs(y, z)H(x, z)dz = Rs(y, x), |y| ≤ |x|. (6.250)

Note the restiction |y| ≤ |x| in (6.250). Multiply (6.250) by q(rα, rβ), set

in (6.250) x = rβ, r = |x|, integrate over S2 in β and then multiply by r2

to get

r2∫

S2

H(rβ, y)q(rα, rβ)dβ +

∫

|z|≤|x|Rs(y, z)r

2

∫

S2

H(rβ, z)q(rα, rβ)dβ

= r2∫

S2

Rs(y, rβ)q(rα, rβ)dβ. (6.251)

Define

(∆x − ∆y)H(x, y) := φ(x, y) (6.252)

and set x = rα in (6.252). Write equation (6.242) as

φ(rα, y) +

∫

|z|≤|x|φ(rα, z)Rs(y, z)dz = r2

∫

S2

Rs(y, rβ)q(rα, rβ)dβ

(6.253)

or, in the operator form

(I +Rs)φ = ψ, (6.254)

where ψ is the right-hand side of (6.253).

Equation (6.251) is of the form

(I + Rs)γ = ψ (6.255)

where

γ := r2∫

S2

H(rβ, y)q(rα, rβ)dβ. (6.256)

Since the operator I+Rs ≥ I is injective, it follows from (6.254) and (6.255)

that φ = γ. Thus

(∆x − ∆y)H(x, y) = r2∫

S2

H(rβ, y)q(rα, rβ)dβ, |y| ≤ |x| = r, x = rα,

(6.257)

where α, β ∈ S2 and q is given by (6.244) with h(rα, rβ) = H(rα, rβ). Let

us formulate the result:



Lemma 6.8 If H(x, y) solves equation (6.250) and the assumptions

(6.241) and (6.249) hold, then H solves equation (6.257) with q(rα, rβ)

given by (6.244) with h(rα, rβ) = H(rα, rβ), α, β ∈ S2, |x| = r, x = rα.

If one defines

H(x, y) = 0 for |y| > |x| (6.258)

and put

H(x, ξ) = (2π)−32

∫H(x, y) exp(iξ · y)dy,

∫:=

∫

R3

, (6.259)

where the integral in (6.259) is taken actually over the ball |y| ≤ |x| because

of (6.258), then

H(x, y) = (2π)−32

∫H(x, ξ) exp(−iξ · y)dξ. (6.260)

Substitute (6.260) into (6.257) (or, which is the same, Fourier transform

(6.257) in the variable y) to get

(∆x + ξ2)H(x, ξ) = r2∫

S2

H(rβ, ξ)q(rα, rβ)dβ, x = rα. (6.261)

Equation (6.261) is a Schrodinger equation with a non-local potential

QH := r2∫

S2

H(rβ, ξ)q(rα, rβ)dβ. (6.262)

Suppose that H(x, y) is computed for |y| ≤ |x| ≤ a. Given this H(x, y),

how does one compute the solution h(x, y) to the equation (6.238) with

D = Ba = x : |x| ≤ a?Write equation (6.238) for D = Ba and z = ρβ as

h(x, y, a) +

∫ a

0

∫

S2

Rs(y, ρβ)h(x, ρβ, a)dβρ2dρ = Rs(y, x). (6.263)

Differentiate (6.263) in a to get

∂h

∂a+

∫ a

0

∫

S2

Rs(y, ρβ)∂h(x, ρβ, a)

∂adβρ2dρ = −a2

∫

S2

R(y, aβ)h(x, aβ, a)dβ.

(6.264)



Let x = aα in (6.263). Multiply (6.263) by −a2h(z, aα, a) and integrate

over S2 to get:

−a2∫S2 h(aα, y, a)h(z, aα, a)dα

+∫ a0

∫S2 Rs(y, ρβ)

−a2

∫S2 h(aα, ρβ, a)h(z, aα, a)

= −a2∫S2 Rs(y, aα)h(z, aα, a)dα. (6.265)

The operator I + Rs is injective. Therefore equations (6.264) and (6.265)

have the same solution since their right-hand sides are the same. Set z = x

and α = β in (6.265), compare (6.264) and (6.265) and get

∂h(x, y, a)

∂a= −a2

∫

S2

h(aβ, y, a)h(x, aβ, a)dβ. (6.266)

Note that

h(x, y) = H(x, y) for |y| ≤ |x| (6.267)

according to equation (6.250). Therefore (6.266) can be written as

∂h(x, y, a)

∂a= −a2

∫

S2

H(aβ, y, a)h(x, aβ, a)dβ. (6.268)

Equation (6.268) can be used for computing the function h(x, y, a) for

all x, y ∈ Ba, given H(x, y) for |y| ≤ |x| ≤ a. The value h(x, aβ, a) can be

computed from equation (6.263):

h(x, aβ, a) +

∫ a

0

∫

S2

Rs(aβ, ρθ)h(x, ρθ, a)dθρ2dρ = Rs(aβ, x). (6.269)

The function Rs(aβ, z) is known for all z, β and the function h(x, ρθ, a),

ρ < a is assumed to be computed recursively, as a grows, by equation

(6.268).

Namely, let us assume that H(x, y) is computed for all values |y| ≤ |x| ≤A, and one wants to compute h(x, y) for all x, y ∈ BA := x : |x| ≤ A.From (6.268) one has

h (x, y, (m+ 1)τ ) = h(x, y,mτ ) − τ (mτ )2 ×∫

S2

H(mτβ, y,mτ )h(x,mτβ,mτ )dβ. (6.270)

Here m = 0, 1, 2, . . . , τ > 0 is a small number, the step of the increment of

a. It follows from (6.263) that

h(x, y, 0) = Rs(y, x), (6.271)



so that

h(x, y, τ ) = Rs(y, x), (6.272)

h(x, y, 2τ ) = Rs(y, x) − τ 3

∫

S2

H(τβ, y, τ )h(x, τβ, τ )dβ, (6.273)

and so on. One can assume that |y| > |x| because for |y| ≤ |x| one can use

(6.267).

A formal connection of the estimation problem with scattering theory

can be outlined as follows.

Let us assume that there exists a function H(x, y), H(x, y) = 0 for

|y| > |x|, such that the function φ(x, θ, k) defined by the formula

φ(x, θ, k) := exp(ikθ · x) −∫

|y|≤|x|exp(ikθ · y)H(x, y)dy (6.274)

is a solution to the Schrodinger equation

[∆ + k2 − q(x)

]φ = 0, (6.275)

where ∆ = ∇2 is the Laplacian. This assumption is not justified presently,

so that our argument is formal.

Taking the inverse Fourier transform of (6.274) in the variable kθ, one

obtains

− 1

(2π)3

∫ ∞

0

dkk2

∫

S2

[φ(x, θ, k)− exp(ikθ · x)] exp(−ikθ · y)dθ = H(x, y).

(6.276)

Compute (∆x − ∆y)H formally taking the derivatives under the integral

signs in the left-hand side of (6.276) and using (6.275). The result is

(∆x − ∆y)H(x, y) = q(x)H(x, y). (6.277)

One is interested in the solution of (6.277) with the property H(x, y) = 0

for |y| > |x|. Define H(x, ξ) by formula (6.259). Substitute (6.260) into

(6.277) and differentiate in y formally under the sign of the integral to get

[∆x + ξ2 − q(x)

]H(x, ξ) = 0. (6.278)

Therefore, if one compares (6.278) and (6.261) one can conclude that the

right-hand side of (6.261) reduces to q(x)H(x, ξ). This means that

q(rα, rβ) =1

r2δ(α− β)q(x), x = rα (6.279)



where δ(α − β) is the delta-function. If (6.279) holds then the non-local

potential Q defined by (6.262) reduces to the local potential q(x).

Equations (6.244) and (6.279) imply

−2d

dr

[r2H(rα, rβ)

]= δ(α− β)q(rα). (6.280)

Note that h(rα, rβ) = H(rα, rβ). From (6.280) one obtains

H(rα, rβ) = −δ(α− β)

2r2

∫ r

0

q(ρα)dρ + Rs(0, 0), (6.281)

where we have used the equation

H(0, 0) = Rs(0, 0) (6.282)

which follows from (6.250).

Let us summarize the basic points of this section:

1) the solution to equation (6.250) solves equation (6.257) provided that the

assumptions (6.241) and (6.249) hold;

2) the solution to equation (6.238) with D = Ba is related to the solution

to equation (6.250) for |x| ≤ a by the formulas (6.267) and (6.268);

3) if the solution to equation (6.250) is found then one can compute the

solution to equation (6.238) with D = Ba recursively using (6.270);

4) the solution to equation (6.250) solves the differential equation (6.257),

and its Fourier transform solves the Schrodinger equation (6.261) with a

non-local potential.


Chapter 7

Applications

In this Chapter a number of questions arising in applications are discussed.

All sections of this Chapter are independent and can be read separately.

7.1 What is the optimal size of the domain on which the

data are to be collected?

Suppose that one observes the signal

U(x) = s(x) + n(x), x ∈ D ⊂ Rr (7.1)

in a bounded domain D which contains a point x0. Assume for simplicity

that D is a ball B` with radius ` centered at x0. The problem is to estimate

s(x0). As always in this book, s(x) is a useful signal and n(x) is noise. It

is clear that if the radius `, which characterizes the size of the domain of

observation, is too large then the time and effort will be wasted in collecting

data which do not improve the quality of the estimate significantly. On the

other hand, if ` is too small then one can improve the estimate using more

data. What is the optimal `? This question is of interest in geophysics and

many other applications.

Let us answer this question using the estimation theory developed in

Chapter 2. We assume that the optimal estimate is linear and the opti-

mization criterion is minimum of variance.

We also assume that the data are the covariance functions (1.3), that

condition (1.2) holds, and that R(x, y) ∈ R.

The optimal estimate is given by formula (2.15). Let us assume for

simplicity that P (λ) = 1, and

|R(x, y)| ≤ c exp(−a|x− y|) c > 0, |x− y| ≥ ε > 0 (7.2)

159



where the last inequality allows a growth of R(x, y) as x→ y, for example,

R(x, y) = (4π|x−y|)−1 exp(−a|x−y|). Here c and a are positive constants,

a−1 is the so-called correlation radius, and the function f(x, y) defined by

formula (1.3) is smooth. Concerning f(x, y) we assume the same estimate

as for R(x, y):

|f(x, y)| ≤ c exp(−a|x− y|), c > 0, |x− y| ≥ ε0 > 0. (7.3)

Under these assumptions the optimal filter is of the form

h = Q(L)f + hs = h0 + hs (7.4)

where hs is the singular part of h which contains terms of the type

b(s)δΓ(j) (see Section 3.3), and h0 is the regular part of h.

The optimal estimate is of the form

s(x0) =

∫

B`

h0(x0, y)U(y)dy +

∫

B`

hs(x0, y)U(y)dy. (7.5)

The optimal size ` of the domain B` of observation is the size for which the

second term in (7.5) is negligible compared with the first.

Example 7.1 Suppose that r = 3, x0 = 0, R(x, y) = (4π|x −y|)−1 exp(−a|x − y|), and |∂jf | ≤ M , 0 ≤ |j| ≤ 2, where j is a multi-

index. Then by formula (2.85) one has

h(y) = (−∆ + a2)f(y) +

(∂f

∂|y| −∂u

∂|y|

)δΓ, (7.6)

where h(y) = h(0, y), f(y) = f(0, y), Γ = x : |x| = `. Therefore the

optimal estimate is

s(0) =

∫

B`

h(y)U(y)dy =

∫

B`

(−∆ + a2)fUdy +

∫

Γ

(∂f

∂|y| −∂u

∂|y|

)U(s)ds

(7.7)

and u is uniquely determined by f as the solution to the Dirichlet problem

(2.22-2.23) which in our case is

(−∆ + a2)u = 0 if |x| ≥ `, u = f if |x| = `, u(∞) = 0. (7.8)

The solution to problem (7.8) can be calculated analytically. One gets

u(r, θ) =∞∑

n=0

fnYn(θ)hn(iar)

hn(ia`), (7.9)


Applications 161

where θ = (ϑ, φ) is a point on the unit sphere S2 in R3, a unit vector,

Yn(θ) is the system of spherical harmonics orthonormalized in L2(S2),

fn :=∫S2 f(`, θ)Y

∗n (θ)dθ, and hn(r) is the spherical Hankel function,

hn(r) :=(π2r

)1/2H

(1)n+(1/2)(r), where H

(1)n (r) is the Hankel function. The

solution (7.9) is a three dimensional analogue of the solution (2.91).

The second integral on the right-hand side of (7.7) is of order of mag-

nitude O (exp(−a`)), while the first integral is of order of magnitude O(1).

Therefore, if we wish to be able to neglect the effects of the boundary

term on the estimate with accuracy about 5 percent then we should choose

` = 3/a. A practical recipe for choosing ` so that the magnitude of the

boundary term in (7.7) is about γ percents of the magnitude of the volume

term is

` =1

aln

100

γ. (7.10)

7.2 Discrimination of random fields against noisy back-

ground

Suppose that one observes a random field U(x) which can be of one of the

forms

U(x) = sp(x) + n(x), p = 0, 1. (7.11)

Here sp(x), p = 0, 1, are deterministic signals and n(x) is Gaussian random

field with zero mean value

n = 0. (7.12)

In particular, if s0 = 0 then the problem is to decide if the observed signal

U(x) contains the signal s1(x) or is it just noise. In order to formulate

the discrimination problem analytically we take as the optimality criterion

the principle of maximum likelihood. Other optimal decision rules such as

Neyman-Pearson or Bayes rules could be considered similarly.

Note that we assume in this section that noise in Gaussian. This is

done because under such an assumption one can calculate the likelihood

ratio analytically.

Let us first develop the basic tools for solving the discrimination prob-

lem. Let

Rφj :=

∫

D

R(x, y)φj(y)dy = λjφj in D. (7.13)



Here

R(x, y) := n∗(x)n(y), (7.14)

λ1 ≥ λ2 ≥ · · · > 0, (7.15)

λj are the eigenvalues of the operator R counted according their multiplici-

ties, and the φj are the corresponding normalized in L2(D) eigenfunctions.

Let us define random variables nj by the formula:

nj = λ−1/2j

∫

D

n(x)φ∗j(x)dx. (7.16)


nj = 0. (7.17)

Moreover, since R(x, y) = R∗(y, x), one has

nin∗j = (λiλj)

−1/2

∫

D

∫

D

R(y, x)φj(x)φ∗i (y)dxdy

= (λiλj)−1/2

∫

D

dxφj(x)λiφ∗i (x) = δij =

1 i = j

0 i 6= j.(7.18)

The random variables nj are called noncorrelated coordinates of the random

field n(x). One has

n(x) =

∞∑

j=1

λ1/2j njφj(x). (7.19)

The series in (7.19) converges in the mean. The random variables nj are

Gaussian since the random field n(x) is Gaussian. Define

spj := λ−1/2j

∫

D

sp(x)φ∗j (x)dx, p = 0, 1. (7.20)

Then

sp(x) + n(x) =

∞∑

j=1

λ1/2j [spj + nj]φj(x). (7.21)

Let

Upj := spj + nj . (7.22)


Applications 163

Then

Upj = spj , |Upj − spj |2 = 1, (7.23)

and Upj are Gaussian.

Let Hp denote the hypothesis that the observed random field is sp(x)+

n(x), p = 0, 1, and f(u1, . . . , un∣∣ Hp) is the probability density for the

random variables Upj under the assumption that the hypothesis Hp occured:

f(u1, . . . , un∣∣Hp) = (2π)−n exp

−1

2

n∑

j=1

|uj − spj |2 . (7.24)

Here we used equations (7.23). Since Upj are complex valued we took

(2π)−n rather than (2π)−n/2 as the normalizing constant in (7.24).

The likelihood ratio is defined as

`(u1, . . . , un) =f(u1, . . . , un

∣∣ H1)

f(u1, . . . , un∣∣ H0)

. (7.25)

Therefore

ln `(u1, . . . , un) = −1

2

n∑

j=1

[|uj − s1j |2 − |uj − s0j |2

]

=1

2

n∑

j=1

(|s0j|2 − |s1j|2

)+ Re

n∑

j=1

uj(s∗1j − s∗0j).(7.26)

We wish to compute the limit of the function (7.26) as n → ∞. If this is

done one can formulate the decision rule based on the maximum likelihood

principle.

Note first that the system of eigenfunctions φj of the operator R is

complete in L2(D) since we have assumed that the selfadjoint operator

R : L2(D) → L2(D) is positive, that is (Rφ, φ) = 0 implies φ = 0. (See

(7.15). Indeed since

L2(D) = c`RanR ⊕ N (R) (7.27)

where c`RanR is the closure of the range of R, and N (R) is the null space

of R, and since N (R) = 0 by assumption (7.15), one concludes that the

closure of the range of R is the whole space L2(D). Thus the closure of the

linear span of the eigenfunctions φj is L2(D) as claimed.



Let us assume that

R(x, y) ∈ R. (7.28)

Define the function V (x) as the solution of minimal order of singularity of

the equation

RV :=

∫

D

R(x, y)V (y)dy = s1(x) − s0(x), x ∈ D. (7.29)

Thus

V (x) = R−1(s1 − s0). (7.30)

Using Parseval’s equality one gets

∞∑

j=1

λ−1j cjb

∗j =

∫

D

c(y)R−1b(y)

∗dy (7.31)

where c(y) and b(y) are some functions for which the integral (7.31) con-

verges,

cj =

∫

D

c(y)φ∗jdy, bj =

∫

D

b(y)φ∗j (y)dy, (7.32)

λ−1j bj =

∫

D

R−1b(y)φ∗jdy. (7.33)

Therefore using formulas (7.16), (7.19), (7.20), (7.22), (7.30) and (7.31) one

obtains

∞∑

j=1

Uj(s∗1j − s∗0j) =

∫

D

U(y)V ∗(y)dy. (7.34)

Let us assume that

∫

D

s0(R−1s0)

∗dx < ∞,

∫

D

s1(R−1s1)

∗dx < ∞. (7.35)


Applications 165

Then, using selfadjointness of R−1, one gets

∞∑

j=1

(|s0j|2 − |s1j|2

)=

∫

D

[s0(R

−1s0)∗ − s1(R

−1s1)∗] dx

= −∫

D

s0V∗dx+

∫

D

s0(R−1s1)

∗

−∫

D

R−1(s1 − s0)s∗1dx−

∫

D

R−1s0s∗1dx

= −∫

D

s0V∗dx−

∫

D

s∗1V dx. (7.36)

Combining (7.36), (7.34) and (7.26) one obtains

Lemma 7.1 There exists the limit

ln ` (U(x)) = limn→∞

ln `(u1, . . . , un)

= Re

∫

D

U(x)V ∗(x)dx− 1

2

∫

D

s0V∗dx− 1

2

∫

D

s∗1(x)V dx.

(7.37)

If the signals sp, p = 0, 1, and the kernel R(x, y) are real valued then

(7.36) reduces to

ln ` (U(x)) =

∫

D

[U(x) − s0(x) + s1(x)

2

]V (x)dx. (7.38)

Suppose that the quantity on the right hand side of equation (7.37) (or

(7.38)) has been calculated, so that the quantity ln ` (U(x)) is known. Then

we use

The maximum likelihood criterion: if ln ` (U(x)) ≥ 0 then the decision

is that hypothesis H1 occured, otherwise hypothesis H0 occured.

Therefore if

Re

∫

D

U(x)V ∗(x)dx ≥ 1

2

∫

D

s0(x)V∗(x)dx+

1

2

∫

D

s∗1(x)V (x)dx (7.39)

then H1 occured. Here V is given by formula (7.30). If the opposite

inequality holds in (7.39) then H0 occured.

If sp(x), p = 0, 1, U(x) and R(x, y) are real valued then the inequality

(7.39) reduces to∫

D

U(x)V (x)dx ≥ 1

2

∫

D

[s0(x) + s1(x)]V (x)dx. (7.40)



The decision rule is: if (7.40) holds then H1 occured, otherwise H0 occured.

If one uses some other threshold criterion (such as Bayes, Neyman-

Pearson, etc.) then one formulates the decision rule based on the inequality

Re

∫

D

U(x)V ∗(x)dx ≥ lnκ+1

2

∫

D

[s0(x)V∗(x) + s∗1(x)V (x)] dx, (7.41)

where κ > 0 is a constant which is determined by the threshold. (See

Section 8.4 for more details.)

The decision rule: Practically, the decision rule based on the inequal-

ity (7.39) (or (7.41)) can be formulated as follows:

1) given sp(x), p = 0, 1, solve equation (7.29) for V (x) by formulas given

in Theorem 2.1.

2) if V (x) is found and U(x) is measured, then compute the integrals in

formula (7.39) and check if the inequality (7.39) holds.

3) if yes, then the decision is that the observed signal is

U = s1(x) + n(x). (7.42)

Otherwise

U = s0(x) + n(x). (7.43)

Example 7.2 Consider the problem of detection of signals against the

background of white Gaussian noise. In this case s0(x) = 0, R(x, y) =

σ2δ(x− y), we assume that the variance of the noise is σ2. The solution to

equation (7.29) is therefore

V = σ−2s1(x). (7.44)

The inequality (7.39) reduces to

Re

∫

D

U(x)s1(x)dx ≥ 1

2

∫

D

|s∗1(x)|2dx. (7.45)

If (7.45) holds then the decision is that the observed signal U(x) is of the

form

U(x) = s1(x) + n(x).

Otherwise one decides that

U(x) = n(x).


Applications 167

The problem of discrimination between two signals s1(x) and s0(x) against

the white Gaussian noise background is solved similarly. If

Re

∫

D

U(x)(s1 − s0)∗dx ≥ 1

2

∫

D

(|s1|2 − |s0|2

)dx (7.46)

then the decision is that equation (7.42) holds. Otherwise one decides that

(7.43) holds.

If all the signals are real-valued then inequality (36) can be written as∫

D

(U − s0)2dx ≥

∫

D

(U − s1)2dx. (7.47)

The decision rule now has a geometrical meaning: if the observed signal

U is closer to S1 in L2(D) metric then the decision is that (7.42) holds.

Otherwise one decides that (7.43) holds.

We have chosen a very simple case in order to demonstrate the decision

rule for the problem for which all the calculations can be carried through

in an elementary way. But the technique is the same for the general kernels

R ∈ R.

Example 7.3 Consider the problem of detection of a signal with un-

known amplitude. Assume that the observed signal is either of the form

U(x) = γs(x) + n(x) (7.48)

or

U(x) = n(x). (7.49)

Parameter γ is unknown, function s(x) is known, n(x) is a Gaussian noise

with covariance function R(x, y) ∈ R. Given the observed signal U(x)

one wants to decide if the hypothesis H1 that (7.48) holds is true, or the

hypothesis H0 that (7.49) holds is true. Moreover, one wants to estimate

the value of γ. In formula (7.37) take

s1 = γs(x), s0 = 0. (7.50)

Then, using the equation ReU∗V = ReUV ∗, write

ln ` (U(x)) = Re

∫

D

U∗(x)V dx− γ∗

2

∫

D

s∗V dx (7.51)

where V (x) solves the equation∫

D

R(x, y)V (y)dy = γs(x). (7.52)



One should find the estimate of γ by the maximum likelihood principle

from the equations

∂ ln `

∂γ= 0,

∂ ln `

∂γ∗= 0. (7.53)

If, again for simplicity, one assumes that the noise is white with variance

σ2 = 1, so that R(x, y) = δ(x− y), then the solution to (7.52) is

V (x) = γs(x), (7.54)

and formula (7.51) reduces to

ln ` = Reγ

∫

D

U∗sdx− |γ|22

∫

D

|s|2dx. (7.55)

Therefore equations (7.53) yield

∫

D

U∗sdx = γ∗∫

D

|s|2dx,∫

D

Us∗dx = γ

∫

D

|s|2dx, (7.56)

so that the estimate γ of γ is

γ =

∫

D

Us∗dx(∫

D

|s|2dx)−1

. (7.57)

Exercise. Check that the estimate γ is unbiased, that is

γ = γ. (7.58)

Hint: Use the equation U = γS(x).

Exercise. Calculate the variance:

(γ − γ)2 =σ2

E, E :=

∫

D

|s|2dx. (7.59)

Estimate (7.59) shows that the variance of the estimate of γ decreases as

the energy E of the signal S(x) grows, which is intuitively obvious.

Assume that hypothesis H0 occured. Then the quantity γ defined by

formula (7.57) is Gaussian with zero mean value and its variance equals σ2

E

by formula (7.59). Therefore

Prob(|γ| > b) = 2erf(bE1/2/σ) (7.60)


Applications 169

where

erf(x) := (2π)−1/2

∫ ∞

x

exp(−t2/2)dt. (7.61)

If one takes confidence level ε = 0, 95 and decides that hypothesis H0

occured if

2erf(γE1/2/σ) > ε, (7.62)

then the decision rule for detection of a known signal with an unknown

amplitude against the Gaussian white noise background is as follows:

1) given the observed signal U(x) calculate γ by formula (7.57),

2) calculate the left hand side in the formula (7.62); if the inequality (7.62)

holds then the decision is that (7.49) holds; otherwise the decision is that

(7.48) holds.

7.3 Quasioptimal estimates of derivatives of random func-

tions

7.3.1 Introduction

Suppose that the observed signal in a domain D ⊂ Rr is

U(x) = s(x) + n(x), x ∈ D ⊂ Rr, (7.63)

where s(x) is a useful signal and n(x) is noise, s = n = 0. If one wishes to

estimate ∂js(x0) optimally by the criterion of minimum of variance then

one has a particular case of the problem studied in Ch. II with As =

∂js (see formula (I.5)). This estimation problem can be solved by the

theory developed in Ch. II. However, the basic integral equation for the

optimal filter may be difficult to solve, the optimal filter may be difficult

to implement, and the calculation of the optimal filter depends on the

analytical details of the behavior of the spectral density of the covariance

kernel R(x, y) = u∗(x)u(y), R(x, y) ∈ R. That is, if one changes R(λ)

locally a little it ceases to be a rational function, for example.

One can avoid the above difficulties by constructing a quasioptimal esti-

mate of the derivative which is easy to calculate under a general assumption

about the spectral density, which is stable towards small local perturbations

of the spectral density and depends basically on the asymptotic behavior

of this density as |λ| → ∞, and which is easy to implement practically.



The notion of quasioptimality will be specified later and it will be shown

that the quasioptimal estimate is nearly as good as the optimal one. The

basic ideas are taken from [Ramm (1968); Ramm (1972); Ramm (1981);

Ramm (1984); Ramm (1985b)].

7.3.2 Estimates of the derivatives

Consider first the one-dimensional case: U(t) = s(t) + n(t). Assume that

|n(t)| ≤ δ, (7.64)

|s′′(t)| ≤ M. (7.65)

Let us assume for simplicity that n(t) and s(t) are defined on all of R1,

that s(t) is an unknown deterministic function which satisfies (7.65), and

the noise n(t) is an arbitrary random function which satisfies (7.64).

Let A denote the set of all operators T : C(R1) → C(R1), linear and

nonlinear, where C(R1) is the Banach space of continuous functions on R1

with the norm ‖ f ‖= maxt∈R1 |f(t)|. Let

∆hU := (2h)−1[U(t+ h) − U(t− h)], (7.66)

h(δ) := (2δ/M )1/2, ε(δ) := (2Mδ)1/2. (7.67)

First, let us consider the following problem: given U(t) and the numbers

δ > 0 and M > 0 such that (7.64) and (7.65) hold, find an estimate U of

s′(t) such that

‖ U − s′(t) ‖→ 0 as δ → 0 (7.68)

and such that this estimate is the best possible in the sense

‖ U − s′(t) ‖= infT∈A

sup|s′′|≤M

|n|≤δ

‖ TU − s′ ‖ . (7.69)

This means that among all estimates TU the estimate U is the best one for

the class of the data given by inequalities (7.64), (7.65). It turns out that

this optimal estimate is the estimate (7.70) in the following theorem.

Theorem 7.1 The estimate

U := ∆h(δ)U (7.70)


Applications 171

has the properties

‖ U − s′ ‖≤ ε(δ) (7.71)

and

infT∈A

sup|s′′|≤M

|n|≤δ

‖ U − s′ ‖= ε(δ), (7.72)

where ε(δ) and h(δ) are defined by (7.67) and ∆h(δ)U is defined by (7.66).

Proof. One has

|∆hU − s′| ≤ |∆h(U − s)| + |∆hs− s′| ≤ δ

h+Mh

2. (7.73)

Indeed

|∆h(U − s)| ≤ |n(t+ h)| + |n(t− h)|2h

≤ δ

h,

and∣∣∣∣s(t + h) − s(t − h)

2h− s′

∣∣∣∣

=∣∣s(t) + s′(t)h + s′′(ξ+)h

2

2 − s(t) + s′(t)h− s′′(ξ−)h2

2

2h

∣∣ ≤ Mh2

2h

=Mh

2, (7.74)

where ξ± are the points in the remainder in the Taylor formula and esti-

mates (7.64), (7.65) were used.

For fixed δ > 0 and M > 0, minimize the right side of (7.73) in h > 0

to get

minh>0

(δ

h+Mh

2

)=

δ

h(δ)+Mh(δ)

2= ε(δ), (7.75)

where h(δ) are ε(δ) are defined in (7.67). This proves inequality (7.71). To

prove (7.72), take

s1 = −M2t[t− 2h(δ)] 0 ≤ t ≤ 2h(δ), (7.76)

and extend it on R1 so that

|s′′1 (t)| ≤ M, |s1(t)| ≤ δ, ∀t ∈ R1. (7.77)



Here h(δ) is given by (7.67). The extension of s1 with properties (7.77) is

possible since on the interval [0, 2h(δ)] conditions (7.77) hold. Let

s2(t) = −s1(t). (7.78)

One has

|s′′p | ≤ M, |sp| ≤ δ, p = 1, 2. (7.79)

Take U(t) = 0, t ∈ R1. Then

|U(t) − sp(t)| ≤ δ, p = 1, 2. (7.80)

Therefore one can consider U(t) as the observed value of both s1(t) and

s2(t). Let T ∈ A be an arbitrary operator on C(R1). Denote

TU(t)∣∣t=0

= a. (7.81)

One has

sup|s′′|≤M

|n|≤δ

‖ TU − s′ ‖ ≥ sup|s′′|≤M

|n|≤δ

|TU(0) − s′(0)| ≥ max|a− s′1(0)|, |a− s′2(0)|

≥ 1

2|s′1(0) − s′2(0)| = ε(δ), (7.82)

where ε(δ) is given by (7.67). Taking infimum in T ∈ A of both sides of

(7.82) one obtains

infT∈A

sup|s′′|≤M

|n|≤δ

‖ TU − s′ ‖≥ ε(δ). (7.83)

From (7.83) and (7.71) the desired inequality (7.72) follows. Theorem 7.1

is proved.

7.3.3 Derivatives of random functions

Assume now that s(t) is a random function, that s(t) and n(t) are uncor-

related, n = 0, and n = σv, where the variance of v, denoted by D[v], is 1,

so that

D[v] = 1,D[n] = σ2. (7.84)

The problem is to find a linear estimate Lu such that

D[LU − s′] = min (7.85)


Applications 173

given the observed signal U(t) = s(t) + n(t). As was explained in section

7.3.1, we wish to find a quasioptimal linear estimate of s′ such that this

estimate is easy to compute, easy to implement, and is nearly as good as

the optimal estimate.

Let us assume that

D[s(m)(t)] ≤ M 2m, (7.86)

where s(m)(t) is the m-th derivative of s(t). Let us seek the quasioptimal

estimate among the estimates of the form

∆(Q)h s := h−1

Q∑

k=−QA

(Q)K s

(t+

kh

Q

). (7.87)

If m = 2q or m = 2q+1 let us take Q = q. If one expands the expression on

the right hand side of (7.87) in powers of h and requires that the order of

the smallness as h→ 0 of the function ∆(Q)h s− s′ be maximal, one obtains

the following system for the coefficients A(Q)k :

Q∑

k=−Q

(k

Q

)jA

(Q)k = δ1j, 0 ≤ j ≤ 2Q, (7.88)

where

δ1j =

0, j 6= 1

1, j = 1.

The system (7.88) is uniquely solvable since its determinant does not

vanish: it is a Vandermonde determinant. One can find by solving system

(7.88) that

A(1)0 = 0, A

(1)±1 = ±1

2(7.89)

A(2)0 = 0, A

(2)±1 = ±4

3, A

(2)±2 = +

1

6(7.90)

A(3)0 = 0, A

(3)±1 = ±9

4, A

(3)±2 = +

9

20, A

(3)±3 = ± 1

20(7.91)

A(4)0 = 0, A

(4)±1 = ±16

5, A

(4)±2 = +

4

5, A

(4)±3 = ± 16

105



A(4)±4 = +

1

70. (7.92)

We will need

Lemma 7.2 Let m = 2q+ 1. Assume that the coefficients A(q)k in (7.87)

satisfy (7.88) with 0 ≤ j ≤ 2q, and let

cm :=m

(m!)2q2m

q∑

k=−q

∣∣∣A(q)k

∣∣∣2

k2m, m = 2q + 1. (7.93)

Then

D[∆

(q)h s− s′

]≤ γmh

2m−2, γm := cmM2m, (7.94)

where D is the symbol of variance.

In order to prove this lemma, one needs a simple

Lemma 7.3 Let gj be random variables and aj be constants. If

D[gj] ≤ M, 1 ≤ j ≤ n (7.95)

then

D

n∑

j=1

ajgj

≤ nM

n∑

j=1

|aj|2. (7.96)

Proof of Lemma 7.3 Note that

(n∑

k=1

|bk|)2

≤ n

n∑

k=1

|bk|2 (7.97)

by Cauchy’s inequality.

Let f(x1, . . . , xn) be the probability density of the joint distribution of

the random variables g1, g2, . . . , gn. Let us assume without loss of generality


Applications 175

that gk = 0. Denote dx = dx1 . . .dxn,∫Rn

=∫

, x = (x1, . . . , xn). Then

D

n∑

j=1

ajgj

=

∫ ∣∣∣∣∣∣

n∑

j=1

ajxj

∣∣∣∣∣∣

2

f(x)dx ≤∫ n∑

j=1

|aj|2n∑

j=1

|xj|2fdx

=

n∑

j=1

|aj|2n∑

j=1

∫x2jf(x)dx =

n∑

j=1

|aj|2n∑

j=1

D[gj]

≤ nM

n∑

j=1

|aj|2. (7.98)


Proof of Lemma 7.2 One has

∆(q)h s − s′ = h−1

q∑

k=−qA

(q)k

hmkm

m!qms(m)(tk), (7.99)

where tK are the points in the remainder of Taylor’s formula. Apply Lemma

7.3 to equation (7.99) and take into account the assumption (7.86) to get

D [∆qhs − s′] ≤ h2m−2M2

m

(m!)2q2m(2q + 1)

q∑

k=−q|A(q)k |2k2m (7.100)

which is equivalent to (31).


Lemma 7.4 One has

D[∆

(q)h U − s′

]≤ φ(h), (7.101)

where

φ(h) := γmh2m−2 + σ2h−2

q∑

k,j=−qA

(q)k A

(q)j R

(k − j

qh

). (7.102)

Here m = 2q + 1,

R(t− τ ) := v∗(t)v(τ ) (7.103)

is the covariance function of v(t), and conditions (7.84) hold.



Proof. By assumption s and v are uncorrelated. Therefore

D[∆

(q)h U − s′

]= D

[∆

(q)h s − s′

]+ σ2D

[∆

(q)h v]

(7.104)

≤ γmh2m−2 + σ2h−2

∑qk,j=−qA

(q)k A

(q)j R

(k−jqh),(7.105)

where we took into account that the coefficients A(q)j are real numbers.


Definition 7.1 The estimate

Lu := ∆(q)h U (7.106)

is called quasioptimal if h minimizes the function φ(h) defined by (7.102).

Thus, the quasioptimal estimate minimizes a natural majorant φ(h)

of the variance of the estimate ∆(q)h U among all estimates (7.106) with

different h. The majorant φ(h) is natural because the equality sign can be

attained in (7.101) (for example, if s = 0 then the equality sign is attained

in (7.101)).

The quasioptimal filter is easy to calculate: it is sufficient to find mini-

mizer of φ(h), h > 0. This filter is easy to implement: one needs only some

multiplications, additions and time shift elements. We will compare the er-

ror estimates for optimal and quasioptimal filter shortly, but first consider

an example.

Example 7.4 Let m = 2, q = 1,

∆(1)h U =

U(t + h) − U(t− h)

2h. (7.107)

By formulas (7.93) and (7.89) for m = 2 and q = 1 one calculates

c2 =1

4. (7.108)

Let us assume that the constant M 22 := M in the estimate (7.86) is

known, the variance σ2 of noise is known (see (7.84)) and the covariance

function of v(t) is

R(t) = exp(−|t|). (7.109)

Then formula (7.102) yields

φ(h) =M

4h2 +

σ2h−2

2[1 − exp(−2h)] =

Mh2

4+σ2h−2

2[R(0) − R(2h)] .

(7.110)


Applications 177

If σ 1 then the minimizer of the function (7.110) should be small. As-

suming h 1 and using 1 − exp(−h) ≈ h for h 1, one obtains

φ(h) ≈ Mh2

4+σ2

h, h > 0. (7.111)

It is easy to check that the function (7.111) attains its minimum at

hmin =

(2σ2

M

)1/3

(7.112)

and

minφ ≈ σ4/3M1/3(22/3 + 2−1/3

)= 2.381σ4/3M1/3. (7.113)

Note that if the minimizer is small, so that hmin 1, then the behavior

of the covariance function is important only in a neighborhood of t = 0.

But this behavior is determined only by the asymptotic behavior of the

spectral density R(λ) as |λ| → ∞.

Let us now compare briefly optimal and quasioptimal estimates.

Let us define the spectral density Rs(λ) of s(t):

Rs(λ) :=

∫ ∞

−∞exp(−iλt)Rs(t)dt, (7.114)

where

Rs(t− τ ) := s∗(t)s(τ ). (7.115)

We assume that

0 < Rs(λ) ≤A

(1 + λ2)a, a ≥ 5

2(7.116)

and that the spectral density of noise

0 < R(λ) ≤ B

(1 + λ2)b, b > 1. (7.117)

One has for the quasioptimal estimate (7.107) formula (7.110), and

M = D[s′′(t)] = Rs′′(0) := [s′′(t)]∗s′′(τ )∣∣t=τ

=

∂4

∂t2∂τ2Rs(t− τ )

∣∣t=τ

=1

2π

∫ ∞

−∞λ4Rs(λ)dλ ≤ A

2π

∫ ∞

−∞λ4 dλ

(1 + λ2)a≤ constA. (7.118)



The term R(0) −R(2h) in (7.110) can be written as

R(0) − R(2h) =1

2π

∫ ∞

−∞R(λ)[1 − exp(2iλh)]dλ. (7.119)

Therefore the function φ(h) in (7.110) can be expressed entirely in terms

of spectral densities. One has

|R(0) −R(2h)| ≤ B

2π

∫ ∞

−∞

dλ

(1 + λ2)b|1− exp(2iλh)|

=B

π

∫ ∞

−∞

dλ

(1 + λ2)b| sinλh| ≤ Bh

π

∫ ∞

−∞

∣∣∣∣sinλh

λh

∣∣∣∣λdλ

(1 + λ2)b

≤ const ×Bh. (7.120)

It follows from (7.110), (7.118) and (7.120) that

φ(h) ≤ const

(Ah2 +

Bσ2

h

), (7.121)

where const does not depend on A, B, σ and h but depends on a and b. If

a ≥ 52

and b > 1 one can take an absolute constant in (7.121). This shows

that estimate (7.121) does not depend much on the details of the behavior

of the spectral densities. One can see from (7.121) that

φmin ≤ constσ4/3

(A

B

)1/3

, hmin = const

(Bσ2

A

)1/3

. (7.122)

This estimate shows how the variance of noise and the ratio AB

influence

the behavior of the error as σ → 0. Note that

1 = D[v] = R(0) =

∫ ∞

−∞R(λ)dλ ≤ B

∫ ∞

−∞

dλ

(1 + λ2)b= const ×B.

Therefore B is of order of magnitude of 1, and formula (7.122) can be

written as

φmin ≤ constσ4/3A1/3. (7.123)

This estimate holds if hmin 1, that is if

σ A1/2. (7.124)

All these estimates are asymptotic as σ → 0.


Applications 179

The optimal h satisfies the equation∫ t

t−T

[Rs(y − z) + σ2R(y − z)

]h(x, z)dz = f(y, x), t− T ≤ y ≤ t,

(7.125)

where

f(y, x) := s∗(y)s′(x) =∂

∂xRs(y − x) = −R′

s(y − x). (7.126)

The error of the optimal estimate can be computed by formula (2.108):

ε(x) = |s′(x)|2 − (f, h0) = −R′′s (0) − (f, h0), (7.127)

where h0 is the solution to equation (7.125) of minimal order of singularity.

For simplicity let us assume that t − T = −∞ and t = ∞. The error

for this physically nonrealizable filter is not more than for the physically

realizable filter. The reasons for considering the nonrealizable filter are: 1)

this filter is easy to calculate, and 2) its error gives a lower bound for the

error of realizable filter.

Let x = 0 in (7.125). Since the random functions we consider are

assumed to be stationary there is no loss in the assumption that x = 0.

Take the Fourier transform of (7.125) with t− T = −∞, t = +∞, and use

the theorem about the Fourier transform of convolution to get

[Rs(λ) + σ2R(λ)]h0(λ) = −iλRs. (7.128)

Thus

h0 = −iλRs[Rs + σ2R]−1. (7.129)

Use (7.129) and apply Parseval’s equality to (61) to get

ε(0) =1

2π

∫ ∞

−∞λ2Rs(λ)dλ −

∫ ∞

−∞λ2 R2

s

Rs + σ2Rdλ

=σ2

2π

∫ ∞

−∞λ2Rs(λ)R(λ)[Rs(λ) + σ2R(λ)]−1dλ. (7.130)

It follows from (7.130), (7.116) and (7.117), if we assume that for large

λ the sign ≤ in (7.116) and (7.117) becomes asymptotic equality, that

ε(0) ≤ const × σ2AB

∫ ∞

−∞

λ2dλ

(1 + λ2)bA + σ2B(1 + λ2)a

≤ const × σ2 as σ → 0. (7.131)



The estimate (7.113) gives O(σ4/3) as σ → 0 but if one takes larger

m the estimate will be 0(σα(m)

), where α(m) → 2 as m grows. One

can estimate α(m) using formula (7.102). For small h one has φ ≤γmh

2m−2 + constσ2h−1, hmin ∼ σ2/(2m−1) and φmin ∼ σ2 (2m−2)/(2m−1),

so that α(m) = 2 2m−22m−1 . Therefore α(m) → 2 as m → ∞.

7.3.4 Finding critical points

Before we discuss the case r > 1 of random functions of several variables,

let us outline briefly an application of the results of section 7.3.2 to the

problem of finding the extremum of a random function. Assume that the

observed function U(t) = s(t)+n(t), where s(t) is a smooth function defined

on the interval [0, 1] which has exactly one maximum on this interval. Such

functions s(t) are called univalent. Suppose that this maximum is attained

at a point τ . Assume that

|s′′| ≤M, |n(t)| ≤ δ. (7.132)

The problem is to find τ given the signal U(t), 0 ≤ t ≤ 1.

The solution to this problem is:

1) divide the interval [0, 1] by the points tk = kh, where h = h(δ) is given

by (7.67), k = 0, 1, 2, . . .

2) calculate Uk := (2h)−1 [U(tk + h) − U(tk − h)] = ∆(1)h U(tk)

3) compute UkUk+1, k = 0, 1, 2, . . .

4) if

|Uk| > ε(δ) ∀k (7.133)

and

UjUj+1 < 0 for some j (7.134)

then

tj < τ < tj + h. (7.135)

Indeed, from (7.67) and (7.71) one concludes that

U(tk) − ε(δ) ≤ s′(tk) ≤ U(tk) + ε(δ). (7.136)

From (7.133), (7.134) and (7.136) it follows that s′(t) changes sign on the

interval (tj, tj+1). This implies (7.135) since s(t) is univalent.


Applications 181

If (7.133) is not valid for some k = k0, then the maximum may be on the

interval (tk − h, tk + h). In this case it may happen that for no j condition

(7.134) holds. The above method may not work if the derivative of s(t) is

very small (smaller than ε(δ)) in a large neighborhood of τ .

Remark 7.1 Since we used formula (7.71) we assumed that U(t) is de-

fined on all of R1. If it is defined only on a bounded interval [a, b] then the

expression ∆(1)h U(t) is not defined for t < a+h. In this case one can define

∆(1)h U(t) = h−1 [U(t+ h) − U(t)] , a ≤ t < a+ h

= ∆(1)h U(t), a+ h ≤ t ≤ b− h

= h−1 [U(t) − U(t− h)] , b− h ≤ t ≤ b. (7.137)

In this case∣∣∣∆(1)

h U(t) − s′(t)∣∣∣ ≤ 2δ

h+Mh

2(7.138)

so that the minimizer hmin of the right hand side of (7.138) is

hmin = 2(δ/M )1/2 (7.139)

and the minimum of the right hand side of (7.138) is

ε(δ) = 2(Mδ)1/2. (7.140)

Note that ε(δ) = 21/2ε(δ), where ε(δ) is given by (7.67).

7.3.5 Derivatives of random fields

Let us consider the multidimensional case. There are no new ideas in this

case but we give a brief outline of the results for convenience of the reader.

Suppose that U(x) = s(x) + n(x), x ∈ Rr. Let ∇s denote the gradient

of s(x) and ‖ s ‖= maxx∈Rr |s(x)|. Assume that

‖ n(x) ‖≤ δ, (7.141)

and

maxx∈Rr

∣∣(d2s(x)θ, θ)∣∣ ≤M ∀θ ∈ S1 := θ : θ ∈ Rr, θ · θ = 1, (7.142)

where

(d2s(x)θ, θ) =

r∑

i,j=1

∂2s(x)

∂xi∂xjθiθj .



Define

∆hU(x) :=U(x+ hθ) − U(x− hθ)

2h, h > 0. (7.143)

Theorem 7.2 If h(δ) and ε(δ) are given by (7.67) then

|∆h(δ)U(x) −∇s(x) · θ| ≤ ε(δ), ∀θ ∈ S1. (7.144)

Moreover

infT∈A

sups,n

‖ TU(x) −∇s(x) · θ ‖= ε(δ) (7.145)

and the infimum is attained at T = ∆h(δ). Here the supremum is taken over

all s(x) ∈ C2(Rr), which satisfy (76), and all n(x), which satisfy (75), and

the infimum is taken over the set A of all operators T : C(Rr) → C(Rr)

linear or nonlinear.

The proof of this theorem is similar to the proof of Theorem 7.1. The

role of the function s1(t) in formula (7.76) is played by the function

s1(x) =M

2

(|x|2 − 2h(δ)x · θ

)in Bδ , (7.146)

where Bδ is the ball, centered at the point h(δ)θ, with radius h(δ), |x|2 =∑rj=1 |xj|2. It is clear that s1(x) vanishes at the boundary ∂Bδ of the ball,

that∣∣(d2s1(x)θ, θ)

∣∣ ≤ M ∀θ ∈ S1 (7.147)

and that

|s1(x)| ≤ δ. (7.148)

Let s2(x) = −s1(x) and argue as in the proof of Theorem 7.1 in order to

obtain (7.144) and (7.145).

7.4 Stable summation of orthogonal series and integrals

with randomly perturbed coefficients

7.4.1 Introduction

Consider an orthogonal series

f(x) =

∞∑

j=1

cjφj(x), x ∈ D ⊂ Rr, (7.149)


Applications 183

where

(φj, φm) :=

∫

D

φj(x)φ∗m(x)dx = δjm (7.150)

and

cj := (f, φj). (7.151)

Suppose that the data

bj := cj + εj, 1 ≤ j < ∞ (7.152)

are given, that is the Fourier coefficients of f are known with some errors

εj. Assume that

εj = 0, ε∗jεm = σ2δjm, σ = const > 0. (7.153)

The problem is: given the data (7.152), (7.153), estimate f(x).

From the point of view of systems theory, one can interpret this prob-

lem as follows. Suppose that the system’s response to the signal φj(x) is

Kjφj(x), where Kj is a generalized transmission coefficient of the system.

For example, if ω is a continuous analogue of j and φj(x) = exp(iωx)

then K(iω) is the usual transmission coefficient of the linear system. If

there is a noise at the output of the system then one actually receives∑∞j=1(Kjcj + εj)φj(x) at the output, where εj is the noise comlponent

corresponding to the j-th generalized harmonic φj(x).

Let us consider two methods for solving the problem. These methods

are easy to use in practice.

The first method is to define

fN :=

N∑

j=1

bjφj(x) (7.154)

and to choose N = N (σ) so that

‖ fN(σ)(x) − f(x) ‖2 = min, ‖ f ‖2:=

∫

D

|f |2dx. (7.155)

The second method is to define

g(x) :=

∞∑

j=1

ρj(ν)bjφj(x), (7.156)



where ν > 0 is a parameter, and to choose the multipliers ρj(ν) so that

‖ g − f ‖2 = min . (7.157)

The same problem can be formulated for orthogonal integrals that is for

continuous analogues of orthogonal series:

f(x) =

∫ ∞

−∞c(λ)φ(x, λ)dλ (7.158)

and∫

Rrφ(x, λ)φ∗(x, λ′)dx = δ(λ− λ′). (7.159)

We assume that b(λ) are given, b(λ) = c(λ)+ ε(λ), ε∗(λ)ε(λ′) = σ2δ(λ−λ′)and the problem is to estimate f(x). This can be done in the same way as

for the problem for series.

7.4.2 Stable summation of series

Let us consider the first method. Assume that

|φj(x)| ≤ c, |cj| ≤ Aj−a, a >1

2, (7.160)

where c and A are positive constants which do not depend on x and j.

Then

‖ fN − f ‖2 =

N∑

j,j′=1

εjε∗j′(φj , φj′) +

∞∑

j=N+1

|cj|2

≤ σ2N + A2∞∑

j=N+1

j−2a

≤ σ2N + A2N−2a+1

2a− 1:= γ(N ), a >

1

2. (7.161)

Here we used (7.150), (7.152), (7.153) and (7.160). Let us find Nm for

which γ(N ) = min, σ and A assumed fixed. One has

N (σ) := Nm =

(2a

2a− 1

)1/(2a)(A

σ

)1/a

(7.162)

and

γm := γ(Nm) = constA1/aσ(2a−1)/a, (7.163)


Applications 185

where const depends on a but not on A and σ. We have proved

Proposition 7.1 If N (σ) is given by (7.162) then

‖ fN(σ)(x) − f(x) ‖2 ≤ constA1/aσ(2a−1)/a. (7.164)

Therefore formula (7.154) with N = N (σ) gives an estimate of f(x)

such that the error of this estimate goes to zero according to (7.164) as

σ → 0.

7.4.3 Method of multipliers

Let us consider the second method. Take

ρj(ν) := exp(−νj). (7.165)

These are multipliers of convergence used in Abel’s summation of series.

Then

J :=

∥∥∥∥∥∥

∞∑

j=1

exp(−νj)bjφj(x) −∞∑

j=1

cjφj(x)

∥∥∥∥∥∥

2

=

∞∑

j=1

|1 − exp(−jν)|2|cj|2 + σ2∞∑

j=1

exp(−2jν)

≤ A2∞∑

j=1

|1 − exp(−jν)|2j2a

+ σ2 exp(−2ν)

1 − exp(−2ν). (7.166)

For fixed A and σ one can find νm which minimizes the right side of

(7.166). If γm is the minimum of the right side of (7.166), then the error

estimate of the method is

J ≤ γm. (7.167)

One can see that γm → 0 as σ → 0.

7.5 Resolution ability of linear systems

7.5.1 Introduction

Let us briefly discuss the notions of resolution ability of a linear system.

In optics Rayleigh gave an intuitive definition of resolution ability: if one

has two bright points [δ(x− a) + δ(x+ a)]/2 as the input signal and if the



optical system is described by the transmission function h(x, y), so that the

output signal is [h(x, a) + h(x,−a)]/2 then the two points can be resolved

according to the Rayleigh criterion if [h(0, a) + h(0,−a)]/2<∼0.8h(0).

Note that we took the signal [δ(x− a) + δ(x + a)]/2 rather than δ(x −a) + δ(x + a) in order to compare this signal with a bright point at the

origin δ(x). The sum of the coefficients in front of delta-functions should

be therefore equal to 1, the coefficient in front of δ(x). The factor 0.8 in the

Rayleigh criterion is an empirical one. The transmission function h(x, y) is

defined as follows: if sin(x) is the input signal then the output signal of the

linear system is given by

sout(x) =

∫

D

h(x, y)sin(y)dy. (7.168)

The domainD in optics is usually the input pupil of the system, or its input

vision domain. The optical system is called isoplanatic if h(x, y) = h(x−y).In describing the Rayleigh criterion we assume that h(x, y) has absolute

maximum at x = y, that the distance 2a between two points is small, so

that both points lie in the region near origin in which h(x, y) is positive.

Suppose now that

s1(x) = δ(x−a)+δ(x+a)2 , (7.169)

s0(x) = δ(x), (7.170)

and one observes the signal

Uj(x) =

∫

D

h(x, y)sj (y)dy + n(x), (7.171)

where n(x) is the output Gaussian noise, n = 0, |n|2 = σ2 < ∞, and j = 0

(hypothesis H0) or j = 1 (hypothesis H1).

The problem:

given the observed signal Uj(x) decide whether H0 or H1 occured.

(7.172)

If with the probability 1 one can make a correct decision no matter how

small a > 0 is, then one says that the resolution ability of the system in the

sense of Rayleigh is infinite. The traditional intuition (which says that, for

a fixed size of the input pupil, an optical system can resolve the distances of

order of magnitude of the wavelength) is based on the calculation of diffrac-

tion of a plane wave by a circular hole in a plane. In [R 6)] it was proved

that, in the absence of noise, the transmission function of a linear system


Applications 187

can be made as close to the δ(x−y) as one wishes, by means of apodization.

This means that there is a sequence hm(x, y) of the transmission functions

which is a delta sequence in the sense that

∫

D

hm(x, y)s(y)dy → s(x) as m → ∞

for any continuous s(y).

7.5.2 Resolution ability of linear systems

In this section we apply the theory developed in Section 7.2 in order to show

that there exists a linear system, or what is the same for our purposes, the

transmission function h(x, y) such that the resolution ability of this system

in the sense of Rayleigh is infinite. More precisely, we will formulate a

decision rule for discriminating between the hypoteses H0 and H1 such

that the error of this rule can be made as small as possible. The error of

the rule is defined to be

αm := P (γ1

∣∣H0) (7.173)

that is, the probability to decide that hypothesis H1 occured when in fact

H0 occured. The meaning of the parameter m, the subscript of α, will

be made clear shortly. This parameter is associated with the sequence of

linear systems whose resolution ability increases without limit.

First let us choose hm(x, y) so that the sequences

hm(x, z) =

∫

D

hm(x, y)δ(y − z)dy := δm(x, z) (7.174)

are delta-sequences. Then, by formula (4), the observed signals became

U1(x) =δm(x, a) + δm(x,−a)

2+ n(x) := s1(x) + n(x) (7.175)

or

U0(x) = δm(x, 0) + n(x) := s0(x) + n(x). (7.176)

Let us apply to the problem (7.172) the decision rule based on formula

(7.41).



First one should solve the equation (7.29):

RV :=

∫

D

R(x, y)V (y)dy =δm(x, a) + δm(x,−a)

2− δm(x, 0)

= s1(x) − s0(x) := f, (7.177)

where R(x, y) := n∗(x)n(y) is the covariance function of noise. We assume

that R(x, y) ∈ R, and, for simplicity, that P (λ) = 1. In this case R(x, y)

solves the equation

Q(L)R(x, y) = δ(x− y). (7.178)

We also assume that δm(x, y) is negligibly small for |x− y| > |a|2

, and that

points 0, a, and −a are inside of D and

ρ(0,Γ) > |a|, ρ(a,Γ) > |a|, ρ(−a,Γ) > |a|, (7.179)

where ρ(x,Γ) is the distance between point x and Γ = ∂D. In this case one

can neglect the singular boundary term of the solution to equation (7.177)

and write this solution as

V (x) = Q(L)f. (7.180)

Let us write inequality (7.39):

Re

∫

D

U(x)Q(L)f∗(x)dx ≥ 1

2

∫

D

δm(x, 0)Q(L)f∗(x)dx

+1

2

∫

D

δ∗m(x, a) + δ∗m(x,−a)2

Q(L)f(x)dx,

(7.181)

where we assume that the coefficients of Q(L) are real. Otherwise one

would write [Q(L)f ]∗ in place of Q(L)f∗.Let us write the expression (7.173) for αm:

αm := P (γ1

∣∣ H0) = P

2Re

∫

D

U0(x)V∗dx ≥

∫

D

(s0V∗ + s∗1V )dx

,

(7.182)

where V , s1 and s0 are given by (7.180), (7.175) and (7.176) respectively.


Applications 189


αm = P

2Re

∫

D

n(x)V ∗(x)dx ≥∫

D

(s∗1 − s∗0)Q(L)(s1 − s0)dx

= P

2Re

∫

D

n(x)V ∗(x)dx ≥∫

D

(s∗1Q(L)s1 + s∗0Q(L)s0)dx

.

(7.183)

Here we took into account that∫

D

s∗jQ(L)sidx = 0 for i 6= j (7.184)

because of the assumption that for m sufficiently large the functions S1(x)

and S0(x) have practically nonintersecting supports: each of them with

derivatives of order sq is negligibly small in the region where the other is

not small. One can write (7.183) as

αm = P

Re

∫

D

n(x)V ∗(x)dx ≥ 3

2Am

, (7.185)

where we denote by Am the positive quantity of the type

Am =

∫

D

δm(x, 0)Q(L)δ∗m(x, 0)dx, (7.186)

and one can write in place of δm(x, 0) the functions δm(x, a) or δm(x,−a).The basic property of Am is

Am → +∞ as m → ∞. (7.187)

Indeed, the elliptic operator Q(L) is positive definite on Hsq/2(Rr), and

one can assume that δm(x, 0) ∈ Hsq/2(Ba), Ba := x : x ∈ Rr , |x| ≤ a.Therefore

∫

D

δm(x, 0)Q(L)δ∗m(x, 0)dx ≥ c

∫

D

|δm(x, 0)|2dx→ +∞. (7.188)

Here c is a positive constant which does not depend on δm(x, 0) (it depends

on Q(L) only) and the integral in (7.188) tends to infinity because δm(x, 0)

is a delta-sequence by construction (see formula (7.174) and the line below

it).

Let us apply the Chebyshev inequality to get:

P

(∣∣∣∣∫

D

h(x)V ∗(x)dx

∣∣∣∣ ≥ Am

)≤ D

[∫D n(x)V ∗(x)dx

]

A2m

. (7.189)



Here we took into account that n = 0, and D[n] stands for the variance of

random quantity n. One has

D[∫

D

n(x)V ∗(x)dx

]=

∫

D

∫

D

R(x, y)V (x)V ∗(y)dxdy

=

∫

D

∫

D

R(x, y)Q(L)f(x)Q(L)f∗(y)dxdy

=

∫

D

(∫

D

Q(L)R(x, y)f(x)dx

)Q(L)f∗(y)dy

=

∫

D

(∫

D

δ(x− y)f(x)dx

)Q(L)f∗(y)dy

=

∫

D

f(y)Q(L)f∗ (y)dy =3

2Am. (7.190)

Here we used definition (7.177) of f , formula (7.186) and equalities of

the type (7.184) for the functions δm(x, 0), δm(x, a) and δm(x,−a). From


P

∣∣∣∣∫

D

n(x)V ∗(x)dx

∣∣∣∣ ≥ Am

≤ 3

2Am→ 0 as βm → +∞. (7.191)

If X is an arbitrary random variable then

P ReX ≥ Am ≤ P |X| ≥ Am . (7.192)

From (7.185), (7.187), (7.191) and (7.192) it follows that

αm ≤ 2

3Am→ 0 as m → ∞. (7.193)

Let us compute the probability to take the decision that the hypothesis H0

occurred while in fact H1 occurred. We have

βm := P (γ0

∣∣ H1) = P

2Re

∫

D

U1V∗dx ≤

∫

D

(s0V∗ + s∗1V )dx

= P

2Re

∫

D

nV ∗dx ≤∫

D

(s0V∗ − s1V

∗)dx

= P

Re

∫

D

nV ∗dx ≤ −1

2

∫

D

fQ(L)f∗dx

= P

Re

∫

D

nV ∗dx ≤ −3

4Am

, (7.194)

where we used formula (7.190).


Applications 191

If ReX < −A then |X| > A. Therefore

P (|X| ≥ A) ≥ P ReX ≤ −A . (7.195)

From (7.191), (7.194) and (7.195) one concludes that

βm ≤ P

∣∣∣∣∫

D

hV ∗dx

∣∣∣∣ ≥3

4Am

≤ 3

2

16

9Am=

8

3Am→ 0 as m → ∞.

(7.196)

We have proved the following

Theorem 7.3 The problem (7.172) can be solved by the decision rule

from Section 7.2 and αm → 0, βm → 0, where αm is defined in (7.182) and

βm is defined in (7.194).

7.5.3 Optimization of resolution ability

In this section we give a theory of optimization of resolution ability of linear

optical instruments.

Let us consider the resolution ability for the problem of discriminat-

ing between two arbitrary signals s1(x) and s0(x) which are deterministic

functions of x ∈ Rr. The observed signal in a bounded domain D ⊂ Rr is

U(x) = sj(x) + n(x), j = 0 or j = 1, (7.197)

n(x) is Gaussian noise,

n = 0, D[n] = σ2, n∗(x)n(y) = R(x− y). (7.198)

Assume that the linear optical instrument is isoplanatic. This means that

its transmission function, h(x, y), is h(x− y). In optics r = 2 and D is the

input pupil, or entrance pupil, of the instrument.

Consider the case of incoherent signals. For incoherent signals the trans-

mission power function is |h(x−y)|2. This means that if Iin(x) := |s(x)|2 is

the intensity of the signal s(x) in the object plane then in the image plane

one has the distribution of the intensity Iout(x) given by

Iout(x) =

∫|h(x− y)|2Iin(y)dy,

∫=

∫

R2

. (7.199)

The bar denotes averaging in phases. This will be explained soon. Let us



briefly derive (7.199). One has

Iout(x) =

∣∣∣∣∫h(x− y)sin (y)dy

∣∣∣∣2

=

∫∫h(x− y)h∗(x− y′)s∗in(y′)dydy′.

(7.200)

Let us now explain the meaning of the average. For incoherent signals by

definition we have

sin(y) =∑

j

Aj(y)eiφj (y), (7.201)

where Aj(y)eiφj are sources which form the signal sin(y) at the point y,

and φj are their phases which assumed random, uniformly distributed in

the interval [−π, π] and statistically independent, so that φj(y)φj (y′) =

δjj′δ(y − y′) φj = 0. Under these assumptions one has

sin(y)s∗in(y′) =∑

j,j

Aj(y)A∗j′ (y

′exp i[φj(y) − φj′(y′)]

=∑

j,j′

Aj(y)A∗j′ (y

′)δjj′δ(y − y′) =∑

j

|Aj(y)|2δ(y − y′)

= Iin(y)δ(y − y′). (7.202)

Here we took into account that

Iin(y) = |sin(y)|2 =∑

j,j′

Aj(y)A∗j′ (y)exp i[φj(y) − φj′(y)] =

∑

j

|Aj(y)|2.

(7.203)

From (7.202) and (7.200) one obtains (7.199). Denote

H(x) := |h(x)|2 ≥ 0, (7.204)

where h(x) is the transmission function for an isoplanatic linear instrument.

Then

Iout(x) =

∫H(x− y)Iin(y)dy,

∫:=

∫

R2

(7.205)

for incoherent signals. Let us assume that∫H2(x)dx := ε < ∞. (7.206)

One often assumes in applications that

H(λ) is negligible outside ∆, (7.207)


Applications 193

where ∆ ⊂ R2 is a finite region. The assumption (7.178) means that the

instrument filters out the spatial frequencies which do not belong to ∆.

Let us finally assume that the noise is a real-valued function which is a

perturbation of the observed intensity in the image plane. One can think

of n(x), for example, as of the noise of the receiver of the intensity.

The problem is: given the observed signal

U(x) = Ij(x) + n(x), j = 1 or j = 0,

Ij(x) =

∫H(x− y)sj (y)dy, sj(x) := Iinj(x) (7.208)

decide whether it is of the form (7.179) with j = 1 (hypothesis H1) or with

j = 0 (hypothesis H0).

Applying the decision rule from Section 7.2 and taking into account that

the signals are real valued, one solves the equation (7.29)

∫

D

R(x, y)V (y)dy = I1(x) − I0(x) := I(x) (7.209)

and then checks inequality (7.39) with real-valued signals:

∫

D

U(x)V (x)dx ≥ 1

2

∫

D

[I0(x) + I1(x)]V (x)dx. (7.210)

If (7.210) holds then the decision is that hypothesis H1 occurred. Oth-

erwise one concludes that hypothesis H0 occurred.

The error of the first kind of this rule is the probability to decide that

H1 occurred when in fact H0 occurred:

α := P (γ1

∣∣ H0)

= P

∫

D

[I0(x) + n(x)]V (x)dx ≥ 1

2

∫

D

[I0(x) + I1(x)]V (x)dx

= P

∫

D

n(x)V (x)dx ≥ 1

2

∫

D

I(x)V (x)dx

, (7.211)

where I(x) is given by (7.209). Note that

∫

D

n(x)V dx = 0, (7.212)



d2 := D[∫

D

n(x)V (x)dx

]=

∫

D

∫

D

R(x, y)V (y)V (x)dydx. (7.213)

We assume here that R(x, y) and V (x) are real-valued. Since∫Dn(x)V (x)dx is Gaussian (because n(x) is) one concludes from (7.211),

(7.212) and (7.213) that

α =1

d√

2π

∫ ∞

d2/2

exp

(− t2

2d2

)dt = erf(d/2), (7.214)

where erf(x) is defined in (7.61) and we used the equation

d2 =

∫

D

I(x)V (x)dx (7.215)

which can be easily checked:

d2 :=

∫

D

(∫

D

R(x, y)V (y)dy

)V (x)dx =

∫

D

I(x)V (x)dx

because of (7.209).

Therefore

α = min for those H(x) for which d = max . (7.216)

For the error of the second kind which is the probability to decide that

H0 occurred while in fact H1 occurred one has

β := P (γ0

∣∣ H1) = P

∫

D

(I1 + n)V dx <1

2

∫

D

(I0 + I1)V dx

= P

∫

D

nV dx < −1

2

∫

D

I(x)V (x)dx

=1

d√

2π

∫ −d2/2

−∞exp

(− t2

2d2

)dt

=1√2π

∫ −d/2

−∞exp

(− t

2

2

)dt = erf(d/2) = α. (7.217)

Therefore both α and β = α will be minimized if H(x) solves the following

optimization problem

d2 :=

∫

D

∫

D

R(x, y)V (y)V (x)dydx = max (7.218)


Applications 195

subject to the conditions∫

D

R(x, y)V (y)dy = I1(x) − I0(x) := I(x) (7.219)

and

I(x) =

∫H(x− y)s(x)dx, s(x) := s1(x) − s0(x). (7.220)

Let us assume for simplicity that D is the whole plane:

D = R2. (7.221)

Then, taking the Fourier transform of (7.219) and (7.220) yields

R(λ)V (λ) = I(λ) = H(λ)s(λ). (7.222)

Here the last assumption (7.198) was used. Thus

V (λ) = H(λ)s(λ)R−1(λ). (7.223)

Write (51) as

d2 =

∫

R2

I(x)V (x)dx =1

(2π)2

∫

R2

R−1|H(λ)|2|s|2dλ = max (7.224)

and (7.206) as

ε = (2π)−2

∫

R2

|H|2dλ. (7.225)

The instrument with the power transmission function H(x) which maxi-

mizes functional (7.224) under the restriction (7.225) will have the maxi-

mum resolution power for the problem of discriminating two given signals

s1(x) and s0(x).

The solution to (7.224) is easy to find: |H|2 should be parallel to the

vector R−1|s|2, so

|H|2 = R−1|s|2 · const, (7.226)

where the constant is uniquely determined by condition (7.225):

|H(λ)|2 = R−1(λ)|s(λ)|2 · ε1/2

12π

∫R2 R−1|s|2dλ

. (7.227)

Formula (60) determines uniquely |H| := A(λ), so that

H(λ) = A(λ) exp[iφ(λ)], (7.228)



where φ(λ) is the unknown phase of H(λ):∫

R2

H(x) exp(iλ · x)dx = A(λ) exp[iφ(λ)], H(x) ≥ 0. (7.229)

Formula (7.224) shows that the resolution power of the optimal instrument

depends on A(λ) = |H(λ)| only. The phase φ(λ) does not influence the

resolution ability but influences the size of the region D out of which H(x)

is negligibly small. If one writes the equation∫

D

H(x) exp(iλ · x)dx = A(λ) exp[iφ(λ)], H(x) ≥ 0 (7.230)

and consider it as an equation forH(x) and φ(λ), givenA(λ) and argH(x) =

0, then one has a phase retrieval problem. In (7.230) D ⊂ R2 is assumed to

be a finite region with a smooth boundary. This problem has been studied

in [Kl], where some uniqueness theorems are established. However, the

numerical solution to this problem has not been studied sufficiently.

The condition (7.206) does not seem to have physical meaning. One

could assume that∫H(x)dx = E. (7.231)

In this case the const in (7.226) cannot be found explicitly, in contrast to

the case when condition (7.206) is assumed. If (7.231) is assumed then it


Iout(x) ≤ maxy∈R2

Iin(y)E. (7.232)

7.5.4 A general definition of resolution ability

In this section a general definition of resolution ability is suggested. The

classical Rayleigh definition deals with very special signals: two bright

points versus one bright point. Suppose that the set M of signals, which one

wishes to resolve is rather large. For example, one can assume that M con-

sists of all functions belonging to L2(D) or to C10 (D), the space of functions

which have one continuous derivative in a bounded domain D ⊂ Rr and

are compactly supported in D. Suppose that the linear system is described

by its transmission function

Lf =

∫

D

h(x, y)f(y)dy. (7.233)


Applications 197

Assume that actually one observes the signal

U(x) = Lf + n(x), n = 0, D[n] = σ2, (7.234)

where n(x) is noise. Let

BσU := fσ (7.235)

denote a mapping which recovers f given U(x). Let us assume that the

operator L−1 exists but is unbounded. Then in the absence of noise one

can recover f exactly by the formula f = L−1U , but L−1 cannot serve as

Bσ in (7.235): first, because in the presence of noise U may not belong to

the domain of L−1, secondly, because if L−1U is well defined in the presence

of noise it may give a very poor estimate of f due to the fact that L−1 is

unbounded. Let us define the resolution ability of the procedure Bσ on the

class M of input signals as a maximal number r ∈ [0, 1] for which

limσ→0

supf∈M

D[BσU − f ]σr

< ∞. (7.236)

This definition takes into account the class M of the input signals, the

procedure B for estimating f , and the properties of the system (see the

definition (7.234) of U). Therefore all the essential data are incorporated in

the definition. The definition makes sense for nonlinear injective mappings

L as well.

Roughly speaking the idea of this definition is as follows. If the error

in the input signal is O(σ) and the identification procedure Bσ produces

fσ = BU such that, in some suitable norm, ‖ fσ − f ‖= O(σr) as σ → 0,

then the large r is, the better the resolution ability of the procedure Bσ is.

Example 7.5 Assume that the transmission function is

h(x− y) =

(2

π

)1/2sin(x− y)

x− y. (7.237)

Note that

h(λ) :=1√2π

∫ ∞

−∞h(x)e−iλxdx =

1 |λ| ≤ 1

0 |λ| > 1.(7.238)

Let

BσU =1√2π

∫ ∞

−∞

U (λ) exp(iλx)dλ

h(λ) + a(σ), (7.239)



where a(σ) > 0 for σ > 0 will be chosen later. Let M be the set of functions

f(x) ∈ L2(−∞,∞) such that

‖ f ‖L2[−1,1]≤ m, f(λ) = 0 for |λ| > 1, (7.240)

where m > 0 is a constant. Assume that n(x) is an arbitrary function such

that

n(x) ∈ L2(−∞,∞), ‖ n ‖:=(∫ ∞

−∞|h(x)|2dx

)1/2

≤ σ. (7.241)

By Parseval’s equality, one has

‖ BσU − f ‖2 =

∥∥∥∥∥hf + n

a(σ) + h− f

∥∥∥∥∥

2

=

∥∥∥∥∥n − a(σ)f

a(σ) + h

∥∥∥∥∥

2

≤ 2

‖ h ‖2

a2(σ)+ a2(σ)

∥∥∥∥∥f

h

∥∥∥∥∥

2 ≤ 2

(σ2

a2(σ)+ a2(σ)m

).

(7.242)

Choose

a(σ) = σ1/2. (7.243)

Then (7.242) and (7.243) yield

‖ BσU − f ‖≤ 2(1 +m)σ. (7.244)

Therefore r = 1 for the procedure Bσ defined by formula (7.239):

limσ→0

supf∈M

‖ BσU − f ‖σ

≤ 2(1 +m) <∞. (7.245)

7.6 Ill-posed problems and estimation theory

7.6.1 Introduction

In this section we define the notion of ill-posed problem and give examples

of ill-posed problems. Many problems of practical interest can be reduced

to solving an operator equation

Au = f, (7.246)


Applications 199

where u ∈ U , f ∈ F , A : U → F , U and F are Banach spaces, A is an

injective mapping with discontinuous inverse and domain D(A) ⊂ U . Such

problems are called ill-posed because small perturbations of f may lead

to large perturbations of the solution u due to the fact that A−1 is not

continuous. In many cases R(A), the range of A, is not the whole F . In

this case small perturbations of f may lead to an equation which does not

have a solution due to the fact that the perturbed f does not belong to

R(A).

The problem (7.246) is said to be well-posed in the sense of Hadamard if

A : D(A) → F is injective, surjective, that is R(A) = F , and A−1 : F → U

is continuous.

The formulation of an ill-posed problem (7.246) can often be given as

follows. Assume that the data are δ, A, fδ, where δ > 0 is a given number,

fδ is a δ-approximation of f in the sense

‖ f − fδ ‖≤ δ. (7.247)

The problem is: given the data δ, A, fδ, find uδ ∈ U such that

‖ uδ − u ‖→ 0 as δ → 0. (7.248)

We use ‖ · ‖ for norms in U and F .

A more general formulation of the ill-posed problem is the one when

the data are δ, η, Aη, fδ where δ and fδ are as above, η > 0 is a positive

number and Aη is an approximation of A in a suitable sense, for example,

if A is a linear bounded operator one can assume ‖ Aη − A ‖≤ η. We will

not discuss this more general formulation here. (See [I], for example.)

Ill-posed problems can be formulated as estimation problems. For ex-

ample, suppose that A is a linear operator, u solves equation (7.246), and

one knows a randomly perturbed right hand side of (7.246), namely

f + n, (7.249)

where n is a random variable,

n = 0, n∗(x)n(y) = σ2δ(x− y). (7.250)

The problem is to estimate u given the data (7.249), (7.250).

Let us give a few examples of ill-posed problems of interest in applica-

tions.

Example 7.6 Numerical differentiation.



Let

Au :=

∫ x

a

u(t)dt = f(x). (7.251)

Assume that δ > 0 and fδ are given such that ‖ fδ − f ‖≤ δ. Here ‖ f ‖=maxa≤x≤b |f(x)|, F = U = C([a, b]). The problem (7.251) is ill-posed.

Indeed, the linear operator A is injective: if Au = 0 then u = 0. Its range

consists of the functions f ∈ C1[a, b] such that f(a) = 0. Therefore equation

(7.251) with fδ in place of f has no solutions in C[a, b] if fδ 6∈ C1[a, b] or

fδ(0) 6= 0. If one takes fδ = f+δ sin[ω(x−a)], then the solution to equation

(7.251) with fδ in place of f exists: uδ = f ′+δω cos[ω(x−a)]. Since u = f ′

one has ‖ uδ − u ‖= δω 1 if ω is sufficiently large. Therefore, a small

in the norm of U perturbation of f resulted in a large in the same norm

perturbation of u. Therefore the formula uδ = A−1fδ = f ′δ in this example

does not satisfy condition (7.248). In Section 7.3 a stable solution to the

problem (7.251) is given:

uδ(x) =fδ(x+ h(δ)) − fδ(x− h(δ))

2h(δ). (7.252)

It is proved that

‖ f ′ − uδ ‖=‖ u− uδ ‖≤ ε(δ) → 0 as δ → 0, (7.253)

where h(δ) and ε(δ) are given by formulas (7.67) (see Theorem 7.1). For-

mula (7.252) should be modified for x < a + h(δ) and x > b − h(δ) (see

(7.107)).

Remark 7.2 The notion of ill-posedness depends on the topology of F .

For example, if one considers as F the space C1[a, b] of functions, which

satisfy condition A(0) = 0, then problem (7.251) is well posed and A is an

isomorphism of C[a, b] onto F .

Example 7.7 Stable summation of orthogonal series with perturbed co-

efficients.

Let

u(x) =

∞∑

j=1

cjφj(x), x ∈ D ⊂ Rr. (7.254)

Suppose (φj, φm) = δjm is a basis of L2(D), the parentheses denote the

inner product in L2(D). Assume that u ∈ L2(D). This happens if and only


Applications 201

if∞∑

j=1

|cj|2 < ∞. (7.255)

Suppose the perturbed coefficients are given

cjδ = cj + εj , |εj| ≤ δ. (7.256)

The problem is: given the data cjδ and δ > 0, find uδ such that (7.248)

holds, the norm in (7.248) being L2(D) norm.

Consider the mapA : L2(D) → `∞ which sends a function u(x) ∈ L2(D)

into a sequence c = cj = (c1, . . . , cj, . . .) ∈ `∞ by formula (7.254). Since,

in general, the perturbed sequence cδ = cjδ 6∈ `2, the series∑∞

j=1 cjδφj(x)

diverges in L2(D), so that the perturbed sequence cjδ may not belong

to the range of A. It is easy to give examples when cjδ ∈ R(A) but the

function uδ :=∑∞

j=1 cjδφj(x) differs from u in L2(D) norm as much as one

wishes no matter how small δ > 0 is.

Exercise. Construct such an example:

Therefore A−1 is unbounded from `∞ into L2(D), and the problem is

ill-posed.

Note that if one changes the topology on the set of sequences cjfrom `∞ to `2 then the problem becomes well-posed and the operator A :

L2(D) → `2 is an isomorphsm.

A stable solution to the problem is given in Section 7.4.

Example 7.8 Integral equation of the first kind.

Let A be an integral operator A : H → H, H = L2(D),

Au =

∫

D

A(x, y)u(y)dy = f(x), x ∈ D. (7.257)

If A is injective, that is, Au = 0 implies u = 0, and A is compact, then

A−1 is not continuous in H, R(A) is not closed in H, so that small in H

perturbations of f may result in large perturbations of u, or may lead to

an equation which has no solutions in H (this happens if the perturbed f

does not belong to R(A)). Therefore, the problem (7.257) is ill-posed.

Example 7.9 Computation of values of unbounded operators.

Suppose B : F → U is an unbounded linear operator densely defined in

F (that is, its domain of definitionD(B) is dense in F ). Suppose f ∈ D(B),



Bf = u. Assume that instead of f we are given a number δ > 0 and fδsuch that ‖ f − fδ ‖≤ δ.

The problem is: given fδ and B compute uδ such that ‖ uδ − u ‖=‖uδ − Bf ‖→ 0 as δ → 0.

This problem is ill-posed. If A−1 is unbounded and B = A−1, the

problem (7.257) reduces to the above problem.

Example 7.10 Analytic continuation.

Suppose f(z) is analytic in a bounded domain D of the complex plane

and continuous in D. Assume that D1 ⊂ D is a strictly inner subdomain

of D.

The problem is: given f(z) in D1 find f(z) in D.

By Cauchy’s formula one has

1

2πi

∫

∂D

f(t)dt

t− z= f(z), z ∈ D1. (7.258)

This is an integral equation of the first kind for the unknown function f(t)

on ∂D. If f(t) is found then f(z) in D is determined by Cauchy’s formula.

Therefore the problem of analytic continuation from the subdomain D1 to

the domain D is ill-posed.

Example 7.11 Identification problems.

Let sin(x) be the input signal and sout(x) be the output signal of a linear

system with the transmission function h(x, y), that is

∫

D

h(x, y)sin(y)dy = sout(x), x ∈ ∆, (7.259)

where D and ∆ are bounded domains in Rr, and the function h in contin-

uous in D × ∆.

The identification problem is: given sout(x) and h(x, y) find sin(y).

Equation (7.259) is an integral equation of the first kind. Therefore the

above problem is ill-posed.


Applications 203

Example 7.12 Many inverse problems arising in physics are ill-posed.

For example, the inverse scattering problem in three dimensions. The prob-

lem consists of finding the potential from the given scattering amplitude

(see [R 26] and Chapter 6). Inverse problems of geophysics are often ill-

posed [R 2].

Example 7.13 Ill-posed problems in linear algebra.

Consider equation (7.246) with U = Rn and F = Rm, Rm is the m-

dimensional Euclidean space. Let N (A) = u : Au = 0 be the null-space

of A, and R(A) be the range of A. If N (A) 6= 0 define the normal solution

u0 to (7.246) as the solution orthogonal to N (A):

Au0 = f, u0 ⊥ N (A). (7.260)

This solution is unique: if u0 is another solution to (7.260) then

A(u0 − u0) = 0, u0 − u0 ⊥ N (A).

This implies that u0 = u0. One can prove that the normal solution can

be defined as the solution to (7.246) with the property that its norm is

minimal:

min ‖ u ‖=‖ u0 ‖, (7.261)

where minimum is taken over the set of all solutions to equation (7.246),

and the minimizer is unique: u = u0. Indeed, any element u ∈ Rn can be

uniquely represented as:

u = u0 ⊕ u1, u0 ⊥ N (A), u1 ∈ N (A) (7.262)

‖ u ‖2=‖ u0 ‖2 + ‖ u1 ‖2 . (7.263)

If Au = f then Au0 = f , and (7.263) implies (7.261). Moreover, the

minimum is ‖ u0 ‖ and is attained if and only if u1 = 0.

The normal solution to the equation Au = f can be defined as the least

squares solution:

‖ Au− f ‖= min, u ⊥ N (A) (7.264)

in the case when f 6∈ R(A). This solution exists and is unique. Existence

follows from the fact that minimum in (7.264) is attained at the element u

such that ‖ Au− f ‖= dist(f,R(A)) (note that R(A) is a closed subspace

of Rm). Uniqueness of the normal solution to (7.264) is proved as above.



Lemma 7.5 The problem of finding normal solution to (7.246) is a well-

posed problem in the sense of Hadamard: for any f ∈ Rm there exists

and is unique the normal solution u0 to equation (7.246), and this solution

depends continuously on f : if

Au0 = f, Au0δ = fδ, ‖ f − fδ ‖≤ δ (7.265)

then

‖ u0 − uδ ‖→ 0 as δ → 0. (7.266)

Proof. Existence and uniqueness are proved above. Let us prove (7.266).

Let

f = f1 ⊕ f2, fδ = fδ1 ⊕ fδ2, (7.267)

where f1 ∈ R(A), f2 ⊥ R(A), fδ1 and fδ2 are defined similarly. One has

Au0 = f1 Au0δ = fδ1. (7.268)

The operator A : N (A)⊥ → R(A) is an isomorphism. Therefore A−1 :

R(A) → N (A)⊥ is continuous. Lemma 7.5 is proved.

Definition 7.2 The mapping A+ : f → u0 is called pseudoinverse of A.

The normal solution u0 is called sometimes pseudosolution.

Although we have proved that the problem of finding normal solu-

tion to equation (7.246) is well-posed in the sense of Hadamard when

A : Rn → Rm, we wish to demonstrate that practically this problem should

be considered as ill-posed in many cases of interest.

As an example, take n = m, A : Rn → Rn, N (A) = 0, so that A is

injective and, by Fredholm’s alternative, A is an isomorphism of Rn onto

Rn. Consider the equations

Au = f, Auδ = fδ. (7.269)

Thus

A(u− uδ) = f − fδ , u− uδ = A−1(f − fδ).

Therefore

‖ u− uδ ‖≤‖ A−1 ‖‖ f − fδ ‖ . (7.270)

Since

‖ f ‖=‖ Au ‖≤‖ A ‖‖ u ‖ (7.271)


Applications 205

one obtains

‖ u− uδ ‖‖ u ‖ ≤‖ A−1 ‖‖ A ‖ ‖ f − fδ ‖

‖ f ‖ . (7.272)

Define ν(A), the condition number of A, to be

ν(A) :=‖ A−1 ‖‖ A ‖ . (7.273)

Then (7.272) shows that the relative error of the solution can be large even

if the relative error

‖ fδ − f ‖ / ‖ f ‖ of the data is small provided that ν(A) is large.

Note that the inequality (7.272) is sharp in the sense that the equality

sign can be attained for some u, uδ, f and fδ .

The point is that if ν(A) is very large, then the problem of solving

equation (7.246) with A : Rn → Rn, with N (A) = 0, is practically ill-

posed in the sense that small relative perturbations of f may lead to large

relative perturbations of u.

7.6.2 Stable solution of ill-posed problems

In this section we sketch some methods for stable solution of ill-posed prob-

lems, that is, for finding uδ which satisfies (7.248).

First let us prove the following known lemma. By a compactum M ⊂ U

we mean a closed set such that any infinite sequence of its elements contains

a convergent subsequence.

Lemma 7.6 Let M ⊂ U be a compactum. Assume that A : M → N is

closed and injective, N := AM . Then A−1 : N →M is continuous.

Remark 7.3 We assume throughout that U and F are Banach spaces

but often the results and proofs are valid for more general topological spaces.

These details are not of prime interest for the theory developed in this work,

and we do not give the results in their most general form for this reason.

Proof. Let Aun = fn, fn ∈ N . Assume that

‖ fn − f ‖→ 0 as n → ∞. (7.274)

We wish to prove that f ∈ N , that is, there exists a u ∈ M such that

Au = f , and

‖ un − u ‖→ 0. (7.275)



Since un ∈ M and M is a compactum, there exists a convergent subse-

quence, which is denoted again un, with limit u, ‖ un − u ‖→ 0. Since M

is a compactum, it is closed. Therefore u ∈ M . Since un → u, Aun → f ,

and A is closed, one concludes that Au = f . Lemma 7.6 is proved.

This lemma shows that if one assumes a priori that the set of solutions

to equation (7.248) belongs to a compactum M then the operator A−1

(which exists on R(A) since we assume A to be injective) is continuous on

the set N := AM .

Therefore an ill-posed problem which is considered under the condi-

tion f ∈ N becomes conditionally well posed. This leads to the following

definition.

Definition 7.3 A quasisolution of equation (7.246) on a compactum M

is the solution to the problem

‖ Au− f ‖= min, u ∈M. (7.276)

Here A : U → F is a linear bounded operator.

The functional ε(u) :=‖ Au−f ‖ is continuous and therefore it attains its

minimum on the compactum M . Thus, a quasisolution exists. In order to

prove its uniqueness and continuous dependence on f , one needs additional

assumptions. For example, one can prove

Theorem 7.4 If A is linear, bounded, and injective, M is a convex com-

pactum and F is strictly convex then, for any f ∈ F , the quasisolution

exists, is unique, and depends on f continuously.

The proof of Theorem 7.4 requires some preparation. Recall that F is

called strictly convex if and only if ‖ u + v ‖=‖ u ‖ + ‖ v ‖ implies that

v = λu for some constant λ.

Exercise. Prove that λ has to be positive.

The spaces Lp(D), `p, Hilbert spaces are strictly convex, while L1(D),

C(D) and `1 are not strictly convex.

Definition 7.4 If g ∈ U is a vector and M ⊂ U is a set, then an element

h ∈ M is called the metric projection of g onto M if and only if ‖ g− h ‖=infu∈M ‖ g − u ‖. The mapping P : g → h is called the metric projection

mapping, Pg = h, or PMg = h.

In general, Pg is a set of elements. Therefore the following lemma is of

interest.


Applications 207

Lemma 7.7 If U is strictly convex and M is convex then the metric

projection mapping onto M is single valued.

Proof. Suppose h1 6= h2, hj ∈ Pg, j = 1, 2. Then

m :=‖ h1 − g ‖=‖ h2 − g ‖≤‖ u− g ‖ ∀u ∈M.

Since M is convex, (h1 + h2)/2 ∈M . Thus

m ≤∥∥∥∥g −

h1 + h2

2

∥∥∥∥ ≤1

2‖ g − h1 ‖ +

1

2‖ g − h2 ‖= m. (7.277)

Therefore, since U is strictly convex, one concludes that g−h1 = λ(g−h2),

λ is a real constant. Since ‖ g− h1 ‖=‖ g− h2 ‖, it follows that λ = ±1. If

λ = 1 then h1 = h2, contrary to the assumption. If λ = −1, then

g = (h1 + h2)/2. (7.278)

Since M is convex, equation (7.278) implies that g ∈ M . This is a contra-

diction since g ∈M implies Pg = g. Lemma 7.7 is proved.

Lemma 7.8 If U is strictly convex and M is convex then PM : U → M

is continuous.

Proof. Suppose ‖ gn − g ‖→ 0 but ‖ hn − h ‖≥ ε > 0, where hn = Pgn,

h = Pg. SinceM is a compactum, one can assume that hn → h∞, h∞ ∈ M .

Thus ‖ h∞ − h ‖≥ ε > 0. One has

‖ g − h ‖≤‖ g − h∞ ‖ (7.279)

and

‖ g − h∞ ‖ ≤ ‖ g − gn ‖ + ‖ gn − hn ‖ + ‖ hn − h∞ ‖→ ‖ g − h ‖, n → ∞. (7.280)

Indeed ‖ g − gn ‖→ 0, ‖ hn − h∞ ‖→ 0, and

‖ gn − hn ‖= dist(gn,M ) → dist(g,M ) =‖ g − h ‖ . (7.281)

From (7.279) and (7.280) one obtains ‖ g−h∞ ‖=‖ g−h ‖. This implies

h∞ = h as in the proof of Lemma 7.7. This contradiction proves Lemma

7.8.



Exercise. Prove that dist(g,M ) := infu∈M ‖ g − u ‖ is a continuous

function of g.

We are ready to prove Theorem 7.4.

Proof of Theorem 7.4 Existence of the solution to (7.276) is already

proved. Since M is convex and A is linear the set AM := N is convex.

Since N is convex and F is strictly convex, Lemma 7.7 says that PNf exists

and is unique, while Lemma 7.8 says that PNf depends on f continuously.

Let Au = PNf . Since A is injective u = A−1PNf is uniquely defined and,

by Lemma 7.6, depends continuously on f . Theorem 7.4 is proved.

It follows from Theorem 7.4 that if M ⊂ U is a convex compactum

which contains the solution u to equation (7.246), if A is an injective linear

bounded operator, and F is strictly convex, then the function

uδ = A−1PAMfδ (7.282)

satisfies (7.248). The function uδ can be found as the unique solution to

optimization problem (7.276) with fδ in place of f . One could assume

A closed, rather than bounded, in Theorem 7.4. Uniqueness of the quasi

solution is not very important for practical purposes. If there is a set uδ of

the solutions to (7.276) with fδ in place of f , if A is injective and Au0 = f0,

and if ‖ fδ−f0 ‖≤ δ, then ‖ uδ−u0 ‖→ 0 as δ → 0 for any of the elements of

the set uδ. Indeed ‖ Auδ − fδ ‖≤‖ Au0 − fδ ‖=‖ f0 − fδ ‖≤ δ. Therefore

‖ Auδ − Au0 ‖≤‖ Auδ − fδ ‖ + ‖ fδ − f0 ‖≤ 2δ. Since M is compact and

uδ, u0 ∈ M the inequality ‖ Auδ − Au0 ‖≤ 2δ implies ‖ uδ − u0 ‖→ 0 as

δ → 0 (see Lemma 7.6).

We have finished the description of the first method, the method of qua-

sisolutions, for finding a stable solution to problem (7.246).

How does one choose M? The choice of M is made in accordance to the

a priori knowledge about the solution to (7.246). For instance, in Example

7.6 one can take as M the set of functions u, which satisfy the condition

|u(a)| ≤ M1, |u′(x)| ≤M2, (7.283)

where Mj are constants, j = 1, 2. Then

Au =

∫ x

a

udt = f


Applications 209

and

|f ′′| ≤M2, |f ′(a)| ≤M1, f(a) = 0. (7.284)

Inequality (7.283) defines a convex compactum M in L2[a, b] and (7.284)

defines N = AM in L2[a, b]. Theorem 7.4 is applicable (since L2[a, b]

is strictly convex) and guarantees that ‖ u − uδ ‖L2[a,b]→ 0. A stable

approximation of u = f ′(x) in C[a, b] norm is given in Theorem 7.1.

Let us now turn to the second method for constructing uδ which satisfies

equation (7.248). This will be a variational method which is also known

as a regularization method. While in the first method one needs to solve

the variational problem (7.276) with the restriction u ∈ M , in the second

method one has to solve a variational problem without restrictions.

Consider the functional

F (u) :=‖ Au− fδ ‖2 +γφ2(u), ‖ fδ − f ‖≤ δ, (7.285)

where 0 < γ = const is a parameter, A is a linear bounded operator and

φ(u) is a positive strictly convex densely defined functional which defines a

norm:

φ(u) ≥ 0, φ(u) = 0 ⇒ u = 0, φ(λu) = |λ|φ(u) (7.286)

φ(u1 + u2)

2<φ(u1) + φ(u2)

2if u1 6= λu2, λ = const. (7.287)

We also assume that the set of u ∈ U, which satisfy the inequality

φ(u) ≤ c, (7.288)

is compact in U . In other words the closure of the set Domφ(u) in the norm

φ(u) is a Banach space Uφ ⊂ U which is dense in U , and the imbedding

operator i : Uφ → U is compact.

One often takes

φ(u) =‖ Lu ‖, (7.289)

where L : U → U is a linear densely defined boundedly invertible operator

with compact inverse. An operator is called boundedly invertible if its

inverse is a bounded operator defined on all of U . Let us assume that U

is reflexive so that from any bounded set of U one can select a weakly

convergent in U subsequence.

We will need a few concepts of nonlinear analysis in order to study the

minimization problem F (u) = min.



Definition 7.5 A functional F : U → R1 is called convex if D(F ) :=

DomF is a linear set and for all u, v ∈ D(F ) one has

F (λu+ (1 − λ)v) ≤ λF (u) + (1 − λ)F (v), 0 ≤ λ ≤ 1. (7.290)

Definition 7.6 A functional F (u) is called weakly lower semicontinuous

from below if

un u⇒ lim infn→∞

F (un) ≥ F (u), (7.291)

where denotes weak convergence in U .

Lemma 7.9 A weakly lower semicontinuous from below, functional F (u),

in a reflexive Banach space U is bounded from below on any bounded weakly

closed set M ⊂ DomF and attains its minimum on M at a point of M .

Note that the set is weakly closed if un ∈M and un u implies u ∈ M .

Proof of Lemma 7.9 Let

−∞ ≤ d := infu∈M

F (u), F (un) → d, un ∈ M. (7.292)

Since M is bounded and U is reflexive, there exists a weakly convergent

subsequence of un which we denote un again, un u. Since M is weakly

closed one concludes that u ∈ M . Since F is weakly lower semicontinuous

from below, one concludes that

d ≤ F (u) ≤ lim infn→∞

F (un) = d. (7.293)

Therefore d > −∞, and F (u) = d. Lemma 7.9 is proved.

Lemma 7.10 A weakly lower semicontinuous from below functional F (u)

in a reflexive Banach space attains its minimum on every bounded, closed,

and convex set M .

Proof. Any such set M in a reflexive Banach space is weakly closed.

Thus Lemma 7.10 follows from Lemma 7.9.

Exercise. Prove the following lemmas.

Lemma 7.11 If F (u) is weakly lower semicontinuous from below func-

tional in a reflexive Banach space such that

F (u) → +∞ as ‖ u ‖→ ∞ (7.294)

then F (u) attains its minimum on any closed convex set M ⊂ U .


Applications 211

Lemma 7.12 A weakly lower semicontinuous from below functional in a

reflexive Banach space attains its minimum on every compactum.

Lemma 7.13 A convex Gateaux differentiable functional F (u) is weakly

lower semicontinuous from below in a reflexive Banach space.

Definition 7.7 A functional F (u) is Gateux differentiable in U if and

only if

limt→+0

t−1[F (x+ th) − F (x)] = Ah (7.295)

for any x, h ∈ U , where A : U → R1 is a linear bounded functional on U .

Proof of Lemma 7.13 If un u then convexity of F (u) implies

F (u) ≤ F (un) + F ′(u)(u− un). (7.296)

Pass to the limit infimum in (51) to get

F (u) ≤ lim infn→∞

F (un). (7.297)


Exercise. Prove that if F (u) is Gateux differentiable and convex in the

sense (7.290) then

F (u) − F (v) ≤ F ′(u)(u− v) ∀u, v ∈ DomF. (7.298)

In fact, if F (u) is Gateux differentiable then

(7.290) ⇔ (7.298) ⇔ (F ′(u) − F ′(v), u − v) ≥ 0. (7.299)

The last inequality means that F ′(u) is monotone. The parentheses in

(7.299) denote the value of the linear functional F ′(u)− F ′(v) ∈ U ∗ at the

element u− v ∈ U . By U ∗ the space of linear bounded functionals on U is

denoted.

We are now ready to prove the following theorem.

Theorem 7.5 Assume that A is a linear bounded injective operator de-

fined on the reflexive Banach space U , and φ(u) is a strictly convex weakly

lower semicontinuous from below functional such that the set (7.288) is

compact in U . Then the minimization problem

F (u) = min, (7.300)



where F (u) is defined in (7.285), has the unique solution uδ,γ for any γ > 0,

and if one chooses γ = γ(δ) so that

γ(δ) → 0, δ2γ−1(δ) ≤ m < ∞ as δ → 0, (7.301)

where m = const > 0, then uδ := uδ,γ(δ) satisfies (7.248).

Proof. The functional F (u) ≥ 0. Let 0 ≤ d := infu∈U F (u), and let unbe the minimizing sequence

F (un) → d. (7.302)

Then

d ≤ F (un) ≤ d+ ε ≤ δ2 + γφ2(uf ), uf := A−1f, ∀n > n(ε), (7.303)

where ε > 0 is a sufficiently small number, see (7.317) below. From (7.301)

and (7.303) one concludes that

γφ2(un) ≤ d+ ε ≤ γ[γ−1δ2 + φ2(uf )

]≤ cγ (7.304)

so that φ2(un) ≤ c, c := m+φ2(uf ). Therefore one can choose a convergent

subsequence from the sequence un. This subsequence is denoted also un:

un−→

Uu0, u0 = u0δ := uδ. (7.305)

Since A is continuous one has

‖ Aun − f ‖→‖ Au0 − f ‖ . (7.306)

The lower weak semicontinuity of φ(u) and (7.305) imply

lim infn→∞

φ(un) ≥ φ(u0). (7.307)

Thus

d ≤ F (u0) ≤ lim infn→∞

F (un) = d. (7.308)

Therefore the solution to (7.300) exists and the limit (7.305) is a solution.

Suppose v is another solution:

d = F (v) = F (u0). (7.309)

Then, since F (u) is convex, one has

d ≤ F (λu0 + (1 − λ)v) ≤ λF (u0)+(1−λ)F (v) = d, ∀λ ∈ [0, 1]. (7.310)


Applications 213

Therefore

F

(u0 + v

2

)=F (u0) + F (v)

2. (7.311)

This implies that

‖ Au0 − fδ ‖2

+‖ Av − fδ ‖

2=

∥∥∥∥Au0 + v

2− f

∥∥∥∥ (7.312)

and

φ

(u0 + v

2

)=φ(u0) + φ(v)

2. (7.313)

From (7.287) and (7.312) one concludes that v = cu0, c = const. Since

φ(cu) = |c|φ(u), equation (7.313) implies that c ≥ 0. From (7.310) it

follows that

F (λcu0 + (1 − λ)u0) = d ∀λ ∈ [0, 1]. (7.314)

Let µ := λ(c− 1). If c 6= 1, then for all real sufficiently small µ one obtains

from (7.314) that

‖ Au0 − fδ ‖2 +γφ2(u0) =‖ Au0 − fδ + µAu0 ‖2 +γ(1 + µ)2φ2(u0).

Thus

0 = µ2 ‖ Au0 ‖2 +2µRe(Au0 − fδ , Au0)+2γµφ2(u0)+γµ2φ2(u0). (7.315)

Since (7.315) is a quadratic equation it cannot be satisfied for all small µ,

since its coefficients are not all zeros. Therefore c = 1, and uniqueness of

the solution to (7.300) is established.

Let us prove the last statement of Theorem 7.5. Assume that (7.301)

holds. Let Au = f u = A−1f := uf . One has

F (u) ≥ F (u0). (7.316)

Therefore

‖ Au0 − fδ ‖2 +γφ2(u0) ≤ δ2 + γφ2(u), u = A−1f. (7.317)

Thus, by (7.301),

φ2(u0) ≤ φ2(u) +m := c (7.318)

and, using (7.301) again, one obtains

‖ Au0 − fδ ‖2≤ γ(γ−1δ2 + φ2(u)

)≤ cγ → 0 as δ → 0. (7.319)



Similarly, for sufficiently small δ, one has

F (u) ≥ F (uδ), φ2(uδ) ≤ c, (7.320)

‖ Auδ − fδ ‖2≤ cγ(δ) → 0 as δ → 0. (7.321)


‖ Auδ − Au ‖ ≤ ‖ Auδ − fδ ‖ + ‖ fδ − Au ‖≤ ‖ Auδ − fδ ‖ +δ ≤ c1/2γ1/2(δ) + δ → 0

as δ → 0. (7.322)

Let M :=v : v ∈ U, φ2(v) ≤ c

. Then M is a compactum, uδ ∈M , u ∈M

by (7.320) and (7.318). Therefore, (7.322) and Lemma 7.6 imply (7.248).


Remark 7.4 If one finds for γ = γ(δ) not the minimizer uδ = uδ,γ(δ)itself but an approximation to it, say vδ , such that F (vδ) ≤ F (u), then, as

above,

φ2(vδ) ≤ c, ‖ Avδ − fδ ‖2≤ cγ(δ). (7.323)

Thus

‖ Avδ − Au ‖≤ c1/2γ1/2(δ) + δ → 0. (7.324)

As above, from (7.323) and (7.324) it follows that ‖ vδ − u ‖→ 0 as δ → 0.

Therefore one can use an approximate solution to the minimization problem

(7.300) as long as the inequality F (vδ) ≤ F (u) holds.

Remark 7.5 One can assume in Theorem 7.5 that A is not bounded but

closed linear operator, D(A) ⊃ D(φ) and the set (7.288) is compact in

the space GA which is D(A) equipped with the graph norm ‖ u ‖A:=‖ u ‖+ ‖ Au ‖. One can also assume that φ(u) is a convex lower weakly semi-

continuous functional, the set φ(u) ≤ c is not necessarily compact. The

change in the proof of Theorem 7.5 under this assumption is as follows.

From (7.304) it follows that un u0 in U . It follows [Ru, p. 65, Theorem

3.13] that there exists a convex combination un of un such that un → u0.

The sequence un is minimizing if un is. Therefore equations (7.306)-(7.308)

hold with un in place of un. The proof of the uniqueness of the solution to

(7.300) is the same as above. One can prove that uδ u as δ → 0 and

that there is a convex combination uδ of uδ such that uδ → u as δ → 0.

However, there is no algorithm to compute uδ given uδ.


Applications 215

If Rα,δ : F → U is the mapping which sends fδ into uα,δ, the solution

to (7.300), then

Rδfδ := Rα(δ),δfδ = uδ (7.325)

satisfies (7.248), that is

‖ Rδfδ − A−1f ‖→ 0 as δ → 0. (7.326)

A construction of such a family Rα,δ of operators, that there exists α(δ)

such that (7.326) holds, is used for solving an ill-posed problem (7.246).

The family Rα,δ is called a regularizing family for problem (7.246).

The error estimate of the approximate solution uα,δ := Rαfδ can be

given as follows:

‖ uα,δ − u ‖≤‖ Rα(fδ − f) ‖ + ‖ RαAu− u ‖≤ ω(α)δ + η(α) := ε(ε, δ).

Here we assumed that Rα is a linear operator and that ‖ Rα ‖≤ ω(α). One

assumes that Rα and u are such that η(α) → 0 and ω(α) → +∞ as α → 0.

Then there exists α = α(δ) such that α(δ) → 0 and ε (α(δ), δ) := ε(δ) → 0

as δ → 0.

Therefore Rα is a regularizing family for problem (7.246) provided that

ω(α) → +∞ and η(α) → 0 as α → 0. The stable approximation to the

solution u of equation (7.246) is uδ := Rα(δ)fδ, where α(δ) is chosen so that

ε(α, δ) ≥ ε (α(δ), δ) := ε(δ) for any δ > 0. One has the error estimate

‖ u− uδ ‖≤ ε(δ).

We gave two general methods for constructing such families. The theory

can be generalized in several directions:

1) one can consider nonlinear A; unbounded A, for example, closed densely

defined A; A given with some error, say Aε is given such that ‖ A−Aε ‖<ε.

2) one can consider special types of A, for example convolution and other

special kernels; in this case one can often give a more precise error esti-

mate for approximate solution.

3) one can study the problem of optimal choice of γ and of the stabilizing

functional φ(u) in (7.285).

4) one can study finite dimensional approximations for solving ill-posed

problem (7.246).

5) one can study methods of solving problem (7.246) which are optimal in

a suitable sense.



These questions are studied in many books and papers, and we refer

the reader to [Ivanov et. al. (1978); Lavrentiev and Romanov (1986);

Morozov (1984); Ramm (1968); Ramm (1973b); Ramm (1975); Ramm

(1980); Ramm (1981); Ramm (1984); Ramm (1985b); Ramm (1987b);

Ramm (1987c); Tanana (1981); Tikhonov (1977)].

In [Ramm (2003a)] and [Ramm (2005)] a new definition of the reg-

ularizing family is given. The new definition operates only with the

data δ, fδ, K, and does not use the unknown f . The compact K in

this definition is the compact to which the unknown solution belongs.

The knowledge of this compact is an a priori information about the so-

lutions of ill-posed problems. One calls Rα,δ a regularizing family, if

limδ→0 supu:u∈K,||Au−fδ||≤δ ||Rδfδ − u|| = 0, where u solves the equation

Au = f , and Rδ = Rα(δ),δ for some 0 < α(δ) → 0 as δ → 0.

7.6.3 Equations with random noise

In this section we look at the problem (7.246) with linear bounded injective

operator from the point of view of estimation theory. Let us consider the

equation

Aw = f + n, (7.327)

where A is an injective linear operator on a Hilbert space H, and n is noise.

Let us assume for simplicity that noise takes values in H. In practice this

is not always the case. For example, if H = L2(Rr) then a sample function

n(x) may belong to L2(Rr) locally, but not globally if n(x) does not decay

sufficiently fast as |x| → ∞. Therefore the above assumption simplifies the

theory. Assume that

n = 0, n∗(x)n(y) = σ2R(x, y), (7.328)

where σ2 > 0 is a parameter which characterizes the power of noise. Let

us assume that the solution to (7.327) exists and f ∈ RanA. One may try

to suggest the following definition.

Definition 7.8 The solution to (7.327) is statistically stable if

D[w − u] → 0 as σ → 0, (7.329)

where u solves the equation

Au = f. (7.330)


Applications 217

We will also use this definition with (7.329) substituted by

‖ w − u ‖2→ 0 as σ → 0, (7.331)

where ‖ · ‖ denotes the norm in H.

This definition is very restrictive. First, the assumption that equation

(7.327) is solvable means that a severe constraint is imposed on the noise.

Secondly, the requirement (7.329) is rather restrictive.

Let us illustrate this by examples and then consider some less restrictive

definition of the stable solution to (7.327). Note that, under the above

assumption,

w = A−1f +A−1n = u+ A−1n. (7.332)

Thus

D[w − u] = D[A−1n]. (7.333)

Let us assume that H = L2(D), D ⊂ Rr is a finite region, and A is a

selfadjoint compact operator on H with kernel A(x, y),

Aφj = λjφj, λ1 ≥ λ2 ≥ · · · > 0, (7.334)

where∫

D

φj(x)φ∗i (x)dx := (φj , φi) = δji. (7.335)

Then

A−1n =

∞∑

j=1

λ−1j (n, φj)φj. (7.336)

Therefore

D[A−1n] =∞∑

i,j=1

λ−1i λ−1

j

∫

D

n(t)φ∗j (t)dt

∫

D

n∗(z)ψi(z)dzφj(x)φ∗i (x)

= σ2∞∑

i,j=1

λ−1i λ−1

j

∫

D

∫

D

R(z, t)φi(z)φ∗j (t)dzdtφj(x)φ

∗i (x)

= σ2

∫

D

∫

D

A−1(x, z)R(z, t)A−1(t, x)dzdt, (7.337)



where A−1(x, y) is the kernel of the operator A−1 in the sense of distribu-

tions and is given by the formula

A−1(x, y) =

∞∑

j=1

λ−1j φ∗

j(x)φj(y). (7.338)

For the right hand side of (7.337) to converge to zero it is necessary and

sufficient that the kernel B(x, y) of the operator A−1RA−1 be finite for all

x and y = x, B(x, x) < ∞.

If one requests in place of (7.329) that

‖ w − u ‖2 → 0 as σ → 0, (7.339)

where the bar denotes statistical average and

‖ w ‖:= (w,w)1/2, (7.340)

then the following condition (7.342) will imply (7.339).

One has

(A−1n,A−1n) =

∫

D

dxD[A−1n] = σ2Tr(A−1RA−1) → 0 as σ → 0

(7.341)

provided that A is selfadjoint, positive, and

Tr(A−1RA−1) < ∞. (7.342)

Condition (7.342) is a severe restriction on the correlation function R(x, y)

of the noise.

Example 7.14 Consider the case H = L2(R1),

Au :=

∫ ∞

−∞A(x− y)u(y)dy. (7.343)

Then

A−1f :=1

2π

∫ ∞

−∞exp(iλx)A−1(λ)f (λ)dλ, (7.344)

where

f(x) =1

2π

∫ ∞

−∞f (λ) exp(iλx)dx (7.345)


Applications 219

one obtains (a derivation is given below) the formula

A−1n(A−1n)∗ = σ2

∫ ∞

−∞

R(λ)

|A(λ)|2dx. (7.346)

Here we have assumed that

n(x) =

∫ ∞

−∞exp(iλx)dζ(λ), (7.347)

where ζ(λ) is a random process with orthogonal increments such that

ζ(λ) = 0, dζ∗(λ)dζ(λ) = σ2R(λ)dλ, (7.348)

dζ∗(λ)dζ(µ) = 0 for λ 6= µ. (7.349)

If B is a linear integral operator with convolution kernel:

Bn =

∫ ∞

−∞B(x − y)n(y)dy,

and the spectral representation for n(x) is (7.347) then the spectral repre-

sentation for Bn is

Bn =

∫ ∞

−∞

∫ ∞

−∞B(x− y) exp(iλy)dy

dζ(λ)

=

∫ ∞

−∞exp(iλx)B(λ)dζ(λ). (7.350)

If B1 and B2 are two linear integral operators with convolution kernels,

then

B1n(B2n)∗ =

∞∫∫

−∞

exp(iλx − iµx)B1(λ)B∗2 (µ)dζ(λ)dζ∗(µ)

= σ2

∫ ∞

−∞B1(λ)B

∗2 (λ)R(λ)dλ. (7.351)

Equation (7.346) is a particular case of (7.351). Note that stationary ran-

dom functions (7.347) have mean value zero and

D[n(x)] := |n(x)|2 =

∫ ∞

−∞

∫ ∞

−∞eix(λ−µ)dζ∗(λ)dζ(µ)

= σ2

∫ ∞

−∞R(λ)dλ. (7.352)



It follows from formula (7.346) that the variance of the random process

A−1n at any point x is finite if and only if the spectral density R(λ) of

the noise is such that R(λ)|A(λ)|−2 ∈ L1(−∞,∞). If A(λ) tends to zero

as |λ| → ∞, the above condition imposes a severe restriction on the noise.

For example, if the noise is white, that is R(λ) = 1, then the condition

|A(λ)|−2 ∈ L1(−∞,∞) is not satisfied for A(λ) → 0 as |λ| → ∞.

Example 7.15 Consider the Hilbert space H of 2π-periodic functions

f =∑

m6=0

fm exp(imx) (7.353)

with the inner product

(f, g) :=∑

m6=0

fmg∗m. (7.354)

Assume that f(x) + n(x) is given, where n ∈ H

n(x) = 0, n∗(x)n(y) = σ2R(x− y), (7.355)

R(x+ 2π) = R(x). (7.356)

Note that R(x) can be written as

R(x) = σ2∑

m6=0

rm exp(−imx). (7.357)

Suppose one wants to estimate f ′(x) given f(x) + n(x). If one uses the

function u := f ′(x) + n′(x) as an estimate of f ′, then the variance of the

error of this estimate can be calculated as follows. Let

n(x) =∑

m6=0

nm exp(imx), (7.358)

where nm, the Fourier coefficients of n(x), are random variables, such that

nm = 0, n∗mnj = σ2rmδmj . (7.359)

The numbers rm can be determined easily. Indeed

R(x− y) = n∗(x)n(y) =∑

m,j n∗mnj exp(ijy − imx)

= σ2∑rm exp−im(x − y). (7.360)


Applications 221

From (7.356) and (7.360) one can see that the numbers rm in (7.359) can

be calculated by the formula

rm =1

2π

∫ π

−πR(x) exp(imx)dx. (7.361)

Since R(x) is a covariance function, the numbers rm are nonnegative as it

should be. From (7.358) and (7.359) it follows that

D[n′] = σ2∑

m

m2rm. (7.362)

Therefore the numbers rm have to satisfy the condition that the right side

of (114) is a convergent series, in order that the estimate u be statistically

stable in the sense of Definition 7.7.

Example 7.16 Let L be a selfadjoint positive operator on a Hilbert space

H. Assume that the spectrum of L is discrete:

0 ≤ λ1 ≤ λ2 ≤ · · · , λm → ∞ as m → ∞. (7.363)

Consider the problem

ut = Lu, t > 0 (7.364)

u(0) = f. (7.365)

Let φm be the eigenvectors of L:

Lφm = λmφm (7.366)

and assume that the system φm, 1 ≤ m < ∞, forms an orthonormal

basis of H:

(φm, φj) = δmj . (7.367)

The formal solution to problem (7.364)-(7.365) is

u =

∞∑

m=1

exp(λmt)fmφm (7.368)

where

fm := (f, φm). (7.369)



Formula (7.368) gives a formal solution to (7.364)-(7.365) in the sense that

formal differentiation in t yields

ut =∞∑

m=1

λm exp(λmt)fmφm, (7.370)

and formal application of the operator L and formula (118) yield

Lu =

∞∑

m=1

λm exp(λmt)fmφm, (7.371)

so that (7.370) and (7.371) yield (7.364). Put t = 0 in (7.368) and get

(7.365). Formula (7.368) gives the strong solution to (7.364)-(7.365) in H

if and only if the series (7.370) converges in H, that is

∞∑

m=1

λ2m exp(2λmt)|fm|2 < ∞, t > 0. (7.372)

This implies that the problem (7.364)-(7.365) is very ill-posed. This prob-

lem is an abstract heat equation with reversed time. If one takes −L in

place of L in (7.364) then the problem is analogous to the usual heat equa-

tion and is well posed. Suppose that H = L2(D), and that the noisy data

are given in (7.365), the function f(x) + n(x) in place of f(x) where n(x)

is noise. Let

n(x) =

∞∑

m=1

nmφm(x), (7.373)

where

nm := (n(x), φ(x)) (7.374)

where the parentheses denote the inner product in L2(D). It is clear that

if n(x) = 0, then

nm = 0, ∀m. (7.375)

Let

n∗(x)n(y) = σ2R(x, y). (7.376)

Then∞∑

m,j=1

n∗mnjφ

∗m(x)φj(y) = σ2R(x, y). (7.377)


Applications 223

The kernel R(x, y) is selfadjoint and nonnegative definite, being a covariance

function. Let us assume that the matrix

rmj := n∗mnj (7.378)

is such that the series (7.377) converges in L2(D) ×L2(D). If one uses the

formula

u :=

∞∑

m=1

exp(λmt)(fm + nm)φm (7.379)

for the solution of the problem (7.364)-(7.365) with the noisy initial data,

then for the variance of the error of this estimate one obtains

D[u− u] =∞∑

m,j=1

exp[(λm + λj)t]rmjφ∗m(x)φj(x). (7.380)

It is clear from (7.380) that the solution u is not statisticaly stable since

the series (7.380) may diverge although the series (7.377) converges.

Exercise. Let (7.363)-(7.368) hold, ‖ u(0) ‖< ε and ‖ u(T ) ‖≤ c. Prove

that ‖ u(t) ‖≤ ε1−tT c

tT for 0 ≤ t ≤ T .

Hint: Consider φ(t) :=‖ u(t) ‖2=∑∞

m=1 exp(−2λmt)|fm|2. Check that

φ′′ > 0 and (lnφ)′′ ≥ 0. Thus lnφ is convex. Therefore

lnφ [(1 − α)0 + T ] ≤ (1 − α) lnφ(0) + α lnφ(T ), 0 ≤ α ≤ 1.

Let α = tT . Then the desired inequality follows.

The above examples lead to the following question: how does one find

a statistically stable estimate of the solution to equation (7.327)?

Let us outline a possible answer to this question. We consider linear

estimates, but the approach allows one to generalize the theory and to

consider nonlinear estimates as well.

The approach is similar to the one outlined in Section 2.1. Let us look

for a linear statistically stable estimate u of the solution to equation (7.327)

of the form

u = L(f + n). (7.381)



Assume that the injective linear operator A in (7.327) is an integral operator

on H = L2(D), D ⊂ Rr, with the kernel A(x, y), and

Lf =

∫

D

L(x, y)f(y)dy. (7.382)

Since Au = f , one has:

|u− u|2 = |(LA − I)u + Ln|2 = |(LA− I)u|2 + |Ln|2 (7.383)

where I is the identity operator and the term linear with respect to n

vanishes because n = 0. Let us calculate the last term in (7.383):

|Ln|2 =

∫

D

L∗(x, y)n∗(y)dy

∫

D

L(x, z)n(z)dz =

= σ2

∫

D

∫

D

L∗(x, y)R(y, z)L′(z, x)dydz. (7.384)

Here we used second formula (7.328) and the standard notation

L′(z, x) := L(x, z). (7.385)

Remember that star denotes complex conjugate (and not the adjoint oper-

ator).

Integrating both sides of (7.383) in x over D yields

ε := ‖ u− u ‖2 =‖ (LA − I)u ‖2 +σ2TrQ (7.386)

where Q is an integral operator with the kernel

Q(x, ξ) :=

∫

D

∫

D

L∗(x, y)R(y, z)L′(z, ξ)dydz (7.387)

and TrQ stands for the trace of the operator Q. This operator is clearly

nonnegative definite in H = L2(D):

(Qφ, φ) = (L∗RL′φ, φ) = (RL′φ, L′φ) ≥ 0. (7.388)

Here we used the fact that R is nonnegative definite; that (L∗)† = L′, where

A† denotes the adjoint operator in L2(D); and we have assumed that the

function Q(x, ξ) is continuous in x, ξ ∈ D. The last assumption and the

fact that the kernel Q(x, ξ) is nonnegative definite imply that

TrQ =

∫

D

Q(x, x)dx. (7.389)


Applications 225

One wants to choose L such that

ε = min (7.390)

where ε is given in (7.386). If one put L = A−1 then the first term on the

right side of (7.386) vanishes and the second is finite if

Tr(A−1)∗R(A−1)′

< ∞. (7.391)

We assume that (7.391) holds. We claim that if (7.391) holds and σ → 0

then one can choose L so that

εmin := ε(σ) → 0 as σ → 0. (7.392)

Any such choice of L yields a statistically stable estimate of the solution u.

Let us prove the claim that the choice of L which implies (7.392) is

possible. For simplicity we assume that A is positive selfadjoint operator

on H.

Put

L = (A+ δI)−1 (7.393)

where δ > 0 is a small number. Then the spectral theory yields:

‖(LA − I)u‖2=

∫ ‖A‖

0

(λ

λ + δ− 1

)2

d(Eλu, u)

=

∫ ‖A‖

0

δ2

(λ+ δ)2d(Eλu, u) (7.394)

where Eλ is the resolution of the identity of the operator A. Since

∫ ‖A‖

0

δ2

(λ+ δ)2d(Eλu, u) ≤

∫ ‖A‖

0

d(Eλu, u) =‖ u ‖2<∞ (7.395)

and, as δ → 0, the integrand in (7.394) tends to zero, one can use the

Lebesgue dominant convergence theorem and conclude that

‖(LA − I)u‖2:= η(δ, u) → 0 as δ → 0 (7.396)

where L is given by (7.393). The claim is proved.

Lemma 7.14 If L is defined by (7.393) and (7.391) holds then

lim supδ→0

Tr L∗RL′ < ∞. (7.397)



Proof. One has

[(A+ δI)−1

]∗R[(A + δI)−1

]′

=[(A+ δI)−1

]∗A∗(A−1)∗R(A−1)′A′ [(A + δI)−1

]′. (7.398)

The operator (A−1)∗R(A−1)′ is in the trace class by (7.391). Moreover, if

A > 0 and δ > 0, then

∥∥∥[(A + δI)−1

]∗A∗∥∥∥ ≤ 1,

∥∥∥A′ [(A + δI)−1]′∥∥∥ ≤ 1. (7.399)

Both inequalities (7.399) can be proved similarly or reduced one to the

other, because A∗ = A′. Note that A > 0 implies

A∗ > 0. (7.400)

Indeed, for any φ ∈ H, one has

(A∗φ, φ)∗ = (Aφ∗, φ∗) > 0 (7.401)

since A > 0 by the assumption and φ∗ ∈ H. Therefore

(A∗φ, φ)∗ = (A∗φ, φ) > 0 ∀φ ∈ H. (7.402)

The desired estimate (7.399) follows from the spectral theorem:

∥∥∥[(A + δI)−1

]∗A∗∥∥∥ = max

0<λ≤‖A∗‖

λ

λ+ δ≤ 1. (7.403)


It is now easy to prove the following theorem.

Theorem 7.6 Let A > 0 be a bounded operator on H = L2(D). Assume

that condition (7.328) holds and TrR <∞. Then the estimate

u = L(f + n), (7.404)

with L given by (7.393), is statistically stable (in the sense (7.331)) estimate

of the solution to equation Au = f , provided that parameter δ = δ(σ) in

formula (7.393) is chosen so that

ε := σ2Tr(L∗RL′) + η(δ, u) = min . (7.405)


Applications 227

Proof. Note that we do not assume in this theorem that condition (7.391)

holds. Therefore

Tr(L∗RL′) := ψ(δ) > 0 will, in general, satisfy the condition

ψ(δ) → +∞ as δ → 0. (7.406)

Thus

ε = σ2ψ(δ) + η(δ, u). (7.407)

From (7.406), (7.407) and (7.396) it follows that the function ε considered

as a function of δ for a fixed σ > 0 attains its minimum at δ = δ(σ) and

δ(σ) → 0 as σ → 0. (7.408)

Therefore

ε(σ) = εmin = ε (δ(σ)) → 0 as σ → 0. (7.409)


If some estimates for ψ(δ) and η(δ, u) are found then an estimate of

ε(σ) can be obtained. This requires some a priori assumptions about the

solution.

Example 7.17 A simple estimate for ψ(δ) is the following one.

Tr(L∗RL′) ≤ TrR ‖ L∗ ‖2≤ TrR

δ2. (7.410)

Here we used the estimates

‖ L∗ ‖=‖ L′ ‖≤ δ−1, (7.411)

where L is given by (7.393), and the estimate

Tr(L∗RL′) ≤ TrR ‖ L ‖2 . (7.412)

We will prove inequality (7.412) later. Let us estimate η(δ, u). To do this,

assume that

‖ A−af ‖≤ c, a = 1 + b, b > 0, (7.413)

where c > 0 is a constant, and

A−af :=

∫ ‖A‖

0

λ−adEλf, (7.414)



‖ A−af ‖2=

∫ ‖A‖

0

λ−2ad(Eλf, f) ≤ c2. (7.415)

Since Au = f , u = A−1f it follows from (7.396) and (7.415) that

η(δ, u) := η(δ) =

∫ ‖A‖

0

δ2

(λ + δ)2λ−2d(Eλf, f)

=

∫ ‖A‖

0

δ2λ2b

(λ + δ)2λ−2−2bd(Eλf, f) ≤ δ2b

∫ ‖A‖

0

λ−2ad(Eλf, f)

≤ c2δ2b. (7.416)

Therefore, under the a priori assumption (7.413) about f , one has

ε ≤ σ2TrR

δ2+ c2δ2b. (7.417)

The right hand side in (7.417) attains its minimum in δ (σ > 0 being fixed)

at

δmin = δ(σ) =

(TrR

c2b

) 12a

σ1/a (7.418)

and

εmin = ε(σ) ≤ constσ2 b/a (7.419)

where const can be written explicitly.

Let us finally prove inequaity (7.412). This inequality follows from

Lemma 7.15 If B is a linear bounded operator on H and R ≥ 0 is a

trace class operator, then BR and RB are trace class operators and

|Tr(BR)| ≤‖ B ‖ TrR, Tr(RB) ≤‖ B ‖ TrR. (7.420)

Proof. Let us recall that a linear operator T : H → H is in the trace

class if and only if

‖ T ‖1:=

∞∑

j=1

sj(T ) < ∞, (7.421)

where sj(T ) are the s-numbers of T . These numbers are defined by the

equality

sj(R) = λj

(T ∗T )1/2

(7.422)


Applications 229

where λ1 ≥ λ2 ≥ · · · ≥ 0 are the eigenvalues of the nonnegative definite

selfadjoint operator (T †T )1/2, and T † is the adjoint of T inH. The minimax

principle for the s-values is

sj+1(T ) = minLj

maxφ⊥Lj

φ6=0

‖ Tφ ‖‖ φ ‖ (7.423)

where Ln runs through all j-dimensional subspaces ofH, and φ ⊥ Lj means

that φ is orthogonal to all elements of Lj. If B is a linear bounded operator

on H then

sj+1(BT ) = minLj

maxφ⊥Lj

φ6=0

‖ BTφ ‖‖ φ ‖ ≤‖ B ‖ min

Ljmaxφ⊥Lj

φ6=0

‖ Tφ ‖‖ φ ‖

= ‖ B ‖ sj+1(T ). (7.424)

Therefore if (173) holds then

‖ BT ‖1=

∞∑

j=1

sj(BT ) ≤‖ B ‖∞∑

j=1

sj(T ) =‖ B ‖‖ T ‖1 . (7.425)

The first part of Lemma 7.15 is proved since if R ≥ 0 one has TrR =‖R ‖1. The second part can be reduced to the first. Indeed, T and T ∗ are

simultaneously in the trace class since

sj(T ) = sj(T∗), ∀j. (7.426)

One has

(TB)∗ = B∗T ∗. (7.427)

Since ‖ B ‖=‖ B∗ ‖, and (TrT ) = TrT ∗, one concludes from (179) and

(177) that

|tr(TB)| = |tr(B∗T ∗)| ≤‖ B∗ ‖‖ T ∗ ‖1=‖ B ‖‖ T ‖1 . (7.428)

Take T = R ≥ 0 then ‖ T ‖1= TrR and the second inequality (7.420) is

obtained. Lemma 7.15 is proved.

Additional information about s-values one can find in Section 6.3.



7.7 A remark on nonlinear (polynomial) estimates

Let

U(x) = s(x) + n(x), x ∈ D ⊂ Rr. (7.429)

Consider the polynomial estimate (filter):

AU :=

m∑

j=1

HjU [j] (7.430)

where

HjU [j] =

∫

D

· · ·∫

D

hj(x, ξ1, . . . , ξj)U(ξ1) . . .U(ξj)dξ1, . . . , dξj (7.431)

U [j] = U(ξ1) . . .U(ξj). (7.432)

The problem is to find A such that

ε := D[AU − s] = min . (7.433)

Here D is the symbol of variance, the assumptions about s(x) and n(x) are

the same as in Chapter 1, the optimal estimate is defined by n functions

(h1, . . . , hn), and we could consider by the same method the problem of

estimating a known operator on s(x), for example ∂js(x). Let us substitute

(7.430) in (7.433):

ε :=

m∑

i,j=1

HjU [j]H∗i U [i]∗ − 2Re

m∑

i=0

H∗i U [i]∗s(x) + |s(x)|2

=

m∑

i,j=1

aijHjH∗i − 2Re

m∑

i=0

Hibi + |s(x)|2

= min (7.434)

Here

bi := U [i]∗s(x), bi = bi(x, ξ′1, . . . , ξ

′i) (7.435)

aij := U [j]U [i]∗ = U(ξ1) . . .U(ξj)U∗(ξ′1) . . .U∗(ξ′i), (7.436)


Applications 231

m∑

j=1

aijHj

:=

m∑

j=0

∫

D

· · ·∫

D

aij(ξ′1, . . . , ξ

′i, ξ1, . . . , ξj)hj(x, ξ1, . . . , ξj)dξ1 . . .dξj .

(7.437)

Note that (7.435) implies that

a∗ij = aji, aij = aij(ξ′1, . . . , ξ

′i, ξ1, . . . , ξj). (7.438)

Let

hj(ξ1, . . . , ξj) + εjηj(ξ1, . . . , ξj) (7.439)

be substituted for hj in (7.437) and we surpress exibiting dependence on x

since x will be fixed. Here εj are numbers. The condition ε = min at εj = 0

implies that

m∑

j=1

aijHj = bi, 1 ≤ i ≤ m. (7.440)

This is a system of integral equations for the functions hj(x, ξ1, . . . , ξj):

m∑

j=1

∫

D

· · ·︸︷︷︸jtimes

∫

D

aij(ξ′1, . . . , ξ

′i, ξ1, . . . , ξj)hj(x, ξ1, . . . , ξj)dξ1 . . .

dξj = bi(x, ξ′1, . . . , ξ

′i). (7.441)

Consider as an example the case of polynomial estimates of degree 2. Then

∫

D

a11(ξ′1, ξ1)h1(ξ1)dξ1 +

∫

D

∫

D

a12(ξ′1, ξ1, ξ2)h2(ξ1, ξ2)dξ1dξ2 = b1(ξ

′1)

(7.442)∫

D

a21(ξ′1, ξ

′2, ξ1)h1(ξ1)dξ1 +

∫

D

∫

D

a22(ξ′1, ξ

′2, ξ1, ξ2)h2(ξ1, ξ2)dξ1dξ2

= b2(ξ′1, ξ

′2). (7.443)

If a22(ξ′1, ξ

′2, ξ1, ξ2) belongs to R, or to some class of the operators which

can be inverted, one can find h2(ξ1, ξ2) from equation (7.443) in terms of

h1(ξ1) and then (7.442) becomes an equation for one function h1(ξ1).



In the framework of correlation theory it is customary to consider only

linear estimates because the data (covariance functions) consist of the mo-

ments of second order.


Chapter 8

Auxiliary Results

8.1 Sobolev spaces and distributions

8.1.1 A general imbedding theorem

Let D ⊂ Rr be a bounded domain with a smooth boundary Γ. The Lp(D),

p ≥ 1 spaces consist of measurable functions on D such that ‖ u ‖Lp(D):=

(∫D|u|pdx)1/p < ∞. For p = +∞ one has ‖ u ‖L∞(D):= ess supx∈D |u(x)|.

If in place of the Lebesgue measure a measure µ is used then we use Lp(D,µ)

as the symbol for the corresponding space. These are Banach spaces. If

C∞0 (D) is the set of infinitely differentiable functions with compact support

in D, then C∞0 (D) is dense in Lp(D), 1 ≤ p < ∞. If one defines a mollifier,

i.e. a function 0 ≤ ρ(x) ∈ C∞0 (Rr), ρ(x) = 0 for |x| ≥ 1,

∫ρ(x)dx = 1,∫

:=∫Rr , e.g.

ρ(x) := c exp(|x|2 − 1)−1 for |x| < 1, ρ(x) = 0 for |x| ≥ 1 (8.1)

where c is the normalizing constant chosen so that∫ρdx = 1, then the

function

uε(x) := ε−n∫ρ(|x − y|ε−1)u(y)dy, ε > 0, (8.2)

belongs to C∞loc and ‖ uε − u ‖Lploc

→ 0 as ε → 0. By Lploc one means a

set of functions which belong to Lp on any compact subset of D or Rr.

Convergence in Lploc(D) means convergence in Lp(D) where D is an arbi-

trary compact subset in D. By W `,p(D), the Sobolev space, one means the

Banach space of functions u(x) defined on D with the finite norm

‖ u ‖W `,p(D):=∑

|j|=0

‖ Dju ‖Lp(D) . (8.3)

233



Here j is a multiindex, j = (j1, . . . , jr), Dj = Dj1

x1. . .Djr

xr , |j| = j1+· · ·+jr .The space C∞(D) of infinitely differentiable in D functions is dense in

W `,p(D). By W `,p(D) we denote the closure of C∞0 (D) in the norm (8.3).

By H`(D) we denote W `,2(D). This is a Hilbert space with the inner

product

(u, v)` :=

∫

D

∑

j=0

DjuDjv∗dx, ‖ u ‖`:= (u, u)1/2` . (8.4)

Let C∞(D) denote the space of restrictions to D of functions in C∞(Rr).

If the boundary of D is not sufficiently smooth, then C∞(D) may be

not dense in W `,p(D). A sufficient condition on Γ for C∞(D) to be dense

in W `,p(D) is that D is bounded and star shaped with respect to a point.

D is called star shaped with respect to a point 0 if any ray issued from 0

intersects Γ := ∂D at one and only one point. Another sufficient condition

for C∞(D) to be dense in W `,p(D) is that every point of Γ has a neighbor-

hood U in which D ∩U is representable in a suitable Cartesian coordinate

system as xr < f(x1, . . .xr−1), where f is continuous.

Any function u ∈W `,p(D), p ≥ 1, ` ≥ 1, (possibly modified on a set of

Lebesgue Rr-measure zero) is absolutely continuous on almost all straight

lines parallel to coordinate axes, and its distributional first derivatives co-

incide with the usual derivatives almost everywhere. The spaces W `,p(D)

are complete.

We say that a bounded domain D ⊂ Rr satisfies a uniform interior cone

condition if there is a fixed cone CD such that each point of Γ is the vertex

of a cone CD(x) ⊂ D congruent to CD. A strict cone property holds if

Γ has a locally finite covering by an open set Uj and a corresponding

collection of cones Cj such that ∀x ∈ Uj ∩ Γ one has x + Cj ∈ D.

According to Calderon’s extension theorem, there exists a bounded linear

operator E : W `,p(D) → W `,p(Rr) such that Tu = u on D for every

u ∈ W `,p(D) provided that D ⊂ C0,1. The class C0,1 of domains consists

of bounded domains D such that each point x ∈ Γ has a neighborhood

U with the property that the set D ∩ U is represented by the inequality

xr < f(x1, . . . , xr−1) in a Cartesian coordinate system, and the function f

is Lipschitz-continuous. The domains in C0,1 have the cone property. The

extension theorem holds for a wider class of domains than C0,1, but we do

not go into details.


Auxiliary Results 235

Let us formulate a general embedding theorem ([Mazja (1986)]).

Theorem 8.1 Let D ⊂ R be a bounded domain with a cone property,

and let µ be a measure on D such that

sup ρ−sµ(D∩B(x, ρ)) < ∞, B(x, ρ) = y : |x−y| ≤ ρ, y ∈ Rr, x ∈ Rr

and s > 0. (If s ≤ n is an integer, then µ can be s-dimensional Lebesgue

measure on D ∩ Γs, where Γs is an s-dimensional smooth manifold).

Then, for any u ∈ C∞(D) ∩W `,p(D), one has

k∑

j=0

‖ Dju ‖Lq(D,µ)≤ c ‖ u ‖W `,p(D) (8.5)

where c = const > 0 does not depend on u. Here the parameters q, s, `, p, k

satisfy one of the following sets of restrictions:

a) p > 1, 0 < r − p(`− k) < s ≤ r, q ≤ sp[r − p(` − k)]−1

b) p = 1, 0 < r − `+ k ≤ s ≤ r, q ≤ s(r − `+ k)−1

c) p > 1, r = p(` − k), s ≤ r, q > 0 is arbitrary if

d) p > 1, r < p(` − k) or

e) p = 1, r ≤ ` − k then

k∑

j=0

supx∈D

|Dju| ≤ c ‖ u ‖W `,p(D) . (8.6)

If

f) p ≥ 1, (` − k − 1)p < r < (` − k)p, and λ := `− k − rp−1 then

supx,y∈D

x6=y

|x− y|−λ|Dku(x) −Dku(y)| ≤ C ‖ u ‖W `,p(D) . (8.7)

If

g) (`− k − 1)p = r then (7) holds for all 0 < λ < 1.

The imbedding operator i : W `,p(D) →W k,q(Ds∩D) is compact if s > 0,

r > (`− k)p, r − (` − k)p < s ≤ r, and q < sp[r − (` − k)p]−1.

If r = (` − k)p, q ≥ 1, s ≤ r then the above imbedding operator i is

compact.

If r < (` − k)p then i : W `,p(D) → Ck(D) is compact.

The trace operator i : H`(D) → H`−1/2(Γ) is bounded if ` > 12 .



8.1.2 Sobolev spaces with negative indices

We start with the brief exposition of the facts in distribution theory which

will be used later. Let S(Rr) be the Schwartz’s space of C∞(Rr) functions

which decay with all their derivatives faster than any negative power of |x|as |x| → ∞, so that

|φ|m := maxx∈Rn

(1 + |x|)mm∑

j=0

|Djφ(x)| < ∞, 0 ≤ m < ∞. (8.8)

The set of the norms | · |m defines the topology of S(Rr). A sequence

φn converges to φ in S if |φn − φ|m → 0 as n → ∞ for 0 ≤ m < ∞. It

is easy to see that C∞0 (Rr) is dense in S in the above sense. A space S′

of tempered distributions is the space of linear continuous functionals on

S. Continuity of a linear functional f on S means that (f, φn) → (f, φ)

if φn → φ in S. By (f, φ) the value of f at the element φ is denoted. A

wider class of distributions is used often, the class D′ of linear continuous

functionals over the space D = C∞0 (Rr) of test functions. Continuity of

f ∈ D′ means that for any compact set K ⊂ Rr there exist constants c

and m such that |(f, φ)| ≤ c∑

|j|≤m supx∈K |Djφ|, φ ∈ C∞0 (K). By the

derivative of a distribution f ∈ S′ one means a distribution f ′ ∈ S′, such

that

(f ′, φ) = −(f, φ′) ∀φ ∈ S. (8.9)

Also

(Dmf, φ) = (−1)|m|(f,Dmφ), ∀φ ∈ S.

Let D be an open set in Rr, and S(D) be the completion of C∞0 (D) in the

topology of S(Rr). A continuous linear functional on S(D) is a distribution

in D. The space of these distributions is denoted S′(D). If (F, φ) = (f, φ)

for all φ ∈ S(D), some F ∈ S′(Rr) and some f ∈ S′(D), then f is called

the restriction of F to D and we write f = pF . Since S(D) is, by definition,

a closed linear subspace of S(Rr), the Hahn-Banach theorem says that a

linear continuous functional f ∈ S′(D) can be extended from S′(D) to

S′(Rr). Let us denote this extension by Ef = F . The space S′(Rr) is

a Frechet space, i.e. a complete locally convex metrizable space, so that

Hahn-Banach theorem holds. If F ∈ S′(Rr) then one says that F = 0 in

an open set D if (F, φ) = 0 for all φ ∈ S(D), or which is equivalent, for all

φ ∈ C∞0 (D). If D is the maximal open set on which F = 0, one says that

Ω := Rr \D is the support of F . By Ω the closure of Ω is denoted. If F



is locally integrable, i.e. a usual function, then supp F is the complement

of the maximal open set on which F = 0 almost everywhere in the sense

of Lebesgue measure in Rr . Note that supp DmF ⊆ supp F . The Fourier

transform of f ∈ S′(Rr) is defined by

(f , φ∗) = (2π)r(f, φ∗) (8.10)

where the star stands for complex conjugation, the tilde stands for the

Fourier transform, and

φ :=

∫φ(x) exp(iλ · x)dx,

∫:=

∫

Rr, φ ∈ S(Rr ).

This definition of f is based on the Parseval equality for functions:

(φ1, φ∗2) = 2π(φ1, φ

∗2), and on the fact that the Fourier transform is a linear

continuous bijection of S(Rr) onto itself. Note that

Dmf = (−iλ)m φ (8.11)

F(f ∗ φ) = f φ, Ff := f (8.12)

where f ∗ φ is the convolution of f ∈ S′(Rr) and φ ∈ S(Rr), defined by

f ∗ φ = (f, φ(x − y)). Here x ∈ Rr is fixed and (f, φ(x − y)) denotes the

value of f at the element φ(x− y) ∈ S(Rr). The function f ∗φ is infinitely

differentiable.

Any f ∈ S′(Rr) can be represented in the form

f =∑

|j|≤mDjfj (8.13)

where fj are locally continuous functions which satisfy the inequality

|fj(x)| ≤ c(1 + |x|)N , |j| ≤ m, (8.14)

m and N are some integers, and c = const > 0. The space H`(Rr) one

can define either as the closure of C∞0 (Rr) in the norm ‖ · ‖`:=‖ · ‖H` , or

by noting that, for φ ∈ C∞0 (Rr), one can define an equivalent norm by the

formula:

‖ φ ‖2`= (2π)−r

∫(1 + |λ|2)`|φ|2dλ, (8.15)

and

(u, v)` :=

∫(1 + |λ|2)`uvdλ, (u, v) := (u, v)0. (8.16)



In (8.15) one can assume −∞ < ` < ∞. If f ∈ H`(Rr) and φ ∈ S(Rr) then

|(f, φ)| ≤ (2π)−r∫

|f ||φ|dλ ≤‖ f ‖`‖ φ ‖−`, (8.17)

where

‖ φ ‖−`:= (2π)−r/2(∫

(1 + |λ|2)−`|φ|2dλ)1/2

. (8.18)

It is clear from (8.17) that one can define H−`(Rr) as the closure of S(Rr)

in the norm

‖ φ ‖−`= supf∈H` (Rr)f 6=0

|(f, φ)|/ ‖ f ‖` . (8.19)

Thus H−`(Rr) is the dual space to H`(Rr) with respect to the pairing

given by the inner product (·, ·)0. Consider the spaces H`(D) and H`(Ω),

−∞ < ` < ∞, of functions belonging to H`(Rr) with support in D and

Ω respectively, Ω := R3 \ D. These spaces are closed in H`(Rr) norm

subspaces of H`(Rr) which can be described as completions of C∞0 (D) and

C∞0 (Ω) in the norm H`(Rr). If f ∈ H`(D) then (f, φ) = 0 ∀φ ∈ C∞

0 (Ω).

Consider the space H`(D), where D ⊂ EH`, ` ≥ 0. This means that

D has the property that there exists an extension operator E : H`(D) →H`(Rr) with the property Ef = f in D, ‖ Ef ‖H`(Rr)≤ c ‖ f ‖H`(D). A

similar definition can be given for D ⊂ EW `,p. Bounded domainsD ⊂ C0,1

are in the class EH` (and in EW `,p). The property Ef = f in D for ` < 0

means that pEf = f , where Ef ∈ H`(Rr) ⊂ S′(Rr), ` < 0, and p is the

restriction operator p : S′(Rr) → S′(D). Thus, for any `, −∞ < ` < ∞,

we consider the linear space of restrictions of elements f ∈ H`(Rr), −∞ <

` < ∞, to D ⊂ C0,1.

Define a norm on this space by the formula

‖ f ‖`:=‖ f ‖H`(D)= infE

‖ Ef ‖H`(Rr), −∞ < ` < ∞, (8.20)

where the infimum is taken over all extensions of f belonging to H`(Rr).

If Ef is such an extension, then Ef + f− is also such an extension for any

f− ∈ H`(Ω).

If ` ≥ 0 and D ⊂ EH` then the norm (8.20) is equivalent to the usual

norm (1.4).

Let (H`(D))′ denote the dual space to H`(D) with respect to L2(D) =

H0(D). This notion we introduce in an abstract form. Let H+ and H0

be a pair of Hilbert spaces, H+ ⊂ H0, H+ is dense in H0 in H0 norm and



‖ f ‖+≥‖ f ‖0. Define the dual to H+ space H ′+ := H− as follows. Note

that if f ∈ H0 and φ ∈ H+ then

|(f, φ)| ≤‖ f ‖0‖ φ ‖0≤‖ f ‖0‖ φ ‖+, (8.21)

where (f, φ) = (f, φ)0 is the inner product in H0. Define

‖ f ‖−:= supφ∈H+

φ6=0

|(f, φ)| ‖ φ ‖−1+ . (8.22)

It follows from (14) that ‖ f ‖−≤‖ f ‖0, and that (f, φ) is a bounded

linear functional on H+. By Riesz’s theorem, (f, φ)0 = (If, φ)+, where

I : H0 → H+ is a linear bounded operator, ‖ If ‖+≤‖ f ‖0. Define H− as

the completion of H0 in the norm (8.22). Clearly H+ ⊂ H0 ⊂ H−. The

triple H+,H0,H− is called a . The space H− is called the dual space to

H+ with respect to H0. The inner product in H− can be defined by the

formula

(f, g)− := (If, Ig)+ . (8.23)

Indeed, the operator I was defined as an operator from H0 into H+. There-

fore, for f, g ∈ H0 the right side of (8.23) is well defined. Consider the

completion of H0 in the norm ‖ f ‖−=‖ If ‖+. Then we obtain H−. Note

that the right side of (8.22) can be written as ‖ If ‖+. Therefore, the

operator I is now defined as a linear isometry from H− into H+. In fact,

this isometry is onto H+. Indeed, suppose (If, φ)+ = 0 ∀f ∈ H−. We

wish to prove that φ = 0. If f ∈ H0, then 0 = (If, φ)+ = (f, φ)0. Thus,

φ = 0. If one considers H+ as the space of test functions, then H− is the

corresponding space of distributions.

In this sense, we treat (H`(D))′ as the space of distributions correspond-

ing to H+ = H`(D). Since H`(D) ⊂ H`(D), one has (H`(D))′ ⊂ (H`(D))′.One of the results we will need is the following:

(H`(D))′ = H−`(D), −∞ < ` < ∞,

(H`(D))′ = H−`(D), −∞ < ` < ∞. (8.24)

Here we assume that D ⊂ C0,1. Let us recall that H−`(D) can be

defined as the completion of the set C∞0 (D) in the norm of H−`(Rr), while

H−`(D) can be defined as the space of restrictions of elements of H−`(Rr)to the domain D ⊂ C0,1 with the norm (8.20).



Let us now describe a canonical factorization of the isometric operator

I : H− → H+ defined above. This factorization is

I = p+p−, (8.25)

where p− : H− → H0 and p+ : H0 → H+ are linear isometries onto H0 and

H+ respectively. In order to prove (8.25) let us note that (If, g)+ = (f, g)0for all f, g ∈ H+. Thus I is a selfadjoint positive linear operator on H+,

(If, f)+ = (f, f)0 ≥ 0 (= 0 if and only if f = 0). Let J+ be the closure of

I1/2 : H+ → H+ considered as an operator in H0. Since ‖ I1/2f ‖+=‖ f ‖0

the operator I1/2 is closable. If fn ∈ H+ and fn converges in H0 to f ∈ H0,

then I1/2fn −→H+

g. Let us define J+f = I1/2f = g. Then J+ is defined

on all of H0, it is an isometry: ‖ J+f ‖+=‖ f ‖0, and its range is all of

H+. Indeed, suppose (J+f, φ)+ = 0 for all f ∈ H0 and some φ ∈ H+.

Then (f, φ)0 = 0 ∀f ∈ H0. Thus φ = 0. Since J+ is an isometry, its range

is a closed subspace in H+. We have proved above that the orthogonal

complement of the range of J+ is trivial. Therefore the range of J+ is

H+. So one can take p+ = J+. Define p− := J −1+ I. Then (8.25) holds

and p− : H− → H0 is an isometry with domain H− and range H0, while

p+ : H0 → H+ is an isometry with domain H0 and range H+. One has

I = p+p−, which is the desired factorization (8.25).

If i : H+ → H0 is the imbedding operator and I is considered as an

operator from H0 into H+ then i = I∗, where I∗ is the operator adjoint to

I : H0 → H+, I∗ : H+ → H0. Indeed

(If, g)+ = (f, g)0 = (f, ig)0 ∀f, g ∈ H+. (8.26)

If one assumes that the imbedding operator i : H+ → H0 is in the

Hilbert-Schmidt class as an operator in H0, then the p+ is in the Hilbert-

Schmidt class as an operator in H0. Indeed, p+ : H0 → H+ is an isometry.

Therefore it sends a bounded set in H0, say the unit ball of H0 :‖ u ‖0≤ 1,

into a bounded set in H+, ‖ p+u ‖+=‖ u ‖0≤ 1 if ‖ u ‖0≤ 1. Since the

imbedding i : H+ → H0 is Hilbert-Schmidt, the operator p+ considered as

an operator on H0 is in the Hilbert-Schmidt class:

p+ : H0 → H0 = (i : H+ → H0)(p+ : H0 → H+). (8.27)

The right side of (8.27) is the product of an operator in the Hilbert-Schmidt

class and a bounded operator. Therefore the product is in the Hilbert-

Schmidt class.



Define p−1+ := q+, q+ : H+ → H0, ‖ q+f ‖0=‖ f ‖+, p−1

− := q−,

q− : H0 → H−, ‖ q−f ‖−=‖ f ‖0. One has

(q+h, q+g)0 = (h, g)+, h, g ∈ H+, (8.28)

(p+h, p+g)+ = (h, g)0, h, g ∈ H0 (8.29)

(p−h, p−g)0 = (h, g)−, h, g ∈ H−, (8.30)

(q−h, q−g)− = (h, g)0, h, g ∈ H0. (8.31)

Because p+p− = I, one has

q−q+ = I−1. (8.32)

Note that

(f, q+u)0 = (q−f, u)0 f ∈ H0, u ∈ H+, (8.33)

so that q− = q∗+ is the adjoint to q+ in H0.

To check (8.33) one writes

(f, q+u)0 = (q−f, q−q+u)− = (q−f, I−1u)− = (q−f, u)0 (8.34)

which is equation (8.33).

8.2 Eigenfunction expansions for elliptic selfadjoint opera-

tors

8.2.1 Resoluion of the identity and integral representation

of selfadjoint operators

Every selfadjoint operator A on a Hilbert space H can be represented as

A =

∫ ∞

−∞λdEλ, (8.35)

where Eλ is a family of orthogonal projection operators such that

E2λ = Eλ, E−∞ = 0, E+∞ = I, (8.36)

E∆E∆′ = E∆∩∆′ (8.37)



where 0 is the zero operator, I is the identity operator, ∆ = (a, b], −∞ <

a < b < ∞, E∆ := Eb − Ea. The family Eλ is called the resolution of the

identity of A. The domain of definition of A is:

Dom A = f : f ∈ H,

∫ ∞

−∞λ2d(Eλf, f) < ∞. (8.38)

A function φ(A) is defined as

φ(A) =

∫ ∞

−∞φ(λ)dEλ, (8.39)

where

Dom φ(A) = f : f ∈ H,

∫ ∞

−∞|φ(λ)|2d(Eλf, f) < ∞.

The operator integrals (8.35), (8.39) can be understood as improper oper-

ator Stieltjes integrals which converge strongly.

8.2.2 Differentiation of operator measures

A family E(∆) of selfadjoint bounded nonnegative operators from the set

of Borel sets ∆ ∈ R1 into the set of B(H) of bounded linear operators on a

Hilbert space H is called an operator measure ifE(⋃∞j=1 ∆j) =

∑∞j=1E(∆j)

where the limit on the right is taken in the sense of weak convergence of

operators, ∆i ∩ ∆j = ∅ for i = j, and E(∅) = 0. Assume that for ∆

bounded one has Tr E(∆) < ∞, where TrA is the trace of A. Then

ρ(∆) := Tr E(∆) ≥ 0 is a usual (scalar) measure on R1. Let |Ψ| denote the

Hilbert-Schmidt norm of a linear operator Ψ, |Ψ| =(∑∞

j=1 ‖ Ψej ‖2)1/2

,

where ej is an orthonormal basis of H. In this chapter the star will

stand for the adjoint operator and the bar for complex conjugate or for the

closure.

Lemma 8.1 For ρ-a.e. (almost everywhere) there exists a HS (Hilbert-

Schmidt) operator-valued function Ψ(λ) ≥ 0 with ‖ Ψ(λ) ‖≤ TrΨ(λ) ≤ 1

such that

E(∆) =

∫

∆

Ψ(λ)dρ(λ). (8.40)

The function Ψ(λ) is uniquely defined ρ-a.e. and can be obtained as a weak



limit

Ψ(λ) = w − limE(∆j)ρ−1(∆j) as ∆j → λ. (8.41)

The integral in (8.40) converges in the norm of operators for any bounded

∆. The limit in (8.41) means that λ ∈ ∆j and |∆j| → 0, where |∆j| is the

length of ∆j, and ∆j is a suitable sequence of intervals.

Let A be a selfadjoint operator on H and E(∆) be its resolution of the

identity. In general, trE(∆) is not finite so that Lemma 1 is not applicable.

In order to be able to use this lemma, let us take an arbitrary linear densely

defined closed operator T on H with the properties: i) RanT = H, T is

injective, ii) T−1 ∈ σ2 where σ2 is the class of Hilbert-Schmidt operators.

Definition 8.1 A linear compact operator K on H belongs to the class

σp, K ∈ σp, if and only if∑∞

n=1 spn(K) <∞, where sn(K) are the s-values

of K, which are defined by the formula sn(K) = λn[(K∗K)1/2] and λn(B)

are the eigenvalues of a compact selfadjoint nonnegative operator B ordered

so that λ1 ≥ λ2 ≥ · · · ≥ 0. If K ∈ σ1 it is called trace class operator; if

K ∈ σ2 it is called HS operator.

More information about trace class operators is given in Section 8.3.3.

Having chosen T as above, define the operator measure θ(∆) :=

T−1∗E(∆)T−1. If A ∈ σ2 define its HS norm by the formula |A|2 :=∑∞j=1 ‖ Aej ‖2, where ‖ Af ‖ is the norm in H of the vector Af , and ej

is the orthonormal basis of H. By |A| denote the norm of A. Note that

|AB| ≤ |A| ‖ B ‖, ‖ A ‖≤ |A|, |λA| = |λ||A|, |A+ B| ≤ |A| + |B|, and that

|A| does not depend on the choice of the orthonormal basis ej ofH. Since

Tr(B∗AB) ≤ |A| ‖ B ‖2 one has Trθ(s) ≤‖ T−1 ‖2 |E(∆)| ≤‖ T−1 ‖2< ∞.

Therefore Lemma 8.1 is applicable to θ(∆), so that

(E(∆)f, g) =(T−1∗E(∆)T−1Tf, Tg

)=

∫

∆

(Ψ(λ)Tf, Tg) dρ(λ) (8.42)

where dρ is a nonnegative measure, ρ ((−∞,∞)) < ∞, Ψ(λ) is a nonnega-

tive operator function, Ψ(λ) ≥ 0, and |Ψ(λ)| ≤ TrΨ(λ) = 1.

Let, for a fixed λ, φα(λ) and να(λ), α = 1, 2, . . .Nλ ≤ ∞, be respec-

tively the orthonormal system of eigenvectors of Ψ(λ) and the corresponding

eigenvalues. Then

(Ψ(λ)Tf, Tg) =

N(λ)∑

α=1

να(Tf, φα)(Tg, φα) =

N(λ)∑

α=1

(Tf, ψα)(Tg, ψα), (8.43)



where the bar denotes complex conjugate,

ψα := ν1/2α (λ)φα,

Nλ∑

α=1

‖ ψα ‖2=

Nλ∑

α=1

να = TrΨ(λ) = 1. (8.44)

One can write

(E(∆)f, g) =

∫

∆

Nλ∑

α=1

(Tf, ψα(λ)) (Tg, ψα(λ))dρ(λ). (8.45)

If F (λ) is a Borel measureable function on Λ, then

(F (A)f, g) =

∫ ∞

−∞F (λ) (Ψ(λ)Tf, Tg) dρ(λ), (8.46)

where f ∈ D (F (A)) ∩ D(T ), g ∈ D(T ). If one takes T = q+ and if the

imbedding i : H+ → H0 is in the Hilbert-Schmidt class, then the operator

P (λ) := T ∗Ψ(λ)T , which appears in (8.42):

(E(∆)f, g) =

∫

∆

(P (λ)f, g) dρ(λ) (8.47)

and in (8.46):

(F (A)f, g) =

∫ ∞

−∞F (λ) (P (λ)f, g) dρ(λ), (8.48)

can be considered as an operator from H+ into H− for any fixed λ ∈ R1.

This operator is in the Hilbert-Schmidt class because it is a product of two

bounded operators and a Hilbert-Schmidt operator Ψ(λ): T ∗ = q− : H0 →H− is bounded, T = q+ : H+ → H0 is bounded, Ψ(λ) : H0 → H0 is in

the Hilbert-Schmidt class. The range of the operator P (λ) is a generalized

eigenspace of the operator A corresponding to the point λ. This eigenspace

belongs to H−. Formula (8.47) can be written as

E(∆) =

∫

∆

P (λ)dρ(λ) (8.49)

where the integral (8.49) converges weakly, as follows from formula (8.47),

but also it converges in the Hilbert-Schmidt norm of the operators

L(H+,H−), where by L(H+,H−) we denote the set of linear bounded op-

erators from H+ into H−. Indeed

|P (λ)| ≤‖ T ∗ ‖ |Ψ(λ)| ‖ T ‖≤ |Ψ(λ)| ≤ 1 (8.50)

where we took into account that ‖ T ‖=‖ q+ ‖= 1, ‖ T ∗ ‖=‖ q− ‖= 1.



The operator P (λ) is an orthogonal projector in the following sense. If

φ ∈ H+ and

(P (λ)u, φ)0 = 0 ∀u ∈ H+ (8.51)

then

P (λ)φ = 0. (8.52)

Indeed

0 = (P (λ)u, φ)0 = (q−Ψ(λ)q+u, φ)0 = (Ψ(λ)q+u, q+φ)0

= (q+u, ψ(λ)q+φ)0 = (u, q−Ψ(λ)q+φ)0

= (u, P (λ)φ)0 = 0 ∀u ∈ H+. (8.53)

Thus, equation (8.52) follows from (8.53). Therefore, if φ is orthogo-

nal to the range of P (λ) then the projection of φ onto the range of P (λ)

vanishes.

Let us rewrite formula (8.43) in terms of the generalized eigenvectors.

Define

T ∗ψα = q−ψα := ηα ∈ H−. (8.54)

Then (8.43) can be rewritten as

(P (λ)f, g) =

N(λ)∑

α=1

fα(λ)gα(λ), (8.55)

where

fα(λ) := (f, ηα)0, f ∈ H+. (8.56)

Formula (8.45) becomes

(E(∆)f, g) =

∫

∆

N(λ)∑

α=1

fα(λ)gα(λ)dρ(λ). (8.57)

Since P (λ) is in the Hilbert-Schmidt class, it is an integral operator with a

kernel Φ(x, y, λ), the (generalized) spectral kernel (see the Remark at the

end of Section 3.6). The operator Eλ is an integral operator with the kernel

E(x, y, λ) =

∫ λ

−∞Φ(x, y, λ)dρ(λ). (8.58)



The operator F (A) is an integral operator with the kernel

F (A)(x, y, λ) =

∫ ∞

−∞F (λ)Φ(x, y, λ)dρ(λ). (8.59)

8.2.3 Carleman operators

An integral operator

Af =

∫A(x, y)f(y)dy,

∫=

∫

Rr(8.60)

is called a Carleman operator if

ν(A) := supx∈Rr

∫|A(x, y)|2dy <∞. (8.61)

A selfadjoint operator L is called a Carleman operator if there exists a

continuous function φ(λ),

φ(λ) ∈ C(Λ), 0 < |φ(λ)| ≤ c ∀λ ∈ Λ, (8.62)

where c = const > 0, Λ is the spectrum of L, such that the operator

A = φ(L) is an integral operator with the kernel A(x, y) which satisfies

(8.61).

Let H0 = L2(Rr) and take H+ = L2 (Rr , p(x)) , where p(x) ≥ 1 and∫p−1(x)dx < ∞. (8.63)

For example, one can take p(x) = (1 + |x|2)(r+ε)/2, where ε > 0 is any

positive number. Then the operator A defined by the formula (8.60) is in

the Hilbert-Schmidt class σ2(H0,H+) if the condition (8.63) holds.

Indeed, if fj, 1 ≤ j <∞, in an orthonormal basis of H0, then

∞∑

j=1

‖ Afj ‖2+ =

∞∑

j=1

∫p−1(x)

∣∣∣∣∫A(x, y)fj(y)dy

∣∣∣∣2

dx

=

∫dxp−1(x)

∫|A(x, y)|2dy

≤ ν(A)

∫dxp−1(x) < ∞, (8.64)

where Parseval’s equality was used to get the second equality and the con-

ditions (8.61) and (8.63) were used to get the final inequality (8.64).



If A = φ(L) and A ∈ σ2(H0,H+) one can use the triple

L2 (Rr, p(x)) ⊂ L2(Rr) ⊂ L2(Rr, p−1(x)

)(8.65)

for eigenfunction expansions of the operator L. Here

H+ = L2 (Rr , p(x)) , H0 = L2(Rr), H− = L2(Rr , p−1(x)

). (8.66)

Indeed, the basic condition, which has to be satisfied for the theory

developed in section 1 to be valid, is

Tr(T−1)∗E(∆)T−1

< ∞, (8.67)

provided that ∆ is bounded. Since φ(λ) is continuous and (8.62) holds, one

has

ξ∆(λ) ≤ |φ(λ)|2c(∆), λ ∈ Λ, (8.68)

where ξ∆(λ) is the characteristic function of ∆:

ξ∆(λ) =

1, λ ∈ ∆

0, λ 6∈ ∆(8.69)

and c(∆) = const > 0. Inequality (8.68) implies

E(∆) ≤ c(∆)

∫

Λ

|φ(λ)|2dEλ = c(∆)φ∗(L)φ(L). (8.70)

Therefore

Tr(T−1)∗E(∆)T−1

≤ c(∆)Tr

[φ(L)T−1

]†φ(L)T−1

= c(∆)|φ(L)T−1|. (8.71)

Thus (8.67) holds if

|φ(L)T−1| < ∞. (8.72)

If T−1 = p+, then condition (8.72) becomes

|φ(L)p+| < ∞. (8.73)

In the case of the triple (8.65)-(8.66) the operator p+ is a multiplication

operator given by the formula

p+f = p−1/2(x)f(x). (8.74)



Condition (8.73) holds if p(x) satisfies condition (8.63) and φ(L) = A is a

Carleman operator so that condition (8.61) holds.

Indeed, inequality (8.73) holds if∫∫

|A(x, y)|2p−1(y)dydx < ∞. (8.75)

We assume hat the function φ(λ) is such that (8.61) implies

supy∈Rr

∫|A(x, y)|2dx < ∞. (8.76)

If this is the case then inequality (8.75) holds provided that (8.63) and (8.76)

hold. Therefore in this case one can use the triple (8.65) for eigenfunction

expansions of the operator L and the generalized eigenfunctions of L are

elements of L2(Rr, p−1(x)

), so that they belong to L2

loc(Rr).

Inequality (8.61) implies (8.76), for example, if φ(λ) is a real-valued

function. In this case A = A∗, so that

A(x, y) = A(y, x) (8.77)

and if (8.77) holds then clearly (8.61) implies (8.76).

In many applications one takes

φ(λ) = (λ− z)−m, (8.78)

where z is a complex number and m is a sufficiently large positive integer.

If φ is so chosen then

A(x, y; z) =

∫ ∞

−∞(λ − z)−mΦ(x, y, λ)dρ(λ) (8.79)

and

A(y, x, z) =

∫ ∞

−∞(λ− z)−mΦ(y, x, λ)dρ(λ)

=

∫ ∞

−∞(λ− z)−mΦ(x, y, λ)dρ(λ)

=

[∫ ∞

−∞(λ − z∗)−mΦ(x, y, λ)dρ(λ)

]

= A(x, y, z), (8.80)

where we have used the equation

Φ(x, y, λ) = Φ(y, x, λ), (8.81)



which follows from the assumed selfadjointness of L. Therefore, if both

kernels A(x, y; z) and A(x, y; z) satisfy inequality (8.61), then (8.76) holds.

8.2.4 Elements of the spectral theory of elliptic operators

in L2(Rr)

Let

Lu =∑

|j|≤saj(x)D

ju, Dj = Dj11 · · ·Djr

r , Dp = −i ∂

∂xp, (8.82)

where x ∈ Rr , j is a multiindex, aj(x) ∈ C |j|(Rr),

as(x, ξ) :=∑

|j|=saj(x)ξ

j 6= 0 for (x, ξ) ∈ Rr × (Rr \ 0), (8.83)

and assume that L is formally selfadjoint

L = L∗ (8.84)

that is

(Lφ, ψ) = (φ,Lψ) ∀φ, ψ ∈ C∞0 (Rr). (8.85)

The function as(x, ξ) is called the symbol of the elliptic operator (8.82),

and condition (8.83) is the ellipticity condition. If (8.83) holds and r ≥ 3

then s is necessarily an even number.

Often one assumes that L is strongly elliptic. This means that∑

|j|=sReaj(x)ξ

j 6= 0 for (x, ξ) ∈ Rr × (Rr \ 0). (8.86)

The assumptions (8.84) and (8.86) imply that the operator L is bounded

from below on C∞0 (Rr).

(Lu, u)0 ≥ c1 ‖ u ‖2Hs(Rr) −c2 ‖ u ‖2

L2(Rr) ∀u ∈ C∞0 (Rr), (8.87)

where c1 and c2 are positive constants.

Define the minimal operator in L2(Rr) generated by the formally selfad-

joint differential expression (8.82) as the closure of the symmetric operator

u → Lu with the domain of definition C∞0 (Rr). Any densely defined of

a Hilbert space H symmetric operator L is closable. Recall that L, the

closure of L, is defined as follows. Let un ∈ DomL := D(L), un → u in H

and Lun → f in H. Then one declares that u ∈ D(L) and Lu = f . This

definition implies that L is defined on the closure ofD(L) in the graph norm



‖ u ‖L:=‖ u ‖ + ‖ Lu ‖ and the graph of L is the closure in the graph norm

of the graph of L, that is of the set of ordered pairs u,Lu, u ∈ D(L).

One says that L is closable if and only if the set u,Lu, u ∈ D(L), is a

graph. In other words, L is closable if and only if there is no pair 0, f,f 6= 0, in the closure of the graph of L. This means that if

un ∈ D(L), un → 0 and Lun → f (8.88)

then

f = 0. (8.89)

If L is symmetric and densely defined and φ ∈ D(L), then

(f, φ) = limn→∞

(Lun, φ) = limn→∞

(un,Lφ) = 0 ∀φ ∈ D(L). (8.90)

Since D(L) is dense, one concludes that f = 0. Therefore L is closable. We

denote L by Lm.

Under some assumptions on the coefficients aj(x) it turns out that Lmis selfadjoint. If no assumptions of the growth of aj(x) as |x| → ∞ are

made then Lm may be not selfadjoint (see [Ma, p 156] for an example).

Let us give some sufficient conditions for Lm to be selfadjoint.

Note that if L is densely defined symmetric and bounded from below it

has always a (unique) selfadjoint extension with the same lower bound, the

Friedrichs extension LF . This extension is characterized by the fact that

its domain of definition belongs to the energy space of the operator L, that

is to the Hilbert space HL defined as the closure of D(L) in the metric

[u, u] = (Lu, u) + c(u, u), (8.91)

where c > 0 is a sufficiently large constant such that L + cI is positive

definite on D(L).

Since the closure L of L is the minimal closed extension of L and since

LF is a closed extension of L, one concludes that if L is bounded from

below in H = L2(Rr) and Lm is selfadjoint then

Lm = LF . (8.92)

In order to give conditions for Lm to be selfadjoint, consider first the case

when

aj(x) = aj(x) = const, |j| ≤ s. (8.93)

In this case L is symmetric on C∞0 (Rr).



Lemma 8.2 If (8.93) holds then Lm is selfadjoint. Its spectrum consists

of the setλ : λ =

∑

|α|≤sajξ

j, ξ ∈ Rr

. (8.94)

Proof. Let

Fu = (2π)−r/2∫

exp(−iξ · x)u(x)dx := u(ξ) (8.95)

u(x) = F−1u = (2π)−r/2∫

exp(iξ · x)u(ξ)dξ (8.96)

F(Dpu) = ξpu(ξ), 1 ≤ p ≤ r, Dp = −i ∂

∂xp. (8.97)

Therefore

F

∑

|j|≤sajD

ju

=

∑

|j|≤sajξ

j u

and

Lu = F−1L(ξ)Fu, (8.98)

where

L(ξ) :=∑

|j|≤sajξ

j. (8.99)

The operator F of the Fourier transform is unitary in L2(Rr). Formula

(8.98) shows that the operator L is unitarily equivalent to the operator

of multiplication by the polynomial L(ξ) defined by formula (8.99). This

multiplication operator is defined on the set F (C∞0 (Rr)) of the Fourier

transforms of the functions belonging to the set C∞0 (Rr), that is to the

domain of definition of the operator L. Consider the closure M in L2(Rrξ)

of the operator of multiplication by the function L(ξ):

Mu := L(ξ)u(ξ). (8.100)

The domain of definition of M is

D(M ) =u : L(ξ)u(ξ) ∈ L2(Rr)

. (8.101)



The operator M is clearly selfadjoint since the function L(ξ) is real-valued.

Therefore

Lm = F−1MF (8.102)

is also selfadjoint. Since the spectra of the unitarily equivalent operators

are identical and the spectrum of M is the set (8.94), the spectrum of Lmis the set (8.94). Lemma 8.2 is proved.

The following well-known simple lemmas are useful for proving, under

suitable assumptions, that Lm is selfadjoint.

Lemma 8.3 Let A and B be linear operators on a Hilbert space H, A is

selfadjoint, D(A) ⊂ D(B), B is symmetric, and

‖ Bu ‖≤ ε ‖ Au ‖ +c ‖ u ‖, ∀u ∈ D(A), (8.103)

where 0 < ε < 1 is a fixed number and c is a positive number. Then the

operator A +B with Dom(A +B) = DomA is self-adjoint.

Lemma 8.4 Let A be a symmetric densely defined in a Hilbert space H

operator. Then A, the closure of A, is selfadjoint if and only if one of the

following conditions holds

c`Ran(A ± iλ) = H (8.104)

or

N (A∗ ± iλ) = 0. (8.105)

Here λ > 0 is an arbitrary fixed number, c`RanA is the closure of the range

of A,

N (B) = u : Bu = 0 (8.106)

and A∗ is the adjoint of A.

For convenience of the reader let us prove these lemmas. We start with

Lemma 8.4.

Proof of Lemma 8.4 a) The argument is the same for any λ > 0, so let us

take λ = 1. Suppose that A is selfadjoint. Then A∗ = A∗

= A. If A∗u = iu

then

(A∗u, u) = i(u, u). (8.107)



Since A∗ is selfadjoint the quadratic form (A∗u, u) is real valued. Thus,

equation (8.107) implies U = 0, and (8.105) is established. To derive (8.104)

one uses the formula

c`Ran(A± i) ⊕ N (A∗+i) = H, (8.108)

where ⊕ is the symbol of the orthogonal sum of the subspaces. From (8.108)

and (8.105) one gets (8.104).

b) Assume now that (8.105) holds and prove that A is selfadjoint. If

(8.105) holds then (8.104) holds as follows from (8.108). Conversely, (8.104)

implies (8.105). Note that

c`Ran(A± i) = Ran(A± i) (8.109)

because Ran(A ± i) are closed subspaces. Indeed, if (A ± i)un = fn and

fn → f , then

‖ (A ± i)unm ‖→ 0, n,m→ ∞, (8.110)

where unm := un − um. Since A is symmetric, one obtains from (75) that

‖ (A ± i)unm ‖2=‖ Aunm ‖2 + ‖ unm ‖2→ 0, n,m→ ∞. (8.111)

Therefore un is a Cauchy sequence. Let un → u. Then, since A is closed,

one obtains

(A ± i)u = f. (8.112)

Therefore Ran(A± i) are closed subspaces. If (8.105) holds then

Ran(A± i) = H. (8.113)

This and the symmetry of A imply that A is selfadjoint. Indeed, let

((A+ i)u, v

)= (u, f) ∀u ∈ D(A). (8.114)

Using (8.113), find w such that

(A− i)w = f. (8.115)

This is possible because of (8.113). Use (8.114) and the symmetry of A to

obtain

((A+ i)u, v

)=(u, (A− i)w

)=((A + i)u,w

). (8.116)



By (8.113) one has Ran(A + i) = H. This and (8.116) imply v = w.

Therefore v ∈ D(A) and A is selfadjoint on D(A). Lemma 8.4 is proved.

Proof of Lemma 8.3 It is sufficient to prove that, for some λ > 0,

Ran(A +B ± iλ) = H. (8.117)

One has

A+B ± iλ =[I +B(A ± iλ)−1

](A ± iλ). (8.118)

Equation (8.117) is established as soon as we prove that

‖ B(A ± iλ)−1 ‖< 1. (8.119)

Indeed, if (8.119) holds then the operator I+B(A+iλ)−1 is an isomorphism

of H onto H and, since A is selfadjoint, Ran(A ± iλ) = H, so

Ran(A+ B ± iλ) = Ran[I +B(A ± iλ)−1

](A ± iλ)

= H. (8.120)

Use the basic assumption (8.103) to prove (8.119). Let

(A+ iλ)−1u = f, B(A + iλ)−1u = Bf. (8.121)

Then

‖ B(A + iλ)−1u ‖ = ‖ Bf ‖≤ ε ‖ Af ‖ +c ‖ f ‖= ε ‖ A(A+ iλ)−1u ‖ +c ‖ (A+ iλ)−1U ‖≤ ε ‖ u ‖ +cλ−1 ‖ u ‖= (ε+ cλ−1) ‖ u ‖ . (8.122)

If ε < 1 and λ > 0 is large enough, then

ε+ cλ−1 < 1. (8.123)

Thus (8.122) holds, and Lemma 8.3 is proved. In deriving (8.122) we

have used the inequalities

‖ (A ± iλ)−1 ‖≤ λ−1, λ > 0, (8.124)

and

‖ A(A± iλ)−1 ‖≤ 1, (8.125)



both of which follow immediately from the spectral representation of a

selfadjoint operator A: if

φ(A) =

∫φ(t)dEt (8.126)

then

‖ φ(A) ‖≤ maxt

|φ(t)|. (8.127)

Both inequalities (8.124) and (8.125) can be derived in a simple way which

does not use the result (8.126)-(8.127). This derivation is left for the reader

as an exercise.

We now return to the question of the selfadjointness of Lm. If Lm is

selfadjoint then L is called essentially selfadjoint. We wish to prove that

if the principal part of L is an operator with constant coefficients and the

remaining part is an operator with smooth bounded coefficients then L is

essentially selfadjoint. The principal part of L is the differential expression

L0 :=∑

|j|=saj(x)D

j . (8.128)

Let us recall that a polynomial P (ξ) is subordinate to the polynomial Q(ξ)

if

|P (ξ)|1 + |Q(ξ)| → 0 as |ξ| → ∞, ξ ∈ Rr. (8.129)

If (8.129) holds then we write

P ≺≺ Q. (8.130)

We say that Q(ξ) is stronger than P (ξ) and write

P ≺ Q (8.131)

if

P (ξ)

Q(ξ)≤ c, ∀ξ ∈ Rr, (8.132)

where c > 0 is a constant, and

P (ξ) :=

∑

|j|≥0

|P (j)(ξ)|2

1/2

. (8.133)



In the following lemma a characterization of elliptic polynomials are given.

A homogeneous polynomial Q(ξ) of degree s is called elliptic if

Q(ξ) 6= 0 for ξ ∈ Rr \ 0. (8.134)

Lemma 8.5 A homogeneous polynomial Q(ξ) of degree s is elliptic if and

only if it is stronger than every polynomial of degree ≤ s. In particular

c|ξ|s ≤ |Q(ξ)|, (8.135)

where c = const > 0.

A proof of this result can be found in [Hormander (1983-85), vol. II, p.

37].

Lemma 8.6 Assume that

Lu = L0u+∑

|j|<saj(x)D

ju (8.136)

and

L0u =∑

|j|=sajD

ju, (8.137)

where

L0(ξ) :=∑

|j|=sajξ

j is an elliptic polynomial (8.138)

and

supx∈Rr

|aj(x)| ≤ c, |j| < s. (8.139)

Then L is essentially selfadjoint on C∞0 (Rr) and L = Lm is selfadjoint on

Hs(Rr).

Proof. By Lemma 8.2 L0 defined on C∞0 (Rr) is essentially selfadjoint.

If (8.138) holds then inequality (8.135) holds. Therefore the closure of L0,

the operator L0m, is selfadjoint and

DomL0m = Hs(Rr). (8.140)

Let us apply Lemma 8.3 with A = L0m and B = L − L0. The basic

condition to check is inequality (8.103) on Hs(Rr). It is sufficient to check

this inequality on C∞0 (Rr) since C∞

0 (Rr) is dense in Hs(Rr). If (8.103)

established for any u ∈ C∞0 (Rr) then one takes u ∈ Hs(Rr) and a sequence



un ∈ C∞0 (Rr) such that ‖ un − u ‖Hs(Rr)→ 0, n → ∞, and passes to the

limit n → ∞ in (8.103). This yields inequality (8.103) for any u ∈ Hs(Rr).

In order to check inequality (8.103) it is sufficient to prove that

‖ aj(x)Dju ‖≤ ε ‖ L0u ‖ +c(ε) ‖ u ‖, ∀u ∈ C∞0 (Rr) (8.141)

for any ε > 0, however small, and any j such that |j| < s. Using (8.139)

and Parseval’s equality one obtains:

‖ aj(x)Dju ‖≤ c ‖ Dju ‖= c ‖ ξj u ‖ . (8.142)

On the other hand, Parseval’s equality, condition (8.138) and inequality

(8.135) yield

‖ L0u ‖=‖ L0(ξ)u ‖≤ c ‖ |ξ|su ‖ . (8.143)

If |j| < s then

|ξ|j ≤ ε|ξ|s |ξ| > R = R(ε). (8.144)

In the region |ξ| ≤ R one estimates

|ξju| ≤ c(R)|u|, (8.145)

where, for example, one can take c(R) = R|j|. Therefore∫

|ξ|≤R|ξju|2dξ ≤ c2(R) ‖ u ‖2= c2(R) ‖ u ‖2, (8.146)

and∫

|ξ|>R|ξju|2dξ ≤ ε2

∫|L0(ξ)u|2dξ = ε2 ‖ L0u ‖2 (8.147)

where Parseval’s equality and estimates (8.144) and (8.145) were used.

From (8.142), (8.146) and (8.147) one obtains the desired inequality (8.141).


Remark 8.1 Note that the method of the proof allows one to relax con-

dition (8.139). For example, one could use some integral inequalities to

estimate the Fourier transform.

Let us now prove the following result.

Lemma 8.7 If L defined by (8.82) has smooth and uniformly bounded

in Rr coefficients such that the principal part of L is an elliptic operator



of uniformly constant strength, then L is essentially selfadjoint on C∞0 (Rr)

and Lm is selfadjoint on Hs(Rr).

Proof. Let us recall that the principal part L0 of L,

L0u =∑

|j|=saj(x)D

ju (8.148)

is an elliptic operator of uniformly constant strength if the principal symbol

as(x, ξ) :=∑

|j|=saj(x)ξ

j

satisfies the ellipticity condition (8.83) and the condition of uniformly con-

stant strength

as(x, ξ)

as(y, ξ)≤ c ∀x, y ∈ Rr (8.149)

where c does not depend on x, y and

as(x, ξ) :=

∑

|j|≥0

|Djξas(x, ξ)|2

1/2

. (8.150)

By Lemma 8.4 the operator Lm is selfadjoint on Hs(Rr) if the equations

(Lm ± iλ)u = f, (8.151)

are solvable in Hs(Rr) for any f ∈ C∞0 (Rr) and for some λ > 0. Indeed, in

this case Ran(Lm ± iλ) ⊃ C∞0 (Rr) and therefore is dense in H = L2(Rr).

Since Lm is symmetric Ran(Lm ± iλ) is closed in L2(Rr) and, being dense

in L2(Rr), has to coincide with L2(Rr). This implies, by Lemma 8.4 that

Lm is selfadjoint.

Existence of the solution to (8.151) in Hs(Rr) for any f ∈ C∞0 (Rr)

follows from the existence of the fundamental solution E(x, y, λ):

(Lm ± iλ)E(x, y, λ) = δ(x− y) in Rr (8.152)

and the estimate

|E(x, y, λ)| ≤

c|x− y|s−r, if r is odd or r > s,

|x− y| ≤ 1

c|x− y|s−r + c1 log 1|x−y| , if r is even and r ≤ s,

|x− y| ≤ 1

(8.153)



where c and c1 are positive constants, and

|E(x, y, λ)| ≤ c exp(−a(λ)|x− y|) if |x− y| ≥ 1 (8.154)

where c > 0 is a constant and a(λ) > 0 is a constant, a(λ) → +∞ as

λ → +∞. Also E(x, y, λ) is smooth away from the diagonal x = y, and the

following estimates hold

|DjE(x, y, λ| ≤ c exp (−a(λ)|x− y|) if |x− y| ≥ 1 (8.155)

|DjE(x, y, λ)| ≤c0 + c1|x− y|s−r−|j| if s 6= r + |j|, |x− y| ≤ 1

c0 + c1 |log |x− y|| if s = r + |j|, |x− y| ≤ 1.

(8.156)

Indeed, if there exists the fundamental solution with the properties

(8.152)-(8.156) then

u =

∫

RrE(x, y, λ)f(y)dy (8.157)

solves (8.151) and u ∈ Hs(Rr), so that lm is selfadjoint.

Existence of the fundamental solution with the properties (8.152)-

(8.156) for elliptic selfadjoint operators with constant coefficients can be

established if one uses the Fourier transform [Hormander (1983-85), vol.

I, p. 170], and for the operators of uniformly constant strength it is es-

tablished in [Hormander (1983-85), vol. II, p. 196]. Thus, Lemma 8.7 is

proved.

It is not difficult now to establish that the function (LN−iλ)−1 := φ(L),

λ > 0, has a kernel which is a Carleman operator if N > r2s

. Indeed, it

follows from the estimate (8.156) that the singularity of the kernel of the

operator φ(L) is 0(|x− y|Ns−r

)so that this kernel is locally in L2 ifN > r

2s.

On the other hand, the estimate (8.154) implies that the kernel of φ(L) is in

L2(Rr) globally. Since the constants in the inequalities (8.154) and (8.156)

do not depend on x, one concludes that φ(L) is a Carleman operator. Let

us formulate this result as Lemma 8.8.

Lemma 8.8 Suppose that N > r2s and the assumptions of Lemma 8.7

hold. Then the operator (LN − iλ)−1, λ > 0, is a Carleman operator.



8.3 Asymptotics of the spectrum of linear operators

In this section we develop an abstract theory of perturbations preserving

asymptotics of the spectrum of linear operators. As a by-product a proof

of Theorem 2.3 is obtained.

8.3.1 Compact operators

8.3.1.1 Basic definitions

Let H be a separable Hilbert space and A : H → H be a linear operator.

The set of all bounded operators we denote L(H). The set of all linear

compact operators on H we denote σ∞. Let us recall that an operator A

is called compact if it maps bounded sets into relatively compact sets. It is

well known that A is compact if and only if one of the following conditions

hold

1) fn f , gn g implies (Afn, gn) → (Af, g),

2) fn f implies Afn → Af ,

3) from any bounded sequence of elements ofH one can select a subsequence

fn such that

(Afnm, fnm) converges as n,m → ∞

where

fnm := fn − fm.

By we denote weak convergence and → stands for convergence in the

norm of H (strong convergence).

If A is compact and B is a bounded linear operator then AB and BA

are compact. A linear combination of compact operators is compact. If

An ∈ σ∞ and ‖ An −A ‖→ 0 as n→ ∞, then A ∈ σ∞. The operator A is

compact if and only if A∗ is compact. In this section we denote the adjoint

operator by A∗. If H is a separable Hilbert space and A is compact then

there exists a sequence An of finite rank operators such that ‖ An−A ‖→ 0.

An operator B is called a finite rank operator if rank A := dimRanB < ∞.

If A is compact then A∗A ≥ 0 is compact and selfadjoint. The spectrum

of a selfadjoint compact operator is discrete, the eigenvalues λn(A∗A) are

nonnegative and have at most one limit point λ = 0. We define the singular



values of a compact operator A (s-values of A) by the equation

sj(A) = λ1/2j (A∗A). (8.158)

One has

s1(A) ≥ s2(A) ≥ · · · ≥ 0. (8.159)

Note that

s1(A) =‖ A ‖ (8.160)

and if A = A∗ then

sj(A) = |λj(A)|. (8.161)

The following properties of the s-values are known

sj(A) = sj(A∗), (8.162)

sj(BA) ≤ ‖ B ‖ sj(A), (8.163)

sj(AB) ≤ ‖ B ‖ sj(A) (8.164)

for any bounded linear operator B. Obviously

sj(cA) = |c|sj(A), c = const. (8.165)

Any bounded linear operator A can be represented as

A = U |A| (8.166)

where

|A| := (A∗A)1/2 (8.167)

and U is a partial isometry which maps Ran(A∗) onto RanA. Representa-

tion (8.166) is called polar representation of A.

The operator |A| is selfadjoint. If A is compact then |A| is compact.

Let φj be its eigenvectors and sj = sj(A) be its eigenvalues:

|A|φj = sjφj, (φj, φm) = δjm. (8.168)

Then

|A| =

∞∑

j=1

sj(A)(·, φj)φj (8.169)



where the series (8.169) converges to |A| in the norm of operators:∥∥∥∥∥∥|A| −

n∑

j=1

sj(A)(·, φj)φj

∥∥∥∥∥∥→ 0 as n→ ∞. (8.170)

Let

ψj := Uφj. (8.171)

Then

A =

∞∑

j=1

sj(·, φj)ψj . (8.172)

Formula (8.172) is the canonical representation of a compact operator A.

Note that

(ψj, ψm) = δjm

since U is a partial isometry.

It follows from (8.172) that A is a limit in the norm of operators of the

finite rank operator∑n

j=1 sj(·, φj)ψj . Moreover

A∗ =

∞∑

j=1

sj(A)(·, ψj)φj. (8.173)

In the formulas (8.172) and (8.175) the summation is actually taken over

all j for which sj(A) 6= 0. If rank A < ∞, then sj(A) = 0 for j > rank A.

If A is compact and normal, that is A∗A = AA∗, then its eigenvectors form

an orthonormal basis of H

A =

∞∑

j=1

λj(A)(·, φj)φj , Aφj = λj(A)φj (8.174)

and

sj(A) = |λj(A)|. (8.175)

8.3.1.2 Minimax principles and estimates of eigenvalues and sin-

gular values

Lemma 8.9 Let A be a selfadjoint and compact operator on H. Let

λ+1 ≥ λ+

2 ≥ · · · (8.176)



be its positive eigenvalues counted according to their multiplicities and φjare corresponding eigenvectors

Aφj = λ+j φj . (8.177)

Then

λ+j+1 = min

Ljmaxφ⊥Lj

(Aφ, φ)

(φ, φ)(8.178)

where Ln ⊂ H is an n-dimensional subspace. Maximum in (8.178) is

attained on the subspace

Lj(A) := spanφ1, . . .φj (8.179)

spanned by the first j eigenvectors of A corresponding to the positive eigen-

values.

Remark 8.2 Maximum may be attained not only on the subspace (8.179).

The sign φ ⊥ L means that φ is orthogonal to the subspace L.

Lemma 8.10 If A ∈ σ∞ then

sj+1(A) = minLj

maxφ⊥Lj

‖ Aφ ‖‖ φ ‖ . (8.180)

Lemma 8.10 follows immediately from Lemma 8.9 and from the defini-

tion of the s-values given by formula (8.158).


sj+1(A) = minK∈Kj

‖ A −K ‖ (8.181)

where Kj is the set of operators of rank ≤ j.

The following inequalities for eigenvalues and singular values of compact

operators are known.

Lemma 8.12 If A and B are selfadjoint compact operators and A ≥ B,

that is

(Aφ, φ) ≥ (Bφ, φ) ∀φ ∈ H, (8.182)

then

λ+j (A) ≥ λ+

j (B). (8.183)



Lemma 8.13 If A and B are selfadjoint and compact operators, then

λ+m+n−1(A +B) ≤ λ+

m(A) + λ+n (B), (8.184)

and

λ−m+n−1(A+ B) ≥ λ−n (A) + λ−m(B). (8.185)

Moreover

∣∣λ+j (A) − λ+

j (A +B)∣∣ ≤‖ B ‖ (8.186)

and

∣∣λ−j (A) − λ−j (A +B)∣∣ ≤‖ B ‖ (8.187)

where λ−1 (A) ≤ λ−2 (A) ≤ · · · < 0 are the negative eigenvalues of a selfad-

joint compact operator A counted according to their multiplicities.

Lemma 8.14 If A is compact and B is a finite rank operator, rank B = ν,

then

sj+ν(A) ≤ sj(A+ B) ≤ sj−ν(A). (8.188)

Lemma 8.15 If A,B ∈ σ∞ then

sm+n−1(A+ B) ≤ sm(A) + sn(B), (8.189)

sm+n−1(AB) ≤ sm(A)sn(B),

|sn(A) − sn(B)| ≤‖ A −B ‖ . (8.190)


unj=1|λj(A)| ≤ unj=1sj(A). (8.191)

Lemma 8.17 If A,B ∈ σ∞ and f(x), 0 ≤ x < ∞, is a real-valued

nondecreasing, convex, and continuous function vanishing at x = 0, then

n∑

j=1

f (sj(A+ B)) ≤n∑

j=1

f (sj(A) + sj(B)) (8.192)

for all n = 1, 2, . . . ,∞.



In particular, if f(x) = x, one obtains

n∑

j=1

sj(A+ B) ≤n∑

j=1

sj(A) +

n∑

j=1

sj(B) (8.193)

for all n = 1, 2, . . . ,∞. If f(x), f(0) = 0, 0 ≤ x < ∞, is such that the

function φ(t) := f (exp(t)) is convex, −∞ < t <∞, then

n∑

j=1

f (sj(AB)) ≤n∑

j=1

f (sj(A)sj(B)) (8.194)

for all n = 1, 2, . . . ,∞.

In particular, if f(x) = x, then

n∑

j=1

sj(AB) ≤n∑

j=1

sj(A)sj(B). (8.195)

for all n = 1, 2, . . . ,∞.

Lemma 8.18 Let A,B ∈ σ∞ and

limn→∞

nasn(A) = c, (8.196)

where a > 0 and c = const > 0. Assume that

limn→∞

nasn(B) = 0. (8.197)

Then

limn→∞

nasn(A +B) = c.

Proofs of the above results can be found in [GK].

8.3.2 Perturbations preserving asymptotics of the spectrum

of compact operators

8.3.2.1 Statement of the problem

Here we are interested in the following question. Suppose that A and Q

are linear compact operators on H, and B is defined by the formula

B = A(I + Q). (8.198)



Question 1: Under what assumptions are the singular values of B asymp-

totically equivalent to the singular values of A in the following sense:

limn→∞

sn(B)

sn(A)= 1. (8.199)

Assume now that

sn(A) = cn−p [1 +O(n−p1)], n→ ∞, (8.200)

where p and p1 are positive numbers, and c > 0 is a constant.

Question 2: Under what assumptions is the asymptotics of the singular

values of B given by the formula

sn(B) = cn−p [1 +O(n−q)], n→ ∞? (8.201)

When is q = p1?

We will answer these questions and give some applications of the results.

8.3.2.2 A characterization of the class of linear compact operators

We start with a theorem which gives a characterization of the class of linear

compact operators on H.

In order to formulate this theorem let us introduce the notion of limit

dense sequence of subspaces. Let

Ln ⊂ Ln+1 ⊂, . . . , dimLn = n (8.202)

be a sequence of finite-dimensional subspaces of H such that

ρ(f, Ln) → 0 as n→ ∞ for any f ∈ H (8.203)

where ρ(f, L) is the distance from f to the subspace L.

Definition 8.2 A sequence of the subspaces Ln is called limit dense in

H if the conditions (8.202) and (8.203) hold.

Theorem 8.2 A linear operator A : H → H is compact if and only if

there exists a limit dense in H sequence Ln of subspaces Ln such that

suph⊥Ln

‖ Ah ‖‖ h ‖ → 0 as n→ ∞, h ∈ H. (8.204)



If (8.204) holds for a limit dense in H sequence Ln then it holds for every

limit dense in H sequence of subspaces.

Proof. Sufficiency. Assume that Ln is a limit dense in H sequence of

subspaces and condition (8.204) holds. We wish to prove that A is compact.

Let Pn denote the orthoprojector in H onto Ln. Condition (8.204) can be

written as

γn := sup‖h‖=1

h⊥Ln

‖ Ah ‖→ 0 as n→ ∞. (8.205)

Therefore

‖ A− APn ‖ = sup‖h‖≤1

‖ Ah−APnh ‖= sup‖h‖≤1

‖ A(I − Pn)h ‖

= supg=(I−Pn)h

‖g‖=(1−‖Pnh‖2)1/2≤1, ‖h‖≤1

||Ag|| ≤ supg⊥Ln

‖ Ag ‖= γn → 0.

(8.206)

Therefore A is the norm limit of the sequence of the operators APn.

The operator APn is of finite rank ≤ n. Therefore A is compact. Note that

in the sufficiency part of the argument the assumption that the sequence

Ln is limit dense in H is not used. In fact, if condition (8.204) holds for

any sequence of subspaces Ln ⊂ Ln+1 then A is compact as we have proved

above.

Necessity. Assume now that A is compact and Ln is a limit dense

in H sequence of subspaces. We wish to derive (8.204). We have

suph⊥Ln‖h‖=1

‖ Ah ‖ = supPnh=0

‖h‖=1

‖ Ah−APnh ‖≤ sup‖h‖=1

‖ A(I − Pn)h ‖

= ‖ A(I − Pn) ‖→ 0 as n→ ∞. (8.207)

The last conclusion follows from the well known result which is formu-

lated as Proposition 8.1.

Proposition 8.1 If A is compact and the selfadjoint orthoprojection Pnconverges strongly to the identity operator I then

‖ A(I − Pn) ‖→ 0 as n→ ∞. (8.208)

Note that Pn → I strongly if and only if the sequence Ln is limit dense

in H. Let us prove Proposition 8.1. Let A be compact and B∗n = Bn → 0



strongly. In our case B = I − Pn. Represent A = K + Fε, where K is a

finite rank operator and ‖ Fε ‖< ε. Then

‖ ABn ‖≤‖ FεBn ‖ + ‖ KBn ‖≤ cε+ ‖ KBn ‖ . (8.209)

Here c ≥‖ Bn ‖ does not depend on n and ε. Choose n sufficiently large.

Then

‖ KBn ‖< ε (8.210)

since Bn → 0 strongly and K is a finite rank operator. Indeed

KBnh =

m∑

j=1

sj(Bnh, φj)ψj (8.211)

where

K :=

m∑

j=1

sj(·, φj)ψj , sj = const. (8.212)

It is known that Bn → 0 strongly does not imply B∗n → 0 strongly, in

general. But since we have assumed that Bn = B∗n, we have

‖ KBnh ‖≤m∑

j=1

|sj | ‖ ψj ‖‖ h ‖‖ Bnφj ‖≤ ε(n) ‖ h ‖ (8.213)

where ε(n) → 0 as n → ∞ because

‖ Bnφj ‖→ 0 as n → ∞, 1 ≤ j ≤ m. (8.214)

Proposition 8.1 and Theorem 8.2 are proved.

Note that in the proof of the necessity part the assumption that the

sequence Ln is limit dense inH plays the crucial role: it allows one to claim

that Pn → I strongly. If the sequence Ln is not limit dense then there may

exist a fixed vector h 6= 0 such that ‖ Ah ‖> 0 and h ⊥ ⋃∞n=1 Ln. In

this case condition (8.204) does not hold. This is the case, for example, if

h is the first eigenvector of a selfadjoint compact operator A, and Ln :=

spanφ2, . . . , φn+1 where Aφj = λjφj.

8.3.2.3 Asymptotic equivalence of s-values of two operators

We are now ready to answer Question 1. Recall that N (A) := u : Au = 0.



Theorem 8.3 Assume that A, Q ∈ σ∞, N (I + Q) = 0 and rank

A = ∞. Then

limn→∞

sn A(I + Q)sn(A)

= 1 (8.215)

and

limn→∞

sn (I + Q)Asn(A)

= 1. (8.216)

Proof. By the minimax principle for singular values one has

sn+1 A(I +Q) = minLn

maxφ⊥Ln

‖ A(I + Q)φ ‖‖ φ ‖

= minLn

maxφ⊥Ln

‖ A(I +Q)φ ‖‖ (I + Q)φ ‖

‖ (I + Q)φ ‖‖ φ ‖

≤ maxφ⊥Mn

‖ A(I + Q)φ ‖‖ (I +Q)φ ‖

(1 + max

φ⊥Mn

‖ Qφ ‖‖ φ ‖

)

= sn+1(A)(1 + εn), (8.217)

where

εn := maxφ⊥Mn

‖ Qφ ‖‖ φ ‖ → 0 as n → ∞. (8.218)

Here Mn is so chosen that the condition φ ⊥ Mn is equivalent to the

condition (I + Q)φ ⊥ Ln(A), where Ln(A) is the linear span of the first n

eigenvectors of the operator (A∗A)1/2

Mn := (I + Q∗)Ln(A). (8.219)

Since N (I + Q) = 0 and Q is compact, the operator I + Q is an iso-

morphism of H onto H and so is I + Q∗. Therefore the limit dense in H

sequence of the subspaces Ln(A) is mapped by the operator I + Q∗ into

a limit dense in H sequence of the subspaces Mn. Indeed, suppose that

f ⊥ Mn ∀n, that is

(f, (I +Q∗)φj) = 0 ∀j (8.220)

where φj is the set of all eigenvectors of the operator (A∗A)1/2 includ-

ing the eigenvectors corresponding to the eigenvalue λ = 0 if zero is an



eigenvalue of (A∗A)1/2. Then

((I +Q)f, φj) = 0 ∀j. (8.221)

Since the set of all the eigenvectors of the operator (A∗A)1/2 is complete in

H, we conclude that

(I +Q)f = 0. (8.222)

This implies that f = 0 since I +Q is an isomorphism.

The fact that εn → 0 follows from the compactness of Q and Theorem

1. Let B = A(I +Q). Since (I +Q)−1 = I +Q1 where Q1 := −Q(I +Q)−1

is a compact operator, one has A = B(I + Q1). Therefore one obtains as

above the inequality:

sn+1(A) ≤ sn+1(B)(1 + δn), δn → 0 as n → ∞. (8.223)

From (8.217) and (8.223)equation (8.215) follows. The proof of (8.216)

reduces to (8.215) if one uses property (8.162) of s-values. Theorem 8.3 is

proved.

The result given in Theorem 8.3 is optimal in some sense. Namely, if

Q is not compact but, for example, an operator with small norm, then

the conclusion of Theorem 8.3 does not hold in general (take, for instance,

Q = εI where I is the identity operator). The assumption rank A = ∞is necessary since if rank A < ∞ one has only a finite number of nonzero

singular values. The assumption N (I + Q) = 0 is often easy to verify

and it is natural. It could be dropped if the assumption about the rate of

decay of sn(A) is

sn(A) ∼ cn−p, p > 0

but we do not go into detail.

8.3.2.4 Estimate of the remainder

Let us now answer the second question.

Theorem 8.4 Assume that A and Q are linear compact operators on H,

N (I +Q) = 0, B := A(I +Q),

sn(A) = cn−p [1 + O(n−p1)]

as n → ∞, (8.224)



where p, p1 and c are positive numbers, and

‖ Qf ‖≤ c ‖ Af ‖a‖ f ‖1−a, a > 0. (8.225)

Then

sn(B) = cn−p [1 + O(n−q)]

(8.226)

where

q := min

(p1,

pa

1 + pa

). (8.227)

In particular,

ifpa

1 + pa> p1 then q = p1 (8.228)

and therefore not only the main term of the asymptotics of sn(A) is pre-

served but the order of the remainder as well.

Remark 8.3 The estimate (8.227) of the remainder in (8.226) is sharp

in the sense that it is attained for some Q.

Proof. Let n and m be integers. It follows from (8.180) that

sn+m+1(B) = minLn+m

maxh⊥Ln+m

‖ Bh ‖‖ h ‖

≤ maxh⊥Mn

‖ A(I +Q)h ‖‖ (I + Q)h ‖ ·

(1 + max

h⊥Lm(A)

‖ Qh ‖‖ h ‖

).(8.229)

Here, as in the proof of Theorem 8.3, Mn is defined by formula (8.219),

and Lm(A) is the linear span of first m eigenvectors of the operator

(A∗A)1/2. This means that we have chosen Ln+m to be the direct sum

of the subspaces Mn + Lm(A). Since the sequence Lm(A) is limit dense in

H one can use Theorem 8.2 and conclude from (8.229) that

sn+m+1(B) ≤ sn+1(A)(1 + εm), εm → 0 as m → ∞ (8.230)

and

εm = maxh⊥Lm(A)

‖ Qh ‖‖ h ‖ ≤ c max

h⊥Lm(A)

‖ Ah ‖a‖ h ‖a = csam+1(A). (8.231)

Therefore

sn+m+1(B) ≤ sn+1(A)[1 + csam+1(A)

]. (8.232)



Unfortunately our assumptions now do not allow to use the argument sim-

ilar to the one used at the end of the proof of Theorem 8.3. The reason is

that our assumptions now are no longer symmetric with respect to A and

B. For example, inequality (8.225) is not assumed with B in place of A.

In applications it is often possible to establish the inequality (8.225) with

B in place of A, and in this case the argument can be simplified: one can

use by symmetry the estimate (8.232) in which B and A exchange places.

With the assumption formulated in Theorem 8.4 we proceed as follows.

Write

A = B(I + Q1), Q1 = −Q(I + Q)−1. (8.233)

Choose

M1n := (I + Q∗1)Ln(B) (8.234)

and use the inequalities similar to (8.229)-(8.232) to obtain

sn+m+1(A) ≤ sn(B)[1 + c1s

am+1(A)

]. (8.235)

It follows from (8.232) and (8.235) that

sn+2m+1(A) ≤ sn+m+1(B)[1 + c1s

am+1(A)

]

≤ sn+1(A)[1 + csam+1(A)

] [1 + c1s

am+1(A)

]

≤ sn+1(A)[1 + c2s

am+1(A)

], (8.236)


0 < sm(A) → 0 as m → ∞ (8.237)

so that s2m(A) ≤ sm(A) for all sufficiently large m.

Therefore, for all sufficiently large m one has

sn+2m+1(A)

sn+m+1(A)[1 + 0 (sam(A))] ≤ sn+m+1(B)

sn+m+1(A)

≤ sn+1(A)

sn+m+1(A)[1 + O (sam(A))] . (8.238)

Choose

m = n1−x, 0 < x < 1. (8.239)




sn+m(A)

sn(A)=

(n+m

n

)−p [1 + 0(n+m)−p1 + O(n−p1)

]

= 1 + 0(mn

)+O(n−p1 ) = 1 + O(n−x) +O(n−p1).(8.240)


sn+m+1(B)

sn+m+1(A)= 1 + O(n−p1) + O(n−x) +O(n−(1−x)pa). (8.241)

Let

q := minp1, x, (1− x)pa . (8.242)

Then

sn+m+1(B)

sn+m+1(A)= 1 + O(n−q). (8.243)

Since

n+m + 1

n∼ 1 as n → ∞ (8.244)

it follows from (8.243) that formula (8.226) holds. Choose now x, 0 < x < 1

such that

min (x, (1 − x)pa) = max . (8.245)

An easy calculation shows that (8.245) holds if

x =pa

1 + pa. (8.246)

Therefore

q = min

(p1,

pa

1 + pa

).

This is formula (8.227). Theorem 8.4 is proved.

We leave for the reader as an exercise to check the validity of the Re-

mark.

Hint. A trivial example, which shows that for some Q the order of

the remainder in (8.226) is attained, is the following one. Let A > 0 be a



selfadjoint compact operator. In this case sj(A) = λj(A). Take Q = φ(A).

Then

λn(B) = λn A [1 + φ(A)] = λn(A) [1 + φ(λn)] (8.247)

by the spectral mapping theorem. If one chooses φ(λ) such that

φ(λn) = o(n−p1 ). (8.248)

Then q = p1 is the order of the remainder in the formula

λn(B) = cn−p [1 +O(n−p1 )]. (8.249)

8.3.2.5 Unbounded operators

Note that the results of Theorems 8.3, 8.4 can be used in the cases when

the operators we are interested in are unbounded. For example, suppose

that L is an elliptic selfadjoint operator in H = L2(D) of order s and ` is

a selfadjoint in H differential operator of lower order, ord` = m. We wish

to check that

limn→∞

λn(L + `)

λn(L)= 1. (8.250)

Since λn(L) → +∞ and λn(L+ cI) = λn(L) + c where c is a constant, one

can take L + cI in place of L in (8.250) and choose c > 0 such that the

operator L+cI is positive definite inH. Then the operator A := (L+cI)−1

is compact in H. Moreover

L + cI + ` =[I + `(L + cI)−1

](L + cI)

so that

B := (L + cI + `)−1 = (L + cI)−1[I + `(L + cI)−1

]−1. (8.251)

If ord` < ordL then the operator

S := `(L + cI)−1 is compact in H. (8.252)

One can always choose the constant c > 0 such that N (I + S) = 0, so

that

(I + S)−1 = I + Q (8.253)

where Q is compact in H. Then (8.251) can be written as

B = A(I + Q) (8.254)



and the assumptions of Theorem 8.3 are satisfied. In fact, since A and B

are selfadjoint and λn(A) and λn(B) are positive for all sufficiently large n,

one has

sn(B) = λn(B), sn(A) = λn(A), ∀n > n0. (8.255)

By Theorem 8.3 one has

limn→∞

λn(B)

λn(A)= 1. (8.256)

Since λn(B−1) = λ−1n (B), it follows from (97) that

limn→∞

λn(B−1)

λn(A−1)= 1. (8.257)

This is equivalent to (8.250) because, as was mentioned above,

limn→∞

λn(L + ` + cI)

λn(L + `)= 1 (8.258)

for any constant c.

8.3.2.6 Asymptotics of eigenvalues

In this section we prove some theorems about perturbations preserving

asymptotics of the spectrum. In order to formulate these theorems in which

unbounded operators appear, we need some definitions.

Let A be a closed liner densely defined in a Hilbert space H operator,

D(A) is its domain of definition, R(A) is its range, N (A) = u : Au = 0is its null-space, σ(A) is its spectrum.

Definition 8.3 We say that the spectrum of A is discrete if it consists

of isolated eigenvalues with the only possible limit point at infinity, each of

the eigenvalues being of finite algebraic multiplicity and for each eigenvalue

λj the whole space can be represented as a direct sum of the root subspace

Mj corresponding to λj, and a subspace Hj which is invariant with respect

to A and in which the operator A − λjI has bounded inverse. In this case

λj is called a normal eigenvalue.

The root linear manifold of the operator A corresponding to the eigen-

value λ is the set of vectors which solve the equation

Mλ := u : (A − λI)nu = 0 for some n . (8.259)



The algebraic multiplicity ν(λ) of the eigenvalue λ is

ν(λ) := dimMλ. (8.260)

If Mλ is closed in H it is called root subspace. The geometric multiplicity

n(λ) of λ is the dimension of the eigenspace corresponding to λ, n(λ) =

dimN (A−λI). If ε > 0 is small enough so that there is only one eigenvalue

in the disc |z − λ| < ε then

Pλ := − 1

2πi

∫

|z−λ|=εR(z)dz, R(z) := (A − zI)−1 (8.261)

is the projection, that is P 2 = P . The subspace PλH is invariant for A,

Pλ commutes with A, PλA = APλ, the spectrum of the restriction of A

onto PλH consists of only one point λ, which is its eigenvalue of algebraic

multiplicity ν(λ).

An example of operators with discrete spectrum is the class of operators

for which the operator (A− λ0I)−1 is compact for some λ0 ∈ C. Such are

elliptic operators in a bounded domain.

If A is an operator with discrete spectrum then

|λn(A)| → ∞ as n → ∞. (8.262)

Thus, for any constant c,

limn→∞

λn(A+ cI)

λn(A)= 1. (8.263)

Therefore it is not too restrictive to assume that A−1 exists and is compact:

if A−1 does not exist then choose c such that (A + cI)−1 exists and study

the asymptotics of λn(A + cI) = λn(A) + c. Note that if (A − λ0I)−1 is

compact for some λ0, then (A−λI)−1 is compact for any λ for which A−λIis invertible. This follows from the resolvent identity

(A− λI)−1 = (A− λ0)−1 + (λ − λ0)(A − λI)−1(A− λ0I)

−1. (8.264)

If A−1 is compact we define the singular values of A by the formula

sn(A) =[sn(A−1)

]−1. (8.265)

If

A = A∗ ≥ m > 0 (8.266)



then we denote by HA the Hilbert space which is the completion of D(A)

in the norm ‖ u ‖A= (Au, u)1/2. Clearly HA ⊂ H, ‖ u ‖≤ m−1 ‖ u ‖A,

and (u, v)A := (Au, v) is the inner product in HA. The inner product can

be written also as (u, v)A = (A1/2u,A1/2v). If B = B∗ ≥ −m, then by HB

we mean the Hilbert space which is the completion of D(B) in the norm

‖ u ‖B:= ((B +m + 1)u, u)1/2

. All unbounded operators we always assume

densely defined in H.

Theorem 8.5 Let A = A∗ ≥ m > 0 be a linear closed operator with

discrete spectrum, T be a linear operator, D(A) ⊂ D(T ), B := A + T ,

D(B) = D(A). Assume that A−1T is compact in HA, B = B∗ and HA ⊂D(T ). Then

limn→∞

λn(B)

λn(A)= 1. (8.267)

The conclusion (8.267) remains valid if A ≥ −m and [A + (m + 1)I]−1T

is compact in HA.

Remark 8.4 If T > 0 then A−1T is compact in HA if and only if the

imbedding operator i : HA → HT is compact. By HT we mean the Hilbert

space which is the completion of D(T ) in the norm (Tu, u)1/2. If T is

not positive but |(Tf, f)| ≤ (Qf, f) for some Q > 0 and all f ⊂ D(T ),

D(T ) ⊂ D(Q), and if the imbedding i : HA → HQ is compact then A−1T

is compact.

The reader can prove these statements as an exercise or find a proof in[Glazman (1965), §4].

To prove Theorem 8.5 we need a lemma.

Lemma 8.19 If the operator A−1T is compact in HA then HA = HB

and the spectrum of B is discrete.

Assuming the validity of this lemma let us prove Theorem 8.5 and then

prove the Lemma.

Proof of Theorem 8.5 Let us use the symbol q for orthogonality in HA

and ⊥ for orthogonality in H. If Ln(A) is the linear span of the first n

eigenvectors of A−1 then f ⊥ Ln(A) is equivalent to f q Ln(A). Indeed, if

A−1φj = λjφj, λj 6= 0, then

0 = (f, φj) = λj(f,Aφj) = λj(f, φj)A ⇔ (f, φj)A = 0.



Note also that

inf α(1 + β) ≥ (1− sup β) inf α if α ≥ 0 and −1 < β < 1. (8.268)

We will use the following statement

γn :≡ supf⊥Ln(A)

|(Tf, f)|(f, f)A

= supfqLn(A)

|(A−1Tf, f)A|(f, f)A

→ 0 as u → ∞

(8.269)

which follows from Theorem 8.2 and the assumed compactness of A−1T in

HA.

We are now ready to prove Theorem 8.5. By the minimax principle one

has

λn+1(B) = supLn

inff⊥Ln

(Bf, f)

(f, f)

≥ inff⊥Ln(A)

(Bf, f)

(f, f)= inf

f⊥Ln(A)

(Af, f)

(f, f)

[1 +

(Tf, f)

(Af, f)

]

≥ λn+1(A)(1 − γn), γn → 0 as n→ ∞, (8.270)

where we have used (8.268) and (8.269).

By symmetry, for all sufficiently large n, one has

λn+1(A) ≥ λn+1(B)(1 − δn), δn → 0 as n → ∞. (8.271)

We left for the reader as an exercise to check that under the assumptions

of Theorem 8.5 the operator (B+ cI)−1T is compact if B+ cI is invertible.

From (8.270) and (8.271) the desired conclusion (8.267) follows. If A ≥−m and [A+ (m + 1)I]

−1T is compact in HA then we argue as above and

obtain in place of (8.270) the following inequality

λn+1 B + (m + 1)I ≥ λn+1 A + (m + 1)I (1 − γn),

γn → 0 as n → ∞. (8.272)

Since λn(A+ cI) = λn(A) + c, λn(A) → +∞ and λn(B) → +∞, inequality

(8.272) implies

λn+1(B) [1 + o(1)] ≥ λn+1(A) [1 + o(1)] (1 − γn) as n→ ∞, (8.273)

and the rest of the argument is the same as above. Theorem 8.5 is proved.



Proof of the Lemma 8.19 Note that B = A(I +S), where S := A−1T is

compact in HA. Let us represent S in the form S = Q+F where ‖ Q ‖A< 1

and F is a finite rank operator. The operator S is selfadjoint inHA. Indeed

(Sf, g)A = (Tf, g) = (f, T g) = (f, Sg)A

where we used the symmetry of T = B − A on H. We choose Q and F to

be selfadjoint in HA. The operator I +Q is positive definite in HA while

(Fu, u)A =N∑

j=1

aj|(u, φj)A|2

for some orthonormal in HA set of functions φj , for some constants aj,

and some number N = rank F . Since D(A) is dense in HA, one can find

vj ∈ D(A) such that ‖ φj−vj ‖A< ε, where ε > 0 is arbitrarily small. Then

|(Fu, u)A| ≤N∑

j=1

|aj| |(u, φj − vj)A + (u,Avj)|2

≤ c1ε ‖ u ‖2A +c2 ‖ u ‖2 (8.274)

where c1 and c2 are some positive constants which do not depend on u. It


(Bu, u) = (A(I + Q)u, u) + (AFu, u)

= ((I + Q)u, u)A + (Fu, u)A

≥ c0 ‖ u ‖2A −c1ε ‖ u ‖2

A −c2 ‖ u ‖2 (8.275)

where c0 > 0. It follows from (8.275), if one chooses ε so small that c0−εc1 >0, that B is bounded from below inH. Since, clearly, (Bu, u) ≤ c(u, u)A one

concludes that the metrics of HA and HB are equivalent, so thatHA = HB.

It remains to be proved that the spectrum of B is discrete. Since B

is selfadjoint it is sufficient to prove that no point of spectrum σ(B) of

B belongs to the essential spectrum σess(B). Recall that λ ∈ σess(B),

where B = B∗, if and only if dimE(∆ε)H < ∞ for any ε > 0, where

∆ε = (λ−ε, λ+ε) and E(∆) is the resolution of the identity corresponding to

the selfadjoint operator B. Assume that λ ∈ σ(B) and dimE(∆εn)H = ∞for some sequence εn → 0. Then there exists an orthonormal sequence

un ∈ H, such that ‖ Bun − λun ‖→ 0 as n → ∞. Thus

‖ un +A−1Tun − λA−1un ‖→ 0 as n→ ∞. (8.276)



Since ‖ un ‖= 1 we have

(Aun, un) + (Tun, un) − λ(un, un) → 0, n→ ∞. (8.277)

If A−1T is compact in HA, we have proved that, for any ε > 0,

|(Tu, u)| ≤ ε ‖ u ‖2A +c(ε) ‖ u ‖2, u ∈ HA. (8.278)


‖ un ‖A≤ c (8.279)

where c > 0 is a constant which does not depend on n. Since A−1T is com-

pact in HA, inequality (8.279) implies that a subsequence of the sequence

un exists (we denote this subsequence again by un) such that A−1Tun con-

verges in HA and, therefore, in H. Since the set un is orthonormal, unconverges weakly to zero in H

un 0, n→ ∞. (8.280)

Therefore

‖ A−1Tun ‖→ 0 as n→ ∞. (8.281)


‖ un − λA−1un ‖→ 0 as n → ∞ (8.282)

where un is an orthonormal subsequence. This means that if λ 6= 0 then

λ ∈ σess(A) which is a contradiction since, by assumption, A does not have

essential spectrum. If λ = 0 then (8.282) cannot hold since ‖ un ‖= 1.

Therefore B does not have essential spectrum and its spectrum is discrete.


Example 8.1 Let A be the Dirichlet Laplacian −∆ in a bounded domain

D ⊂ Rr and B = −∆ + q(x), where q(x) is a real-valued function. In this

case Tu = q(x)u is a multiplication operator. The condition A−1T is

compact in HA means that (−∆)−1q is compact ino

H1(D). This condition

holds if and only if A−1/2T is compact in H = L2(D), D ⊂ Rr, that

is (−∆)−1/2q is compact in L2(D). If, for example, q ∈ Lp(D) then the

operator (−∆)−1/2q(x) is compact in L2(D) provided that q ∈ Lγ (D),

γ > r, and Theorem 8.5 asserts that, in this case,

limn→∞

λn(−∆ + q)

λn(−∆)= 1. (8.283)



In the calculation of the Lp class to which q belongs we have used the

known imbedding theorem which says that the imbedding i : W k,p(D) →L

rpr−kp

(D) is compact for kp < r, where W k,p(D) is the Sobolev space of

functions with derivatives of order ≤ k belonging to Lp(D). If q ∈ Lγ(D)

and u ∈ L2(D) then qu ∈ Lp(D),

∫

D

|qu|pdx ≤(∫

D

|q|pαdx) 1α(∫

D

|u|pβ) 1β

where α > 1, β = αα−1

, pβ = 2, pα = γ. Thus

p =2(α− 1)

α, p =

γ

α, so that α = 1 +

γ

2

and p = 2γγ+2

. On the other hand, if qu ∈ Lp(D) then ∆−12 qu ∈W 1,p(D) ⊂

Lrpr−p (D). If rp

r−p > 2, that is p > 2rr+2

, then γ > r. The condition on q(x)

for which (8.283) holds can be relaxed.

In the next theorem we assume compactness of A−1T and TA−1 in H

rather than in HA.

Theorem 8.6 Assume that A = A∗ ≥ m > 0 is an operator with discrete

spectrum, D(A) ⊂ D(T ), B = A + T , D(B) = D(A), B is normal, 0 6∈σ(B), and the operator A−1T is compact in H. Then the spectrum of B is

discrete and

limn→∞

λn(B)

λn(A)= 1. (8.284)

Proof. First we prove that the spectrum of B is discrete. Since A is

selfadjoint positive definite and its spectrum is discrete it follows that A−1

is compact. Let

Au+ Tu = λu+ f, u+A−1Tu = λA−1u+ A−1f. (8.285)

Since 0 6∈ σ(B) the operator I + A−1T has bounded inverse. Therefore

u = λ(I + A−1T )−1A−1u+ (I + A−1T )−1A−1f. (8.286)

The operator (I + A−1T )−1A−1 is compact being a product of a bounded

and compact operator. Equations (8.285) and (8.286) are equivalent.

Therefore

(B − λI)−1 =[I − λ(I +A−1T )−1A−1

]−1(I + A−1T )−1A−1. (8.287)



It follows from (8.287) that λ ∈ σ(B) if and only if λ−1 ∈ σ(F ), F :=

(I + A−1T )−1A−1. Since F is compact, each λ is an isolated eigenvalue

of finite algebraic multiplicity and σ(B) is discrete. In this part of the

argument we did not use the assumption that B is normal.

If B is normal then |λn(B)| = sn(B), where sn(B) are the singular

values of B. Since B = A(I +A−1T ) and A−1T is compact, since sn(B) =

s−1n (B−1), and since A−1 is compact, we can apply Theorem 1 and get

limn→∞

sn(B−1)

sn(A−1)= lim

n→∞sn(A)

sn(B)= 1. (8.288)

Since A > 0, we have sn(A) = λn(A). Therefore the desired result (8.284)

will be proved if we prove that

limn→∞

|λn(B)|λn(B)

= 1. (8.289)

Let us prove (8.289). Let

Aφj + Tφj = λjφj. (8.290)

Since B is normal we can assume that

(φj , φm) = δjm. (8.291)

Rewrite (8.290) as

φj +A−1Tφj = λjA−1φj. (8.292)

Multiply (8.292) by φj to get

1 + (A−1Tφj, φj) = λj (A−1φj, φj). (8.293)

Since A−1T is compact and φj 0 as j → ∞, we have:

(A−1Tφj, φj) → 0 as j → ∞. (8.294)

Note that (A−1φj, φj) > 0. Therefore it follows from (8.293) that

ImλjReλj

=Im(A−1Tφj, φj)

1 +Re(A−1Tφj, φj)→ 0 as j → ∞. (8.295)

This implies (8.289). Theorem 8.6 is proved.



8.3.2.7 Asymptotics of eigenvalues (continuation)

In this section we continue to study perturbations preserving asymptotics

of the spectrum of linear operators. Let us give a criterion for compactness

of the resolvent (A − λI)−1 := R(λ) for λ 6∈ σ(A), where A is a closed

densely defined linear operator in H.

Theorem 8.7 The operator (A−λI)−1 , λ 6∈ σ(A) is compact if and only

if the operator (I +A∗A)−1 is compact.

Proof. Sufficiency. Suppose (I +A∗A)−1 is compact and λ 6∈ σ(A). Let

‖ gn ‖≤ c, (A − λI)−1gn = fn. (8.296)

Then

‖ fn ‖≤ c and ‖ Afn ‖≤ c, (8.297)

where c denotes various positive constants. Therefore

∥∥∥(I + A∗A)1/2fn

∥∥∥2

=‖ fn ‖2 + ‖ Afn ‖2≤ c. (8.298)

The operators (I + A∗A)−1 and (I +A∗A)−1/2 are selfadjoint positive op-

erators. They are simultaneously compact or non-compact. Therefore if

(I + A∗A)−1 is compact then (I + A∗A)−1/2 is compact and (8.298) im-

plies that the sequence fn is relatively compact. Therefore the operator

(A − λI)−1, λ 6∈ σ(A), maps any bounded sequence gn into a relatively

compact sequence fn. This means that (A− λI)−1 is compact.

Necessity. Assume that (A − λI)−1 is compact and ‖ hn ‖≤ c. Then

the sequence (A − λI)−1hn is relatively compact. We wish to prove that

the sequence qn := (I + A∗A)−1hn is relatively compact. The sequence

(I +A∗A)qn = hn is bounded. Thus

((I + A∗A)qn, qn) =‖ qn ‖2 + ‖ Aqn ‖2≤ c. (8.299)

Define pn := (A− λI)qn, qn = (A− λI)−1pn. We have

‖ pn ‖≤‖ Aqn ‖ +|λ| ‖ qn ‖≤ c (8.300)

where c denotes various constants. From (8.300) and compactness of (A −λI)−1 it follows that the sequence qn = (A−λI)−1pn is relatively compact.




Remark 8.5 Let T be a linear operator in H, D(A) ⊂ D(T ) and let

0 6∈ σ(A).

Definition 8.4 If for any sequence fn such that

‖ fn ‖ + ‖ Afn ‖≤ c

the sequence Tfn is relatively compact then T is called A-compact.

In other words, T is A-compact if it is a compact operator from the

space GA into H. The space GA is the closure of D(A) in the graph norm

‖ f ‖GA:=‖ f ‖ + ‖ Af ‖. IfA is closed, which we assume, thenD(A) = GAis a Banach space if it is equipped with the graph norm.

Proposition 8.2 The operator T is A-compact if and only if the operator

TA−1 is compact in H.

Proof. Suppose T is A-compact. Let ‖ fn ‖≤ c, and define gn = A−1fn.

Then ‖ gn ‖ + ‖ Agn ‖≤ c. Therefore the sequence Tgn is relatively

compact. This means that the sequence TA−1fn is relatively compact.

Therefore TA−1 is compact in H. Conversely, suppose TA−1 is compact

in H and ‖ fn ‖ + ‖ Afn ‖≤ c. Then the sequence Tfn = TA−1Afn is

relatively compact. Proposition 8.2 is proved.

8.3.2.8 Asymptotics of s-values

In this section we prove

Theorem 8.8 Let A be a closed linear operator in H. Suppose that σ(A),

the spectrum of A, is discrete and 0 6∈ σ(A). Let T be a linear operator,

D(A) ⊂ D(T ), B = A+ T , D(B) = D(A).

If the operator TA−1 is compact then B is closed. If, in addition, A−1

is compact and, for some number k 6∈ σ(A), the operator B+kI is injective,

then σ(B) is discrete and

limn→∞

sn(B)

sn(A)= 1 as n → ∞. (8.301)

The following lemma is often useful.

Lemma 8.20 Suppose that fn ∈ H is a bounded sequence which does

not contain a convergent subsequence. Then there is a sequence ψm =

fnm+1 − fnm such that

ψm 0 as m → ∞ (8.302)



and ψm does not contain a convergent subsequence.

Proof. Since fn is bounded we can assume that it converges weakly:

fn f (8.303)

(passing to a subsequence and using the well known fact that bounded sets

in a Hilbert space are relatively weakly compact). Since fn does not

contain a convergent subsequence, one can find a subsequence such that

‖ fnm − fnk ‖≥ ε > 0 for all m 6= k. (8.304)

If

ψm := fnm+1 − fnm (8.305)

then (8.303) implies (8.302), and the sequence ψm does not contain a

convergent subsequence because

‖ ψm ‖≥ ε > 0, (8.306)

and if there would be a convergent subsequence ψmj it would have to con-

verge to zero since its weak limit is zero. Lemma 8.20 is proved.

This lemma can be found in [Glazman (1965), §5] where it is used in the

proof of the following result: if A is a closed linear operator in H and K

is a compact operator then σc(A+K) = σc(A), where σc(A) is continuous

spectrum of A that is the set of points λ such that there exists a bounded

sequence ψm ∈ D(A) which does not contain a convergent subsequence and

which has the property ‖ Aψm − λψm ‖→ 0 as m → ∞.

Proof of Theorem 8.8 (1) Let us first prove that B is closed. Assume

that

fn → f, Bfn = Afn + Tfn → g, (8.307)

and fn ⊂ D(B) = D(A). Suppose we have the estimate

‖ Afn ‖≤ c. (8.308)

Then the sequence Tfn = TA−1Afn contains a convergent subsequence

since TA−1 is compact. This and the second equation (8.307) imply that

the sequence Afn contains a convergent subsequence which we denote



again Afn. Since A is closed by the assumption, we conclude that f ∈D(A) = D(B) and

Af + Tf = g


limn→∞

Tfn = limn→∞

TA−1Afn = TA−1Af = Tf.

Thus, the operator B is closed provided that (8.308) holds. Let us prove

inequality (8.308). Suppose

‖ Afn ‖→ ∞, n → ∞. (8.309)

Define

gn :=fn

‖ Afn ‖ , ‖ gn ‖→ 0, ‖ Agn ‖= 1. (8.310)

Equation (8.307) implies

Agn + Tgn → 0, n→ ∞. (8.311)

As above, compactness of the operator TA−1 and the last equation (8.310)

imply that one can assume that the subsequence Tgnm , which we denote

again Tgn, converges in H. This and equation (8.311) imply that Agnconverges to an element h:

Agn → h. (8.312)

Since A is closed and gn → 0, one concludes that h = 0. This is a contra-

diction:

1 = limn→∞

‖ Agn ‖=‖ h ‖= 0.

This contradiction proves estimate (8.308). We have proved that the oper-

ator B is closed.

(2) Let us prove that σ(B) is discrete. We have

(B − λI)−1 = (A + T − λI)−1 = (A+ kI)−1(I +Q− µS)−1, (8.313)

where

Q := TS, S = (A+ kI)−1 (8.314)

µ = λ + k, k 6∈ σ(A). (8.315)



The operators S and Q are compact. If B + kI is injective then I + Q is

injective. Since Q is compact this implies that I + Q is an isomorphism of

H onto H. Therefore

(I + Q− µS)−1 = (I + Q)−1(I − µK)−1 (8.316)

where

K := S(I +Q)−1 is compact. (8.317)

Therefore the set µ for which the operator B − λI is not invertible is a

discrete set, namely the set of the characteristic values of the compact

operator K. Recall that µj is a characteristic value of K if

φj = µjKφj, φj 6= 0. (8.318)

Thus the set µj has the only possible limit point at infinity. Each µj is an

isolated eigenvalue of k of finite algebraic multiplicity and therefore λj =

µj−k is an isolated eigenvalue of B of finite algebraic multiplicity. Finally,

the corresponding to λj projection operator (8.261) is finite dimensional,

so that λj is a normal eigenvalue. We have proved that σ(B) is discrete.

(3) Let us prove the last statement of the theorem, i.e. formula (8.301).

We have

sn(B) = s−1n (B−1) = s−1

n

A−1(I + TA−1)−1

. (8.319)

We can assume without loss of generality that k = 0. In this case the

operator I + TA−1 is invertible and since TA−1 is compact one can write

(I + TA−1)−1 = I + S, where S is a compact operator. The operator A−1

is compact by the assumption. We can apply now Theorem 8.3 and obtain

limn→∞

snA−1(I + S)

sn(A−1)= 1. (8.320)

This is equivalent to the desired result (8.301).

8.3.2.9 Asymptotics of the spectrum for quadratic forms

In this section we study perturbations preserving spectral asymptotics for

quadratic forms. As a motivation to this study let us consider the following

classical problem.



Let D ⊂ Rr be a bounded domain with a smooth boundary Γ. Consider

the problems

(−∆ + 1)uj = λjuj in D, uN = 0 on Γ (8.321)

(−∆ + 1)uj = µjuj in D, uN + σu = 0 on Γ (8.322)

where σ = σ(s) ∈ C1(Γ) and N is the outer normal to Γ.

The question is: how does one see that

limn→∞

µnλn

= 1. (8.323)

The usual argument uses relatively complicated variational estimates. The

eigenvalues λn are minimums of the ratio of the quadratic forms∫D

[|∇u|2 + |u|2]dx∫D|u|2dx = min, u ∈ H1(D) (8.324)

while µn are minimums of the ratio∫D

[|∇u|2 + |u|2]dx+∫Γσ|u|2ds∫

D|u|2dx = min, u ∈ H1(D). (8.325)

The desired conclusion (8.323) follows immediately from the abstract

result we will prove and from the fact that the quadratic form∫Γ σ|u|2ds is

compact with respect to the quadratic form∫D

[|∇u|2 + |u|2]dx.Let A[u, v] and T [u, v] be bounded from below quadratic forms in a

Hilbert space, T [u, u] ≥ 0 and A[u, u] > m ‖ u ‖2, m > 0. Assume that

D[A] ⊂ D[T ], where D[A] is the domain of definition of the form A, and

that the form A is closed and densely defined in H. The form A is called

closed if D[A] is closed in the norm

‖ u ‖A:= A[u, u]1/2 . (8.326)

If A[u, u] is not positive definite but bounded from below: A[u, u] ≥ −m ‖u ‖2, ∀u ∈ D[A], then the norm ‖ u ‖A is defined by

‖ u ‖A= A[u, u] + (m + 1)(u, u)1/2. (8.327)

The following proposition is well-known (see e.g. [Kato (1995)]).

Proposition 8.3 Every closed bounded from below quadratic form A[u, v]

is generated by a uniquely defined selfadjoint operator A.



This means that

A[u, v] = (Au, v) ∀u ∈ D(A), v ∈ D[A]

and D(A) ⊂ D[A] ⊂ H is dense in D[A] in the norm (8.327). The spectrum

of the closed bounded from below quadratic form is the spectrum of the

corresponding selfadjoint operator A.

Definition 8.5 A quadratic form T is called A-compact if from any se-

quence fn such that ‖ fn ‖≤ c one can select a subsequence fnk such that

T [fnk − fnm , fnk − fnm ] → 0 as m, k → ∞.

Theorem 8.9 If A[u, u] is a closed positive definite quadratic form in H

with discrete spectrum λn(A), and T [u, u] is a positive A-compact quadratic

form, D(A) ⊂ D(T ), then the form B[u, u] := A[u, u] + T [u, u], D[B] =

D[A], is closed, its spectrum is discrete and

limn→∞

λn(B)

λn(A)= 1. (8.328)

The conclusions of the theorem remain valid if T [u, u] is not positive but

|T [u, u]| ≤ T1[u, u] and T1 is A-compact.

We need a couple of lemmas for the proof.

Lemma 8.21 Under the assumptions of Theorem 8.9 the quadratic form

T [u, u] > 0 can be represented as

T [u, v] = [Tu, v] (8.329)

where [u, v] is the inner product in HA := D[A] and T > 0 is a compact

selfadjoint operator in HA.

Proof. Consider the quadratic form T [u, v] in the Hilbert space HA.

Since T [u, u] is A-compact, it is bounded in HA. If T [u, v] is not closed

in HA consider its closure and denote it again by T [u, v]. By Proposition

8.2 there exists a selfadjoint in HA operator T > 0 such that (8.329) holds.

Let us prove that T is compact in HA. Suppose ‖ un ‖A≤ c. Since T [u, u]

is A-compact there exists a subsequence, which we denote un again, such

that

T [un − um, un − um] → 0, n,m→ ∞.

Thus

[T (un − um), un − um] → 0, n,m → ∞. (8.330)



Since T > 0 is selfadjoint, T 1/2 is well defined and (8.330) can be written

as∥∥∥T 1/2(un − um)

∥∥∥A→ 0, n,m→ ∞. (8.331)

This implies that T 1/2 is compact in HA. Therefore T is compact. Lemma

8.21 is proved.

Lemma 8.22 Under the assumptions of Theorem 8.9 one has HB = HA.

Proof. It is sufficient to prove that

T [u, u] ≤ εA[u, u] + c(ε) ‖ u ‖2 ∀ε > 0. (8.332)

If (8.332) holds then

(1 − ε)A[u, u]− c(ε) ‖ u ‖2 ≤ B[u, u] ≤ (1 + ε)A[u, u] + c(ε) ‖ u ‖2

≤ c2(ε)A[u, u]

so that the normB[u, u] + c(ε) ‖ u ‖2

1/2is equivalent to the norm ‖ u ‖A.

This means that HB = HA. The proof of (8.332) is the same as the proof

of Lemma 8.19 used in the proof of Theorem 8.5. Lemma 8.22 is proved.

Proof of Theorem 8.9 We need only prove formula (8.328) and the fact

that B has a discrete spectrum. The other conclusions of Theorem 8.9 have

been proved in Lemmas 8.21 and 8.22. Since the form B[u, u] is bounded

from below in H we may assume that it is positive definite. If not we choose

a constant m such that Bm[u, u] := B[u, u] + m(u, u) is positive definite.

Since λn(Bm) = λn(B) +m and since λn(A) → +∞, the equation

limn→∞

λn(Bm)

λn(A)= 1, n → ∞

is equivalent to (8.328).

Note first that the spectrum of the form B[u, u] is discrete. Indeed, the

following known proposition (Rellich’s lemma) implies this.

Proposition 8.4 Let B[u, u] be a positive definite closed quadratic form

in H. The spectrum of B is discrete if and only if the imbedding operator

i : HB → H is compact.

For the convenience of the reader we prove Proposition 8.4 after we

finish the proof of Theorem 8.9.



Returning to the proof of Theorem 8.9 we note that A has a discrete

spectrum by the assumption. Therefore i : HA → H is compact. Since

HA = HB the imbedding i : HB → H is compact. By Proposition 8.4 this

implies that the spectrum of B is discrete.

To prove formula (8.328) we use the minimax principle:

λn+1(B) = supLn

infu⊥Ln

B[u, u]

(u, u)

≥ inff⊥Ln(A)

A[u, u]

(u, u)

(1 +

T [u, u]

A[u, u]

)

≥ λn+1(A)

(1 − sup

f⊥Ln(A)

(Tu, u)

‖ u ‖2A

)

= λn+1(A)(1 − γn), γn → ∞. (8.333)

Here we used Theorem 8.2 and denoted by Ln(A) the linear span of the first

n eigenvectors of the operator A generated by the quadratic form A[u, u].

Interchanging A and B we get

λn+1(A) ≥ λn+1(B)(1 − δn), δn → 0, n→ ∞. (8.334)

From (8.333) and (8.334) formula (8.328) follows.

The last statement of Theorem 8.9 follows from the following proposition

8.4.

Proposition 8.5 If |T [u, u]| ≤ T1[u, u] and T1 is A-compact then the

operator A−1T is compact in HA.

We will prove this proposition after the proof of Proposition 8.4.

Proposition 8.5 granted, the proof of the last statement of Theorem 8.9

is quite similar to the given above and is left to the reader. Theorem 8.9 is

proved.

Proof of Proposition 8.4 Assume that the spectrum of B[u, u] is discrete.

Then the corresponding selfadjoint operator B has only isolated eigenvalues

0 < m ≤ λn(B) → +∞. Therefore the operator B−1 is compact in H.

This implies that B−1/2 is compact in H. Assume that ‖ un ‖B≤ c, that

is, ‖ B1/2un ‖≤ c. Then the sequence un = B−1/2B1/2un contains a

convergent inH subsequence. Thus, the imbedding i : HB → H is compact.

Conversely, suppose i : HB → H is compact. Then any sequence unsuch that ‖ un ‖B=‖ B1/2un ‖< c contains a convergent subsequence. This



means that B−1/2 is compact. Since B−1/2 is selfadjoint it follows that

B−1 is compact. Since B ≥ m > 0 this implies that the spectrum of B is

discrete. Proposition 8.4 is proved.

Proof of Proposition 8.5 Denote Q := A−1T , Q1 = A−1T1. Then

|[Qu, u]| ≤ [Q1u, u]. (8.335)

where the brackets denote the inner product in HA. The operator Q1 is

nonnegative and compact in HA. Indeed,

[Q1u, u] = (T1u, u) ≥ 0

so Q1 ≥ 0 in HA. Suppose ‖ un ‖A≤ c. Then T1un contains a Cauchy

sequence in HT1 , that is

(T1unm − T1unk , unm − unk) → 0, m, k → ∞. (8.336)

Thus

[Q1 (unm − unk) , unm − unk] → 0, m, k → ∞. (8.337)

Since Q1 ≥ 0 equation (8.337) implies that Q1/21 is compact in HA. There-

fore Q1 is compact in HA. Conversely, if Q1 is compact in HA then T1 ≥ 0

is A-compact. Indeed, if ‖ un ‖A≤ c then T1un = AQ1un so that there is a

subsequence unm such that

(T1 (unm − unk) , unm − unk) = [Q1 (unm − unk) , unm − unk] → 0

as m, k → ∞, because Q1 is compact in HA.

So we have proved that

T1 ≥ 0 is A compact if and only if A−1T1 is compact in HA. (8.338)

Let us prove that if Q1 is compact in HA. This is the conclusion of

Proposition 8.5. According to section Section 8.3.1.1 it is sufficient to prove

that if fn 0 and gn 0 in HA then

[Qfn, gn] → 0, n → ∞. (8.339)

Indeed, if

fn 0 and gn 0 ⇒ [Qfn, gn] → 0 (8.340)



then

fn f and gn g ⇒ [Qfn, gn] → [Qf, g], (8.341)

that is, Q is compact. To check that (8.340) implies (8.341) one writes

[Qfn, gn] = [Q (fn − f) , gn] + [Qf, gn]

= [Q (fn − f) , gn − g] + [Q (fn − f) , g] + [Qf, gn − g] + [Qf, g]

= [Q (fn − f) , (gn − g)] + [fn − f,Q∗g] + [Qf, gn − g]

+ [Qf, g]. (8.342)

It follows from (8.342) that (8.340) implies (8.341).

Let us check that (8.335) implies (8.340). One uses the well known

polarization identity

[Qf, g] =1

4[(f + g), f + g] − [(f − g), f − g] − i [Q(f + ig), f + ig]

+ i [Q(f − ig), f − ig]. (8.343)

It is clear from (8.343) and (8.335) that

|[Qfn, gn]| ≤1

4|[Q1 (fn + gn) , fn + gn]| + |[Q1 (fn − gn) , fn − gn]|

+ |[Q1 (fn + ign) , fn + ign]|+ |[Q1 (fn − ign) , fn − ign]| → 0, n → ∞. (8.344)

The last conclusion follows from the assumed compactness of Q1 and the

fact that if fn 0 and gn 0 then any linear combination c1fn + c2gnconverges weakly to zero. Proposition 8.5 is proved.

Example 8.2 It is now easy to see that (164) holds. Indeed, the imbed-

ding i : H1(D) → L2(Γ, |σ|) is compact. Therefore the quadratic form∫Γσ|u|2ds is A-compact, where A[u, u] =

∫D

(|∇u|2 + |u|2

)dx. From this

and Theorem 8.9 the formula (8.323) follows.

8.3.2.10 Proof of Theorem 2.3

In this section we prove Theorem 2.3.

First let us note that if

R(x, y) =

∫ ∞

−∞ω(λ)Φ(x, y, λ)dρ(λ) (8.345)



where ω(λ) ∈ C(R1), ω(∞) = 0 then the operator R : L2(D) → L2(D),

where D ⊂ Rr is a bounded domain, and

Rh :=

∫

D

R(x, y)h(y)dy, (8.346)

is compact. This is proved in Section 4.2 (cf. the argument after formula

(4.71)). If ω(λ) ≥ 0 then R = R∗ ≥ 0. Suppose that

ω1(λ) = ω(λ) [1 + φ(λ)] , φ(±∞) = 0 (8.347)

where φ(λ) ∈ C(R1), 1 + φ(λ) > 0. Then the corresponding operator R1

can be written as

R1 = R(I + Q), (8.348)

where Q is a compact operator with the kernel

Q(x, y) =

∫ ∞

−∞φ(λ)Φ(x, y, λ)dρ(λ). (8.349)

By Theorem 8.3 one has

limn→∞

sn(R1)

sn(R)= 1. (8.350)

Note that the operator I + Q is injective since 1 + φ(λ > 0. Since R1 ≥ 0

and R ≥ 0 one has sn(R1) = λn(R1), sn(R) = λn(R). Therefore (191) can

be written as

limn→∞

λn(R1)

λn(R)= 1. (8.351)

Therefore it is sufficient to prove formula (2.31) for ω(λ) = (1 + λ2)−a/2.Secondly, let us note that if one defines

N (λ) :=∑

λn≤λ1, (8.352)

then formulas

λn = cnp [1 + o(1)] as n→ +∞, c = const > 0, p > 0, (8.353)

and

N (λ) = c−1/pλ1/p [1 + o(1)] , λ → +∞, p > 0, (8.354)



are equivalent. This follows from the fact that the function N (λ) is the

inverse function for λ(N ) := λN in the sense that N (λN ) = N and

λ (N (λ)) = λN . Therefore if one knows the asymptotics of N (λ) then

one knows the asymptotics of λn and vice versa. In [Ramm (1975), p. 339]

it is proved that if o(1) in (8.353) is 0 (n−p1 ), p1 > 0, then o(1) in (8.354)

is 0(λ−p1/p

).

Thirdly, let us recall a well known fact that an elliptic operator L of order

s with smooth coefficients and regular boundary conditions in a bounded

domain D ⊂ Rr with a smooth boundary (these are the regularity assump-

tions) has a discrete spectrum and

N (λ,L) = γλr/s [1 + o(1)] , λ→ +∞, (8.355)

where N (λ,L) is defined by (8.352) with λn = λn(L), and γ = const > 0 is

defined by formula (2.32). By formula (8.353) one obtains

λn(L) = γ−s/rns/r [1 + o(1)] , n → +∞. (8.356)

The operator R is a rational function of L so that by the spectral mapping

theorem one obtains

λn(R) = λ−an (L) [1 + o(1)] = γas/rn−as/r [1 + o(1)] , n → ∞. (8.357)

This is formula (2.31).

For the function ω(λ) = (1 + λ2)−a/2 and even a a proof of the formula

(2.31) is given in [Ramm (1980), p. 62].

This proof goes as follows. The problem

Rφn :=

∫

D

R(x, y)φn(y)dy = λnφn(x), x ∈ D (8.358)

is equivalent to the problem

∫

RrR(x, y)φn(y)dy =

λnφn(x), x ∈ D

un(x), x ∈ Ω,(8.359)

where

φn(x) := 0 in Ω (8.360)

Q(L)un = 0 in Ω (8.361)



un(∞) = 0, ∂jNun = λn∂jNφn on Γ, 0 ≤ j ≤ as

2− 1, (8.362)

where Q(λ) := (1 + λ2)a/2. The equivalence means that every solution

to (8.358) generates the solution to (8.359)- (8.362) and vice versa. The

problem (8.359)- (8.362) can be written as

λnQ(L)φn = χD(x)φn(x) in Rr, (8.363)

where

χD(x) =

1, x ∈ D

0, x ∈ Ω.

This problem has been studied [Tulovskii (1979)] and formula (8.357) has

been established.

For the general case of ω(λ) = (1 + λ2)−a/2, a > 0 one can use the

results from the spectral theory of elliptic pseudo-differential operators.

Under suitable regularity assumptions the following formula for the number

N (λ) := #λn : λn ≤ λ of eigenvalues of such an operator R in a bounded

domain D ⊂ Rr is valid:

N (λ) = (2π)−r meas (x, ξ) ∈ D ×Rr : r(x, ξ) < λ [1 + o(1)] , λ → +∞.

(8.364)

Here meas is the Lebesgue measure, and r(x, ξ) is the symbol of the pseudo-

differential operator R. This means that

Rh := (2π)−r∫∫

exp i(x− y) · ξ r(x, ξ)h(y)dydξ,∫

=

∫

Rr. (8.365)

The symbol of the elliptic operator (2.5) is∑

|j|≤s aj(x)(iξ)j . Only the

principal symbol, that is is∑

|j|=s aj(x)ξj, defines the main term of the

asymptotics of N (λ). Since s = ordL is even one chooses L so that

L0(x, ξ) :=∑

|j|=s aj(x)ξj > 0 for |ξ| 6= 0. For example, one chooses

L = −∆ rather than ∆. In this case

(2π)r meas (x, ξ) ∈ D × Rr : L0(x, ξ) < λ = λr/s(2π)−r∫

D

ηdx = γλr/s,

(8.366)

where η is given by (2.33) for the operator L0 in the selfadjoint form.

The asymptotic behavior of the function N (λ) has been studied exten-

sively for wide classes of differential and pseudo-differential operators (see,



e.g., [Levitan (1971); Hormander (1983-85); Safarov and Vassiliev (1997);

Shubin (1986)] and references therein).

8.3.3 Trace class and Hilbert-Schmidt operators

In this section we summarize for convenience of the reader some rsults on

Hilbert-Schmidt and trace-class operators. This material can be used for

references.

One writes A ∈ σp, 1 ≤ p < ∞ if

∞∑

j=1

spj (A) < ∞.

8.3.3.1 Trace class operators

The operators A ∈ σ1 are called trace class (or nuclear) operators. The

operators A ∈ σ2 are called Hilbert-Schmidt (HS) operators. The class

σ∞ denotes the class of compact operators. If A ∈ σp then A ∈ σq with

q ≥ p. We summarize some basic known results about trace class and HS

operators.

Lemma 8.23 If and only if A ∈ σ1 the sum TrA :=∑∞

j=1(Aφj , φj) is

finite for any orthonormal basis of H and does not depend on the choice of

the basis. In fact TrA =∑ν(A)

j=1 λj(A), where ν(A) is the sum of algebraic

multiplicities of the eigenvalues of A.

The following properties of the trace are useful.

Lemma 8.24 If Aj ∈ σ1, j = 1, 2, then

1) Tr(c1A1 + c2A2) = c1TrA1 + c2TrA2, cj = const, j = 1, 2,

2) TrA∗ = (TrA), the bar stands for complex conjugate,

3) Tr(A1A2) = Tr(A2A1),

4) Tr(B−1AB) = TrA where B is a linear isomorphism of H onto H,

5) Tr(A1A2) ≥ 0 if A1 ≥ 0 and A2 ≥ 0,

6) Tr(A1A2)1/2 ≤ TrA1+TrA2

2 if A1 ≥ 0 and A2 ≥ 0,

7) |TrA| ≤∑∞j=1 sj(A) :=‖ A ‖1,

8) ‖ A ‖1= sup∑∞j=1 |(Afj , hj)|, where the sup is taken over all orthonor-

mal bases fj and hj of H,

9) ‖ c1A1 + c2A2 ‖1≤ |c1| ‖ A1 ‖1 +|c2| ‖ A2 ‖1,



10) Assume that Aj ∈ σ1, 1 ≤ j < ∞, ‖ Aj ‖1≤ c, c does not depend on j,

and Aj A. Then A ∈ σ1, and ‖ A ‖1≤ supj ‖ Aj ‖1, the symbol

denotes weak convergence of operators,

11) if A ∈ σ1 and B is a bounded linear operator, then

‖ AB ‖1 ≤ ‖ A ‖1‖ B ‖,‖ BA ‖1 ≤ ‖ A ‖1‖ B ‖ .

8.3.3.2 Hilbert-Schmidt operators

The operators A ∈ σ2 are called Hilbert-Schmidt (HS) operators.

Lemma 8.25 A ∈ σ2 if and only if the sum ‖ A ‖22:=

∑∞j=1 ‖ Aφj ‖2 is

finite for any orthonormal basis φj of H. If A ∈ σ2 then

‖ A ‖22=

∞∑

j=1

s2j (A).

If cj = const and Aj ∈ σ2 then

‖ c1A1 + c2A2 ‖≤ |c1| ‖ A1 ‖2 +|c2| ‖ A2 ‖2 .

If A ∈ σ2 and B is a bounded linear operator then

‖ A ‖2 = ‖ A∗ ‖2,

‖ AB ‖2 ≤ ‖ A ‖2‖ B ‖ . (8.367)

Lemma 8.26 A ∈ σ2 if and only if there exists an orthonormal basis

φj of H for which

∞∑

j=1

‖ Aφj ‖2<∞. (8.368)

In particular, if (8.368) hold for an for an orthonormal basis of H, then it

holds for every orthonormal basis of H.

Lemma 8.27 A ∈ σ1 if and only if there exists an orthonormal basis of

H for which

∞∑

j=1

‖ Aφj ‖<∞. (8.369)

However if (8.369) holds for an orthonormal basis of H it may not hold for

another orthonormal basis of H.



Example 8.3 Let H = `2, f = c(1, 1

2 , . . . ,1n , . . .

)where c = const > 0

is chosen so that ‖ f ‖= 1. Let A be the orthogonal projection on the

one-dimensional subspace spanned by f , and let φj, φj = δ1j, be an

orthonormal basis of `2. Then Aφj = cj f ,

∑∞j=1 ‖ Aφj ‖=

∑∞j=1

cj = ∞.

Lemma 8.28 If and only if A ∈ σ1 it can be represented in the form

A = A1A2 where Aj ∈ σ2, j = 1, 2.

Lemma 8.29 The classes σ1 and σ2 are ideals in the algebra L(H) of all

linear bounded operators on H. If H is a separable Hilbert space then the set

σ∞ of all compact operators on H is the only closed proper non-zero ideal

in L(H). The ideal is called proper if it is not L(H) itself. The closedness

is understood as the closedness in the norm of linear operators on H.

8.3.3.3 Determinants of operators

Definition 8.6 If A ∈ σ1 then

d(µ) := det(I − µA) :=

ν(A)∏

j=1

[1 − µλj(A)] .

One has

1) |d(µ)| ≤ exp (|µ| ‖ A ‖1).

2) d(µ) = exp(−∫ µ0Tr[A(I − µA)−1

]dµ), if the operator I −λA, 0 ≤ λ ≤

µ, is invertible.

3) det(I −A) = limn→∞ det [δij − (Aφi, φj)]i,j=1,...,n where φj is an arbi-

trary orthonormal basis of H.

4) det(I − AB) = det(I − BA), AB ∈ σ1, BA ∈ σ1, A ∈ σ∞, B ∈ L(H).

5) det [(I − A)(I − B)] = det [(I − B)(I − A)], A,B ∈ σ1.

6) if A(z) ∈ σ1 is analytic operator function in a domain ∆ of the complex

plane z then d(1, z) := det (I − A(z)) is analytic in ∆; here d(µ, z) :=

det (I − µA(z)).

7) ddzTr F (A(z)) = Tr

F ′ (A(z)) dA(z)

dz

where F (λ) is holomorphic in

the domain which contains the spectrum of A(z) for all z ∈ ∆, and

F (0) = 0.

8) det(I+A) = exp Tr log(I + A), A ∈ σ1 where log(I+A) can be defined

by analytic continuation of log(I + zA). This function is well defined for

|z| ‖ A ‖< 1.



If A ∈ σ2 then the series∑∞

j=1 |λj(A)| may diverge and the Definition

8.6 is not applicable. One gives

Definition 8.7

d2(µ) := det2

(I − µA) :=

µ(A)∏

j=1

[1 − µλj(A)] exp [µλj(A)] .

One has:

9) |d2(µ)| ≤ exp

|µ|22 Tr(A∗A)

.

10) d2(1) = limn→∞ det [δij − (Aφi, φj)]1≤i,j≤n exp[∑n

j=1(Aφj, φj)]

where

φj is an arbitrary orthonormal basis of H.

11) if A,B ∈ σ2 and I − C = (I − A)(I − B) then

det2

(I − C) exp [Tr(AB)] = det2

(I −A) det2

(I − B)

If B ∈ σ1 then

12) det2

(I − C) = det2

(I − A) det(I − B) exp Tr [(I −A)B] .

Definition 8.8 If A ∈ σp then

13)

dp(µ) := detp

(I − µA) :=

∞∏

j=1

[1 − µλj(A)] exp

[p−1∑

m=1

µm

mλmj (A)

].

Carleman’s inequality: If A ∈ σ2, λj are eigenvalues of A counted

according to their multiplicities, |λ1| ≥ |λ2| ≥ · · · and φλ(A) :=∏∞j=1(1 −

λjλ−1) exp(λjλ

−1), then

14) ‖ φλ(A)(A − λI)−1 ‖≤ |λ| exp

1

2

(1 + |λ|−2 ‖ A ‖2

).

8.4 Elements of probability theory

8.4.1 The probability space and basic definitions

A probability space is a triple Ω,U , P where Ω is a set, U is a sigma

algebra of its subsets, and P is a measure on U , such that P (Ω) = 1, so its

a measure space Ω,U equuipped with a normalized countably additive



measure

P(U∞j=1Aj

)=

∞∑

j=1

P (Aj), Aj ∩Am = ∅ for j 6= m.

A random variable ξ is a U-measurable function on Ω, that is, a U-

measurable map Ω → R1. A random vector ξ is a U-measurable map Ω →Rr.

A distribution function of ξ is F (x) = P (ξ < x).

It has properties: F (−∞ = 0, F (+∞) = 1, F is nondecreasing,

F (x+ 0) − F (x) = P (ξ = x), F (x− 0) = F (x).

The probability density f(x) := F ′(x) is defined in the classical sense if

F (x) is absolutely continuous, so that F (x) =∫ x−∞ f(t)dt. If ξ is a discrete

random variable, that is ξ takes values in a discrete set of points, then its

distribution function is F (x) =∑

i Piθ(x− xi), where

θ(x) :=

1 x > 0

0 x ≤ 0, Pi > 0,

∑

i

Pi = 1.

The probability density for this distribution is f(x) =∑i Piδ(x−xi) where

δ(x) is the delta-function.

The probability density has the properties:

1) f ≥ 0,

2)∫∞−∞ fdt = 1.

A random vector ξ = (ξ1, . . . , ξr) has a distribution function

F (x1, . . . , xr) := P (ξ1 < x1, . . . , ξr < xr).

This function has characteristic properties

1) F (+∞, . . . ,+∞) = 1,

2) F (x1, . . . , xm = −∞, . . . , xr) = 0 for any 1 ≤ m ≤ r,

3) F (x1, . . . , xm = +∞, . . . , xr) = F (x1, . . . , xm−1, xm+1, . . . , xr),

4) F is continuous from the left, i.e. F (x1, . . . , xm − 0, . . . , xr) =

F (x1, . . . , xm, . . . , xr) and nondecreasing in each of the variables

x1, . . . , xr.

The probability density f(x1, . . . , xr) is defined by the formula

f(x1, . . . , xr) =∂rF (x1, . . . , xr)

∂x1 . . . ∂xr



so that

F (x1, . . . , xr) =

∫ x1

−∞dx1 . . .

∫ xr

−∞dxrf(x1, . . . , x2).

This formula holds if the measure defined by the distribution function F

is absolutely continuous with respect to the Lebesgue measure in Rr, i.e.

if P (ξ ∈ ∆) = 0 for any ∆ ⊂ Rr such that meas∆ = 0 where meas is the

Lebesgue measure in Rr.

Example 8.4 A random vector ξ is said to be uniformly distributed in a

set ∆ ⊂ Rr if its probability density is

f(x) =

0 x 6∈ ∆

1meas∆

x ∈ ∆.(8.370)

Example 8.5 A random vector ξ is said to be Gaussian (or normally

distributed, or normal) if its probability density is

f(x) = (2π)−r/2[detC]1/2 exp

r∑

i,j=1

cij(xi −mi)(xj −mj)

. (8.371)

Here C = (cij) is a positive definite matrix, M [ξ] = ξ = m = (m1, . . . ,mr),

the matrix C−1 is the covariance matrix of ξ:

C−1 :=(c(−1)ij

)= (ξi −mi)(ξj −mj), ξi = mi, (8.372)

where all the quantities are real-valued, the bar denotes the mean value,

defined as follows:

ξi :=

∫xidF,

∫:=

∫

Rr.

If g(x1, . . . , xr) is a measurable function defined on Rr, g : Rr → R1,

then η = g(ξ1, . . . , ξr) is a random variable with the distribution function:

Fη(x) := P (η < x) =

∫

g(x1 ,...,xr)<η

dF (x1, . . . , xr)

where F (x1, . . . , xr) is the distribution function for the vector (ξ1, . . . , ξr).

One has

g(ξ1, . . . , ξr) =

∫g(x1, . . . , xr)dF (x1, . . . , xr).



In particular the variance of a random variable is defined as

D[ξ] := (ξ − ξ)2 =

∫ ∞

−∞(x−m)2dF (x)

where

ξ = m =

∫ ∞

−∞xdF (x).

Let us define conditional probabilities. If A and B are random events

then

P (A∣∣ B) :=

P (AB)

P (B), (8.373)

where AB is the event A ∩B and P (A∣∣ B) is called the conditional prob-

ability of A under the assumption that B occured. The conditional mean

value (expectation) of a random variable ξ = f(u) under the assumption

that B occured is defined by

M (ξ∣∣ B) = P−1(B)

∫

B

f(u)P (du). (8.374)

In particular, if

ξ =

1 if A occurs

0 otherwise

then (8.374) reduces to (8.373).

If A = Unj=1Ej, Ej ∩Ej′ = ∅, j 6= j′, then

P (A) =

n∑

j=1

P (A∣∣ Ej)P (Ej) (8.375)

and

P (Ej∣∣ A) = P (A

∣∣ Ej)P (Ej)P−1(A). (8.376)

This is the Bayes formula. The conditional distribution function of a ran-

dom variable ξ with respect to an event A is defined as

Fξ(x∣∣ A) :=

P (ξ < x ∩A)

P (A). (8.377)



One has

M(ξ∣∣ A)

=

∫ ∞

−∞xdFξ

(x∣∣ A). (8.378)

The characteristic function of a random variable ξ with the distribution

function F (x) is defined by

φ(t) := M [exp(itξ)] =

∫ ∞

−∞exp(itx)dF (x). (8.379)

It has the properties

1) φ(0) = 1, |φ(t)| ≤ 1, −∞ < t < ∞, φ(−t) = φ∗(t)2) φ(t) is uniformly continuous on R1

3) φ(t) is positive definite in the sense

n∑

j,m=1

φ(tj − tm)zjz∗m ≥ 0 (8.380)

for any complex numbers zj and real numbers tj , 1 ≤ j ≤ n.

Theorem 8.10 (Bochner-Khintchine ) A function φ(t) is a charac-

teristic function if and only if it has properties 1)-3).

One has

limT→∞

1

2T

∫ T

−Tφ(t) exp(−itx)dt = 0

if F (x) is continuous at the point x. More generally

limT→∞

1

2T

∫ T

−Tφ(t) exp(−itx)dt = F (x+ 0) − F (x).

If conditions 2) and 3) hold but condition 1) does not hold then one still

has the formula

φ(t) =

∫ ∞

−∞exp(itx)dF (x) (8.381)

where F (x) is a monotone nondecreasing function, but the condition

F (+∞) = 1 does not hold and is replaced by F (+∞) < ∞.

Let us define the notion of random function. Let Ω,U , P be the

probability space and ξ(t, ω), ω ∈ Ω, be a family of random variables

depending on a parameter t ∈ D ⊂ Rr. The space X to which the variable

1/2ξ(t, ω) belongs for a fixed ω ∈ Ω is called the phase space of the random



function ξ(x, ω). If r = 1 the random function is called a random process, if

r > 1 it is called a random field. Usually one writes ξ(t) in place of ξ(t, ω).

If X = Rm then ξ(t) is a vector random field, if m = 1 then ξ(t) is a scalar

random field. One assumes that X is a measurable space (X,B) where Bis the Borel sigma-algebra generated by all open sets of X.

If one takes n points x1, . . . ,xn ∈ D ⊂ Rr then one obtains a random

vector

ξ(t1, ω), . . . , ξ(t, ω). Let F (t1, . . . , tn) be the distribution function for

this random vector. For various choices of the points tj one obtains various

distribution functions. The collection of these functions is consistent in the

following sense:

1) Fξ1,...,ξn(x1,x2, . . . ,xm−1,xm ∈ X, . . . ,xn) =

F (x1, . . . ,xm−1,xm+1, . . . ,xn)

2) Fξ1,...,ξn(x1, . . . ,xn) = Fξi1 ,...,ξin (xi1 , . . . ,xin)

for any permutation (i1, . . . , in) of the set (1, . . . , n).

The following theorem gives conditions for a family F (x1, . . . ,xn) of

functions to be the family of finite-dimensional distribution functions cor-

responding to a random function ξ(t).

Theorem 8.11 (Kolmogorov) If and only if the family of functions

F is compatible and each of the functions has the characteristic properties

1)-4) of a distribution function, there exists a random function for which

this family is a family of finite-dimensional distribution functions.

Moment functions of a random function are defined as

mj(t1, . . . , tj) = M [ξ(t1) . . . ξ(tj)] = ξ(t1) . . . ξ(tj). (8.382)

Especially often one uses the mean values

M [ξ(t, ω)] = m(t) = ξ(t) (8.383)

and covariance function

R(t, τ ) := [ξ(t) −m(t)]∗[ξ(τ ) −m(τ )] . (8.384)

The characteristic property of the class of covariance functions is positive

definiteness in the sense

n∑

i,j=1

R(ti, tj)z∗i zj ≥ 0 (8.385)



for any choice of real ti and complex zi. The star stands for complex

conjugate.

8.4.2 Hilbert space theory

Let us assume that a family of random variables with finite second moments

is equipped with the inner product defined as

(ξ, η) := ξη∗ (8.386)

and the norm is defined as

‖ ξ ‖= (ξ, ξ)1/2. (8.387)

Then the random variables belong to the space L2 = L2(Ω,U , P ). Conver-

gence in this space is defined as convergence in the norm (8.387).

If ξ(t) is a random function then its correlation function is defined as

B(t, τ ) := ξ∗(t)ξ(τ ). (8.388)

If ξ(t) = 0 then B(t, τ ) = R(t, τ ), where R(t, τ ) is the covariance function

(8.384). The characteristic property of the class of correlation functions is

positive definiteness in the sense (8.385).

A random function ξ(t) ∈ L2(Ω,U , P ) is called continuous (in L2 sense)

at a point t0 if

‖ ξ(t) − ξ(t0) ‖→ 0 as ρ(t, t0) → 0, (8.389)

where ρ(t, t0) is the distance between 1/2 and t0.

Lemma 8.30 For (8.389) to hold it is necessary and sufficient that

B(t, τ ) be continuous at the point (t0, t0).

A random function ξ(t) ∈ L2 is called differentiable (in L2 sense) at a

point t0 if there exists in L2 the limit

ξ′(t0) := l.i.m.ε→0ε−1 [ξ(t0 + ε) − ξ(t0)] , (8.390)

where l.i.m. = limit in mean stands for the limit in L2.

Lemma 8.31 For (8.390) to hold it is necessary and sufficient that



∂2B(t0,t0)∂t∂τ exists, that is

∂2B(t0, t0)

∂t∂τ:= lim

ε1→0

ε2→0

1

ε1ε2[B(t0 + ε1, t0 + ε2)

− B(t0, t0 + ε2) −B(t0 + ε1, t0) + B(t0, t0)] . (8.391)

If ∂2B(t,τ)∂t∂τ exists then

ξ′∗(t)ξ′(τ ) =∂2B(t, τ )

∂t∂τ

ξ′∗(t)ξ(τ ) =∂B(t, τ )

∂t. (8.392)

Let ξ(x), x ∈ D ⊂ Rr be a random function and µ(x) be a finite

measure. The Lebesgue integral of ξ(x) is defined as∫

D

ξ(x)dµ(x) = l.i.m.n→∞

∫

D

ξn(x)dµ(x) (8.393)

where ξn(x) ≤ ξn+1(x), ξn(x) ∈ L2 and ξn(x) −→

Pξ(x), that is

limn→∞

P (|ξ(x)− ξn(x)| > ε) = 0, ∀x ∈ D (8.394)

for every ε > 0.

If µ(D) < ∞ and∫

D

B(x, x)dµ(x) < ∞ (8.395)

then∫

D

|ξ(x)|2dµ(x) =

∫

D

B(x, x)dµ(x). (8.396)

Assume that ξ(x) ∈ L2 and B(x, y) is continuous in D × D, where

D ⊂ Rr is a finite domain. Then, by Mercer’s theorem (see p. 61) one has

B(x, y) =

∞∑

j=1

λjφj(x)φ∗j (y) (8.397)

where

Bφj = λjφj, λ1 ≥ λ2 ≥ · · · > 0 (8.398)



(φj, φm) =

∫

D

φjφ∗mdx = δjm, (8.399)

Bφ :=

∫

D

B(x, y)φ(y)dy. (8.400)

Put

ξn :=

∫

D

ξ(x)φn(x)dx. (8.401)

Then

ξ∗nξm =

∫

D

∫

D

B(x, y)φm(y)φ∗n(x)dydx = λmδnm, (8.402)

and

ξ∗(x)ξn =

∫

D

B(x, y)φn(y)dy = λnφn(x). (8.403)

Lemma 8.32 The series

ξ(x) =

∞∑

j=1

ξjφ∗j (x) (8.404)

converges in L2 for every x ∈ D if the function B(x, y) is continuous in

D ×D.

Remark 8.6 If one defines B1(t, τ ) by the formula

B1(t, τ ) := ξ(t)ξ∗(τ ) (8.405)

so that B1 = B∗, then formulas (8.397)-(8.400) hold for B1, in formula

(8.401) in place of φn one puts φ∗n and in formula (8.404) one puts φj in

place of φ∗j .

Let us define the notion of a stochastic or random measure.

Consider a random function ζ(x) ∈ L2, x ∈ D. Let B be a sigma-algebra

of Borel subsets of D. Suppose that to any ∆ ⊂ B there corresponds a

random variable µ(∆) with the properties

1) µ(∆) ∈ L2, µ(∅) = 0

2) µ(∆1 ∪ ∆2) = µ(∆1) + µ(∆2) if ∆1 ∩ ∆2 = ∅3) µ(∆1)µ∗(∆2) = m(∆1 ∩ ∆2),

where m(∆) is a certain deterministic function on B. Note that



m(∆) = |µ(∆)|2 ≥ 0

and m(∆1 ∪ ∆2) = |µ(∆1) + µ(∆2)|2 = m(∆1) +m(∆2)

provided that ∆1 ∩∆2 = ∅, so that m(∆) has some of the basic properties

of a measure. It is called the structural function of µ(∆) and µ(∆) is called

an elementary orthogonal stochastic measure.

Assume that m(∆) is semiadditive in the following sense: for any ∆ ∈ Bthe inclusion ∆ ⊂ ⋃∞

j=1 ∆j, ∆j ∈ B, implies

m(∆) ≤∞∑

j=1

m(∆j). (8.406)

Then m(∆) can be extended to a Borel measure on B.

Let f(x) ∈ L2 (D,B,m). Define the stochastic integral as the following

limit∫

D

f(x)ζ(dx) = l.i.m.

∫

D

fn(x)ζ(dx). (8.407)

where fn(x) is a sequence of simple functions such that

‖ f − fn ‖L2(D,m):=

∫

D

|f − fn|2m(dx)

1/2

→ 0 n→ ∞. (8.408)

A simple function is a function of the type

fn(x) =

n∑

j=1

cjχAj (8.409)

where cj = const, χAj is the characteristic function of the set Aj ⊂ B.

Lemma 8.33 If fi ∈ L2(D,m), i = 1, 2, and ci = const, then

∫

D

(c1f1 + c2f2)ζ(dx) = c1

∫

D

f1ζ(dx) + c2

∫

D

f2ζ(dx) (8.410)

and

∫

D

f1(x)ζ(dx)

∫

D

f∗2 (x)ζ∗(dx) =

∫

D

f1(x)f∗2 (x)m(dx). (8.411)

Using the notion of the stochastic integral one can construct an integral

representation of random functions.



Suppose that a random function ξ(x), x ∈ D, has the covariance func-

tion of the form

B(x, y) =

∫

Λ

g∗(x, λ)g(y, λ)m(dλ) (8.412)

where m(dλ) is a Borel measure on the set Λ, g(x, λ) ∈ L2(Λ,m(dλ)) ∀x ∈D, and the set of functions g(x, λ), x ∈ D is complete in L2(Λ,m(dλ)).

Lemma 8.34 Under the above assumptions there exists an orthogonal

stochastic measure ζ(dλ) such that

ξ(x) =

∫

Λ

g(x, λ)ζ(dλ). (8.413)

Equation (8.413) holds with probability one, and m(∆) :=∫∆m(dλ) is the

structural function corresponding to ζ(dλ). There is an isometric isomor-

phism between L2(Λ,m(dλ)) and L2ξ, where L2

ξ is the closure of the set of

random variables of the form∑n

j=1 cjζ(∆j), ∆j ∈ Λ, in the norm (8.387).

This isomorphism is established by the correspondence

ξ(x) ↔ g(x, λ), ζ(∆) ↔ χ∆(λ). (8.414)

If hi(λ) ∈ L2(Λ,m(dλ)), i = 1, 2 then

(h1, h2)L2(Λ,m) :=

∫

Λ

h1h∗2m(dλ) = ξ1ξ∗2 , (8.415)

where

ξi =

∫

Λ

hi(λ)ζ(dλ). (8.416)

This theory extends to the case of random vector-functions.

8.4.3 Estimation in Hilbert space L2(Ω, U , P )

Assume that L2ξ is the subspace of L2(Ω,U , P ), a random variable η ∈

L2(Ω,U , P ) and we want to give the best estimate of η by an element of

L2ξ , that is to find η0 ∈ L2

ξ such that

δ :=‖ η − η0 ‖= infφ∈L2

ξ

‖ η − φ ‖, (8.417)

where the norm is defined by (8.387).

The element η0 ∈ L2ξ does exist, is unique, and is the projection of η

onto the subspace L2ξ in the Hilbert space L2(Ω,U , P ).



The error of the estimate, the quantity δ defined by (8.417), can be

calculated analytically in some cases. For example, if (ξ1, . . . , ξn) is a finite

set of random variables then

η0 =1

Γ

∣∣∣∣∣∣∣∣∣

(ξ1, ξ1) . . . (ξ1, ξn) ξ1...

(ξn, ξ1) . . . (ξn, ξn) ξn(η, ξ1) . . . (η, ξn) 0

∣∣∣∣∣∣∣∣∣, (8.418)

where Γ = Γ(ξ1, . . . , ξn) is the Gramian of (ξ1, . . . , ξn):

Γ :=

∣∣∣∣∣∣

(ξ1, ξ1) . . . (ξ1, ξn)

. . . . . . . . .

(ξn, ξ1) (ξn, ξn)

∣∣∣∣∣∣. (8.419)

One has

δ2 =Γ(ξ1, . . . , ξn, η)

Γ(ξ1, . . . , ξn). (8.420)

The optimal estimate η0 ∈ L2ξ satisfies the orthogonality equation

(η0 − η, ξ(x)) = 0, ∀x ∈ D (8.421)

which means geometrically that η − η0 is orthogonal to L2ξ .

Equation (1.6) for the optimal filter is a particular case of (8.421): if,

in the notation of Chapter 1,

η0(z) =

∫

D

h(z, y)U(y)dy, η = s(z), (8.422)

then (8.421) becomes

∫

D

h(z, y)U(y)U∗(x)dy = s(z)U∗(x),

which, according to (1.3), can be written as

∫

D

h(z, y)R(x, y)dy = f(x, z). (8.423)

This is equation (1.6).



8.4.4 Homogeneous and isotropic random fields

If ξ(x) is a random field and

ξ(x) = 0, ξ∗(x)ξ(y) = R(x− y) (8.424)

then ξ(x) is called a (wide-sense) homogeneous random field. It is called

a homogeneous random field if for any n and any x1, . . . , xn, x the distri-

bution function of n random variables ξ(x1 + x), . . . , ξ(xn + x) does not

depend on x. Here x ∈ Rr or, if x ∈ D ⊂ Rr, then one assumes that

x, y ∈ D implies x + y ∈ D. The function R(x) is positive definite in the

sense (8.380). Therefore, by Bochner-Khintchine theorem, there exists a

monotone nondecreasing function F (x), F (+∞ <∞, such that

R(x) =

∫exp(ix · y)dF (y),

∫=

∫

Rr, (8.425)

x · y =

r∑

j=1

xjyj . (8.426)

One often writes dF (y) = F (dy) to emphasize that F determines a measure

on Rr . Monotonicity in the case r > 1 is understood as monotonicity in

each of the variables.

If r > 1 then a positive definite function R(x) is the Fourier transform of

a positive finite measure on Rr. This measure is given by the function F (x)

which satisfies characteristic properties 2)-4) of a distribution function. It


0 < R(0) = F (Rr) < ∞ (8.427)

|R(x)| ≤ R(0) (8.428)

R(−x) = R∗(x). (8.429)

A homogeneous random field is called isotropic if

ξ∗(x)ξ(y) = ξ∗(gx)ξ(gy) (8.430)

for all x, y ∈ Rr and all g ∈ SO(n), where SO(n) is the group of rotations

of Rr around the origin. Equation (8.430) for homogeneous random fields

is equivalent to

R(x) = R(gx) ∀g ∈ SO(n). (8.431)



This means that

R(x) = R(|x|) (8.432)

where |x| = (x21+· · ·+x2

r)1/2 is the length of the vector x. This and formula

(8.425) imply that dF (y) = dφ(|y|). If∫

|R(x)|dx < ∞ (8.433)

then dF = f(y)dy and f(y) is continuous:

f(y) = (2π)−r∫

exp(−ix · y)R(x)dx. (8.434)

If∫

|R|2dx < ∞ (8.435)

then dF = f(y)dy, f(y) ∈ L2(Rr), and formula (8.434) holds in L2-sense.

It is known that

∫

|x|=ρexp(ix · y)ds =

(2πρ

|y|

)r/2|y|J(r−2)/2(ρ|y|) (8.436)

where Jn(t) is the Bessel function and ds is the element of the surface area

of the sphere |x| = ρ in Rr. Using formula (65) one obtains

Lemma 8.35 Assume that R(ρ) is continuous function. This function

is a correlation function of a homogeneous isotropic random field in Rr if

and only if it is of the form:

R(ρ) = 2(r−2)/2Γ(r

2

)∫ ∞

0

J(n−2)/2(λρ)

(λρ)(r−2)/2dg(λ) (8.437)

where g(λ) is a monotone nondecreasing bounded function, g(+∞) < ∞,

and Γ(z) is the Gamma-function.

If r = 2 formula (8.437) becomes

R(ρ) =

∫ ∞

0

J0(λρ)dg(λ), (8.438)

for n = 3 one gets

R(ρ) = 2

∫ ∞

0

sin(λρ)

λρdg(λ). (8.439)



From formula (8.425) and Lemma 8.34 it follows that a homogeneous

random field admits the spectral representation of the form

ξ(x) =

∫exp(ix · y)ζ(dy),

∫=

∫

Rr, (8.440)

where ζ(dy) is an orthogonal stochastic measure on Rr.

If the random field is homogeneous isotropic and continuous in the L2-

sense then

ξ(x) = cr

∞∑

m=0

h(m,r)∑

j=1

Sm,j(θ)

∫ ∞

0

Jm+(r−2)/2(λ|x|)(λ|x|)(r−2)/2

ζmj(dλ). (8.441)

Here cr = const, Sm,j (θ) is the orthonormalized in L2(Sr−1) system of

the spherical harmonics, Sr−1 is the unit sphere in Rr , θ ∈ Sr−1,

h(m, r) = (2m + r − 2)(m + r − 3)!

(r − 2)!m!,

r ≥ 2 is the number of linearly independent spherical harmonics corre-

sponding to the fixed m. For example, if r = 3 then h(m, 3) = 2m+1. The

stochastic orthogonal measures ζmj(dλ) have the properties

ζmj(dλ) = 0, (8.442)

ζmj(∆1)ζ∗pq(∆2) = δmpδjqm(∆1 ∩∆2) (8.443)

where ∆1 and ∆2 are arbitrary Borel sets in the interval (0,∞), and m(∆)

is a finite measure on (0,∞).

If ξ(x) is a homogeneous random field with correlation function (8.425),

then one can aply a differential operator Q(−i∂), ∂ = (∂1, . . . , ∂r), ∂j =∂∂xj

, to ξ(x) in L2 sense if and only if

∫|Q(y)|2F (dy) < ∞. (8.444)

If condition (8.444) holds then Q(−i∂)ξ(x) is a homogeneous random field

and its correlation function is Q∗(−i∂)Q(−i∂)R(x) and the corresponding

spectral density is |Q(y)|2f(y), where F (dy) = f(y)dy. By the spectral

density of the homogeneous random field with correlation function (8.425)

one means the function f(y) defined by F (dy) = f(y)dy in the case when

F (dy) is absolutely continuous with respect to Lebesgue’s measure, so that

f(y) ∈ L1(Rr).



8.4.5 Estimation of parameters

Let ξ be a random variable with the distribution function F (x, θ) which

depends on a parameter θ. The problem is to estimate the unknown θ

given n sample values of ξ to which F (x, θ) belongs. The estimated value

of θ let us denote θ = 1/2hatθ(x1, . . . , xn) where xj, 1 ≤ j ≤ n, are observed

values of ξ. What are the properties one seeks in an estimate? If ρ(θ, θ) is

the risk function which measures the distance of the estimate θ from the

true value of the parameter θ, then the estimate is good if

ρ(θ, θ) ≤ ρ(θ, θ1), (8.445)

where θ1 is any other estimate, and

ρ(θ, θ) =

∫ρ(θ, θ(x1, . . . , xn)

)dF (x1, θ) . . .dF (xn, θ) (8.446)

and F (x, θ) is the distribution function of ξ for a fixed θ.

A minimax estimate is the one for which

supθρ(θ, θ) = min (8.447)

where θ runs through the set Θ in which it takes values.

A Bayes estimate is the one for which

∫

Θ

ρ(θ, θ)dµ(θ) = min (8.448)

where µ(θ) is an a priori given distribution function on the set Θ. This

means that one prescribes a priori more weight to some distribution of θ.

An unbiased estimate is the one for which

θ = θ. (8.449)

An efficient estimate is the one for which

|θ − θ|2 ≤ |θ − θ1|2 for any θ1. (8.450)

The Cramer-Rao inequality gives a lower bound for the variance of the

estimate:

σ2θ(θ) := |θ − θ|2 ≥ 1

nI(θ), (8.451)



where one assumes that (78) holds,

I(θ) :=

∫ ∣∣∣∣∂ log p(x, θ)

∂θ

∣∣∣∣2

p(x, θ)ν(dx), (8.452)

and one assumes that dF has a density p(x, θ) with respect to a σ-finite

measure ν(dx)

dF = p(x, θ)ν(dx) (8.453)

and p(x, θ) is differentiable in θ.

A mesure ν on a set E in a measure space is called σ-finite if E is a

countable union of sets Ej with ν(Ej) < ∞.

In particular, if measure ν is concentrated on the discrete finite set of

points y1, . . . , yn, then

I(θ) =

n∑

j=1

∣∣∣∣∂ log p(yj, θ)

∂θ

∣∣∣∣2

p(yj , θ). (8.454)

The quantity I(θ) is called the information quantity. A measure ν is called

concentrated on a set A ⊂ E if ν(B) = ν(B ∩A) for every B ⊂ E, that is,

if ν(B) = 0 whenever B ∩A = ∅.Sometimes an estimate θ is called efficient if the equality sign holds in

(8.451). An estimate θ is called sufficient if dF (x, θ) = p(x, θ)ν(dx) and

p(x1, θ) . . . p(xn, θ) = gθ(θ)h(x1, . . . , xn)

where gθ and h are nonnegative functions, h does not depend on θ and gθdepends on x1, . . . , xn only through θ = θ(x1, . . . , xn). Suppose that the

volume of the sample grows, i.e. n → ∞. An estimate θ(x1, . . . , xn) := θnis called consistent if

limn→∞

P(|θn − θ| > ε

)= 0 for every ε > 0. (8.455)

There are many methods for constructing estimates.

Maximum-likelihood estimates of θ is the estimate obtained from the

equations

∂ logL(θ, x1, . . . , xn)

∂θj= 0, 1 ≤ j ≤ m. (8.456)

Here θ is a vector parameter θ = (θ1, . . . , θm), the function

L(θ, x1, . . . , xn) is called the likelihood function and is defined by

L(θ, x1, . . . , xn) := Πni=1p(xi, θ), (8.457)



where p(x, θ) is the density of dF defined by dF = p(x, θ)dx.

Cramer proved that if:

1) ∂k logp(x,θ)∂θk

, k ≤ 3, exists for all θ ∈ Θ and almost all x ∈ R1

2)∣∣∣∂

kp(x,θ)∂θk

∣∣∣ ≤ gk(x) where gk(x) ∈ L1(R1), k = 1, 2, and

supθ∈Θ

∫∞−∞ g3(x)p(x, θ)dx < ∞

3) I(θ) is positive and finite for every θ ∈ Θ, where I(θ) is defined by (8.452)

with ν(dx) = dx,

then equation (8.456) has a solution θ(x1, . . . , xn) which is a consistent,

asymptotically efficient and asymptotically Gaussian estimate of θ.

Here asymptotic efficiency is understood in the sense that inequality

(8.451) becomes an equality asymptotically as n → ∞. More precisely,

define

eff(θ) :=[nI(θ)σ2

θ (θ)]−1

. (8.458)

Then the estimate θ is asymptotically efficient if

limn→∞

eff(θn) = 1. (8.459)

The estimate θn is asymptotically Gaussian in the sense that

I(θ)n1/2[θ(x1, . . . , xn) − θ

]∼ N (0, 1) as n → ∞ (8.460)

where N (0, 1) is the Gaussian distribution with zero mean value and vari-

ance one, and we assumed for simplicity that θ is a scalar parameter.

We do not discuss other methods for constructing estimates (such as the

method of moments, the minimum of χ2 method, intervals of confidency,

the Bayes estimates etc.).

8.4.6 Discrimination between hypotheses

One observes n values x1, . . . , xn, of a random quantity ξ, and assumes

that there are two hypotheses H0 and H1 about ξ. If H0 occurs then the

probability density of the observed values is fn(x1, . . . , xn∣∣ H0), otherwise

it is fn(x1, . . . , xn∣∣ H1). Given the observed values x1, . . . , xn one has to

decide whether H0 or H1 occured. Let us denote γi, i = 0, 1, the decision

that Hi occured. The decision γ0 is taken if (x1, . . . , xn) ∈ D0, where D0

is a certain domain in Rn. The choice of such a domain is the choice of the



decision rule. If (x1, . . . , xn) 6∈ D0 then the decision γ1 is taken. The error

α10 of the first kind is defined as

α10 = P (γ1

∣∣ H0) =

∫

D1

f(x∣∣ H0)dx, x = (x1, . . . , xn) (8.461)

where D1 = Rn \D0. Thus α10 is the probability to take the decision that

H1 occured when in fact H0 occured. The error of the second kind is

α01 = P (γ0

∣∣ H1) =

∫

D0

f(x∣∣ H1)dx, x = (x1, . . . , xn). (8.462)

The conditional probabilities to take the right decisions are

P (γ0

∣∣ H0) = 1 − α10, P (γ1

∣∣ H1) = 1 − α01. (8.463)

One cannot decrease both α10 and α01 without limit: if α10 decreases

then D1 decreases, therefore D0 increases and α01 increases. The problem

is to choose an optimal in some sense decision rule.

Let us describe some approaches to this problem. The Neyman-Pearson

approach gives the decision rule which minimizes α01 under the condition

that α10 ≤ α, where α is a fixed confidence level.

Let us define the likelihood ratio

`(x) :=f(x

∣∣ H1)

f(x∣∣ H0)

, x = (x1, . . . , xn), (8.464)

and the threshold c > 0 which is given by the equation

P`(x) ≥ c

∣∣ H0

= α. (8.465)

The Neyman-Pearson decision rule is

if `(x) < c then H0 occured, otherwise H1 occured. (8.466)

If φ(t) is a monotone increasing function then `(x) < c if and only if

φ (`(x)) < φ(c). In particular, the rule

if log `(x) < log c then H0 occured, otherwise H1 occured (8.467)

is equivalent to (8.466).

Assume that the a priori probability p0 of H0 is known, so that the a

priori probability of H1 is 1− p0.

Then the maximum a posteriori probability decision rule is:

if `(x) <p0

1 − p0then H0 occured, otherwise H1 occured. (8.468)



If no a priori information about p0 is known, then one can use the

maximum likelihood decision rule

if `(x) < 1 then H0 occured, otherwise H1 occured. (8.469)

All these rules are threshold rules with various thresholds, and other

decision rules are discussed in the litrature.

8.4.7 Generalized random fields

A generalized random field ξ(x), x ∈ Rr is defined as follows. Suppose that

φ1(x), . . . , φm(x) is a set of C∞0 (Rr) functions, and to each such set there

corresponds a random vector ξ(φ1), . . . , ξ(φm), such that the distribution

functions for all such vectors for all m and all choices of φ1, . . . , φm are

consistent. Then one says that a generalized random field ξ is defined.

The theories of generalized random functions of one and several vari-

ables are similar. A linear combination∑N

j=1 cjξj(x) of generalized random

functions is defined by the formula

N∑

j=1

cjξj(x)(φk) =

N∑

j=1

cjξj(φk).

Similarly, if ψ ∈ C∞(Rr), then ψξ(φ) := ξ(ψφ), ξ(x+h)(φ) := ξ (φ(x− h)).

If ξ(φ) := m(φ) is a continuous linear functional on C∞0 , then m := ξ is

called the mean value of ξ,

m(φ) =

∫xdF, whereF (x) = P ξ(φ) < x . (8.470)

Recall that m(φ) is called continuous in C∞0 if φn(x) → φ(x) implies

m(φn) → m(φ), where φn → φ means that all φn and φ vanish outside

of a fixed compact set E of Rr and maxx∈E |φ(j)n − φ(j)| → 0 as n → ∞ for

all multiindices j.

The correlation functional of a generalized random field is defined as

B(φ, ψ) = ξ∗(φ)ξ(ψ). (8.471)

The covariance functional is defined as

R(φ, ψ) = [ξ∗(φ) −m∗(φ)] [ξ(ψ) −m(ψ)] = B(φ, ψ) −m(φ)m(ψ). (8.472)

Both functionals are nonnegative definite:

B(φ, φ) ≥ 0, R(φ, φ) ≥ 0.



If the random vectors ξ(φ1), . . . , ξ(φm) for all m are Gaussian, then ξ

is called a generalized random Gaussian field. If B(φ, ψ) and m(φ) are

continuous in C∞0 (Rr) bilinear and, respectively, linear functionals and

R(φ, φ) ≥ 0 then there is a generalized random Gaussian field for which

B(φ, ψ) is the correlation functional and m(φ) is the mean value functional.

A generalized random field is called homogeneous (stationary) if the

random vectors

ξ (φ1(x+ h)) , . . . , ξ (φm(x+ h)) and ξ (φ1(x)) , . . . , ξ (φm(x))have the same distribution function for any h ∈ Rr. The mean value

functional for a homogeneous generalized random field is

m(φ) = const

∫φdx

and its correlation functional is

B(φ, ψ) =

∫φ(λ)ψ∗(λ)µ(dλ)

where φ(λ) is the Fourier transform of φ(x), and µ(dλ) is a positive measure

on Rr satisfies for some p ≥ 0 the condition

∫

Rr

(1 + |λ|2

)−pµ(dλ) <∞.

The measure µ is called the spectral measure of ξ. One can introduce the

spectral representation of the generalized random field similar to (8.440).

An important example of a Gaussian generalized random field is the Brow-

nian motion.

8.4.8 Kalman filters

Let us start with the basic equation for the optimal Wiener filter for the

filtering problem

σ2h(t, τ ) +

∫ t

t0

h(t, τ ′)Rs(τ, τ′)dτ ′ = f(τ, t), t0 < τ < t, (8.473)

where U = s + n is the observed signal, s and n are uncorrelated,

n∗(t)n(τ ) = σ2δ(t− τ ), Rs(τ, t) := s∗(τ )s(t),

s(t) = n(t) = 0, f(τ, t) := U∗(τ )s(t). (8.474)



The optimal estimate of s is

s(t) =

∫ t

t0

h(t, τ )U(τ )dτ. (8.475)

The error of this estimate

s(t) := s(t) − s(t) (8.476)

and

D [s(t)] = Rs(t, t) −∫ t

t0

h∗(t, τ )f(τ, t)dτ. (8.477)

Let us assume that s(t) satisfies the following differential equation

s(t) = A(t)s + w, (8.478)

where for simplicity we assume all functions to be scalar functions, w to be

white noise, and

w = 0, w∗(t)w(τ ) = Qδ(t− τ ), Q = const > 0. (8.479)

One could assume that

U = Hs(t) + n, (8.480)

where n is white noise, and H is a linear operator, but the argument will

be essentially the same, and, for simplicity, we assume (8.479).

Note that

f(τ, t) = U∗(τ )s(t) = [s∗(τ ) + n∗(τ )] s(t) = Rs(τ, t) (8.481)

assuming that the noise n(τ ) and the signal s(t) are uncorrelated

n∗(τ )s(t) = 0. (8.482)

Also

R(τ, t) := U∗(τ )U(t) = Rs(τ, t) + σ2δ(t− τ ) (8.483)

provided that (8.482) holds and

n∗(τ )n(t) = σ2δ(t− τ ), n = 0. (8.484)



To derive a differential equation for the optimal impulse function, dif-

ferentiate (8.473) in t using (8.481):

∂Rs(τ, t)

∂t= h(t, t)R(τ, t) +

∫ t

t0

∂h(t, τ ′)

∂tR(τ, τ ′)dτ ′. (8.485)

For τ < t equation (8.483) becomes

R(τ, t) = Rs(τ, t), τ < t. (8.486)

This and equation (8.473) yield after multiplication by h(t, t):

h(t, t)R(τ, t) =

∫ t

t0

h(t, t)h(t, τ ′)R(τ, τ ′)dτ ′. (8.487)

From (8.478) one obtains

∂

∂tRs(τ, t) = ARs(τ, t), τ < t (8.488)

where one used the equation

s∗(τ )w(t) = 0 for τ < t. (8.489)

To derive (8.489) note that equation (8.478) implies

s(t) = ψ(t, t0)s(t0) +

∫ t

t0

ψ(t, τ )w(τ )dτ (8.490)

where ψ(t, τ ) is the transition function for the operator ddt −A. Since

w∗(t)w(τ ) = 0 for τ < t, (8.491)

it follows from (8.490) that (8.489) holds.

From (8.488), (8.481) and (8.473) one has

∂

∂tRs(τ, t) = A

∫ t

t0

h(t, τ ′)R(τ, τ ′)dτ ′. (8.492)

From (8.485), (8.492) and (8.487) one obtains

0 =

∫ t

t0

[−Ah(t, τ ′) − h(t, t)h(t, τ ′) +

∂h(t, τ ′)

∂t

]R(τ, τ ′)dτ ′ (8.493)

for τ < t. Since R is positive definite (see (8.483)), equation (8.493) has

only the trivial solution, one gets

∂h(t, τ )

∂t= Ah(t, τ ) − h(t, t)h(t, τ ), τ < t. (8.494)



This is a differential equation for the optimal filter h(t, τ ).

Let us find a differential equation for the optimal estimate s(t) defined

by (8.475):

˙s = h(t, t)U(t) +

∫ t

t0

∂h(t, τ )

∂tU(τ )dτ (8.495)

where f := dfdt . From (8.494) and (8.495) one gets

˙s = h(t, t)U(t) +

∫ t

t0

[Ah(t, τ ) − h(t, t)h(t, τ )]U(τ )dτ

= h(t, t)U(t) + As(t) − h(t, t)s(t)

= As(t) + h(t, t) [U(t) − s(t)] . (8.496)

This is the differential equation for s(t). The initial condition is s(t0) =

0 according to (8.475). Let us express h(t, t) in terms of the variance of the

error (8.373). If this is done then (8.496) can be used for computations.

From (8.473) and (8.481) one obtains:

Rs(τ, t) = σ2h(t, τ ) +

∫ t

t0

Rs(τ, τ′)h(t, τ ′)dτ ′. (8.497)

Put τ = t in (8.497) assume that h(t, τ ) is real-valued and use (8.477) to

get

h(t, t) = σ−2D [s(t)] . (8.498)

Let us finally derive a differential equation for h(t, t). From (8.476),


˙s = As + w − As − h(t, t) [s(t) + n(t) − s(t)]

= [A − h(t, t)] s(t) + w − h(t, t)n(t). (8.499)

The solution to (8.499) is

s(t) = ψ(t, t0)s(t0) +

∫ t

t0

ψ(t, τ ) [w(τ ) − h(τ, τ )n(τ )] dτ. (8.500)



One obtains, using (8.499), that

h(t, t) = σ−2[˙s∗(t)s(t) + s∗(t) ˙s(t)

]

= σ−22Re[A∗(t) − h(t, t)]σ2h(t, t)

+ w∗(t)s(t) − h(t, t)h∗(t)s(t)

= [A∗(t) +A(t)]h(t, t) − 2h2(t, t) + Qσ−2 + h2(t, t)

= [A∗(t) +A(t)]h(t, t) − h2(t, t) +Qσ−2, (8.501)

where we assumed that w, n and s(t0) are uncorrelated, took into ac-

count that h(t, t) > 0 (see (8.498)) and used formula (8.500) to get

w∗(t)s(t) =1

2Qψ(t, t) =

Q

2(8.502)

and

h∗(t)s(t) = −h(t, t)2

σ2. (8.503)

Note that ψ(t, t) = 1 and the 12

factor in (8.502) and (8.503) appeared since

we used the formula∫ tt0δ(t− τ )fdτ = 1

2f(t).

Equation (8.501) is the Riccati equation for h(t, t). Equations (8.494),

(8.496) and (8.501) define Kalman’s filter. This filter consists in computing

the optimal estimate (8.475) by solving the differential equation (8.496) in

which h(t, t) is obtained by solving the Riccati equation (8.501). The initial

data for equation (8.501) is

h(t0, t0) = σ2D [s(t0)] = σ−2D [s(t)] = σ−2Rs(t0, t0). (8.504)

Here we used equation (8.476) and took into account that s(t0) = 0.

The ideas of the derivation of Kalman’s filter are the same for random

vector-functions. In this case A(t) is a matrix. For random fields there is

no similar theory due to the fact that there is no causality in the space

variables in contrast with the time variable.


Appendix A

Analytical Solution of the Basic

Integral Equation for a Class of

One-Dimensional Problems

In this Section we develop the theory for a class of random processes, be-

cause this theory is analogous to the estimation theory for random fields,

developed in the next Section.

Let

(?)Rh = f, 0 ≤ x ≤ L, Rh =

∫ L

0

R(x, y)h(y) dy

where the kernel R(x, y) satisfies the equation QR = Pδ(x − y). Here

Q and P are formal differential operators of order n and m < n, respec-

tively, n and m are nonnegative even integers, n > 0, m ≥ 0, Qu :=

qn(x)u(n) +

∑n−1j=0 qj(x)u

(j), Ph := h(m) +∑m−1

j=0 pj(x)h(j), qn(x) ≥ c > 0,

the coefficients qj(x) and pj(x) are smooth functions defined on R, δ(x) is

the delta-function, f ∈ Hα(0, L), α := n−m2 , Hα is the Sobolev space.

An algorithm for finding analytically the unique solution h ∈ H−α(0, L)

to (?) of minimal order of singularity is given. Here H−α(0, L) is the dual

space to Hα(0, L) with respect to the inner product of L2(0, L).

Under suitable assumptions it is proved that R : H−α(0, L) → Hα(0, L)

is an isomorphism.

Equation (?) is the basic equation of random processes estimation the-

ory. Some of the results are generalized to the case of multidimensional

equation (?), in which case this is the basic equation of random fields esti-

mation theory. The presentation in Appendix A follows the paper [Ramm

(2003)].

325



A.1 Introduction

In Chapter 2 estimation theory for random fields and processes is con-

structed. The estimation problem for a random process is as follows. Let

u(x) = s(x) + n(x) be a random process observed on the interval (0, L),

s(x) is a useful signal and n(x) is noise. Without loss of generality we as-

sume that s(x) = n(x) = 0, where the overbar stands for the mean value,

u∗(x)u(y) := R(x, y), R(x, y) = R(y, x), u∗(x)s(y) := f(x, y), and the star

here stands for complex conjugate. The covariance functions R(x, y) and

f(x, y) are assumed known. One wants to estimate s(x) optimally in the

sense of minimum of the variance of the estimation error. More precisely,

one seeks a linear estimate

Lu =

∫ L

0

h(x, y)u(y) dy, (A.1)

such that

|(Lu)(x) − s(x)|2 = min. (A.2)

This is a filtering problem. Similarly one can formulate the problem of

optimal estimation of (As)(x), where A is a known operator acting on

s(x). If A = I, where I is the identity operator, then one has the filtering

problem, if A is the differentiation operator, then one has the problem of

optimal estimation of the derivative of s, if As = s(x+x0), then one has an

extrapolation problem, etc. The kernel h(x, y) is, in general, a distribution.

As in Chapter 1, one derives a necessary condition for h to satisfy (A.2):

∫ L

0

R(x, y)h(y, z) dy = f(x, z), 0 ≤ x, z ≤ L. (A.3)

Since z enters as a parameter in (A.3), the basic equation of estimation

theory is:

Rh :=

∫ L

0

R(x, y)h(y) dy = f(x), 0 ≤ x ≤ L. (A.4)

The operator in L2(0, L) defined by (A.4) is symmetric. In Chapter 1 it is

assumed that the kernel

R(x, y) =

∫ ∞

−∞

P (λ)

Q(λ)Φ(x, y, λ) dρ(λ), (A.5)


Analytical Solution of the Basic Integral Equation 327

where P (λ) and Q(λ) are positive polynomials, Φ(x, y, λ) and dρ(λ) are

spectral kernel and, respectively, spectral measure of a selfadjoint ordinary

differential operator ` in L2(R), degQ(λ) = q, degP (λ) = p < n, p ≥ 0,

ord` := σ > 0, x, y ∈ Rr, r ≥ 1, and ` is a selfadjoint elliptic operator in

L2(Rr).

It is proved in 4 that the operator R : H−α(0, L) → Hα(0, L),

α := n−m2 σ is an isomorphism. By Hα(0, L) the Sobolev space Wα,2(0, L)

is denoted, and H−α(0, L) is the dual space to Hα(0, L) with respect to

L2(0, L) := H0(0, L) inner product. Namely, H−α(0, L) is the space of dis-

tributions h which are linear bounded functionals on Hα(0, L). The norm

of h ∈ H−α(0, L) is given by the formula

‖h‖H−α(0,L) = supg∈Hα(0,L)

|(h, g)|‖g‖Hα(0,L)

, (A.6)

where (h, g) is the L2(0, L) inner product if h and g belong to L2(0, L).

One can also define H−α(0, L) as the subset of the elements of H−α(R)

with support in [0, L].

We generalize the class of kernels R(x, y) defined in (A.5): we do not

use the spectral theory, do not assume ` to be selfadjoint, and do not assume

that the operators Q and P commute.

We assume that

QR = Pδ(x− y), (A.7)

where Q and P are formal differential operators of orders n and m respec-

tively, n > m ≥ 0, n and m are even integers, δ(x) is the delta-function,

Qu :=

n∑

j=0

qj(x)u(j), qn(x) ≥ c > 0, Ph := h(m)+

m−1∑

j=0

pj(x)h(j), (A.8)

qj and pj are smooth functions defined on R. We also assume that the

equation Qu = 0 has n2 linearly independent solutions u−j ∈ L2(−∞, 0)

and n2 linearly independent solutions u+

j ∈ L2(0,∞). In particular, this

implies that if Qh = 0, h ∈ Hα(R), α > 0, then h = 0, and the same

conclusion holds for h ∈ Hβ(R) for any fixed real number β, including

negative β, because any solution to the equation Qh = 0 is smooth: it is a

linear combination of n linearly independent solution to this equation, each

of which is smooth and none belongs to L2(R).



Let us assume that R(x, y) is a selfadjoint kernel such that

c1‖ϕ‖2− ≤ (Rϕ,ϕ) ≤ c2‖ϕ‖2

−, c1 = const > 0, ∀ϕ ∈ C∞0 (R), (A.9)

where (·, ·) is the L2(R) inner product, ‖ϕ‖− := ‖ϕ‖H−α(R) := ‖ϕ‖−α,

α := n−m2 , ‖ϕ‖β := ‖ϕ‖Hβ(R), and we use below the notation ‖ϕ‖+ :=

‖ϕ‖Hα(0,L) := ‖ϕ‖H+ . The spaces Hα(0, L) and H−α(0, L) are dual of

each other with respect to the L2(0, L) inner product, as was mentioned

above. If ϕ ∈ H−α(0, L), then ϕ ∈ H−α(R), and the inequality (A.9) holds

for such ϕ. By this reason we also use (for example, in the proof of Theorem

A.1 below) the notation H− for the space H−α(0, L).

Assumption (A.9) holds, for example, for the equation

Rh =

∫ 1

−1

exp(−|x− y|)h(y)dy = f(x), −1 ≤ x ≤ 1.

Its solution of minimal order of singularity is

h(x) = (−f ′′+f)/2+δ(x+1)[−f ′ (−1)+f(−1)]/2+δ(x−1)[f ′ (1)+f(1)]/2.

One can see that the solution is a distribution with support at the boundary

of the domain D if the following inequalities (A.10) and (A.11) hold:

c3‖ϕ‖−α+n ≤ ‖Q∗ϕ‖−α ≤ c4‖ϕ‖−α+n, c3, c4 = const > 0, ∀ϕ ∈ C∞0 (R),

(A.10)

c5‖ϕ‖2n+m

2

≤ (PQ∗ϕ, ϕ) ≤ c6‖ϕ‖2n+m

2

, ∀ϕ ∈ C∞0 (R), (A.11)

where Q∗ is a formally adjoint to Q differential expression, and c5 and c6are positive constants independent of ϕ ∈ C∞

0 (R). The right inequality

(A.11) is obvious because ordPQ∗ = n+m, and the right inequality (A.10)

is obvious because ordQ∗ = n.

Let us formulate our basic results.

Theorem A.1 If (A.9) holds, then the operator R, defined in (A.5), is

an isomorphism of H−α(0, L) onto Hα(0, L), α = n−m2 .

Theorem A.2 If (A.7), (A.10) and (A.11) hold, then (A.9) holds and

R : H−α(0, L) → Hα(0, L) is an isomorphism.

Theorem A.3 If (A.7), (A.10) and (A.11) hold, and f ∈ Hn(0, L),

then the solution to (A.4) in H−α(0, L) does exist, is unique, and can be



calculated analytically by the following formula:

h =

∫ x

0

G(x, y)Qf dy +

n−α−1∑

j=0

[a−j (−1)jG(j)

y (x, 0) + a+j (−1)jG(j)

y (x, L)],

(A.12)

where a±j are some constants and G(x, y) is the unique solution to the

problem

PG = δ(x− y), G(x, y) = 0 for x < y. (A.13)

The constants a±j are uniquely determined from the condition h(x) = 0 for

x > L.

Remark A.1 The solution h ∈ H−α(0, L) is the solution to equation

(A.4) of minimal order of singularity.

Remark A.2 If P = 1 in (A.7) then the solution h to (A.4) of minimal

order of singularity, h ∈ H−n2 (0, L), can be calculated by the formula h =

QF , where F is given by (A.22) (see below) and u+ and u− are the unique

solutions of the problems Qu+ = 0 if x > L, u(j)+ (L) = f (j)(L), 0 ≤ j ≤

n2 −1, u+(∞) = 0, and Qu− = 0 if x < 0, u

(j)− (0) = f (j)(0), 0 ≤ j ≤ n

2 −1,

u−(−∞) = 0.

A.2 Proofs

Proof of Theorem A.1. The set C∞0 (0, L) is dense in H−α(0, L) (in the

norm of H−α(R)). Using the right inequality (A.9), one gets:

‖R‖H−→H+ = suph∈H−

(Rh, h)

‖h‖2−

≤ c2, (A.14)

by the symmetry of R in L2(0, L). This implies ||R||H−→H+ ≤ c2. Using

the left inequality (A.9), one gets: c1‖h‖2− ≤ ‖Rh‖+‖h‖−, so

c1‖h‖− ≤ ‖Rh‖+. (A.15)

Therefore

‖R−1‖H+→H− ≤ 1

c1. (A.16)

Consequently, the range Ran(R) of R is a closed subspace of H+. In fact,

Ran(R) = H+. Indeed, if Ran(R) 6= H+, then there exists a g ∈ H− such



that 0 = (Rψ, g) ∀ψ ∈ H−. Taking ψ = g and using the left inequality

(A.9) one gets ‖g‖− = 0, so g = 0. Thus Ran(R) = H+.

Theorem A.1 is proved.

Proof of Theorem A.2. From (A.7) and (A.8) it follows that the kernel

R(x, y) defines a pseudodifferential operator of order −2α = m−n. In par-

ticular, this implies the right inequality (A.9). In this argument inequalities

(A.10) and (A.11) were not used.

Let us prove that (A.10) and (A.11) imply the left inequality (A.9).

One has

‖Q∗ϕ‖−α ≤ C‖ϕ‖n−α, ∀ϕ ∈ C∞0 (R), (A.17)

because ordQ∗ = n. Inequality (A.10) reads:

c3‖ϕ‖−α+n ≤ ||Q∗ϕ||−α ≤ c4‖ϕ‖−α+n, ∀ϕ ∈ C∞0 (R), (A.18)

where c3 and c4 are positive constants. If (A.18) holds, then Q∗ :

H−α+n(R) → H−α(R) is an isomorphism of H−α+n(R) onto H−α(R) pro-

vided that N (Q) := w : Qw = 0, w ∈ Hα(R) = 0. Indeed, if the range

of Q∗ is not all of H−α(R), then there exists an w 6= 0, w ∈ Hα(R) such

that (Q∗ϕ,w) = 0 ∀ϕ ∈ C∞0 (R), so Qw = 0. If Qw = 0 and w ∈ Hα(R),

then, as was mentioned below formula (A.8), it follows that w = 0. This

proves that Ran(Q∗) = H−α(R).

Inequality (A.11) is necessary for the left inequality (A.9) to hold. In-

deed, let ψ = Q∗ϕ, ϕ ∈ C∞0 (R), then (A.9) implies

c5‖ϕ‖2−α+n ≤ c‖Q∗ϕ‖2

−α ≤ (RQ∗ϕ,Q∗ϕ) = (QRQ∗ϕ, ϕ) = (PQ∗ϕ, ϕ),

(A.19)

where c > 0 here (and elsewhere in this paper) stands for various estimation

constants. Because −α+ n = n+m2

, inequality (A.19) is the left inequality

(A.11). The right inequality (A.11) is obvious because the order of the

operator PQ∗ equals to n+m.

Let us prove now that inequalities (A.11) and (A.10) are sufficient for

the left inequality (A.9) to hold.

Using the right inequality (A.10) and the left inequality (A.11), one

gets:

c‖ψ‖2−α ≤ c5‖ϕ‖2

n+m2

≤ (PQ∗ϕ, ϕ) = (Rψ,ψ), ψ = Q∗ϕ, ∀ϕ ∈ C∞0 (R).

(A.20)



Let us prove that the set ψ = Q∗ϕ∀ϕ∈C∞0 (R) is dense in H−α(0, L).

Assume the contrary. Then there is an h ∈ H−α(0, L), h 6= 0, such that

(Q∗ϕ, h) = 0 for all ϕ ∈ C∞0 (R). Thus, (ϕ,Qh) = 0 for all ϕ ∈ C∞

0 (R).

Therefore Qh = 0, and, by the argument given below formula (A.8), it

follows that h = 0. This contradiction proves that the set Q∗ϕ∀ϕ∈C∞0 (R)

is dense in H−α(0, L).

Consequently, (A.20) implies the left inequality (A.9). The right in-

equality (A.9) is an immediate consequence of the observation we made

earlier: (A.7) and (A.8) imply that R is a pseudodifferential operator of

order −2α = −(n+m).

Theorem A.2 is proved.

Proof of Theorem A.3. Equations (A.4) and (A.7) imply

Ph = g := QF. (A.21)

Here

F :=

u−, x < 0,

f, 0 ≤ x ≤ L,

u+, x > L,

(A.22)

where

Qu− = 0, x < 0, (A.23)

Qu+ = 0, x > L, (A.24)

and u− and u+ are chosen so that F ∈ Hα(R). This choice is equivalent to

the conditions:

u(j)− (0) = f (j)(0), 0 ≤ j ≤ α− 1, (A.25)

u(j)+ (L) = f (j)(L), 0 ≤ j ≤ α− 1. (A.26)

If F ∈ Hα(R), then g := QF ∈ Hα−n(R) = H−n+m2 (R), and, by (A.22),

one gets:

g = Qf +

n−α−1∑

j=0

[a−j δ

(j)(x) + a+j δ

(j)(x− L)], (A.27)

where a±j are some constants. There are n − α = n+m2 constants a+

j and

the same number of constants a−j .



Let G(x, y) be the fundamental solution of the equation

PG = δ(x− y) in R, (A.28)

which vanishes for x < y:

G(x, y) = 0 for x < y. (A.29)

Claim. Such G(x, y) exists and is unique. It solves the following Cauchy

problem:

PG = 0, x > y, G(j)x (x, y)

∣∣∣∣x=y+0

= δj,m−1, 0 ≤ j ≤ m − 1, (A.30)

satisfies condition (A.29), and can be written as

G(x, y) =m∑

j=1

cj(y)ϕj(x), x > y, (A.31)

where ϕj(x), 1 ≤ j ≤ m, is a linearly independent system of solutions to

the equation:

Pϕ = 0. (A.32)

Proof of the claim. The coefficients cj(y) are defined by conditions

(A.30):

m∑

j=1

cj(y)ϕ(k)j (y) = δk,m−1, 0 ≤ k ≤ m− 1. (A.33)

The determinant of linear system (A.33) is the WronskianW (ϕ1, . . . , ϕm) 6=0, so that cj(y) are uniquely determined from (A.33).

The fact that the solution to (A.30), which satisfies (A.29), equals to

the solution to (A.28) – (A.29) follows from the uniqueness of the solution

to (A.28) – (A.29) and (A.30) – (A.29), and from the observation that the

solution to (A.28) – (A.29) solves (A.30) – (A.29). The uniqueness of the

solution to (A.30) – (A.29) is a well-known result.



Let us prove uniqueness of the solution to (A.28) – (A.29). If there

were two solutions, G1 and G2, to (A.28) – (A.29), then their difference

G := G1 − G2, would solve the problem:

PG = 0 in R, G = 0 for x < y. (A.34)

By the uniqueness of the solution to the Cauchy problem, it follows that

G ≡ 0. Note that this conclusion holds in the space of distributions as well,

because equation (A.34) has only the classical solutions, as follows from the

ellipticity of P . Thus the claim is proved.

From (A.21) and (A.27) – (A.29) one gets:

h =

∫ x

0

G(x, y)Qf dy +

∫ x

0

G(x, y)

n−α−1∑

j=0

[a−j δ

(j)(y) + a+j δ

(j)(y − L)]dy

=

∫ x

0

G(x, y)Qf dy +

n−α−1∑

j=0

(−1)j

[G(j)y (x, y)

∣∣∣∣y=0

a−j + G(j)y (x, y)

∣∣∣∣y=L

a+j

]

:=

∫ x

0

G(x, y)Qf dy +H(x). (A.35)

It follows from (A.35) that h ∈ H−α(R) and

h = 0 for x < 0, (A.36)

that is, (h, ϕ) = 0 ∀ϕ ∈ C∞0 (R) such that suppϕ ⊂ (−∞, 0). In order to

guarantee that h ∈ H−α(0, L) one has to satisfy the condition

h = 0 for x > L. (A.37)

Conditions (A.36) and (A.37) together are equivalent to supph ⊂ [0, L].

Note that although Qf ∈ H−n+m2 (0, L), so that Qf is a distribution, the

integral∫ x0G(x, y)Qf dy =

∫∞−∞G(x, y)Qf dy is well defined as the unique

solution to the problem Pw = Qf , w = 0 for x < 0.

Let us prove that conditions (A.36) and (A.37) determine the constants

a±j , 0 ≤ j ≤ n+m2 − 1, uniquely.

If this is proved, then Theorem A.3 is proved, and formula (A.35) gives

an analytical solution to equation (A.4) in H−α(0, L) provided that an

algorithm for finding a±j is given. Indeed, an algorithm for finding G(x, y)

consists of solving (A.29) – (A.30). Solving (A.29) – (A.30) is accomplished

analytically by solving the linear algebraic system (A.33) and then using



formula (A.31). We assume that m linearly independent solutions ϕj(x) to

(A.32) are known.

Let us derive an algorithm for calculation of the constants a±j , 0 ≤ j ≤n+m

2 − 1, from conditions (A.36) – (A.37).

Because of (A.29), condition (A.36) is satisfied automatically by h de-

fined in (A.35).

To satisfy (A.37) it is necessary and sufficient to have

∫ L

0

G(x, y)Qf dy +H(x) ≡ 0 for x > L. (A.38)

By (A.31), and because the system ϕj1≤j≤m is linearly independent,

equation (A.38) is equivalent to the following set of equations:

∫ L

0

ck(y)Qf dy +

n+m2 −1∑

j=0

(−1)j[c(j)k (0)a−j + c

(j)k (L)a+

j

]= 0, 1 ≤ k ≤ m.

(A.39)

Let us check that there are exactly m independent constants a±j and that

all the constants a±j are uniquely determined by linear system (A.39).

If there are m independent constants a±j and other constants can be

linearly represented through these, then linear algebraic system (A.39) is

uniquely solvable for these constants provided that the corresponding ho-

mogeneous system has only the trivial solution. If f = 0, then h = 0, as

follows from Theorem 1.1, and g = 0 in (A.27). Therefore a±j = 0 ∀j, and

system (A.39) determines the constants a±j ∀j uniquely.

Finally, let us prove that there are exactly m independent constants

a±j . Indeed, in formula (A.21) there are n2 linearly independent solutions

u−j ∈ L2(−∞, 0), so

u− =

n2∑

j=1

b−j u−j , (A.40)

and, similarly, u+ in (A.21) is of the form

u+ =

n2∑

j=1

bju+j , (A.41)



where u+j ∈ L2(0,∞). Condition F ∈ Hα(R) implies

n2∑

j=1

b−j (u−j )(k) = f (k) at x = 0, 0 ≤ k ≤ α− 1 =n−m

2− 1, (A.42)

and

n2∑

j=1

b+j (u+j )(k) = f (k) at x = L, 0 ≤ k ≤ n−m

2− 1. (A.43)

Equations (A.42) and (A.43) imply that there are n2− n−m

2= m

2inde-

pendent constants b−j and m2

independent constants b+j , and the remaining

n−m constants b−j and b+j can be represented through these m constants by

solving linear systems (A.42) and (A.43) with respect to, say, first n−m2

con-

stants, for example, for system (A.42), for the constants b−j , 1 ≤ j ≤ n−m2 .

This can be done uniquely because the matrices of the linear systems (A.42)

and (A.43) are nonsingular: they are Wronskians of linearly independent

solutions u−j 1≤j≤n−m2

and u+j 1≤j≤n−m

2.

The constants a±j can be expressed in terms of b±j and f by linear rela-

tions. Thus, there are exactly m independent constants a±j . This completes

the proof of Theorem A.3.

Remark A.3 In Chapter 5 a theory of singular perturbations for the

equations of the form

εhε +Rhε = f (A.44)

is developed for a class of integral operators with a convolution kernels

R(x, y) = R(x − y). This theory can be generalized to the class of ker-

nels R(x, y) studied here. The basic interesting problem is: for any ε > 0

equation (A.44) has a unique solution hε ∈ L2(0, L); how can one find the

asymptotic behavior of hε as ε → 0? The limit h of hε as ε → 0 should

solve equation Rh = f and, in general, h is a distribution, h ∈ H−α(0, L).

The theory presented in Chapter 5 allows one to solve the above problem

for the class of kernels studied here.

Remark A.4 Theorems A.1 and A.2 and their proofs remain valid in

the case when equation (A.4) is replaced by the

Rh :=

∫

D

R(x, y)h(y)dy = f, x ∈ D. (A.45)



Here D ⊂ Rr, r > 1, is a bounded domain with a smooth boundary S, D is

the closure of D, R(x, y) solves (A.7), where P and Q are uniformly elliptic

differential operators with smooth coefficients, ordP = m ≥ 0, ordQ =

n > m, equation Qh = 0 has only the trivial solution in Hβ(Rr) for any

fixed real number β. Under the above assumptions, one can prove that the

operator defined by the kernel R(x, y) is a pseudodifferential elliptic operator

of order −2α, where α := n−m2

. We do not assume that P and/or Q are

selfadjoint or that P and Q commute. An analog of Remark 2.1 holds for

the multidimensional equation (A.44) as well. Equation (A.45) is the basic

integral equation of random fields estimation theory.


Appendix B

Integral Operators Basic in Random

Fields Estimation Theory

B.1 Introduction

Integral equations theory is well developed starting from the beginning of

the last century. Of special interest are the classes of integral equations

which can be solved in closed form or reduced to some boundary-value

problems for differential equations. There are relatively few such classes of

integral equations. They include equations with convolution kernels with

domain of integration which is the whole space. These equations can be

solved by applying the Fourier transform. The other class of integral equa-

tions solvable in closed form is the Wiener-Hopf equations. Yet another

class consists of one-dimensional equations with special kernels (singular in-

tegral equations which are reducible to Riemann-Hilbert problems for ana-

lytic functions, equations with logarithmic kernels, etc). (See e.g. [Zabreiko

et. al. (1968)], [Gakhov (1966)]). In Chapter 5 a new class of multidimen-

sional integral equations is introduced. Equations of this class are solvable

in closed form or reducible to a boundary-value problem for elliptic equa-

tions. This class consists of equations (B.3) (see below), whose kernels

R(x, y) are kernels of positive rational functions of an arbitrary selfadjoint

elliptic operator in L2(Rn), where n ≥ 1. In Appendix A this theory is gen-

eralized to the class of kernels R(x, y) which solve problem QR = Pδ(x−y),where δ(x) is the delta-function, Q and P are elliptic differential operators,

and x ∈ R1. Ellipticity in this case means that the coefficient in front of the

senior derivative does not vanish. In Appendix A integral equations (B.3)

with the kernels of the above class are solved in closed form by reducing

them to a boundary-value problem for ODE. Our aim is to generalize the

approach proposed in Appendix A to the multidimensional equations (B.3)

whose kernel solves equation QR = Pδ(x− y) in Rn, where n > 1. This is

337



not only of theoretical interest, but also of great practical interest, because,

as shown in Chapter 1, equations (B.3) are basic equations of random fields

estimation theory. Thus, solving such equations with larger class of kernels

amounts to solving estimation problems for larger class of random fields.

The kernel R(x, y) is the covariance function of a random field. The class

of kernels R, which solve equation QR = Pδ(x − y) in Rn, contains the

class of kernels introduced and studied in Chapters 1-4.

Our theory is not only basic in random fields estimation theory, but

can be considered as a contribution to the general theory of integral equa-

tions. Any new class of integral equations, which can be solved analytically

or reduced to some boundary-value problems is certainly of interest, and

potentially can be used in many applied areas.

For convenience of the reader, the notations and auxiliary material are

put in Section B.4. This Appendix follows closely the paper [Kozhevnikov

and Ramm (2005)].

Let P be a differential operator in Rn of order µ,

P := P (x,D) :=∑

|α|≤µaα (x)Dα,

where aα (x) ∈ C∞ (Rn) .

The polynomials

p (x, ξ) :=∑

|α|≤µaα (x) ξα and p0 (x, ξ) :=

∑

|α|=µaα (x) ξα

are called respectively symbol and principal symbol of P.

Suppose that the symbol p(x, ξ) belongs to the class SG(µ,0) (Rn) con-

sisting of allC∞ functions p (x, ξ) on Rn×Rn, such that for any multiindices

α, β there exists a constant Cα,β such that∣∣∣Dα

xDβξ p (x, ξ)

∣∣∣ ≤ Cα,β 〈ξ〉µ−|β| 〈x〉−|α| , x, ξ ∈ Rn, 〈ξ〉 :=(1 + |ξ|2)1/2(B.1)

It is known (cf. [Wloka et. al. (1995), Prop. 7.2]) that the map

P (x,D) : S (Rn) → S (Rn) is continuous, where S (Rn) is the Schwartz

space of smooth rapidly decaying functions. Let Hs (Rn) (s ∈ R) be the

usual Sobolev space. It is known that the operator P (x,D) acts naturally

on the Sobolev spaces, that is, the operator P (x,D) is (cf. [Wloka et. al.

(1995), Sec. 7.6]) a bounded operator: Hs (Rn) → Hs−µ (Rn) for all s ∈ R.

The operator P (x,D) is called elliptic, if p0 (x, ξ) 6= 0 for any x ∈ Rn,

ξ ∈ Rn \ 0.


Integral Operators Basic in Random Fields Estimation Theory 339

Let P (x,D) and Q (x,D) be both elliptic differential operators of even

orders µ and ν respectively, 0 ≤ µ < ν, with symbols satisfying (B.1) (for

Q (x,D) we replace p and µ in (B.1) respectively by q and ν). The case

µ ≥ ν is a simpler case which leads to an elliptic operator perturbed by a

compact integral operator in a bounded domain.

We assume also that P (x,D) and Q (x,D) are invertible operators, that

is, there exist the inverse bounded operators P−1 (x,D) : Hs−µ (Rn) →Hs (Rn) and Q−1 (x,D) : Hs−ν (Rn) → Hs (Rn) for all s ∈ R.

Let R := Q−1 (x,D)P (x,D) . The invertibility of P (x,D) and Q (x,D)

imply that R is an invertible pseudodifferential operator of negative order

µ− ν acting from Hs (Rn) onto Hs+ν−µ (Rn) (s ∈ R) .

Since P and Q are elliptic, their orders µ and ν are even for n > 2. If

n = 2, we assume that µ and ν are even numbers. Therefore, the number

a := (ν − µ) /2 > 0 is an integer.

Let Ω denote a bounded connected open set in Rn with a smooth bound-

ary ∂Ω (C∞-class surface) and Ω its closure in L2(Ω), Ω = Ω ∪ ∂Ω. The

smoothness restriction on the domain can be weakened, but we do not go

into detail.

The restriction RΩ of the operator R to the domain Ω ⊂ Rn is defined

as

RΩ := rΩReΩ− , (B.2)

where eΩ− is the extension by zero to Ω− := Rn\Ω and rΩ is the restriction

to Ω.

It is known (cf. [Grubb (1990), Th. 3.11, p. 312]) that the operator RΩ

defines a continuous mapping

RΩ : Hs (Ω) → Hs+ν−µ (Ω) (s > −1/2) ,

where Hs (Ω) is the space of restrictions of elements of Hs (Rn) to Ω with

the usual infimum norm (see Section B.4).

The pseudodifferential operator R of negative order µ − ν and its re-

striction RΩ can be represented as integral operators with kernel R (x, y) :

Rh =

∫

Rn

R (x, y)h (y) dy, RΩh =

∫

Ω

R (x, y)h (y) dy (x ∈ Ω) ,

where R (x, y) ∈ C∞ (Rn × Rn \Diag) , Diag is the diagonal in Rn × Rn,

Moreover, R (x, y) has a weak singularity:

|R (x, y)| ≤ C |x− y|−σ n+ µ− ν ≤ σ < n.



For n + µ− ν < 0, R (x, y) is continuous.

Let γ := n+ µ− ν and rxy := |x− y| → 0. Then R(x, y) = O(r−γxy ) if n

is odd or if n is even and ν < n, and R(x, y) = O(r−γxy log rxy) if n is even

and ν > n.

In Chapter 1, the equation

RΩh = f ∈ Ha (Ω) , h ∈ H−a0 (Ω) , a =

ν − µ

2, (B.3)

is derived as a necessary and sufficient condition for the optimal estimate

of random fields by the criterion of minimum of variance of the error of the

estimate. The kernel R(x, y) is a known covariance function, and h(x, y)

is the distributional kernel of the operator of optimal filter. The kernel

h(x, y) should be of minimal order of singularity, because only in this case

this kernel solves the estimation problem: the variance of the error of the

estimate is infinite for the solutions to equation (B.3), which do not have

minimal order of singularity. In Chapters 1-4, equation (B.3) was studied

under the assumption that P and Q are polynomial functions of a selfad-

joint elliptic operator defined in the whole space. In Appendix A some

generalizations of this theory are given. In particular, the operators P and

Q are not necessarily selfadjoint and commuting.

In this Appendix an extension to multidimensional integral equations

of some results from Appendix A is given.

We want to prove that, under some natural assumptions, the oper-

ator RΩ is an isomorphism of the space H−a0 (Ω) onto Ha (Ω), where

a = (ν − µ) /2 > 0, and Hs0 (Ω) , s ∈ R, denotes the subspace of Hs (Rn)

that consists of the elements supported in Ω.

To prove the isomorphism property, we reduce the integral equation

(B.3) to an equivalent elliptic exterior boundary-value problem. Since we

look for a solution u belonging to the space Ha (Ω−) = H(ν−µ)/2 (Ω−) ,

and the differential operator Q is of order ν, then Qu should belong to

some Sobolev space of negative order. This means that we need results

on the solvability of equation (B.3) in Sobolev spaces of negative order.

Such spaces as well as solvability in them of elliptic differential boundary

value problems in bounded domains have been investigated in [Roitberg

(1996)] and later in [Kozlov et. al. (1997)]. The case of pseudodifferential

boundary value problems has been studied in [Kozhevnikov (2001)]. In[Erkip and Schrohe (1992)] and in [Schrohe (1999)] the solvability of elliptic

differential and pseudodifferential boundary value problems for unbounded

manifolds, and in particular for exterior domains, has been established.



These solvability results have been obtained in weighted Sobolev spaces

of positive order s. To obtain the isomorphism property, we need similar

solvability results for exterior domain in the weighted Sobolev spaces of

negative order. One can find in Section B.4 the definition of these spaces

(cf. [Roitberg (1996)]).

B.2 Reduction of the basic integral equation to a boundary-

value problem

In Theorem B.1 the differentiation along the normal to the boundary Djn

is used. This operator is defined in Section B.4.

Theorem B.1 Integral equation (B.3) is equivalent to the following sys-

tem (B.4), (B.5), (B.6):

Qu = 0 in Ω−Dj

nu = Djnf on ∂Ω, 0 ≤ j ≤ a− 1,

(B.4)

Ph = QF, h ∈ H−a0 (Ω) , (B.5)

where u ∈ Ha (Ω−) is an extension of f :

F ∈ Ha (Rn) , F :=

f ∈ Ha (Ω) in Ω,

u ∈ Ha (Ω−) in Ω−.(B.6)

Proof. Let h ∈ H−a0 (Ω) solve equation (B.3), RΩh = f ∈ Ha (Ω). Let us

define F := Q−1Ph. Since h ∈ H−a0 (Ω) , it follows that Ph ∈ H−a−µ (Rn)

and F = Q−1Ph ∈ H−a+ν−µ (Rn) = Ha (Rn) . We have f = RΩh =

rΩQ−1Ph = rΩF, so F is an extension of f. Therefore, F can be represented

in the form (B.6). Furthermore, since F = Q−1Ph, then Ph = QF, that

is, h solves (B.5). Since h ∈ H−a0 (Ω) , then QF = Ph ∈ H−a−ν

0 (Ω) . It

follows, that Qu = 0 in Ω−. Since F ∈ Ha (Rn) , we get Djnu = Dj

nf on

∂Ω−, 0 ≤ j ≤ a− 1. This means that u ∈ Ha (Ω−) solves the boundary-

value problem (B.4). Thus, it is proved that any solution to (B.3) solves

problem (B.4), (B.5).

Conversely, let a pair (u, h) ∈ Ha (Ω−) ×H−a0 (Ω) solve system (B.4),

(B.5), (B.6). Since Ph = QF, then Rh = Q−1Ph = F. It follows from

(B.6) that RΩh = Rh|Ω = F |Ω = f, i.e. h solves (B.3).

Remark B.1 If µ > 0, the boundary value problem (B.4) is underdeter-

mined because Q is an elliptic operator of order ν which needs ν/2 boundary



conditions, but we have only a (a < ν/2) conditions in (B.4). Therefore,

the next step is a transformation of equation (B.5) into µ/2 extra bound-

ary conditions to the boundary value problem (B.4). This will be done in

Theorem B.2.

Let us define κ (ξ′, λ) :=(1 + |ξ′|2 + λ2

)1/2. Choose a function

ρ (τ ) ∈ S (R) with suppF−1ρ ⊂ R− and ρ (0) = 1. Let a >

2sup |∂τρ (τ ) |. Let Ξt+,λ denote a family (λ ∈ R+, t ∈ Z) of order-reducing

pseudodifferential operators Ξt+,λ := F−1χ+ (ξ, λ)F , where χ+ (ξ, λ) :=(κ (ξ′, λ) ρ

(ξn

aκ(ξ′,λ)

)+ iξn

)tare their symbols. It has been proved in

[Grubb (1996), Sec. 2.5] that the operator Ξt+,λ maps the space S0

(Rn+

):=

u ∈ S (Rn) : suppu ⊂ R

n

+

onto itself and has the following isomorphism

properties for s ∈ R:

Ξt+,λ : Hs (Rn) ' Hs−t (Rn) , (B.7)

Ξt+,λ : Hs0

(Rn+

)' Hs−t

0

(Rn+

). (B.8)

It is known ([Grubb (1996)], [Schrohe (1999)]) that using Ξt+,λ and an

appropriate partition of unity one can obtain, for sufficiently large λ, the

operator Λt+ which is an isomorphism:

Λt+ : Hs (Rn) ' Hs−t (Rn) , ∀s ∈ R,

and

Λt+ : Hs0 (Ω) ' Hs−t

0 (Ω) , ∀s ∈ R. (B.9)

Lemma B.1 Let P (x,D) be an invertible differential operator of order

µ, that is, there exists the inverse operator P−1 (x,D) which is bounded:

P−1 (x,D) : Hs−µ (Rn) → Hs (Rn) for all s ∈ R. Then a solution h to the

equation

P (x,D)h = g, g ∈ H−a−µ0 (Ω)

belongs to the space H−a0 (Ω) if and only if g satisfies the following µ/2

boundary conditions:

r∂ΩDjnΛ

−a−µ/2+ P−1 (x,D) g = 0 (j = 0, ..., µ/2− 1) .



Proof. Necessity. Let h = P−1 (x,D) g, h ∈ H−a0 (Ω) , solve the equa-

tion P (x,D)h = g, g ∈ H−a−µ0 (Ω) . By (B.9), we have Λ

−a−µ/2+ h ∈

Hµ/20 (Ω) . Therefore, r∂ΩD

jnΛ

−a−µ/2+ h = 0 (j = 0, ..., µ/2− 1) .

Sufficiency. Assume that the equalities r∂ΩDjnΛ

−a−µ/2+ h =

0 (j = 0, ..., µ/2− 1) hold. Since g ∈ H−a−µ0 (Ω) ⊂ H−a−µ (Rn) , we have

h = P−1 (x,D) g ∈ H−a (Rn) . Therefore, Ψ := Λ−a−µ/2+ h ∈ Hµ/2 (Rn) .

Since r∂ΩDjnΨ = 0 (j = 0, ..., µ/2− 1), we have Ψ = Ψ+ + Ψ−, where

Ψ+ := eΩ−rΩΨ ∈ Hµ/20 (Ω) and Ψ− := eΩrΩ−Ψ ∈ H

µ/20 (Ω−) . Since Λ

ν/2+ :

Hµ/20 (Ω) ' H−a

0 (Ω) , it follows that Λν/2+ Ψ+ ∈ H−a

0 (Ω) .Moreover, Λν/2+ is

a differential operator with respect to the variable xn, hence suppΨ− ⊂ Ω−implies suppΛ

ν/2+ Ψ− ⊂ Ω−. Since P is a differential operator,

supp(PΛ

ν/2+

)Ψ− ⊂ suppΛ

ν/2+ Ψ− ⊂ Ω−.

On the other hand, we have

Φ :=(PΛ

ν/2+

)Ψ =

(PΛ

ν/2+

)(Ψ+ + Ψ−) =

(PΛ

ν/2+

)Ψ+ +

(PΛ

ν/2+

)Ψ−.

For any ϕ ∈ C∞0 (Ω−) one has:

0 = 〈Φ, ϕ〉 =⟨(PΛ

ν/2+

)Ψ+, ϕ

⟩+⟨(PΛ

ν/2+

)Ψ−, ϕ

⟩=⟨(PΛ

ν/2+

)Ψ−, ϕ

⟩

Thus, supp(PΛ

ν/2+

)Ψ− ⊂ Ω. It follows that supp

(PΛ

ν/2+

)Ψ− ⊂ ∂Ω.

For any Ψ− ∈ C∞0 (Ω−) , we have

(PΛ

ν/2+

)Ψ− ∈ C∞ (Rn) and

supp(PΛ

ν/2+

)Ψ− ⊂ ∂Ω. Therefore,

(PΛ

ν/2+

)Ψ− = 0. Since P is invertible,

Λν/2+ Ψ− = 0 for Ψ− ∈ C∞

0 (Ω−) . Since C∞0 (Ω−) is dense in H

µ/20 (Ω−) ,

one gets Λν/2+ Ψ− = 0 for Ψ− ∈ H

µ/20 (Ω−) . It follows that

h = Λν/2+ Ψ = Λ

ν/2+ Ψ+ + Λ

ν/2+ Ψ− = Λ

ν/2+ Ψ+ ∈ H−a

0 (Ω) .

Lemma B.1 is proved.

Let F ∈ C∞ (Ω) ∩ S(Ω−). Assume that F has finite jumps Fk of the

normal derivative of order k (k = 0, 1, ...) on ∂Ω. For x′ ∈ ∂Ω, we will use

the following notation:

F0 (x′) := [F ]∂Ω (x′) := limε→+0

(F (x′ + εn) − F (x′ − εn)) ,

Fk (x′) :=[Dk

nF]∂Ω

(x′) .



Let f ∈ C∞ (Ω)

and u ∈ S(Ω−), and define γkf (x′) := r∂ΩD

knf (x′) ,

γku (x′) := r∂ΩDknu (x′) .

Let δ∂Ω denote the Dirac measure supported on ∂Ω, that is, a distribu-

tion acting as

(δ∂Ω, ϕ) :=

∫

∂Ω

ϕ (x)dS, ϕ (x) ∈ C∞0 (Rn) .

It is known that for any differential operator Q of order ν there exists a

representation Q =ν∑j=0

QjDjn, where Qj is a tangential differential operator

of order ν − j (cf Section B.4). We denote by DαF (x) the classical

derivative at the points where it exists.

The following Lemma B.2 is essentially known but for convenience of

the reader a short proof of this lemma is given.

Lemma B.2 The following equality holds for the distribution QF :

QF = QF − i

ν∑

j=0

Qj

j−1∑

k=0

Dkn (Fj−1−kδ∂Ω) . (B.10)

Proof. Let cos (nxj) denote cosine of the angle between the exterior unit

normal vector n to the boundary ∂Ω of Ω and the xj-axis.

We use the known formulas∫

Ω

∂u

∂xjdx =

∫

∂Ω

u (x) cos (nxj) dσ, u (x) ∈ C∞ (Ω), j = 1, ..., n,

∫

Ω−

∂v

∂xjdx = −

∫

∂Ω

v (x) cos (nxj) dσ, v (x) ∈ C∞0

(Ω−)

j = 1, ..., n.

where dσ is the surface measure on ∂Ω. Applying these formulas to

the products u (x)ϕ (x) and v (x)ϕ (x) where ϕ (x) ∈ C∞0 (Rn) , u (x) ∈

C∞ (Ω), v (x) ∈ C∞

0

(Ω−), we get

∫

Ω

∂u

∂xjϕ (x)dx = −

∫

Ω

u (x)∂ϕ

∂xjdx+

∫

∂Ω

u (x)ϕ (x) cos (nxj) dσ

j = 1, ..., n, (B.11)



∫

Ω−

∂v

∂xjϕ (x)dx = −

∫

Ω−

v (x)∂ϕ

∂xjdx−

∫

∂Ω

v (x)ϕ (x) cos (nxj) dσ

j = 1, ..., n. (B.12)

By (B.11), (B.12), we have

(∂F

∂xj, ϕ

)= −

(F,

∂ϕ

∂xj

)= −

∫

Rn

F (x)∂ϕ (x)

∂xjdx

=

∫

Rn

∂F (x)

∂xj

ϕ (x) dx+

∫

∂Ω

[F ]∂Ω (x) cos (nxj)ϕ (x) dS

=

(∂F

∂xj

+ [F ]∂Ω cos (nxj) δ∂Ω, ϕ (x)

), ϕ (x) ∈ C∞

0 (Rn) .

This means,

∂F

∂xj=

∂F

∂xj

+ [F ]∂Ω cos (nxj) δ∂Ω, j = 1, ..., n.

It follows,DnF = DnF−iF0δ∂Ω. Furthermore, using the last formula we

have D2nF = Dn DnF − iDn (F0δ∂Ω) =

D2

nF− iF1δ∂Ω − iDn (F0δ∂Ω)

and so on. By induction one gets:

DjnF =

Dj

nF− i

j−1∑

k=0

Dkn (Fj−1−kδ∂Ω) (j = 1, 2, ...) .

Substituting this formula for DjnF into the representation Q =

ν∑j=0

QjDjn,

we get (B.10). Lemma B.2 is proved.

Denoting in the sequel the extensions by zero to Rn of functions f (x) ∈C∞ (Ω

), u (x) ∈ S

(Ω−), as f0 and u0, and using Lemma 2, we obtain

the following formulas:

(Qf)0 = Q(f0)− i

ν∑

j=1

Qj

j−1∑

k=0

Dkn

((Dj−1−k

n f)∣∣∂Ωδ∂Ω

) (f ∈ C∞ (Ω

)),

(B.13)



(Qu)0

= Q(u0)

+ i

ν∑

j=1

Qj

j−1∑

k=0

Dkn

((Dj−1−k

n u)∣∣∂Ωδ∂Ω

) (u ∈ S

(Ω−)),

(B.14)

where(Dj

nf)∣∣∂Ω

:= r∂ΩDjnf. Using these formulas one can define the action

of the operator Q upon the elements of the spaces Hs,ν (Ω) and Hs,ν (Ω−)

(s ∈ R) (defined in Section B.4) as follows (cf. [Kozlov et. al. (1997), Sect.

3.2.], [Roitberg (1996), Sect. 2.4]):

(Q(f, ψ

))0:= Q

(f0)− i

ν∑

j=1

Qj

j−1∑

k=0

Dkn (ψj−kδ∂Ω)

((f, ψ

)∈ H

s,ν (Ω)),

(B.15)

(Q(u, φ

))0:= Q

(u0)

+ i

ν∑

j=1

Qj

j−1∑

k=0

Dkn (φj−kδ∂Ω) ,

((u, φ

)∈ H

s,ν (Ω−)).

(B.16)

It is known ([Roitberg (1996)],[Kozlov et. al. (1997)], [Kozhevnikov

(2001)]) that Q, defined respectively in (B.15) and (B.16), is a bounded

mapping

Q : Hs,ν (Ω) → Hs−ν (Ω) and Q : H

s,ν (Ω−) → Hs−ν (Ω−) .

Moreover, Q is respectively the closure of the mapping f → Q (x,D) f(f ∈ C∞ (Ω

))or u → Q (x,D)u

(u ∈ S

(Ω−))

between the correspond-

ing spaces.

Let Wm` (m = 1, ..., µ/2, ` = a+ 1, ..., ν) be the operator acting as

follows:

Wm` (φ) := iγm−1Λ−a−µ/2+ P−1

ν∑

`=a+1

ν∑

j=`

QjDj−`n (φδ∂Ω) , φ ∈ C∞ (∂Ω) ,

(B.17)

where γk is the restriction to ∂Ω of the Dkn (cf. Section B.4).

The mapping Wm` is a pseudodifferential operator of order m − µ +

ν/2−1− ` . Therefore, for any real s, this mapping is a bounded operator:

Wm` : Hs (∂Ω) → Hs−m+µ−ν/2+1+` (∂Ω) .

For(f, ψ

)∈ Ha,ν (Ω) , one has g := Q

(f, ψ

)∈ Ha−ν

0 (Ω) , and we set

wa+m := −γm−1Λ−a−µ/2+ P−1g0 (m = 1, ..., µ/2) , (B.18)



where the operator γm−1Λ−a−µ/2+ P−1 (x,D) is a trace operator of order

m − 1 − a− 3µ/2. It follows that wa+m ∈ Hµ/2−m+1/2 (∂Ω) .

Theorem B.2 Integral equation (B.3)

RΩh = f ∈ Ha (Ω) , h ∈ H−a0 (Ω)

is equivalent to the following boundary-value problem:

Qu = 0 in Ω−,Dj

nu = Djnf on ∂Ω, 0 ≤ j ≤ a− 1,

ν∑`=a+1

Wm` (γ`−1u) = wa+m on ∂Ω, 1 ≤ m ≤ µ/2,(B.19)

where the functions u, f and h are related by the formulas

h = P−1QF, F ∈ Ha (Rn) , F :=


u ∈ Ha (Ω−) in Ω−.

Proof. Our starting point is Theorem B.1. Consider the equation Ph =

QF, h ∈ H−a0 (Ω) . Since F ∈ Ha (Rn) and Qu = 0 in Ω− by (B.4), then

QF ∈ Ha−ν0 (Ω) = H−a−µ

0 (Ω) . By Lemma 1, a solution h to the equation

Ph = QF ∈ H−a−µ0 (Ω) belongs to the space H−a

0 (Ω) if and only if QF

satisfies the following µ/2 boundary conditions:

r∂ΩDm−1n Λ

−a−µ/2+ P−1QF = 0, m = 1, ..., µ/2. (B.20)

Since F = f0 + u0, one has QF = Q(f0)

+ Q(u0). Substituting the last

expression into (B.20) we have

γm−1Λ−a−µ/2+ P−1Q

(u0)

= −γm−1Λ−a−µ/2+ P−1Q

(f0)

m = 1, ..., µ/2.

From (B.15) and (B.16), one gets:

iγm−1Λ−a−µ/2+ P−1

ν∑

j=1

Qj

j−1∑

k=0

Dkn (φj−kδ∂Ω)

= γm−1Λ−a−µ/2+ P−1

(Q(f, ψ

))0

+ iγm−1Λ−a−µ/2+ P−1

ν∑

j=1

Qj

j−1∑

k=0

Dkn (ψj−kδ∂Ω) (B.21)



Since

F :=


u ∈ Ha (Ω−) in Ω−and F ∈ Ha (Rn) ,

it follows that γj−1u = γj−1f, j = 1, ..., a. Therefore, φj = γj−1u =

γj−1f = ψj, j = 1, ..., a.

We identify the space Ha (Ω) with the subspace of Ha,(ν) (Ω) of all(f, ψ

)= (f, ψ1, ..., ψν) such that ψa+1 = ... = ψν = 0. Let

(f, ψ

)be-

long to this subspace and(u, φ

)= (u, φ1, ..., φν) ∈ Ha,(ν) (Ω−) . Then we

can rewrite (B.21) as

γm−1Λ−a−µ/2+ P−1

(Q(f, ψ

))0

+ iγm−1Λ−a−µ/2+ P−1

ν∑

j=1

Qj

j∑

`=a+1

Dj−`n (φ`δ∂Ω) = 0. (B.22)

Changing the order of the summation

ν∑

j=1

j∑

`=a+1

=

ν∑

`=a+1

ν∑

j=`

,

we get

iγm−1Λ−a−µ/2+ P−1

ν∑`=a+1

ν∑j=`

QjDj−`n (φ`δ∂Ω)

= −γm−1Λ−a−µ/2+ P−1

(Q(f, ψ

))0, (B.23)

where m = 1, ..., µ/2. In view of (B.17) and (B.18), formula (B.23) can be

rewritten as µ/2 equations

ν∑

`=a+1

Wm` (φ`) = wa+m on ∂Ω, m = 1, ..., µ/2.

Since φj = γj−1u for u ∈ S(Ω−), we get

ν∑

`=a+1

Wm` (γ`−1u) = wa+m on ∂Ω, m = 1, ..., µ/2.

These equations define µ/2 extra boundary conditions for the

boundary-value problemQu = 0 in Ω−,Dj

nu = Djnf on ∂Ω, 0 ≤ j ≤ a− 1.



Theorem B.2 is proved.

B.3 Isomorphism property

We look for a solution u ∈ Ha (Ω−) to the boundary-value problem (B.19).

Let us consider the following non-homogeneous boundary-value problem

associated with (B.19):

Qu = w in Ω−γ0Bju := γ0D

j−1n u = wj on ∂Ω, 1 ≤ j ≤ a

γ0Ba+mu :=ν∑

`=a+1

Wm` (u`−1) = wa+m on ∂Ω, 1 ≤ m ≤ µ/2,

(B.24)

where w, wj j = 1, ..., ν/2, are arbitrary elements of the corresponding

Sobolev spaces (see below Theorem B.3 and B.4).

For the formulation of the Shapiro-Lopatinskii condition we need some

notation.

Let ε > 0 be a sufficiently small number. Denote by U (ε-conic neigh-

borhood) the union of all balls B (x, ε 〈x〉) , centered at x ∈ ∂Ω with radius

ε 〈x〉 . Let y = (y′, yn) = (y1, ..., yn−1, yn) be normal coordinates in an ε-

conic neighborhood U of ∂Ω, that is, ∂Ω may be identified with yn = 0 ,yn is the normal coordinate, and the normal derivative Dn is Dyn near ∂Ω.

Each differential operator on Rn with SG-symbol can be written in U as a

differential operator with respect to Dy′ and Dyn :

Q =

ν∑

j=0

Qj (y,Dy′ )Djyn ,

where Qj (y,Dy′ ) are differential operators with symbols belonging to

SG(ν,0) (Rn) . Let

q (y, ξ) = q (y, ξ′, ξn) =

ν∑

j=0

qj (y, ξ′) ξjn

be the symbol of Q, where ξ′ and ξn are cotangent variables associated with

y′ and yn.

Assumption 1. We assume that the operator Q is md-properly elliptic

(cf. [Erkip and Schrohe (1992), Assumption 1, p. 40]), that is, for all large

|y|+ |ξ′| the polynomial q (y, ξ′, z) with respect to the complex variable z has

exactly ν/2 zeros with positive imaginary parts τ1 (y′, ξ′) , ..., τν/2 (y′, ξ′) .



We conclude from Assumption 1 that the polynomial q (y, ξ′, z) has no

real zeros and has exactly ν/2 zeros with negative imaginary part for all

large |y| + |ξ′|.In particular, the Laplacian ∆ in the space Rn (n ≥ 2) is elliptic in

the usual sense but not md- properly elliptic, while the operator I − ∆ is

md-properly elliptic.

Let

χ (y′, ξ′) :=

1 +

n−1∑

i,j=1

ξi

(g (y)

−1)ijξj

1/2

,

where g = (gij) is a Riemannian metric on ∂Ω. We denote

q+ (y′, ξ′, z) :=

ν/2∏

j=1

(z − χ (y′, ξ′)

−1τj (y′, ξ′)

).

Consider the operators Bm (m = 1, ..., ν/2) from (B.24). Each of them

is of the form

Bm =

ν−1∑

j=0

Bmj (y′, Dy′)Djyn

in the normal coordinates y = (y′, yn) = (y1, ..., yn−1, yn) in an ε-conic

neighborhood of ∂Ω. Here Bmj (y′, Dy′) is a pseudodifferential operator of

order ρm − j (ρm ∈ N) acting on ∂Ω. Let bmj (y′, ξ′) denote the principal

symbol of Bmj (y′, Dy′) . The operators Bm in the boundary-value problem

(B.24) are operators of this type. We set

bm (y′, ξ′, z) :=

ν−1∑

j=0

bmj (y′, ξ′)χ (y′, ξ′)−ρm+j

zj .

Define the following polynomials with respect to z:

rm (y′, ξ′, z) =

ν/2∑

j=1

rmj (y′, ξ′) zj−1

as the residues of bm (y′, ξ′, z) modulo q+ (y′, ξ′, z) , i.e. we get rmj (y′, ξ′)representing bm (y′, ξ′, z) in the form

bm (y′, ξ′, z) = qm (z) q+ (y′, ξ′) +

ν/2∑

j=1

rmj (y′, ξ′) zj−1,



where qm (z) is a polynomial in z.

Assumption 2. (Shapiro-Lopatinskii condition) The determinant

det(rmj (y′, ξ′)) is bounded and bounded away from zero, that is, there exist

two positive constants c and C such that

0 < c ≤ det (rmj (y′, ξ′)) ≤ C.

Remark B.2 The following Theorem B.3 has been proved in [Erkip and

Schrohe (1992), Th. 3.1] in the more general case of the SG-manifold.

The latter includes the exterior of bounded domains which is a particular

case of the SG-manifolds. This particular case was chosen for simplicity

of the exposition. Moreover, the results in [Erkip and Schrohe (1992), Th.

3.1], [Schrohe (1999)] have been obtained for operators acting in weighted

Sobolev spaces. The usual Sobolev spaces in Theorem B.3 are particular

cases of the weighted Sobolev spaces with zero order of the weight.

Theorem B.3 (cf [Erkip and Schrohe (1992), Th. 3.1], [Schrohe

(1999)]). If the differential operator Q of even order ν satisfies Assump-

tions 1 and 2, that is Q is md-properly elliptic and the Shapiro-Lopatinskii

condition holds for the operator(Q, γ0B1, ..., γ0Bν/2

), then the mapping

(Q, γ0B1, ..., γ0Bν/2

): Hs (Ω−) → Hs−ν (Ω−)×

ν/2∏

j=1

Hs−ρj−1/2 (∂Ω) , s ≥ ν,

is a Fredholm operator.

Assumption 3.The Fredholm operator(Q, γ0B1, ..., γ0Bν/2

)has the trivial

kernel and cokernel.

For example, if the kernel R(x, y) has the property (Rh, h) ≥ c||h||2H−a

0

for all h ∈ H−a0 , where c = const > 0 does not depend on h, then the

operator in Assumption 3 is invertible (see Chapter 1).

Corollary B.1 Under the assumptions of Theorem B.3 and in addition

under Assumption 3, for any s ∈ R, there exists a bounded (Poisson) oper-

ator

K : Πm−1j=0 H

s+2m−j−1/2 (∂Ω) → Hs+2m (Ω−) (B.25)

which gives a unique solution u = Kχ to the boundary-value problem

Qu = 0 in Ω−, γ0B1u = χ1, ..., γ0Bν/2u = χν/2 (B.26)



with

χ =(χ1, ..., χν/2

)∈ Π

ν/2−1j=0 Hs+ν−j−1/2 (∂Ω) .

More precisely, the operator u = Kχ solves the problem with s < 0 in the

sense that u = Kχ is the limit in the space Hs+ν (Ω−) of a sequence un in

Hν (Ω−) with

Qun = 0, γ0Bjun = χj,n (j = 1, ...,m) , limn→∞

χn→ χ

in Πν/2−1j=0 Hs+ν−j−1/2 (∂Ω) .

Proof. The statement of Corollary is an immediate consequence of The-

orem B.3 due to the fact that the solution operator to the boundary-value

problem (B.26) with homogeneous equation Qu = 0 in Ω− is a Poisson op-

erator. The latter acts in the full scale of Sobolev spaces [Schrohe (1999)],

that is, (B.25) holds for all s ∈ R.

Theorem B.4 Under the assumptions of Theorem B.3 and in addition

under Assumption 3, the mapping RΩ, defined in the Introduction, is an

isomorphism : H−a0 (Ω) → Ha (Ω).

Proof. Let us consider the operator(Q, γ0B1, ..., γ0Bν/2

)generated by

the boundary value problem (B.24). Taking into account that

ρj = orderBj = j − 1 for j = 1, ..., a,

ρj = orderBj = j − µ + ν/2− 2 for j = a+ 1, ..., ν/2,

one concludes by Theorem B.3 that the mapping

(u, φ

)7→(Q(u, φ

), γ0B1

(u, φ

), ..., γ0Bν/2

(u, φ

))= (w,w1, ..., wν/2)

is a Fredholm operator. It maps the space Hs (Ω−) to the space

Hs−ν (Ω−) ×a∏

j=1

Hs−j+1/2 (∂Ω) ×ν/2∏

j=a+1

Hs−j+µ−ν/2+3/2 (∂Ω) (s ≥ ν) .

Assumption 3 implies that this mapping is an isomorphism. By Corollary,

the operator K, solving the boundary-value problem

Qu = 0, γ0Bju = χj (j = 1, ...,m) ,

is a Poisson operator

K : Πm−1j=0 H

s+2m−j−1/2 (∂Ω) → Hs+2m (Ω−) (s ∈ R) .



Choosing s = a and using Theorem B.2, we conclude, that for any

f ∈ Ha (Ω) the function u is a unique solution to the boundary-value

problem (B.19). Therefore, again by Theorem B.2, the operator RΩ is an

isomorphism of the space H−a0 (Ω) onto Ha (Ω). Theorem B.4 is proved.

Example B.1 Let P = I be the identity operator (its order µ = 0) and

Q = I − ∆, (ν = ordQ = 2) . Then, by Theorem B.4, the corresponding

operator RΩ is an isomorphism: H−10 (Ω) → H1 (Ω) .

Under the assumptions of Theorem B.4 there exists a unique solution

to the integral equation (B.3). Let us find this solution.

Examples of analytical formulas for the solution to the integral equation

(B.3) can be found in [Ramm (1990)]. The analytical formulas for the

solution in the cases when the corresponding boundary-value problems are

solvable analytically, can be obtained only for domains Ω of special shape,

for example, when Ω is a ball, and for special operators Q and P , for

example, for operators with constant coefficients.

We give such a formula for the solution of equation (B.3) assuming

P = I and Q = −∆ + a2I. Consider the equation

RΩh (x) =

∫

Ω

exp(−a|x− y|)4π|x− y| h(y)dy = f(x), x ∈ Ω ⊂ R

3, a > 0,

(B.27)

with the kernel R(x, y) := exp(−a|x − y|)/ (4π|x− y|), P = I, and Q =

−∆+a2I. By formula (2.24), one obtains a unique solution to the equation

(B.27) in H−10 (Ω):

h(x) = (−∆ + a2)f +

(∂f

∂n− ∂u

∂n

)δ∂Ω (B.28)

where u is a unique solution to the exterior Dirichlet boundary-value prob-

lem

(−∆ + a2)u = 0 in Ω−, u|∂Ω = f |∂Ω . (B.29)

For any ϕ ∈ C∞0 (Rn) one has:

((−∆ + a2)Rh,ϕ

)=(Rh, (−∆ + a2)ϕ

)

=

∫

Ω

f(−∆ + a2)ϕdx+

∫

Ω−

u(−∆ + a2)ϕdx =



=

∫

Ω

(−∆ + a2)fϕdx+

∫

Ω−

(−∆ + a2)uϕdx

−∫

∂Ω

(f∂nϕ − ∂nfϕ) ds+

∫

∂Ω

(u∂nϕ− ∂nuϕ) ds

=

∫

Ω

(−∆ + a2)fϕdx+

∫

∂Ω

(∂nf − ∂nu) ds,

where the condition u = f on ∂Ω was used. Thus, we have checked that

formula (B.28) gives the unique in H−10 (Ω) solution to equation (B.27).

This solution has minimal order of singularity.

B.4 Auxiliary material

We denote by R the set of real numbers, by C the set of complex num-

bers. Let Z := 0,±1,±2, ..., N := 0, 1, ..., N+ := 1, 2, ..., Rn :=

x = (x1, ..., xn) : xi ∈ R, i = 1, ..., n .Let α be a multi-index, α := (α1, . . ., αn), αj ∈ N, |α| := α1 + . . .+αn,

i :=√−1; Dj := i−1∂/∂xj; D

α := Dα11 Dα2

2 . . .Dαnn .

Let C∞ (Ω)

be the space of infinitely differentiable up to the boundary

functions in Ω. Near ∂Ω there is defined A normal vector field n (x) =

(n1 (x) , ..., nn (x)) , is defined in a neighborhood of the boundary ∂Ω as

follows: for x0 ∈ ∂Ω, n (x0) is the unit normal to ∂Ω, pointing into the

exterior of Ω. We set

n (x) := n (x0) for x of the form x = x0 + sn (x0) =: ζ (x0, s)

where x0 ∈ ∂Ω, s ∈ (−δ, δ) . Here δ > 0 is taken so small that the repre-

sentation of x in terms of x0 ∈ ∂Ω and s ∈ (−δ, δ) is unique and smooth,

that is, ζ is bijective and C∞ with C∞ inverse, from ∂Ω × (−δ, δ) to the

set ζ (∂Ω × (−δ, δ)) ⊂ Rn.

We call differential operators tangential when, for x ∈ ζ (∂Ω × (−δ, δ)),they are either of the form

Af =

n∑

j=1

aj (x)∂f

∂xj(x) + a0 (x) f with

n∑

j=1

aj (x)nj (x) = 0,



or they are products of such operators. The derivative along n is denoted

∂n :

∂nf :=

n∑

j=1

nj (x)∂f

∂xj(x)

for x ∈ ζ (∂Ω × (−δ, δ)) . Let Dn := i−1∂n.

Let Ω− := Rn \ Ω denote the exterior of the domain Ω, r∂Ω, rΩ be

respectively the restriction operators to ∂Ω, Ω : r∂Ωf := f |∂Ω , rΩf :=

f |ΩLet S (Rn) be the space of rapidly decreasing functions, that is the

space of all u ∈ C∞ (Rn) such that

sup|α|≤k

supx∈Rn

∣∣∣(1 + |x|2

)mDαu (x)

∣∣∣ < ∞ for all k,m ∈ N.

Let S(Ω−)

be the space of restrictions of the elements u ∈ S (Rn) to

Ω− (this space is equipped with the factor topology).

Let u ∈ C∞ (Ω)

and v ∈ S(Ω−), then we set γku := r∂ΩD

knu =(

Dknu)∣∣∂Ω, γkv := r∂ΩD

knv =

(Dk

nv)∣∣∂Ω.

Let Hs (Rn) (s ∈ R) be the usual Sobolev space:

Hs (Rn) :=f ∈ S ′ | F−1

(1 + |ξ|2

)s/2 Ff ∈ L2 (Rn),

‖f‖Hs(Rn) :=∥∥∥F−1

(1 + |ξ|2

)s/2 Ff∥∥∥L2(Rn)

,

where F denotes the Fourier transform f 7→ Fx→ξf(x) =∫

Rne−ixξf(x)dx,

F−1 its inverse and S ′ = S ′ (Rn) denote the space of tempered distributions

which is dual to the space S (Rn) .

Let Hs (Ω) and Hs (Ω−) (0 ≤ s ∈ R) be respectively the spaces of re-

strictions of elements of Hs (Rn) to Ω and Ω−. The norms in the spaces

Hs (Ω) and Hs (Ω−) are defined by the relations

‖f‖Hs(Ω) := inf ‖g‖Hs(Rn) (s ≥ 0) ,

‖f‖Hs(Ω−) := inf ‖g‖Hs(Rn) (s ≥ 0) ,

where infimum is taken over all elements g ∈ Hs (Rn) which are equal to f

in Ω respectively in Ω−.



By Hs0 (Ω) (s ∈ R) and Hs

0 (Ω−) , we denote the closed subspaces of the

space Hs (Rn) which consist of the elements with supports respectively in

Ω or in Ω−, that is,

Hs0 (Ω) :=

f ∈ Hs (Rn) : supp f ⊆ Ω

⊂ Hs (Rn) , s ∈ R,

Hs0 (Ω−) :=

f ∈ Hs (Rn) : supp f ⊆ Ω−

⊂ Hs (Rn) , s ∈ R.

We define the spaces

Hs (Ω) :=

Hs (Ω) s > 0

Hs0 (Ω) s ≤ 0,

Hs (Ω−) :=

Hs (Ω−) s > 0,

Hs0 (Ω−) s ≤ 0.

For s 6= k + 1/2 (k = 0, 1, ..., `− 1) , we define the spaces Hs,` (Ω) and

Hs,` (Ω−) respectively as the sets of all

(u, φ

)= (u, φ1, ..., φ`) and

(v, ψ

)= (v, ψ1, ..., ψ`)

where u ∈ Hs (Ω) , v ∈ Hs (Ω−) , φ = (φ1, ..., φ`) and ψ = (ψ1, ..., ψ`) are

vectors in∏j=1

Hs−j+1/2 (∂Ω) satisfying the condition

φj = Dj−1n u

∣∣∂Ω, ψj = Dj−1

n v∣∣∂Ω

for j < min (s, `) .

The norms in Hs,` (Ω) and Hs,` (Ω−) can be defined as

∥∥(u, φ)∥∥2

Hs,`(Ω)= ‖u‖2

Hs(Ω) +∑

j=1

‖φj‖2Hs−j+1/2 (∂Ω) ,

∥∥(v, ψ)∥∥2

Hs,`(Ω)= ‖v‖2

Hs(Ω) +∑

j=0

‖ψj‖2Hs−j+1/2 (∂Ω) .

Since only the components φj and ψj with index j < s can be chosen

independently of u, we can identify Hs,` (Ω) and Hs,` (Ω−) with the following

spaces.



For s 6= k + 1/2 (k = 0, 1, ..., `− 1) ,

Hs,` (Ω) =

Hs (Ω) , ` = 0,

Hs (Ω) , 1 ≤ ` < s + 1/2,

Hs (Ω) × ∏j=[s+1/2]+1

Hs−j+1/2 (∂Ω) , 0 <[s+ 1

2

]< `,

Hs (Ω) × ∏j=1

Hs−j+1/2 (∂Ω) , s < 12

Hs,` (Ω−) =

Hs (Ω−) , ` = 0,

Hs (Ω−) , 1 ≤ ` < s + 1/2,

Hs (Ω−) × ∏j=[s+1/2]+1

Hs−j+1/2 (∂Ω) , 0 <[s+ 1

2

]< `,

Hs (Ω−) × ∏j=1

Hs−j+1/2 (∂Ω) , s < 12

Finally, for s = k+ 1/2 (k = 0, 1, ..., `− 1) , we define the spaces Hs,` (Ω) ,

Hs,` (Ω−) by the method of complex interpolation.

Let us note that for s 6= k+1/2 (k = 0, 1, ..., `− 1) , the spaces Hs,` (Ω) ,

Hs,` (Ω−) are completion of C∞ (Ω), S(Ω−)

respectively in the norms

‖(u, γ0u, ..., γ`−1u)‖2Hs,`(Ω) = ‖u‖2

Hs(Ω) +

`−1∑

j=0

‖γju‖2Hs−j−1/2 (∂Ω) ,

‖v, γ0v, ..., γ`−1v‖2Hs,`(Ω−) = ‖v‖2

Hs(Ω−) +

`−1∑

j=0

‖γjv‖2Hs−j−1/2 (∂Ω) .




Bibliographical Notes

The estimation theory optimal by the criterion of minimum of the error vari-

ance has been created by N. Wiener (1942) for stationary random processes

and for an infinite interval of the time of observation. A large bibliography

one can find in [Kailath (1974)]. The theory has been essentially finished by

mid-sixties for a finite interval of the time of observation and for stationary

random processes with rational spectral density. Many attempts were made

in engineering literature to construct a generalization of the Wiener theory

for the case of random fields. The reason is that such a theory is needed

in many applications, e.g., TV and optical signal processing, geophysics,

underwater acoustics, radiophysics etc. The attempts to give an analytical

estimation theory in engineering literature (see [Ekstrom (1982)] and refer-

ences therein) were based on some type of scanning, and the problem has

not been solved as an optimization problem for random fields.

The first analytical theory of random fields estimation and filtering,

which is a generalization of Wiener’s theory, has been developed in the series

of papers [Ramm (1969); Ramm (1969b); Ramm (1969c); Ramm (1970b);

Ramm (1970c); Ramm (1970d); Ramm (1971); Ramm (1971c); Ramm

(1973c); Ramm (1975); Ramm (1976); Ramm (1978); Ramm (1978b);

Ramm (1978c); Ramm (1978d); Ramm (1979); Ramm (1980b); Ramm

(1980c); Ramm (1984b); Ramm (1985); Ramm (1987d); Ramm (2002);

Ramm (2003); Kozhevnikov and Ramm (2005)] and in [Ramm (1980),

Chapter 1]. This theory is presented in Chapters 2-4. Its applications are

given in Chapter 7, and its generalizations to a wider class of random fields

are given in the Appendices A and B. The material in Chapter 3 is based on

the paper [Ramm (1985)]. The material in Section 7.7 is taken from [Ramm

(1980)] where a reference to the paper Katznelson, J. and Gould, L., Con-

struction of nonlinear filters and control systems, Information and Control,

359



5, (1962), 108-143, can be found together with a critical remark concerning

this paper. In Section 7.2 the paper [Ramm (1973b)] is used; in Section

7.3.4 papers [Ramm (1968); Ramm (1973b); Ramm (1978); Ramm (1981);

Ramm (1985b); Ramm (1984); Ramm (1987b); Ramm (1987c)] are used;

the stable differentiation formulas (7.66, (7.67) and (7.71) were first given in[Ramm (1968)]; in Section 7.5 the papers [Ramm (1969b); Ramm (1970b);

Ramm (1970c); Ramm (1970d)] are used. There is a large literature

in which various aspects of the theory presented in Section 7.6 are dis-

cussed, see [Fedotov (1982)], [Ivanov et. al. (1978)], [Lattes and Li-

ons (1967)], [Lavrentiev and Romanov (1986)], [Morozov (1984)], [Payne

(1975)], [Tanana (1981)], [Tikhonov (1977)] and references therein. The

presentation in Section 7.6 is self-contained and partly is based on [Ramm

(1981)].

The class R of random fields has been introduced by the author in 1969[Ramm (1969b); Ramm (1970b); Ramm (1970c); Ramm (1970d)]. It was

found ( see [Molchan (1975)], [Molchan (1974)]) that the Gaussian random

fields have Markov property if and only if they are in the class R and

P (λ) = 1 (see formula (1.10)).

Chapter 5 contains a singular perturbation theory for the class of inte-

gral equations basic in estimation theory. This chapter is based on [Ramm

and Shifrin (2005)] (see also [Ramm and Shifrin (1991)], [Ramm and Shifrin

(1993)], [Ramm and Shifrin (1995)]).

Random fields have been studied extensively [Adler (1981)], [Gelfand

and Vilenkin (1968)], [Koroljuk (1978)], [Pitt (1971)], [Rosanov (1982)],[Vanmarcke (1983)], [Wong (1986)], [Yadrenko (1983)], but there is no in-

tersection between the theory given in this book and the material presented

in the literature. In the presentation of the material in Section 8.1.2 the

author used the book [Berezanskij (1968)]. Theorem 8.1 in Section 8.1.1

is taken from [Mazja (1986), p. 60], and the method of obtaining the

eigenfunction expansion theorem in Section 8.2 is taken from [Berezanskij

(1968)]. There is a large literature on the material presented in section

8.2.4. Of course, it is not possible in this book to cover this material in

depth (and it was not our goal). Only some facts, useful for a better under-

standing of the theory presented in this book, are given. For second-order

elliptic operators L a number of stronger conditions sufficient for L to be

selfadjoint or essentially selfadjoint are known (see [Kato (1981)]).

The assumption that L is selfadjoint is basic for the eigenfunction ex-

pansion theory developed in [Berezanskij (1968)]. In some cases an eigen-

function expansion theory sufficient for our purposes can be developed


Bibliographical Notes 361

for certain non-selfadjoint operators (see [Ramm (1981b); Ramm (1981c);

Ramm (1982); Ramm (1983)]. We have discussed the case when D, the

domain of observation, is finite. For the Schrodinger operator the spectral

and scattering theory in some domains with infinite boundaries is devel-

oped in [Ramm (1963b); Ramm (1963); Ramm (1965); Ramm (1968b);

Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987); Ramm

(1988b)].

The material in Section 8.3.1 is well known. The proofs of the results

about s-values can be found in [Gohberg and Krein (1969)].

The material in Section 8.3.2 belongs to the author [Ramm (1980);

Ramm (1981b); Ramm (1981c)] and the proofs of all of the results are

given in detail. The material in Section 8.3.3 is known and proofs of the

results can be found in [Gohberg and Krein (1969)], [Konig (1986)], and[Pietsch (1987)].

In Section 8.4 some reference material in probability theory and statis-

tics is given. One can find much more material in this area in [Koroljuk

(1978)]. The purpose of the Chapter 6 is to explain some connection be-

tween estimation and scattering theory.

In the presentation of the scattering theory in Section 6.1 the pa-

pers [Ramm (1963b); Ramm (1963); Ramm (1965); Ramm (1968b);

Ramm (1969d)] are used, where the scattering theory has been devel-

oped for the first time in some domains with infinite boundaries. Most

of the results in Section 6.1 are well known, except of Theorems 6.1 and

6.2, which are taken from [Ramm (1987e); Ramm (1987f); Ramm (1988);

Ramm (1988c); Ramm (1989)]. It is not possible here to give a bibliog-

raphy on scattering theory. Povsner (1953-1955) and then Ikebe (1960)

studied the scattering problem in R3. Much work was done since then (see[Hormander (1983-85)] vol. II, IV and references therein). A short and self-

contained presentation of the scattering theory given in Section 6.1 may be

useful for many readers who would like to get a quick access to basic re-

sults and do not worry about extra assumptions on the rate of decay of the

potential.

Lemma 6.1 in Section 6.2 is well known, equation (6.13) is derived in[Newton (1982)], our presentation follows partly [Ramm (1987e); Ramm

and Weaver (1987)] and Theorem 6.1 is taken from [Ramm (1987e)].

A connection between estimation and scattering theory for one-

dimensional problems has been known for quite awhile. In [Levy and

Tsitsiklis (1985)] and [Yagle (1988)] some multidimensional problems of



estimation theory were discussed. In Section 6.3 some of the ideas from[Yagle (1988)] are used. Our arguments are given in more detail than in[Yagle (1988)].

The estimation problem discussed in [Levy and Tsitsiklis (1985)] and[Yagle (1988)] are the problems in which the noise has a white component

and the covariance function has a special structure. In [Levy and Tsitsiklis

(1985)] it is assumed that r = 2 and R(x, y) = R(|x − y|), that is the

random field is isotropic, and in [Yagle (1988)] r ≥ 2 and ∆xR(x, y) =

∆yR(x, y). The objective in these papers is to develop a generalization

of the Levinson recursion scheme for estimation in one-dimensional case.

The arguments in [Levy and Tsitsiklis (1985)] and [Yagle (1988)] are not

applicable in the case when the noise is colored, that is, there is no white

component in the noise. There was much work done on efficient inversion of

Toeplitz’s matrices [Friedlander et. al. (1979)]. These matrices arise when

one discretize equation (3.4). However, as ε → 0, one cannot invert the

corresponding Toeplitz matrix since its condition number grows quickly as

ε→ 0. If ε = 1 in (3.4), then there are many efficient ways to solve equation

(3.4). It would be of interest to compare the numerical efficiency of various

methods. It may be that an iterative method, or a projection method will

be more efficient than the discretization method with equidistant nodes

used together with the efficient method of inverting the resulting Toeplitz

matrix.

The results in Appendix A are taken from [Ramm (2003)] and the results

in Appendix B are taken from [Kozhevnikov and Ramm (2005)].

The author has tried to make the material in this book accessible to a

large audience. The material from the theory of elliptic pseudodifferential

equations (see, e.g., [Hormander (1983-85)]) was not used. The class R of

kernels is a subset in the set of pseudodifferential operators. The results

obtained in this book concerning equations in the class R are final in the

sense that exact description of the range of the operators with kernels in

class R is given and analytical formulas for the solutions of these equations

are obtained. The general theory of pseudodifferential operators does not

provide analytical formulas for the solutions. It was possible to derive such

formulas in this book because of the special structure of the kernels in the

class R. In [Eskin (1981), §27] an asymptotic solution is obtained for a

class of pseudo-differential equations with a small parameter.


Bibliography

Adler, R. (1981). The geometry of random fields, J. Wiley, New York.Agmon, S. (1982). Lectures on exponential decay of solutions of second-order el-

liptic equations, Princeton Univ. Press, Princeton.Akhieser, N. (1965). Lectures on approximation theory, Nauka, Moscow.Aronszajn, N. (1950). Theory of reproducing kernels, Trans. Am. Math. Soc. 68,

pp. 337-404.Berezanskij, Yu (1968). Expansions in eigenfunctions of selfadjoint operators,

Amer. Math. Soc., Providence RI.Deimling, K. (1985). Nonlinear functional analysis, Springer Verlag, New York.Ekstrom, M. (1982). Realizable Wiener filtering in two dimensions, IEEE Trans.

on acoustics, speech and signal processing, 30, pp. 31-40;Ekstrom, M and Woods, J. (1976). Two-dimensional spectral factorization with

applications in recursive digital filtering, ibid, 2, pp. 115-128.Erkip, A. and Schrohe, E, (1992). Normal solvability of elliptic boundary-value

problems on asymptotically flat manifolds, J. of Functional Analysis 109,pp. 22–51.

Eskin, G. (1981). Boundary value problems for elliptic pseudodifferential equa-

tions, Amer. Math. Soc., Providence, RI.Fedotov, A. (1982). Linear ill-posed problems with random errors in the data,

Nauka, Novosibirsk.Friedlander, B., Morf, Mi, Kailath, T., Ljung, L. (1979). New inversion formulas

for matrices classified in terms of their distance from Toeplitz matrices,Linear algebra and its applications, 27, pp. 31-60.

Gakhov, F. (1966). boundary-value problems, Pergamon Press, Oxford.Glazman, I. (1965). Direct methods of qualitative spectral analysis of singular

differential operators, Davey, New York.Gohberg, I. and Krein, M. (1969). Introduction to the theory of linear nonselfad-

joint operators, AMS, Providence.Gilbarg, D. and Trudinger, N. (1977). Elliptic partial differential equations of

second order, Springer Verlag, New York.Gelfand, I. and Vilenkin, N. (1968). Generalized functions, vol. 4., Acad. Press,

New York.

363



Grubb, G. (1990). Pseudo-differential problems in Lp spaces, Commun. in Partial

Differ. Eq.,15 (3), pp. 289–340.Grubb, G. (1996). Functional calculus of pseudodifferential boundary problems,

Birkhauser, Boston.Hormander, L. (1983-85). The analysis of linear partial differential operators, vol.

I-IV, Springer Verlag, New York.Ivanov, V., Vasin, V. and Tanana, V (1978). Theory of linear ill-posed problems

and applications, Moscow, Nauka.Kato, T. (1995). Perturbation theory for linear operators, Springer Verlag, New

York.Kato, T. (1981). Spectral theory of differential operators, North Holland, Amster-

dam. (Ed. Knowles, I. and Lewis, R.) pp. 253-266.Kato, T. (1959). Growth properties of solutions of the reduced wave equa-

tion,Comm. Pure Appl. Math, 12, pp. 403-425.Kailath, T. (1974). A view of three decades of linear filtering theory, IEEE Trans.

on inform. theory, IT-20, pp. 145-181.Kantorovich, L. and Akilov, G. (1980). Functional analysis, Pergamon Press, New

York.Klibanov, M. (1985). On uniqueness of the determination of a compactly sup-

ported function from the modulus of its Fourier transform, Dokl. Acad.

Sci. USSR, 32, pp. 668-670.Konig, H. (1986). Eigenvalue distribution of compact operators, Birkhauser,

Stuttgart.Koroljuk, V. ed. (1978). Reference book in probability and statistics, Naukova

Dumka, Kiev.Kozhevnikov, A. (2001). Complete scale of isomorphisms for elliptic pseudodiffer-

ential boundary-value problems, J. London Math. Soc. (2) 64, pp. 409–422.Kozhevnikov, A. and Ramm, A.G. (2005). Integral operators basic in random

fields estimation theory, Intern. J. Pure and Appl. Math., 20, N3, 405-427.Kozlov, V. Maz’ya, V. and Rossmann, J. (1997). Elliptic boundary-value problems

in domains with point singularities, AMS, Providence 1997.Krasnoselskii, M., et. al. (1972). Approximate solution of operator equations, Wal-

ters - Noordhoff, Groningen.Lattes R. and Lions J. (1967). Methode de quasi-reversibilite et applications,

Dunod, Paris.Lavrentiev, M. and Romanov, V. and Shishatskii, S. (1986). Ill-posed problems of

mathematical physics and analysis, Amer. Math. Soc., Providence, RI.Levitan, B. (1971). Asymptotic behavior of spectral function of elliptic equation,

Russ. Math. Survey, 6, pp. 151-212.Levy, B. and Tsitsiklis, J. (1985). A fast argorithm for linear estimation of two-

dimensional random fields, IEEE Trans. Inform. Theory, IT-31, pp. 635-644.

Mazja, V. (1986). Sobolev spaces, Springer Verlag, New York.Molchan, G. (1975). Characterization of Gaussian fields with Markov property,

Sov. Math. Doklady, 12, pp. 563-567.Molchan, G. (1974). L-Markov Gaussian fields, ibid. 15, pp. 657-662.


Bibliography 365

Morozov, V. (1984). Methods for solving incorrectly posed problems, Springer Ver-lag, New York.

Naimark, M. (1969). Linear differential operators, Nauka, Moscow.Newton, R. (1982). Scattering of waves and particles, Springer Verlag, New York.Payne, L.E. (1975). Improperly posed problems, Part. Dif. Equ., Regional Conf.

Appl. Math., Vol. 22, SIAM, Philadelphia.Pietsch, A. (1987). Eigenvalues and s-numbers, Cambridge Univ. Press, Cam-

bridge.Piterbarg, L. (1981). Investigation of a class of integral equations, Diff. Urav-

nenija, 17, pp. 2278-2279.Pitt, L. (1971). A Markov property for Gaussian processes with a multi-

dimensional parameter, Arch. Rat. Mech. Anal., 43, pp. 367-391.Preston, C. (1967). Random fields, Lect. notes in math. N34, Springer Verlag,

New York.Ramm, A.G. (1963). Spectral properties of the Schrodinger operator in some

domains with infinite boundaries. Doklady Acad. Sci. USSR, 152, pp. 282-285.

Ramm, A.G. (1963b). Investigation of the scattering problem in some domainswith infinite boundaries I, II, Vestnik 7, pp. 45-66; 19, pp. 67-76.

Ramm, A.G. (1965). Spectral properties of the Schrodinger operator in someinfinite domains, Mat. Sbor. 66, pp. 321-343.

Ramm, A.G. (1968). On numerical differentiation. Math., Izvestija vuzov, 11,1968, 131-135. 40 # 5130.

Ramm, A.G. (1968b). Some theorems on analytic continuation of the Schrodingeroperator resolvent kernel in the spectral parameter. Izv. Ac. Nauk. Arm.SSR, Mathematics, 3, pp. 443-464.

Ramm, A.G. (1969). Filtering of nonstationary random fields in optical systems.Opt. and Spectroscopy, 26, pp. 808-812;

Ramm, A.G. (1969b). Apodization theory. Optics and Spectroscopy, 27, (1969),pp. 508-514.

Ramm, A.G. (1969c). Filtering of nonhomogeneous random fields. Ibid. 27,pp. 881-887.

Ramm, A.G. (1969d). Green’s function study for differential equation of the sec-ond order in domains with infinite boundaries. Diff. eq. 5, pp. 1509-1516.

Ramm, A.G. (1970). Eigenfunction expansion for nonselfadjoint Schrodinger op-erator. Doklady, 191, pp. 50-53.

Ramm, A.G. (1970b). Apodization theory II. Opt. and Spectroscopy, 29, pp. 390-394.

Ramm, A.G. (1970c). Increasing of the resolution ability of the optical instru-ments by means of apodization. Ibid. 29, pp. 594-599.

Ramm, A.G. (1970d). On resolution ability of optical systems. Ibid., 29, pp. 794-798.

Ramm, A.G. (1971). Filtering and extrapolation of some nonstationary randomprocesses. Radiotech. i Electron. 16, pp. 80-87.

Ramm, A.G. (1971b). Eigenfunction expansions for exterior boundary problems.Ibid. 7, pp. 737-742.



Ramm, A.G. (1971c). On multidimensional integral equations with the translationkernel. Diff. eq. 7, pp. 2234-2239.

Ramm, A.G. (1972). Simplified optimal differentiators. Radiotech. i Electron. 17,pp. 1325-1328.

Ramm, A.G. (1973). On some class of integral equations. Ibid., 9, pp. 931-941.Ramm, A.G. (1973b). Optimal harmonic synthesis of generalized Fourier series

and integrals with randomly perturbed coefficients. Radio technika, 28,pp. 44-49.

Ramm, A.G. (1973c). Discrimination of random fields in noises. Probl. peredaci

informacii, 9, pp. 22-35. 48 # 13439.Ramm, A.G. (1975). Approximate solution of some integral equations of the first

kind. Diff. eq. 11, pp. 582-586. 440-443.Ramm, A.G. (1976). Investigation of a class of integral equations. Doklady Acad.

Sci. USSR, 230, pp. 283-286.Ramm, A.G. (1978). A new class of nonstationary processes and fields and its

applications. Proc. 10 all-union sympos. “Methods of representation andanalysis of random processes and fields” Leningrad, 3, pp. 40-43.

Ramm, A.G. (1978b). On eigenvalues of some integral equations. Diff. Equations,15, pp. 932-934.

Ramm, A.G. (1978c). Investigation of a class of systems of integral equations.Proc. Intern. Congr. on appl. math., Weimar, DDR, (1978), 345-351.

Ramm, A.G. (1978d). Investigation of some classes of integral equations and theirapplication. In collection “Abel inversion and its generalizations”, edited byN. Preobrazhensky, Siberian Dep. of Acad. Sci. USSR, Novosibirsk, pp. 120-179.

Ramm, A.G. (1979). Linear filtering of some vectorial nonstationary random pro-cesses, Math. Nachrichten, 91, pp. 269-280.

Ramm, A.G. (1980). Theory and applications of some new classes of integral

equations, Springer Verlag, New York.Ramm, A.G. (1980b). Investigation of a class of systems of integral equations,

Journ. Math. Anal. Appl., 76, pp. 303-308.Ramm, A.G. (1980c). Analytical results in random fields filtering theory, Zeitschr.

Angew. Math. Mech., 60, T 361-T 363.Ramm, A.G. (1981). Stable solutions of some ill-posed problems. Math. Meth. in

Appl. Sci. 3, pp. 336-363.Ramm, A.G. (1981b). Spectral properties of some nonselfadjoint operators, Bull,

Am. Math. Soc., 5, N3, pp. 313-315.Ramm, A.G. (1981c). Spectral properties of some nonselfadjoint operators and

some applications in “Spectral theory of differential operators”, Math. Stud-ies, North Holland, Amsterdam, ed. I. Knowles and R. Lewis, pp. 349-354.

Ramm, A.G. (1982). Perturbations preserving asymptotics of spectrum with aremainder. Proc. A.M.S. 85, N2, pp. 209-212;

Ramm, A.G. (1983). Eigenfunction expansions for some nonselfadjoint operatorsand the transport equation, J. Math. Anal. Appl. 92, pp. 564-580.

Ramm, A.G. (1984). Estimates of the derivatives of random functions. J. Math.

Anal. Appl. 102, pp. 244-250.


Bibliography 367

Ramm, A.G. (1984b). Analytic theory of random fields estimation and filter-ing. Proc. of the intern sympos. on Mathematics in systems theory (BeerSheva, 1983), Lecture notes in control and inform. sci. N58, Springer Verlag,pp. 764-773.

Ramm, A.G. (1985). Numerical solution of integral equations in a space of dis-tributions. J. Math. Anal. Appl. 110, pp. 384-390.

Ramm, A.G. (1985b). Estimates of the derivatives of random functions II. (withT. Miller). J. Math. Anal. Appl. 110, pp. 429-435;

Ramm, A.G. (1986). Scattering by obstacles, Reidel, Dordrecht.Ramm, A.G. (1987). Sufficient conditions for zero not to be an eigenvalue of the

Schrodinger operator, J. Math Phys., 28, pp. 1341-1343.Ramm, A.G. (1987b). Optimal estimation from limited noisy data, Journ. Math.

Anal. Appl., 125, pp. 258-266.Ramm, A.G. (1987c). Signal estimation from incomplete data, Journ. Math. Anal.

Appl., 125, pp. 267-271.Ramm, A.G. (1987d). Analytic and numerical results in random fields estimation

theory, Math. Reports of the Acad. of Sci., Canada, 9, pp. 69-74.Ramm, A.G. (1987e). Characterization of the scattering data in multidimensional

inverse scattering problem, in the book: Inverse Problems: An Interdisci-

plinary Study. Acad. Press, New York, pp. 153-167. (Ed. P. Sabatier).Ramm, A.G. (1987f). Completeness of the products of solutions to PDE and

uniqueness theorems in inverse scattering inverse problems, 3, L77-L82.Ramm, A.G. (1988). Multidimensional inverse problems and completeness of the

products of solutions to PDE. J. Math. Anal. Appl. 134, 1, pp. 211-253.Ramm, A.G. (1988b). Conditions for zero not to be an eigenvalue of the

Schrodinger operator, J. Math. Phys. 29, pp. 1431-1432.Ramm, A.G. (1988c). Recovery of potential from the fixed energy scattering data.

Inverse Problems, 4, pp. 877-886;Ramm, A.G. (1989). Multidimensional inverse scattering problems and complete-

ness of the products of solutions to homogeneous PDE. Zeitschr. f. angew.

Math. u. Mech., T305, N4-5, T13-T22.Ramm, A.G. (1990). Random fields estimation theory, Longman Scientific and

Wiley, New York, pp. 1-273.Ramm, A.G. (1990b). Stability of the numerical method for solving the 3D inverse

scattering problem with fixed energy data. Inverse Problems, 6, L7-L12; J.reine angew. math., 414, (1991), pp. 1-21.

Ramm, A.G. (1990b). Is the Born approximation good for solving the inverseproblem when the potential is small? J. Math. Anal. Appl. 147, pp. 480-485.

Ramm, A.G. (1991). Symmetry properties of scattering amplitudes and applica-tions to inverse problems. J. Math. Anal. Appl. 156, pp. 333-340.

Ramm, A.G. (1992). Multidimensional inverse scattering problems, Longman,New York, (Russian edition, Mir, Moscow 1993).

Ramm, A.G. (1996). Random fields estimation theory, MIR, Moscow, pp. 1-352.Ramm, A.G. (2002). Estimation of Random Fields, Theory of Probability and

Math. Statistics, 66, pp. 95-108.



Ramm, A.G. (2003). Analytical solution of a new class of integral equations.Differential and Integral Equations, 16 N2 p. 231-240.

Ramm, A.G. (2003a). On a new notion of regularizer, J. Phys. A, 36, p. 2191-2195.

Ramm, A.G. (2004). One dimensional inverse scattering and spectral problems,Cubo Math. Journ., 6, N1, p. 313-426.

Ramm, A.G. (2005). Inverse problems, Springer, New York.Ramm, A.G. and Shifrin, E.I. (1991). Asymptotics of the solution to a singularly

perturbed integral equation, Appl. Math. Lett. 4, pp. 67 - 70.Ramm, A.G. and Shifrin, E.I. (1993). Asymptotics of the solutions to singularly

perturbed integral equations, Journal of Mathematical Analysis and Appli-

cations, 178, No 2, pp. 322 - 343.Ramm, A.G. and Shifrin, E.I. (1995). Asymptotics of the solutions to singularly

perturbed multidimensional integral equations, Journal of Mathematical

Analysis and Applications, 190, No 3, pp. 667 - 677.Ramm, A.G and Shifrin, E.I. (2005). Singular pertubation theory for a class

of Fredholm integral equations arising in random fields estimation theory,Journal of integral equations and operator theory.

Ramm, A.G.and Weaver, O. (1987). A characterization of the scattering data in3D inverse scattering problem. Inverse problems, 3, L49-52.

Ramm, A.G. and Weaver, O. (1989). Necessary and sufficient condition on thefixed energy data for the potential to be spherically symmetric. Inverse

Problems, 5 pp. 445-447.Roitberg, Ya. A. (1996). Elliptic boundary-value problems in the spaces of distri-

butions, Kluwer, Dordrecht.Rosanov, Yu (1982). Markov random fields, Springer Verlag, New York.Rudin, W. (1973). Functional Analysis, McGraw Hill, New York.Safarov, Yu. and Vassiliev, D. (1997). The asymptotic distribution of eigenval-

ues of partial differential operators, American Mathematical Society, Prov-idence, RI, 1997.

Saito, Y. (1982). Some properties of the scattering amplitude and the inversescattering problem, Osaka J. Math., 19, pp. 527-547.

Schrohe, E. (1987). Spaces of weighted symbols and weighted sobolev spaces onmanifolds, In: Pseudo-Differential Operators, Cordes, H.O., Gramsch, B.,and Widom, H. (eds.) Springer LN Math. 1256, pp. 360-377, Springer-Verlag, Berlin.

Schrohe, E. (1999). Frechet algebra techniques for boundary value problems onnoncompact manifolds, Math. Nachr. 199, pp. 145–185.

Shubin, M. (1986). Pseudodifferential operators and spectral theory, Springer Ver-lag, New York.v

Skriganov, M. (1978). High-frequency asymptotics of the scattering amplitude,Sov. Physics, Doklady, 241, pp. 326-329.

Somersalo, E. et. al. (1988). Inverse scattering problem for the Schrodinger equa-tion in three dimensions, IMA preprint 449, pp. 1-7.

Tanana, V. (1981). Methods for solving operator equations, Nauka, Moscow.


Bibliography 369

Tikhonov, A. and Arsenin, V. (1977). Solutions of ill-posed problems, Winston,Washington.

Tulovskii, V. (1979). Asymptotic distribution of eigenvalues of differential equa-tions, Matem. Sborn., 89, pp. 191-206.

Vanmarcke, E. (1983) Random fields: analysis and synthesis, MIT Press, Cam-bridge.

Vishik, M.I. and Lusternik, L. (1962). Regular degeneration and boundary layerfor linear differential equations with a small parameter, Amer. Math. Soc.

Transl. 20 pp. 239 - 264.Wloka, J.T. (1987). Partial differential equations, Cambridge University Press.Wloka, J.T., Rowley, B. and Lawruk, B. (1995). Boundary value problems for

elliptic systems, Cambridge University Press.Wong, E. (1986). In search of multiparameter Markov processes, in the book:

Communications and Networks, ed. I.F. Blake and H.V. Poor, SpringerVerlag, New York, pp. 230-243.

Yadrenko, M. (1983). Spectral theory of random fields, Optimization Software,New York.

Yagle, A. (1988). Connections between 3D inverse scattering and linear least-squares estimation of random fields, Acta Appl Math. 13, N3, pp. 267-289.

Yagle, A. (1988). Generalized split Levinson, Schur, and lattice algorithms for 3D

random fields estimation problem (preprint).Zabreiko, P., et. al. (1968). Integral equations, Reference Text, Nauka, Moscow.




Symbols

Spaces

H+ ⊂ H0 ⊂ H−, 239

C`(D), 236

Lp(D), Lp(D,µ), 233

W `,p(D), 233

H`(D), 238

D = C∞0 , 236

D′, 236

S, 236

S ′, 236

H`(D), 238

H−`(D), H−`(D), 239

V , 70

W , 71

H`(D) = H1(D) ∩H`(D)

Classes of domains

C0,1

With cone property

EW `,p, 238

Classes of operators

σp, 243

σ1 trace class, 243

σ2 Hilbert-Schmidt class, 243

σ2(H1,H2)

371



L elliptic operator, 255

Special functions

Jν(x), 26

K0(x), 27

hn(r) spherical Hankel functions, 160

Yn(Θ) spherical harmonics, 160

δ(x) delta function, 4

Symbols used in the definition of class Rdρ spectral measure of L, 3

P (λ), Q(λ) polynomials, 3

p = degP (λ), 3

q = degQ(λ), 3

s = ordL, 3

R class of kernels, 3

Φ(x, y, λ) spectral kernel of L , 3

Λ spectrum of L, 3

Various symbols

Rr Euclidean r-dimensional space, 1

S2 the unit sphere in R3 , 161

s(x) useful signal, 1

n(x) noise, 1

U(x) observed signal, 1

h(x, y) optimal filter, 2

H0,H1 hypotheses, 163

`(u1, . . . , un) the likelihood ratio, 163

N (A) null-space of A, 275

RanA range of A, 275

D(A) domain of A, 275

σ(A) spectrum of A, 275

→ strong convergence, 39

weak convergence, 260∫=∫Rr

, 9


Index

Approximation of the kernel, 46

Asymptotic efficiency, 317

Bochner-Khintchine theorem, 304

Characteristic function, 304

Characterization of the scatteringdata, 138

Completeness property of thescattering solution, 131

Conditional mean value, 303Correlation function, 306

Covariance function, 305

Direct scattering problem, 111

Distributions, 236

Elliptic estimate, 76

Estimate, Bayes’, 315

Estimate, efficient, 315Estimate, maximum likelihood, 316

Estimate, minimax, 315

Estimate, unbiased, 315Estimation in Hilbert Space, 310

Integral representation of randomfunctions, 309

Inverse scattering problem, 134Iterative method, 38

Mean value, 305Mercer’s theorem, 61

Moment functions, 305

Order of singularity, 3

Projection methods, 39

Random function, 305Reproducing kernel, 59Rigged triple of Hilbert spaces, 239

Singular support, 15Sobolev spaces, 233Solution of (2.12) of minimal order of

singularity (mos solution), 14Spectral density, 314Spectral measure, 18Stochastic integral, 309

Transmission problem, 15

Variance, 303

Weakly lower semicontinuous, 210

373

Documents

Publishers’ pageramm/papers/486.pdf · that is, random functions of one variable. Random elds are random func-tions of several variables. Wiener’s theory was based on the analytical