242
NONLINEAR FUNCTIONAL ANALYSIS Jacob T Schwartz notes on mathematics and its applications GORDON AND BREACH SCIENCE PUBLISHERS

J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Embed Size (px)

Citation preview

Page 1: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

NONLINEARFUNCTIONAL

ANALYSIS

Jacob T Schwartz

notes on

mathematicsand itsapplications

GORDON AND BREACHSCIENCE PUBLISHERS

Page 2: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Nonlinear FunctionalAnalysis

J. T. SCHWARTZCourant Institute of

Mathematical SciencesNew York University

Notes by

H. FattoriniR. Nirenberg and H. Porta

with an additional chapter by

Hermann Karcher

GORDON AND BREACH SCIENCE PUBLISHERSNEW YORK LONDON PARIS

Page 3: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Copyright © 1969 by GORDON AND BREACH SCIENCE PUau3SHERS INC.150 Fifth Avenue, New York, N. Y. 10011

Library of Congress Catalog Card Number: 68-25643

Editorial office for the United Kingdom:

Gordon and Breach Science Publishers Ltd.12 Bloomsbury WayLondon W.C.1.

Editorial office for France:

Gordon & Breach7-9 rue Emile DuboisParis 14e

Distributed in Canada by:

The Ryerson Press299 Queen Street WestToronto 2b, Ontario

All rights reserved. No part of this book may be reproduced or utilized in anyform or by any means, electronic or mechanical, including photocopying, recording,or by any information storage and retrieval system, without permission in writingfrom the publishers. Printed in East Germany.

Page 4: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Editors' Preface

A large number of mathematical books begin as lecture notes; but, sincemathematicians are busy, and since the labor required to bring lecture notesup to the level of perfection which authors and the public demand of formallypublished books is very considerable, it follows that an even larger numberof lecture notes make the transition to book form only after great delayor not at all. The present lecture note series aims to fill the resulting gap.It will consist of reprinted lecture notes, edited at least to a satisfactory levelof completeness and intelligibility, though not necessarily to the perfectionwhich is expected of a book. In addition to lecture notes, the series willinclude volumes of collected reprints of journal articles as current develop-ments indicate, and mixed volumes including both notes and reprints.

JACOB T. SCHWARTZ

MAURICE LEvI

Page 5: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Contents

Introduction . . . . . . . . . . . . . . . . . . . . I

Chapter 1: Basic Calculus . . . . . . . . . . . . . . 9

Chapter II: Hard Implicit Functional Theorems . . . . . . . .33

Chapter III: Degree Theory and Applications . . . : . . . . 55

Chapter IV: Morse Theory on Hilbert Manifolds . . . . . . . 99

Chapter V : Category . . . . . . . . . . . . . . . . 155

Chapter VI: Applications of Morse Theory to Calculus of Variationsin the Large . . . . . . . . . . . . . . . 165

Chapter VII: Applications . . . . . . . . . . . . . . . 181

Chapter VIII: Closed Geodesics on Topological Spheres . . . . . 199

Index . . . . . . . . . . . . . . . . . . . . . . 235

Page 6: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Introduction

Nonlinear functional analysis is of course not so much a subject, as thecomplement of another subject, namely, linear functional analysis. In study-ing our negatively defined field, we will, however, find a certain unity; partlybecause we shall exclude from functional analysis those analytic theorieswhich do not make use of the characteristic procedure of functional analysis.This characteristic procedure is, of course, the treatment of a given problem,or the construction or study of a desired function, by imbedding the problemor function into a space (generally infinite dimensional) of related problemsor functions. In accordance with this distinction we shall, for example,regard much of the asymptotic study (by topological methods) of solutionsof nonlinear differential equations as belonging to nonlinear analysis butnot to nonlinear functional analysis, while, for instance, the Morse theoryof geodesics, or the construction of solutions of partial differential equa-tions by application of the Schauder fixed point theorem, will definitely beconsidered to belong to nonlinear functional analysis. The distinction sug-gested is not always clear-cut, however.

We may orient ourselves toward our subject of study as follows. Non-linear functional analysis is nonlinear analysis in the context of infinitedimensional topological spaces, manifolds, etc. Naturally, our knowledge ofnonlinear analysis in this case cannot be more complete than our knowledgeof nonlinear analysis in the finite dimensional case. Therefore the finite-dimensional case can serve as a model for the infinite dimensional case. Wecan formulate- our aim as follows: to extend known theorems of nonlinearanalysis from the finite to the infinite dimensional case; to analyze anyparticular difficulties, not present in the finite dimensional case, which arisein the infinite dimensional case.

Now, what are the main branches of nonlinear analysis in finitely manydimensions? They may be listed under five general headings:

1. Elementary calculus.2. The implicit function theorem and related results.

I Schwartz, Nonlinear1

Page 7: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

2 NONLINEAR FUNCTIONAL ANALYSIS

3. Topological principles for establishing the existence of solutions tosystems of equations: the Brouwer fixed point theorem, the theory of degree,the Jordan separation theorem, and, more generally, the Lefschetz fixed

point theorem and the general topological intersection theory.4. Topological theories for establishing the existence of critical points:

the Morse critical point theory, and the Lusternik-Schnirelman "category"theory.

5. Theorems following by the powerful special methods of complex func-tion theory.

We shall find infinite dimensional generalizations of theorems belongingto each of these five categories.

1. Elementary calculus goes over to B-spaces (and even slightly moregeneral spaces) in a routine way. Integration theory is developed for vector-valued functions defined on a measure space in Linear. Operators, Chapter 3,and contains no surprises. The proper notion of derivative (as already intwo-dimensional spaces) is that of directional derivative or Gateaux deriv-ative, which may be defined as follows. Let 0 be.a function mapping oneB-space X into another Y. Then if, for each x, y e X the function 0 (x + ty)of the real variable t is differentiable at t = 0, we say that 0 is (Gateaux)differentiable, and write

do (x; y) = ¢ (x + ty) .

=o

A certain amount of basically elementary and unsurprising real variabletheory is connected with this notion. Thus, for instance, under suitable hypo-theses do (x; y) is linear in y; so that we may, if we like, speak of the derivatived0(x) as a linear operator mapping X into Y. Rather than study this elementarycalculus for its own sake, we will develop results which belong to it asneeded for other purposes.

.2. The implicit function theorem in B-spaces exists in-two versions. On theone hand, we have the classical "soft" implicit function theorem, whichstates that if 0 is a mapping of a B-space X into a space Y, if 0(0) = 0,and if 0 is continuously differentiable and 4,'(0) is a bounded operator with abounded inverse, then 0 maps a neighborhood of zero (in X) homeomorph-ically onto a neighborhood of zero (in Y). This basic version of the theoremhas several interesting variants, one of which' is the so-called theory of"monotone" mappings. Another class of theorems closely related to thisimplicit function theorem form the so-called "bifurcation theory". The main

Page 8: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

INTRODUCTION 3

idea of this latter theory may be explained as follows. Suppose that the solu-

tio'ns of a functional equation ¢(x) = 0 are to be studied in the vicinity of a

given solution x = 0. (Here, 0 is taken to be a differentiable mapping of aB-space X into itself.) We may write ¢(x) = x + Kx + tp(x), where ly(x)I

= 0(1x12) for x near 0, and where K is a linear transformation. If (I + K)-'exists as a bounded operator, then, by the implicit function theorem, x = 0is an isolated zero. In the bifurcation theory, we suppose only that K is com-pact, and wish to consider the case in which (I+K)'' does not exist. In thiscase, it follows by the Riesz theory of compact operators that X decomposesas a direct sum X = Y ® Z of two subspaces, both invariant under K, thesecond being finite dimensional, such that (I+K) is a bounded mapping Yonto itself having a bounded inverse. Correspondingly, we may writex = [y, z], and write the equation 4,(x) = 0 as a pair of equations:

01 (y, z) = (I + K)y + V, (y, z) =0

4, (y, z) = (I +K)z+V2(y,z) =0.By the implicit function theorem, the first equation may be solved for y interms of z : y = Y(z). Substituting this solution into the second equation,we find that the solutions of the original equation 4,(x) = 0 are in one-to-onecorrespondence with the solutions of the equation (I + K) z + 1p2 (Y(z), z) = 0.This last equation, however, may be regarded as a finite system of equationsin a finite number of variables, upon which all the resources of finite-dimen-sional analysis may be brought to bear.

An introduction to the theory of bifurcation as outlined above may befound in Graves' article Remarks on singular points of functional equations,Trans. Amer. Math. Soc., V. 79, 150-157 (1955).

In addition to the "soft" version of the implicit function theorem describedabove, there exists iu the functional-analytic case a separate "hard" versionof the theorem. The precise statement of this second version of the implicitfunction theorem will be given in a later lecture. At present we shall onlyremark that this theorem applies even in cases where the Gateaux derivativeof4, is unbounded as a linear operator, and has an unbounded linear inverse.The theorem is due to J. Nash: The imbedding problem for Riemannian mani-folds, Anti. Math. 63, pp. 20-63 (1956). J. Moser (A new technique for theconstruction of solutions of non-linear differential equations, Proc. Nat. Acad.U.S.A., V. 47,1961, pp. 1824-1831) made the useful observation that Nash's"hard" version of the implicit function theorem could be proved by anappropriate modification of the "Newton's Method" of finite dimensionalanalysis, the superrapid convergence of Newton's method compensating,

Page 9: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

4 NONLINEAR FUNCTIONAL ANALYSIS

in an appropriate sense, for the unboundedness of the Frechet derivativeand its inverse. Moser has subsequently made interesting applications andextensions of this basic idea, cf. Moser: On invariant curves ofarea preservingmappings ofan annulus, Gottinger Nachrichten,1962, pp.1-20, and subsequentpublications. Cf. also a lecture by Serge Lang in the 1962 Sdminare Bourbaki.

3. Finite codimensional topology. The attentive reader will have observedthat in our listing above of theorems of finite dimensional topology we haveseparated these theorems into two groups. This separation, somewhat un-natural in the finite dimensional case, is essential in the infinite dimensionalcase. Consider, for example, the Brouwer fixed point theorem. As is well-known, this theorem is equivalent to the statement that the boundary of theunit sphere is not continuously deformable to a point on itself. In infinitedimensions, however, this statement is false. E.G., if we examine the bound-ary OS of the unit sphere in the Hilbert space L2 (0, 1), and follow the homo-topy f(x) - f,(x), I z t , where

A(x) =t-112f/1 l}, 05x5 t=0 l J t5x51

by the homotopy f112(x) - tfi/2(x) + 1 - 12 a(x), t Z 0, where or e 8Sand o(x) = 0 for 0 S x 5 1, we obtain a continuous deformation of 8Salong itself, to the single point o. This implies a set of topological conse-quences rather different from the corresponding results in finite dimensions.We owe to Schauder and Leray the important observation that the mostfamiliar results of finite dimensional topology can be carried over to in-finitely many dimensions if attention is restricted to the special category ofmaps # having the form 0 - 1 + yv, where 1 is the identity, and +p is a mappingwhose range is compact. Thus, for instance, if we confine our attention tothis special category of maps, the boundary of the unit sphere is not con-tinuously deformable to a single point along itself. Moreover, again for mapsof- this category, a straightforward generalization of the finite dimensionaltheory of degree can be established, and infinite dimensional generalizationsof many of the basic theorems of finite dimensional topology obtained. As abasic reference, see Schauder and Leray : Topologie et equations fonctionelles,Ann. Sci. Ecole Norm. Sup. (3) 51 (1934) pp. 45-78. The infinite dimensionaltheory of degree is based upon the finite dimensional method, which we shalldevelop by a simplified method patterned after the procedure of Heinz: Anelementary analytic theory of degree in n-dimensional space, J. Math. Mech. 8(1959) pp. 231-247.

Page 10: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

INTRODUCTION 5

An especially useful theorem belonging to this circle of ideas is the fixedpoint theorem of Schauder: any continuous mapping into itself of a compactconvex set in a locally convex linear topological space possesses a fixed point.Cf. Schauder: Der Fixpunktsatz in Funktionalydumen, Studia Math. 2 (1936)pp. 171-180. Krein and Rutman (Uspekhi Math. Nauk 3, No. 1, pp. 3-95)give an interesting application of fixed point theory to the "projective space"of a B-space.

A connected account of many of the principal results in the type of func-tional topology discussed above is given by A. Granas: The theory of com-pact vector fields and some of its applications to topology of functional spaces(I). Roszprawy Math. XXX, Warsaw 1962. Granas lays stress on the homo-topy theory of compact maps and on the Borsuk antipodal point theorem,but avoids the theory of degree.

4. Finite dimensional topology. The second category of topological resultsavailable in functional spaces is distinguished by the fact that it makesreference to the ordinary singular homology and cohomology groups, de-fined similarly in the functional case and in the finite dimensional case. TheMorse theory and the Lusternik-Schnirelman theory both begin with thesameconstruction. A manifold is defined to be a topological space locally homeo-morphic to a given B-space in such a way that the "transition mappings"between the various "local coordinate patches" which cover the manifoldare infinitely often differentiable. On such a manifold, all the ordinary localnotions of analysis such as directional derivative, differentiable function,etc., are available. Let M be such a manifold, and let f be a smooth real-valued function defined on M. A critical point of f is by definition a point in Mat which the directional derivative off in every direction vanishes. If M hasa Riemannian metric, we may in the usual way define the gradient Vf off,which is a field of vectors tangent to M; in this case, the critical points of fare the points p where Vf(p) = 0. If there exists no critical pointp off suchthat a S f(p) 5 b (and assumingeertain additional, technicalhypotheses), thenthe subsets M, _ (q a Mlf(q) 5 a) and Mb = {q e Mlf(q) 5 b} are diffeo-morphic. To see this, we have only to note that if each point q such thata 5 f(q) 5 'b is pushed down in the direction of the gradient field Vf untilM. is reached, we obtain the desired diffeomorphism. This statement is thefirst main lemma of the Morse theory. The second observation on which theMorse theory is built gives a corresponding result for the case in which{q e MI a 5 f(q) 5 b} contains an isolated set of critical points. In this case,and under the further assumption that the critical points are all nondegenerate

Page 11: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

6 NONLINEAR FUNCTIONAL ANALYSIS

in an appropriate sense, a closer analysis shows that the space Mb is diffeo-

morphic to a space M. L) H obtained from M. by affixing a certain collection

of "handles". Thus the sequence of critical points off describes the construc-

tion of M by the successive addition of "handles" to a "ball". This connec-tion may be exploited in either of two directions: to conclude from theknown topology of M that any function f defined on M must admit criticalpoints of certain numbers and types, or, conversely, to deduce informationabout the topology of M from a knowledge of the critical points of someparticular function on M.

If we let M be the space of all smooth curves on a finite dimensional mani-fold N, and regard M in an appropriate way as being an infinite dimensionalmanifold, then the general Morse theory outlined above reduces to thespecial Morse theory of geodesics.

A lucid account of the Morse theory, especially in the finite-dimensionalcase, is to be found in Milnor: Morse theory, Ann. of Math., Study 51,Princeton 1963. The generalization to infinite dimensional manifolds isdeveloped in Palais: Lectures on Morse theory, Notes, Harvard, 1963, to berepublished in 1964 in the Journal Topology. Palms gives the application ofthe general theory to the Morse geodesic theory, developing in detail anaccount of the necessary compactness properties of the infinite dimensionalmanifold M and the function f on it. Further applications of the generaltheory to establish the existence of higher type critical structures in theoriesof minimal surfaces, etc., are to be hoped for. We may also refer to a setof notes, entitled Lectures of Smale on Differential Topology (Columbia,1963). These notes give extensions of various qualitative theorems of finite-dimensional differential topology to the infinite dimensional case.

The Lusternik-Schnirelman theory of critical points agrees with the Morsetheory in makiag use of the deformations along gradient curves on a mani-fold M. However, the methods of Lusternik-Schnirelman are more pointset theoretic thahthose of Morse, and lead to more general but less preciseresults, If A and M are kopological spaces, and ¢ maps A into M and is con-tinuous, call 0 a map of category I if it is homotopic to a constant map, andcall 0 a map of category k if A can be divided into k sets A1, ..., Ak, butno fewer, such that 0 1 A is of category 1. If A c M, the category cat (A) isthe catqpry of the identity map of A into M. It is not hard to establish thatcat (A) - 1 is a lower bound for the topological dimension of A. If f is areal valued function defined on A, and m S cat (A), put

cm(f) = inf (sup (f(x) Ix e B}j.ostis>zM

Page 12: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

INTRODUCTION 7

Then cl(f) S c2(f) 5 . It may be shown, under suitable compactnesshypotheses, that for each m S cat (A) there exists a set B,,, c"A such that

cat in, and sup {f(x) S x e cm(f ). Were it the case that A con-tained no critical point q with f(q) = ejf), we could push all the points

p e B. down in the direction of the gradient field Vf, obtaining a sets,,, ofcategory m such that sup {f(x) I x e B.} < c.(f), a contradiction. Thus wesee that each value cm(f) must be a critical value of f. A refinement of thisargument shows that if c.(f) = then {x e A I f(x) = c, and Vf(x) = 01must be of category at least m -- n + 1. Thus any smooth function on Amust admit at least cat (A) critical points. This last result makes it importantto be able to establish lower bounds for the category of a space. We will seein a subsequent lecture that such results follow from an analysis of thesingular cohomology ring of a topological space. For an introductory accountof the theory of category and some of its applications, cf. Lusternik andSchnirelman, Metkodes topologiques daps les problemes variationels, Gauthier-Villars, Paris, 1934.

Chapter VIII of the present notes, generously contributed by Dr. Her-mann Karcher*, gives an account of some of the Morse Theory of closed geo-desics on manifolds which are topological spheres, according to methodsstemming from Klingenberg.

5. The complex analytic case. A few results applying specifically to complexanalytic functional mappings between complex linear spaces are known. Inthe first place, one has the usual elementary results guaranteeing the powerseries expansion of complex analytic mappings, etc. The bifurcation theory.,where applicable, shows that the set of zeroes of an analytic functional equa-tion O(x) = 0 is in one-to-one bianalytic correspondence with the set ofzeroes of a similar set of analytic equations in a finite number of complexvariables. A good deal is known about the structure of such analytic varieties,and, the bifurcation theory enables one to carry all this information over tothe functional case,

. It follows readily from the definition of degree, in the cases where thisdefinition is applicable, that the degree of a complex analytic map x - 4(x)near any isolated zero is non-negative. According to an interesting theoremof Jane Cronin (cf. Cronin : Analytic Functional Mappings, Ann. Math. 58

' The work of H. Karcher was supported at'the Courant Institute of MathematicalSeiencm New York -University, by the National Science Foundation under GrandNSF-GR8114.

Page 13: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

8 NONLINEAR FUNCTIONAL ANALYSIS

(1953) pp. 175-181) the degree of such a zero is actually positive. This result,combined with the results available from the general theory of degree, leads

to a principle of permanance of zeroes that generalizes the well-knowntheorem of Rouchk to the functional case.

6. Miscellany: In addition to the five principal categories of results out-lined above, a variety of miscellaneous special results must be included inour subject. These will be noted as they arise in our subsequent lectures.In the present introduction, we shall note the work of Hammerstein (cf.Nichtlineare Integralgleichungen nebst Anwendungen, Acta Math., V. 54(1930) pp. 117-176) on integral equations, in which the order properties ofthe integral operators studied are exploited. This work is related to thetheory of monotone operators alluded to above.

We may also mention the existence of various investigations, notably thoseof E. Rothe, devoted to the variational method in functional analysis, i.e.,to the possibility of solving functional equations 4(x) = 0 by casting theminto the form O(x) = min, Where 0 is an appropriately selected functional.

While the literature on the subject of the present course of lectures issomewhat scattered, a number of useful books have dppeared. We mentionin the first place the book of Krasnoselskii: Topological methods in the theoryof non-linear integral equations, Moscow, 1956, 392 pp. An English trans-lation of a related survey article by Krasnoselskii appears in the AMS Trans-lations, Ser. 2, No. 10, pp. 345-409. Krasnoselskii gives a good account ofthe available information on continuity and compactness of nonlinear inte-gral operators of various forms, a good summary account of a number ofother important topics in nonlinear theory, as well as an extensive biblio-graphy. A less closely related, but still relevant work is the article Functionalanalysis and applied mathematics by Kantorovic in Uspekhi Math. Nauk 3(No. 6) (1948), pp. 89-185, as well as this author's treatise Approximatedmethods of higher analysis. An account of the differential calculus in B-spacesis to be found in the book of Michal: Le calcul difirentielle daps ksespacesde Banach (V. I, Fonctions analytiques--Equations int6grales) Gauthier-Villars; 1958 (150 pp.), in the well-known treatise by Hille and Phillips onsemigroups, and in the texts of advanced calculus by Dieudonne and bySerge Lang. _

The reader wishing to extend his knowledge of nonlinear functional ana-lysis beyond the necessarily limited material contained in the present noteswill find it useful to consult the comprehensive survey article by James sells :

A Settingfor Global Analysis, Bull. Amer. Math. Soc. v. 72,,1966,p. 7S1-809.This excellent review may also serve as a guide to the literature of the subject:

Page 14: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER I

Basic Calculus

A. Some Definitions and a Lemma on Topological Linear Spaces . . . . . . 9

B. Elementary Calculus . . . . . . . . . . . . . . . . . . . . 11

C. The "Soft" Implicit Function Theorem . . . . . . . . . . . . . . 14

D. The Hilbert Space Case . . . . . . . . . . . . . . . . . . . 18

E. Compact Mappings . . . . . . . . . . . . . . . . . . . . . 26F. Higher Differentials and Taylor's Theorem . . . . . . . . . . . . . 28

G. Complex Analyticity . . . . . . . . . . . . . . . . . . . . 30H. Derivatives of Quadratic Forms . . . . . . . . . . . . . . . . 31

A. Some Definitions and a Lemma on Topological Linear Spaces

1.1. Definition: We say that E is a topological linear space if E is a linearspace which is given a topology such that addition and multiplication byscalars are continuous functions, i.e.: + : E x E - E and - : E x R - E arecontinuous functions, where E x E and E x R have the product topology.

1.2. Definition: Let E be a T.L.S. We say that E is locally convex if thereexists a family of convex sets {U) which is a basis for the family of neighbor-hoods of 0.

(A set K is called convex iff x, y e K implies tx + (1 - t) y e K for everyt e [0,1 ].)

1.3. Definition: A T.L.S. will be called an F-space, or Frechet space, if, asa topological space, it is metric and complete, with a topology given by a"norm" function }x( which satisfies: (i) lxl real ? 0; (ii) lxi = 0 if x = 0;(iii) Ix + yJ Ixi + lyi. (See Linear Operators*, Chapter 2.) We shall writeL.C.F.-space for locally convex F-spaces.

'Linear Operators, Nelson Durnford and Jacob T. Schwartz, Wiley-Interscience,Vol. 1, 1958, Vol. 11, 1963.

9

Page 15: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

10 NONLINEAR FUNCTIONAL ANALYSIS

1.4. Definition: A T.L.S. will be called a Banach space, or a B-space, iffit is complete and its topology is given by a norm, which in addition to con-ditions (i), (ii) and (iii) of the above definition, satisfies (iv) IAxl = JAI Ixi

The spaces with which we shall ordinarily deal are L.C.F.-spaces.

1.5. Definition: Let E be an F-space. We say that K c F is bounded if forany neighborhood U of 0 there exists e -' 0 such that eK a U.

This condition is easily seen to be equivalent to the following one: e, -+ 0and k e K implies ek -- 0.

We shall now prove a lemma relating L.C.F.-spaces to B-spaces to B-spaces.

1.6. Lemma: A L.C.F.-space is a B-space if it contains a bounded openset.

First, we note that boundedness is unaffected by translations, so we canassume that 0 e U, where U is bounded and open. By definition of an L.C.F.-space, U will contain a convex neighborhood U' of 0, which a fortioriis bounded. Now, we can replace U' by V = U' n (- U) which is alsoconvex, bounded, and a symmetric neighborhood of 0, i.e., V = - V. Bythe definition. of a bounded set, the family {eV}, e real > 0, is a neighbor-hood basis at 0. We consider now the support function of V, p(x), definedby p(x) = r sup I11l -1. (Obviously if in a B-space V is the unit sphere,

rx.vp(x) - Ixi.) The functionp(x) has the four properties of a norm function :

(i) p(x) real and ? 0. Obvious. It is a finite number because V is absorbing.(ii) p(x) = 0 -co- x = 0. If p(x) = 0, sup ItI = oo, and this means that

rxevtx e V for all t, because V is convex. Hence x e 1 V for all t > 0, whence x

tis in every neighborhood of 0, since

.

I V . is a neighborhood basis. There-

fore x = 0, because the space is Hausdorff.(iii) p (x + y) 5 p(x) + p(y). It is apparent that p(x) can be defined by

p(x) = inf 1. Let a, ft > 0 and such that x e aV and y e gV. Then

x + y e aV + fV; sinceV is convex, a' + PV = (a + fi) V, whencex + y e (a + (3) V. Therefore inf t 5 inf t + inf t, whichis (iii). 9>0.X+rerv r>0,xety r>o.,erv

(iv) p (ax) = jal p(x). If a > 0, it is easy to see that p (ax) = ap (x). Butsince V is symmetric, (iv) holds for any a, since a V = -a V.

Next we note that V = {xjp (x) < 1} if we assume. V to be open. For then'

Page 16: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 11

clearly x e V implies p(x) < 1. Also p(x) < 1 implies x e tV for some t < 1,i.e., x = tv, v e V; since V is convex, x e V. We see at once then thatsV = {xlp(x) < e}. Therefore p(x) is continuous at 0, and consequently atevery x.

Conversely, for any e > 0, there exists d > 0 such that p(x) < d implieslxl < e. Simply choose d so that d V e S., where S. = {xl lxl < e}. This ispossible since {e V} is a neighborhood basis at 0.

Thus we have shown that l I and p(x) determine the same topology; hencep(x) is the required norm.

Q.E.D.

B. Elementary Calculus

1.7. Defnitioa: Let X and Y be T.L.S. Let U be an open subset of X andf : U - Y. We say that f has a Gateaux derivative df (x, y) at x e U iff

dt f (X + ty) 1=0 = df(x, y)

exists for every y e X.We call this derivative the derivative off at x in the direction y, and shall

write it often as (df(x)) (y) or (f'(x)) (y).

1.& Definition: Let X and Y be T.L.S. and let 0: U - Y, where'U is aneighborhood of 0 in X. We say that 46 is horizontal at 0 if for each neigh-borhood V of 0 in Y there exists a neighborhood U' of 0 in X, and a function0(t) such that

0 (t U') c o(t) V.

1.9. DeWtion: Let X, Y be T.L.S. and U open in X. Let f : U -- Y andxo e U. We say that f is Frechet differentiable, or F-diferentiable at xo, ifthere exists a continuous linear map A : X - Y such that if we write

f(xo + y) = f(xo) + Ay + 4)(y)

then 0 is horizontal at 0.We call A the derivative of f at xo, and we write it df(x, y) as in Defini-

tion 1.7.

1.10. Remark; If the spaces are B-spaces, then the definition of a functionhorizontal at 0 is equivalent to

I4)(x)I s .Ixl tv(x)

Page 17: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

12 NONLINEAR FUNCTIONAL ANALYSIS

where tp is real valued and lim V(x) = 0. Thus, in a B-space, the conditionx-'0

of F-differentiability can be expressed as follows :

f(xo + y) = f(xo) + Ay + 0(IYI)

1.11. Remark: If a linear function A is horizontal at 0, then A = 0, asfollows at once from the definition. Thus we see that the F-derivative of afunction is unique, because if

f(xo + y) = f(xo) + Ay + 4,(y)and

f(xo + y) = f(xo) + By + 4'(y)

where A and B are continuous and linear, and 0 and 0' are horizontal at 0,then A - B is horizontal at 0 (the sum of 0 and 4,' is still horizontal at 0)whence A = B.

1.12. Remark: The domain of f in the definition of G-differentiability canbe assumed to be a "finitely open set in x", where x is simply a linear space.Also, for complex spaces, it is easy to see that the Gateaux derivative isalways linear in y, and that the hypothesis of linearity is also unnecessaryfor F-derivatives. (Cf. Hille and Philips [1], Sections 3.13 and 26.3.)

In the case Xis a B-space, it is easy to show that F-differentiability impliesGateaux differentiability.

1.13. Lemma: Let f : U -- Y, where U is open in a B-space X and Y is aT.L.S. Then if f has an F-derivative at x0, it also has a Gateaux derivativeat x0, and they are equal.

Proof: We writef(xo + ty) - f(xo) = My + o (I tyl ),

where A is linear and continuous. But o(Ityl) = o(Itl lyl) and

Q.E.D.lim 1 (f(x0 + ty) -f(xo)) - Ay.f40 t

The next lemma gives the chain rule for F-derivatives.

1.14. Lemma: If f : U -+ V is F-differentiable at x0, and g : V -+ W isF-differentiable at f(xo), then g (f(x)) is F-differentiable at x0, and itsderivative is given by:

(d (gf)) (xo, y) = dg (f(xo), df(xo, y)).

Here, U, V and W are open sets contained in X, Y, Z which are T.L.S.

Page 18: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 13

Proof: We have only to write :

g [f(xo + y)] = g [f(xo) + df(xo, y) + 0(y)]

= g [f(xo)] + dg [f(xo), df(xo, y)]

+ dg [f(xo),4(y)] + tp [df(xo, y) + 4(y)],

where 0 and +p are horizontal at 0. It is easy to see that the last term is hori-zontal at 0 as a function from X to Z. We note that if 0 is horizontal at 0,and if A is linear and continuous, then A o4) is also horizontal at 0. This fol-lows immediately from the definition of horizontality. Thus dg [f(x0),0(y)1is horizontal at x0, and its derivative is dg [1(x0), df(xo, y)1.

Q.E.D.

We next prove another lemma relating Gateaux and F-differentiability.

1.15. Lemma: Let X and Y be B-spaces, U open in X and f : U - Y. If fhas a Gateaux derivativef'(x, y) in U, which is linear in the variable y, andif, when regarded as a linear operator, f'(x) is bounded for x e U and dependscontinuously on x in the uniform topology, then f is F-differentiable in U.

Proof: Our point of departure is the formula

dtf(x + ty) = f'(x + ty) (y)

which one can prove easily. It follows that

f(x + y) = f(x) + f0 l f'(x + ty) (y) dt

But now:[f'(x + ty) - f'(x)] (y) dt.=f(x) +f'(x) (y) + fo

f t [f'(x + ty) - f'(x)] (y) dt 6 lyl fo

If'(x + ty) - f'(x)I dt0 0

Q.E.D. = lyl 0(1) = o(lyl)

1.16. Remark: In the last lemma we integrated functions of a real (orcomplex) variable with values in a B-space (cf., for example, Hille andPhillips [1], Chapter III). The basic fact we used was that

fsf(S) d1c (S) S f I f(S) I du (S).J

Page 19: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

14 NONLINEAR FUNCTIONAL ANALYSIS

This is not true in general for F-spaces, because its proof depends upon theinequality IE atrl S E la,l I fil . In L.C.F. spaces, however, one can define

weak integration by appropriate use of linear functionals. (Cf loc. cit.) Local

convexity implies separation theorems which assure the uniqueness of the

integral.

1.17. Lemma: (Contracting Mapping Principle.) Let X be a completemetric space and 4 : U -+ X, U open in X, and assume a (¢(x), 4(y)) Sae (x, y)with 0 S a < 1, where a (x, y) is the distance between x and y. Moreover,suppose there exists zo e U such that a (zo, X - U) > M, and e (zo,¢(zo))< M (I - a). Then there exists a fixed point z = O (z.) such that a (zo,zj< M.

Proof: a (zo,4(zo)) < M(1 - a) < M < e (zo, X - U), so 4(zo) is alsoin U, and inductively 02(zo), ..., 4 "(zo) ... are all in U, where #"(zo)= ¢ ( '-1(zo)). The sequence zo, ¢(zo), ..., 4 (zo) ... is Cauchy, as followsfrom the contracting hypothesis. Hence we can set za, = lim o"(zo). By thecontinuity of 46, i(z,,) = z.. The formula "

e (z0, 40(zo)) < M (1 -,X")

is easily proved by induction on n. Then

e (zo, z.) = e (zo, lim 0"(zo)) = lime (zo, 4"(zo)) < M.

Q.E.D.

C. The "soft" Implicit Function Theorem

1.18. Lemma: Let x be an F-space, U the sphere {x: lxl < r}, and0: U - X such that ¢(x) = x + y(x), where V(x) satisfies:

IV(x) - o(y)i 5 a lx - yl with 0 S a < 1, and o(0) = 0.

Then: (i) 4(U) covers a sphere of radius r (1 - a) about 0. (ii) 0 is one-to-one and the inverse,0' 1 satisfies a Lipschitz condition with constant 1/1 -a.

Proof: (i) We apply the last lemma to the function f(x) = -V(x) + pwhere p e X and Ipl < r (1 - a). If we put zo = 0, this inequality impliesthat Izo - O(zo)l = IpI < r (1 - a). Hence there exists a point z in U suchthat z,, _ -+y(zo,) + p, i.e., O (z,,) = p. (ii) Suppose 4(x) = x + ip(x) = p

Page 20: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 15

and4(y) = y + Vi(y) = q. Then, x - y + 1V(x) - V(y) = p - q, and Ix -yfIV(x) - v(y)I s- lP - qI, so (1 - a) Ix - yl 5 IP - ql, and we are done.

Q.E.D.

1.19. Corollary: If x is a B-space and if, in the notation of the abovelemma, V,'(x) exists and Iv'(x)I S a < 1 in U, and V(0) = 0, then (i) and (ii)

are true.

Proof: We only have to note that

Iv(x) -- V'(y)I y))I Ix - yI dt < a Ix - yl.

Q.E.D.Now we can prove the following important theorem :

1.20. Theorem: (Implicit function theorem.) Let X, Y be B-spaces andU - Y, where U is an open neighborhood of 0 in X and 4)(a) = 0.

Assume : (a) 0 is F-differentiable in U. (b) 4)'(x) depends continuously on xin the uniform operator topology. (c) 4)'(0) is a bounded linear map with abounded linear inverse. Then 0 maps a sufficiently small neighborhood ofzero homeomorphically onto a neighborhood of zero.

Proof: Let A = 4)'(0). We put , = A -1 o 0. Then iq: U - X, 77 has anF-derivative rl'(x) which is continuous in x in the uniform operator topology,and rl'(0) = I, the identity operator. Let v _ 71 - I. Then o'(0) = 0, and

V(x) - o(y) = n(x) - rl(y) - (x - y) = fo

(x + t (x - y)) (x - y) dt

Thus

-(x -y) = f (rl' (x + t (x - y)) - 1) (x - y) dt.0

hv(x) - v(y)I s Ix - yl I Irl'(x + t (x - y)) - 11 dt.1

0

But we can make the integral on the right of the last formula less than one,by taking x and y in a sufficiently small neighborhood V c U of 0. Then thepreceding lemma applies toil, and, a fortiori, ¢ = Aij maps V homeomorphi-cally onto a neighborhood of 0 in Y.

Q.E.D.

1.21.Cor6llary: Given the conditions of 1.20, the inverse map 1: 4)(V) -. Vis F-differentiable. Setting ip = 4-1, we have, for y e 4)(V), the formula

v'(y) = (4' (4) 1(,)))-1.

Page 21: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

16 NONLINEAR FUNCTIONAL ANALYSIS

Proof: If 4(x,) = yl, 4(x2) = y2, then

I4-'V"2) -0-'(yl) -(4'(xl))-1 (Y2/,- yi)I

= I(4,'(x1))-' (4,'(x1) (X2 - xl) - (Y2 - yl)ll

5 A 14'(xl) (X2 - xl) - ,,/4'(x2) + 4(xl)I

This last expression is o &2 - x1 I), whence the first expression is o (I y2 -yl I),and the result follows.

Q.E.D.

An induction argument easily yields the fact that if 4, has derivatives ofhigher order (definition given later), then so does ¢-1. Similarly, if 0 dependscontinuously on some parameter, so does 4,-1.

The following theorem is a global version of the "local" implicit theorem :

1.22. Theorem: Let X and Y be B-spaces, and q5: X -+ Y a continuouslyF-differentiable function, and suppose 46' is invertible (as a linear operator)at every x e X, and moreover, that I [4'(x)]-' I S K < co uniformly in x.Then 0 is a homeomorphism of X onto Y.

The proof will depend on the following lemma :

1.23: Lemma: Under the same hypothesis as the theorem, if d is thesquare 0 S s S 1, 0 S t 5 1, and if F(s, t) satisfies the conditions:

(i) F (s, t) I d - Y.(ii) F (j, t) is continuous in (s, t) and for every fixed s, 0 S s S 1, F (s, t)

is F-differentiable in t.(iii) F (s, t) has fixed endpoints, i.e. there exist yo, yl e Y such that F (s, 0)

=yo,F(s,1) =ylfor all OSsS 1.

Then there exists a function G (s, t) from d to X which also satisfies (ii) andin addition

0 [G (s, t)] = F (s,, t) for all (s, t) ed.

Proof of the lemma: By the local implicit function theorem, there existneighborhoods V of yo and U of x0 (where 4,(xo) = yo) such that 0 is ahomeomorphism of U onto V. Then, for sufficiently small e, we can defineG ( s, t) as4, .1(F(s, t)) if 0 -< t 5 e and for 0 S s 5 1. We call a the largestof the values such that G (s, t) can be defined in the rectangle 0 5 t < a,0 5 s S 1. Assume a < 1. If G (s, t) is defined for t = a, consider thecurve G (s, a) and its image 0 (G (s, a)) = F(s, a). For each s, 0 S s 5 1,we can select a neighborhood U, of G (s, a) and a neighborhood V. of F(s, a)

Page 22: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 17

such that 0 is a homeomorphism of U, onto V,. But G (s, a), 0 5 s ---5 1 is

compact, and therefore, there exists a finite subcovering of the curve G (s, a)with neighborhoods U,,, i = 1, ..., n. In each of these neighborhoods, wecan define the function G (s, t) for all s and 0 S t < a + e by the localimplicit function theorem. So, G (s, t) can be defined for the rectangle0 S s S 1, 0 5 t < a + min e,,, contradicting the fact that a was the largest

of such numbers. Now G (s, 1) must be F-differentiable in t, for F(s, t)satisfies (ii) and 4'' is locally F-differentiable. By the chain rule, we have forall 05 s5 I:

4,' [(G (s, t)] G' (s, t) = F(s, t)

where the prime denotes differentiation with respect to t. So :

G' (s, t) = [0' (G (s, t))]-' F (s, t)and

ThenIG' (s, t)J 5 J[0' (G (s, t))]-'J IF (s, t)J for-all 0 5 s <_ 1.

(G' (s, t)J 5 KI F' (s, 1)1 5 A,.

Now, integrating with respect to t between to and t,-, we get:

IG (s, t,) - G (s, to)+ 5 A. 11, - tot,

a Lipschitz condition for G (s, t). Therefore lim G (s, t) exists for each s,r-*.-

and G (s, t) can be defined at t = a. We have proved that a = 1.Q.E.D.

Proof of the theorem: Let yo = 0(0), and y any point in Y. Consider thestraight line segment joining yo and y; we write it y(t), 0 t S 1. As aparticular case of the lemma, there exists a curve x(t), 0 5 t 5 1, in X suchthat 0 [x(t)) = y(t). Then O [x(l)] = y, and4, is onto.

Let, as' before, yo = 4,(0), and suppose there are two points xo, x, a X,xo 0 x1 such that 4,(x0) z l! = y. We take two curves xo(t) and x,(t) join-ing respectively 0 to xo and 0 to x1, with 0 5 t 5 1. Then both image curvesyo(t) - 4, (xo(t)) and y1(t) = 4, (x1(t)) will join yo and y. As Y is simplyconnected, there exists a function F(s, t) from d to Y continuous in (s, t)and such that F (0, t) - yo(t), F (1, t) = y1(t), and F (s, 0) = yo, F (s, 1) = y.

By the argument of the lemma, we find a function G-(s, t) from A to X,continuous and such that 4, (G (s, 1)) = F (s, t), with G (0, t) = xo(t) andG (1, t) = x1(t). But then the continuous curve G (s, 1) with endpoint xoand x, is mapped by 4, onto y. This contradicts the local implicit function2 Schwartz. Nonlinear

Page 23: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

18 NONLINEAR FUNCTIONAL ANALYSIS

theorem, and therefore ¢ is 1 - 1. 0-1 is obviously continuous (it is F-differentiable) so 0 is a homeomorphism of X onto Y.Q.E.D.

D. The Hilbert Space Case

Now we shall prove some implicit function results for Hilbert space:

1.24. Lemma: Let H be a real Hilbert space, L a bounded linear operatormapping H into H. Suppose that for every x e H, (Lx, x) z a (x, x) wherea > 0. Then L- I is defined everywhere, and tL- 11 a-1.

Proof: L is 1 - 1, for if Lx = 0, then 0 = (Lx, x) a (x, x), and so x = 0.The range of L is closed, for if x e range of L and x, -+ x, then L-lx is aCauchy sequence:

(L-1Xe _ L-1Xm, LT 1X - L" 1Xm) S 1 (X. - Xm, X. - Xm)a2

1 Z_ ? Ix, - Xm1a

Therefore L- 1x. - y e H; since L is continuous, Ly = x. Now let z e Hbe in the orthogonal complement of the range of L. Then for every x e H,(z, Lx) = 0 implies (z, Lz) = 0 implies (z, z) = 0, and z = 0. We have provedthat the range of L = H, and L is onto. From the inequality (Lx, x) z a (x, x)we see that jLxi z a. Hence JL-1xl 5 a-1.Q.E.D.

1.25. Corollary: Let ¢ : H - H where His a Hilbert space, and suppose 4,is continuously F-differentiable, and (4,'(x) y, y) z a jy, y) for every x and y.Then 0 is a homeomorphism of H onto H.

Proof: Apply the last lemma and Theorem 1.22.Q.E.D.

We shall state the hypothesis of this corollary in a slightly different way.

1.26. Definition: We say that ¢ : H -+ H (H is a Hilbert space) is stronglymonotone if for every x, y e H we have

(4,(x) - 4,(y), x - y) z a (x - y, x - y)for some a > 0.

Page 24: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 19

It is easy to see that a differentiable 0 is strongly monotone if (0'(x) y, y)a (y, y) for every x, y e H. In fact, suppose 0 is strongly monotone. Then

for any real t:

(4) (y + tz) - ¢(y), z) to 1z12, where y, z e H,

and dividing by t and taking the limit, we get (4)'(y) z, z) > a (z, z). Con-versely, we get the condition of strong monotonicity by integrating the condi-tion involving the derivative.

The following definition will be useful in the sequel:

1.27. Definition: 0: H -> H will be called monotone if for every x, y e H,(4)(x) - 4)(y), x - y) > 0. If the sign > holds for x - y 96 0, ¢ will be calledstrictly monotone.

1.28. Remark: Obviously strongly monotone implies strictly monotoneimplies monotone. Furthermore, 4) is strongly monotone with a constant a

if 1 0 - I is monotone, and if ¢ is strictly monotone (a fortiori, if 0 isa

strongly monotone), 0 is 1 - 1.We prove now a useful lemma on Euclidean space :

1.29. Lemma: (Kirszbraun) Suppose {x1 ... x"} and {xi - - are two setsof points in E", and let p be also in E. Assume that for every i, j,1 < i, j < n,we have Ixi - xj'l S Is - xxl (I I is the standard norm E Ix.I ). Then, thereexists p' a E" such that

IP' - xxl < Ip - xfl for every j, 1 <j5n.Proof: Let

A = inf max I P,- - xil

D'.En 1sts. IP - xtl

This infimum is assumed at some point p+E E", for max IP, - xti becomeslarge when p' is large. Hence we can put '' s" l p - X11

max

IP+ - xt'l

=A.1$!$" IP - xt1

Now, suppose that for 1 i S k we have Ip+ -:xil = A Ip - xtl, and thatfor k+ 1 5 I S n we have l p+ - xil < A l p - xtl . We shall show thatp+ a co (xi , ... , xt) (the convex hull of {x. , ... , xk}). Suppose thatp+ 0 co (xi xt). Then, we can separate p+ from co (xi xx) with a hyper-plane A. If we move the point p+ toward A perpendicularly, it is obviousthat the distance from p+-to every point in the halfspace not containing p+

Page 25: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

20 NONLINEAR FUNCTIONAL ANALYSIS

decreases. We can move p+ by so little as to preserve the inequalities Ip+ - x;(< A Ip - x,I, k + 1 5 i 5 n, and now we get (p+ - ill < A (p - xil for every1 5 i 5 n, which is impossible, because p+ realizes the infimum. Thus we

4 it

can express p+ as c,%, where c, z 0 and c, = 1. Let R, = p - xi and1 1

Ri = p+ - x,. Now suppose A is greater than 1. Then

(1) R'2>R; forOn the other hand, we have by the hypothesis:

(R; - RR)2 S (R, - Rjy,

and after expanding and using (1):

(2) RiRj'/> R,Rj, 1 5 i, j 5 k.

Now, as c, = 1, we have c,) p+ = E c,x', and therefore, Y c,R# = 0,1 \\\\1 1 1

and, by (1) and (2), 0 > (E c,R,)Z, a contradiction. We have proved that151.

Q.E.D.

Now, it is easy to generalize this result for Hilbert space:

1.30. Corollary: Let {xa} and (x') be two sets of points in the Hilbertspace H, and p e H. Suppose Ix.' - xg'(;S (x - x,(. Then there existsp' e H such that Ix' - p'I S (x3 - p( for all a.

Proof: We want prove that the intersection of the infinite family ofspheres with center xa, and radius Ix, - p( is non-void. But spheres are com-pact in the weak topology for H, so it is sufficient to prove that every finitesubfamily of spheres has a non-void intersection. If, then, there are onlyfinitely many xa's, the set {xa} u (x') generates a finite dimensional Euclideanspace, and we have only to apply the lemma.

Q.E.D.

1.30A. As the following counterexample (due to Charles McCarthy)shows, the obvious generalization of Kirszbraun's lemma to Banach spacesthat are not Hilbert spaces is not true in general. We give the following

1Leoi+em: Let 1,P, 1 < p < co, n z 1 be n-dimensional Euclidean spacewith the norm

Ix

I"1,,

Page 26: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 21

Then if n > 1, p # 2, the generalization of Kirszbraun's lemma does not

hold.

Proof: Take in l,', p > 2, n > 1 the points xi = (0, 0, ..., 0), x2 = (0, 1,

0, ..., 0), x3 = (1, 0, ..., 0). Evidently

Ixi - x21, = Ixi - x31, = 1, Ix2 - x31 = 21/D.

Choose now spheres Si, S2, S3 around xi, x2, x3 of radii 2(1 -,)/,_ We have

Si n S2 n S3 0, ..., 0)}.

Now let 0 0 0 x' //1 - 2i-Di/D 2(1-J`)/", 0, ..., 0),xi (, ), 2 = ll ) ,

A = ((1 - 21-,)i/a, --2('-,)/,, 0, ..., 0).

AgainIxi - x221, = Ixi - x31, = 1, Ix2 - x31 = 21/P.

But if we take spheres Si , Ss , S3' of radii 2c1-p)/p around x'j, x2, x3, theirintersection will be void. In fact, by uniform convexity

S2' nS3 ={((1 -2i-,)i/,,0,..., 0)} ={y).

Since (1 - 21-,)i/, > 2(1-,)/, is p > 2, Sin S2 n S3' = ¢. The case I. P,1 < p < 2, n > 1 may be handled in a similar way; the points X1, x2, x3are replaced by x' j, x2, x3 and vice versa, and the radii of S1, S2, S3 become(I - 21-F)i-,, 2(i-,)/,9 2(1-,)/, respectively.

Q.E.D.

1.31. Theorem: Let H be a Hilbert space, S any subset of H, and4' : S - H.Suppose 14'(x) - 4'(y)1 < K Ix - yI for all x, y e S. Then 0 can be extendedto all of H in such a way that the extension satisfies the same Lipschitzcondition.

Proof: Without loss of generality we can suppose that K = 1. By Zorn'slemma, there exists a maximal extension 4' subject to the same Lipschitzcondition. Suppose p # domain of 4'. We have 14,(x) - 4(y)I 5 Ix - yI forx, y e domain 4'. Therefore, by the last corollary, we can find p' e H suchthat 14(x) - p'I S Ix - pi for all x e domain 4,.

If we define p' = 4'(p), we have extended j to one more point preserving theLipschitz condition, and thus contradicting the maximality of 4'. Hencedomain of = H.

Q.E.D.

We now make some definitions preparatory for the next theorem.

Page 27: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

22 NONLINEAR FUNCTIONAL ANALYSIS

1.32. Definition: We say that 0: X -+ Y (X, Y are B-spaces), is feebly con-tinuous if the mapping t --> 0 (x + ty) is continuous from R to Y with theweak topology for every pair x, y e X.

1.33. Definition: 0 : X -+ Y (X, Y are B-spaces) is slightly continuous ifx,, - x strongly in X implies 4(xa) -+ 4(x) weakly in Y.

1.34. Remark: As it is easily seen, continuity implies slight continuityimplies feeble continuity.

1.35. Theorem (Minty): (a) Let 0: S -+ H (H is a Hilbert space), be de-fined in an open set S C H, and suppose 0 is feebly continuous and stronglymonotone. Then 0 is an open mapping. (b) Let 0 : H -> H be defined every-where, and suppose 0 is slightly continuous and strongly monotone. Then 0maps H onto H.

Proof: (a) As we remarked earlier, we can assume without loss of generalitythat 0 = id + T, where T is monotone. Now consider the Hilbert directsum H ® H. We introduce the relations:

(1) [x, y] M [x', y'] iff (x - x', y -- y') ? 0

and

(2) [x, y] L [x', y'] if(Note that neither is transitive.)

ly-ASIx --x'$.

Let 4: H ® H - H ® H (Cayley transformation) be defined by

(3) ([x, yl) _ - [x + y, x - y] .

It is easy to see that 0 is an isometry (of course I[x, y]12 = Ix12 + Iyi2), andthat 02 = id.

Now let p = [x, y] and q = [x', y']. We hale:

(4) pMq if 4$(p) L4(q)For

4$(p) L4$(q)

iff Ix-y-x'+y'12 5Ix+y-x'-y'12if -2 (x - x', y - y') 5 2 (x - x', y - y')if (x - x', y - y') > 0 if pMq.

Page 28: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 23

Call I' e H ® H the graph of T. Since T is monotone,

(5) pMq for all p, q e T

By (4), if we put r1 = P(r), we get :

(6) pLq for all p, q c- r1 .

This means that r, is the graph of some function S1 satisfying a Lipschitzcondition with K = 1. Obviously the domain of S1 is the set of points

1 (x + y) e H such that [x, y] cr, i.e.,

(7) domain (S1) = 1 range (id + T).2By the previous theorem, we can extend S1 to a function S2 defined on allof H and satisfying the same Lipschitz condition. Let I'2 = graph of S2 andr3 = )(r2). Then r3 = r, because r2 = I'1. $2 satisfies the Lipschitz condi-tion, whence

(8) p, q e r2 implies pLq

and

(9) p, q e r3 implies pMq

(apply (4) and recall that 02 = id).Now, by (3) and (7) we have :

ran(id - S e (id + T)10) r = [(Id+ S ) ) ] x_ g( 2 x, 2 x ;{ J e 7and

(11) r = i [(Id + S2) x, (id - S2) x]; x e H

Suppose now that the range of id + T = range of 4 is not open; then thereexists a point in this range which is a limit of a sequence of points not in therange, and by (10) and (11), this means that there exists a point [y, z] E rsuch that [y, z] = lim [yy, and {[y,,, c r3 - r. But, by hypo-thesis, the domain o; T is open; hence for some no,y e domain of T y y 0, z* = z,0, we arrive at

Page 29: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

24 NONLINEAR. FUNCTIONAL ANALYSIS

the following conclusion: There exists a pair V, z*] such that:

(i) y* e domain of T,

(ii) (y, z] M [y*, z*] for every pair (y, z] r e r,

(iii) z* * Ty*.

Now we show that this leads to a contradiction.By (i), for small e > 0, y = y* + e (z* - Ty*) belongs to the domain of T,

so, by (ii), we have :(y* - y, z* - Ty) ? 0,

and using the definition of y:

-e (z* -- Ty*, z* - T (y* + e (z* - Ty*))) z 0or

(z* -- Ty*, z* - T (y* + e (z* - Ty*))) 5 0.

As e -+ 0, we have, using the feeble continuity of T:

fz*-Ty*12 0.

Thus, z* = Ty', which contradicts (iii). This proves that range of# is open.

(b) As we remarked before, 0 is 1- 1, because it is strongly monotone, andmoreover ¢-: satisfies a Lipschitz condition with constant 1la, as followsfrom Definition 1.26 and Schwarz's inequality. Now to prove that ¢ isonto we use the same argument as was used in Theorem 1.22, in provingthat if x(t) is defined for t < a, it is defined for t = a; as before, we use,the Lipschitz condition on 4' 1 to prove that x(t) has a limit as t - a. Thenwe use the slight continuity of ¢ to prove that 0 (x(a)) = y(a)..Q.E.D.

We now establish an additional theorem for monotone functions:

1.36. Theorem: Suppose 0: H - H (H a Hilbert space) is monotone andcontinuous. If p e H is such that (x, 4(x) - p) 10 for fix) z R, (where'1,,is a number depending on p), then p belongs to the range of ¢.

Proof: We can assume that p = 0. Let 4,(x) = ex + 4(x) with e > 0.Then ¢, is strongly monotone, and by Minty's theorem there exists x, e Hsuch that

(1) ex. + 4)(x.) = 0 ,

Page 30: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 25

So that multiplying by x, we have814, + (O(x,), x,) _' 0.

Therefore, for every e > 0 Ix,I must be smaller than R so that the x, forma bounded set. Hence there exists a sequence 0 such that the sequence{x,j tends weakly to a point x., and we can -suppose that Ix,, j also con-verges. Now, by the monotonicity of# and (1), we get:

(xa - x 6xa - ex,) S 0 for every d > 0 and e > 0.Then, if we put 8 = e and let n -+ oo,

(x - xQ, - ex,) S 0 for every e > 0or

Therefore

(xQO - xt, x:) ? 0.

(2) Ix,l2 z Jim Ix6"I2.

On the other hand, spheres in B-spaces are weakly closed, and this meansthat:

(3) Ix,I 5 lim Ix,j.

Hence, by (2) and (3)

and soIXCDI = lim Ix4,l

Xen -, x strongly.By the continuity of 0 we get 0.

Q.E.D.

1.37. Example: Suppose (S, K, p) is a finite measure space, and V afinite dimensional linear space. Let f : S x V - V be continuous in V forevery S, and such that I f(s, u)I 5 K Jul + 1 for some constant K z 1 andevery s e S, u e V. Then .(s) -+ f(s, 0(s)) maps L2 (S, V) into L2 (S, V) be-cause

fIfs#(sxII2d#c s K2 f (I4(s)I + 1)2d/j.s

We call this map ¢(F). If a sequence {4.($)} converges in L2 to 4(s),then there exists a subsequence {4.. (s)} converging to 4(s) a.e. HenceF% d (s) -+ F(4) (s) a.e. But IF(4,,) (s)I 5 K 14,,,(s)I + 1, whence there existsa subsequence {4M,3 such that '0,,,j (s)j 5 V(s) for all j, where &'(s) is a summable

Page 31: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

26 NONLINEAR FUNCTIONAL ANALYSISco

function. (Simply take (4 ,,) such that f -dill < oo.) By the Le-

besgue theorem, F(4) in L2. This proves that F is continuous.Now let D be a linear operator in L2 with bounded inverse, and suppose

we want to solve the equation

D*FDx = Y.

By Minty's theorem, if D*FD is strongly monotone then there exists a solu-

tion for any y e L2. The condition for strong monotonicity is

(D*FDx,_D*FDx2,x,-x2)>_EIXI-x212 where a>0, x,,x2aL2,

(Fx, - Fx'2, x'1 - x2) ? E Ixi - X2112 where s' > 0, x', x2 a L2.

(Calling x; = Dx1, i = 1, 2, and remembering that D is bounded and has abounded inverse.)

We therefore see that for solvability it is sufficient to have

(f(s,v) -f(s,v'),v -v') E' IV -v'(2 for all seS, v,v'e V.

(Note'that here the scalar product is that id R", whereas before it was theone in L2.)

This condition is implied by:

(df(s, v) v', v') Z E' Iv'I2 for all s e S, v, v' e V

which is equivalent to the following condition : There exists an a > 0 suchthat the symmetric matrix 'J + J -- eI is positive definite, where J is theJacobian matrix off. Thus for the existence of solutions at every point of the

above equation, it is sufficient to require that the matrix A =af I( + LP-axi ax, r. j

has smallest eigenvalue >0 at every point.

E. Compact Mappings

1.38. Definition: Let E, F be two T.L.S. 45: E - F is compact iff it iscontinuous and maps bounded sets into compact sets, i.e., if B e E isbounded, then 4)(B) is relatively compact.

1.39. Definition: 0: E - F is called locally compact at a point p e E i$'4)is continuous in a neighborhood V of p, and maps V into a relatively compactset.

Page 32: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 27

1.40. Theorem: Let E, F be T.L.S., F complete. Let 0: E -+ F be F-differentiable at p e E and locally compact at p . Then do (p) is a compact linearoperator.

Proof: We may suppose that p = 0 and that ¢(p) = 0. Let A = d4(p) andsuppose A is not compact. Let S be a bounded subset of E, with non-com-pact A(S). By the completeness of F, we can find a family {x«} a A(S) anda neighborhood U of 0 in F such that x« -- xx 0 U whenever a 0 P. Nowlet {y«} e S be such that Aya = x«, and for any & > 0, let us define:

n 6(X.) = 0 (&y.)Then we have :

(1) vla(x,,) - &x. = 4' (by.) - &x. = A (&y.) - &x. + V (&y.) = V (&y,.)

where tp is a function horizontal at 0, i.e. for every neighborhood V of I in F,there exists a neighborhood U of 0 in E such that

Sp(&U) c of&) V.Now

(2) rl6(x«) - ,;a(x$) = (&x« - &xa) + (r!a(x«) - &x«) + (&xp - rla(xp))

Choose asymmetric and circled neighborhood V of 0 in F such that V + V+ V,= U. For such a neighborhood there exists another one W in Est ch that

tp(&W)cof&) V.

Since $ is bounded, there exists A > 0 such that AS a W. Then for every a,

rla(x«) - 8x« = to (&yj a tp W) c v (,) V.

But as a o ( -+ 0 as & - 0, it follows that for sufficiently small &

0(1 VC &V.

Therefore, by (2), tya(x«) - t a(x,) # &V whenever a 0 fi, because if t1j(x«)- rja(x«) a &V, then &x« - &xj + &V + &V a &U contrary to our assumption.Hence 0 (&y«) -,0 (&ys) 0 &U for a # fl, and for sufficiently small &, thiscontradicts the local compactness of 4).Q.E.D.

Page 33: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

28 NONLINEAR FUNCTI161 AL ANALYSIS

F. Higher Differentials and Taylor's Theorem

We recall that if X1, X2, ..., X., Z are linear spaces over the same scalarfield, a function M: (X1 x X2 x ... x X.) -+ Z is multilinear or n-linearif it is linear in each of the variables separately. If X1,..., X., Z are B-spacesM is continuous iff there exists a constant K such that IM (xi ... x.)J5 K Ixi I Ix21 ... Ix.l for all x1 in X1, i.e. if it is bounded. The minimum of thenumbers K satisfying this inequality will be called the norm, of M, IMI. Theset of all n-linear bounded maps M from Xi x X2 x ... x X. to Z will bedenoted by B (XI, ..., X.; Z), and it is easy to verify that if X1 and Z areZPase B-spaces then B (X1, ..., X.; Z) is a B-space with the usual additionand scalar multiplication, and with the norm defined above. In the caseXi = X2 = ... = X., B (XI, ..., X.; Z) will be written B' (X, Z).

1.41. Lemma: Let X1, ..., X., Z be B-spaces over the same scalar field.Then there is an isometric isomorphism between B (X1, ..., X.; Z) andB (XI, B(X2, ..., B(X., Z)) ...).

The proof is left as an exercise for the reader.Suppose that f is F-differentiable on a set Do in X with range off in Z. Then

the function f1, defined for x e Do by fi(x) = df (x), has its values in theB-space B (X, Z). It makes sense to ask if f, is differentiable. If it is, then thedifferential of fi = df will have its values in the space B2 (X, Z) of boundedbilinear functions of X to Z, where, by the lemma above, we have identifiedB2 (X, Z) and B (X, B (X, Z)). We define the differential of fi = df at apoint c to be the second differential off at c and we denote this second diffe-rential by PA x). Hence d2f (c) is a bounded bilinear function on X to Z.Higher order differentials are defined by induction.

1.42. Defuidon: A function f on D c X to Z is said to be in class C" onD, written f e C", iff the n-th differential d"f exists at every point of D and themapping x - d"lx) of D into B" (X; Z) is continuous. If f e C' for all n, wesay that f e C. Observe that iff : X -+ Y e CO on a neighborhood of a pointc e X and if g : Y Z e C' on a neighborhood of the point b = ft c), thenh= g of: X ..+ Z e Co on a neighborhood of c.

We now prove Taylor's theorem for B-spaces. We shall write x(k) for thek-tuple (x, x, ..., x).

1.43. Theorem: Suppose that fe Cl on an open set D which contains theline segment joining c to c + x. Then

Page 34: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 29

f(c + x) = f(c) + 1! df(c; x) +21

d2f(c; x(2)) + ...

+ 1 d"-lf(c; x("-")(n-I)!

, t)"-1 d"f(c + tx; dt.+ 1fo

-n!Proof:

Since the map t -+ d*Ac + tx; x(")) is continuous on [0, 1) to Z,it is clear that both sides of the equation have a meaning. To establishthe equality let Z* be a continuous linear functional on Z and let F be de-fined on 10, 11 to the scalar field by F(t) = Z*f (c + tx). Then F«"(t)= Z* [dkf(c + tx; xwk')] for 0 k 5 n and we can apply the scalar-valuedform of Taylor's theorem to F. If we observe that Z* commutes withintegration, we can apply the Hahn-Banach theorem and obtain the result.

Q.E.D.

1.44. Corollary: Under the hypotheses of Taylor's theorem, there exists abounded n-linear function R. from X to Z such that

f(c + x) =.f(c) + 1! df(c; x) + ... +(n

1 1)! d"-If(c;x(X-1)) +(xa)).

Proof: Let

Q.E.D.

An =1

J(1 - t)"-' d" f (c + tx) dt.

n! o

1.45. Corollary: Under the hypotheses of Taylor's theorem, there exists afunction a on a neighborhood of the origin in X to Z such that

1)1 x' '>)f(c + x) = f(c) + df(c; x) + ... +(n

I

+ d"f(c; xa') + Q(x).

where Q(x) = o(jxl").

Proof: Observe that

1 f (1 -t)"-'dt=1,n

Page 35: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

30 NONLINEAR FUNCTIONAL ANALYSIS

and define Q(x) to be

Q(x) = 1 1(1 - t)"-' [d"f(c + tx; xc">) - d"f(c; x("))] dt.Q.E.D(n-1)!fo

.

Note: The reader may easily generalize the definitions and results of thissection to the case of locally convex T.L.S.

G. Complex Analyticity

Let X and Y be complex B-spaces.

1.46. Definition: We say that 46: X -+ Y is complex analytic on an opensubset 0 of X iff 0 (zlxl + 22x2 + + z xjt) is analytic in zl , ..., zx forevery xl , ..., xj, in D, zi complex.

If 0 is complex analytic, we have immediately

ow _i 4, (x (Cauchy's formula)

andn! f0(x+CydC.

1.47. Lemma: If v (x + (y) is a cl vector-valued function of x and y, and

if 1 av _ av, then v is complex analytic.

i ax ay

Proof: Let v* be a continuous linear functional. v* (v (x + iy)) is a com-plex valued function which satisfies the Cauchy-Riemann equations and istherefore analytic. Thus

v* (v(z)) = v* mz dd,2xi f C -

and by the Hahn-Banach theorem, the Cauchy formula holds, and v isanalytic.

Q.E.D.

Suppose 0 is analytic in D. Assume that 0 e D and 4,'(0) is invertible. Bythe implicit function theorem, 0 has a local inverse, V. If u and v + 2u aresufficiently near 0,

0 (V(u)) = u and 46 (V (v + 2u)) = v + 2u.

Page 36: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

BASIC CALCULUS 31

Then

and

0' (p(v)) dt V (V + tu)so

= u

0' (V(v)) d V (v + itu)I = iu.=o

Since 0' (V(v)) is invertible,

d , (v + itu) = i dV (v + tu).

By the preceding lemma, V (v + zu) is analytic in z, and consequently V isanalytic. We have therefore proved the implicit function theorem in theanalytic case :

1.48.T6eorem: Under the hypotheses of the implicit function theorem 1.20,and if 0 is analytic, 0-1 is also complex analytic.

H. Derivatives of Quadratic Forms

1.49. Definition: If B, V are linear spaces and f : B -+ V, we say that f is aquadratic form if the expression

fi (x, Y) = f(x + Y) - f(x) - f(y)

f(Ax) = 22f(x)for every x e B and every scalar A.

It is clear that in such a case f( -x) = f(x) and f(0) = 0. P is called thebilinear form associated with f. From the definition it follows that

(3) f(x) = +i (x, x),

and therefore f and fi determine each other. Suppose now that B and V areBanach spaces. It is clear that f is continuous if and only if f is continuous.If f is continuous, then

(4) If(x)I 5111A IxI2,

where IIPII stands for the norm of f as a bilinear function. If we defineII!II = (4) may be written

If(x)I S 11111 Ix12.

Page 37: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

32 NONLINEAR FUNCTIONAL ANALYSIS

The main theorem on derivatives of quadratic forms is the following:

1.50. Theorem: Every continuous quadratic form has F-derivatives of allorders. Denote by fi the bilinear form associated with! and by Ilf II the numberI sup Ifl (x, y)I, Ixl s 1, lyi s 1. The first and second derivatives off are:

f'(z) h = fl (z, h),f"(z)

=#I

and the higher derivatives vanish identically. From the equations above itfollows that :

If'(z)I s 2 Ilfli lzI,

llf"(z)0 = 2 IIfll

Proof: From the definition of fi it follows that

Y (z + h) - f(z) = f(h) + fl (z, h)

and (4) implies that f(h) = o(h). Since fi (z, h) is linear in h, it follows that fhas an F-derivative at every z and the equality f'(z) h = P (z, h) holds. Nowf : B -, Horn (B, V) (being equal to fi) is obviously linear. Then from thegeneral fact that the derivative of a linear mapping at any point is that linearmapping, we conclude thatf"(z) = f', or f"(z) = P.

Q.E.D.

Page 38: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER II

Hard Implicit Functional Theorems

A. Newton's Method and the Nash Implicit Functional Theorem . . . . . . . 33

B. A Partial Differential Equation . . . . . . . . . . . . . . . . . 41

C. Embedding of Riemannian Manifolds . . . . . . . . . . . . . . . 43

A. Newton's Method and the Nash Implicit Functional Theorem

The following theorem will be proved by the so-called Newton's method.

2.1. Theorem: Let B be a Banach space, and let f be a mapping whosedomain D(f) is the unit sphere of B. Suppose that:

(i) f has two continuous Fr6chet derivatives in D(f), both bounded aboveby a constant M, which we assume to exceed 2.

(ii) There exists a map L(u) with domain D(L) = D(J) and range in thespace &(B) of bounded linear maps of B into itself, such that

(iia) IL(u) hlM jkI, h e B, u E D(L)

(ii b) df (u) L(u) Ii h#` h e B, u e D(L).

Then, if J f(0)) < M-3, it foI1ow'§ that f(D(f)) contains the origin..

Proof: Let x = J, and let f> 0 be a real number to be specified later. Putuo = 0 and, proceeding inductively, put

(2.1) ua+t = U. -

We will prove inductively that

(2.2;n)

(2.3; n) 1u - t I S e-$"", n Z I.3 Schwartz, Noatinear

33

Page 39: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

34 NONLINEAR FUNCTIONAL ANALYSIS

We proceed as follows. Suppose that statements (2.2;j) and (2.3;j) are truefor j 5 n. Then

(2.4) Iu I e-p-j s e-pcx-I» <J=1 J=1 1>

so that if fi is sufficiently large, (2.2; n) follows. Therefore Definition (2.1)makes sense. Observe now that if g is any function twice continuously F-differentiable, the mean-value theorem with Lagrange remainder applied tog (u + th) yields

g (u + h) = g(u) + dg (u) h + fo (1 - t) d 2g (u + th, h, h) dt.

Combining this with (ii) and our induction hypothesis yields

(2.5) Iun+l - I = IL(uu)f(un)I 5 M If(uf)I

S

M2 1110 -- u.-lI2 = M Iu - M2e

Thus we have only to choose P so that

or

(2.6)

M2e-2"O' 5 e-Pxn+'

M2 < e(2-x)px"

Since x < 2, it is clear that (2.6) will hold for P sufficiently large, and then(2.3; n) follows, completing our induction. Thus we have only to prove thecorrectness of (2.3;1) to finish, our proof. But this statement is simply

(2.7) IL(O)f(O)I s e -PM

and it is therefore implied by M (f(0)I S e-p' . Since I f(0)I 5 M- s, if wechoose P so that M2 = e"/2>p", (2.7) follows. Then u converges to someelement u in D(f). By (iib) and (2.1)

f(u,,) = df(u;) (u. - u,+

If(U)I s M 1U.+A - s Me-p",,,

so J(U) = 0.Q.E.D.

The following theorem, which gives an important generalization of Theo-rem 2.1, is proved by a modified Newton's method. We weaken the hypo-theses of Theorem 2.1 requiring not that the "inverting" operator L(u) bebounded, but only that it be an unbounded operator acting somewhat like adifferential operator of ordera.

Page 40: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 35

Given a compact n-dimensional manifoldK weintroduce the space C'(K) = C

of (possibly vector-valued), r times continuously differentiable functions

with the norm

Jul, = max max jDau (x)Ila15 r _. M

&-2,. -j a"), a1 non-negative integers, loci = al + + a

a I ... ( 61G-X'1 ) ax"

Note that C' m- C'+1 and that, if u e Ci+1, Jul,, S Jul.+1; we write Jul, = coif u 0 Cr. In the sequel we shall refer to a certain range m - a 5 r 5 m + 10aof spaces C' and to a certain constant M z 1. We suppose that M is suffi-ciently large so that there exist smoothing operators S(t), t z 1 such that

(S1) ISO ul e S M1°-' l ul,, u e C'

(S2) 1(I - S(t)) ul, 5 ml'-° lule, u e C°

(S3) S(t) U M:-°-1 lule, u E C°dt

(S4) lim l(1 - S(t)) ul, = 0, tLe C'-++ao

formr:9 Lo :9 m+10a.(We will show later how to construct these operators for any compact

manifold.)We proceed now to the statement of the main result of this chapter:

2.2. Nash implicit fanetional theorem: Let f be a mapping whose domainD(f) is the unit sphere of CO with range in C". Suppose that

(i) f has two continuous F-derivatives, both bounded by M.(ii) There exists a map L(u) with domain D(L) = D(f) and range in the

space it (C", C'-) of bounded linear operators on C" to C"-a,such that :

(iia) IL(u)hl,-,, 5 Mlhl",

(iib) df(u)L(u)h = h,

(tic) IL(u)f(u)Im+9s 5 M(1 + IuI"+LOa),

ueD(L), heCm

ueD(L), heCmau e C"''°.

Page 41: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

36 NONLINEAR FUNCTIONAL ANALYSIS

Then, ifIf(0)1.+9a -

2-40M-202

f(D(f)) contains the origin.

Proof: Let x = I and P, µ, v > 0 be real numbers to be specified later.Put uo = 0, and proceeding inductively, put

(2.8; n) u.+ 1 = U. - S.L (u.) flu.)

where S. = S(e1). We will prove inductively that

(2.9; n)

(2.10; n)

(2.11; n)

(2.12; n)

lu. - u.-1I. 5 e-1001"

1+lu.1.+10a 5 e"a0"

Suppose (2.10; j) is true for j 5 n. Then

e,enc.-Iu I

e-pa0x1 S Z e-e.0c.-1» =map (N- 1)J-1 1-1 1 - e

which implies (2.9; n) if i4, µ are, sufficiently large. Suppose now that(2.9; j), (2.10; j), (2.11; j), (2.12; j) are true for j S n. Then

1u,+1 - 6I.. = IS.L(u.)f(101..

s M e°"' JL(u.)f(w)Im-a 6 M2 e'"M' If(u )J ,

S M2 eO"" S.-1L(u.-1)f(u.-1)1,.

+ M3ea0x"1u. - u.-11M

s M2 ea0x" Idf(u.-1)(1 - S._1)L(U,-1)f(u.-1)I.,

+ M3 ea0x" e-2pa0x"

s M3 eaax"[Me-9a0x"-' IL(u.=i)f(u.-x)1.+9a + e

c Ma ea0x" [e-900x"-' M (1 + Iu.-1Ie+1oa) + e-:"°`R""]

S M5 ea0""[e-9.0x"-' e"«0'P-' + e'2m0x")

S Ms {expagx"-1 (v - 9 + x) + expa4x' (1 - 2µ)).

Page 42: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 37

The desired inequality will be then implied by

(2.13; n) M5 {exp [ape-' (v - 9 + x)] + exp (c flu" (1 - 2µ)])

which (noting that x = 4) will follow for P sufficiently large if we choose

(2.14)

Thus (2.10; n + 1) follows.Next we note that

+ 1u.+Jd.+10.

µ>2, 41u+v< s.

1 + i ISJL(uJ)1(uJ)I.+10,J-0

i5 1 + Me°10"' IL(uu)J(uu)I.+%J-o

Thus

(2.16)

5 1 + M2 Z ed" ' (1 + IuJl.+ios)J-o

i eaRCi+.)'r5 1 + M2J-o

(1 + Iu6+tl.+1oJ a-b,.ft

5 e^' 'P.t + M2 iJ-o

If v > 2 the right side of (2.16) will be less than I for sufficiently large P,and so statement (2.12; n + 1)will follow from (2.16), completingourinduction.If we taker = }, µ a ,t, condition (2.14) is satisfied and so we have only toverify the correctness of statements (2.10; 1) and (2.12; 1) and our proofwill be complete. These statements, however, are simply the inequalities

(2.17) IS1L(0)f0)I. 5 e-'oo"

and

(2.18) 1 + IS1L(0) f(0)I.+1o, S e"I"

and they in turn follow from the bound for JJ(0)I and (iic). The conclusion ofthe proof is now just as in Theorem 2.1.

We proceed now to the construction of the family of smoothing operatorswhose existence was assumed for the special case K = n-dimensional torus;

Page 43: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

38 NONLINEAR FUNCTIONAL ANALYSIS

then Ck(K) is simply the space of all k-times differentiable functions u(x), de-fined in E" and periodic with period 2x in each variable. Take a sufficientlylarge constant M and a function a e C°°(E), vanishing outside a compactset and identically equal to 1 in a neighborhood of 0, and let a be its Fouriertransform. It is well known that for any a, N

IDaa (x)I < Aa.N (1 + IxI)-N.Moreover

./ E^a(x) dx = 1,

Now we set

xa(x)dx =0, xa =x,xa2...xa IaI > 0.E

(S(t) u) (x) = t"J

a (t (x - y)) u(y) dy.

It is clear that S(t) u e C°° and, since S(t) commutes with partial differentia-tion operators, we have to prove statements (S1), (S2) above only for r = 0.In fact, suppose (S1) is true for r = 0. Then

IS(t)UI° < Ml,,-'

IuIo

Taking any a, IaI s rWS(t) ul°-, = IS(t) Daul°-.

5 Mt°-' IDaul0 s Mt°' Jul..But then

IS(t)ulk s me- Iul,

One deals similarly with (S2).Suppose then that r = 0.' (S1) reduces to

is(t) uI° s Mt° Iulo.Let Jai 5 e. We have

I D"S(t) ulo = t"flat f Daa (t (x - y)) u(y) dylE"

s Mia1 IuIo s Me Iul0

if IaI S e, so (S1) is established. (S2) reduces to

1(1 -- S(t)) ulo 5 Mt-° Iui,.

Page 44: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 39

To prove this, apply Taylor's theorem with integral remainder:

+ I fu - )"'_' dµ(1) _ Y1Pk)(0)

k=0 k! (m - 1)! 0

to the function f(t) = u (x + ty). We obtain°-1

1u(x + y) = Y ( > yaDau (x)

k=0 k. Ia1=k

11

+ y" (i - Yu (x +,uy) du.(Q - 1)! Ia1=° J0

Thus

U - S(t) u = t" fEn

(t (x - y)) (u(x) - u(y)) dyEn

, Ia(t(x_y))(1-/s)'tDu(x+iy)d,Ady.(B - 1)! Ial=o EM 0

Making the change of variable ty = z, we obtain

fo

I

t" f a (tx - ty) (i - µ)Q -1 yDu (x + µy) dls dyEa

J'a(tx= t -1a1f - z) (1 - ,u)° - 1 zaD"u (x + µt -1 z) d IA dz.E" fo

But then it is easy to conclude

V - S(t) UI0 S Mt-° lup0which proves (S2).

Let us pass now to (S3). As before, we can suppose without loss of general-ity that r = 0, in which case, (S3) reduces to

drS(t) ul 5 Mt-°-1 Pubo.

But

a ( t (x - y)) u(y) dyd

S(t) U = dt toft.

E1=1= 18-1 (na (t (x - y)) + Y ty' (D'a) (t (x - y))) u(y) dy.

Page 45: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

40 NONLINEAR FUNCTIONAL ANALYSIS

Reasoning entirely analogous to that used in the proof of (S2) yields thedesired result. As for (S4), it is a well-known result in the theory of singularintegrals, and therefore we omit the proof.

We note that the construction of the smoothing operators could be carriedout for any compact manifold, and not only for the torus. For the proof, werefer the reader to J. Schwartz, On Nash's Implicit Functional Theorem,Comm. Pure Appl. Math., vol. 13 (1960), pp. 509-530. We note also that theuse of spaces C' and the norms I.1, is by no means essential in the proof ofNash's implicit functional theorem; indeed, these spaces can be replaced, forexample, by spaces like L; (K) = LD = space of all (possibly vector-valued)functions f for which

I&I S r, with the norm

IIDafl' dx < co,

iii = f ID-fly dx.1a15r x

We present now a useful corollary of Nash's Implicit Functional Theorem.

2.3. 2nd implicit functional theorem: Let T - n-dimensional torus, letf : Ck - C" be defined on the unit sphere of Ck, and suppose that

(i) f has infinitely many continuous F-derivatives.(ii) f is translation Invariant, i.e. if u e Ch, Iulk < I

f(u (. + h)) (s) = U W)(u)} (x + h) .

(iii) There exists a mapping L(u) defined in the unit sphere of Ck withvalues in 9 (Ck, Ck-s) such that L(u) is translation invariant in thesame sense as f, such that L(u) has infinitely many continuousF-derivatives, and such that

(iiia) 194) hJk-s S M Ihlk,u e Ck, h e CR

(iiib) df (u) L(u) h = h, u, h e Ck.

Then, if f(0) = 0, f(D(f)) contains a C00-neighborhood of zero.

Proof: Note that, since f is translation invariant, it commutes with deriv-atives, so if we apply f to a function in CL*, k' > k, we obtain a function inC&V-0; similarly L(u) can be considered as a function whose domain is theunit sphere of Ck' and range C*'-R. The inequalities and identities

(iiia)* IL(u) bilk.-a S M Ihlk., u e CO, h e C"'

(iiib)' df(u) L(u) h = h, u, h e Cr+s

Page 46: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 41

also hold. We now have only to apply Nash's implicit functional theorem,

for which we need (iic) of its statement. But this is a consequence of thetranslation-invariance of f and L together with inequality (iiia) and theboundedness of the derivatives of f. Applying Nash's implicit functionaltheorem, it follows that if a point k is sufficiently near to the origin in C"-O,

there is a point in Ck whose image is k. Therefore, f(D(f)) contains a Ck-1-neighborhood of the origin, and thus a C°°-neighborhood also.

We show now how the implicit functional theorem can be applied, firstto an artificial example and then to a natural one.

B. A Partial Differential Equation

Consider functions of n variables, of period 2n in each variable, i.e. func-tions on the n-dimensional torus. The partial differential operator

a a a 4 a 4} +: +)2

ax1) \ ax2 / aX3 / axa .

GO- a 2 a l2axs/ - ... - ax,/

has, by deliberate choice, an extremely unfortunate "mixed" character fropithe point of view of the theory of partial differential operators. But it iseasy to see that 0 admits the complete orthogonal set of functions

exp (i (mxxl + +

as eigenfunctions, and that the eigenvalues of 0 are Gaussian integers. There-fore, the equation

(B1) (Q++)u=vis invertible in the following sense: if visa function in L2(K), then there is afunction u e L2(K) such that (B 1) is valid in the L2-sense.

Our aim is to show that the equation

f(u)=Qu+Iu+u3exp(Qu) v

has a solution u E C°° for sufficiently small v in C°0. Observe first that fmaps Ck into 0 -4 for any k and that it has infinitely many continuous F-derivatives, the first of which is

df(u) h = (1 + u3 exp (Qu)) Qh + (I + 3u2 exp (Qu)) h.

Page 47: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

42 NONLINEAR FUNCTIONAL ANALYSIS

To find L(u) we have to invert

(B2) h + + 3u2 exp h

1 + u3 exp 1 + u3 exp

For u = 0, (B2) reduces to (B 1), so by a standard perturbation theory argu-ment, (B2) will be invertible for any u sufficiently close to zero in CR. Theoperator L(u) I = h will then be defined and certainly continuous as an opera-tor from C1 to C'`-4; thus inequality (iiia) follows for L, and (iiib) is an im-mediate consequence of the definition of u, i.e., of the fact that L(u) has in-finitely many derivatives and is translation invariant. But we have now veri-fied all the hypotheses of Theorem 2.3, so our result follows at once.

We note next that our "translation invariance" requirements on f do notprevent us from treating some apparently unmanageable cases, such as

f(u) = u (x) + c1(x) u(x) + c2(x) u3(x) exp (x)) = v(x),

where c,(x) and c2(x), are C°° functions on the n-dimensional torus. In fact,we have only to look at this problem as if it were that of solving the systemof equations

u + dlu + d2u3 exp v

d, = c,

d2 = C2.

If we suppose that the operator u - ( + cl) u has a bounded inverse inL2 (as was the case for c, = 1), then the first Frechet derivative of the in-finitely differentiable mapping

F: [d,, d2, u] -- [Du + d1u + d2u3 exp d, , d2]

will have an inverse; in fact

dF [d, , d2 i u] [sl, s2, h] = slu + d1h + s2u3 exp (Du)

+ 3d2u2 exp (Oh) + d2u3 exp (Du) h, s1 i s21

which, for d, and d2 near cl and c2 and f sufficiently close to zero in suitablesenses may be solved as in the previous case. Observing that Fis translation-invariant, and reasoning as before, our result follows.

Page 48: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 43

C. Embedding of Riemannian Manifolds

Bibliography

1. N. Bourbaki, Espaces vectoriels lopologiques.2. S. Helgason, Differential Geometry and Symmetric Spaces.3. S. Lang, "Fonctions Implicites et plongements Riemanniens", Sem. Bourbaki, E.N.S.,

expose 237 (1961-62).4. J. Nash, "The imbedding problem for Riemannian manifolds", Ann. of Math. vol. 63,

pp. 20-63 (1956).

Now we shall consider the problem of isometric embeddings of Riemannianmanifolds in euclidean spaces. This problem was successfully treated for thefirst time by John Nash (see [4]), and it provides a natural application fortheorems such as Theorem 2.2 above. The problem can be stated as follows.Is every Riemannian manifold (say of class Ck) isometrically embeddable inRI? (Throughout this section, embedding means diffeomorphic mapping withinjective differential at each point (= regular at each point).) Nash's answeris in the affirmative (technically, when k >- 3), and he also asserts that mmay be chosen less than or equal to an explicit function of the dimensionn of the manifold (namely m 5 1 (3n3 + 14n2 + 11 n) for the general caseand m 5 1 n(3n + 11) if M is compact). Actually we shall here prove only aweak result: our final statement will deal only with C °°-compact Riemannianmanifolds and no bounds for in will be determined. M will henceforth de-note a compact Riemannian manifold ofdimensoin it. The manifold itself andits metric will be supposed of class C.

I. Remark: Without loss of generality the manifold M may be supposedto be a torus ([3], No. 1). In fact, by Whitney's theorem (cf. G. de Rham,Varietes differintiables, or Milnor, Notes on Differential Topology, Princeton,1959) M can be represented as a closed smooth bounded submanifold ofsome Euclidean space E". But then by properly choosing everything wecan assume that the projection of E' on some torus is I - I on the mani-fold M. This represents M as a closed smooth submanifold of a torus.

Now we have some Riemannian metric defined on the submanifold M ofthe torus. By a standard procedure using partitions of unity it is possible toextend this metric to a metric on all of the torus. If we now isometricallyembed the torus equipped with this metric we obtain by restriction an iso-metric embedding of M. Thus we may always suppose that our i anifold i-, atorus; this will simplify some constructions. Nevertheless we begin h discu:s-

Page 49: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

44 NONLINEAR FUNCTIONAL ANALYSIS

ing an arbitrary compact manifold, because as far as VI below the assump-tion that M is a torus makes no difference in the proofs.

11. Consider the Banach space Cr (M, RI) of r-times differentiable func-tions on M with values in R'", and, more generally the Banach space S' ofsymmetric, doubly covariant, r times continuously differentiable tensor fieldson M, defined by dealing locally with matrices instead of with real numbers(see above). Such tensor fields are metrics and R" has a canonical metric,namely the Euclidean metric. Each z e C' (M, R'") induces "by devolution"of this metric an element of S'-1, and therefore we have a mapping f:C'(M, RM) -' S'-I (for a more explicit definition see below). We shall showthat for m large enough, the image off covers an open set of S. To do sowe shall prove the hypothesis of Theorem 2.2, and then establish our claimeasily.

III. First of all we want to know the Frechet derivatives off (and f itself).Suppose that zl, ..., z," are the canonical coordinates in R', and thatxl, ..., xs is a coordinate system defined on some open set U of D. Then ifz e C, (M a R"') and f(z) denotes, as above, the tensor on M induced by z,we have the following expression in coordinates:

(l) (f(z))I.J = E aza 8z..

mOX,8Xj

This formula is standard and may be taken as the starting point, but (at therisk of being more boring than necessary) we add the following exposition.If X is a manifold and p e X, denote by TX, the tangent space to X at p.Now if z : M R" is smooth, p e M, q = z(p), z has a differential at p, i.e.,z induces a map

z* : TM, - T(R"),

which is linear (see [2], § 3, No. 1). But the metric on R'" induces an iso-morphism

u : T(RO), - (T(R0),)* (* = dual space).

Consider now the linear mapping A obtained by composition of the map-pings :

soTM, _i T(R"), - (T(R1.)* r"'` (TM,)* ,

where 'z* stand for the transpose of z*. Clearly A : TM, -. (TM,)*. As thereexists a canonical identification :

Ho?R (TM,. (TM,)*) = (TM,)' 0 (TM,)*,R

Page 50: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 45

A may be considered as a doubly covariant tensor at p. The correspondence

z -+ A is what we called f, i.e. we define f by (f(z)), = A. Observe that the fact

that the values of z are in R"' has not played any special role, and any Rie-mannian manifold could replace R"`. But in our case, we know that T(Rl),and (T(Rm),)* may be identified with Rm itself, and that the isomorphism u is

the identity. Finally we get/

(l a) V lz))D = zD . Izy*.

In terms of coordinates, z* is the Jacobian matrix z* = J. = (), and from(1 a) it follows that

(1 b) f(z) = J. 'J".But then

aza az,

a ax, ax,

which is (1) above. From formula (la) or (lb) it follows at once thatf: Cr+1 -+ S' is a quadratic form (see 2, Chap. I); the bilinear form P asso-ciated with f is

(2) j9 (x, Y) = x* -'Y* .}. y* . 'X*.

Another consequence of (I b) is the continuity off (as a function from C'+ 1into S"). This is clear. We may therefore apply Theorem 1.50 and concludethat f has derivatives of all orders :

(3) f (z) h = z* 'h* + h* 'z*,

f"(z) (h, k) = h* 'k* + k* 'h*,

f (")(z) = 0 if n 3,

and that the norms satisfy

(3') If'(z)I S 2111'11 Iz1

II,f"(z)N = 2 11111

In terms of coordinates, (3) may be written as:

(3a) f'(z) h = J,'J. + J1, 'J:,

(3b) (f'(z) h), j az, ah" + az_ ah,

ax, ax, ax, ax,

Page 51: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

46 NONLINEAR FUNCTIONAL ANALYSIS

Naturally we plan to show that f'(z) h is invertible (as a function of h) in avery smooth way, i.e.,

(4) given g E S' and z e C' we want to find h as differentiable as possiblesuch that g = f'(z) h.

IV. To achieve this we use a trick- invented by Nash ([4], p. 31) whichto the problem of solving (4) adds some new conditions. In other words, werequire that the solution h have an additional property given a priori, namely

(5) tZ* (h(p)) = 0 for every p e M.

Since (T(RM),)* was identified with R'" and

`zD : (T(Rm)a)* - (TM,)*, h(p) E Rm,

it is clear that (5) makes sense.Of course (4) and (5) may be written in coordinates as:

(4a)

(5 a)

aza A. aza A.g" = E

axe ax, + ax, axe

-Z-' ha=0, i=1,...,n.a axe

We now prove that the conditions (4) and (5) may be satisfied simultane-ously by a suitable h. From (5a) we conclude that

aza aha+ 02Z, h=0a axe axe axtax, a

or

Y,Oz,, aha

a axe axe a axr axe

and then (4a) becomesa2

Z'%(4b) gjj = -21axe axe

ha.

This shows the point of adding the condition (5): now (4b) and (5) give asystem of algebraic linear equations, equivalent to (4) and (5), which are asystem of partial differential equations of first order.

Equations (4) and (5) (or (4b) and (5a)) can be written for every.z E C'+' (M, Rm). Nevertheless, we can assure the existence of a solution only

_ _E a2Za

Page 52: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 47

for a non-void open set of z's in C3(M, Rl"), where m is large enough (m maybe chosen to be m = 2n2 + 3n - see [4], p. 53-but remember that we don't

care about bounds for m).V. Choose a mapping of M into Rs by functions v1 , ..., vs. Now define a

mapping 2 of M into Rs+(1/2)s(s+1) by means of the functions

Cl , ., Vs

v; , ..., vlt's

v2r'1,...,"21',

2t'sv1, .., t's.

Write

21 = V1, ZZ = t'2, ..., Zs = vs, Z(.J = t'ji'j, 1 < j.

If v = (v1, ..., is a regular COD embedding of M into Rs (the existence ofsuch is guaranteed by Whitney's Theorem), we claim that f' (z) his invertibleas a function of h for every z in a neighborhood of 2. In fact, if v = (v1, ... , v3)is a regular embedding, one can take as local coordinates for Msome appro-priate subset of the v='s. Suppose that M has been covered by the open setsU1, ..., UN in such a way that on each Us one such subset works as a co-ordinate system. We consider the linear systems (4 b) and (5 a) in this particularcase. In order to simplify the notation, order the z's once and for all by

(z1,...,Zp,z1.1,z1.2,...,Zs.s)S

and write a as a general index for them. Let us fix one particular U, (callit simply U) and suppose that x1 = v1, x2 = v2, ..., x = v is a systemof coordinates on U. Consider 2. The coefficients of (4 b) and (5 a) are firstor second derivatives of the 9,,'s with respect to the xl's, and the matrix Bof the system of linear equations has the following form at every point of U:

n s-n k

B-

Page 53: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

48 NONLINEAR FUNCTIONAL ANALYSIS

In the above, shading indicates arbitrary coefficients and k = in (n + 1).I., I,t are the identity matrices of dimension and k respectively.

This shows that the matrix B has maximal rank n + in (n + 1) at everypoint of U (its rows being linearly independent). The same is true (for ob-vious reasons) for every z which is sufficiently near 2 in the C2 sense.

This remark has two strong consequences. First, it clearly shows that-thesystems (4b) and (5 a) have solutions at every point of U for all z, C2 near 2.This is basic in finding h. Second, it will follow from this that there existsa solution of (4b) and (5 a) defined by a mapping that is as smooth as gand the second derivatives of z are. This needs some explanation. At eachpoint of U, we know that (4b) and (5i) can be solved. Among the solu-tions of this system of linear equations we pick out one by the condition

(6) (ha,)2 = minimum.

It is easy to conclude from the fact that the solutions form a convex set inEuclidean space, that one well defined solution is thereby selected.

This defines a mapping h : U - R', I = s + Is (s + 1), and two prob-lems arise. (a) Is h differentiable? (b) Is h defined independently of U4-

(a) Differentiability of h. Fix a point p in U. Since B has maximal rank,B 'B is non-singular (this follows from the fact that det (B 'B) is the Gramdeterminant of the rows of B, and consequently different from zero if theserows are linearly independent). But then the equation

has a unique solution D = (B 'B)-1 G, for all GDefine

(7) H = 'BD = 'B (B 'B)-1 G.

Clearly H is a solution of (4b) and (5a) at the given point of U. We claimthat this is the solution 'satisfying (6). In fact, if R is another solution,BR = G, we have (writing ( , ) for the scalar product in R'):

and (9, R) (H, H) =(R-H,R--H)+2(H,R -H)

(H,R-H)=('BD,R--H)=(D,BR-BH)=(D,G-G)=0.Then

(R, fl) - (H, H) _ (R - H, R - H),

and this proves that among all solutions ft of BR = G, (R,17) is minimumwhen R = H.

Page 54: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 49

If the point p in U is now permitted to vary, formula (7) shows that H

varies as smoothly as B and G do. In it follows that h is r times differen-

tiable provided that z is t + 2 times differentiable and g is r times differen-

tiable.(b) It remains to show that the solution H is independent of the coordi-

nates chosen in U. But that is easy. If hl, h2 are solutions constructed onU1 and U2 respectively, since property (6) is coordinate independent, h1 andh2 both must possess it at every point of U1 n U2; by uniqueness h1 and h2

agree there. This proves that h may be defined everywhere, and thus we are

through.The expression

(7)H=tB(B'fB)-G

for H at p in terms of G at p (and z locally), assures that the correspondenceg - h (z supposed fixed) is linear.

But it tells us still more. In fact, the matrix B involves first and secondderivatives of z (B is the matrix of (4b) and (5a)). Then from (7) it is apparentthat His a smooth function of g and of the first and second derivatives of z.

Let L(z) g denote the function H of (7). The smoothness of H as a functionof z and g may be stated as follows: the mapping L(z) g is continuous inboth variables z and g simultaneously for the topologies: z e C' +2(M, R"),gES'L(z)geC'(M,R").

Naturally the equivalence between (4) and (5) and (4b) and (5) can beproved so long as h at least first derivatives (otherwise (4) does not makesense). For that we need L(z) g to belong at least to C1(M, R"'). This requiresr to be 3 or more.

We sum up as follows :

(8) For every r z 3 there exists an open set 0 in C'+ 2 (M, R'") such that afunction L(z) g is defined on 0 x S' is continuous (in both variables simultane-ously), and has values in C' (M, R'") satisfying:

(8i) f'(z) ° L(z) g = g, (z, g) E 0 x S';(8ii) L(z) g is linear in g for every fixed z in 0;(8iii) the elements in 0 are 1 - 1 and regular at every point.

(8iii) follows from the fact that we may choose 0 to be a small neighborhoodof the embedding I above, and that the set of embeddings is open in C',r Z1

VI. We now assume that M is a torus (cf. I). Hence there exists a global setof (local) coordinates (the angular parameters .tit, ..., x") and consequently4 Schwartz, Nonlinear

Page 55: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

50 NONLINEAR FUNCTIONAL ANALYSIS

there is a standard way of expressing the doubly covariant tensor fields asn x n matrices of functions, just by taking the components of such tensors incoordinates x1, ..., x" at each point. This means that it is possible to identifyS' with (C' (M, R))"2. But then the mapping f : Cr- I (M, R") -+ Sr may beassumed to have range in (C'(M, R))"2 and hence to split in n' componentseach with range in C'(M, R).

Thus each component inherits from f all properties vis-A-vis derivatives.We leave to the reader now the verification that these components satisfy thehypotheses of Theorem 2.2.

We can then apply Theorem 2.2 and conclude that :

(10) The image under f of the set of infinitely often continuously differentiableembeddings of M in some Euclidean space covers an open set of S.

Remark: We may restrict our embeddings to be embeddings of Min somefixed R'", and the conclusion should remain the same. But our next step willconsist of adding directly two such embeddings and the bound m willvanish. For that reason (9) is stated without any reference to the ranges ofthe embeddings considered. ' .

VII. Let K' = the set of all tensor fields in Sr that are metric, that is tosay positive definite at each point of M. Clearly K' is a convex cone, open forthe- S' topology.

For every C'+1 embedding zof M in some Euclidean space, f(z) belongsto K'. Let E' c K' be the set of all such f(z).

Lemma: E°° is a convex cone dense in K°° for the S0°- topology.

Proof: (i) E' is a convex cone. For every A z 0 we have Rf(z) =f(Jir)(f is a quadratic form). If z : M -+ R'", u : M -+ R .then the embeddingt = z ® u : M -+ Rm ® R' (defined in the obvious way) satisfiesf(t) = f(z) + f(u). Both properties together define a convex cone.

(ii) E°° is dense in K aD.

Proof: Suppose the contrary, and let E°° be the closure of E. Thenthere exists a point g e K°0 such.that g ylE°°. By the separating hyperplaneproperty of locally convex F-space Sab of all C°° tet}sors on the manifoldM, there exists a continuous. linear functional 4 on S°° such that ¢(E°°) S 0and 0(g) > 0. Let z be any arbitrary embedding of Minto a Euclidean space,and let u be any arbitrary smooth mapping of M into a Euclidean space.Then, for any positive e, ez is an embedding. Since 0r?f(ez ®u)) S 0

Page 56: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 51

for all e > 0, it follows on letting e - 0 that.0 (f(u)) 5 0 for every smoothmapping of M into a Euclidean space (cf. formula (1) above). By formula (1)above, this is equivalent to the statement that ¢(f (u)) S 0 for each smoothmapping of M into 1-dimensional Euclidean space, that is, that¢ (f(u)) S 0for each smooth real-valued function on M.

We now let VS M be a coordinate patch, introduce coordinates [x1,x2, ..., xA] = [x1, y] = x in V mapping V onto the unit sphere in Euclideanspace and restrict the functional 0 to the set SV of tenors in S°° vanishing

outside V. For h e Sv, 4(h) may be written as 4(h) = i D'J(h,J(-)), wheret.J-1

D'J = D" is a distribution defined in the unit sphere of n-dimensionalEuclidean space, and where h,Ax) is the coordinate expression of thetensor h e V. The above condition ¢ (f(n)) 5 0 evidently implies that

i au auD'J (-- -) S 0 for all smooth functions u vanishing outside aI.J.1 ax, ax,subset of the unit sphere.

Let ,n be a smooth non-negative function in R", of total integral 1, vanishingoutside the unit circle. We know from the general theory of distributions that

the "convolutions" DsJ(y) = D'J (__L?)) are a family of C°° functions,

defined in the sphere of radius 1 - e, and converging as e - 0 to D'J in the

sense of the theory of distributions. From the statement i D'J au au5 0.

it is easily verified that t.J-t (ax, ax1) -

1*1'.

f DQJ(x)aaxx) aaxx) dx

- 0' J

for each smooth function u vanishing outside the sphere of radius 1 - Weshall show that [*] implies that

A

E D'J(x) ,i;J S 0 for each jxi < I - eI.J- t

and each vector -' e R".

We proceed as follows. First note that, by the rotational symmetry and thehomogeneity. of the condition [*], it is sufficient to prove [**] in the special,notationally simpler case 6 = [1, 0, ..., 01 and x = [c, y], i.e., to prove thatDal (c, y) S 0 for each small c and jyl < 1 - e. To establish this last inequa-lity, let o,be a smooth function of a real variable 1. equal to a constant in a

Page 57: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

52 NONLINEAR FUNCTIONAL ANALYSIS

small neighborhood of t = 0 and vanishing for Itl > -{l - e), and putua(x) = 8112p (.i - cl W(IYI). Then

auj

8x1 (x) 8x1 (x) =a_ 1(p. (X1

S

- c)12 (p(lyD)2,

so thatJeu8

(x)12 dx =j'f (tv'(xl - c))2 V(IYD2 dx

(ax,is independent of 8, while all the other products of partial derivatives ofua(x) have integrals which go to zero as 8 - 0. Thus, choosing p so that

f f Itp'(x, - c)I Iw(IyD12 dx > 0,

putting us for u in [*], and letting 6 -+ 0, we find that

f D,1 (c, y) (w (IyD)2 dy s 0.

Therefore, letting p vary through a sequence of functiops approaching a a-function, we find that D." (c, y) S 0 for lyl < I - e. Therefore, as alreadyobserved, [**] follows.

Now note that if -A is a positive symmetric matrix and B is a positivesymmetric matrix, then tr (AB) S 0. Indeed, we have

tr(AB) = tr(AB112B112) tr(B1/2(_A)1/2(_A)112B1/2) = _tr(CC*),

where C = B112( _A)1/2. Since CC* ? 0 we have tr (CC*) i' 0 and ourconclusion follows. Therefore it is a consequence of [**] that

n

[***] j D; (x) h,,,(x) 5 0W-1

for every smooth positive symmetric tensor hu(x) vanishing outside lxl < 1- e.Integrating the inequality [***] and letting a - 0 we conclude that

DI"(h j) S 0

for every positive symmetric tensor vanishing outside a compact subset of theunit. sphere, i.e. that 4(h) S 0 for each positive h e SOD vanishing outside thecoordinate neighborhood V. Since, by use of an appropriate partition of unity,any positive-definite g e SOD can be written as a sum of positive elementsh, a SOD, each vanishing outside a certain coordinate patch of M, it follows

Page 58: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

HARD IMPLICIT FUNCTIONAL THEOREMS 53

that 0(g) S 0 for each g e K. But this contradicts our original statement

¢(g) > 0, and thus completes the proof of assertion (ii).

Q.E.D.

2.4. Theorem: (Nash, [4)). For every compact Riemannian manifold Ywith a Cm-metric, there exists a C°° isometric embedding of M in someEuclidean space.

Proof: If E°°, K°° are the cones defined above, our theorem states simplythat E°0 = K. By the lemma above we know that E°° is dense in K°°; andfrom (9) we also know that E°0 has interior points.

a

Let g e K°0, go be an interior point of E. As K°0 is open, there exists anelement 1:0 g in K°° such that g is a convex combination of go and g. Butthen g is also a cluster point of E because it belongs to K°°. Moreover, go isinterior to E°° and E°0 is convex. This implies- that all points in the opensegment joining go and g (in particular g) are interior points of E°° (see [1],E.V.T., Chap. 11, § 1, prop. 15), and we are done: E°D = K.

Page 59: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications
Page 60: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER III

Degree Theory and Applications

A. A Form of Sard's Lemma . . . . . . . . . . . . . . . . . . 55

B. Definition of the Degree of a C1 Mapping in R" . . . . . . . . . . . 61

C. Some Functions are Divergences . . . . . . . . . . . . . . . . 63

D. Back to the Definition . . . . . . . . . . . . . . . . . . . . 66E. The Continuous Case . . . . . . . . . . . . . . . . . . . . 70F. The Multiplicative Property and Consequences . . . . . . . . . . . 74G. Borsuk's Theorem . . . . . . . . . . . . . . . . . . . . . 78

H. Preliminaries: Degree Theory in an Arbitrary Finite Dimensional Space . . . 83

1. Preliminaries: Restriction to a Subspace . . . . . . . . . . . . . . 83

J. Degree of Finite Dimensional Perturbations of the Identity . . . . . . . 84

K. Properties . . . . . . . . . . . . . . . . . . . . . . . . 86L. Limits . . . . . . . . . . . . . . . . . . . . . . . . . 86M. Compact Perturbations . . . . . . . . . . . . . . . . . . . 89N. Multiplication Property and Generalized Jordan's Theorem for Banach Spaces. 920. Fixed Point Theorems in Banach Spaces . . . . . . . . . . . . . 96

A. A Form of Sard's Lemma

Our aim is to prove the following Theorem 3.1, which is related to Sard'slemma (cf. de Rham, "VarietLs differentiables", Sec. 3, Th. 4).

3.1. Theorem: Let D be an open set in R", let f be a continuously diffe-rentiable mapping of D into R", and let J(x) be the Jacobian determinant off at x. Then for any measurable subset E of D the set f(E) is measurable and

3

m (f(E)) 5 f IJ(x)I dx.a

55

Page 61: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

56 NONLINEAR FUNCTIONAL ANALYSIS

We begin by recalling a few definitions. For any point vo of R" and anyset of n linearly independent vectors a,, ..., a" in R", the parallelotope P withinitial vertex vo and edgevectors a,, ..., a" is the set of all points of R" of

form x = vo + A,a,, where are real numbers such that the

0 5 A, < 1, i = 1, ..., n. The point vo + + E a, is called the center of theparallelotope. `a 1

For fixed k the set of those points of P for which At has a fixed value equalto either 0 or 1 is called an (n - 1)-dimensional face (or, briefly, face) of P, so

that the number of faces of P is 2n. The point vo + Akak + } a, is calledthe center of the face. t+(rk

It is immediate that the parallelotope P with initial vertex at the origin andedge-vectors a,, ..., a" is the image of the unit cube

: : 5

under the non-singular linear transformation h : R" -+ R" given by

h(x) = h (x', ... , x") = i x'a,;

moreover the image of the unit cube by any non-singular linear transforma-tion R" onto itself is a parallelotope of this form. It follows that P is compact,and that the frontier of P is the union of the 2n faces of P. Moreover, (see,for example, Zaanen [4], p. 160) the n-dimensional measure of P is equal toIdet (h)I = det (ai)j, where at is the}-th coordinate of a,; and obviously theselast results extend to a parallelotope with any initial vertex.

Throughout the following discussion we use the ordinary Euclideannorm for points of As and the corresponding norm forlinear transformationsof R" into itself, and we denote the inner product ofx and y by x - y. We useA(n) to denote a positive constant depending only on n, not necessarily thesame on any two occurrences.

For our proof of Theorem 3.1 we require two simple geometrical in-equalities, which we state below.

1. Let Fbe a set in A" contained in a hyperplane H, let x0 be a fixed point ofF, and let Rx - x0 116 d whenever x e F. Let also G be the set of points ofR" whose distance from F is less than 6. Then G is measurable (since it is open)and

(a.1) m(G) 9 2" (d + 6r'-' 6.

Page 62: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 57

It is evident that G lies between the two hyperplanes parallel to H anddistance 8 from it, and to prove (a. 1) we construct a parallelotope containingG with two of its faces in these hyperplanes.

By a suitable translation we may suppose that H contains the origin, sothat H is an (n - 1)-dimensional vector subspace of R". We can there-fore find a unit vector a1 such that x - a1 = 0 for all x e H (i.e. such thata1 is orthogonal to H), and then we can find vectors a2 , ... , a" such that{a1, a2, ..., a"} is a complete orthogonal set in R". Let now y e G. Since everyvector in R" can be expressed as a linear combination of the a,, there existreal numbers A,, ..., A. such that

y -- xo = Aiaj.1=z

Further, since the distance of y from F is less than 8, there exists x e F(possibly identical with y) such that fly -- x P < 8, and then writing

y - x = (y - x0) - (x -- xo),

we obtain

R, = (y - x0)whence

a1 =(y-x)-a, =(y-x)-a,

14a11 6 fly - xfl flalll = IIy - xll < 8.Also

ily - xoll s IIy - xll + IIx - xoll < 8+d,

so that for i = 2, ..., n,

lA,l = I(y - xo) ails IIy - xoll Ilaill = IIy - xoll 5 d + 6.

It follows that G is contained in the (fixed) parallelotope with center x0 andedge-vectors 28a1, 2 (d + 8) at, i = 2, ..., n, and since the measure of thisparallelotope is 2" (d + 8)"-18 Idet (a;)I = 2" (d + 8)"'18, the result follows.

2. Let h be a linear transformation of R" into itself, let P be the image by hof the unit cube C = {x = (xl, ..., x"): 0 5 x' S 1, i = 1, ..., n}, and letQ be the set of points of R" whose distance from P is less than 6. Then Q ismeasurable (since it Is open) and

m(Q) s Idet (h)l + A(n) (IIhII + 8)"-1 6.

Suppose first that det (h) = 0, so that h is singular. In this case P iscontained in a hyperplane, and we apply Lemma 1 to F = P, taking x0 to be

Page 63: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

58 NONLINEAR FUNCTIONAL ANALYSIS

the image of the center wo of C. Since

IIh(w) - h(wo)II = Ilh (w - wo)ll 5 ilhll 11w - wolf 5 # fn Ilhll

whenever w e C, we have 11x - x011 5 J,,.!n IIhII whenever x e P, whence state-

ment I above gives

m(Q) 5 2"(I -,/n IIhII + b)"-1 b < A(n) (IIhII + 6)"-' (1,

as required.Suppose next that det (h) 0 0. In this case P is a parallelotope with meas-

ure m(P) = Idet (h)I, and it is therefore enough to prove that the open setQ - P has measure not exceeding A(n) (llhll +a)"-16. Since P is compact, foreach y eQ - P there exists x e P such that Ily - xll is equal to the distanceof y from P, and evidently x is a frontier point of P, so that x lies on one ormore (n - 1)-dimensional faces of P. Since P and 2n such faces and eachface is the image by h of a face of C, it is now enough to prove that if B is aface of C and E is the set of points of R" whose distance from h(B) is lessthan a, then

(a.2) m(E) 5 A(n) (IIhII + 6)r-16.

To prove this last result we observe that h(B) is contained in a hyperplane,so that we can apply 1 to F = h(B). We choose x0 to be the center of the

face h(B) of P, so that fix - xoll 5 (n - 1) IIhII whenever x E h(B), andthen I gives

m(E)S2"(1 (n - 1)Ilho +8)"-1asA(n)(IIhf +ar-1a.

This proved (a.2), and completes the proof of 2.From 2 we deduce immediately:

3. Let C be a closed cube in R" with sides parallel to the axes and of length a,let h be a linear transformation of R" into itself, and let Q be the set of pointsof All whose distance from the set h(C) is less than aa. ThenQ is measurable(since it is open) and

(a.3) nt(Q) 5 m(C) {Idet (h)I + A(n) ((IhI + 6)"-' b).

* In the case in which det (h) 96 0 it is tempting to estimate nt(Q) by using the inequalityn Q) S m(P'), where P' is the smallest parallelotope containing Q with sides parallel tothose of P, but unfortunately the measure m(P) tends to infinity as we approach the singu-lar case, i.e. as det (h) tends to 0 (this is easily seen from a diagram illustrating the planecase). Most proofs of the change of variable formula in which the estimate of the measureof a p-trallelotope appears do in fact use an estimate of the form m(Q) 5 m(P'), and it isfor this reason that the hypothesis inf I!(x)l > 0 is essential to such proofs.

Page 64: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 59

By applying 3 to the derivative of a differentiable mapping, we obtain the

following result; in this we use the definition of derivative given in 1.9.

4. Let C be a closed cube in R" with center x0 and with sides parallel to the

axes, let f be a differentiable mapping of C into R", and let J(x) be the Ja-cobian determinant off at x. Then

(a.4) m*(f(C)) 5 m(C) {IJ(xo) I + A(n) (flf'(xo)0 + j) 1 in

where ri = sup II f'(x) - f'(xo) II and m* denotes outer Lebesgue measure.X@ c

To prove (a.4) let a be the length of the sides of C, and let P be the imageof C by the linear transformation f'(xo) : R" -* R". By the mean value theoremapplied to the function f - f'(xo) ( cf. the proof of Corollary 1.45), we havefor each x of C

11f(x) -f(xo) --f'(xo) (x - xo)II 5 rl pz - xoll < rla. fn,

and this inequality expresses the fact that the point f(x) - f(xo) + f'(xo) (xo)of the translate f(C) - f(xo) + f'(xo) (xo) of ft C) is at a distance less thanr?a from the point f'(xo) (x) of P. It follows that this translate of f(C) iscontained in the set of points of R" whose distance from P is less thanrla f , and applying 3 (and noting that det (f'(xo)) = J(xo)) we immediatelyobtain the inequality (a.3).

5. Let D be an open set in R, let f be a continuously differentiable mappingof D into R", and let J(x) be the Jacobian determinant off at x. Then for anymeasurable subset E of D

(p.1) m* (f(E)) 5 fz

dx,s

where m* denotes outer Lebesgue measure.

Suppose first that E is a closed cube C with sides parallel to the axes. Sincef is continuous on C, we can divide C into a finite number of non-overlappingclosed cubes C1, ..., C,, with centers x1, ..., x4 and with sides parallel tothe axes such that II f'(x) - f'(xk) 11 5 e whenever x e Ck (k = 1...., N). By4, for each cube Ck we have

m*(f(CC)) < m(Ck) {IJ(xk)I + As),

where A is independent of k, so that also

m*(f(C)) < E m*(f(Ck)) < Ij fJ(xk)I m(Ck) + Asm(C),

Page 65: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

60 NONLINEAR FUNCTIONAL ANALYSIS

the summations being extended over all cubes Ck. When the maximumdiameter of the cubes Ck tends to 0 the sum E IJ(xk)I m(Ck) tends to theRiemann integral of IJ(x)I over C, and sinces is arbitrary we therefore obtain

(p.2) m*(f(C)) 5 fIJ(x)I dx,c

which is (#.1) for E = C.Suppose next that E is a measurable subset of D. Then we can find a set

El containing E and with measure equal to that of E such that El is theintersection of a contracting sequence of open sets O a D. If now C is aclosed cube contained in D with sides parallel to the axes, then for each fixedn the set C n O, is a countable union of non-overlapping closed cubes withsides parallel to the axes, and applying (f.2) to each such cube and summingwe obtain

IJ(x)I dx,m*(f(C n O.)) sJ

whence also

(fl.3) m*(f(C n E)) Lo. IJ(x)I dx

(since E e Op). Since J is bounded above on C, the integral on the right of

(fl.3) is finite, and so tends to L81 IJ(x)I dx as n tends to + oo, whence

m* (f(C n E)) s J IJ(x)I dx = f I J(x)I dx.Cnr, c.,s

Since D is a countable union of non-overlapping cubes such as C, the generalresult (j.1) follows.

6. Let D be an open set contained in R", and let f be a continuously differ.entiabk mapping of D Into R". Then f(E) is measurable for every meawableset E e D.

(For a proof of this under more general hypotheses see Rado and Reichel-derfer [1], pp. 337, 214).

Let J(x) be the Jacobian determinant off atx. It follows immediately from5 that if E0 is the subset of D where J(x) = 0, then flE0) has measure 0, sothat m (f (E n Es)) = 0 for every measurable E e D. Since D - E0 isopen, it is therefore enough to prove the above result when J(x) 0 0 on D.

Page 66: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGkEE THEORY AND APPLICATIONS 61

Suppose then that J(x) t 0 on D, so that f is locally a homeomorphism.The open set D is a countable union of closed cubes, and, by the Heine-Borel theorem, we can cover each of these cubes with a finite number of closedcubes on each of which f is a homeomorphism. Hence D is a countable unionof closed cubes Ct on each of which f is a homeomorphism, and sincef(E) = U f(E n Cj), it is enough to prove that f(En C) is measurablewhenever E is measurable and C is a closed cube in D on which f is a homeo-morphism.

If E is closed, so are E n C and f(E n C), and hence if E is a countableunion of closed sets, then f(E n C) is measurable. Since any measurable set isthe union of a set of measure zero and a set which is a countable union ofclosed sets, it is now enough to prove thatf(E n C) is measurable when E isof measure zero, and this follows immediately from 5. This completes theproof of 6, and hence also of Theorem 3.1.

References

1. T. Rado and P. V. Reichelderfer, Continuous transformation in analysis (Berlin, 1955).2. G. de Rham, Varietes differentlables (Paris, 1955).3. J. Schwartz, "The formula for change in variables in a multiple integral", Amer. Math.

Monthly 61 (1954), 81-5.4. A. C. Zaanen, An introduction to the theory of integration (Amsterdam, 1958).

B. Definition of the Degree of a CI Mapping in Rn

Notation: Until further comment, D will denote an open bounded set ofR", the Euclidean space whose coordinates are x = (x1, ..., x"). Let 8D de-note the boundary of D.

Most of the mappings appearing below are continuous on D. We shallwrite C for the space C(F), k) of continuous mappings defined on f) andhaving values in R. By a C' function on f) we mean a function having deri-vatives on a neighborhood of b up to order r which coincide with restrictionsof continuous functions on D. Cf. Chapter I for the topology of Cr.

Suppose that0 e C is C1 on D and that p e R" is a point not belonging to0(0). We shall define the degree ofq5 with respect top and D; it will be aninteger denoted by deg (p, 0, D).

If Z e D is the set of critical points of 0, i.e. points at which theJacobian of 0 vanishes, and f -1(p) n Z = 0, then the set 0-1(p) is discrete,by the implicit function theorem; since f) is compact, this set is finite.

Page 67: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

62 NONLINEAR FUNCTIONAL ANALYSIS

At each x 4-'(p), J+ does not vanish. Then its sign is unambiguously de-fined and we define

(11.1) deg (p, 0, D) = E sign J#(x).

Suppose now that 4-'(p) n Z 0 0. ,By Sard's Lemma 3.1, O(Z) has measure zero in R-, and in particular, has

empty interior. This implies that the point p may be approximated as closelyas desired by points q for which 0-'(q) n Z = 0. For each q, the degree isdefined as above. Then, by definition, the degree of p is

(11.2) deg (p, 0, D) = lim deg (q, 46, D).f-.Dj-1<i)nZ-0

In order to justify this definition we must prove that the limit exists (andthat it is independent of the choice of the q's). This will follow from moregeneral results to be obtained later (see Corollary 3.10). As a first remark wenote that if p #ci(b), then deg (p,0, D) = 0.

Let us consider the special case when #-'(p) n Z = 0. Suppose that f f.}is a family of continuous functions f,: R' R with the properties

(i) f. (x) dx = 1

(II.3)

(ii) K, = support f, = sphere of radius a and center at p.

Let us consider the integrals

f f,(t¢(x)) J.#(x) dx.D

We shall prove that I, = deg (p, 0, D) for every e small enough.Let p', ..., p, be the elements of¢-'(p) (as was observed, this set is finite).When s is small enough, there exist neighborhoods A,', ..., As of p', ..., AVE

such that 0 naps each A' homeomorphic ally onto K the sphere of radius sf

around p. Observe that f,(4(x)) vanishes outside U As.. It follows that'-1

(1) fD'J*(x) dx r f,(4x) J+(x) dx .

9-1 J A.

Since 4(p') # 0, for i = 1, ...,'s (recall that 4- '(p) n Z = 0), by choosing ssmall enough, we may assume that J` 0 0 at every point in every A.. In

Page 68: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 63

that case sign JJ is (defined and) constant on every A. Hence we may con-sider 0 : A; - K. as a change of variables and apply the classical theorems onchange of variables in an integral. This leads to the equality:

(2) r fe(¢(x)) J`(x) dx = sign J#(P4) J f.(x) dx = sign J4(P`).

Combining (1) and (2) yields

1.f.(4(x)) J+(x) dx = E sign J#(p'), {P} _ 4-1(P),

i.e.,

(II.4) deg (p,0, D) =fDf.

(46(x)) J#(x) dx

for every family of functions with the properties (1.3) and provided s is smallenough.

This expression may be adopted as an alternate definition of deg (p, 0, D)whenever4-1(p) n Z = 0 (see E. Heinz, [1]). We shall prove some proper-ties of the degree as so expressed. First we need some lemmas.

C. Some Functions are Divergences

3.2. Lemma: Suppose v is a C1 vectorfunction: (vl, ..., v") = v : E" -+ E",

and that f - div v (_ T8v''

. If v vanishes outside a bounded set K, then8x

SRI'f(x) dx = 0.

Proof: Compute.Suppose now that 4) e C and 0 in C2 on D, and that visas in Lemma 3.2.

K is the (compact) support off.

3.3. Lemma: If K r 0 (8D) = 0, the function

g(x) = f(OW) J4(x)

is the divergence of some vector valued C1 function u with support includedin D.

Page 69: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

64 NONLINEAR FUNCTIONAL ANALYSIS

Proof: Let ak.i be the minor determinant corresponding to the (k, 1)-th

entry in the Jacobian. matrixair

and define u by8xj

u'(x) _ v'(g6(x))a'.J(x), (i = 1, 2, ..., n).

J

Then u is C' since 0 is C2 and v is C1. From the hypothesis 0 (OD) n K = 0it also follows that u may be considered as defined on all of R" (with thevalue zero outside D) and that its support is then included in D, as required.It only remains to prove that div u = g. We simply compute as follows:

[v« (O(x))Oi(x) a' '(x) + v'(O(x)) ai''(x)],k,J

div u = u', = div v J#(x) + vJ (¢(x)) ai''(x)

g(x) + vJ (4(x)) a(''(1By definition we have

(It - 1)! ar.J =

(where a means sign of the permutation m,, - j ), and then, differentiating,the antisymmetry leads to a;'' = 0, which implies the desired result.

3.4. Lemma: Let f be a continuous function defined on R" having supportK contained in D. Let x° e R" and suppose that the convex hull ofK v (K - x°)(where K - x° denotes the set obtained from K by the translation inducedby -x°) is contained in D. Then the function

f(x) - f(x + x°)

is the divergence of some mapping v: R" - R" whose support is containedin D.

Page 70: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 65

Proof: Let t(x) = f(x) - f(x + x°). Clearly 0 has support equal toK u (K - x°).

Now define °0(x) _ f_ O(x + tx°) dt,

v'(x) = 40(x).

It is easily verified that v' has support contained in the convex hull ofK u (K - x°). Moreover, if v = (v', ..., v"), we claim that div v = q . In fact,we have

dive=v;x°ax,

But the last term is the directional derivative of 0 in the direction x°, andconsequently equals

d 0 (x + tx°)dt I-0,

By definition of 0, it follows that

d + tx°) = d ($04 (x + (t + u) x°) du)[_

O

d0(x+(t+u)x0)dudt 1.0

= f ° .(d '0 (x + ux°)) du OW.-7

and then div v(x) = 4(x) = f(x) - f(x + x°) as desired.

3.5. Corollary: Let x(s) be a continuous curve in R", 0 < s 5 1, and let fbe continuous f: R" -- B with support K contained in D. Suppose

(i) K is contained in a convex compact set M contained in D.(ii) M - x(s) never touches the boundary of D.

Then f(x + x(0)) -f(x + x(1)) is the divergence of some C' mapping v:R" -+ R" whose support in contained in D.

Proof: Let us define an equivalence relation on the set of values of s bySi N 02 ifff(x + x(sl)) -f(x + x(s2)) is the divergence of a mapping withsupport in D. Lemma 3.4 allows us to conclude that every class modulo - is5 Schwarts, Nonlinear

Page 71: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

66 NONLINEAR FUNCTIONAL ANALYSIS

open. In fact, for every s, M + x(s) being a compact convex subset of D thereexists an open convex neighborhood of M + x(s) also contained in D. Bythe continuity of x(s), the convex hull of (M + x(s)) u (M + x(s')) is con-tained in such a neighborhood, when s' is close to s. Then Lemma 3.4applies and s s'. Thus we conclude that every class is open. But the con-nectedness of [0, 11 therefore implies that there is only one class. This meansthat 0 - 1 or that f(x + x(0)) - f(x + x(l)) is a divergence, and we aredone.

D. Back to the Definition

We begin by proving a lemma.

3.6. Lemma: Let 0 E C and be C2 on D. Consider two points pi , p2 E R"- 0 (8D) and such that -1(p1) n Z = 0-1(P2) n Z = 0. Then if pl and P2belong to the same component of the open set R', we have

deg (P1, 4, D) = deg (P210, D)

Proof: Since 0-1(p1) n Z = 0, we know that deg (p, 4, D) may be com-puted by means of a family of functions with the simple properties describedin 11.3 as follows:

deg (p1, 0, D) = f.((x))J,(x) dxD

for e small.If we suppose that P2 lies in the same component of R' -,0 (8D) as does

p1, then there exists a continuous curve x(s), 0 5 s S 1 such that x(0) = 0,x(1) = P2 - p1 and x(s) + p1 lies in that component. Since x(s) is compactthere exists e > 0 such that if K. is the- e-sphere around p1, K. + x(s) nevertouches the boundary of R' -,0 (8D). Then Lemma 3.4 may be applied andyields the conclusion that

A(x) - f.(x + P2 - Pi)is a divergence. Therefore by Lemma 3.3,

f.(/(x)) J4(x) -1.(4(x) + P2 - P1) J#(X)

is also the divergence of a mapping with support in D. In such a case,Lemma 3.2 implies that

(a) Jf.(x)) J*(x) dx = fJ(x) + P2 - Pi) JO(X) dx

Page 72: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 67

or, according to formula (11.4):

(b) deg (Pl, 0, D) = f!. (4)(x) + P2 - Pi) JJ(x) dx.

But now the functions g,(x) = ff(x + P2 - p1), e > 0 have the properties(11.3) around p2, and consequently

deg (P24, D) = I g. (4)(x)) J#(x) dxJD

= Jfi(d(x) + Ps - P1) J#(x) dx.

Formulas (a), (b) and the last one together imply

deg (P1, 0, D) = deg (P2, 0, D)and we are done.

Consequences: Suppose that p e R" but t¢-1(p) r Z = 0. In that case, forevery q sufficiently close top, q belongs to some well-defined component ofR" - 0 (3D), namely that containing p. But then by Lemma 3.6, deg (q, 0, D)is constant when q is near p and 4-1(q) n Z = 0. This justifies the de-finition (11.2) when 0 is C2. Taken together Lemma 3.6 and the definition(11.2) imply:

3.7. Corollary: Let ¢ e C and be C2 on D. Then

deg (p, 0, D)

is constant on every component of R" - 0 (3D).This corollary will also be true for continuous mappings after we define

the degree for such mappings. The unnatural hypothesis that 4, is C2 needsto be eliminated first, we will do so in the first corollary of the nextlemma.

3.8. Lemma: Let 0 e C and be C1 on D. Then for each p f 0 (3D) u 4)(Z),there exists a C' neighborhood U of ¢ such that for every p e U we havep # +p (3D) and

deg (p, 0, D) = deg (p, +p, D).

Proof: Let y,, j = 1, ..., k be the elements of the finite set ¢' 1(p). Let Bjbe an open ball around y,, whose radius rj will be determined below.

Page 73: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

68 NONLINEAR FUNCTIONAL ANALYSIS

First of all, if each r, is small enough, the family {B,) is disjoint. Our aimnow is to prove that there exists a C1 neighborhood U such that for every 1Pin U, the equation y'(x) = p has one and only one solution in each B, and noothers.

The hypothesis p #4)(Z) implies that the derivative 4)' of 0 (the Jacobianmatrix) has an inverse at each y,. By decreasing the radii r, it is possible toguarantee that 0' has an inverse at every point of each B, and moreover that:

(1) I(4'(y,))-1 (4)'(y) - 4'(y,))I < 1, y e B,

where the norm I I stands for the norm for operators on R" into itself.Let us suppose that this is the case for radii r1, ..., rk and let r be the

smallest of these. If we let F be the set D - U B1, then F is compact andI

p O¢(F). Finally, let a be a positive number such that I(4)'(y,))-1I > a, j= 1, .., k.

Now we are able to define the C1 neighborhood of 0 that will give adesired solution.

Choose U to be a ball (in the C1 sense) such that for every V e U the follow-ing holds:

(a) p ll +p(F) ).(b) v,'(y) is invertible when u e U B,, and I(o'(y,))I-1 > a.(c) I(V'(y,))-1(?p'(y) - yr'(y,))f < ;, when y e B, and for j = 1, .... k.

(d) 10(x) - V(x)I < for all x e D.

We remark that each one of the properties (a), (b), (c), (d) defines an open setin the C' sense (call these sets A, B, C, D), that the sets A, B, D obviouslycontain 0 and that 0 e C by (1).

Therefore U may be chosen as any C1 ball around 0 contained in A n BnCnl).

Observe that since ¢' is invertible over each B1, the sign of the Jacobiandeterminant is constant. U being a ball, it is connected. Then the same resultfollows. for every 0 in U and clearly

(2) sign J*(y) = sign J#(y1)

when j e B, and V e U.

We now show that properties (a), (b), (c) and (d) imply that the function 1phas one and only one root of the equation v'(x) = p in each B1. Of course

Page 74: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 69

property (a) tells us that there are no roots outside U Bj (remember thatF = D - U Bj). The only problem is to see what happens in one Bp say, in B1.

Define11(y) = (1V'(Yl))-1 (W(Y))

From (c) it follows that

Irl'(y) - 11 < 1, y e B1.

Then by Corollary 1.19 it follows that t1 is one-to-one on Bp This impliesthat SV is one-to-one on B1 and therefore it has at most one root there.

But from the same corollary we also conclude that ii(Bj) covers a ball Bof radius (I - J) r, = Jr, > Jr around 77(yl).

As I(V'(y1))-11 > a, it follows that tV (yl) (B) covers a ball V of radiusa Jr around tV'(Y1) (71(y1)). But tp(y) = (tp'(yl))'' (11(Y)) and then tp(B1)covers V. This means that every point x e R' for which ix - ip(y1)I < Irais of the form x = tp(b), b e B1. But by (d), Ip - tp(Yl)I = 14)(yl) - tp(Y1)1

< 4 . and then the equation V(x) = p has at least one solution in B1.

The same holds for every Bj and we have tp-1(p) n Bj = {9,}, j = 1, ..., k.But then, by (2)

deg (p, u, D) _ Sign J (y,) _ Sign J#(yt) = deg (p,0, D).J

Q.E.D.

3.9. Corollary: Let ¢ e C (b) and be C' on D. If p, q do not belong to4) (8D) u 4(Z) and belong to the same component of R' - ¢ (8D), then

deg (p, 0, D) = deg (q, 0, D).

Proof: Choose a continuous curve x(s) joining p = x(O) and q = x(1).Since x(s) is compact and disjoint from 0 (8D), when tp is CO near 0, x(s) isalso disjoint from tp (8D). This means that p and q belong to the same com-ponent of B' -1p (8D). Therefore if we choose ,p to be C2,

deg (p, p, D) = deg (q, W, D).

This holds for every tp of class C2 and CO near 0. But since deg (p, to, D)- deg (p, ¢, D) and deg (q, ip, D) - deg (q, ¢, D) as tp - 4 in the C' sense,we conclude that

deg (p, 0, S) = deg (q, 0, D).Q.E.D.

Page 75: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

70 NONLINEAR FUNCTIONAL ANALYSIS

3.10. Corollary: Let p not belong to 4) (8D). The expression

deg (q, 4), D)

has a limit when qj - p and the qj's belong to R" - (4 (8D) u ¢(Z)).

Proof: Obvious by the previous lemma.The conclusion is that the degree of 0 has a meaning for every point

p #¢ (8D) in the sense of the definition (11.1, 11.2) which may therefore beadopted for every Cl mapping.

The next step is to remove the condition 0 e C2 in Corollary 3.7.

3.11. Proposition: Let 0 e C and be C' on D. Then the degree

deg (p, 0, D)

as defined in (11. 1) (11.2) is constant on every component of R" - 0 (8D).

Proof: This follows from 3.9 and the fact that ¢(Z) has empty interior.Next we remove the condition p #4,(Z) in 3.8.

3.12. Proposition: Let 0 e C and be C1, and p be a. point not in 4) (8D).In that case there exists a Cl neighborhood U of4 such that for every v, in U,p 0 +p (8D) and :

deg (p, y,, D) = deg (p, 0, D).

Proof: This follows from 3.9 and the fact that 4(Z) has empty interior.

3.13. Corollary: Let {¢t} be a family of C' mappings in C depending con-tinuously in the Cl sense on the real parameter t, 0 S t S 1. If p Oq$t (8D)for every t, then

deg (p, 00, D) = deg (R01, D).

Proof: Essentially the same argument as in the proof of Corollary 3.5:the equality deg (p, 4t, D) = deg (p, 0 D) defines an equivalence relationt - U. Proposition 3.12 implies that each class is open and the connectednessof [0, 1] allows us to conclude that 0 - 1. This is the claim.

E. The Continuous Case

We are approaching the most important point in this chapter: the definitionof the degree for every continuous mapping. This will demonstrate the topo-logical character of this concept of degree. The definition is as follows:

Page 76: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 71

3.14. Definition: Let 0 e C and let {&"} be a sequence of functions of

class C' converging uniformly to 0. Then for every p 00 (8D) a sequencedeg (p, gyp", D), n > N is defined, the limit

lim deg (p, 4", D)

exists and does not depend on and we then define

deg (p, 0, D) = lira deg (p, 0", D).n-.ao

Justification of the definition. Let d equal the distance between p and thecompact set ¢ (8D).

Choose N so large that if n Z N, then 1¢n - 01 < Id (I I stands here forthe uniform norm = convergence in the CO sense).

Since p does not belong to any ball of radius Id and center at fi(x),x e 8D, it follows that p is not a convex combination of the form to. (x)+ (1 - t) 4",(x), 0 5 t S 1, x e 3D, n, m > N, because 4"(x) and 4m(x) be-long to one such ball.

But then we may fix n, m > N and apply Corollary 3.13 to the family :

ton+(1 0<-t5 1.

Hence deg (p, 0", D) = deg (p, 0,,, D).In other words, the limit lim deg (p, 0", D) does exist and 3.14 is legitimate.

We may reformulate the definition as follows.3.15. Given p # ¢ (8D) there exists a CO neighborhood U of 0 such that for

every +p of class Cl belonging to U, the degree deg (p, gyp, D) is the same.Henceforth we define

deg (p,0, D) = deg (p, yi, D), V e U.

Remark: The correct way to think about the degree is that it is defined forevery continuous function 0 and that Definitions II.1 and II.4 in B are onlymethods of computing it in the special case 0 C'.

Our next aim is to extend to the continuous case the statements that wehave obtained for the C' or C2 cases. This is done in the following summarytheorem.

3.16. Theorem: To every continuous map 0 : D -+ R" and every p 4 (3D)there is associated an integer deg (p,0, D) with the properties:

Page 77: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

72 NONLINEAR FUNCTIONAL ANALYSIS

1. Invariance under homolopy. The integer deg (p, 0, D) depends only onthe homotopy class of4) in the following sense: if 0, is a family of mappings¢, e C depending continuously in the uniform topology on the parametert, 0 S t S 1, and such that p 00,(OD) for every t, then

deg (p, 00 , D) = deg (p, 4) , D).

2. Dependence only on the boundary values. As a consequence of (1), wehave: if 4Iav = VI DD and p 00 (OD) = p (aD), then

deg(p,0,D) = deg(p,ip, D).

3. Continuity. The function deg (p, 4), D) is continuous in 0 in the uniformtopology in the following sense: given 0 and p 00 (8D), there exists a uni-form neighborhood U of 0 such that if +p e U. then p 0,p (8D) and

deg (p, 0, D) = deg (p, w, D).

4. If p then deg (p, 0, D) = 0. If p and q belong to the same com-ponent of R" - 0 (aD), then deg (p, ¢, D) = deg (q, 0, D).

5. Decomposition of the domain. If D = U D, where each Di is open, the

family (DJ is disjoint, and 8D, a 8D, then for every p 00 (8D):

deg (p, 0, D) _ deg (p, 0, D,).

6. The excision property. If p 00 (8D), K e D, K is closed and p #4)(K),then

deg (p, 0, D) = deg (p, 0, D - K).

7. Cartesian products. If D e R", D' c R'° and 0: D - R", lp : D' - R,then

deg ((p, q), (0,'p), D x D') = deg (p, 0, D) deg (q, ,p, D)

whenever each term makes sense.

Proof of 3. The proof follows obviously from Definition 3.15.

1 follows from 3. In fact, the function deg (p,4),, D) is continuous in t.Since the range (the integers) is discrete and (0, 1] is connected, it must beconstant.

2 follows from 1. Consider the family to + (1 - t) V.

Proof of 4. We remarked after Definition 11.2 that 4 holds for C' mappings.The general case follows immediately by 3 and approximation.

Page 78: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 73

Proof of 5. We assumed that D is compact. This can happen only if D,is a finite family. But then we can select a single C' mapping rp such that ipand each restriction y), = 1V4D, belong to the CO neighborhood correspondingto 0 and each41D,, respectively, according to Definition 3.15. The statementis thus reduced to the C' case, and now it is enough to consider the casep # p(Z). But then our result follows trivially from the associative law ofaddition:

deg (p, V, D) = Y Sign J4(x)v(x)=D

=

Cxi Sign Jm(x)1 = deg (R 1V,, DI)

V(x)

=P

J

Proof of 6. Here again it is easy to see how to reduce the continuous caseto the C' case. If ¢ is supposed to be C', Definition II.1 gives

deg (p, 0, D) = Y Sign J#(x)4.(x)-P

and under the assumption p O4,(K), it follows that

deg (p, 0, D) = Y Sign J#(x) = deg (p, 0, D - K).O(x)-PxeD-K

Proof of 7. Once again the reduction to the C' case is immediate and theresult follows from the remark that J(+,,,) = J# J,,.

Q.E.D.

We now generalize 1 and 2.

3.17. Corollary: If4, and ff have homotopic restrictions to 8D, i.e., if thereexists a family 0,, 0 S t 5 1 of mappings 0,: 8D R' such that p 0 0, (aD)for every t and d16D = 00 , VI DD = 0, , then

deg (p, 0, D) = deg (p, lp, D) .

Proof: Consider the cylinder L = D x [0, 1]. The homotopy 0, may beconsidered as a continuous function 0 from 8D x [0, 1] into R" by defining0 (t, x) = 0,(x). Now extend 0 to a mapping T defined on all of L and call ,1V the mappings O(x) = T (0, x), j(x) = T(1, x), xeD. Clearly by (3.16; 1)we have

deg (p, w, D) = deg (p, , D).

But since 1 aD = 4,18D and V 1 aD = V1 ,1D, by (3.16; 2) it also follows thatdeg (p, 0, D) = deg (p, gyp, D) as desired.

Page 79: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

74 NONLINEAR FUNCTIONAL ANALYSIS

An easy consequence of 3.16 (and one which could have been obtainedearlier, if desired) is the Brouwer fixed point theorem and its equivalent

"no-retraction" theorem.

3.18. Corollary: ("No-retraction" Theorem.) Let B e R" be the open unitball. There is no continuous mapping 0: B -- 8B such that the restrictioncb B is the identity.

Proof: Under the conditions stated, 0 should satisfy

deg (p,0, B) = deg (p, id, B)

for every properly situated p (by 3.16; 2). The second member is 1 at p = 0(actually at any point in B). This implies, according to (3.16; 4), that0 e¢(B). But this contradicts 4)(B) a 8B, and we are done.

Q.E.D.

3.19. Theorem: (Brouwer fixed point theorem.) Let B be any open ballin R. If 0: B -+ B is continuous, there exists x e B such that 4)(x) = x.

Proof: We may suppose that B is the unit ball with center at the origin.If ¢ has no fixed points, then for each x e Bx and 4)(x) define a straight line.Define +p(x) as the only point of the form Ax + (1 - A)4(x), A z 1 havingnorm 1. V is continuous, p : B --i- 8B and 'Ia e = id. But we have just seenthat such a mapping cannot exist.Q.E.D.

F. The Multiplicative Property and Consequences

As We have seen, if4, is a continuous mapping4, : D -+ R", then deg (p,¢,D)is constant on every component of R" - 4) (8D) (see (3.16; 4)). This allowsus to introduce the notation

deg (A, 0, D)

where A is any non-void set contained in a single component of R" - 0 (8D).The definition is:

deg (A, 0, D) = deg (p, 0, D), p E A.

However, we can point out one distinguished component, namely the un-bounded one. Certainly, 0 (8D) being compact, there is at least one un-bounded component of R" - 46 (8D). But this component contains the ex-terior of any ball containing 0 (8D). Therefore it is unique. Call it A.,. Since4)(D) is compact, there exist points in A., not in 4,(b). For any such point p,deg (p, 4), D) = 0, by (3.16;4). Thus deg D) ='0.

Page 80: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 75

3.20. Theorem: (Multiplication Property.) Let ¢ : D - R", w : R" - R" betwo continuous mappings and d i the bounded components of R" - 0 (8D).Suppose that p t +p o 0 (8D). Then

deg (p, +p o0, D) = E deg (p, +p, d,) deg (d,, 4', D).4,

Proof: We place ourselves in the pleasant situation of "counting zeros"by assuming that 0 and +p are Cl mappings and that p is not the image of acritical point. In that case

deg (p, Y) o 4', D) _ Sign J,, o #(y)Y; y (d(Y )) = D

Sign Sign J#(y)

Sign J (z) Sign J#(y)v(z)=D

I deg (z, 0, D) Sign J.(z).z E R"-4(OD)

v(z)=0

But R" - ¢ (8D) is the union of the disjoint sets 4,, so that

deg (p, V o0, D) = F rdeg (d,, 0, D) Y, Sign J11(z)ZE4j

VP(z)=D

_ Y deg (d,, 0, D) deg (p, +p, d41

and the proof is done.As an illustration of the power of this theorem it is possible we give,

following Leray, ao immediate proof of the Jordan separation theorem(cf. Jean Leray, Proc. Int. Congress, 1950, vol. II, pp. 202-208).

3.21. Theorem: (Jordan) Let K and L be homeomorphic compact setsin R1. Then R" - K and R" - L have the same number of components.

Proof: (Leray) Suppose that h : K - L is a homeomorphism and that 0and +p are extensions respectively of h and h-1 to all of R". Let d, be the com-ponents of R" - K and D, those of R" - L. Now the mapping ip o 0 : R" R"is the identity on K. ConsiderAJ. The restriction ap o,0la4, is also the identitysince 8d, c K. Therefore if p is properly located

deg (p, +p o 0, A,) = deg (p, id, d,)by (3.16; 2).

Page 81: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

76 NONLINEAR FUNCTIONAL ANALYSIS

But for every i, the degree deg (A,, id, 4J) is equal to 8,J (Kronecker's d),and then :

deg (d1, tp o0, d,) = 8,, J.

Using the multiplicative property we shall obtain

61,J = E deg (dJ,V, DJ deg (D,.,0, A )h

This is not immediate, because the sets Dy are not the components ofR" - 0 (8d,), but subsets of them.

Fix j and compare the sets:

R"-L=UD,R' - 4 (ad,) = UG,,

where G, are the components of R" - ¢ (3DJ). The sets of the family {G,}are disjoint.

Since L and 0 (8d J) are compact there is only one unbounded componentin each case. Call these A. and G... Since 0 (8d J) c L, or

R" - Lc R"-¢(8d,).

It follows by connectedness arguments that the family {D,} splits into severalsubfamilies {Dj} such that

Us=Diu...uD.' u...cG,U2=D;u

D. is necessarily contained in G.. Consequently

(1) deg(D,,,4,4J) = deg(G,,0,AJ).

Let M, = C, - U'. It is easy to see that M, c L for every i. But now ifp e d, and o(z) = p, z cannot belong to L, because rp(L) = h(L) = K, and,a posteriori, z cannot belong to any M, either. This implies that

deg (p, N, G,) = deg (p, +p, G, -- M,)

as was seen in (3.16; 6).But then, since G, - M, = U Dj, we conclude (3.16; 5) that

J

(2) deg (p, gyp, G,) = deg (p, rp, U Dj) _ deg (p, ip, Dj).J J

Page 82: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS

Now we recall that the multiplicative property means exactly

deg (p, V o 0, A) _ deg (p, +,, G.) deg (Gb, 0 4,) .

77

h* 00

and, by (1) and (2), it follows that

deg (p,' o0, 4,) _ F, (E deg (R V, DI',)) deg (D;,, (p, d,)

_ Y deg (p, +P, D,) deg (DI, 0, df)DI*D

Since p was any point in d,, we conclude

(3) 8,,, = deg (A1, fV o4, Aj) = Y deg (A,, +p, D) deg (D, Q', dj).D*D.

By the symmetry of the argument it is also true that

(4) 8,j _ deg (D,, 0, A) deg (d, +p, D).e*e.,

Suppose that both families {d,} and {D,} are finite. Then they define twomatrices

A = (deg (D,, 0, d,))

B = (deg (d,, lp, Dj))and (3) and (4) imply

(s)AB=IBA = 1,,,

if we assume A to be n x m and B to be m x n, where n = number of D,'s,m = number of Al's. But- now it is an easy exercise to show that the equal-ities (5) imply n = m, and this is what we were looking for.

The case {D,} finite, {d,} infinite is easily seen to be impossible by theequalities (3) and (4). If both families are infinite, then being disjoint familiesof open sets, are both denumerably infinite.

Q.E.D.

We can now draw some topological consequences.

3.22. Corollary: (Domain Invariance) Let U be an open set in R",0: U - R" a continuous and one-to-one mapping. Then 0 is an open map-ping.

Proof: Pick a point p e U and let D be the closure of an open ball D con-taining p and contained in U.

Page 83: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

78 NONLINEAR FUNCTIONAL ANALYSIS

Since D is compact, 0: D - 4(D) is a homeomorphism and we can applyJordan's theorem which implies that R' - 4,(D) has one component andthat R" - 0 (8D) has two components.

Take a point q in the bounded component d of R" - 4, (8D). Ifq e R" - 4,(D), it follows from the connectedness of R" - ¢(D) that -q can bejoined by a continuous curve with every other point in R" - 4,(D). But this

set certainly intersects the unbounded component of R" - 0 (8D) which isa contradiction. We conclude that d n (R" - 4(D)) = 0 or d c 4(b). Aswe knowthat¢(p) 4 0 (8D), because4, is 1-1,4,(p) must belong to A. Hence¢is open.Q.E.D.

G. Borsuk's Theorem

We shall prove the following

3.23. Theorem: Let D be a symmetric bounded open set in R" containingthe origin and 46: D - R" an odd mapping (4(x) = -4'(-x), for all x e D)such that 0 #0 (8D). Then deg (0,¢, D) is an odd number (in particulardifferent from zero).

The proof follows from a sequence of lemmas.

3.24. Lemma: Let K e R" be a compact set, 0: K - R"+ 1 a continuousmapping such that 0 #4,(K). Then 4' can be extended to a continuous nevervanishing mapping defined on a cube C ? K.

Proof: For e > 0 choose r' to be C1 and such that (4,(x) - +p(x)I < e forx e K.1p(D) has measure zero for every D e R" by Sard's lemma, and so it ispossible to pick a point yo such that x - ip(x) + yo never assumes the value 0.Suppose then that y, itself is never vanishing.

Let c = inf 14,(x) (, x e K and choose a continuous function defined fort > 0 with values in R such that

ra(t) = 1 if t Z 2 r1(t) =2t if t 5 .c 2

If we define 0 as the mapping

OW = V(x)

rJ(I o(x)I)

then I0(x)I k c/2 for all x and I0(x) - 4,(x)I < c on K.Suppose that a has been chosen so that e < c/2

Page 84: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 79

By the Tietze extension theorem there exists a function d : C -. R" suchthat 8(x) = O(x) - 4(x) if x e K, and I8(x)I S s for x e C. Define 41(x)_ O(x) - 8(x), x e C.

Then 01(x) _ 4(x) if x e K,

- e > 0101(4 = 10W - 44 ? I0(x)1 - 144 ?C

and 01 is a solution of our problem.

3.25. Lemma: Let D e R" be a symmetric open bounded set such that0 0 D. Let be a mapping of 8D in R1, m > n, which is odd and non-vanish-ing. Then Q, can be extended to D to be odd and non-vanishing.

Proof: We shall use the induction on the dimension of R. For n = 1,D looks like:

e

--H-H -y -HBy the previous lemma we can extend the function ip = 0 restricted to

[s, co] n 8D, to a non-vanishing function j defined on some interval [e, N].By symmetry, we may define a function extending 0 and never vanishing.

Suppose now that the lemma is established for n1 < n. Let x e R", .f e R11-1(suppose furthermore that R"-1 has been identified with the hyperplanexl = 0 in R"). Considering R"-1 n D, o can be extended to 8D v (R"' 1 n D)to be odd and non-vanishing (this is our inductive step): call the extensionagain 4'.

Now split R" into R"-1, where x1 = 0, x1 > 0, x1 < 0 respec-tively, and let D+ = D n R+, D- = D n r- By the previous lemma 4' hasa further extension to 8D v (R"-1 n D) u D +, continuous and non-vanish-ing. Now, by symmetry, the final extension can be defined.

Q.E.D.

3.26. Lemma: Let D e R" be a bounded open symmetric set such that0 0 D, 0 : 8D -+ R", a continuous odd and never vanishing mapping. Then 0can be extended to D to be continuous and odd, and furthermore, non-vanishing on D n R"-1 (again the identification R"-1 a R").

It follows from the previous lemma applied too retsricted to 8 (D n R8-1)= 8D n R"-1, that we can obtain a never-vanishing extension to D n R"-1.Such an extension can be extended at once to the desired map on D bysymmetry.

Q.E.D.

Page 85: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

80 NONLINEAR FUNCTIONAL ANALYSIS

3.27. Lemma: If D e R" is a bounded open symmetric set and 0 # D, forevery 0: 8D -j, R" continuous and odd such that 0 f 0 (8D), deg (0, 4), D) isan even integer.

Proof: Extend 0 to D so as to be a continuous odd mapping never vanish-ing on D n R". The lemma above assures the existence of such an exten-sion. Call the extended mapping also 0. Approximate 0 by a mapping ,p ofclass C1 and odd (replace, if necessary, an approximating V by its odd partI [+p(x) - ip(-x)]). If rp is close enough to 0, it follows that

0 0 V (dD)

00+p(Dn R"-1)

deg (0, vp, D) = deg (0, ¢, D).

We want to compute deg (0, lv, D). Consider the sets D+ = R"+ n D,D- = Jr- n D (where R"+ = {x(xl > 0}, R"_ = {xlxl < 0)).

By construction V never vanishes on D n R"-1, so we can avoid this setand obtain :

(1) deg (0, +p, D) = deg (0, p, D+ u D-)

= deg(0,%p,D+) + deg(0,+p,D-).

Choose p close to 0 and such that p is not the image under ip of a criticalpoint of ip. Observe now that since V is odd, each partial derivative p/8xt iseven. But then J, is also an even mapping. This implies in particular that-p is not the image of a critical point either. Compute

deg (0, v, D+) _ Sign J#(y)V(V)=a7ED.

deg (0, +p, D_) = Sign J#(z).,(z)- -D

zED_

Since V is odd, the set {z(tp(z) = p, z e D_} can be obtained by taking theopposite of the elements in {yjy,(y) = -p}. But J,(z) = J,(-z) and we con-clude that

deg (0, gyp, D+) = deg (0, jp, D_).

Then (1) implies that deg (0, 0, D) = deg (0, ip, D). is an even nwnber.Q.E.D.

Now we are ready to prove Borsuk's theorem.

Page 86: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 81

Consider a small open ball U with center at 0, and a mapping f : D - R"such that

(a) f is odd,(b) AD 010D,(c) f I u = identity.

The existence of such a function follows from the observation that if g is anextension satisfying (b) and (c) (such an extension certainly exists) thenf = [g(x) - g(-x)] satisfies (a), (b) and (c).

We know that f 18D = 46 implies

deg (0, f, D) = deg (0, 0, D)

(as follows from 3.16; 2).But if f = id on U, it is clear that f # 0 on 8U and then

deg (0, f, D) = deg (0, f, U u (D - U))

= deg (0, f, U) + deg (0, f, D - U)

= 1 + deg (0, f, D - U) .

But the second term is known to be even by the last lemma. This proves that

deg (0, ¢, D) = deg (0, f, D)is odd.

Q.E.D.

We now draw some consequences from Borsuk's theorem.

3.28. Corollary : Let D be as in Borsuk's theorem and V: 8D - R" a con-tinuous mapping. Then there exists no homotopy +p, of W into a constantmapping such that lpr(x) # 0 for all t, x e 8D.

Proof: First extend w to some 0 defined on D. Replacing 0 byI (4)(x) - 4)(-x)) we may suppose that 0 itself is odd. But then Borsuk'stheorem implies that deg (0, 0, D) # 0 and the impossibility of the existenceof the homotopy described is then apparent.

Q.E.D.

3.29. Corollary: Let D be as above. If V : 8D -' RA is an odd continuousmapping whose image is contained in a subspace E # R", then w assumesthe value 0 at some point of 8D.

Proof: Extend +p to a continuous odd mapping 0: b - E. If 0 l w (8D),then by Borsuk's theorem, deg (0, 0, D), being odd, is different from zero.6 Schwartz, Nonlinear

Page 87: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

82 NONLINEAR FUNCTIONAL ANALYSIS

But this implies that deg (p, 45, D) differs from zero on the component dof R" - p (8D) containing 0. Now (3.16; 4) implies that 0(b) contains sucha component, and, a posteriori, E does also. But this is impossible, since dis open, non-void, and E # R'.

Q.E.D.

3.30. Corollary: Let D be as above, and let +p any continuous mappingTp : 8D -- R" whose image is contained in a subspace E 01%. Then thereexists p E 8D such that +p(p) = V(-p).

Proof : Apply the corollary above to the mapping I (y'(x) - lp(- x)).Q.E.D.

3.31. Corollary: Let D be as above, and q5: D - R' a continuous mappingnever vanishing on 8D, such that for every x E 8D,

(1) a4, (x) # (1 - 0)4(-x)

for alla,I Sa 5 1.Then 4(D) contains a neighborhood of the origin.

Proof: Observe that the conclusion follows from the statement

deg (0, 0, D) # 0.

This property is an immediate consequence of the fact that ifp is the mapping+p(x) = I (O(x) - ¢(-x)), then by Borsuk's theorem, deg (0, P, D) # 0.Under condition (1), the family

to (x) - (I - t)4(-x), # 5 t .g 1,

is a homotopy between 0 and tp, which implies

deg (0, ,0, D) = deg (0, V. D).Q.E.D.

Page 88: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 83

Degree Theory-General Case

H. Preliminaries: Degree Theory in an Arbitrary Finite DimensionalVector Space

Suppose E is a real vector space of dimension n. By choosing a basis in Ewe can identify E with R". This should allow us to define deg (p, 0, D) as itwas done for R" and of course the only important thing is to see what happensafter a change of basis. The answer is that the degree is basis-independent.More precisely, given a basis B = {b1, ..., b"}, we shall for the momentdenote by deg B(p, ¢, D) the degree computed with resp --et to B; then we have

3.32. Proposition: For every pair of bases B, .P

degB (p, 0, D) = degp (p, 4', D) ,

whenever the expressions make sense.

Proof: It suffices to prove this for C1 mappings. But then we only need toknow what happens to the sign of the Jacobian of a mapping when the basisis changed. This is easily seen to be invariant, whence the result.Q.E.D.

1. Preliminaries: Restriction to a Subspace

Suppose that D e Jr is an open and bounded set, and that R' S R",where the inclusion is made by identifying R" with the subspace of R" whosepoints are the x such that x"+1 = x"+: = ... = x. = 0.

3.33. Proposition: If 0: D -+ R" is continuous and 9p: D -+ R is the map-ping 1P = id + 4, for every p c -AR" not belonging to o (8D):

deg (p, p, D ) = deg (p,1olR., a, D n R").

Proof: Let us begin by noting that y' (R" n D) c R"' as can be verifiedeasily; thus the expression deg (p, Vlx,.,,8, R" n D) makes sense.

Suppose that 4, is C'. By definition it suffices to prove the statement forthis case, and under the assumption that p is not the image ofa critical point

Page 89: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

84 NONLINEAR FUNCTIONAL ANALYSIS

of t'. As the degree is then computed by counting zeros, it is necessary to lookfor the points y in V r 1(p) . If y(y) = y + 0(y) =p, then y = p - 0(y) ,s R"'.Hence tp-1(p) c RI n D.

This implies that the points to be counted for V: D - R" and forF = ?P1 R-, fi, F: R" n D -+ R'" are the same, and the only possible differencein degree lies in the signs assigned to them.

Our proposition will follow from the fact that at each such pointy, we have

(1) Sign 4(y) = Sign

To prove (1), first observe that the Jacobian matrix of ip has the form

1 + aO, 0ax,

0

Im

U runs from 1 to m).This implies immediately that J((x) for every x e R'" n D, which

clearly implies (1) above. Hence our proposition has been proved.Q.E.D.

J. Degree of Finite Dimensional Perturbations of the Identity

Let X be a real Hausdorff locally convex T.L.S., and D e X an open sub-set of X such that E n D is bounded for every finite dimensional subspace Eof X (in that case we say that D is "finitely bounded"). This is the most generalcase we shall consider.

We now give some definitions.

3.34. Definition: If T is a topological space and 0: T -. X is a continuousmapping, we shall say that 0 is finite dimensional if ¢(T) is contained insome finite dimensional subspace of X. If T is also a subset of X, we definea finite dimensional perturbation of the identity to be a mapping lp : T - Xof the form y, = 1 + 0 where 1 is the identity 1: T -. X and 0 is finite di-mensional. 0 = tp - 1 is called the perturbation of V.

Our aim is to define the degree deg (p, ip, D) for every finite dimensionalperturbation of the identity yt = I + ¢ (defined on T = D, D as above). Let

Page 90: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 85

p e X be a point not in y, (3D) and choose a finite dimensional subspace

E c X such that

(a) peE,qS(D)-- E.

Letting f denote the restriction f : D n E " -- E, we have

3.35. Definition:deg (p, v, D) = deg (p, f, D n E),

where the second member is computed in E according to the theory for finitedimensional spaces.

We must justify 3.35. Suppose Fis a subspace of X satisfying properties (a)above and that F (-_ E. Then proposition 3.32 applies and we conclude

deg (p, f IF n D, D n F) = deg (p, f, D n E).

If F satisfies (a) but F and E are not nested, we reduce to that case by con-sidering E e E + F and F e E + F separately. Thus 3.35 is justified.

3.36. Remark: We know that in the finite dimensional case the degreedeg (p, ¢, D) depends only on the restriction of 0 to 3D (see 3.16; 2).

Let us suppose that we have a finite dimensional mapping defined onlyon 3D, : 3D -+ X. The degree of all finite dimensional perturbations ofthe identity 1 + 0 by means of a finite dimensional extension 0 of J definedon all of D will be the same as follows from Definition 3.35 and the finitedimensional theory. But we cannot assure the existence of such extensionsunless we assume X to have additional properties (for instance, to be nor-mal). Nevertheless a notion of degree of 1 + may be defined. Supposep # ¢ (3D). Choose a finite dimensional subspace E containing both p and4' (3D). E is finite dimensional and so there are extensions 0 Of $18DnE to allof D n E. Thus it is possible to define deg (p, + 1, D) by

deg (p, . + 1, D) = deg (p, ¢ + 1, D),

where the second member is computed in E.Hence whenever we have a finite dimensional mapping : 3D - X we can

define the degree deg (p, + 1, D), and this coincides with the degree of everyfinite dimensional perturbation of 1 by means of an extension of p to all of D.

Page 91: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

86 NONLINEAR FUNCTIONAL ANALYSIS

K. Properties

We have shown that the definition of degree can be generalized to obtaina notion of degree for finite dimensional perturbations of the identity withrespect to domains "finitely bounded" in an arbitrary locally convex T.L.S.Now we shall list the properties of degree that remain valid in this situation.

From the Definition 3.35 and 3.16; 4, 5, 6 and 7 we obtain:

3.37. Proposition: For every finite dimensional perturbation of the identity1P = I + 0: D -- X, and every p # ip (8D), the following results hold:

1. If p 0 tp(D), then deg (p, yr, D) = 0.2. If p and q belong to the same component of X - p (8D), then

deg (p, yr, D) = deg (q, gyp, D).

3. If D = U D1, where the family {Dt) is disjoint and 8Dg a 8D, then

deg (p, I p, D ) = deg (p, V. Di) .

4. If K D, K is closed, p t yr(K), then

deg (p, ys, D) = deg (p, rp, D -- K).

5. If f : Dl - X1 is a mapping satisfying the same conditions as 4,, then

deg ((p x q),1 + (0 x J), D x D1) = deg (p, l + 4, D) deg (q, I + f, D1)

whenever these expressions make sense.

L. Limits

The family of finite dimensional mappings is closed under addition andproduct by scalars. Hence to proceed we consider limits of such mappings.From this point of view the important thing is that the compact mappings(definition below) are such limits and so we will be able to define degreesof compact perturbations of the identity.

Let us recall and introduce some notations: D is an open set in X suchthat D n E is bounded for every finite dimensional subspace E e X. C(D)will denote the (linear) space of all continuous mappings 0: D -* X; like-wise, C (8D) will be the space of continuous mappings 4, : OD - X. Thereexists a natural mapping Q : C(D) - C(OD) defined byrestriction Q(4,) = 018D

Page 92: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 87

Similarly, we denote by F(D) and .F(3D) the subspaces of C(D) and C(OD)whose elements are the finite dimensional mappings. Of course, Q: .F(D)- .01 (aD).

Let us give C(D) and C(OD) the topologies of the uniform convergence.The open sets of, say, C(D), are those defined by

{4)(4)(D) c G)

where G runs over all the open sets of X.

Warning: These topologies are not linear space topologies, but merelygroup topologies (i.e., the mapping (0, p) --, 0 + ,p is continuous, while themapping (A, 0) -, A4, A e R, 0 e C(D), is not necessarily continuous).

By 3.36, for every 0 e .F(3D) and every p 0 (1 + 0) (OD), the degreedeg (p, 1 + 0, D) is defined, and for every g e C(D) such that Qg = 0,

deg (p, l + g, D) = deg (p, l + 0, D).

3.3& Lemma: Let 46 e C(aD), p be a point of X and V be a convex sym-metric neighborhood of 0 e X such that:

(a) (p + V) n (1 + 4)) (OD) = 0.

Then, if f e F (3D) satisfying f(x) - 4)(x) e V for every x e aD, the degrees

deg(p,l +f, D)

are defined; moreover, these degrees are equal for all such f.

Proof: Suppose p = x + f(x), x e 3D. Then x + O(x) = p + (4)(x) - f(x))e p + V, which contradicts (a). Hence p 0 (1 + f) (3D) and the degree isdefined.

Suppose that g e F(3D) also satisfies g(x) - 4)(x) e V for every a e 3D,andcall F=1 +f,G=1 +g,jp=1 +,0.

For every x e 3D, F(x) and G(x) both belong to +p(x) + V and this set isconvex. Hence any convex combination (1 - t) G(x) + tF(x), 0 5 t S 1also belongs to jp(x) + V. Using (a) this implies that for every 1, 0 5 t 5 1,and every x e 3D,

p # (1 - t) G(x) + IF (x).

Let E be a finite dimensional subspace of X such that

peEG (aD) c E

F (49D) E.

Page 93: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

88 NONLINEAR FUNCTIONAL ANALYSIS

Considering the domain D n E and the homotopy (1 - t) G + tFbetween Gand F, we conclude by 3.17 that deg (p, G, D n E) = deg (p, F, D n E), or,according to Definition 3.35, deg (p, G, D) = deg (p, F, D) as desired.Q.E.D.

3.39. Proposition: Let 0 e C (8D) and p a point of X not in the closure of(1 + ¢) (8D). Then there exists a neighborhood U of 0 in C (8D) such thatthe degree deg (p, 1 + f, D)

takes on but a single value for all f belonging to U n .W (aD).

Proof: Follows from the lemma above.Q.E.D.

This proposition will permit us to define the degree for perturbations of theidentity by limits of finite dimensional mappings.

Call 2' (8D), 2(D) the closure of F (8D) (respectively .F(D)) in C (8D)(respectively C(D)). Plainly Q : T(D) - .P (8D).

3.40. Definition: Let 0 = w - 1 be a mapping in 2 (aD). Let p be apoint of X not in the closure of ,(aD). The common value of the degreesdeg (p, 1 +f, D) when f e .F (8D) is near 0 is defined to be deg (p, ip, D).

If4, = y, - 1 is a mapping defined on all of D and such that Q(4,) c-.7 (8D),we shall write simply deg (p, +p, D) instead of deg (p, Qy,, D). In particular,for every 0 e 2(D), the degree is defined.

From 3.37 the next proposition follows easily.

3.41. Proposition: If ,0 = +p - 1 e 2(D) and p and q do not belong tothe closure of tp (8D), then :

1. If p 0 sp(D), then deg (p, gyp, D) = 0.

2. If p and q belong to the same component of X - y,(OD), then

deg (p, ap, D) = deg (q, y,, D).

3. If D = U D,, where the family (D,) is disjoint and 8Di c 8D, then

deg (p, p, D) _ deg (p, y,, D,)

4. If K D, K is closed and p 0 +'(K), then

deg (p, p, D) = deg (p, p, D - K).

5. If f: D1-, X1 is a mapping of 2(D1), then

deg ((p, pl), 1 + (0,f), D x D1) = deg (p, 1 +,0, D) deg (pl, 1 + f, D1) .

Page 94: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 89

Proof: Left to the reader.(Hint: Approximate and use Proposition 3.16.)In the same way it is possible to prove the following generalization of 3.33:

3.42. Proposition: Let Y be a closed subspace of X and p be a point of Y

not in v(aD), where tp - I e 3 (aD). Then

deg (p, y,, D) = deg (P, Vianr, r, D n Y).

Finally, we have a generalization of Borsuk's theorem:

3.43. Proposition: If D c X is an open set which is finitely bounded. symme-tric, and contains the origin, then for every odd tp such that V - I e 2 (aD),

and if 0 0 tp (aD), then deg (0, t', D) is an odd number.The proof follows at once from the Definition 3.40 and the Borsuk theo-

rem 3.23.

3.44. Corollary: If D is a domain as in the proposition above, if vp - 1 is

odd and belongs to 2'(D), and if 0 0 ty(aD), then V(D) covers a neighbor-hood of 0.

M. Compact Perturbations

Here we shall show that the compact mappings are in 3 (OD) and 3(D)and obtain some additional properties of the degree of compact perturba-tions of the identity.

We begin with some purely topological results.Let X be a locally convex T.L.S., T a topological space. Let C(T) denote

the set of continuous mappings 0 : T -+ X. C(T) has a natural st. ucture as atopological space (see the beginning of section L).

Consider the subspace.F(T) of C(T) whose elements are the mappings 0such that O(T) is contained in a finite dimensional subspace of X and thesubspace K(T) of the mappings 0 for which the set f(T) is pre-compact.

3.45. Proposition:

K(T) c .F(T) n K(T).

Proof: Choose an open, convex, symmetric neighborhood V of 0 in X,and let 0 e K(T). Suppose the points yl , ... , y e X have the property

n

O(T) c U{y,+V;.'=1

Page 95: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

90 NONLINEAR FUNCTIONAL ANALYSIS

Letting e be the gauge induced by V,

e(x) = inf 1A1, x e Xxe,1Y

we define mappings pl : T -; R by

pi(x) = max (0, 1 - e (4(x) - YO) .

Each µ, is continuous (since a is continuous).Since O(T) e U l y, + V), for each x there exists at least one p,(x) different

from zero. Thus the function µ(x) _ µ1(x) never vanishes on 4(T), andwe can define

µ(x)

These mappings satisfy I Z A, ? 0, A1(x) = 1. Now define

:(x) _ Ar(x) Ys

This mapping belongs to F(T) n K(T) and

O(x) - q5.(x) = E !(x) (4(x') - ye)

We see that if 4(x) - y, V, then a (4(x) - y,) ? 1, and consequentlyµ,(x) = 0, which implies A,(x) = 0. This means that 4(x) - 41(x) is a con-vex combination of elements of V which belongs to V since V is convex.Thus 0 is a limit point of .F(T) n K(T), as desired.

Q.E.D.

Using this proposition we may return to our initial situation D e X, D anopen finitely bounded set.

We shall say that a mapping 0 e C(D) (or# eC (aD)) is compact iff 4(D) isa compact set (respectively:4(aD)).

(Not to be confused with the mappings of Definition 1.38.)

3.46. Proposition: Any compact mapping 0 e C(D) (respectively 0 e C(OD))belongs to .'(D) (respectively to .P(AD)).

Proof: Immediate from 3.45.Q.E.D.

The perturbations of the identity by elements of 2(b) do not behave"nicely" topologically and in the preceding it was necessary to consider such

Page 96: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 91

artificial sets as the closure of (1 + ¢) (8D). The compact ones however looklike finite dimensional mappings, and we have

3.47. Proposition: If D is any domain D e X, and 0 E C(OD) (respectively

0 e C(D)) is compact, then V = 1 + 0 is proper and closed.

Proof: To say that c is proper means that the inverse image of a compactset is also compact. Suppose K = K is compact, and let A = gyp' 1(K). Sup-pose {xa} is an indexed family in A. Then {x, + 4)(x,)} being contained in K,has a convergent subfamily xp + 4)(x,) -* y. But4) being a compact mapping,there exists a third subfamily such that ¢(x,) - z. This implies that x.-I. y- z,and so A is compact. Suppose now that F = D is closed and that xa + ¢(xJ- z, x, E F. 0 being compact, there exists {xp} such that ¢(x,) -+ y. Thenxp --* z - y; by continuity, z - y + ¢ (z - y) = z. F is closed, so z - y= lim xp belongs to F, which implies that z e (1 + ¢) (F). This means that(1 + 0) (F) is closed as desired.

Q.E.D.

3.48. Corollary:' (8D) is closed.

This property makes the statement "p is not in the closure of 1P(8D)" in mostof the statements above, equivalent to "p is not in tp (8D)", the same statementwhich appears in the finite dimensional case.

We leave to the reader the work (and the delight thereby engendered) ofrewriting Propositions 3.41 and 3.43 for the case ,p - 1 = compact, withthe assumption p # rp(0D).

3.49. Corollary: Suppose D is a domain as in 3.43 and +p a C(D) a mapsuch that 1P - 1 is compact. If +p maps D into a proper linear subspace of X,then V(x) = V(-x) for some x e 8D.

Proof: Consider the map j(x) = +p(x) - V(-x). If +p(x) # 0 when x c- 8D,then by 3.44 j(D) would cover a neighborhood of 0. But j (D) is contained inevery subspace containing ?(D); by hypothesis there is a proper one, withoutinterior points.

Q.E.D.

3.50. Corollary: Let D be as above and ? a C(D) such that +p - I is com-pact. If tp(x) is never in the positive direction of rp(-x), x e 8D, then tp(x) = 0for some x E D.

We shall define a notion of homotopy for compact mappings:

Page 97: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

92 NONLINEAR FUNCTIONAL ANALYSIS

3.51. Definition: Two compact mappings 00,01 e C(T) (T is any topo-logical space) are said to be compact homotopic if there exists a compactmapping F: I x T -+ X, where I = [0, 1], such that F(0, x) = 00(x)1F(1, x) =01(x).

3.52. Proposition: Let D be as above, 0o = loo - 1, 0, = lp, - 1 twocompact mappings 4, E C(aD). If 0o and 01 are compact homotopic underF (t, x) = ¢,(x) = ,(x) - x and p is a point in X such that p # lp,(x) forevery t and every x e OD, then

deg (R V0, D) = deg (p, V1, D)

Proof: If F is compact, it may be approximated by finite dimensionalmappings. The restrictions of such mappings also provide close approxima-tions of 0o and 01, and then the proposition follows from the finite dimen-sional case.

Q.E.D.

N. Multiplicative Property and Generalized Jordan's Theoremfor Banach Spaces

X is now a Banach space. Let D e X be a bounded domain to : D -+ X,where tp - I is compact.

3.53. Lemma:tp(D) is bounded.

Of course V(b) c b + 0(D), and ¢(D) being compact, both D and 0(D) arebounded. Then +p(D) is bounded.

Q.E.D.

Since yr (OD) is closed (see 3.48), the set d = X - tp(aD) is open and there-fore has the form

A=UA,t

where the A are the components of A. Among these there is one and onlyone unbounded component, A., because V(8D) is bounded. Let G = U At,

and suppose furthermore that g : C -- X is a mapping such that g -- I iscompact.

Page 98: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 93

3.54. Theorem: (multiplicative property) Under the hypotheses above, and

if p 0 g+p (8D), then :

(I) deg (p, gy,, D) = E deg (p, g, A,) deg (A j, y,, D).10 go

Remark: We have used the notation deg (A j, y,, D) as in section F(cf. 3.20):the justification for this comes from 3.41; 2.

Proof: First of all, as g is proper (3.47), it follows that K = g-'(p) S Gis compact. Therefore from the covering K c U di, we can select a finite

i#GO

family satisfying K c U A,. Thus in the expression (I) all but a finite num-

ber of terms vanish so that the sum is meaningful.Moreover, if g - I is approximated very closely (uniformly over G) by a

finite dimensional mapping g" - 1,

deg (p, g, d,) = deg (p, g, d,) , 1 # 00 ,

as follows from the Definition 3.40.Hence we can prove (I) as assuming g - 1 itself finite dimensional, and

the general case will follow immediately.Observe that if ip 1 is also finite dimensional we are done, because we

then have just the finite dimensional result proved in 3.20. Thus the onlything to be proved is that 0 may be approximated by finite dimensionalmappings, for which (I) is already known.

When y, - 1 is approximated closely, the composition gy' is also approx-imated, and the left member of (I) remains unchanged:

deg (p, g+p, D) = deg (p, g V', D),

where +p' is the mapping corresponding to a suitable approximation to0=;-1.

Of course the terms deg (q, y,, D) don't change either after the substitu-tion of +p' for ip. The only difficulty arises when we consider the sets A , whichobviously do change. But K = g-1(p) is compact, so the new sets d; willdiffer from the old ones by some (closed) sets, disjoint from K. By 3.41; 4applied to these closed sets, the desired equality follows.

Q.E.D.

Suppose again that D is open and bounded and that V: D - X with4) = V -- 1 compact.

3.55. Lemma: I f f : b - X is one-to-one, then lp'' - 1 : V(b) .- X iscompact.

Page 99: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

94 NONLINEAR FUNCTIONAL ANALYSIS

Proof: It is easy to see that tp-' - I = ¢ o tp-' which yields the lemma.

Q.E.D.

3.56. Lemma: t' can be extended to P : X - X in such a way that ! - 1is still compact.

The proof will follow immediately from Proposition 3.58 below.

3.57. Theorem: (generalized Jordan's theorem) If D and D* are boundedopen sets in a Banach space X and there exists a homeomorphism jp : D -i D*such that (p - 1) (D) is compact, then the number of components of X - Dand X -D" is the same.

Proof: By Lemma 3.55, the inverse mapping tp-1 is also of the form(identity) + (compact); thus the hypotheses are symmetric.

But by Lemma 3.56 it is also possible to assume that tv and Sp-1 are restric-tions of globally defined compact perturbations of the identity. The proofnow is the same as that of 3.21, except for the fact that the appeals to 3.20are replaced by references to 3.54. Q.E.D.

We now give a proof of Lemma 3.56. This lemma is an immediate con-sequence of the following generalization of the Tietze theorem due to J.Du-gundji (An extension of Tietze's theorem, Pacif. Journal, Vol. 1, pp. 353-367(1951)).

3.58. Proposition: Let A be a closed subset of a metric space X, and C aconvex set iti a locally convex T.L.S. E over the real *or the complex field.

Then any continuous f : A - C has a continuous extension F: X C.

Proof: For each x E X - A choose an open B containing x such thatdiam V, 5 e (Vi, A). Then { V} is an open covering of X -A and sinceX - A is paracompact there exists a locally finite refinement {U}, i.e., theU's are open and cover X - A, each U c some V, and for each x e X - Athere exists an open 0x containing x and disjoint from all but a finite numberof the U's.

Let U0 e (U) and define for x e X - A

Auo(x) = e (.r, X - Uo)/D (x, X - U).U

Since a (x, X - U) > 0 if x e U, and since each x e X - A is contained insome U we have 0 S Aco(x) 5 1. For any x e X - A, Au11O,, has the form

e (x, X - Uo)/ I e (x, X- U),finite

no. of U's

Page 100: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 95

and since each e (x, X - U) is continuous (because

l e (x, X - U) - e (y, X - U)! s e (x, Y)),

Au0IOx is continuous. Therefore Au. is continuous on X - A and Avo(x) = 0

iffx#Uo.Now for each U choose au e A such that a (au, U) < 2e (A, U), and let the

extension F be given by

and

F(x) = y A0(x) f(au)u

F(x) = f (x) ,

for

for

x e X - A,

x e A .

For each x e X - A, Au(x) = 0 except for finitely many U's and sinceZ Au(x) . = 1, and f(au) e C it follows that F(x) a C. If x e X - A, F1 0.,, is aufinite sum of continuous functions and hence F is continuous on X - A.Since Fis continuous on the interior of A by assumption, it only remains toshow the continuity of F on the boundary of A.

Let x0 e boundary A, and let W c E be any convex open set containingthe origin. Since f is continuous on A there exists an a > 0 such that ifa e A and a (xo, a) < a then f(a) - f(xo) a W. Let 0 = {x a X: a (x, x0)< a/6}. We will show that if x e 0, then F(x) - F(xo) a W.

Assume x e X - A, a (x, x0) < a/6 and e (x, au) < a/2. Then

(xo, au) 5 e (xo, x) + e (x, au) < 6 + 2 <a

implies f(a0) - f(xo) a W. On the other hand assume x e X - A,a (x, x0)<a/6 and (x, a0) ?=- a/2. Then (x, a0) z 3e (x,xo) 3e (x, A). If x e U

thene (x, au) S e (au, U) + diam U < 2e (A, U) + diam U.

Since U e V. and diam V, S e (V,,, C) we have

diam US diam V, S Lo (V.,, A) :9 e (U. A).

Therefore a (x, au) z 3e (U, A) S 3e (A, x) contradicting the inequalityabove. Hence e (x, x0) < a/6 and a (x, au) > a/2 implies x $ U and there-fore A,(x) = 0.

Finally for x e X - A and a (x, x0) < a/6 we have

F(x) - F(xo) = I Au(x)f(au) - f(xo) _ y Au(x) (f(au) - &o))V V

Page 101: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

96 NONLINEAR FUNCTIONAL ANALYSIS

and by the above, for each U either A,,(x) = 0 or f(au) - f(xo) e W. Sincethe sum is actually a finite sum and Z Au(x) = 1 with 0 < Au(x) < 1 it followsthat F(x) - F(xo) e W. U

Q.E.D.

0. Fixed Point Theorems in Banach Spaces

Let X be a Banach space, K c K a convex and compact set.

3.59. Lemma: Every continuous mapping f : K - K has a fixed point xsuch that: f(x) = x.

Proof: Since K is compact it is contained in some ball B around the origin.Each ball in a Banach space is a metric space, so by 3.58 we can assume that fis the restriction of a continuous mapping (also called f) from B into K.Clearly any fixed point of the extension is a fixed point of the originalmapping.

Consider now the family of mappings

vt=I - tf, 05t51.Since K is compact o, depends continuously on tin the uniform topology (itsuffices to observe that t - 0 implies tf -+ 0 uniformly).

The homotopy F (t, x) - tf(x) is compact because the set {ty; 0 5 t 5 1,y eK}; is the continuous image of the compact set [0, 1] x K under themapping (t, y) - ty, and is therefore compact. That implies that

I = deg (0, V0, B) = deg (0, V1, B)

which establishes at once the existence of a point x e B satisfying p1(x) = 0,or x -f(x) = 0, in other words, a fixed point off.Q.E.D.

3.60. Proposition: (Schauder) Let A be a closed convex set contained inthe Banach space X. Every compact mapping f : A - A has a fixed point.

Proof: Supposef(A) = k. Let K be the closed convex hull of 1R. K is con-tained in A and the restriction f iK has a fixed point by the lemma above.

Q.E.D.

3.61. Proposition (Rothe): Let A be a convex bounded open set in someBanach space X. Suppose 0 : A - X is compact and 0 (8A) a A. Then 0 hasa fixed point.

Page 102: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

DEGREE THEORY AND APPLICATIONS 97

Proof: Let us suppose that 0 e A. Define p by

p(x) = sup{A,;0 < A,;LxeA}.

Clearly p is continuous. Let q(x) = max {p(x), I}. Then q is also continuous.Hence the map g(x) = x/q(x) is continuous, sends the whole space X into Aand is the identity on A. Moreover, the properties q(x) = 1 and x e 8A areequivalent.

Let B be a ball containing A and O(A).The mapping. = 0 o g : B -+ B is compact (because 46 is compact) and by

Schauder's theorem has a fixed point x = .(x). This point is easily seen tobe in A, and thus it is a fixed point for 4.

Q.E.D.

3.62. Proposition: (Altman) Let A be any convex open bounded set in aBanach space X containing 0, and d :.4 - X a continuous mapping satisfying

Ix - (x)12 z I.O(x)IZ - I.x12. x e OA.

Then 0 has a fixed point.

Proof: For 0 5 1 5 I and, x e A define the mapping

F (t, x) = to (x).

F is easily seen to be compact.For t fixed, write

d(t) = deg (0, 1 - F (t, x), A).Clearly d(O) = 1.

Furthermore, if it is supposed that 0 = F (to, xo), for some x e 0A. 0 5 toS 1, then

or

too (xo) = xo .

I4(xo)I = i Ix01.to

I4(xo) - x012 = (l - 10)2lo(xo)I2 = (1 -210)21x012.

1o

I0(xo)I2 - JxoJ2 = I.x0J2 (-j. IxoJ21 - to

to to

Using the hypothesis, we get

Ixol2 (1Z to)2

I.ro121

2 t0to to

7 Schwartz, Nonlinew

Page 103: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

98 NONLINEAR FUNCTIONAL ANALYSIS

or

(1 -t0)2 Z 1 - t209

which cannot occur for any 0 < to 5 1. Then the homotopy F (t, x) is com-pact and avoids the point 0. Therefore

1 = d(O) = d(1)

which means that O(x) = x has a solution for some x e A.Q.E.D.

As an application of homotopy invariance we shall obtain now a separationproperty.

3.63. Proposition: Let K c X be bounded and closed; x1, x2 elementsof X belonging to different components of X - K. If F (t, x), x c K is a com-pact homotopy such that F (0, x) = 0, and if 0,(x) = x - F (t, x) is suchthat 41(K) never contains x1 or x2 then x1 and x2 belong to different com-ponents of X -¢1(K).

Proof: At least one of x1 and x2 belongs to a bounded component. Sup-pose it is x1 . Let A denote any component of X - K. Now compare

deg (x1iq5o, d) and deg (x2,40, A).

They are different. After the homotopy, they remain different, which impliesthat x1, x2 don't belong to the same component.Q.E.D.

Page 104: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER IV

Morse Theory on Hilbert Manifolds

Part 1

A. Manifolds . . . . . . . . . . . . . . . . . . . . . . . . 100

B. Functions . . . . . . . . . . . . . . . . . . . . . . . . 101

C. Tangent Vectors . . . . . . . . . . . . . . . . . . . . . . 102

D. Alternative Definitions of Tangent Vectors . . . . . . . . . . . . . 105

E. More on Linear Topology . . . . . . . . . . . . . . . . . . 107

F. More on Elementary Calculus . . . . . . . . . . . . . . . . . 111

G. A Short Outline of Smooth Linear Bundles . . . . . . . . . . . . 112

H. The Tangent Bundle . . . . . . . . . . . . . . . . . . . . 115

1. Ordinary Differential Equations . . . . . . . . . . . . . . . . 118

J. Submanifolds . . . . . . . . . . . . . . . . . . . . . . . 120

K. Riemannian Manifolds . . . . . . . . . . . . . . . . . . 121

Part 2

A. The Non-critical Nock Principle . . . . . . . . . . . . . . . . 127

B. The Palais-Smale Condition . . . . . . . . . . . . . . . . . . 130C. Local Study of Critical Points . . . . . . . . . . . . . . . . . 132

D. Global Study of Critical Points . . . . . . . . . . . . . . . . 137

E. The Morse Inequalities . . . . . . . . . . . . . . . . . . . 148

This chapter consists of two parts: a discussion of the elementary propertiesof smooth manifolds modelled on general linear topological spaces and thetheory of critical points of mappings defined on such manifolds (actually,when the L.T.S. is a Hilbert space). The two main references for !he secondpart are (1) and (2) of the bibliography. Moreover, for the classical biblio-

99

Page 105: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

100 NONLINEAR FUNCTIONAL ANALYSIS

graphy, references may be found in J. Milnor, Morse Theory, Ann. of Math.Studies, No. 51, Princ. (1963). We assume that the reader is familiar withfinite dimensional differentiable manifolds (de Rham's book, Chern's lec-

ture notes-Chicago-, Helgason's book). All the definitions and propertiesin Part 1 below are easy generalizations of the corresponding ones in theclassical case.

Part I

A. Manifolds

We recall that a mapping from an open set of a linear topological spacewith values in another L.T.S. is called smooth if it is differentiable infinitelymany times in the Frechet sense.

4.1. Definition: If V is a L.T.S. and M a topological space, we shall definea smooth V-structure on M as a pair (4', J), where

1. 4 _ {U,} is an open covering of M.2. / = {&,} is a family of mappings,,O, : U, - V such that if UQ = a(Ua)

then Ua is open and(a) 0.: U,, - Ua is a homeomorphism,(b) 0p (Ua n U,) 0. (U. n Up) is smooth.

If M has been provided with a smooth V-structure, we shall say that M is asmooth manifold modelled on V. Every 0. c f will be called a chart, a co-ordinate system or a map on M, U. being its domain.

Remark: If (b) is replaced by

(b') 4 4 ' is k-times differentiable.

or by

(b") is analytic,

we obtain the concepts of C'`-V-structure on M and of Ce-V-structure,respectively. In most cases, the V-structures are smooth ones (or C'-V-structures, as one says), and we shall restrict ourselves to this case. Parallelresults for other cases (when true!) can be proved by the reader.

Page 106: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 101

Examples:1. Every open set M c V can be modelled on V by means of the smooth

(indeed analytic) V-structures provided by the family / _ {i}, where i : M-+ V is the inclusion.

2. Let Z be a discrete subgroup of V, and define M = V/Z, with the quo-tient topology. Consider all pairs V., Vi', V,, = open set of V, V. - openset of M, such that the canonical projection x : V - M sends Va homeo-morphically onto V.. If f is defined as the family of inverses x' 1 : V. - V,',,then M together with / satisfies the requirements above; hence M has a C'structure modelled on V, or, briefly, M is a C' V-manifold. (If Zis not assumedto be discrete, / does not define a smooth structure nor even a C1 structure.Why?)

3. We shall see later (4.51, (a)) that in every Hilbert space t°, the unitsphere {xl JxJ = I} can be modelled on V, where V is any quotient.Wo being an one dimensional subspace of if.

4. New examples can be obtained from known ones by means of the twofollowing procedures.

(a) if G e M is open and M is a V-manifold (say ('PV, f)), then G has aninduced V-structure defined by the restrictions of the mappings in / to thesets U n G, U. a V.

(b) if M has a V-structure and L has a W-structure then M x L has aV x W-structure in the obvious way.

B. Functions

Let M be a V-manifold and M' a V'-manifold. Let J, J', 0, 0', U, U',define the manifold structures.

4.2. Defn#don: A mapping f : M -- M' is called a smooth mapping if it iscontinuous and for every pair .0'e J', 0 e.1, the mapping 4)' f4-1-whichis defined on some open set of V (precisely: (4))-1 (U n f -1(U'))) and takesits values in V--is smooth. The set of all smooth mappings f: M -+ M' willbe denoted by Hone (M, M').

4.3. Demotion: M and M' are diffeomorphic if there exist fe Hom (M, Al")and g e Hom (M', M) such that fg = id : M'-+ M' and g f = id : M - M.

If M" is a V" manifold, there exists a natural mapping

Hom (M, M') x Hom (M', M") -+ Hom (M, M")

defined by composition of mappings.

Page 107: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

102 NONLINEAR FUNCTIONAL ANALYSIS

4.4. Definition: Let p be a point in M, U an open set of M containing p.A mapping f e Hom (U, M') is horizontal at p if (f0-')'(0(p)) = 0 for one(and hence for every) chart 0 e f whose domain contains p. If f is a real-function horizontal at p, we shall also say that p is a critical point of f andthat the real number f(p) is a critical level off.

Suppose now that +p is another chart V e f defined near p. By the chainrule applied to (f4)' 1) o (4+p' 1) we see that:

(f"- 1)' (p(P)) = (00') (4_ 1))' (Y'(,,))

= 104-1)' (4-1) MP))] [(ov_ 1)' (y(p))]

= [(f4)-1)' (W(P))] ((ov-1)' (+v(p))l

Thus (f0-')'(0(p)) = 0 if (ftp--1)' (tp(p)) = 0, because the mapping(4y,_1),

x (i(p)) is invertible (by the implicit function theorem), and this shows thedefinition of being horizontal at p to be independent of the chart chosen.

We shall henceforth write C°°(M) for Horn (M, R), R with its standardstructure. Note that C0D(M) is a real algebra.

C. Tangent Vectors

Our aim is to define the tangent vector to a curve in a manifold at any pointthrough which the curve passes. Let x be a point in the E-manifold M. Con-sider the set of all the smooth real functions defined on some open set con-taining x, and the equivalence relation - on that set defined by

f N g if f = g on some neighborhood of x.

4.5. Definition: We shall define a germ of smooth functions at x to be anyclass of functions modulo the relation '' . The set of germs at x will bedenoted G(x) and if f is a function, y(f) will denote its germ (i.e., the germcontaining f).

It is clear that G(x) is a real algebra. We want to consider tangent vectorsat x. Roughly speaking, they will be "classes of curves going through x withthe same velocity". We shall check both properties by means of the elementsin G(x).

4.6. Definition: A curve through x e M is a smooth mapping p : J -+ M,where J is an open set of R such that p(u) = x for some u e J.

Page 108: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 103

4.7. Definition: If p is a curve, p(u) = x, we shall define the tangent vector

to p at x = p(u) as the mapping AP -1 : G(x) - R defined bydt t

dp

dtY(f) =

d(fop)t= dt t=U

The second member is meaningful because the mapping fop is a smoothreal function of a real variable and for every such mapping g, we identifythe linear mapping g'(x) : R -+ R with the number g'(x, 1).

4.8. Definition : The set TM., of all the tangent vectors tip where p is adt t=U

curve and p(u) = x, is called the tangent space to M at x.

A tangent vector tip 1 is easily seen to be linear on G(x), whence theredt t ..

is an injection TM, c (G(x))*, where * denotes the algebraic dual space.Of course we may assume, by changing the parameter, that every vector

is of the formtip

, and we will do so hereafter. Now choose a chart0 at x.tit two

Every curve p with p(0) = x induces (after cutting down its domain J, ifnecessary) a mapping o op: J-+ E. But then (4) o p)' (0) : R -. E is linear. Thismeans that (4) op)' (0) may be identified with e(p) = (4) o p)'(0, 1). For everyy(f) e G(x) we have (cf. 1.7 for notation and the chain rule 1.14):

I

Y(f) = (fo P)' (0, 1) = (f 0 4- 1 00 o P)' (0, 1)at

Thus

(1)

two

tip

dt

= U o 0 -1)' (0 (P(0)), (0 o P)' (0, 1))

= (f o 0 (4)(x), a (p)).

y(f) = (f o 4r 1)' (4)(x), a (P))t=o

Hence if p and q are two curves such that e(p) = e(q), then the tangent

vectors tip ,dq

coincide. Therefore we can define a mapping 4*(x) :dt t=o dt tooTM, - E as follows.

4.9. Definition:

4*(x) f tip ' 1 = e (p).dtr=o

Page 109: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

104 NONLINEAR FUNCTIONAL ANALYSIS

The notation is chosen to emphasize the fact that the mapping 4*(x) de-pends on the chart¢ chosen (and of course on x e M).

4.10. Definition: If v e TM,,,, the element 4)*(x) v of E is called the co-ordinate of v in the chart 0.

In the new notation, formula (1) becomes:

(2) v (y(l)) = (f o 0- 1)' (4)(x), 4*(x) v)-

4.11. Lemma: For every chart 0, the mapping 4*(x) : TM -+ E is one-to-

one and onto. If +p is another chart covering x, then

(3) 0*(x) = So V*(X),

where S is the following automorphism of E: S = (gyp o0- ')'(0(x)).

Proof: From formula (2) we conclude that 4*(x) v = 4)*(x) w impliesvy = wy for every y, or v = w. Hence 4*(x) is one-to-one. Let e be an ele-ment of E, and define the curve p(t) _ ¢-1(4)(x) + te) (t small). It is very

easy to prove that if v = (') , then 4 *(x) v = e. The formula (3) followsldt =o

by standard computations.A useful consequence of Lemma 4.11 is the fact that any 4*(x) induces, by

transport of structure from E, a structure of L.T.S. on TM., and that all4*(x) induce the same structure (this follows from formula (3) above). Wesum up in the following proposition.

4.12. Proposition: Every TMw has a canonical structure of L.T.S. givenin such a way that for every chart4) around x, the mapping 4*(x) : TM. -+ Echaracterized by

(2) vy (f) = (I°0-1)' (fi(x), 4)*(x) v)

is a L.T.S. isomorphism. Moreover, the linear structure so induced on TMshas the following properties :

(v+w)y=vy+wyfor y c- G(x),

(Av) y =A ' vy

or, in other words, coincides with that induced by the injection TMw c (G(x)).*The last part of the proposition is very easy, and is left to the reader.The definitions make the following proposition obvious.

Page 110: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 105

4.13. Proposition: Every v eTM,r is a derivation of the algebra G(x) in R, i.e.

v (y(f)y(g)) = (vy(.f)) g(x) + AX) - (vy(g))

Warning: The converse assertion (that every derivation is a tangent vector)is false-see below-unless E is finite dimensional.

0. Alternative Definitions of Tangent Vectors

We continue with the notation of the last section.

(a) First alternative.

Let G(x) be the algebra of germs of smooth functions defined near x e M.Choose a chart ¢ around x (sending x into 0 e E). Then everyy e G(x) is thegerm of a function y = y(f), and by the Taylor formula

fo4-1 =c+x'+swhere c is a real constant, x' is an element of E' and s is a second order-map (from a neighborhood of 0 E E onto R). If Y is another chart (also send-ing x into 0), and

PIP-1 =c+. '+sis the corresponding decompositon, then c = d and .z = x' o T, whereT = (4 o yr-1)' (0). This is clear after easy computation. T is an automorphismof E, both algebraic and topological.

Chosing 4$, define 4(f) = Y. From the remark above, it follows that 0_(f)= (f) o [(4 o ,-1)' (0)]. This means that the mappings i : G(x) -> E' in-duced by all charts 0 are essentially the same. In particular it is true that atopology F on G(x) may be defined as the weakest among those linear topo-logies for which 4 is continuous, when E' is supposed to be endowed withthe weak topology of the duality (E, E'): this topology F is independentof 0.

Now define D(x) as the set D(x) = (d), where d are the following mappings:

(a) d : G(x) -+ R, d is linear and continuous for F;

(b) d is a derivation : d (yu) = d(y) - µ(x) + y(x) d(u).

Of course, if f o 0- 2 = c + x' + s, then d (y(f )) d (y(x' o 4)). For everyx' e E' and d e D(x), write d(x') = d (y(x' o4)). If d(x) = d*(x') for all

Page 111: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

106 NONLINEAR FUNCTIONAL ANALYSIS

x' E E', then d(y) = d *(y) for all y e G(x), or d = d*. This means that if is

chosen, there is an injection D(x) e (E')*.But the continuity required of the elements d implies that D(x) C E c (E')*

One may now show immediately that D(x) = E.D(x) might be called the tangent space at x, and all the theory based

upon this choice.

(b) Second alternative. Again we use the same notation.

Using equality (1) of section C, we may define the tangent vectors bymeans of their coordinates in all charts. In fact, if e is the coordinate of yin the chart 0, then its coordinate in y, is

[*1 g = (w o q- 1)' (4(x), e)

That means that y can be identified with the class of all pairs (0, e), (,p, g), ...where 0 (resp. tp, ...) is a chart and e e E (g a E, ...), satisfying [*J. Call V,rthe set of all such classes. The correspondence: class of (0, e) -* e defines, fora given 0, a map that identifies V. with E, and we have a structure of L.T.S.for Vi,.

Of course if we define a mapping

class of (0, e) --> dpdt r-0

where p is the curve defined as p(t) = 0-1(4(x) + te), (t small),. then we ob-tain a one-to-one-onto correspondence between the elements in V. and thetangent vectors to smooth curves. On the other hand, if we define the map-ping

class of (4, e) -+ d

where d e D(x) is defined by d (y(f)) = (f o 4-1)' (4 (x), e), then we obtain acorrespondence between elements in V,, and elements in D(x).

This proves that both approaches lead essentially to the same result.As a final remark, let us observe that:-In the first definition of tangent space as the set of tangent vectors

to curves, addition is easily defined (by means of TM,,, a (G(x))*), butnot the topology of TM, On the other hand, the geometrical meaning isvery illuminating.

-In the second definition (alternative a), everything is natural (algebraictopological structure), except the meaning.

-The third definition (b) is neither natural nor meaningful.

Page 112: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 107

E. More on Linear Topology

This section is used only in section K, and not in its full generality,but only in a very particular case.

Let E be any linear topological space. We shall denote by End (E) thespace of continuous linear mappings u : E - E.

4.14. Lemma: A linear (and Hausdorff if E is Hausdorff) topology forEnd (E) is defined by uniform convergence on bounded sets: a funda-mental system of neighborhoods of :he origin for this topology is the familyof sets L (K, V) = {u e End (E); u(K) c V), where K ranges over the familyof bounded sets of E and V over the neighborhoods of 0 e E.

Suppose now that E and f are two linear topological spaces, and considerthe space B(E, E) of continuous bilinear forms on E x t: B (E, E) ={P;P: E x t -> R, P is bilinear and continuous}. (The continuity of P is inthe product topology.)

4.15. Lemma: B(E, t) can be made into a Hausdorff L.T.S. by thetopology of uniform convergence on bounded sets. A fundamentalsystem of neighborhoods of 0 e B (E, E) is provided by the sets B. (K, k)= {P e B (E, E); P(K x t) c [ --a, + a]), where K and f range over thefamily of bounded sets of E and E, respectively, and a over the positivenumbers.

4.16. Lemma: The bounded sets of B (E, E) are those sets D for whichD (K x f) (defined as {P (x, y); P e D, x e K, y e R}) is bounded for everypair, K, k of bounded sets of E and E, respectively.

The proof is left to the reader.A subset S c B (Et) is called equicontinuous (or, better, equicontinuous at

the origin) if for every neighborhood N of 0 e R, there exists a neighborhoodN' of (0, 0) e E x E such that all P in S satisfy P(N') c N.

4.17. Lemma: All equicontinuous set,, in B(EP) are bounded.The proof follows immediately from (4.16). Suppose that N = [ -1, + I]

and that K x R c AN' (definition of bounded sets); then IP (x, y)) S A forallPeS,xeK,yeR.

There is a natural way to make End (E) ® End (E) operate on B (E, E);the following lemma establishes this.

4.18. Lemma: The mapping j : End (E) x End (E) -. End (B (E, t)) de-fined by ([ j (u (D v)] P) (x, y) = P (ux, vy) is bilinear.

Page 113: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

108 NONLINEAR FUNCTIONAL ANALYSIS

We note the following relation involving the mapping j:

4.19. Proposition: Suppose that both E and t are locally convex. The map-ping j is continuous iff the converse of 4.17 holds.

Continuity of j from the Converse of 4.17

Let S be the neighborhood of 0 e End (B(EE)) defined by S = {u; uK a V},where K is a bounded set of B (E, t) and V a neighborhood of 0 e B (E, E).

Assume that V = {P; P (L x L) e [ -a, +a]), where L and t are boundedand a is a positive real number. Since we assume the converse of 4.17,K is also equicontinuous. Let N, 10 be neighborhoods of 0 in E and -0 re-spectively, such that if P e K, then P (N x 1a) e [-a, +a].

Define the neighborhoods

W = {u e End (E); u(L) c N)

W = {u e End (E); u(L) c N).

Then, if u e W, v e i, it follows for every P e K that :

(j (u, v) P) L x L = P (uL x vL) c P (N x R) c [-a, +a],

which proves that j (W x W) is contained in S. Hence j is continuous.

Converse of 4.17 From the Continuity of j

Let K be a bounded set in B(E, L); a e R, a > 0. Choose two boundedsets L, L in E, 9 respectively, both different from (0}, and define

V = (PeB(E,-0); P(L x L) c [-a, +a]}.

Clearly V is a neighborhood of 0 e B (E, .9). Now consider the set

S = {ueEnd (B(E,E)); uK c V).

S is a neighborhood of 0 e End (B (E, it)). Since we assume that j is contin-uous, there exist N, 10' in End (E) and End (E) such that j (N x R) a S, andwe may suppose that N and J9 are of the form

N = {u a End (E); uL1 c T1}

N = {u a End (L); uL2 c T2} .

where L, (i = 1, 2) are bounded and T, neighborhoods of 0. We may alsosuppose that L1 L and L2 L. But the inclusion j (N x 19) e S meansthat j(n1,n2)PeV for all n,eN,, P e K (i = 1, 2).

Page 114: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 109

or

(j(n,, n2iP)L x Lc [-a, +a]and, finally

P (n,L x n2 L) -a, +a] for all ni a N,, P E K (i = 1, 2).

Since L :A {0} and E 0 {0} this implies, using the Hahn-Banach Theorem,that

P(T, x T2) c [-a. +a].

Hence K is equicontinuous.We shall now discuss some special cases for which the converse of 4.17

holds.

4.20. Definition: A linear topological space is called a Baire space ifwhenever the union of a countable family of closed subsets covers the wholespace, then at least one of the subsets has non-empty interior.

The classical Baire Category Theorem (see Kelley or Boubaki) impliesthe following proposition.

4.21. Proposition: Every Frechet space is a Baire space.Moreover, the statement below follows easily from the definition.

4.22. Proposition: Every locally convex linear topological space which is aBaire space is a barrelled space.

(Cf. Boubaki, E.V.T., Chapter 111, § 1, Prop. 1.)

4.23. Proposition: If E and t are locally convex linear topological spacesand E x E is a Baire space, then a subset of B (E, E) is equicontinuous iff it isbounded (and consequently, the map j in 4.18 is continuous).

Preliminary remark: If E x t is a Baire space, then both E and £' areBaire spaces (hence barrelled spaces).

Proof:Suppose that D e B (E, 9) is bounded and that N = [ -a, +a], a e R.

a > 0. The set G = {(x, y) e E x .9; P (x, y) e N for all P e D) is clearlyclosed. Moreover, D being bounded, for every pair (x, y) there exists a posi-tive integer A such that P (x, y) E AN for all p e D. Therefore E x .' = U nG,n = positive integer. Since E x E is a Baire space, some nG must have non-

empty interior; hence G = 1 (nG) has non-empty interior: call this inte-n

rior U and choose (a, f3) e U.

Page 115: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

110 NONLINEAR FUNCTIONAL ANALYSIS

Now consider the set

G1 = {xeE;P(x,I)eN, for all PeD}.

G1 a E is clearly a barrel, and therefore (by our preliminary remark) aneighborhood of 0 e E. The same argument applies to E and so we concludethat there exist neighborhoods G1, G1 of 0 e E and 0 e E respectively such

that

(1)

P(a,y)eN, for all P e D and all ye01

P(x,fi)eN, for all PED and all xeG1.

Furthermore, we may find new neighborhoods V e G1, 17 c Gl such that(a + V) x (/3 + 19) c G; in particular

(2) P ((a + V) x (/3 + 1)) c N for all PeD.

Suppose finally that (x, y) e V x17: Then, for every P e D

P (x, y) = P (x +a, y + /3) - P (a, /3) - P (x, /3) - P (x, y)

E P ((a + V) x (8 + IN + P (a, 9) + P (V, fl) - P (a, /3) .

From (1) and (2) we now get:

(3) P (x, y) a 3N - P (a, fl), for all PeD.

Let us observe now that since D is bounded, the set {P (a, /3); P e D} isalso bounded and hence contained in some qN, q a positive integer.

Therefore, by (3) :

{P(x,y);PeD,xeV,ygl7)c(q+3)N,or

(4) {P (x, y);PeD,xeW,yeW}cN,

where W and Ware neighborhoods satisfying (q + 3) W c V, D = P. Theformula (4) shows that D is equicontinuous, as desired. As a corollary, weobtain the following statement.

4.24. Corollary: If E and E are locally convex spaces and E x E is a Bairespace, then the mapping j : End (E) ® End (E) - End (B (E, E)) is continuous.

Page 116: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 111

F. More on Elementary Calculus

Let Mbe a topological space and F a L.T.S. Denote by C (M, F) the set ofcontinuous mappings from M into F. Clearly C (M, F) has a natural linearstructure. Suppose now that Z is another L.T.S. Then there is a naturalidentification

C (M, F) ® C (M, Z) = C (M, F ®Z)

which is a linear isomorphism (defined by (4 ®Tp) (x) = fi(x) ED V(x)). Sup-pose now that u : F - Z is a continuous mapping. u induces a map

u o4.Our aim is to consider the case in which M is a smooth manifold and to

deal with smooth mappings.Assume that M is a smooth manifold.

4.25. Lemma: If 0 e C (M, F) and v e C (M, Z) are smooth, then so is0 ®+p and furthermore

6 (4 ® y,) (x, h) = 80 (x, h) ® 80 (x, h).

The proof follows from the remark that ifu e C (M, F) and v e C (M, Z) areboth o(x), then so is u ® v.

Let now Z, F and G be three L.T.S.

4.26. Lemma: Every continuous bilinear mapping u : Z x F - G is differ-entiable,

er-entiable, and moreover its derivatives are

6u [(x, y); (h, k)] = u (x, k) + u (h, y)

62u [(x, y), (x', y'); (h, k)] = u (x', k) + u (h, y')

8"u-0 if nz3.The proof follows from elementary remarks and the formula

u (x + h, y + k) - u (x, y) = {u (x, k) + u (h, y)} + u (h, k)

(see also "Quadratic Forms," in Chapter I).Assume again that M is a smooth manifold, and that F, Z are locally

convex L.T.S. such that in B (F, Z) every bounded set is equicontinuous.

4.27. Proposition: If 0 e C (M, End (F)) and w e C (M, End (Z)) are differ-entiable and we call j the natural mapping End (F) ® End (Z) - End (B (F, Z)),

Page 117: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

112 NONLINEAR FUNCTIONAL ANALYSIS

then the mapping # e C (M, End (B (F, Z)) defined by (0 (B ?P) is also

differentiable.

Proof: By Lemma 4.25, 0 ® tp e C (M, End (F) ® End (Z)) is differen-tiable. But by Proposition 4.23, j is continuous, hence (Lemma 4.26, above)differentiable. Therefore fi, as a composition of two differentiable mappingsis itself differentiable.

4.28. Corollary: Proposition 4.27 holds under the hypothesis that F and Zare locally convex and F x Z is a Baire space.

4.29. Notation: If 4) e C (M, End (F)) and tp e C (M, End (Z)), the mappingf = j o (4 ® o) e C (M, End (B (F, Z))) defined in 4.27 will be calleds (¢, ip). With this notation, Proposition 4.27 reads: if4s and tp are smooth, so iss (4 o)

G. A Short Outline of Smooth Linear Bundles

The reference is: Lang, Introduction to differentiable manifolds.

(a) DefinitionLet M be a smooth E-manifold.

4.30. Definition: A smooth linear bundle on M consists of the followingobjects:

1. a space X;2. a surjective mapping x : X - M (called the projection);3. an L.T.S. structure on every set x- I(x), x'e M, (the set x' '(x) is called

the fiber of the bundle over X, and denoted by Xx);4. a linear topological space F;5. an open covering {U, V, ...) of M. and6. for every U in this covering, a map

ru : n''(U) -+ U x F;

These maps must satisfy:7. the maps ru, av,, ..., commute with the projections, i.e., the diagram :

x'(u) TUUx F

pri

U

is commutative; (where pri (x, e) = x)

Page 118: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 113

8. the restrictions (T0)x: X. -+ {x} x F are linear isomorphisms (here we

assume that {x} x F and F are identified);9. if U and V are two members of the covering, the map

rw: Un V- End (F)defined by

Tov(x) = (ru)x [(Tv).)-1, x E U n V

is a smooth mapping (End (F) has the topology defined in 4.14).

Examples

1. Let F be any L.T.S. Take X = M x F, r (m, f) = m, the coveringconsisting only of the set M, and rM to be the identity. This is a smoothlinear bundle, called the trivial bundle.

2. The tangent bundle of a manifold M, whose description appears insection H below.

Remark: Sometimes, for short, the expression "let Xbe a bundle" is used.The space Fis called the typical fiber of the bundle (in the case of the tangentbundle, it coincides with the space on which M is modelled).

The notions of sub-bundle of a bundle, direct sum of two bundles and homo-morphism from one bundle into another may be defined in the standard way.

Assume that the covering (U) consists of domains of charts {¢}. Then themappings:

X Idn'1(U)"UxF ---O(U)xFcExFare charts for an E x F manifold structure on X.

Remark: Bundles of class Ck and analytic bundles are similarly defined.

4.31. Definition: A section of a bundle X is any mapping s : M - X suchthat a o s = Id.

(b) A construction

Remark: Here again, the full generality will not be necessary. The readermay assume that all spaces are Hilbert spaces and hence obtain the pro-positions more easily.

We shall give a procedure for constructing new bundles from known ones.Our procedure is .suggested by the general description in Lang (III,§ 4).

Assume that M is a smooth E-manifold (E is not necessarily locally con-vex).

8 Schwartz, Nonlinear

Page 119: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

114 NONLINEAR FUNCTIONAL ANALYSIS

Let X and I be bundles on M; F and P their typical fibers, n and t theirprojections and ru, fv their structural mappings. Define the set B (X, ?) tobe

(a) B (X, I) = U B (:Xx, Ax), x e M,

where B (Xx, kx) denotes, as above, the space of bilinear forms on X, xWe may assume that the coverings of M defining X and I are the same,

replacing the original ones by the intersections U n 0, if necessary. Let

(b) sad = (W, Q, ...)

denote this covering. Now consider:

(c) p : B (X, $) -+ M,

the mapping defined as: if u E B (Xx, fix), thenp(u) = x;

(d) for every We d, define

Aw : p- '(W) --* W x B (F, P)

as follows: for every x e W, the maps (rw)x and (ixw),c give an identificationof X x I. with F x P. Then every u e B (XX, Ix) induces an elementu" e B (F, P), which defines Aw. In other words, if u e B (Xi, t ), then

Aw(u) = (x, u) = (x, u [(rw)=1, (fw)x 11)

The reader will verify that these objects satisfy properties (7) and (8) ofthe definition of bundle.

Let AQw : Q n W -+ End (B (F, P)) denote the mappings 4w(x) = (AQ)x x[(Aw):] -1, where Q, Wed.

We see that this construction leads to a smooth fiber bundle as long as thesemaps AQw are smooth (condition (9)). But it is easily verified that

AQw = s (rQw, fQw),with the notation of 4.29.

4.32. Proposition: Let X, A be two smooth fiber bundles on M (any smoothmanifold), F and E their typical fibers. If both F and E are locally convex and inB (F, t) every bounded set is equicontinuous, then the objects B (X, k), p, sadand Aw described in (a), (b), (c) and (d) define a smooth linear bundle on M(its typical fiber being B (F, P)).

Proof: Obvious from the fact that the hypothesis on B (F, F) implies (by4.28, 4.29) that the mappings AQW = s (ZQw, T(?w) are smooth.

Page 120: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORS& THEORY ON HILBERT MANIFOLDS 115

Remarks (1): Note that no assumptions are made on the local topology of

M, but on the fibers of the bundles considered.(2): Observe that the differentiability (or continuity by 4.26) of j : End (F)

® End (F) -+ End (B (F, F)) is the weakest condition that can be imposed onthe fibers in order to obtain the statement : if 0 and , with values in End (F)and End (F) are differentiable. then j o (4 ®V) is.also differentiable. But thecontinuity of j is equivalent (gee 4.19) to the property: every bounded set inB(F, F) is equicontinuous. Hence we conclude that this latter property ofB (D, F) is a natural condition to impost in order to obtain the validity ofthe last proposition.

Nevertheless, an exception occurs in the case X = = trivial bundle,where the proposition holds for any fiber.

(3): Suppose that f is the trivial bundle with fiber R, the real numbers.Then B (FP) = B (F, .) = F, the topological dual of F provided with thestrong topology.

Here the condition on F about the bounded sets is equivalent (see Bor-baki, EVT Chap. III, § 3, Ex. 6) to: the completion of F is barrelled. There-fore:

4.33. Corollary: Every linear bundle such that the completion of its fiberis a barrelled space has s smooth dual.

Once again the requirement on the fiber is "necessary" (see Bourbaki, loc.cit.).

(4): Perhaps the reader cannot resist the temptation to define a tensorproduct of smooth linear bundles. The following might be a way: X ® $ isdefined by fibers as (X ®$)x = subspace generated in the dual (B (X,,, Is))',of B (X,,, fix) by the image of X x kx under the canonical mapping X,, x- (B (X,,, This object is a smooth bundle provided that in B (F, F) andin (B (F, F))' (the latter with the strong topology) every bounded set is equi-continuous.

This is always the case if F and F are spaces of type (s.F), for example.Of course when the product X ® ? exists, it has the standard universal

property.

H. The Tangent Bundle

Throughout this section, M will denote a smooth E-manifold. We shalldefine a smooth linear bundle on M, the fiber being E, called the tangentbundle of M (notation : T(M)).

Page 121: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

116 NONLINEAR FUNCTIONAL ANALYSIS

4.34. Definition: Let T(M) = U TM,,, where TM,, is the tangent spacexeM

to M at x; let sc : T(M) -- M be the projection v(y) = x if y e TM.,,. Let.sat = {U, V, ... } be the open covering of M by the domains of charts ¢, gyp, ...

Put T(C!') = U TM,, = a-'(0) for every 0 e Manddefine uu : T(U) -+ U x ExE0

as follows: if y E TM., uu(y) = (x,¢*(x) y) (notation of 4.9). The systemT(M), z, {U, V, ... }, {U, uy, ... } defines a smooth linear bundle withfiber E. We call it the tangent bundle of M.

Of course the diagram

(where pr1 (x, e) = x) is commutative.Consider now two charts 0, y,, with domains U and V, respectively. The

composition of the mappings:

(Un V) x E uu'T(Un V) Ov (Un V) x Eis:

uu(uv)- 1 (x, e) = (x, 4.(x) IV.(x)l -1 e),

as follows immediately from the definition of uo and uy. But then, since themapping

['l ru.y : x -+ 0.(x) [w.(x)l

(from U n V into End (E)) is a smooth mapping the mappings uu makeT(M) into a smooth linear bundle and consequently into a smooth E x Emanifold. (If M is a manifold of class Ck, T(M) is a manifold of class Ck-1;if M is analytic, so is T(M).)

Recall that in 4.31 the notion of erection of bundle was defined.

4.35. Definition: A vector field on M is any section of T(M).A vector field may be of class Ck, k ? 0, C°° (analytic if M is analytic).

4.36. Definition: The set of vector fields of class Ck, k < oo will bedenoted by Qk (or Sak(M), if necessary), and that of those of class C°° by(or Q(M)).

Vector fields defined only on an open set of M will often be considered.

Page 122: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 117

Lifting of mappings. Suppose now that M and N are smooth manifolds

,modelled on E and F, respectively) and that f : M -> N is also smooth.Then a mapping f* completing the diagram

7(M) f' - T(N)

M - Nmay be defined by means of

f*dp

dt

d(fop)r=o/ dt r=o

Sometimes we shall write f*(x) for the restriction of f* to TMs:

[**s] f* (x) : TMs -+ TNf( ).

4.37. Definition: The mapping f*(x) defined in [***] is called the differ-ential off at x.

4.38. Proposition: If f : M - N is C', then f*(x) : TM, -+ TN,, is a con-tinuous linear mapping for every x e M. f is one-to-one on some neighbor-hood of x iff f*(x) is one-to-one. f(U) covers a neighborhood of f(x) forevery neighborhood U of x iff f*(x) is onto.

The proof, which uses the implicit function theorem, is left to the reader.Observe that the differential at x of a chart 0 : U - E (where U C M) is

the mapping denoted by 4*(x) in definition 4.9 (assume the identificationTE, = E induced by the identity).

The mapping f* can be computed as the derivative of a mapping whencharts are chosen around x and f(x). In fact, suppose that0 and W are such

charts. Then, if v =dp

dte TMs, and y = f(x), we have

0=0

vV*(y) (f*(x) v) = tV*(y) d (fi)dt

= [(+'fp)' (0)] (1)

t=o/

_ [(wf4 ' 4P)' (0)] (1)

_ [(,ff-')1 (4P (0))] ((4P)' (1))

_ [(tvff-')1(fi(x))] (.*(x) v).

Page 123: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

118 NONLINEAR FUNCTIONAL ANALYSIS

Hence the following diagram in which u = (fi(x))] commutes.

E F

L(:)TM. - TNI(z)

1. Ordinary Differential Equations

In this section, H will denote a smooth manifold modelled on a Banachspace E. Similar results may be obtained for manifolds modelled on Montelspaces (see E. Dubinsky, "Differential equations and differential calculusin Montel spaces", Trans. Am. Math. Soc., Vol. 110, No. 1 (1964)). Thepropositions below seem to have appeared for the first time in Michal andElconin, "Completely integrable differential equations in abstract spaces",Acta Math., Vol. 68 (1937) pp. 71-107. Most proofs are omitted here:reference to the book of Serge Lang will provide them.

4.39. Notation: If p is a curve in M, we shall write p'(u) instead ofdp

dt t

4.40. Proposition: Let po a M, v be a vector field defined on some neigh-borhood of po. Consider the equation

(I)P'(t) - v (P(t))

P(0) = Po

where the unknown p is a curve in M, its parameter running over some inter-val containing 0 e R. If v is of class C1, the equation (1) has solutionsand such solutions agree in the intersection of their intervals ofdefinition.

The statement being local, M may be replaced by E itself. The proof isthen a slight modification of the Picard's proof from the correspondingstatement in the finite dimensional case. (Observe that the special form ofthe equation (I) implies immediately that if a solution is CD, then it is neces-sarily CD+1) As in the finite dimensional case, the above proposition impliesthe existence of maximal integral curves, in the following sense:

4.41. Proposition: Suppose v is a C" vector field defined on the manifold Mtthen, for every p e M. there exists a smooth curve a (t, p) such that

Page 124: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 119

(a) a (t, p) is defined for t belonging to some interval (t_(p). t., (p)), contain-ing 0 e R, and is of class C"+ t there.

(b) a (0, p) = p for every p.(c) a (t, p) satisfies the equation

da (t, p)= v (a (u. P))

dt lt=u

(d) Given p e M, there is no C' curve defined on an interval property con-taining (t_(p), t+(p)) and satisfying (b) and (c) above.

4.42. Proposition: If u. t, u + t e (t_(p), t+(p)), then a (u + t, p)= a (u, a (t, P)).

((d) of Proposition 4.41 applies here.)

4.43. Proposition: The mappings p - t+(p) and p -* t_(p) are lower andupper semicontinuous respectively.

(Follows from the proof of 4.40.)

4.44. Definition: Given a vector field v on M of class C1. the mapping awill be called the flow of v.

The flow a is afunction of two variables: p e M, t e (t_(p). t+(p)). By 4.43,this set of pairs (p, t) is an open subset of M x R, hence a smooth manifoldmodelled on E 6) R. Let D. denote this manifold. From the proof of Pro-position 4.40. we obtain:

4.45. Proposition: For every v of class C'. the flow a :'.Q,. M is a smoothmapping.

Finally we have:

4.46. Proposition: If p E M, the mapping a (. , p) : (t_ (p), t+(p)) -+ M can-not be continuously prolonged to either endpoint, nor can a (t, p) have a limitpoint as t - t+(p) or t _+ t_(p).

For suppose a has a limit point as t - t+(p). Let pa, be this limit pointa (t+(p), p). By the definition of a, there exists a neighborhood U of p",and an e > 0 such that a is defined onto U x (-e, +e). Let to be a realnumber such that t+(p) - is < to < t+(p) and such that a (to, p) a U. Sucha point exists. Then define a curve p by p(t) = a (t, p) if t_ < t < t+,p(t) = a (t - to, a (to, p)) if t+ S t < to + e. Clearly (from 4.42) p(t) iswell defined for t_ < t < t+ + is, is smooth and satisfies the differentialequation. This contradicts the maximality of a (t, p). as it is described in4.41(d).

Page 125: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

120 NONLINEAR FUNCTIONAL ANALYSIS

J. Submanifolds

4.47. Definition: Let M be a smooth E manifold. A subspace N C M iscalled a regularly imbedded submanifold if there exist a covering of M by

domains of charts (U, = domain of ¢,) and a closed linear subspace F of Esuch that, for every i,

4, (N n U,) = 01(U,) n F.

The covering {U, n N} and the restrictions 0, I U, n N provide a smoothF-structure for N. It is easy to see that this structure does not depend on theparticular choice of the original covering { U,} of M.

The following statement can be easily proved.

4.48. Proposition: Every regularly imbedded submanifold of M is a closedsubset of M.

Examples

1. M and every point in M are regularly imbedded submanifolds.2. If M is an open set of the Banach space E, then for every closed linear

subspace F of E the set F n M is a regularly imbedded submanifoldof M.

The following proposition provides less trivial examples.

4.49. Proposition: Let M be a smooth E-manifold and f : M --* r a smoothmapping. If c e -W is not a critical level for f, the set N = f- '({cl) is a regularlyimbedded submanifold of M (modelled on a hyperplane of E).

The reader will see that the lemma below implies the above proposition.

4.50. Lemma: If x e M, f is defined near x, smooth, and non-horizontal at x,then there exists a chart y -+ j'(y) around x such that f(y) = z(y,(y)) + c,where x e E'.

Proof: Considering f -- f(x) we may suppose that f(x) = 0. Choose achart ¢ around x such that 4(x) = 0. We know that x' = (f4'- t)'(O) doesnot vanish. Let e e E be a vector such that x'(e) = 1, and let F be the kernelof x'. Clearly E = F ® IR e. Define a mapping 0 by:

0(y9te) =ye [fc-'(y(D te))e(y c- F, y and t small).

The derivative at the origin of 0 is

O'(O)(y®te) =y®x'(yED te)e=y®e,

Page 126: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 121

or 0'(0) = identity. Therefore near the origin 0 is a smooth mapping withsmooth inverse.

Consider the chart ip = 00. We have

(1) fV-1(y ®te) =fo-10-1(y ®te) =l0-'(y ®ue)

where 0 (y ® ue) = y ® le. But by the definition of 0, this last equalityimplies that t = fo-1(y (D ue). From (1) we conclude that

f+p-1(y(D te) = t =x'(y(D te)Q.E.D.

4.51. More Examples

(a) Let . be a Hilbert space, f :. ° R the mapping f(x) = (x, x). f isa quadratic form. Its derivative is f'(x) y = 2 (x, y), and is horizontal onlyat x = 0. Therefore the unit sphere S = {x l f(x) = 11 is a regularly imbeddedsubmanifold of 0 (modelled on any hyperplane).

(b) Let (d2, I',,u) be a measure space, and consider the space X=p > 1. The mapping f : X -+ yP defined by

f(x) = 1 Ix(s)I' µ (ds)

has as many continuous derivatives as the integer part n of p ("greatestinteger less than or equal to p").

Considering Xas a manifold of class C", a form of Proposition 4.49 appliesand we conclude that the unit sphere S = {xl lxl = 1} = {xl f(x) = 1} isa regularly imbedded submanifold of X, of class C".

(c) Consider again the case of a Hilbert space .- ', and let S be the unitsphere of the Hilbert space . ° ®9t. According to example (1), S is asmooth manifold modelled on any hyperplane of . ° ®98, in particularon A. Define now on S the equivalence relation x - y if x = -y. Thequotient S/- has a natural structure as a smooth manifold modelled on .°,called the projective space on 0 and denoted P(.*°).

K. Riemannian Manifolds

Some Preliminary Remarks on Bilinear Forms

If E is a real Hilbert space we have denoted by B (E, E) the linear topo-logical space of bilinear continuous forms on E (see 4.15). The followingis clear.

Page 127: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

122 NONLINEAR FUNCTIONAL ANALYSIS

4.52. Lemma: The topology of B (E, E) may be defined by means of the

norm

(1) IPI = sup {p (x, }'); 1x1 _<_ 1, lYI < 1).

Consider now the L.T.S. End (E) as defined in 4.14. Clearly we have

4.53. Lemma: The topology of End (E) may be defined by means of thenorm(2) lul = sup flu(x)l; IxI < 1).

Our next step is to prove

4.54. Lemma: B (E, E) and End (E) are isometrically isomorphic in acanonical way. Symmetrical bilinear forms go onto self-adjoint operators.

Proof: Let p e B (E, E) and for every x e E, consider the map x* : yp (x, y). Clearly x* is a continuous linear functional on E. Hence, there

exists an element x** E E such that x*(y) = (x**, y). Call u, the mapu, :x - x**. It satisfies

(ux, y) = p (x, Y)

Of course u, a End (E) and we have a mapping p - u, from B (E, E) intoEnd (E).

It is obvious that p - u, is linear and onto. Let us note that it is an iso-metry :

Iu,12 = sup Iu,x12 = sup I (u,x. u,y)l < I u,I sup l(u,x. y)11451 ixl51 IXIs1y

= lu,I sup Ip (x, y)I 5 Iupl I pI ,

1

s:5I

1

751

whence lu,l < ppl. Conversely

IPI = sup p (x. Y) = sup (u,x, y) 5 Iu,I.

lyIs1 I=

Hence IPI = IuDIQ.E.D.

4.55. Corollary: B (E, E) is a Frechet space.Let us consider now two Hilbert spaces F and E (let (. ),r and be

their inner products respectively) and assume that T : F -- E is a continuouslinear operator. Denote by p the bilinear form on F defined by p U. y)= M. Ty)e.

Page 128: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 123

4.56. Lemma: The norm of p in B (F, F) is the square of that of T as it

bounded operator from F into E.

Proof: By formula (1) we have

IA = sup {P (X. Y), Ix(f 1 IYIf l; .

But thenIPI = sup {(T.r, TV),., l.'lf < 1. IYIf = 1: < ITI1

On the other hand,

ITI2 = sup (Tx. TO, 5 ,up (Tr. Tv) = 1p(,IxI1 1Ir/5i

and we are through.

Remark: The same result follows from more general statements uponnoticing that 'TT = up.

The Definition of Riemannian Manifolds and Some Properties

Let M be a smooth manifold modelled on a Hilbert space E. Since B(E, E)is a Frechet space (Corollary 4.55), Proposition 4.21 allows us to applyProposition 4.32 and its Corollary 4.33 to conclude that the duals of thebundles T(M) and B (T(M), T(M)) (notation of 4.32) both exist.

4.57. Definition: The bundles B (T(M). T(M)) and (T(M))' will be denotedby T2(M) and T1(M) respectively.

4.58. Definition: A pseudometric on M is any smooth symmetric sectionof T2(M).

In this definition, the word symmetric means that if s is the section, thenfor every x e M s(x) e B (TM,,, TM.,) is symmetric.

4.59. Definition: A pseudometric s will be called a metric if for everyx r= M, the bilinear form s(x) is an inner product defining a Hilbert spacestructure on TM., compatible with its topology.

Of course the set of all pseudo-metrics on M is a linear space; that of themetrics is a cone in it.

4.60. Definition: A Riemannian manifold is a pair (M, g) where M is aHilbert manifold and g is a metric on M.

Assume that g is a pseudo-metric for M. For every element u e TM., (x e M)

Page 129: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

124 NONLINEAR FUNCTIONAL ANALYSIS

we shall denote by lul (or Jul,), the norm of u: Jul = (g(x) (u, u))112. This

norm is then a function from T(M) into R.Assume now that 0 is a chart of domain U c M.

T (U)

0*

U

E

B(EE)

T2 (U)U

If 6 and g are smooth sections of T(U) and T,(U) respectively, then

Ib(x)la = g(x) (a(x), b(x))

= g(x) ([4*(x)]- 1 4*(x) b(x),[4*(x)]- 1 4*(x) b(x))

= [g(x) o ([4*(x)]-1

x [4*(x)]_l)] (4*(x) b(x),4*(x) a(x))

being the composition of 4*(x) b(x) and g(x) o ([4*(x)]-1 x [0,(x)] - 1) isalso smooth.

This proves that given a pseudometric g, for every smooth vector field 6,the mapping x - 18(x)1

ois also smooth.

Assume now that g is a metric. Since g induces the inner product g(x) onevery TM,, and 4*(x) : TM,, -+ E is continuous it is natural (and useful)to consider the norms I4*(x)I and 1(4*(x))-11 of 4*(x) and its inverse asoperators from one Hilbert space (TM, with g(x)) into another, namly E.First of all, let us define h : U -i B (E, E) by

h(x) = g(x) o ([4*(x)]-I x [0*(x)]-1).B definition, h is smooth. Moreover, h never assumes on U the value0z B (E, E), because g is a metric. Then then the real function x -+ lh(x)I iscontinuous. Lemma 4.56 implies now that Ih(x)I = 1(¢*(x))I-112 for everyx e U. Hence:

4.61. Proposition: x - 1(0*(x))-11 is continuous on U and never vanishes.Length of a curve.Assume that a is a curve in M, its domain being [a, b] c R.

Page 130: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 125

.62. Definition: We define the length of a, L(a), by

L(a)rb

= I IIP'(t)II. dt,a

where p'(t) = (p*(t)) (1).Observe that L(a) might be + oo.This mapping L from the set of curves in M into R u {oo} will be thor-

oughly studied later; indeed much of our further study is devoted to state-ments about L (mainly about its critical points, namely, the geodesics).Assume now that M is connected. As in the classical case, topological andarcwise connectedness are equivalent. This follows from the fact that everytopologically connected locally arcwise connected space is also arcwise con-nected. Then for every pair x, y of points in M, there exist smooth curves pjoining them: p(a) = x, p(b) = y. Consider all the numbers L(p).

4.63. Definition: d (x, y) = inf L(p), the infininum taken on the (non-void) set of smooth curves joining x and y.

4.64. Proposition: The mapping d is a distance on M defining the originaltopology.

Proof: It suffices to prove that d induces the original topology on someopen neighborhood of every point. Assume then that 4 is a chart and U anopen set in the domain of 0 on which 1(0*(x))-11 is bounded (since 14«(x))-11is continuous by 4.61, such a U does exist) and such that ¢(U) is a ball in E.Such U's cover M and are d-open. We prove now that d induces the rela-tive topology on U. If y, z e U, let

Then p(t) = 0-1(t0 (y) + (1 - 1) 4(z)), 0 S t S 1.

and hence

Therefore

0* (P(t)) P'(t) = dO (P(u)).=. =0(y) - 4(z)

P'(t) _ [,0* (P(t))J-' Wy) - O(z))

IP'(t)I I(c* (P(t)))-' 1 Ilb(y) -4(z).1

Since we have assumed that I(0*(x))-' I is bounded, we have

jp'(t)I < K 10(y) -4(z)I,

from which it follows that

L(p) 5 K 10(y) - O(z)1

Page 131: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

126 NONLINEAR FUNCTIONAL ANALYSIS

Finally:

d (y, z) K j I tq5 (y) + (1 - t) o(z)! di.

This implies that if z - y in the topology of M, then d (y, z) - 0. On theother hand, if q is any curve joining y and z, we have

o. (q(t)) q'(t) dtL(q) = j Iq'(t)I dt ?

J oJ o. (q(t)) q'(t)I dt ?K

fo K

K Ifo dt (0q (t)) dt = K 100) - O(z)1

and then, choosing L(q) near d (y, z), we see that when d (z, y) -+ 0 itnecessarily follows that (¢(y) - O(z)1-1- 0, which implies that z - y in M.

Q.E.D.

Remark: Of course even if M is not connected it is possible to define adistance on M induced by the distance defined above on each of thecomponents.

4.65. Definition: A Riemannian manifold is said to be complete if eachof its components is complete under the metric defined in 4.63.

Generally, we shall deal with complete Riemannian manifolds.

Gradient

Let (M, g) be a Riemannian manifold. Every TM., is endowed with aninner product g(x) that makes it into a Hilbert space. That means that thereexist canonical isometries r,, : (TM,)' -+ TM.,.

4.66. Definition: If f is a smooth function on M, the gradient of f is thevector field Vf defined by (Vf)x That is Vf satisfies

g(x) (( A' v) = vf.

for all v e TM. and all x e M.It is clear that f -+ Vf is a linear mapping from C°°(M) into Q(M). More-

over, if Vf -= 0, then f is constant on every component of M. Hence, if M isconnected, V is an injection of C°°(M)/R into SA(M).

Page 132: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 127

Part 2

Morse Theory

A. The Non-critical Neck Principle

Assume that M is a smooth manifold modelled on a Banach space E. Werecall that, given a smooth function f : M -* R, a real number c is called acritical level for f ill there exists a point x e M such thatf(x) = c and f,(x) = 0(see 4.4 and the notation in 4.9).

4.67. Theorem (Non-critical Neck Principle): Let f be a smooth functionf: M - 9 and consider the sets N e W e Ml e M defined by:

W={xEM;a<f(x)<b)Ml = {xeM;a -F <f(x) <b+ e)N =f-'(c),

where a, b, c and e are real numbers, e > 0 and a < c < b. Suppose that v isa smooth vector field defined on M, and let a (t, p) denote its flow (see 4,44).Assume also that

(a) vfzb>0and that

(b) for every fixed p e N, the function f(a (t, p)) assumes values greaterthan b and less than a in its interval of definition t_(p) < t < t+(p).

Then the manifolds W and (a, b) x N are diffeomorphic.

Remark: The fact that N is a manifold follows from 4.49.

Proof: Without loss of generality we may assume that a = -1, b = + 1,.= 0 and of = I (this last condition is achieved by replacing the original'y (i f)-' v). That means that

df(a (t, P)) = if

Page 133: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

128 NONLINEAR FUNCTIONAL ANALYSIS

and hence (after integrating), that:

[*] f(a (t, p)) = f(p) + t .

Assume now that n e N. Then f(a (t, n)) = t and, from (b), we concludethat t_(n) < -1, t+(n) > + 1. This means in particular that the domain ofthe mapping a contains (-1, + 1) x N. Consider now the restriction

A: (-1, +1) x N-F M

of a (i.e., A (t, n) = a (t, n)).We shall show that A is a diffeomorphism between (-1, + 1) x N and W.

1. A assumes values in W. In fact, from [*] we obtain I f(A (t, n))l

= I f(a (t, n))l = Ill < 1.2. A maps (-1, +1) x N onto W. Assume that p e W. Set t = f(p),

n = a (-t, p) = a (-f(p), p)Thenf(n) = f(a ( f(p), p)), and by [*], we havef(n) = f(p) + f(p) = 0,

whence n e N. Clearly (see the definition of W) we also have - I < t < + 1.Now using Proposition 4.42 we obtain:

A (t, n) = a (t, n) = a (f(p), a (-f(p), p)) = a (0, p) = p

and A is therefore onto.3. A is smooth and has a smooth inverse. If p = A (t, n); using 4.42 again

and also using [*] we conclude that

t = f(p)

n = or (- t, p) = a (-f(p), p)

But then 4.45 implies that both A and A-' are smooth.

Remark 1: If the hypothesis (b) is not satisfied then the proposition isfalse, as the following example shows: let V be the surface of a verticalcircular cylinder in Euclidean space R3. Denote by f(x) the vertical co-ordinate of a point x e V and by v(x) a vertical unit vector with origin atx e V. Now remove a point z e V such that f(z) = 0 and let M denote theremaining manifold.

For the values a = --1, b = +I the proposition does not hold for M.Nevertheless, all the hypotheses except (b) are satisfied.

Remark 2: Observe that the diffeomorphism A : (a, b) x N -- W sends{z} x Ndiffeomorphically onto!- '(z) for every a < z < b (this follows fromthe formula [**] in the above proof).

Page 134: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 129

4.68. Proposition: Assume the same hypothesis as in 4.67 and furthermoreassume that (b) is satisfied in the following stronger version:

(b') for every fixed p e N, the function f(a (t, n)) assumes values greaterthan b + 27 and less than a - 77 for some rj > 0 (the same for all p e N).

Then if W, = {x a M; a S f(x) '5 b}, there exists a diffeomorphism be-tween W, and N x [a, b] (as long as they are manifolds, i.e. when N is amanifold without boundary).

Proof: From the proposition above we obtain the existence of a mapping

(1) A:Nx(a-,b+jl"'W2where W2 = {x a M; a - i1/2 < f(x) < b +71/2}.

This mapping sends N x {z} onto f f'(z) for every a - i/2 < z < b+ -1/2.Then the restriction of A to N x [a, b] is the desired diffeomorphism.

4.69. Corollary: Under the hypothesis of 4.68, there exists a homotopyH: M x I - M, where I = [0, 1], such that if H, (m) = H (m, s) then

1. for every s e I, H,: M-> M is a diffeomorphism;2. if m e M does not satisfy a - 17/4 5 f(m) S b +,1/4, then H,(m) = m

for all s;3. Ho = identity;4. Hl ({x;f(x) S a)) = {x;f(x) S b}.

Proof: Let h be a smooth function as shown below such that h'(x) > 0for all x.

9 Scbwartz, Nonlinear

Page 135: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

130 NONLINEAR FUNCTIONAL ANALYSIS

Let F = {x; a - q/4 5 f(x) 5 b + 7t/4} and let G = M - F. Clearly Gis open and M = G u W2.

Now if A is the mapping defined in (1) of the proof of 4.68, then forevery s between 0 and 1 we define H, as follows:

if m e G, then Him) = m

if m = A (n, t) a W2, then H,(m) = A (n, (1 - s) t + sh(t)).

Observe that H, is well defined (and equal to the identity) on W2 n G.Indeed, the reader can now verify that the H, have the properties 1), ..., 4).

Remark: Part (4) of this corollary says in particular that {x; f(x) 5 a}and {x;f(x) 5 b} are diffeomorphic. This fact will be used very often. Animportant generalization appears in: 4.72.

B. The Palais-Smale Condition

Let us assume now that (M, g) is a Riemannian manifold.

4.70. Definition: If f e COR(M), we shall that f satisfies the Palais-Smalecondition ("P-S condition") if whenever S is a set in M on which! is boundedand JI Vf 11 is not bounded away from zero, then there exists a critical pointoff adherent to S.

Of course this is equivalent to: if is a sequence in M such thatis bounded and II(Vf),j - 0, then there exists a convergent subsequence of

(the limit being necessarily a critical point).

Remark: This condition appears in (2) and (3) of the bibliography.

4.71. Theorem (non-critical neck principle for Riemannian manifolds):Let M be a complete Riemannian manifold, f e C0D(M) and consider thesets N e W c W1 a M1 e M defined by

W = {x; a<f(x)<b)W1 = {x; a < f(x) S b}

M1= {x; a-s<f(x)<b+e}N = f 1(c),

where a, b, c and e are real numbers, e > 0 and a < c < b. Assume that thefollowing hypotheses are satisfied :

Page 136: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 131

(a) f satisfies the P-S condition on M1;(#)f has no critical point in M1.

Then W is diffeomorphic to N x (a, b) and (if Nhas empty boundary) W1 is

diffeomorphic to N x (a, bJ. Moreover, the diffeomorphism may be chosenso as to send N x {z} onto f -1(z) diffeomorphically for every z between aand b.

Proof: Define a vector field v by v = Vf (and let or be its flow). We shallshow that v and f satisfy hypothesis (a) of 4.67 and (b') of 4.68.

First of all, we see that of = (Vf) = (V f, V f) = II Vf II 2 and that hypo-theses (a) and (1) imply that UVfji is bounded away from 0 on M2= {x; a - e < f(x) < b + e/2), i.e., IIVf(z)II >- 8 > 0 if t c- M2. Thus,hypothesis (a) of 4.67 is satisfied. So is hypothesis (b'). In fact, assume that

[*] f ( a (t, p)) S b + 2 f o r 0 S t< t+ = t+(p) .

Then

and hence

dtf(a (t, P)) = dt

f = (of)f = IIVf(a (t, P))II2,

It.If Vf(a (t, p))II2 dt = Jim f -

df(a (t, p)) dt

d-*t, o dt0,

5 sup f(a (u, p)) - f(p) 5 b + f(p).osust. 2

Since IIVfII 6 on M2, it follows from [*] and [**] that

8t* 5 b + 2 - f(p)

whence t+ is finite.But then we also have (from [**]):

J

t+

[***] IIVf(a(t,P))II dt < +oo.o

Since a is the flow of Vf, we conclude that Vf(a (t, p)) = da (t, p), and then[***] implies: dt

[****J Jdo, (11 < +co.

o dt

Page 137: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

132 NONLINEAR FUNCTIONAL ANALYSIS

Assume now thatq > 0 is given. Then T < t.. may be chosen so that

da

dt(t, p) dt < 27.

Consider now two points a (x, p), a (y, p) with T:5. x 5 y < t+ . They maybe joined by a curve y(t) = = a (t, p), x 5 t 5 y, whose length by the lastformula is less than n. Therefore the distance e (x, y) < rl and we haveproved thereby that the net or (p, t), 0 < t < t+ is a Cauchy net (cf. Kelly,General Topology). Since M is c.)mplete, lim or (t, p) exists, which to-

gether with t+ < + co, t -, t+, contradicts Prop. 4.45. Thus hypothesis(b') of 4.68 is satisfied (for ?I = e/2) and then 4.68 applies.Q.E.D.

Remark: Observe that hypotheses (a) and (8) are independent of c. There-fore, the conclusion is true for any c between a and b.

4.72. Corollary: Under the hypotheses of Theorem 4.71, there exists ahomotopy H : M x I-+ M (I = [0, 1]) having the properties:

1. for every s e I, H,: M-+ M is a diffeomorphism;2. if m e M does not satisfy a - e/8:5 f(m) S b + e/8, then H,(m) = m

for all s;3. Ho = identity;4. Hl ({x; f (x) 5 a}) = {x; f(x) S b}.

Proof: We have shown that the hypotheses of 4.71 imply those of 4.68,and hence of its corollary.

C. Local Study of Critical Points

Let E be a Hilbert space and f a smooth real function defined on a neigh-borhood of 0 e E. Using Taylor's expansion we write

f(x) = f(0) + f(O) (x) + If '(0) (x, x) + R(x),

where R(x) is a function of order 3.Assume that 0 is a critical point of f. Then:

[*] f(x) = f(0) + #f"(0) (x, x) + R(r).

Page 138: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 133

Since the bilinear form f'(0) is continuous and symmetric, there exists(see 4.54) a (unique) symmetric operator A e End (E) such that

f"(0) (x, y) = (Ax, y) = (x, Ay).

Formula [*] then becomes

[**] f(x) = f(0) + j (Ax, x) + R(x).

Let us consider a smooth change of coordinates y - x(y), such that x(0) = 0.Then f(y) = fl(x(y)) and using the chain rule we obtain:

P(y) (zl , z2) = f i (x(y)) (x'(y) z1 , x'(y) Z2) + f'(x(y)) (x"(y) (zi , z2))

Since 0 is a critical point, we get:

f"(O) (z1 , z2) = f1(0) (x'(0) Z1, x'(0) z2)

This formula shows that the operator A transforms according to:

[***] A = u'1Alu, where u = x'(0).

4.73. Definition: Let f be a smooth function defined on some Riemannianmanifold, x a critical point of f. We shall say that x is a non-degeneratecritical point off if in any chart, the operator A defined in [**] is invertible.

Remark: The formula [***] shows this notion to be coordinate independent.We are led to the same concept as follows. Ifx is a critical point and 0 is a

chart around x, the bilinear form H(f)x defined on TM.,, by

H(f)x(u, A) = [(fo -I)" (O(x))] (4*(x)p,0*(x) A)

does not depend on 0. Hence we may make the following definition.

4.74. Definition: The bilinear form H(f)x is called the Hessian off at x.According to our definitions, the Hessian of a smooth function is a smooth

section of the bundle B (T(M), T(M)) = T2(M) defined on the set of criticalpoints off.

4.75. Definition: A critical point x is called non-degenerate if H(f)x is ascalar product defining the given topology of TM.,.

It is obvious that 4.73 and 4.75 are equivalent.

Remark: A very elegant definition of H(f)x is given in Milnor, "MorseTheory" (Ann. of Math. Studies, No. 51, Princeton 1963).

Page 139: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

134 NONLINEAR FUNCTIONAL ANALYSIS

Let F be a complex Hilbert space. Denote by End (F) the space of all thecontinuous linear operators T : F -> F and by Aut (F) (respectively H(F))the subset of invertible operators (respectively, the subspace of the Hermi-tean operators.

4.76. Lemma: If A G Aut (F) n H(F), then the mapping tp : End (F)H(F) x H(F) defined by

ip(B) = (B*A + AB, i (B*A - AB))

is one-to-one onto and both.tp and tp-1 are continuous.

Proof: In fact, given S and T symmetric, define

B = +A-1(S + iT).

Then B*A + AB = S and i (B*A - AB) = T, and hence

+iT)is the inverse of tp. Clearly both are continuous.

4.77. Lemma: Assume that A e Aut (F) n H(F). Then the mappingq5: Aut (F) -+ H(F) x H(F) defined by

¢(B) = (B*AB, i (B*AB-1 - A))

is differentiable and its derivative at B = I is

80 (1, B) = tp(B).

Proof: Observe that 8(B-1) = --BB and compute.

4.78. Corollary: 0 maps a neighborhood of 1 e Aut (F) diffeomorphicallyonto a neighborhood of (A, 0) =.0Q).

Proof: Use the implicit function theorem.Assume now that x-1- (A(x), D(x)) is a smooth mapping from an open

set of F into H(F) x H(F) such that A(O) = A and D(0) = 0. Then

4.79. Lemma: x -> B(x) = 0 -1(A(x), D(x)) is a smooth mapping suchthat B(O) = I and A(x) = B*(x) A(0) B(x) and

D(x) = i (B*(x) A(0) B-1(x) - A(0)).

Proof: Follows from the corollary above.Assume now that E is a real Hilbert space.

Page 140: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 135

4.80. Proposition: Let x - A(x) be a smooth mapping from a neighbor-

hood of 0 e E into End (E) such that A(x) is symmetric and A(0) is also

invertible. Then there exists a smooth mapping x --> B(x) of some neigh-

borhood of 0 e E into Aut (E) such that:

A(x) = B*(x) A(0) B(x).

Proof: Let F be the complexification of E : F = E ® C = E ® iE. De-fine the mappings x + iy = z - A(z) by A(z) (u + iv) = A(x) u + iv andz - D(z) by D(z) = 0. Apply Lemma 4.79 to prove that there exists a mapz -> B(z) such that:

(1) B*(z) A(0) B(z) = A(z)

(2) B*(z) A(0) (B(z))-' = A(0).

From (1) and (2) we conclude that:

(3) (B(z))2 = (A(0))-1 A(z)

(4) (B*(z))2 = A*(z) (A*(0))-1

Now observe that for every x e E, A(x) and A(0) leave E invariant: A(x) E= A(x) E e E, A(0) E = E. Then (3) and (4) imply that E is also invariantunder (B(x))2 and (B" (x))2. Now we shall use the following statement (seebelow for a justification):

(5) If an operator T satisfies III - T211 < 1, then the invariant (closed) sub-spaces- of T and T2 are the same.

From (5), it follows that E is also invariant under B(x) and B*(x), providedthat x e E is near 0. Hence, by restriction to E (and calling B(x) = B(x)I B)we obtain from 1

B*(x) A(0) B(x) = A(x), x e E, x near 0as desired.

Justification of (5): observe that

T=(1 - (1 -T2))112 = I+ 2 (1 - T2) - 8 (1 - T2)2

+ (2(n - 1))! (1 - T2) "+ ---2 2n- I n ((n _. 1)!)2

where the series converges in the uniform topology of operators if 11 - TI2< 1.

Page 141: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

136 NONLINEAR FUNCTIONAL ANALYSIS

4.81. Proposition: Let A be a symmetric invertible operator in End (E)(E = real Hilbert space). Then there exist T e Aut (E) and a projectorP e End (E) such that

(Ax, x) = IIPTxll2 - II(1 - P) Txry2, x e E.

Proof: Let h be the characteristic function of [0, oo) and g the functiong(A) = 121-''2, A = real 0 0. Since A is invertible, g is continuous on thespectrum of A. Then S = g(A) is defined.

Clearly S (being a function of A) is symmetric and commutes with A.Moreover S is invertible (because g:0 0 on Spectrum (A)). Call T = S-T is symmetric and invertible.

Now define P = h(A). P is clearly a projector (because h2 = h) commutingwith A, hence also with T. Since we have

I (g(j))2 = h(2) - (1 - h(2)),

we conclude that AT` = P - (1 - P), and hence A = PT2 - (1 - P)T2,But then

(Ax, x) = (PT2x, x) - ((1 - P)T2x, X) = II PTxII2 - 11(1 - P) Tx112,

as desired.

4.82. Proposition (Morse Lemma) : Let f be a smooth function defined ona Riemannian manifold M (modelled on E). If x is a non-degenerate criticalpoint off, then there exists a chart 0 around x (sending x into 0) and a pro-jector P in E such that

(1) f(y) =f(x) + IPcbyI2 - I(1 - P)oyl2

when y belongs to the domain of 0

Proof: Let w be any chart around x (sending x into 0) and put g(y)_ (ftp ') (y) - f(x), where e e E and a near 0. Then

(2) g(y) = g(0) + g'(0) y + fo

(sy)] (y, y) (1 - s) dso

f (1 - s) g" (sy) ds] (y, y)0

rtNow, since a,, =

J(1 - s) g" (sy) ds is a symmetric bilinear form on E,

0

there exists a mapping y - A(y) a End (E) defined by

(3) (A(yy) x, y) = ocr (x, y)

Page 142: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 137

Clearly every A(y) is symmetric and

(A(0) x, y) = ao (x, y) = Ig"

(0) (x, y) -

Together with the fact that x is a nondegenerate critical point off this implies

that A(0) is invertible. Now we apply 4.80 and obtain B(x) satisfying:

This implies thatA(x) = B*(x) A(0) B(_x).

(4) (A(y) Y, Y) _ (A(0) B(Y) y, B(y) y),

and from (2), (3) and (4) we obtain :

g(y) = (A (0) B(y) y, B(y) Y) -

Using 4.81, we conclude that there exist T and P such that

(5) g(y) = J PT B(y) yI2 - I(1 - P) T B(y) y12.

Define 0 by

By (5), we have:4(Y) = T (B ('(Y)) v'(Y)) -

f(Y) = f(x) + g (W(Y)) = f(x) + I PT B (W(Y))'V(Y)f 2

- I(1 - P) T B (V'(Y))'ip(Y)I2 = f(x) - IP4)xI2 - 1(1 - P) 4)x12,as desired.

We must observe that4) is actually a chart; indeed it is the composition of

Y - y'(.l')

e - B(e) e

z-+T(z)

and clearly the first and third mappings are diffeomorphisms while y B(y) y,having the identity as derivative at the origin (compute!) is also a localdiffeomorphism. Hence 0 is a chart on some domain around x.

Q.E.D.

D. Global Study of Critical Points

We begin by defining handles and the attaching of handles.

4.83. Definition: For every cardinal number k we shall denote by D" theclosed unit ball around 0 in a Hilbert space having an orthorormrl basis ofcardinality k. aDk will denote its boundary.

Page 143: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

138 NONLINEAR FUNCTIONAL ANALYSIS

4.84. Definition: Let M and M be two smooth manifolds (possibly withboundary).

We shall say that M has been obtained from M by attaching a handle oftype (k, 1) if the following conditions are satisfied:

1. M is a regularly embedded submanifold of 2;2. There exists a closed subset H c M and a mapping h : Dk x D' -+ H,

such that :2a.MuH=2,2b. h is a homeomorphism,2c. h (aDk x D') = H n M c OM,2d. H - M is a submanifold of M with boundary,2e. the restriction h

l

(Dk x D') is a diffeomorphism of Dk x D' ontoH - M, and

2f. the restriction h I(aDk x D') is a regular embedding of aDk x D'into M.

In this situation, we use the following notation for M: M = M U H (k, 1).k

Remark: Obviously dim 2 = k + 1.We shall say that H is a handle.

As a generalization, we state the following definition:

4.85. Definition: We shall say that M has been obtained from M by at-taching n handles of types (k1,11), ..., if separately attaching n suchhandles to M in such a way that hj(H,) n h,(H,) = 0 the manifold obtainedis M.

In this situation, we use the following notation for M: 2 = M UH1 ... U H. = 11?'. (See next figure.) I"

k

Page 144: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 139

Assume now that f is a smooth real-valued function defined on some Rie-

mannian manifold M (modelled on E) and that x e M is a non-degeneratecritical point off. By 4.82, f may be represented, in some chart 0 around x, as

f(y) =f(x) + IP0yJ2 - I(1 - P)d'yI2,

where P is a projector in E.

4.86. Definition: The index off at x (or the index of x, if there is no con-fusion) is the pair (k, 1), where k = dim (PE), 1 = dim ((1 -- P) E)

Of course both k and 1 may be infinite cardinal numbers.We arrive now at the most important theorems of this section.The symbol - will mean "is diffeomorphic to".Let (M, g) be a complete Riemannian manifold.Given f e C°°(M) then for every s, t e A, define

[f5 s]={xeM;f(X)SS}[S 5f<_t]={xeM;s<f(x)<t}.

4.87. Theorem: (Critical neck principle.) Assume thatf a C°°(M) and thatthe only critical level off between a and b is c, a < c < b; a and b are sup-posed to be non-critical. Assume also that f satisfies the P-S condition on[a 5 f S b] and that its critical points pl, ..., p (at level c) are non-degen-erate, their indices being (k,,1,), i = 1, ..., n. Then

[#]h, hn

where (for every i = 1, ..., n), H, is a (k,,1,) handle.

Remark: The number of critical points is finite.

Page 145: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

140 NONLINEAR FUNCTIONAL ANALYSIS

Proof: I. Consideringf-c instead off, we may assume that c = 0, a< O<b.2. Moreover, by 4.72, it follows that

[f<_s]-[.f5tj if a<s,t<0and

['5s]-[f<t] if 0<s,t<b.Hence [*] is equivalent to

[f<s] - [f<t]UH,...UH.

hl hn

for some s, t, with a <s <0 < t 5 b.3. Consider now charts 0 around pl , ..., sending p, into 0 e E

and such that, for x near pi,

f(x) = IIPJ0cxll2 - IIQg4,xIl2,

where Pi = I - Q, is a projector of E.The existence of such charts follows from 4.82.

The ball of radius lOr is mapped by 0s 1 into a neighborhood U,(r) of p,

and the chart 0': x -+ 14,(x) sends U,(r) onto the ball of radius 10 in E: callr

this ball B. We have

f(x) = r2(IIP4ixli2 - IIQ4ixIi2)

If r is chosen small enough, we have ar-2 < -1, and 2 < br-2. Replacingf by r-2f and 4, by O;, we reduce the problem to the case in which thereare charts 4 1, ..., 0 around pl, ..., p whose domains U,, ..., U. have dis-joint closures and which are all mapped onto the ball of radius 10 around 0 e E.

Page 146: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 141

3. Now f may be expressed near p, as

f(x) = IIPI41XII2 - IIQ,d1xI12,

where P, = I - Ql is a projector of E.All the levels c between a and fl are non-critical (a and # included) except

c = 0, where a and fl satisfy a < -1, 2 < fl.4. The old sets V :!g a] and [f <_ b] correspond to the new levels a and f

and then coincide with the new [f < a] and [f < f]; but, from 2. we have[f 5 a] - [f 5 . II and [f 5 2] 5 [f < i] and hence [*] now becomes

(4a) [f<2]-[f< -1]UH1...UH,,.h, h

5. To simply notations, define ul and wl on U, by

u,(x) = llPbixll

VA) = IIQ41xIIThen we have

f(x) = (ul(x))2 - (vl(x))2, x e U,.

6. The remaining proof will depend on the existence of a function A havingthe following properties:

(6a) A e C°°(M)

(6 b) A >- 0; A = 0 outside U1 v ... u U,,;

(6c) A(pl) _ 4, l = I,-, n;(6d) iff(x) 2, then A(x) = 0.

Put g = f - A; we also require that

(6 h) g has the same critical points as f;

(6j) g satisfies the P-S condition;

(6k) [f;9 -1)UH1 -1].h, h

7. Let us first see why the existence of such a A implies (4a) and hencethe theorem. First of all, from (6b) and (6d) it follows that

(7a) [f;g 21=[g<2].On the other hand, by (6c), g(p,) Hence, by (6 h), all the levels between-1 and 2 are non-critical for g.

Page 147: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

142 NONLINEAR FUNCTIONAL ANALYSIS

That means (see 4.72) that

(7b) [g S -1] [g 5 2].

Finally, from (6k), (7a) and (7b) we obtain (4a), as desired.

8. Existence of ALet A and q be two smooth real functions of a real variable as in the dia-

grams below:

1

x

A

IT1 (x)

2 8 x

We also assume that

(8a) 171 > -f.Let us denote by p, 0, U, P, Q, u, v a particular (but not specified) setpi,4t, U,, Ps, Q,, ui, i = 1, ..., n. Let us agree, moreover, that whenever xand 0, U, P, Q, u or v appear in the same formula, then the choice is deter-mined by x e U, and u, v mean u(x), v(x), respectively.

Define A by

(8 b)A(x) _ I A(u2) t#1) if x e U1 v v UA(x) = 0 otherwise.

Clearly, A is smooth (because A = 0 outside#-1 (ball of radius 3)) and non-negative.

Put g =f -A.9. Proof of the desired properties (6a) - (6k)Properties 6a, b and c are clearly satisfied. Moreover, since f = u2 - v2,

if f z 2, necessarily u2 > 2 and then A(u2) = 0. Thus we have (6d).

Page 148: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 143

Since Vg = Vf - VA, x e M is a critical point of g if and only if (Vf)x

= (VA)x.

Now, for x e [ -1 S f 5 + 2] but outside U1 u " u U., we have (VA) 0and (Vf). # 0, because the only critical points off are pi, ..., p..

That means that g (like f) has no critical points in [ -1 5 f 5 2]

Assume now that x e U, x # p. Then the differential of g is

(9.1) g*(x) = 2u (1 - I A'(u2) ?(v2)) u,(x) + 2v (-1 - )(u2) ?J '(v2)) v*(x).

We shall verify that g*(x) # 0.In order to prove this we need the following statement.(9.2) If u(x) and v(x) are both non-zero, then u*(x) and v*(x) are linearly

independent.In fact, from u(x) # 0, v(x) # 0 it follows that e = Pox and el = Q¢x

are both different from 0 e E. Now consider the curves

a(t) =o- (O(x) + te)

fl(t) _ 0-1(O(x) + tei).Then

u*(x) a'(0) = d (u o a(t))I

= d (11P4 (0-1 (4(x) + te))II )-o dt

= d II(1 + t) ell = llell + 0.

r=o

Page 149: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

144 NONLINEAR FUNCTIONAL ANALYSIS

Similarly we may obtain

u*(x) PO) = 0

v*(x) n'(0) = 0

v,(x)#'(0) = 0.

That proves that u*(x) and v*(x) are linearly independent, establishing (9.2).

Assume now that g*(x) = 0. If u = 0, then by (9.1),

g*(x) = 2v (- I - 12(u2) ?i'(V2)) v*(x) = 0.

Since x 0, and u = 0, it must be that v*(x) 0 0, and from i'(v2) > - ,0 < 2 <- 1, it follows that 2v(- I _12(u2) 17'(v2)) = 0 is only satisfied whenv = 0 also, contradicting x p.

The case v = 0 is treated similarly.Finally, if u 0, v 0 0, then u*(x) and v*(x) are linearly independent by

(9.2) and g*(x) = 0 implies 2v (-1- 12 (u2) ?,'(v2)) = 0, which we have seento be true only if v = 0, while we are assuming v # 0. Thus g has no criticalpoints other than pl , ..., p,,, which are, indeed, critical because A is constant

on u < N/2/2, v < J2 , whence (Vg), = (Vf), = 0.Thus, property (6h) is satisfied by A.In order to establish (6k) we observe that outside Ul u ... u U,,, f and g

coincide by (6b). We shall consider each Us separately and prove that

(9.3) Un[f< -l]UH=Un[g< -1].Is

Clearly this implies (6k).Of course we shall deal with (9.3) after transposing it into E by means

oft.Put X = P(E), Y = Q(E), Q = I - P, and B = open ball of radius 10

around 0 e E. Clearly E = X ® Y, X 1 Y. We shall let x, y denote elementsin X and Y respectively and x2, y2 denote x2 = 1x12, y2 = Jy12. If e, x and yappear in the same formula, it must be understood that e e E, e = x + yand x e X, y e Y. Applying 0, statement (9.3) may be written as follows.

(9.4) In B, the submanifolds:

(9.4.1) V = [x2 - y2 < -1]

(9.4.2) W = [x2 - y2 - 1 2(x2) rl(y2) < -1 ]

Page 150: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 145

satisfy

(9.4.3) V U H = W,h

where H is a (k, 1) handle.The proof depends on elementary (though sly) computations. First of all

we observe that W may be defined more conveniently as

(9.5) W = [x2 - y2 - 2(x2)

In fact, if y2 < 2, tj(x2) = I and then

x2 - y2 - 4 A(x2) 17("2) ` x2 - y2 - 2(x2);

if y2 > 2 and x2 > 1, then

x2 - y2 - j 2(x2),i(y2) = x2 - y2 - j 2(x2) = x2 - y2;

finally, if y2 > 2 and x2 < 1, then necessarily x2 - y2 S -1 and conse-quently

x2 - y2 - 2(x2) 77(y2) < - 1,

x2 - y2 - 2(x2) < -1.This proves (9.5).

Define K e B by

(9.6) K = {e a B; x2 + I - 2(x2) 5 y2 x2 + 1) ,

so that K is the set of elements e e B satisfying

(9.6.1)

and

(9.6.2)

x2 + 1 - 2(x2) < y2,

y2 < x2 + 1.

Let Dk and D' be the unit balls of Y and X respectively and defineh:Dk x D' -Kby(9.7) h (y, x) = (Q(y2))1I2 x + (1 + Q(y2) x2)1/2 y,

where a is the smooth function defined as follows: if 0 5 t < 1, a(t) is theunique solution in [0, 1] of

3 2 (a(t))

2 1 + o(t)10 Schwartz, Nonlinear

Page 151: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

146 NONLINEAR FUNCTIONAL ANALYSIS

This mapping is smooth and has the form shown in the following graph.

I

1

21

1

Since a is smooth, h is also smooth. The reader will check that the image

H = h (D" x D)

of h is contained in K; this follows from the inequality

a(y2) + 1 -4 )' (a(y2) x2) 5 (1 + a(y2) x2) y2

which is a consequence of

1 - y2 = 4.)' (a(y2)) (1 + a(y2))-' < I (A(a(y2) x2) (I + a(y2) x2)-I

it follows from this inequality that H c W; the other condition (y2 < 1 + x2)is even easier and is left to the reader.

Now consider the function S: H - X x Y defined by:z 1/2

(9.9) S(e) = ((1 + X2)-1/2y, [(1( / x)1 +x2J 1

S is smooth and clearly Sh = identity, hS = identity.In order to finish the proof it suffices to show that H is a (k, 1) handle and

thatV U H = W.

To do this, we first show that

H=Kn[x2 S 11.

In fact, it is obvious that H c K n [x2 S 1 ].Assume now that e = x + y eK and x2 < 1. If x 5 1, then x2 5 or (y2/

(1 + x2)) is trivial. If x2 z - , then there exists 0 5 t S 1 such thata(t) = x2. From the formula

x2 + I - 4 )'(x2) = x2 + 1 - 4 A (a(t)) < y2

Page 152: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS

it follows (use (9.8)) that:

x2 + 1 - (1 + a(t)) (1 - t) < y2,

so that(1 +x2) -(1 +x2)(1 -t) S y2,

(I +x2)t<y2,

whence t 5 y2/(1 + x2). But since a is increasing, we havez

x2 = a(t) < a y1 + x2

and this shows that

(i)[a Y'x2)] < 1.

On the other hand, it is obvious from y2 5x2 + 1 that

(ii) 1(1 + x2)- 1/2 yl < 1 .

147

Now (i) and (ii) together imply that S(e) a D' x D' and hence

e = hS (e) e h (Dk x D') = H.

This proves that H = K n [x2 1] and we conclude that H = K n [x2 S 11.As a corollary, it follows that

(9.10) His closed

(This is the required Property 4.84, 2 from the definition of "handle".)We now show that

(9.11) V u H = W.

(This is Property 4.84, 2a).Plainly V u H e W. Let a= x + y be a point such that

(a)

Ifeis not in Vwe have

(b)

x2 - y2 - j A(x2) < -1.

x2 - y2 > -1.

(a) and (b) together imply that 2(x2) > 0, whence

(c) 4 x2 < 1.

Now (a), (b) and (c) imply that e belongs toK n [x2 <- 1] = H, and weobtain the desired assertion W c V u H.

Page 153: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

148 NONLINEAR FUNCTIONAL ANALYSIS

If e e H n V K n V, plainly x2 - y2 = -1, so H n V< O V andy2 (1 + x2)-' = 1 verifying (4.84.2c).

Similarly, if e e H - M then x2 - y2 > -1, so that y2 (x2 + 1)`1 < 1,which with (9.9) and (9.7) verifies (4.84.2d-2e). This completes the proof of(9.3) and of theorem (4.87). Q.E.D.

E. The Morse Inequalities

We begin with a rapid review of homology theory. Our description will bebased on the axiomatic characterization of the homology groups, as givenfor example in Eilenberg-Steenrod ([4]).

We denote by (X, Y) a pair of topological spaces, Y a subspace of X;(X, 0) is written as X. Under suitable restrictions on the class of spaces con-sidered, we can associate with each pair (X, Y) Abelian groups Hk (X, Y), kan integer, (Hk = {0} if k < 0). These groups will depend on a fixed groupG (the "coefficient group" of the theory), so we should denote them byHk (X, Y; G). We are mainly interested in two specific cases, namelyG = integers = Z or G = real numbers - R. In the case G = R the Hkare vector spaces (over R) and the Betti numbers #I, (X, Y) of the pair X, Y aredefined by

We write1'k (X, Y) = dim Ht (X, Y; R) (1.1)

cb:(X,Y)if 0 is continuous, 0: X - I, 4(Y)

Any such function induces a homomorphism for each k

¢*:Hk(X, Y) -+ Hk(', 7)having the following properties:

(a) (00 * = 0*y,*

(b) if i = identity, i* = identity.

(We can already deduce that homeomorphic pairs have isomorphic groups.)

(c) Let 0, p : (X, Y) -+ (2, 7) be homotopic

(here ¢, (X, Y) -- (2, 7)). Then 4* = lp*.We say that (X, Y) and (1, 7) are homotopically equivalent if there exist

0: (X, Y) - (1, F) and V : (I, 7) - (X, Y) s.t. jp$ and 4iy' are homotopieto the respective identity mappings. Property (c) implies that homotopicallyequivalent pairs have isomorphic homology groups.

Page 154: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 149

(d) Excision property: Take a pair (X, Y), let 0 c Y be an open set suchthat also 0 c Y. Let i = identity map from (X - 0, Y - 0) to (X, Y).Then i* : Hk (X -- 0, y -- 0) - Hk (X, Y)

is an isomorphism onto.

(e) The homology groups are related to the coefficient group G by

G if k =0Hk(P) _

{0} if k > 0,

where P = (P, 0) is a space consisting of a single point.

(f) There exist maps ak : Hk (X, Y) -+ Hk_ 1(Y),(which we write simply as a, omitting the subindex) such that, if 0:(X, Y) -- (I, Y)Then

a4* _ (0I y)* a

(here we have designated 01 Y the restriction of 0 to Y, (01 Y)* the inducedmap on Hk(Y) = Hk (Y, 4,)).

(g) Exactness principle of EulerLet X ? Y ? Q ; let

be inclusion maps, j*, k* the induced maps on the homology groups. Wecan construct the sequence

-' Hk(X) '--' H& (X, Y) -a-' Hk- I(Y) "* Hk- IM Hk- i (1, Y) -'... -, Ho(Y) k Ho(X) J- ' Ho (X, Y) 00 0 ...

(the homology sequence of the pair X, Y). The exactness principle tells usthat the homology sequence of any pair (X, Y) is exact (i.e. the image of anygroup in the sequence under the corresponding homomorphism is equalto the kernel of the next homomorphism).

We note the following results for future use.

1. Let S" be the n-sphere, G = Z, n 4 0. Then

Hk(S")=0 if k>0, kin,H"(S") = Z

Ho(S") = Z

Page 155: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

150 NONLINEAR FUNCTIONAL ANALYSIS

2. Let G be as before, D be the n-disk. Then

Hk (D", S°- 1) = Hk (D", aD") = {0}

H. (D", S"-') = Z.I

3. Suppose (X, Y) = U (XI, YI), all XX disjoint.i=I

I

Then Hk(X,Y)=E®Hk(Xi,YI).1=I

if VA n,

Example (i). To illustrate the use of the exactness principle we willdeduce 2) from 1) and (e). Consider

Hk(S"-1) k -+Hk(D") J' Hk (D", S"-1) e- Hk-1(S"-') k* ' Hk-,(D").

Since D" is homotopic to a point for any n, we have Hk(D") = {0} for anyk * 0. Therefore, if k > 0, k # n, we get the sequence

{0} - Hk (D", S"-1) 8-' {0) {0} ,

whose exactness implies readily that Hk (D", S"-1) _ {0}.On the other hand, if k = n, the sequence is

{0} - H. (DO, {0}.

This time, exactness implies that H"(D", S"- 1) = Z.

Example (ii). We now note a result more general than the exactnessprinciple. Let X ? Y 3 Z; consider the inclusion maps

j:(X,Z)-+(X, Y)

k:(Y,Z)-->(X,Z)

Using the induced mappingsj*, k*, 1* we can form the sequence:

-' Hk(X,Z) J, -'Hk(X, Y) 8-'Hk-I (Y, Z) k--.Hk-I (X,Z)

Hk-IMY)a:-... a= Ho(Y,Z)

{0} k'- {0} (1.2)

Page 156: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 151

where d' is the composition of 1* and 0, i.e.

Hk-t(Y) Hk-t (Y, Z)r

a j /a =t.a (1.3)

Hk (X, Y)

As an exercise the reader should prove, using the exactness principle, that(1.2) is an exact sequence.

Consider now the case G = R; the homology groups are then vectorspaces. Let K,, (X, Y) be the subspace of Hk (X, Y) which is either the imageof the preceding homomorphism or the kernel of the next, (similarly defineKk (X, Z), ... etc.) and set ek (X, Y) = dimKk (X, Y) ... etc. From the exactnessof (1.2) we easily see that

Ilk (X, Y) = ek (X, Y) + ek-1 (Y, Z) (1.4a)

flk(Y,Z)=ek(Y,Z)+ek(X,Z) (1.4b)

fk (X, Z) = ek (X, Z) + ek (X, Y)

Now,

AY,(-1)'fli(X,Z) - Y_ (-1)i(3i(X, Y) - A Y, (-1)ifli(X, Z)

E(-1)f {eJ (X, Z) + ef(X, Y) - ei (X, Y) - er-1 (Y, Z)JAM

(1.4c)

-ej (Y' Z) -si(X,Z)} (1.5)

= -I(-I)J(ei (Y,Z)+ef-I (Y, Z)) =(-1)'"+t C. (Y, Z).Jsm

Now definem(X,Z)_(-1)"E(-1),p3(X,Z)

(1.6)!:5m

Clearly it follows that

m (X, Z) = ,m (X, Y) + 1m (Y, Z) - non-negative integer, i.e.

Tim (X, Z) 5,1m(X, Y) +y1,,(Y,Z).(1.8)

We now apply these results to Morse theory. Let M be a Hilbert manifold, fa smooth function on M satisfying the P-S condition on a 5 f 5 b; let c,a < c < b, be its only critical level, and suppose that the critical points offP1, ..., p are non-degenerate, their indices being (n4, m,), i = 1, ..., n.

Page 157: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

152 NONLINEAR FUNCTIONAL ANALYSIS

We know, by Theorem 4.87, that

[f<b],.,[f<_a]UHiU...UH" (1.9)h, h

Our aim is to compute Hk ([ f 5 b], [f < a]).

Theorem 4.88. "

Hk({f<b},{f<a}) _E®bk",Z (1.10)i=1

Before beginning the proof, let us note the surprising fact (R. S. Palais ((3),

p. 336)) that according to Theorem 4.88 the homotopy type of the pair(If S b), {f S a}) will depend only on the critical points with finite indexat intermediate levels, those with infinite index being "homotopically in-visible". This unexpected fact saves us from making the rather inelegantassumption of finiteness of the indices.

On to the proof! By Theorem 4.87, we have

{f<b} = {f<a}UHIUH2U UH,,,h, h2 fin

hi': D" x D" -+ Hi. Let R"' denote D"' with a ball removed from its inte-rior, and let .9" = h, (R"' x D'"). It is easy to see that the pair (U;5 5 b},{f 5 a}) is homotopic to Q f 5 b}, {f < a} u .9 u ... v 9 ,1). By excision,this pair will have the same homology groups as Q f 5 b) - (f < a},9R1 u ... u 9P") and also the same groups as (H1 u ... u H,,, 91 u v 9P").

But by a previous observation, the homology groups of the last pair are

7, Hk (HI, Rj) = Z Hx (D"', 8D"') .

By another of our observations,SHk (D"', 8D"') = 8kj, n< < oo,

and the same formula holds for ni = oo, since D" is homotopically trivialmodulo 80' in this case. This completes the proof.

Corollary: Px ({f 5 b}, {f 5 a)) = number of critical points of type (k, oo )between a and b.

We can now obtain the best known results in Morse theory.

Theorem 4.89 (Morse Inequalities): Let M be a complete Hilbert manifold,f a smooth function satisfying the P-S condition on a <- f 5 b, and supposethat a, b are regular values off and that all critical points off are non-de-generate. For each non-negative integer m let P. be the m-th Betti number of the

Page 158: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

MORSE THEORY ON HILBERT MANIFOLDS 153

pair ({f S b}, {f S a}), and let cm denote the number of critical points offo f index (m, co) in f -1([a, b]). Then

flo < CO

f'1 -/9o<C1 -Co

k(_1)k-rRm Cm

m=0 m=0

and

F, (-1)mPm =E(-1)mCmM=0 M=0

(1.12)

Corollary 1. Nm < cm for all m.

Corollary 2.1f f is bounded below, then the conclusion of the theorem and ofCorollary I remain valid if we interpret P. = m-th Betti number of [f 5 b]and cm = number of critical points off having index m in {f S b} respectively.

Proof of Theorem 4.89. Let c1 < c2 < ... < c be the critical values of fin [a, b]. Choose at, i = 0, 1, ..., n, so that a = ao < I < a1 < c2 < ..< a = b and call Xi = f f < all. Then by the corollary toTheorem 4.88 it follows that Nk (Xi+1, XI) = number of critical points ofindex k in the level ct. We thus have (see (1.6))

'im (Xt+1, -K) = rL, (-1)k-ml'k /(Xi+1, X()ksm

= E (_ 1)k-mksm

l /

F rlk(X.,XO)kim k$m

(number of critical points of index kon level c1).

(number of critical points of index kin f-1([a, b]).

But an iteration of //(1.8) yields

n.Qf<b},{f<a}) < y ilk(Xk+1,Xk)ksm

and by definition of 7m, (1.11) follows. Equation (1.12) can be obtained bytaking m large enough.

The proof of Corollary 1 is trivial, and the proof of 2 follows easily fromTheorem 4.89.

Page 159: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

154 NONLINEAR FUNCTIONAL ANALYSIS

Bibliography

1. Lang, Serge, Introduction to Differential Manifolds (Interscience, John Wiley, New York,1962).

2. Palais, Richard, "Morse theory on Hilbert manifolds", Topology Vol. 2, pp. 299-340(Dec. 1963).

3. Palais, R. and Smale, S., "A generalized Morse theory", Bull. A.M.S. Vol. 70, pp.165-172 (January 1964).

4. Eilenberg, Steenrod, Foundations of Algebraic Topology (Princeton University Press,1953).

5. Milnor, Morse Theory (Princeton University Press, 1963).

Page 160: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER V

Category

A. Definition and Elementary Properties . . . . . . . . . . . . . . 155

B. Category and Homology . . . . . . . . . . . . . . . . . . . 158

C. Category and Calculus of Variations in the Large . . . . . . . . . . 162

A. Definition and Elementary Properties

Remark: We introduce our concepts for arbitrary topological spaces X,but in the applications X will be a manifold.

5.1. Definition: The closed set A E- X is said to be of first category withrespect to X (in symbols cats (A) = 1 or simply cat (A) = 1) if the injectioni : A -+.x is homotopic to a constant.

5.2. Definition: The closed set A c X is said to be of k'k category withrespect to X (cats (A) = cat (A) = k) if

(a)

A, closed and cat (A) = 1, 1 5 i 5 k.(b)

I < k, then some A, is not of the first category.

The closed set A E- X is said to have infinite category with respect to X(cats (A) = cat (A) = oo) if no decomposition of the form (a) in Def. (5.2)is possible.

Examples: Let E", E00 denote the Euclidean n-space and separable Hilbertspace respectively. Then, if A = {x I lx) = 1},

cats.(A)=1, 1 Sn<oo

cat,, (A) =2 for 1 5 n < oo

1 for n = oo.

We now prove some elementary properties of the notion of category.

155

Page 161: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

156 NONLINEAR FUNCTIONAL ANALYSIS

5.3. Lemma: Let A, B closed subsets of X. Then

(5.3.1) cat (A u B) < cat (A) + cat (B)

(5.3.2) A g B implies cat (A) < cat (B)

(5.3.3) Let I be the closed unit interval, rj: X x I - X a continousfunction such that rl (x, 0) = x for x = X.Then if 17 (x, 1) = 771(x), cat (A) 5 cat (t1,(A)).

Proof: (5.3.1.) and (5.3.2) are triv'al. To prove (5.3.3.) we can evidentlysuppose that cat (ri1(A)) is finite, say, k. Then p1(A) = B1 u B2 u u Bk,Bk closed and cat (B,) = 1. Let A, = 171 1(B,). Then A = Al u u Ak. Theidentity mapping in each A, is clearly homotopic to a constant, so cat (A) S k.

Q.E.D.

5.4. Definition: We recall that the dimension of a compact metric spaceX (in symbols, dim X) is equal to n iff

(a) Every open covering {U,) of X has a refinement { V,} such that theorder of {V.} is not greater than n, i.e., such that no n + 2 of the VV havenon-void intersections.

(b) There exists a covering {U,} such that for every refinement {V,} of{U,} there exist V., V,,., in {V,} with non-void intersections (thus {V,}has order n).

A fundamental relation between category and dimension is given by

S.S. Theorem: Let M be a Hilbert manifold, A 9 M a compact set. Then

(5.5) cat (A) 5 dim (A) + 1.

First we prove

5.6. Lemma: Let M be a Hilbert manifold, A S M a compact se,: letn : I x A -+ M be a homotopy between the identity mapping and a constantmapping (rl (0, x) = x, n (1, x) = constant). Then there is an extension s7 of

to I x 0, U a neighborhood of A, such that q is again a homotopy betweenthe identity mapping (in 6) and a constant.

From Lemma 5.6 we deduce immediately:

Corollary: For A, M as in the lemma, suppose (A) 5 k. Then thereexists a neighborhood U of A such that cat,, (U) S k.

Proof of Lemma 5.6: Embed M as a regular submanifold of some Hilbertspace H. Let r c H x H denote the set of all pairs (p, x), p e M, x 1 M,

Page 162: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CATEGORY 157

(MD the tangent space to Mat p). Let a : F-+ H, V :.P-+ H be defined by

a (P, x) =Pp(p,x) =p+x

Let [A) = U 27 (t, A) R A. It is easy to see, using the implicit functionost 1

theorem, that there exists a neighborhood V of [A] and an E > 0 such that,for (p, x) a I', p e V, Jxi < e, p( p, x) covers a neighborhood 0 of A in Hand is smoothly invertible there. Let v-1 be the inverse, and let 4 _ AV-1.Then ¢ is a smooth map from 0 into N, and ¢'[A] = identity.

It is easy to see that ij can be extended to it defined in I x Mwith valuesin H in such a way that n (0, x) = x, x e M, j (1, x) = constant. Choosea neighborhood U of A such that (t, U) c 0 and set

ri(t,p) =X,-n (t, P), P6 U, 0 5 t S 1.

Then j (0, p) = p, p) = constant, n is M-valued, continuous anddefined in I x U.Q.E.D.

Proof of Theorem 5.5: Let { UU} be any covering of A. Then there exists arefinement { V) of { Uj} such that catM (Ph) = 1 for all k (for instance, arefinement consisting of coordinate patches).

Since dim A = n and by our observation, we can find an open covering{U,} such that

(a) cat U, = 1

+s(b) n. ll, = q$ for any (n + 2)-ple of sets in { Uj}

J-(c) There exist U,, U,.+, such that

s+1

in U,1# .J=1

rn+1Let U = U ( fl u, U1 E {U,}) , where the intersection is taken over sets

{if} of distinct indices.By (c), U # ¢. By (a) each of the intersections has category 1, for it is

contained in some U. Hence 6 itself has category 1-it is the disjoint unionof sets of category 1. Observe that

A=(A-U)uU.

Page 163: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

158 NONLINEAR FUNCTIONAL ANALYSIS

A - Uc U {UI - U} and it is easy to see that {U1 -- U} satisfies (b)with n + I replacing n + 2. Indeed; since every possible intersection of n + 1different sets C has been subtracted from U,, the intersection of n + 1

different sets Uj - U with different indices must be empty. Proceeding byinduction (the case n = 1 being trivial) catM (A) < n + cat U < n +-1.

Q.E.D.

B. Category and Homology

We now define the singular homology groups of a topological space. Theseamount to a concrete realisation of the homology groups considered axio-matically at the end of Chapter 4. We will define cubical homology groupsrather than the usual simplicial ones.

Let I = [-I, +1].

5.7. Definition: A singular n cube in the topological space Xis a continuousmapping 0: 1" -+ X.

5.8. Definition: A singular n-cube is called degenerate if4 does not dependon all of its coordinates.

If4. (xl , ..., xk, ..., x") (xl , ..., xx, ..., x") we say that 0 is degeneratealong its k-th coordinate.

Let be the free Abelian group generated by all the singular n-cubesin X. Let D"(X) be the free Abelian group generated by all degenerate n-cubesin X. Then

5.9. Definition: C"(X) = n-th singular cubic chain group is defined as

C"(X) = Q. (X )/D"(X )

Given a singular n cube 0, we obtain from it two n - 1 cubes (the k-th facesof 4), 4k, 4k as follows.

Given (x1 ... xk-1, xk+1 x,,) a 1'. define

(X1, ..., xk-1, Xk+1s ..., x+1) -+0 (XI, ..., x,1-1, 1, xk+l, ..., x")and

.(xls...,xk-1,Xk+ls...,xn) -Iszk+ls...,x").We can then define a boundary operator as follows. If 0, is an n-cube

00 _ y (-1)k (fix - fix )4=1

and 8 is extended linearly to arbitrary n-chains. Since (as the reader mayeasily verify) 8: C"(X) - C"_1(X), we can define a boundary 8 in C

Page 164: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CATEGORY 159

(Definition 5.7 should be compared)

a: Cn(X) Cn-1(X).

5.10. Lemma : as = 0.

Proof: It is evidently sufficient to consider only singular n-cubes 0: I" X.We have

km l

1)k 2: (-1)I (`Yk1

n n-1

I ( - i)k (-1)1 (0k1 4 i+)k=1 1=1

n n-1-Y yk-1 1=1

Observe now that if 1 < k

0*o o*kl

where (*, 0) stands for any combination of the signs +, -; and if 1 > k*ok,1+1 = PI,k

This shows that a04) will be an element of D"(X), and hence aa¢ = 0 in C..Q.E.D.

Having defined chain groups and boundary operators we can definehomology groups in the usual way, i.e.

5.11. Definition: Let Z"(X) = ker a c C"(X)

B"(X) = aC"+1 (X) c C"(X).

Then H"(X) = n-th (cubical) singular homology group of X = Z"(X)JB"(X).To define relative cubical groups of a pair (X, Y) offers no new difficulties;

we simply define the n-th singular chain group of the pair (X, Y) as

C. (X, Y) = C,(X)l (D,(X) + CC(Y)),and the boundary operator a as the relativization of the former O and the n-thcubical singular homology group of (X, Y) as

ZZ(X,Y)=keraYXB ) =,, ( B"(X,Y)=0Cn+1(X,Y)

The cohomology groups are constructed as usual starting from the cochaingroups and the coboundary operators. The reader may benefit by consulting

Page 165: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

160 NONLINEAR FUNCTIONAL ANALYSIS

Hocking-Young or better still Hilton-Wylie, and by proving some of theEilenberg-Steenrod axioms for our cubical singular groups.

It can be seen (H. and Y., p. 306, H. and W., p. 362, B., p. 110) that amultiplicative structure can be introduced into the cohomology groups, by

means of the so-called cup product. Since we are mainly interested in co-homology groups of finite-dimensional manifolds, we will define this productin terms of differential forms-via De Rham's theorem.

5.12. Definition: Let M be an n-manifold, Y e M. Let Ck (X, Y) denotethe vector space of all smooth exterior forms on M of degree k whichvanish in a neighborhood of Y. The coboundary operator is here simply thedifferential d as ordinarily defined for exterior forms.

d : Ck (X, Y) -+ Ck+ 1 (X, Y)

As usual Zk (X, Y) = {co e Ck (X, Y) : dco = 0}, Bk (X, y) = dCk' 1 (X, Y)and

5.13. Definition: Hk (X, Y) = Zk (X, Y)/Bk (X, Y).De Rham's theorem essentially states that these cohomology groups de-

fined by means of differential forms coincide with the cohomology groupof M defined by the singular cubical groups introduced above.

Since differential forms can be multiplied, the definition of cup productis already in view. Let Y1, Y2 s M, 001 a Ck' (M, Y1), co2 e Ck= (M, Y2).

Then col A 0)2 a Ck'+ks (M, Y1 U Y2). Suppose that w1, w2 are cocycles,i.e. that dw1 = dw2 = 0. Then

d(0), A W2) = dw1 A W2 + (-1)eeso' wl A dw2 = 0.

If wl is a cocycle and w2 is a coboundary so that w2 = dw, then

d (w1 A to) = dw1 A w + (-1)desm' wl A eo2 = f wl A 0)2,

which shows thatwl A w2 is a coboundary. This allows us to define a pro-duct (we will use the sign u) in the cohomology groups, operating as follows :

Hk' (M, Y1) x Hk2 (M, Y2) (X, Yx u .We are going to establish now some rather interesting relations between

the cohomology structure of a pair (M, Y) and the category of Y.

5.14. Theorem: Let cat (X) = n. Then any -cup product of n elements ofdegree > 0 vanishes.

Page 166: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CATEGORY 161

This theorem can be reformulated as follows. Define cuplength (X)

= greatest number of elements of non-zero degree with nonvanishing cup

product. Then we have the following corollary.

Corollary:

(5.6) cat (X) > cup length (X) + 1.

Proof: Since the case n = 1 is evident (it reduces simply to the fact that,if the identity mapping i : X - X is homotopic to a constant, then X hastrivial homology) we can suppose n > I. Let Y S X be a set of the firstcategory, i.e. such that i : Y -+ X is homotopic to a constant map m : Y -+ X,

m(y) = p c- X. Consider the exact cohomology sequence

Hk-1(y) Hk (X, Y) J* H"(X) - Hk(Y).

Our assumption on Y implies that Hr(Y) = 0, whence by exactness, j* isonto. Suppose now that X is of category n, X = Y, u u Y,,, Y, of thefirst category. Let y,, ..., y" be n elements in the cohomology rings of de-grees k,, ..., k". Then y, =j*Y,, Y, E Hk(X, Y,), i = 1, ..., n. The mappingj* commutes with the cup product, so (y, u y1 u ) = j * (y, u yZ u ).But y, u Yz ... E Hk,+...+k_(X, Y, U ... U Ya) = Hk'+...+k^(X, X) = 0.

Q.E.D.

Let X, .t be two manifolds, w a form on X. Then we can use w in an evi-dent way to define a form w in X x 1, and the same is true for forms in I.Let n, A be the mappings of forms and of cohomology groups defined inthis way.

a : Hk(X) - Hk (X x 2)

x 2)

It may be seen that rrH" (X), RHk (2) together generate the cohomologyring of X x I But this proves the following statement.

5.15. Lemma:

(5.7) cuplength (X x 1) ? cuplength (X) + cuplength (1).

Applying this lemma to the torus T", cat (T") - 1 ? cuplength (T") n.

Indeed, equality holds here. The proof is left as an exercise.Even though we have defined the cup product only for finite-dimensional

manifolds, we have mentioned the fact that it can be defined in more generalsituations. We mention the following result, discussed in more detail later.11 Schwartz, Nonlinear

Page 167: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

162 NONLINEAR FUNCTIONAL ANALYSIS

5.16. Theorem: Let X be a finite dimensional simply connected manifold,Q(X) its loop space. Then cuplength Q(X) = oo.

Corollary: cat S2(X) = oo .

Example: If P(.i) is the Hilbert projective space over JY, then catP(, = 00.

C. Category and Calculus of Variations in the Large

Let f be a smooth function on a Hilbert manifold satisfying the P-S condi-tion. Define .rk(M) as the set of all subsets of M of category >_ k.

5.17. Definition:

cm(f) =Ae

inf<M) tsupf(p)}

where we put cm(f) = oo if rm(M) = 0.'Let m < m'. Then cm(f) is an infimum over more elements than cm.(f),

therefore

(5.8) -00Sci(f)5CAD :...<oo.

5.18 Lemma: Suppose (as always) that the pair (M, f) satisfies ConditionP-S. Let a(t) be a C° real-valued function defined for t z 0, such that a(t) = 1for 0 S t 5 1, such that t2a(t) is monotone increasing for t ? 0, and such thatt2a(t) = 2 for t a 2. Let V(p) = -a (IVf(p)I) Df(p), so that V is a C' tangentvector field on M, and let ri,(p) be the flow defined by V. Then ii=(p) is defined forall p e M and all - oo < t < + oo.

Proof: It is plain from the description of the function a that the vectorfield V(p) is uniformly bounded; let K be its upper bound. Since drl,(p)ldt= V(rl,(p)), it follows from the definition of distance on the manifold M that8 (rlt(p), rl,(p)) 5 K Is - tI for t_(p) < s, t < t+(p). Thus, if t+(p) < oo, and

approaches t+(p) from below, {rl,l(p)} is a C4uchy sequence, contradict-ing Proposition 4.46 in virtue of the completeness of M. It follows thatt+(p) = oo. We may prove in the same way that t_(p) = - oo , and ourlemma follows.

5.19 Lemma: Let rlt(p) be as in the ' . eceding lemma. Let c be a real num-ber, and set

K, = {p a MIf(p) = c, (vf) (p) = 0}.

Page 168: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CATEGORY 163

Then K,, is compact. Moreover, if for each s > 0 we set

(5.9) N. = {p e MI If(P) - cl < s and I(vf) (' (P))I < efor some t such that 0< t 5 1},

then any neighborhood U of K contains one of the neighborhoods N. of K.

Proof: The assertion that KK is compact follows immediately from Condi-tion P-S; thus only our second assertion requires proof. Suppose that thissecond assertion is false. The there exist a'neighborhood U of KK not con-taining any of the sets N1, and hence there exists a sequence of pointsand a sequence t of numbers such that 0 < t, < 1, such that c and

0 as n --- co. Passing to a subsequence, we may suppose with-out loss of generality that t -+ t* as n -+ co. Now

(5.10)d

f (yh(P)) = V (71r(P))f = -a {(vf('i (P))) f)

and thus, from the definition of the gradient, we have

(5.11)d f (''(p)) = -a (Ivf('i (AI) Ivf('i (P))I2

It is clear from (5.11) and from the definition ofa that I df(ri,(p))/dtl is uniform-ly bounded for all p e M and real t; thus there exists a finite constant K suchthat If(ne(P)) - f(P)I < K lti . Since c, it follows from this last that

is uniformly bounded. Thus, by Condition P-S, {'7,n(pn)} has a con-vergent subsequence, and we may suppose without loss of generality that

converges to a point q c- M. Since (vf) 0, q is evidentlya critical point off. We have

(5.12) P. = ti- r (r1 e (P.)) - n q

by Lemma 4.45 and by the fact that q is a critical point. Thus q e KK is thelimit of and since p # U we have a contradiction which completes ourproof.

5.20 Corollary: Let 0 < e < 1. Let ri, and Nt be as in the preceding Lemma.Then if f(p) 5 c + e2/2 and p 0 N1, we have f (tll(p)) c - e2/2.

Proof: It is plain from (5.11) of the preceding proof that f(77,(p)) decreasesas t increases, and, in fact, that

(5.13) f(nI(P)) -f(P) I a (Ivf('t(P))I) Ivf(n,(P))12 dt.0

Page 169: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

164 NONLINEAR FUNCTIONAL ANALYSIS

Iff(p) < c - e we have nothing to prove; thus we may suppose without lossof generality that I f(p) - el < e. Then p 0 Nt implies that I Vf Z efor 0 < t 5 1, so that, since Pa (t) is monotone increasing (cf. the first para-graph of the preceding proof) we may conclude, using (b), that f (r11(p) -f(p))< -e2. But then

e2 2

and the present corollary is proved.We are now in a position to prove the principal theorem of Lusternik and

Schnirelman in a generalized form.

5.21 Theorem: Let (M, f) satify Condition P-S, and let {cm(f)} be as in De-finition 5.17. Suppose that m < n, and that - oo < c = c.(f) =

oo. Then the set Kc of critical points (cf. (1)) is of category n - m + 1at least; moreover, even if m = n, the set K, is non-empty.

5.22 Corollary: Under the hypotheses of the preceding theorem the set KKis of dimension n - m at least.

Proof: The corollary follows immediately from the theorem and Theo-rem 5.5. To prove the theorem, first suppose that n > m and that cat (K,,)S n - m, and then use the Corollary of Lemma 5.6 to find a neighborhoodU of K. such that cat (0) S n - m. Using Lemma 5.19 and Lemma 5.3.2, wemay suppose without loss of generality that U is one of the neighborhoodsN1 described by (2). By Definition 5.17, there exists a closed subset A ofM such that cat (A) z n and such that sup {f(p) l p e A} 5 c + 6212. PutA0 = A - N1. Then, by Lemma 5.3.1, cat (AD) z m. Thus, if rl, is as inLemma 5.19 and Corollary 5.20, it follows from Lemma 5.3.3 that cat(j7, (Aa)) Z m. On the other hand, by Corollary 5.20, f(171(p)) 5 c - e2/2for p e A0. This contradicts the Definition 5.17 of c. and thus completes theproof of Theorem 5.21 in case n > m. In case n = m and KK is void, we maylet U be the null set, and arrive by the same argument at the same contra-diction. Thus Theorem 5.21 follows in every case. Q.E.D.

Reference

L. Liusternik and L. Schnirelman, Methodes topologiques dints les problEmes variationnels(Hermann & Cie, Editeurs, Paris, 1934).

Page 170: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER VI

Applications of Morse Theory to Calculus

of Variations in the Large

Bibliography

1. R. S. Palais, "Morse theory on Hilbert Manifolds", Topology, Vol. 2, pp. 299-340.2. J. Milnor, Morse theory (Ann. of Math. Studies, Princeton, 1963).3. I. M.Singer, Notes on Differential Geometry (Mimeographed, M.I.T., 1962).4. S.S.Chern, Differentiable manifolds (Mimeographed, Chicago Univ., 1959).

We consider now the set of all suitably smooth paths in a finite-dimensio-nal compact Riemannian manifold M. We shall see that a natural (infinite-dimensional) Riemannian structure can be introduced into this set, allowingus to apply our infinite-dimensional Morse theory. Extremals of a convenient-ly chosen f on this set will correspond to geodesics in M, so that our resultswill relate to the geodesics of M.

6.1. Definition: Let R" be n-dimensional Euclidean space. Define H0(I, R")

1L2(1, R"), i.e. the space of all functions, a, e, ... such that f 1a(t)l2 dt < 00=with the scalar product

fo

o

(a, e)o = I (a(t), Lo(t)) dt .

6.2. Definition: Let Hl(I, R") be the set of all absolutely continuous mapsa : I - R" such that a' a Ho (I, R"). Hl (I, R") is a Hilbert space under the innerproduct (a, e)1 = (a(0), N(O)) + (a', Lo')o. In fact, if (p, q) e R" ® Ho (1, R"),

the map (p, q) --, p + f g(s) ds e Hl (I, R") is an isometry onto.0

165

Page 171: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

166 NONLINEAR FUNCTIONAL ANALYSIS

6.3. Definition: We define L : Hi (I, R") -> Ho (I, R") by La = a' and wedefine Hi (I, R") = {a e HI (I, R") I a(0) = a(1) = 0}.

Then the following is immediate:

6.4. Theorem: L is a bounded linear transformation of norm 1. H1 (I, R")is a closed linear subspace of codimension 2n in HI(I, R") and L maps Hf (1, R")isometrically onto the set of g e Ho (I, R") such that

I

g(t) dt = 0,0

i.e. into the orthogonal complement in Ho (I, R") of the set of constant maps ofI into R.

6.5. Theorem: If p e H, (I, R") and 2 is absolutely continuous from I intoR", then

f I1 (2'(t), e(t)) dt = (2, -LP)o

Jo6.6. Definition: C ° (I, R") = set of all continuous maps of I into R".

C°(I, R") is a Banach space with the usual norm I I.. The inclusion of C° (1, R")into Ho (1, R") is evidently bounded.

6.7. Theorem: Let a e HI (1, R"). Then

la(t)-a(s)) s jt-sIIL,lo.

Proof: Apply Schwarz's inequality.

Corollary 1: If a e H, (1, R") then Io'L0 S 2 la), .

Corollary 2: The inclusion maps i : H, (I, R") - CO (1, R") and Ho (I, R")are completely continuous.

Proof of Corollary 1 is trivial. For 2 apply the Arzela-Ascoli Theorem.

6.8. Lemma: Let 0: R" -+ RP be a smooth map, and let 4) : HI (I, R")- H, (I, RD) be defined by Vi(a) = 0 o a. Then 0 is smooth. Moreover, if1 :!!g m ::5 k, then

d"'oo (2I, ..., A,n) (t) = (21(t) ... ,"(t))This follows from

6.9. Lemma: Let F be a C'-map of r into L3 (R", R°), the spaceof all s-linear maps from 7eR" to R". Then the map F of H, (I, R") into

Page 172: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 167

L' (Hl (1, R"), HI (I, R°)) defined by

F(a) (AI ..., AS) (t) = F (a(t)) (21(t), .... AS(t))

is continuous. Moreover, if F is C3 then F is C' and

dF=dF.Proof: Observe that

F(a) (AI ... AsY (t) =dt

F(a(t}) (AI(t) ... As(t)) = dF,(,) (a'(t)) (AI(t), .... ;.,(t))

+ E F (a(t)) ( I(t), ... , A X0, ... AS(t})t=1

which impliesIF(a) (AI ... As)' (t)I < IdFa(t)I IAI(t)I I ... IAS(t)I

+ IF(a(t))IIAI(t)I...IAI(t)I...1A3(t)I

Since IAtI. < 2IA111, and putting k = sup ldF,(,)I, we have

IdFFcn (a'(t)) (AI ... ).s)10 k23L (a) IAII I ... IAJ ISince also

(AI ... Ai(t) ... AS)I < 2' sup IF'(o'(t))I IAIII ... ,li(t) ... IAs11,

if we recall that 1e12 = Ie(0)12 + Ie'12 we see that

IF(a) (A, ...1,)II < k(a)IA:II

... IA,I'

where k(a) is a constant depending on a. It follows that (since F(a) is plainlymultilinear) F(a) a L'(HI (I, R"), HI (I, R°)). If e e H, (I, R") then

I (F(a) - F(e)) (AI ... A.,)I. s 2' sup I F (a(t)) - F (e(t)) I JAII1 ... IA.11

tnd it is plain that

I ((F(a) - F(e)) (A1 ... ;,,))'I o 5 28M (a, e) IA111 ... IAsl1,

where

Hence

M (a, e) = sup IdFc(,)I la' - e'lo + sup I dF,(t) - dFQ(t)I Ie'lo

+ s sup IF (a(t)) - F (e(t))I

IF(a) - F(e)I <_ k (a, e),

where I i is the norm in L'(HI (I, R"), H, (I, R°)) and where the constantk (a, e) -. 0 if sup IF(or(t)) - F(e(t))I, sup IdF,u> - dFanl and la' - e'Io allapproach zero. But if a --' a in H, (I, R") then IA' - e'lo S la - ell goes to

Page 173: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

168 NONLINEAR FUNCTIONAL ANALYSIS

zero and then e -+ a uniformly. Hence since F and dF are continuousF(a(t)) - F(e(t)) uniformly and dF0(t) -+ dF,(l) uniformly, so k (a, e) 0.This shows that F is continuous, and it can be proved similarly that F isC1 whenever F is C3.

Q.E.D.

Having disposed of the preliminaries we proceed to the applications.

6.10. Definition: Let V be a finite dimensional smooth manifold. Denote byH1 (I, V) the set of all continuous mappings a : I -+ V such that 4)a is ab-solutely continuous and 1(4) o a)'I locally square-integrable for each chart¢ in V.Let H1 (I, V), = {A e Hl (I, T(V)) I A(t) e V,(,) for all t e 1), where T(V) is thetangent bundle to V, and V,(,) is the tangent space at a(t). If p, q e V we defineS2 (V; p, q) as for e H1(I, V) I a(0) = p, a(1) = q} and if a e S2 (V, p, q) wedefine 92 (V; p, q), = {AE H1 (1, V),IA(0) = Oo, A(l) = Q j where 0,, (resp. Oa)is the zero of V. (resp. Va).

Remark: H1 (1, V)o is a vector space under pointwise operations andQ (V, p, q), is a subspace of H1 (I, V),.

6.11. Theorem: Let V be a smooth submanifold of R. Then (a) Hl (I, V)consists of all a e Hl (1, R") such that a(I) c V. (b) H1 (1, V) is a closed sub-manifold of the Hilbert space H1 (I, R"). (c) If p, q e V, then 9 (V, p, q) is aclosed submanifold of H1 (I, V). (d) If or e H, (I, V) then the tangent space toH1 (I, V) at a is H1 (I, V). = (A e H1 (1, R") I A(t) e V,(,), t e 1} and (e) ifa e S2 (V, p, q) then the tangent space to D (V, p, q) at a is just S2 (V; p, q),

{A a H1 (I, V), IA(0) = A(1) = 01.

Proof: (a) is clear. It is equally clear that H,(I, V) is a closed set in H1(I, R")and that S2 (V, p, q) is a closed set in H1 (I, V). Since V is a smooth sub-manifold of R" we can find a smooth Riemann metric for R" such that V is atotally geodesic submanifold. Then if E : R" x R" - R" is the correspondingexponential map (i.e. the map t - E (p, tv), where E (p, tv) is the geodesicstarting from p with tangent vector v), then E is a smooth map. Let or e Hl (I, V)and define 0: H1 (I, R") - Hl (I, R") by 4)(A) (t) = E(a(t), A(t)). Then 0 is smoothand 0(0) = a. Moreover d¢o (A) (t) = dEf"' (A(t)), where E0 )(v) = E(a(t), v).Since dEo(') is the identity map of R", d4o is the identity in H1 (1, R"). Thus bythe inverse function theorem 0 maps a neighborhood of zero in Ht (I, R")Ck-isomorphically onto a neighborhood of a in H1 (1, R"). Since V is totallygeodesic, given A near zero in H1 (I, R"), 4)(A) a H1 (1, V) if and only ifA e H1 (I, V),. Similarly, if a e .Q (V, p, q) then ¢(A) e S2 (V, p, q) if and

Page 174: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 169

only if A E D (V, p, q),. So 0-' restricted to a neighborhood of a inH, (I, V) (resp. Q (V, p, q)) is a chart in H, (I, V) (resp. D (V, p, q)) whichis a restriction of a chart for H, (I, R"). This completes the proof of (b)and (c), and the verification of (d) and (e) is routine.

. Q.E.D.Since any smooth map of a sub-manifold V c R" into a submanifold W

c R," can be smoothly extended to a map from R" to Rl", we can apply ourresults to obtain the following statement.

6.12. Theorem: Let V c R", W e R"' be smooth submanifolds, 0: V - Wa smooth map. Then 0: H, (I, V) --+ H, (I, W) defined by 0(a) = qa is asmooth map of H, (I, V) into H, (1, W). Moi eover the Frechet derivative

is given byd¢,,: Hi (I, V),, - H, (1, W)®(0>

ddb0Q) (t) = d () (A(t))

Observe now that every manifold can, by Whitney's Theorem, be imbed-ded in an Euclidean space. Hence

6.13. Theorem: H, (I, V) and .Q (V, p, q) are Hilbert manifolds, and, byTheorem 6.12, their manifold structure does not depend on the particular im-bedding of V used.

The function to which general Morse Theory will be applied in what fol-lows will be the action integral J"(a), defined for a Riemann manifold V asfollows :

6.14. Definition: For a e H, (I, V),

J°(a) f0l('t)I2dt.

We leave to the reader proofs of the following properties of J°(a).

6.15. Lemma: Let V, W be smooth manifolds, 0 : V - W an isometry.Then J' = Jw o 0.

6.16. Lemma :Let V be a smoot h submanifold of W. Then J' = J w J H, (I, V).

6.17. Lemma: J'' is a smooth functional.Advice: Prove Lemma 6.17 first for smooth submanifolds of R" and then

for general manifolds using Nash's imbedding theorem for Riemannianmanifolds, which was proved as Theorem 2.4.

Next observe that Hl (I, R"), as a Hilbert space, has a natural Rie-mannian structure. Hence for all manifolds V, H, (I, V) will also have a

Page 175: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

170 NONLINEAR FUNCTIONAL ANALYSIS

Riemannian structure. But the situation here is not so pleasant as in connec-tion with the differentiable structure of H, (I, V), since now, in general, thisRiemannian structure will depend on the imbedding V-+ R". However, thiswill not bother us at all.

A second observation is the following. Let W be a complete Riemannianmanifold, W, a closed submanifold of W inheriting from Wits Riemannianstructure; let ev, ew be the respective Riemannian metrics. It is clear that ifp, q c W1, then ev (p, q) >_ ow (p, q), since, by definition the right side of givenan infitnum over a larger set than the infimum on the left. Hence the Rie-mannian structure of W, is also complete. Putting our observations together,we get

6.18. Theorem: Let V be a smooth submanifold of R". Then H, (I, V) is acomplete smooth Ricmannian manifold with the Riemannian structure inherited

from H, (I, R") where R" is any Euclidean space in which V is isometricallyimbedded.

The same reasoning gives shows that Sl (V; p, q) is also a complete Rie-mannian manifold. Regarded as a submanifold of H, (I, R"), the scalarproduct in it is simply (e, A)o = (Le, LA)0.

One more preliminary needed for the application of Morse Theory is theverification of the Palais-Smale condition for the action integral. To remindthe reader of the nature of this condition we write it down again:

P-S condition: Suppose that, for a sequence a", VJv (o") -+ 0, and J v(,.) isbounded. Then there exists a subsequence a",, convergent to an elementaeH,(I,V).

We proceed to establish this condition for iv in a series of substeps. Wesuppose throughout that V has been isometrically imbedded in a Euclidianspace R". In what follows, L is the operator of Definition 6.3.

6.19. Lemma: Let {a"} be a sequence in Q (V; p, q) such that I L (a" - ojo- 0 as n, m -+ oo. Then or. converges in Q (V; p, q).

Proof: Evidently or. - e H, (1, R"). {a"} is Cauchy in H, (I, R") andhence convergent. But Q (V; p, q) is closed in H, (I, R").

Q.E.D.

6.20. Detnitlon: Let p, q V. If a e Q (V; p, q) we define h(a) to be theorthogonal projection of La onto the orthogonal complement of L (Q (V; p, q),)in Ho (I, R").

6.21.1Leorem: Let J = Jv IQ (V; p, q). If we consider Q (V; p, q) as aRiemannian manifold with the structure induced on it as a closed submanifold

Page 176: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 171

of H, (1, R") then for each a e Q (V; p, q) (VJ) (a) can be characterized as theunique element of Q (V; p, q), mapped by L onto La - h(a). MoreoverIVJ (a)1. = I La - h(a)10.

Proof: Note thatQ (V; p, q), is a closed subspace of H1(I, R") and is con-tained in H1 (I, R"). It follows from Theorem 6.5 that L maps Q (V; p, q),isometrically onto a closed subspace of Ho (I, R"). Since La - h(a) is ortho-gonal to L (Q (V; p, q),)1, La - h(a) = LA, A e Q (V; p, q) with R uniqueand JAI, = IL210 = ILa - h(a)lo. It will suffice to prove that dJ,(e) = (2, e)0for e e Q (V; p, q) i.e., that dJ, (e) = (LA, 4)0 = (La - h(a), Le)o fore eQ (V; p, q),. Since (h(a), Le)o = 0 fore eQ (V; p, q) we must provethat dJ,(e) = (La, Le)o for e eQ (V; p, q).,. But JR"(a) _ I ILaI0, so dJa "(e)= (La, Le)o for e e H, (I, R"). Since j = JR" IQ (V; p, q), it follows thatdJ,=dJ;"IQ(V;p,q),.

Q.E.D.

6.22. Definition: Let Q (V; p, q), be the closure of Q (V; p, q), in Ho (1, R"),and let P, be the orthogonal projection of Ho (I, R") on Q (V, p, q),. For eachpoint r e V, let Q(r) denote the orthogonal projection of R" onto the tangentspace V, to V at r.

6.23. T71eorem: The functional J of Theorem 6.21 satifies the Palais-Smalecondition.

Proof: Let {a"} be a sequence in Hl (I, V) such that IJ(a")I < M, JJ(a")-+ 0. Since, by Theorem 6.21,

IVJ (a,,)I = I La. - h(a")l0,

we have ILa,, - h(a")I0 - 0. Since each P, is a projection-hence norm-decreasing-it follows from the corollary of Theorem 6.7 that I La.- P,"h (o")l0 - 0, and by Corollary 2 of 6.7 we can assume on passing to asubsequence that Ia" - aml," -+ 0 as m, n -' oo. We need only to prove thatIL (a" -- am)Io -+ 0 for m, n -+ oo, for then it follows that a" will convergeinQ(V;p,q) to aainQ(V;p,q). But

IL (a" - am)IO = (La", L (a" - am))o - (Idm, L (a,, - d.))0

Thus it suffices to prove that (La", L (a" - am))0 -- 0 as m, n -, oo. SinceI1-0"IZ = 2(a.) is bounded, IL (o - am)I0 is bounded also and, sinceLa" - P,"h (a") -+ 0 in Ho (I, R") it suffices to prove that

(P,"h (a"), L (a" - am))o - 0 as nr, n -+ oo.

Page 177: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

172 NONLINEAR FUNCTIONAL ANALYSIS

We now refer to Lemma 6.24 below and note that it follows from thisLemma that if a e Hl (I, V) then Pf belongs to S2 (V, p, q), if f is smooth

and vanishes for t = 0 and t = 1. Since h(a) is orthogonal to LP,f in this

case, we have (h(a), LPef) = 0 for all such a and f. Thus

(;) (P,h (a), Lf) = (h(a), (P,L - LP,) f)

for a e Hl (I, V) and smooth V vanishing at t = 0,1. If we put Q,(t)(dldt) 0 (a(t)), it follows by differentiation from (*) that

(P,h (a), Lf) = (h(a), Q, '.f) _ (Q" . h(a), f)

for smooth f vanishing at t = 0,1, and hence, by a limit argument, for all

f e H1 (1, R").

Since a,, - am e Hi (I, R") it follows thatI

Qa" (t) h(a") (t), (a" - am) (t)) dtI(PQ"h (a"), L (a" - am))ol = Ifo (

Q,,,(t) h(a") (t)I dtla" - 1a.[. fo I

is bounded. Let A be a compact set such that a"(I) c A. Then there exists Ksuch that

JIIQQ" (t) h(a") (t)I dt 5 K ILa"Io Ih(an)Io.

Now, since ILa"lo is bounded and since ILa" - h(o")lo - 0, Ih(a")lo isbounded and the theorem follows.

Q.E.D.

Finally, we relate critical points of the action integral J with geodesics, andfind conditions under which these critical points are non-degenerate. Wewill not discuss the geometry of geodesics of a finite-dimensional manifoldin detail, but refer the reader instead to (3) or (4) of the Bibliography.

6.24. Lemma : Let a e Q (V; p, q). Then b (V; p, q), = {A e Ho (1, R") I A(t)e V.(,) for almost all t e I). If A e Ho (I, R") then (PA) (t) = 0 (a(t)) A(t).

Proof: Let n, e L (Ho (I, R"), Ho (I, R")) be defined by (nA) (t) = S2 (a(t))A(t). Since S2 (a(t)) is an orthogonal projection in R" for each t e I it followsfrom the definition of the inner product in Ho (1, R") that n, is an ortho-gonal projection. From the characterization of 0 (V; p, q), it is clear that x,maps H* (I, R") onto S2 (V; p, q),. Since Hi (I, R") is dense in Ho (1, R")

Page 178: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 173

it follows that the range of 2r, is d2 (V; p, q) so rr = P.. On the other hand,A e Ho (I, R") is fixed under r, if and only if A(t) e V,(,, for almost all t e I.Since the range of a projection is its set of fixed points, this proves ourlemma.

Q.E.D.

The following are obvious consequences of the lemma.

Corollary 1: If a e Q (V; p, q), then

P, (Hi (I, R")) = Hl (I, Y). and P. (Hi (I, R")) = S2 (V; p, q)0

Corollary 2: If or e Q(V; p, q) then P0La = La.Another simple result, whose proof is left as an exercise, is

6.25. Lemma : Let T c- Ho (I, L (R", R°)) and define for each A e Ho (I, R") ameasurable function TT (A) : I-+ R" by T(A) (t) = T(t) A(t). Then

(1) T is bounded from Ho (I, R") to L1 (I, R°);(2) If T and A are absolutely continuous then so is T(A) and

(TA)' (t) = T'(t) A(t) + T(t) A'(t);(3) IfT e Hl (I, L (R", RD)), A e Hl (I, R"), then T (A) e Hl (I, RD).

6.26. Definition: Let a e .Q (V; p, q). Define G, e Hl (I, L (R", R")) byG, = .Q o or and Q, a Ho (I, L (R", R")) by Q, = G,.

6.27. Theorem: Let a e .G (V; p, q). Let F. be as in Definition 6.22. Ife e H, (I, R"), then (LP, - PL) e(t) = Q,(t) 9(t). Given f e Ho (I, Jr), definean absolutely continuous map g : I -+ R" by

g(t) = J ds.0

Then, if e e Hi (I, R")

(.l (LP, - P,L)e)o = (g, -Le)0.Proof: Since Pe (t) = G,(t) e(t) and P,(Le) (t) = G,(t) e'(t) by 6.24, (LP,

- PA) (e(t)) = Q,(t) e(t) follows immediately by differentiation. By (1) ofLemma 6.25, s - Q,(s) f(s) is summable, so g is absolutely continuous. Nextnote that, since G,(t) = Q (a(t)) is self-adjoint for all t, Q,(t) = G,'(t) isself-adjoint wherever defined, and hence

1(LP P L1

(f, o ` o) e)o = J U(t), Q.(t) e(t)) dt = J(QQ(t)f(t), e(t)) dt0 o

=J

f (g'(t), e(t)) dt.0

Page 179: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

174 NONLINEAR FUNCTIONAL ANALYSIS

Then if e e Hi (1, R") Theorem 6.5 gives

Q.E.D.(J,, (LP, - PoL) e)o = (g, -Le)o

6.28. Theorem: Let h(a) be as in Definition 6.20. If or e .Q (V; p, q) thenP,h (a) is absolutely continuous and (P,h (a))'(t) = Q,(t) h(a) (1).

Proof: If o e Hi (I, R") then

(P,h (a), Le)o = (h(a), P,Le)o = (h(cr), (P,L - LP,) e)o

since (h(a), LPe) = 0. Hence (P,h (a), Le)o = (g, Le)o if we define g to be

g(t) = f Q0(s) h(a) (s) ds.r

.JJ o

Then P,h (a) - g 1 L (H* (I, R")), whence Ph (a) - g = constant. Since gis absolutely continuous so is P,h (a) and they have the same derivative.But g'(t) = Q,(t) h(a) (t).

Q.E.D.

6.29. Theorem: Let a be a critical point of J. Then or is smooth and, more-over a" 1 V everywhere. Conversely, if a eQ (V; p, q), a' a.e., a" 1 V, then a isa critical point of J.

Proof: By Theorem 6.21, if a is a critical point of J, then La = h(a).Since P,la = La, it follows that P,h (a) = h(a), so by Theorem 6.21 a' isabsolutely continuous (so that a is C1) and(*) Qe(t) a'(t).

Now since SZ : V - L (R", R") is smooth using 6.26 we have

Q1(t) = dtd2 (a(t))

It follows that if a is C", then Q,(t) is so by (*) the statement that a"is Cl"- I implies that or is C,"+ 1. Since we already know or is C', it follows thata is smooth. If e e S2 (V; p, q) then La = h(a) is orthogonal to Le, so thata" is orthogonal toe. Since a" and e are continuous, it follows that (a"(t), e(t))= 0, t e I. If t e I is not an endpoint and so a V,(,), then there existse c -.Q (V; p, q) sdch that e(t) = vo, hence al(t) is orthogonal to V,(,) and,by continuity, this holds also at the endpoints. Conversely, if a eQ (V; p,q)is such that a' is absolutely continuous and a" 1 V,(,) for almost all t e 1,then La 1 L (Q (V; p, q)) so La = h(a) and a is a critical point of J.

Q.E.D.

Page 180: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 175

The last step in the characterization of critical points of J is supplied bythe well-known result of classical differential geometry (see (3) and (4)) that,if Or e C2(I, V), or is a geodesic of V parametrized proportionately to arc-length if and only if a" 1 V everywhere. We obtain the following conclusion.

6.30. Theorem: If a e S2 (V; p, q), then or is a critical point of J if and onlyif a is a geodesic of V parametrized proportionately to arc length.

We must now determine when an extremal point of Jwill be degenerate.We limit ourselves to a brief exposition and to suggesting that the readerconsult (3).

Let E denote the exponential map of V. into V; i.e. if v e V,, thenE(v) = a(Ivl) where or is the geodesic starting from p with tangent vector v/Iv(.Then E is smooth. Given v e V, we define R(v) = dimension of null-spaceof d,,, If A(v) > 0, we call v a conjugate vector at p. A point of V is calleda conjugate point of p if it is in the image under E of the set of conjugatevectors at p. By Sard's theorem the set of conjugate points of p has measurezero in V.

Given v e E_1j` _) define v e.Q (V; p, q) by v(t) = E (t(v)). Then v is a geo-desic parametrized proportionately to arc length (factor: JvJ) and hence acritical point of J. Conversely, any critical point of J is of the form v for aunique v e E- 1(q). We may now state the following two theorems:

6.30. Non-degeneracy theorem: If v e E-1(q) then v is a degenerate criticalpoint of J if and only if v is a conjugate vector at p. Hence J has only non-degnerate critical points if and only if q is not a conjugate point p. This condi-tion is satisfied if q lies outside of a set of measure zero in V.

6.31. Morse index theorem: Let v e E- 1(q). Then there are only a finitenumber oft satisfying 0 < t < 1 such that t, is a conjugate vector at p. Theindex of v is E A (tv). In particular each critical point of J has finite index.

0<t<I

Proof of 6.31 and 6.32

Let a be a geodesic on the manifold V. By definition 6.14 of J'(o) we have

J'(a) 1 da (t) dQ (t) dt,2Jodt dt

so that, if we introduce coordinates in the neighborhood of the curve a wehave

j'JV(a) =

2

I Igij (a(t))dat

(1) da, (t) dt.o dt dt

Page 181: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

176 NONLINEAR FUNCTIONAL ANALYSIS

If we then put a, = or + ep, and calculate the terms of second order in a inorder to evaluate the Hessian quadratic form 62J" (a; e, e) we find that

b2JV (a; e, e)

_ 1 1 dp'(t) den (t) :

2I {gg + Ar(t)

ar(t)+ Bt) a (t) a (t) dt ,

Jwhere

Ai/t) and may readily be expressed in terms of a and of thefirst and second partial derivatives of ggj.

If we are careful to choose coordinates in the neighborhood of the geo-desic curve a in such a way that or itself is the first axis and curves perpendicularto a give the remaining axes, we have g!(a(t)) = 6,.,, and the above expres-sion for the Hessian reduces to

[t] 62Jv (a; e, e)

=f

t

It (de'(t))2+ A(t) a(t) d t) + B'(t) '(t) e'(t) dt.z

The Hessian matrix 32J (a, ... , .) will be singular if and only if there existsa function a eQ (V; p, q), such that 62J (or; e, j) = 0 for all a eQ (V; p, q),.That is, the Hessian matrix will be singular if and only if there exists a non-zero function (e') a H, (1) such that e'(0) = e'(l) = 0 and such that

Q.(e, e) =

J o '

dLo'

d t t) d drt) + 2 Ajj(t) e`(t) dr(:)

+ I Aj/:) e `(t) d ' t) + Bu(t) e'(t) e'(t)} dt = 0

for all (e') a H, ((0, 1 ]) such that e'(0) = e'(1) = 0.Integrating the above expression by parts, we see that a2J (a; , ) will

be singular if and only if the second order differential system

(*) - d2@1(t) +

1

Au(t) dej(t) _ 1 d

{Aji(t) e'(t)} + BiAt) e'(t) = 0dr2 2 dt 2 dt

has a solution (e') satisfying A = 0, e'(0) = 0, e'(1) = 0.Call the real numbers A for which there exists a non-zero function (e')

with e'(0) = 0 = e'(1) satisfying (*) the eigenvalues of the Hessian formQ,; call the corresponding functions a the eigenfunctions belonging tothe eigenvalue A; and call the number of linearly independent functionsbelonging to the eigenvalue A the multiplicity of the eigenvalue A.

Page 182: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 177

Then the classical theory of Sturm-Liouville equations supplies the follow-ing results.

a. Every eigenvalue of the boundary value problem defined by the differ-ential equation (*) and the boundary conditions p'(0) = e'(1) = 0 is of finitemultiplicity. The eigenvalues form an infinite sequence of isolated pointsbounded below. Thus, if the eigenvalues are enumerated in increasing order,each being repeated a number of times equal to its multiplicity, they form anincreasing sequence A1, A2, ..., of real numbers approaching infinity.

b. (Minimax principle). Let H1'°'([0, 1]) denote the set of functionse = (e') a H1([0, 1]) such that e'(0) = 0 = Lol(l). Put

e (P) _ + e(t) P'(t) dt,J0

and Ie( = (p, the k-th eigenvalue Ak in the above sequence isgiven by the expression

(**) Ak = max min Q. (e, p)al. ak-1

101=1 0 EH1°)(10.1])

Let 0 < a 5 1, and let H1([0, a]) denote the set of all functions e = (e')on the interval [0, a] which have square-integrable first derivatives: letH(°)([0, a]) denote all those which vanish at both end-points of the interval[0, a]. Let be the n-th eigenvalue, in increasing order and with repetitionsaccording to multiplicity, of the equation (*), with boundary conditionse'(0) = 0 = o'(a). Then, applying the minimax principle (**), we find that(***) Ak(a) = max min Q (e, e)a1. ek-1 (a. aj)-0

C- (CO.03) J=1,-.-,k-1X01= 1 0 e H;°j(10. a])

Since Hi°)([O,a]) may also be regarded as the subset of Hi°)([0,1]) consistingof all functions e = (p') such that p'(t) = 0, a <_ t 5 1, it follows immediately,on comparing (**) and (***), that Ak(a) > Ak(1) = Ak for all k. By the sameargument but more generally we have Ak(a) ? Alt(b) for a 5 b. Thus theeigenvalues A*(a), regarded as functions of the parameter a, are monotone-

decreasing. For all awe have e'(t) =f°°

(e'(s))' ds S 11/24r

1

1

I (e'(s))' I2d$)

l/z

thus for all sufficiently small values of a the first term of the integral [ i ]dominates the others, and the expression [t] is necessarily positive. By theminimax principle (**), this implies that for sufficiently small a, all theeigenvalues Ak(a) are positive. Thus, for each a > 0, the number of negative12 Schwartz. Nonlinear

Page 183: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

178 NONLINEAR FUNCTIONAL ANALYSIS

eigenvalues is precisely equal to the number of eigenvalues which havecrossed from positive to negative as a has increased from zero to its givenvalue.

The above arguments establish the following lemma:

6.32. Lemma :

(i) The Hessian matrix 62J° (a; e, ti) is singular, i.e., the critical point aof the functional J`' is degenerate, if and only if the equation (*) has a non-zero solution e = (e') satisfying e'(0) = e'(1) = 0.

(ii) Let 0 < al < a2 < ... < a, < 1 be the values of a for which thedifferential equation (*) has a non-zero solution e = {e') satisfying e'(0)= e'(a) = 0; and let n(a) be the number of such linearly independent solu-tions attaching to the value a. Then Morse index of the critical point a, i.e.,the number of negative eigenvalues of the hessian matrix 62J", is equal to thesum n(al) + + n(a,).

Next we shall need the following Lemma.

6.33. Lemma: Let n(a) be defined for each a as in (ii) of the preceedingLemma. Let v - E(v) be the exponential transformation, which sends eachtangent vector v at the point e of the manifold V into the point o,flvI), wherea, is the geodesic starting from P with-tangent vector v/JvJ. Let vo be such thator = a,,. Let dE,, be the gradient of the map E at the point vo. Then n(l)is equal to the dimension of the null-space of the linear transformation dE,,.

Before giving the proof of this Lemma, let us note that the Non-degeneracyTheorem and the Morse Index Theorem follow readily from the two pre-ceeding lemmas. Indeed, since n(1) = 0 is the criteria for non-degeneracyaccording to the first of our two lemmas, the Non-degeneracy Theorem fol-lows immediately from the second lemma. As to the Morse Index Theorem,we note that, applying the second of Lemmas to each of the geodesic seg-ments a(t), 0 S t 5 a, with a 5 1, we find that n(a) = 2(avo) for 0 S a S 1.Thus the Morse Index Theorem follows at once from part (ii) of the firstof our Lemmas.

Let us now give the proof of the second lemma:

Proof: Put a,(t) = E(vt), so that a, is a geodesic curve parametrized propor-tional to arc-length, whose tangent vector at t = 0 is v. Since for any v a, is a

critical point of the functional J"(a) we have de J1 (a, + &0)1..o = 0 for

every function e = (e') vanishing for t = 0 and t = 1. Thus, if y is any vec-

Page 184: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS OF MORSE THEORY TO CALCULUS 179

for tangent to Vat the same point p as v, we have

02

for all Q. That is,0EaE 1=0I

a=o = 0JY(av+ir + Ee)

62JV (av,ddE av+iv, e) a=o - 0

z=a

for all a vanishing for t = 0 and t = 1. If we note that the differential equa-tion [*] is derived from the variational condition (t] on integrating by parts,we see at once from this last equation that the function

AY(t) = d av+4t)

satisfies the linear differential equation [*].We have

02ar+tv-(t)

=o aiat

s=o

= d (v+sv70=v;}

t=t=o de

thus d ; satisfies the initial conditions d r(0) = 0, d;(0) = v. On the other

hand, taking t = I, we find that d,(1) = da, *;,{t) = d E (v + iv)

de s-0 de-1,;Wo

= Thus the dimension of the null-space of dE, is at the same timethe dimension of the null-space of A4,1); and hence equals the dimension ofthe space of vectors v such that d satisfies both the boundary conditionsA -,(O) = 0 and d,(1) = 0. That is, the dimension of the null-space of dE,is the integer n(l) of Lemma 6.33, and thus the proof of Lemma 6.33 iscomplete.

Q.E.D.

Page 185: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications
Page 186: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER VII

Applications

A. Applications to Homotopy Theory . . . . . . . . . . . . . . . 181

B. A proof of Theorem 5.16 . . . . . . . . . . . . . . . . . . . 185

C. The Homotopy of Some Lie Groups . . . . . . . . . . . . . . 189

A. Applications to Homotopy Theory

We first recall the definition of the homotopy groups of a space. Let Xbe a topological space, A a subspace of X and p e A. Let In denote the n-dimensional cube, I"-1 c I" the bottom face, and J"-1 the union of all theother faces, so that J"-1 = 8I" -- I"-1. We shall write

f: (I", I"- J"-1) -- (X, A, p)

for any continuous function f : In -> X which maps In-1 into A and J"-1

on p. We denote by Q" (X, A, p) the space of all such functions, and byan (X, A, p) the set of all components of D. (X, A, p). n" (X, A, p) has a wellknown group structure (by taking representatives of two elements of at",reparametrizing and then "joining" them). We call it the n-dimensionalhomotopy group of X relative to A with base point p.

The following are easily proved properties of the groups n":

(i) By reparametrizing we get for n z 2

da" (X, A, P) ; D1(D"-1 (X, A, P).0o, 00),

where 0o is 'the constant map sending I"-1 to p. Hence, n"(X, A, p)= nl(P"-1(X, A,P),00,00).

(ii) In the same way we prove that for n z 2

X. (X, A, P) = n.-1(Q (X, A, P), 00, to),where 00 sends I1 onto p.

181

Page 187: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

182 NONLINEAR FUNCTIONAL ANALYSIS

(iii) The identity of maps homotopic to the con-stant map 0o : I" -> p.

(iv) If 4)(I") c A, then 0 is homotopic to the identity.(Shrink I" by means of a function m,, so that 0 4)m, is a honotopy be-

tween 0 and 0o.)In other words, 'r (X, X, p) is trivial for any p e X.In the sequel we shall write n"(X, p) for n"(X, p, p). It is easy to see that

n"(X, p) is, in fact, the usual "absolute" n-dimensional homotopy groupof X with base point p. Also, n"(X, A, p) will often be written 7r .(X, A),when no confusion can arise. (Of course if A is arc-wise connected n" (X, A, p)does not depend on p.)

Suppose we have a map V: (X, A, p) -+ (Y, B, q). Then V induces a map+p* : n"(X, A, p) -+ n"(Y, B, q). (Just send4) ESl"(X, A, p) into E D.(YB,q).)

7.1. Definition: The boundary homomorphism 49: n" (X, A, p) -' n"-1(A, p, p)or briefly 8 : a. (X, A) - ac _ 1(A) is defined as follows.

Given q e ="(X, A), take ¢ e q, then ¢II"-1 belongs to D"_ 1(A, p, p) andso determines a class 8q a n"_ 1(A).

We state the following without proof.

7.2. Theorem: Let i be the injection (A, p) - (X, p) and j the injection(X, p) - (X, A). Then the following sequence is exact:

... -. n" (X, A) e ' nA-1 (A, p) o n"-1 (X, p) !_`` n"-1 (X, A)

This is the analog for homotopy groups of the Exactness Principle givenin Chapter N, Part 2, § E, of these Notes.

Now, suppose we have a manifold M, feCOD(M), satisfying the P-S condi-tion, and let as usual M° _ {x e M; f(x) 5 a), M° = {x e M; f(x) S b}.If there are only non-degenerate critical levels between a and b, Mb is de-formable to M' with handles attached :

(1) Mb - M" u h1 (Dk' x D`1) u h2 (D1= x D12) u ---h, h= h3

Let A be another manifold, and 0 a mapping 46: A - Mb. Assume that dim (A)is less than the index kl of any critical point in (1) and that A is compact.Next, note that 01 can be deformed to a smooth map, and set Al= ¢-1 (h1 (Dk1 x D")). Consider h-14) : Al -+91 x Di'. Let p1 bethe pro-

jection map of Dk1 x D" onto D. Then p1hi 10 is smooth and maps Alinto 0111. But dim (A1) < k1, whence some point in D"`1 does not belong tothe range of pah-'4), and the same holds for the other indices k2, etc.

Page 188: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS

Now a manifold of the type

M°uh,q, x D'')uh2(#2-q2 x D`2)U...

hl h2 h3

can be deformed into M° (see drawing).Hence, 0 can be deformed to a map A - M°.

As a special case we get:

183

7.3. Theorem: n (M°, M°) = 0 if n < degree of any critical point betweena and b.

7.4. Corollary: If Morse theory applies to (M, f) and if above some non-critical level c all critical points have indices greater than n, then n. (M, M`)=0.

We will now apply our results to the topology of spheres, in order toobtain the so-called Freudenthal suspension relation between homotopygroups.

First we recall that in relation to H,(SJ, p, q) and the function J, thegeodesics joining p and q are critical points whose indices depend onthe length of the geodesic: if length (y) = n - e for any 0 < e < n, thenindex (y) = 0; if length (y) = n + e, then index (y) = j - 1; if length(y) = 3n - e, index (y) = 2(j - 1) and so on.

This follows from the Morse Index Theorem 6.31. Suppose that wehave, as before, a map 4) : A - H, (SJ; p, q) where p # q and p # q',the conjugate of q, and that dim (A) < 2 (j - 1). Then by 7.3 0 is homo-topic to a map whose range contains curves of length at most n + 2e. Nowassume that length (o) < n + 2e. Let m be the midpoint of cr: m = v(Q.It is easy to see that m: H,(SJ; p, q)-+ SJ is a smooth map.

We have d (p, m) < in + e and d (q, m) < in + e. This implies that thereare unique geodesics v, joining p and m and v2 joining q and m (see dra-wing below), if e is small enough.

Page 189: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

184 NONLINEAR FUNCTIONAL ANALYSIS

P'

Then d (a(t), al(t)) 5 1(2c + 2e) + j (n + 2E) < n if e is small enough (ebeing the distance between p' and q). Hence in this case a(t) and al(t) are con-nected by a unique shortest geodesic varying continuously with t, whence a canbe deformed through these geodesics into a1. The same holds for a2. Thusany map 0: A -+ H1(SJ; p, q) is homotopic to a map 0: A -+ H1(SJ; p, q)such that each value ¢(A) is a broken geodesic of two segments and totallength less than a + It follows that the space of maps A - H1 is of thesame homotopy type as the space of maps with values in a "belt", and henceofthe same homotopy type as the space of maps A - SJ-1(see figure below).

Now, it can readily be proved that H1(SJ; p, q) is of the same homotopytype as H1(SJ; p, j); that is, the homotopy type does not depend on thepoints p and q. Thus our result is independent of the relative position ofp and q.

In particular, we obtain :

7.5. Theorem: If dim (A) < 2(j -1), the space of maps A - D1(SJ; p, q)is of the same homotopy type as the space of maps A - SJ-1.

Page 190: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS

7.6. Corollary: For n < 2(j-I)

an (D1 (SI; p, q)) - n (Sl -1) .

By property (ii) of the homotopy groups, we obtain

7.7. Corollary:7r"+1(SJ") n"(SJ), for n < 2j.

Corollary 7.7 is known as the Freudenthal suspension relation.

7.8. Corollary:xn(S") -- 2r"+1(S"+1), if n > 0,

whence arn(S") = Z if n > 0.

B. A Proof of Theorem 5.16

185

Let X ° . B be a fiber space. Also let 0 : A - X and V = p¢ : A -+ B. Wesay that the homotopy +p= of V has the "lifting property" if there exists ahomotopy 0, of ¢ such that ip, = po,.

Example: If X = B x C and p,: X-+ B is the natural projection on Band P2: X - C that on C, and if 0 and w are two functions as above,then given a homotopy Vr the map

of =+V,has the required properties.

We state without proof the following

7.9. Theorem (Kunneth) [Cf., for example, Hilton and Wiley, HomologyTheory.]

H. (B x C; G) = ®Hk (B; Ht (C; G)).k+t=n

7.10. Corollary: Suppose G = real numbers. Then

b (B x C) = E' bk(B) b!(C)k+1="

where b" are the Betti numbers.If we form the Betti polynomials

b (B, z) = z"bn(B),nao

Corollary 7.10 implies that

b (B x C, z) = b (B, z) b (C, z).

Page 191: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

186 NONLINEAR FUNCTIONAL ANALYSIS

Consider now the following fiber space: take a topological space B, apoint b e B and let X be the space of all curves in B starting at b, with theusual topology. Of course p : X-+ B assigns its end point to each curve.

Let 0: A -+ X, and ip = pq5. For a given homotopy y,, of ip, put 4,(a)= curve 4(a) followed by +p,(a). This provides a lifting. So X -°-- B has thelifting homotopy property. Furthermore, Xhas the homotopy type of a point(just shrink each curve to the point b). Therefore 0 for n > 0.

Returning to Theorem 7.9, set Ht (B, H, (C, G)) for k, I z 0, anddenote by Z the whole double sequence {Zk-'; k, l z 0}. More generally, as-sume we have two arbitrary double sequences of Abelian groups, Z and Z.Then we make the following

7.11. Definition: We say that Z is derived from Z by an r-boundaryoperation if there exists a "boundary" operator d: E ®Zk.1 E ®Zk.rsuch that d2 = 0,

(#)

and

d: Zk,' - Zk-r. i+r-I

k.1 {dz = 0) n Zk. Z

dZ - Zk"(It should be understood that Z" is the trivial group for k or I < 0.) d iscalled an operator of type r. In this case we shall write 2 = JE°,(Z).

Observe that for r large and k + 1 small, Zk.', because in this case{dz = 0} = and dZ = {0}.

7.12. Lemma: If we have operators d, of type i = 2, 3, ... and startingwith Z, sequences .r°2(Z), .* 3 (.*'2(Z)) of groups, etc., the limit

.W.(Z) = lim' °e (°2(Z)) ...)exists.

This follows from the above observation.Next we quote the following fundamental theorem on the homology of

fiber space, but without giving its proof.

7.1 3.Tbeorem (Leray-Serre) : [Cf. Serre, "Homologie Singulibre des EspacesFibras", Ann. Math. 54 (1951).] Let X °- B be a fiber space, with B con-nected and simply connected, and connected fiber F = p-1(b). Put Zk,'= Hk (B, H, (F)). Then H (X) has a composition series with factors Zk-',k + I = n, such that Z = .af°.(Z). (A composition series for G is a sequenceof subgroups G, of G such that G = Go 2 G, -a 0, and the factorsare the groups G,/G,+,.)

Page 192: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 187

Let us consider once more the fiber space X -D- B of curves beginning atb e B, with fiber F = p-1(b) = Q(B). As we said before, all the homo-logy groups of X are zero for n > 0. As in Theorem 7.13, put

Hk (B, H, (.Q(B))), where Zk-0 = Hk(B). Suppose that H (B) is the first non-vanishing homology group of B of positive dimension (see diagram below).If n > 2, by Theorem 7.13, Z°.1 must be zero, because this does not changewhen homology with respect to an r/r Z 2 boundary operation is taken;since the final result must be 0, all Zk.'being zero, H1(Q) itself must vanish.This implies that all the in the column of H1(Q) are 0. Similarly, if n > 3,all the groups in the column of H,(Q) are zero. Using these remarks, wemay prove the following theorem.

0

0

H0(8)-H0(Q)

d2 ZliI

H1 (Q) H, (2) -- - Hn-i (U) H.. (S2)

7.14. Theorem: If B is connected and simply connected, the first non-vanishing homology group of positive dimension is isomorphic to the firstnon-vanishing homology group of positive dimension of Q(B), which isHr-1 (Q(B))

Thus

H (B) ^-' (.(B))

Proof: Suppose, for example, that n = 3. After homology with respect tothe 2-boundary operation is taken, Z3.0 remains the same, for Z1.1 is zeroby the above remark. The same is true of Z°.2. Taking homology with re-spect to the 3-boundary operation may change both groups, but all the other

Page 193: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

188 NONLINEAR FUNCTIONAL ANALYSIS

homologies leave invariant the groups in the places (3, 0) and (0, 2). But thelimit groups H.,(Z)3.0 and H,,,(Z)°,Z must be zero, so the 3-boundary ho-mology gives us zero in both places. In other words, the sequence

0d3- HO)

113

a~ 0

is exact, which proves the theorem.Q.E.D.

7.15. Ccrollary (Hurewicz): If B is connected and simply connected, thefirst non-vanishing 'romology group of positive dimension, H (B) is isomor-phic to the first non-vanishing homotopy group of positive dimension a. (B).

Proof:

H4(B) = i (.(B) = H1 (f"-'(B)) = ri (D8-'(B)) = x.(B)

Q.E.D.

Now assume that B is a finite dimensional space. Consider homologygroups with real coefficients and let Dk.' = dim bk(B) bt(Q(B)), wherethe bk are Betti numbers.

Suppose that Q(B) has only finitely many non-vanishing Betti numbers;let b,(D(B)) be distinct from zero, and bj(Q(B)) = 0 for 1 > r. Similarly, let

0 and bk(B) = 0 for k > n. Then D'," is different from zero, andremains fixed throughout the sequence of homologies of Lemma 7.12 andTheorem 7.13 (same argument as before). But this is a contradiction, forthe final result gives the trivial homology of the path-space X and hencemust be 0. Thus D(B) always has infinitely many non-vanishing homologygroups. Suppose next that one of the numbers, say, b,(Q(B)), is infinite, andthat for I < s, b, (S2(B)) is finite. Then the number at the node (s, 0) of theabove diagram remains infinite throughout the sequence of homologieswhich is again a contradiction. We have thus proved

7.16. Theorem: If B is connected, simply connected and finite dimensional,99(B) has infinitely many non-vanishing real homology groups and all of themhave finite dimensions.

Q.E.D.

The space 9(B) is an example of the more general concept of a "group-like space".

7.17. Definition: Let X be a topological space. Then X is called a group-likespace if there is a binary operation defined on it, a distinguished element

Page 194: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 189

called the identity, and a mapping x -- x-1 such that all the properties defining

a group are satisfied up to homotopy (e.g. m - m - e - identity).

For group-like spaces we have the following theorem of Hopf, which wequote from the ciled paper of Serre but shall not prove.

7.18. Theorem: The cohomology ring with real coefficients of a group-likespace with finite dimensional homology groups is the direct product of a poly-nomial algebra and an exterior algebra.

7.19. Corollary: Under the above hypotheses, if the group-like space X hasinfinitely many non-vanishing Betti numbers, the cup-length of X equals oo.(See § 2 of Chapter 5 for the definition of cup-length.)

7.20. Corollary: If B is connected, simply connected and finite dimensional,then

cup-length (S1(B)) = oo.

[Compare with Theorem 5.16.]

7.21. Corollary: Under the above hypotheses, any two points of a Rieman-nian manifold B are connected by indefinitely long geodesics.

Remark: It can be proved that if B is compact, simple connectedness is notnecessary.

C. The Homotopy of Some Lie Groups

We first recall the definition and some properties of the unitary group.For more details, see Milnor's book on Morse theory. The unitary groupU(n) is the group of all n x n complex matrices preserving the inner productin C", or equivalently, the group of all n x n complex matrices such thatUU* = I, where U* is the conjugate transpose of U.

This is a Lie group, and the tangent space at the identity I is the space ofmatrices {iH}, where H is hermitian, i.e.: H = H*. Analogously, the tangentspace at U0 is the space of matrices {iU0H} = {iHU0}. The matrix ex-ponential function defined by

2 3expA=I+A+ A-+ A-+2! 3!

coincides with the exponential function defined on the Lie algebra {iH} withvalues in the Lie group U(n). The scalar product

*'-(A, B) = Re trace (AB*)

defines a Riemannian structure on U(n).

Page 195: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

190 NONLINEAR FUNCTIONAL ANALYSIS

The geodesics beginning at I are the curves of the form v(t) = exp (ill?),with H hermitian. We say that o(t) has H as initial velocity.

Our aim now is to determine at which points of the tangent space at theorigin, that is, at which hermitian matrices H, the exponential function has avanishing Jacobian. The image of these points under the exponential is theset of conjugate points to the identity I.

In general, let f be any analytic function of a matrix. Then f has a Cauchyintegral representation,

f(M) = I, f(Z) dz.tact z -- M

If bf(M, N) denotes the first variation off at the point Mapplied to N, wehave:

(1) af(M, N) = - f f(z) S [(z --- M)-', NJ dz.

On the other hand,

(2) 6(z-M)-' _(z-M)-'dM(z-M)-Now consider the operators e(A) and A(A), on matrices defined as right andleft multiplication by A respectively. Since the mapping A - e(A) is a homo-morphism from the group of non-singular matrices to the group of non-singular linear operators on the linear space of all matrices, and similarlyfor A, we obtain :

ande ((z - A)-') = (z - e (A))-'

A((z -A-') =(z -A(A))-'.Hence from formula (2) we obtain

8 ((z - M-') = (z - e(M))-' (z - A(M))-' 8M,

and therefore using (1) it{f(z)follows that

(3) bf(M, N) = _L2xi

(z - (M))1 (z - A(M))-' dz (N).}Set

Then

Moreover,

$2) -- 1 f(z) IF Idz.2 i

6f(M, N) = 0 (e(M), A(M)) (N) .'

1 1 _ ( L-),Z -$1 z 1 1 2(Z z r ),

- r2

Page 196: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 191

SO

$2

We now return to the function f(H) = exp (iH). In this case, we have

S exp (iH) = 0 (o(H),1.(H)) 8H,where

exp (iE1) - exp (iE2)

E, - $2 $1 - $2

Bute(A) - R(A) = Ad (A).

So finally we get the formula

where8 exp (iH) = exp (iH) +p (Ad (H)) oH,

(z)1 - exp (iz)

z

Furthermore, the eigenvalues of ti (Ad (H)) are equal to W (eigenvalues ofAd (H)). The zeros of tp are z = 22rn, n = 0, ± 1, ±2, ... Hence the ma-trices H which give rise to conjugate points to the identity, are those whoseAd (H) has an eigenvalue of the form 2nn, n = 0, ± 1, ± 2, ... Now, theeigenvalues of Ad (H) are differences of--the eigenvalues of H, hence thematrices we are looking for are those having eigenvalues differing by 2nn,n=0,±1,±2,...

All these calculations apply to the Special Unitary group SU(n) also, butin this case the tangent space to the indentity, that is, the Lie algebra, is thespace of matrices {iH} where H is hermitian and has trace 0.

The following considerations apply to the group U(2m) or the groupSU (2m).

We choose an element E near -I having the form

exp(i(a+a,)) 0

exp (-i (n + el))E = exp (i (n + 82))

exp (-i (n + e2))0

= exp(i (E1 - 2)))

Page 197: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

192 NONLINEAR FUNCTIONAL ANALYSIS

We wish to study the geodesics joining 1 and E. Therefore we have to find the

solutions of exp (iH) = E. But such an elementHcommuteswith E, and since E

is diagonal with distinct entries, H too must be diagonal. Set

H =

Then exp (ih1) = exp (i (n + e,)); exp (ih2) = exp (i (n -- et)); etc. Hence,

h 1 = n + e l + 27zn, = e, + (2n, + 1) n,

h2 = r - e, + 2nn2 = -el + (2n2 + 1) n, etc.

This means that the h, are of the form:

(2k + 1) n ± e.

The length of the geodesic with initial velocity H is

L= ddt

exp (itH)I = (tr (HH*))1/2 = (tr(H2))1/2

= {E ((2n, + 1) n ± e)2}1/2

Choosing ± 1 for the coefficients of n, we obtain 22n, geodesics of minimal

length -aThe next shortest geodesics are obtained when all coefficients but one are

±1, and one of them is ±3; then the length is - n ylm + 8.Conjugate points to the identity appear along a geodesic when t (h, -hj)

= 22rn for some t, 0 < t < 1, and for, some h, 0 hj. Given hl and hj there are

Ch` - h,] ([ ] = integer part) conjugate points. The total number of con-

jugate points along the geodesic is therefore

I rhi - h j = 2 hh,#hJL 2n IL 2n ]

But hj = n (2nj + 1) ± e. Hence the total number of conjugate points is> 2 (nj - n, - 1) for a small enough.

nj>njConsider now the special case of the Special Unitary group SU (2m).

Then we have

Page 198: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 193

7.22. Lemma: Unless m of the nj's in the formula for the hj's are 0 and

the m others are -1, the geodesic with initial velocity H passes through at

least 2 (m + 1) conjugate points to the identity.

Proof: Since the trace of H is zero, E hj = n (E 2nj + 2m) = 0, soE nj = -m. Thus there are two possibilities: either Y, nj < -m, or

nJ<0

n j = -m. In the first case, there is at least one positive n j, call it n1inJ<0so that if N denotes the number of conjugate points

N>2 Y (nj - ni - 1)z2 E (n1 -nj-1)>2 E -nj>2(m+1).nj>ni nj<0 eJ<0

In the second case E nj = -m there are no positive nj's. If our hypothesisAJ<0

is violated, some negative nj must be less than -1, so the number of nj'sequal to zero is >- m + 1. Now,

N > 2 E (nk - nj - 1) >_ 2 (number of n's equal to 0) > 2 (m + 1).nk=0nJ<-1

Q.E.D.

7.23. Corollary: All the geodesics joining I and E of non-minimal lengthcontain at least 2 (m + 1) conjugate points and therefore have index

2(m + 1).

7.24. Corollary: For the loop space H1(SU (2m), I, E) and the lengthfunction J, the relative homotopy group

nj(H1i{J5n12m+8)) = 0

for all j :!!g 2m + 1 and 6 small enough.The proof is an application of Corollary 7.4.We consider now the geodesics joining I and -I in SU (2m). If H is the

initial velocity of such a geodesic, H has eigenvalues (2nj + 1) n, and thelength of the corresponding geodesic is

2m

L = [ ((2n j + 1) n)2]1/2

=1

We obtain the geodesics of minimal length when all the coefficients 2nj + Iof n, are ± 1. This length is 2m r, and the other geodesics have length>= n 2m + 8. For the geodesics of minimal length, the fact that trace (H)= 0 implies that there are m eigenvalues equal to +n and m eigenvaluesequal to -n. In this case, the matrix H is completely determined by givingthe subspace of eigenvectors of the eigenvalue n, the other subspace corre-13 Schwartz, Nonlinear

Page 199: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

194 NONLINEAR FUNCTIONAL ANALYSIS

sponding to -a being orthogonal to the first one. Therefore we have a homeo-

morphism between the manifold of minimal geodesics joining I to -I inSU (2m) and the Grassmann manifold Gm(2m) of m-dimensional linear

subspaces of C2m.We shall now prove the following theorem due to Bott (compare to

Lemma 22.5 of Milnor's Morse Theory):

7.25. Theorem (Bott): Let M be a complete Hilbert manifold, f a smooth

function satisfying the P-S condition. Suppose that the set (f a) has onlyone critical level c, with critical set K = f- I (c), and assume that K is a finitedimensional submanifold of M. Then

n:k({f<a},K) =0 for all k,

provided that f is bounded below.For the proof we need the following lemma.

7.26. Lemma: Under the hypotheses above, if {x"} is a sequence suchthat f(x") -+ c, then there exists a subsequence {x",} such that x", -' x e K.

Proof: Take c to be 0. Consider the vector field v = -Vf, and let o(x, t)be the flow of v (v (x, 0) = x; see Definition 4.44). For each n, let t be thefirst value of t for which

I( Vf(a (x", t))II <n

We prove the existence of such a t" as follows.

If g(t) = f(a(t)), a(t) being any solution of a'(t) = v (a(t)), then

g'(t) = dfo(,)(a'(t)) = df,(,)( -Vff(,)) = - II Vf,(,)II2 ,

so g(t) is monotone decreasing. Since g is bounded below, its derivative can-not be bounded away from zero for positive t. Thus t" exists. Set y" = a (x", t").Then we have

f(y,,) _<_ f(x,,).

By the P-S condition, there is a subsequence (y"J} which tends to a criticalpoint y,,,. Since 0 is the only critical level, f(y.J) - 0. Assumewithout loss of generality that nf(x") -- 0. Then, nf(y") - 0 also. But

d (x",. y.) = d (a (xn,. 0). a (x",, ta,)) < f Ila'(xn1, t)II dto

f"" IIVf ((r(x",, t))IJ dt.0

Page 200: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 195

In the interval (0, t,,,), II Vf (a (xc,, t))II > l/nj, whence the last expression is

less than

(*) n j

Jtnl II Vf (a (xn,, t))112 att .

0

Now,dt

f(a(t)) _ -IIVf(a(t))112 So (*) equals

tn,

nj f(d a (xn,, t)) dt = nj (f(xn,) -f(yn))0

which tends to zero.Thus {xn,} also tends to y,,,, e K.Q.E.D.

Proof of Theorem 7.25: By our hypotheses, K has a neighborhood N in Mhomeomorphic to K x disc. [Cf. O. Hanner, "Some theorems on absoluteneighborhood retracts", Arkiv for Matematik, Vol. 1, (1950), pp. 389-408.]

Assume that S is a topological space and ¢ : S - { f S a} is continuous.Using Morse Theory it follows that for any s > 0, 0 is homotopic to a4z such that c - s S f(o,(S)) < c + s [because { f S a} and {f < c + e}have the same homotopy type]. Therefore there is a homotopic 01 such thatf(4 1(S)) 5 c + s and 01(S) s N. If this were not the case, we would con-struct a sequence such that c and x. l N for all n, contra-dicting Lemma 7.26. Since N is homeomorphic to K x disc it follows bysqueezing the disc hat 01 is homotopic to a function 02 with values in K.

Q.E.D.

We return to our study of the groups U(2in) and SU(2m). By Corol lary 7.24.any map from a space X of di mension < 2m + I into H1(SU(2m), I, E) can bepushed down homotopically into a map ofXwith values in curves whose length

is S n + e, for any e > 0. The same result is true if we consider insteadthe space of loops H1(SU(2m), I, -I); the points -I and E being joined byan' unique minimal geodesic one can prove that H1(SU (2m), I, E) andH1 (SU(2m), I, -1) have the same homotopy type.

7.27. Lemma: Fork < 2m + 1, nx(Q(SU(2m), I, -I),K) = 0, where Kis the manifold of minimal geodesics joining I and -I.

Proof: By the remark above, a map from the k-cube into Sl (SU(2m), I, -I)can be pushed down to a map into the space S20 of curves of length at most

Page 201: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

196 NONLINEAR FUNCTIONAL ANALYSIS

x 2 +a. But by Bott's theorem, the space Do has vanishing homotopy

groups relative to K.Q.E.D.

7.28. Corollary: Fork < 2m + 1, a k (S2 (SU (2m),1, -1)) ~ nk (Gm(2m)) .

Proof: By Theorem 7.2, the sequence

....-* 71Sk (X, A) 1 nk-1(A) - 7Lk-1(X) - 7Lk-1 (x, A) ...

is exact. The first and the last written terms are zero by Lemma 7.27, whence

the middle terms are isomorphic, i.e.

nk (92 (SU (2m),1, -1)) = nk(K)

But, as noted preceding Theorem 7.25, K is homeomorphic to the Grass-mann manifold G. (2m).

Q.E.D.

7.29. Corollary (Bott's isomorphisms):

nk+1 (SU (2m)) nk (G,,, (2m)) for k5 2m.

We now proceed to obtain corresponding results for the group U(n). LetX ! + B be a fiber space, X and B connected.

Let F = p-1(b) be the fiber. Then p induces an isomorphism p" : nk (X, F)-+ aik(B) [cf. Steenrod, The Topology of Fiber Bundles, or Hilton and Wylie,Homology Theory, pp. 288-289]. Using the exact sequence for homotopywe see that

-- ...... -+ 7ak(B) -s nk- I(F) ..+ ack-1(X) -+ n,%- 1(B)

is exact [this is the so called exact bundle sequence].If G is a connected Lie group, and Ha subgroup of G, then G is a fiber bundle

with base space GJH (the factor space of G by H). The projection p is thenatural one. [This uses the existence of local cross sections of G over GJH.See Steenrod's book.] Consider now the inclusion SU(n) c U(n). The factorspace U(n)/SU(n) is the unit circle C. Then:

ak (U(n)) = nk (SU(n)), k > 2.

Next, consider the inclusion U(n) c U(n + 1). The factor space U(n + l)/U(n)is the sphere S2A+1 [cf. Steenrod's book]. Hence,

nk(S2n+1).+ nk-1(U(n)) -+ Xk-1(U(n + 1)) -i alk-1(S2s+1)

Page 202: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

APPLICATIONS 197

is exact. So, fork < 2n + 2 we get:

nk(U(n)) = irk(U(n + 1))

(Stable homotopy groups). It is natural to write ak (U(oo)) = lim alt (U(n)).a-.m

The space U(m) x U(m) is included in U(2m), since the matrix

A(m) 0

0 ` A(m)

is in U(2m) for any A(m) c- U(m).It is easy to verify that the factor space U(2m)/U(m) x U(m) is homeo-

morphic to the Grassmann manifold G. (2m). Therefore

xk (Gm(2m)) --+ ak-1(U(m) x U(m)) -.+ vk-1 (U(2m)) - ik- z (Gm(2m))

is exact. This geometric fact justifies the following

Remark: Given a Lie group G and subgroups H1 c H2, GJH1 has abundle structure over GJH2, with natural projection and fiber H2/H1.(Same proof as in the somewhat less general case considered before.] If wethen consider the fiber space

U (2m)/ U(m) - U (2m)/ U(m) x U(m),

and use the exact bundle sequence we find that

xk (U(2m)l U(m)) - al, (Gm (2m)) --+'re-1(U(m)) - xk (U(2m)l U(m))

is exact.

7.30. Theorem: (Bott's Periodicity Theorem):

xk-1(U(co)) - Irk+1(U(oo)) for k Z I.

Proof: First we prove that in the exact sequence noted above thefirst and the last groups are zero provided m is large enough. Indeed, wehave seen already that the space U(2m)l U(2m -1) is the sphere S" 1. Thus

irk(U(2m)/U(2m - 1)) = 0 for m large.Similarly

ak (U (2m - 1)/ U (2m - 2)) = 0 for m large, etc.,

and we getxk (U(2m)/U(m)) = 0 for m large,

by the above remark.

Page 203: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

198 NONLINEAR FUNCTIONAL ANALYSIS

Hence, by the exactness of the above sequence of homotopy groups, we get

.nk (Gm (2m)) - nk _ j (U(m)) form large.Now,

nk -I (U(M)) - nk -1(U (2m)) by stability, for m large.Therefore

nk (Gm (2m)) - nk_ 1(U (2m)) form large.

Using Corollary 7.29 (Bott's isomorphisms), we may assert that

nk+ 1 (SU (2M)) '" nk(Gm (2m)) form large.

But we have seen that

nk (U(n)) - nk (SU (n)) for k>2.Thus

nk + 1(U (2m)) =nk (Gm (2m)) form large,

and for k > 1. Hence, finally,

nk+1(U(2m)) - nk_1(U(2m)) if m is large.Q.E.D.

7.31. Corollary: The homotopy groups nk (U(oo)) are zero if k is even, andisomorphic to Z if k is odd.

Proof: Observe that7r2 WOOD n2(SU(2)) = 0

n3 (U(00)) = r3 (SU (2)) = Z,

as SU(2) is nothing but S3.Q.E.D.

Page 204: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CHAPTER VIII

Closed Geodesics on Compact RiemannianManifolds

(Chapter by Hermann Karcher)

In this chapter closed geodesics on a Riemannian manifold M will bestudied using infinite dimensional Morse Theory in much the same way asin Chapter IV where geodesics joining two fixed points were treated. Weshall study a Hilbert manifold H1 (S1 , M) of closed, sufficiently regularcurves (Hl-curves) on M. The coordinate spaces for Hl (S1 , M) are (as inChapter IV) Hilbert spaces whose elements are Hl-vector fields along curveson M. In Chapter IV one defined scalar products for the coordinate spaces(following Palais) with the aid of Nash's embedding theorem. In this chapter(cf. Theorem 8.6) we use instead Klingenberg's intrinsic scalar product (firstintroduced in a lecture given in Bonn) which depends only on the Riemannianstructure of M. In Theorem 8.9 we prove that this scalar product and variousother possible scalar products on the Hilbert spaces of Hl-vector fields leadto equivalent norms. The differentiable structure of H1(S1, M) and usefulcoordinate systems are discussed in 8.11 to 8.18. Theorem 8.19 states thedifferentiability of the energy function and in 8.20 we introduce a Riemannianmetric for H1(S1, M) based on Klingenberg's intrinsic scalar products.These developments are somewhat more complicated than the correspondingones in Chapter IV since it does not seem possible to obtain the intrinsicRiemannian metric of H1(Si, M) via an embedding M z RN. In 8.22 to8.26 we carry out an auxiliary discussion of differentiable curves on H1(S1, M)and their representation on M. In the second half of the present chapter ouruse of the intrinsic scalar product allows simpler proofs than in Chapter IV;it also seems that notions such as the gradient vector field of J on H1(S1, M)can be more readily interpreted in terms of M. We prove a few geometricresults concerning the Riemannian structure of H, (S1, M) in 8.27 to 8.33.

199

Page 205: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

200 NONLINEAR FUNCTIONAL ANALYSIS

Lemma 8.34 contains basic estimates which we need to derive the explicitformula for the first derivative of the energy J in 8.35 and to prove the validityof the Palais-Smale condition for J in 8.41. In 8.36 to 8.39 we introduce thegradient of J as a vector field on H1(S1, M) and identify the critical pointsof J as the closed geodesics on M. In 8.41 to 8.50 standard arguments frominfinite dimensional Morse Theory are used to show the existence of at leastone nontrivial closed geodesic on every compact C6-Riemannian manifold.In 8.48 we prove that those flow lines of the gradient deformation (cf. 8.43),which start at points f with sufficiently small energy J(f) < e have uniformlybounded length. As an immediate consequence we obtain the importantresult 8.47 that J-1(0) is deformation retract of J-1([0, e]).

We conclude with a summary of recent developments.

Preliminaries. M will be a compact Riemannian manifold of class Ck(k Z 6). (Metric completeness rather than compactness is sufficient for mostof our general developments but not for the desired application to closedgeodesics.) MD denotes the tangent space to M at p, TM the tangent bundleof M. The scalar product in MD is denoted by g(p) (v, w) or more briefly by(v, w); in local coordinates on Mthe metric is writtenglk(p) v'wk. The distanceon M induced by this infinitesimal metric (cf. Chapter VI) is called dM(p, q).

Absolutely continuous curves (resp. vector fields) with locally square in-tegrable derivatives will be called H1-curves (resp. H1-vector fields). For H1-curves we may define an energy integral J and a length LM as follows.

J(f) = j f (1'(t),f(t)) dt , L,4(f) = f (f(t),f(t))112 dt.

We shall be interested in closed H1-curves parametrized by the interval [0, 1](not necessarily proportionally to arc length). Hence we find it useful todefine the following space :

H1(S1i M) = {fIf: [0, 1]/{0, 1) -, M and J(f) < co}.

(We always identify the circle S1 with the factor space [0, 1]/{0, 1}.)The covariant derivative of a vector field v(t) along f(t) will be written

Dv/dt; this derivative is given in local coordinates on M by the formula

dt+ I',k (f(t )) At) Vk(t )

(where rJ, e Ck -I are the Christoffel symbols). Differential equations withsquare-integrable coefficients can be treated by the Picard-Lindelof iterationscheme. Using this fact and the last formula we see that Levi-Civita parallel

Page 206: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 201

transport is well defined along any curve f e H1(S1, M). In particular aparallel vector field along such a curve is absolutely continuous, and for anycontinuous vector field along f the following statement holds: dv/dt is locallysquare integrable if and only if Dv/dt is locally square integrable.

The exponential map available on the manifold M may be described asfollows. For 0 , v e Mo, let c : [0, oo) - M be the geodesic ray starting at pwith tangent vector v, such that its parameter t is proportional to arc length sand ds/dt = I vl m.. Define exp (v) = c(1). Then exp: TM -+ M. We denotethe restriction expIMD = exp,. If convenient, we write exp, = exp. It fol-lows immediately from the differential equation for geodesics (i.e., fromD/8t c = 0) that exp, (t v) = c(t), in other words that exp, is radial iso-metric. Since r,,,, e Ck-2, it follows that exp : TM -- M is a C'`-2 map, andthe differential equation for geodesics also shows that the linear map in-duced by exp, at the origin of M. is the identity map.

Geodesic parallel coordinates on M are easily defined in terms of the ex-ponential map exp. Given a geodesic arc c: [0, T] -+ M (arc length t as para-meter), choose an orthonormal n-frame F0 = (c(0), v2(0), ..., v(0)} in M 0)and define n-frames F, in M,(,) by parallel transport of F0 along c. The

map [0, T] x Rn-1 -> M given by (t, u2, ..., u") -' exp t) (=R2E u' - vl(t)) is

Ck'2 and the inducgd linear map at (t, 0, ..., 0) is the identity. Therefore aneighborhood of [0, TJ x {0} is mapped C4'2 diffeomorphically onto atubular neighborhood of c in M. The inverse map gives the desired coordi-nates. We have g,,t(c(t)) = ask and Fj,t(c(t)) = 0. Since M is compact theseremarks prove the following lemma.

8.1. Lemma: There exists e, > 0 such that the geodesic parallel coordi-nates just defined are valid in the e,-tubular neighborhood of any geodesicarc which is sufficiently short so that its e;.tube does not cover any pointtwice. Moreover there exist constants C > 0 and 0 < m1 S 1 S m2 < oosuch that for parallel coordinates in any ep tube we have

n n

(1) II'fxl 5 C and m1 E (v')2 5 gtxv'vk 5 m2 E (v')2t=1 t=1

A consequence of the second inequality in (1) is given in

8.2. Lemma: If ( , ) and (( , )) are two scalar products such that ml((v, v))5 (v, v) S M2 ((V, *for all v and if m1 S 1 S m2 then

I(v, w) - ((v, w))I s 16 (m2 - ml) ((v, v)) - ((w, w)).

Page 207: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

202 NONLINEAR FUNCTIONAL ANALYSIS

Proof: Using (v, w) _ I ((v + w, v + w) - (v - w, v - w)) one first gets

I(v, w) - ((v, w)) I < 8 (m2 - mi) (((v, v)) + ((w, w)))

Now (v, w) - ((v, w)) = b (v, w) is bilinear and

l b (v, w)i < c (It'll + 1w12)implies

lb (v, w)I s 2 1v1 Iw1,

since for A 0 we have

Ib (v, w)I = b (Av, w) < C A2IUI2 +2IWI2l

Q.E.D.

The following result is well known for H1-curves in RN.

8.3. Lemma: Every subset of H1(Si , M) on which the energy integral isbounded consists of equicontinuous curves on M.

Proof :

dM(f(t),f(to)) Lm(f)l',. = (1'(z),.f(a))1/2 dr

ftot 1/2

I t - to l ft (f(r), j(-r)) dr) (by the Cauchy-Schwarzo inequality)

.5 -J(f

8.4. Corollary: L,(f) S 2J(f), and La (,f) =1(f) if and only if f is para-metrized proportional to arc length, i.e., (f(a), f(r)) = coast.

We next wish to define a differentiable structure on H1(S1, M) with theaid of coordinate spaces which have a geometric interpretation on M.

8.5. Definition: For any f e H1(S1, M) consider the set of H1-vectorfields along f:

(2) H1(S1, TMf) = {vIv 3 [0, 1]/(0, 1)) --i TM

such that v(t) a Mr(,) and v is absolutely continuous and has locally squareintegrable derivatives.

8.6. Theorem: H1(S1, TMf) is a separable Hilbert space with the scalarproduct <v, w> defined below.

Page 208: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 203

Proof: H, (S, , TMf) is obviously a vector space. For v, w e H, (S, , TM,r)define

(3) <v, w> = f1 Iwo, w(t)) + Idt dt )} dt.

o

(Of course, Dldt denotes the covariant derivative along f.) It is clear from (2)of Definition 8.5 that we have <v, v> < oo, <w, w> < oo. Moreover usingI(v(t), w(t))I < Iv(t)I Iw(t)I (in Mf(,)) and the Cauchy-Schwarz inequality, itfollows that

<v, w)2 < CI1 fIv(t )I . Iw(t)i + I dt dto

2

1fDUo Iv(t)I2 +

dt

21dt fo { t 2 + JD

= <v,v>.<w,w>.

Hence <v, w> is defined for all v, w e H1 (S,, TMf). Clearly <v, w> is bi-linear and positive definite, and therefore a scalar product.

For the proof of completeness of the space H, (S, , TMf) we need thefollowing definition and lemma.

8.7. Definition : For v e H, (S,, TMf) put

IIvII. = max (v(t), v(t))1'2.

t E [O.1)

8.8. Lemma: IIvII2 S 2 <v, v>.

Proof: The formula

(4) at (8U(t))(v(t), w(t))) _ (., + dt

is well known for f, v, w e C' and generalizes by an easy limit argumentto all f e H, (S,, M) and v, w e H, (S1, TM,r). Now choose such thatJ I vll v(tm)) and note that

uvll , =(4t),

v(t)) + 5"" (v(z), v(a)) drdz

1

s (v(t), v(r)) + 2 f Iv(r)I . Dvdr.

o dt

Page 209: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

204 NONLINEAR FUNCTIONAL ANALYSIS

Since the left side of this formula is independent of sand since 2a b s a2 + b2we get

IIV112 ::g00 - (v(t), v(t)) dt + f ' {(vr), v(-r)) + (P!.. )} dt 2<v, v>.fo

Q.E.D.

We may now complete the proof of Theorem 8.6 in regard to completenessand separability.

Since f is absolutely continuous.we can subdivide f into finitely many sub-arcs f, such that each of them is representable in some geodesic parallelcoordinates by functions f,'(1) (i = 1, ..., n = dim M and t e I,) withI f,(t)I < e, (cf. Lemma 8.1) and U I, = [0, 1], II,I = 1. If and only iff e H1(S1 i M) will all the f, be Hl functions. Since J(f) < 00 we can alsoassume that these subares are so short that

(f(t), f(t)) dt <ml

3e r, 8C m2n

where C, m1, m2 are the constants of 8.1 (1), and n is the dimension of M.We now consider the restriction of any v e H1(S1, TM,) to I i.e. to a vec-tor field along f . The coordinates of v will be called vi. Then by 8.1

(v(t ), v(t)) + , D dt<v,v> r, - I

dt dt /jZ ml rv ` ((v `(t))2 + {v`(t) + I' (f,(t)) fr(t) Vk(t))2) . dt.

We use (A + B)2 ? +A2 - B2 to obtain )J`

<v, v> mifyv

ft ((v`)2 + (1'ik frvk)2) dt11v

!

? ml U1)2 + "Z (D1)2) - C2n dt .(E11vk)2}

Note that by 8.1 (1) and 8.8

vk)2 S n > (vk)2 5n

(v(s), v(t)) 52n <v,

v>m1 ml rY

and note also that ( f (f(t), f (t)) di, so that` m1 ,

v, v> Z rnl 2C2n3< ((v')2 + (n1)2) dt - <v, v> (j(1), j(1)) dt.jr 2 Jr i m1 iv

Page 210: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 205

By the choice of the I,, we have

+2C2n3 J f (f(t), f(t)) dt < 1 + 4

m1 twhich implies

<v, v) > _m_ Y ((v1(t))2 + (v'(t))2) di.17V _ 3 f1v

Similarly, we may deduce

<v, v) ( 5 2m2 E ((v')2 + (v1)2) dt + 4C2n3m2t, t mi

so that we obtain

<v, v)

<v, v)

5 4m2 Z &1(t))2 + (v'(t))2) dt .IV r

(ff) dt,r t

Therefore <v, v) and the norms f1V (tJ')2) dt are equivalent. Butr,.

the completeness and separability of H, in this latter norm is well-known.Q.E.D.

The Definition 8.5 of the vector space H1(Si , TMr) involves only thedifferentiable structure of M. In Theorem 8.6 we prove completeness withrespect to a scalar product which is defined intrinsically in terms of a Rie-mannian metric on M (justifying the name intrinsic scalar product). It willbe helpful in what follows to know that various possible scalar products onH1 (Si , TMr), are in fact equivalent.

8.9. Theorem: (i) Let g(') and g(2) be Riemannian metrics on M. The cor-responding scalar products on the various spaces H1(S1i TMr) as definedin 8.6 (3) are then equivalent, i.e. c, Ilvll"' < Ilvll`2' S c2 110{1' for eachv e H1(S,, TMr) with constants depending only on the energy J(f) of thecurve f.

(ii) Let M c R" be a differentiable submanifold, so that H, (S, , M) maybe regarded as a closed submanifold of the Hilbert space H, (S1, RN) (The-orem 6.11), and so that the usual H,-norm for H, (S, , R") induces a scalarproduct on each tangent space H1(S1, TMr) of H1(S1, M). Then any twoembeddings induce scalar products in H1(S1, TMr) which define norms forthese spaces which are uniformly equivalent on any subset of H1(S1, M) ofbounded energy J(f).

Page 211: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

206 NONLINEAR FUNCTIONAL ANALYSIS

(iii) Any two scalar products in H1(S1, TM,r) of the sort described in (i)

and (ii) above are uniformly equivalent on any subset of H1(Si, M) of

bounded energy.

Proof: We first prove (i). By compactness of M there exist constants0 < m1 < m2 < oo such that

(1) 1 k (2) i k< (1) 1 kM191k v v glk L v = m2$Ik V D

This implies m1J(1)(f) < J(2)(f) < m2J(1)(f). This observation allows usto repeat the equivalence-of-norms proof given above as the second part ofthe proof of Theorem 8.6 with only minor changes. The number of subinter-vals I, which are needed in that proof can trivially be estimated in terms of abound for J(f); the other constants appearing in the proof of Theorem 8.6are independent off. All additional details are left to the reader.

We now prove (iii). Given an embedding M c RN, consider the norm II II

on H1(S1, TM,) which is induced by the corresponding embedding H1(S1, M)c H,(S1, RN) as described in (ii). Note also that the Riemannian metricon M induced by the embedding M c RN defines an intrinsic norm II II' onH1(S1, TMr). We claim that II II' and II II * are equivalent, uniformly forany subset of H1 (S, M) of bounded energy. Indeed, for v e H1(S1, TM,r)

1 1 Dv Dv\

dtwe have (IIviI')2 = (v(t), v(t))M,(,) dt + f (---, --dt / Mr(,)0 dt

while

(IIvII*)2 = 1 (t#), v(t))RN dt + f 1 (dv , dv ldt.

o ,J o dt dt )RN

Of course (v(t), v(t))M,(,) = (v(t), v(t))RN by the choice of the metric for M.Let the map 0: M -i RN define the embedding with which we are con-

cerned. The second fundamental form of M c RN in terms of local coordinatesfor M is a symmetric matrix of vector valued functions of these coordinates,equal specifically to the projection of a2018u' auJ onto the hyperplane ofdirections normal to M. We denote this form by 11J. (Cf. Milnor, MorseTheory.) The definition of the covariant derivative of a vector field v alonga curve f (in local coordinates v', f') can be seen to imply the formula

(dv dv = Dv Dv

dt' dt)RN - (dt ' dt M,(+ (luvl',

from which it is plain that it II' < II O. On the other hand, letting K. bethe least upper bound for the principal curvatures of M in all directions, it

Page 212: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 207

follows by the compactness of M (the embedding being fixed) that K," < oo .

Moreover, by the definition of the principal curvatures,

(ltJV P, 'rsvrfs)RN K,2" Iv(t)I2 If12.

Hence, using 8.8(IlviI*)2 S (IIvII')Z + Km 2J(f) 2(Ilvll')2.

This proves the equivalence of the norms II Il* and II 11' Combining (i) and(iii), (ii) follows trivially; details are left to the reader. Q.E.D.

Remark: It is no coincidence that the constants cl, c2 such that cl IIv11"'IIvllcs < c211v11`" in the preceding theorem depend on an upper bound

for the energy J(f). One can easily check by explicit calculations (carried outmost simply on the sphere or on manifolds with vanishing curvature) thatthe best possible constants c, and c2 may indeed approach 0 and oo respec-tively as J(f) -+ oo.

Our next developments will depend on Palais' Theorem 6.12, which werestate as follows.

8.10. Theorem: Let M and N be closed Ck-submanifolds of R"` and R"respectively. Then H, (Si, M) and H, (S1, N) are closed Ck-4-submani-folds of H, (SI, R"`) and H, (SI, R") respectively (cf. Theorem 6.11). Let0:M->NbeaCk`2map.

(i) It is an elementary result of differential topology that 0 can be extendedto a C2 map from R'" to R. Then, by Theorem 6.8, the extended 0 givesrise to a C1-4 map of H, (SI , RI) into H, (S1, R").

(ii) Consequently the map 0 : H, (S,, M) -+ H, (SI, N) defined by0(f) = fi of is a C11-4 map. Moreover dO,(v) (t) = d0pt) (v(t)) forv e H, (SI , TM,).

8.11. Definition: As in Chapter VI we take as the differentiable structureof HI (S1, M) that structure which it inherits as a submanifold of H, (SI , RN)in virtue of an embedding of M c RN. It is clear from Theorem 8.10 (cf. alsoTheorem 6.11) that the differentiable structure thus defined does not dependon the embedding. Although the differentiable structure of HI (SI , M) isindependent of the Riemannian metric of M it will be of considerable ad-vantage to have coordinate neighborhoods and maps on HI (SI , M) avail-able which are closely related to the Riemannian structure of M. We intro-duce such coordinates in the following definition and lemmas.

8.12. Lemma: The set 0(f) = {vJ v e HI (SI, TMf), IIv11. < e}, is anopen subset of the Hilbert space HI (SI, TMf).

Page 213: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

208 NONLINEAR FUNCTIONAL ANALYSIS

Proof. Let v e 0(f), i.e. let IIvII < e so that IIvII, s e - 28 for someb > 0. Then if <w, w>h12 = IIwII < 8 we have IIwtI,, < 26 (cf. Lemma 8.8

above). Het.ce 11v + wll W < e, so that the 6-ball around v is in 0(f). Q.E.D.

8.13. Definition: For f, h E H, (S1, M) put

d. (f, h) = max du(f(t), h(t)); (1)tES,

given.f e H1(S1, M-) and e < l e, (cf. Lemma 8.1). Put

U(f) = {h I h E Hl (S1, M) and d. (f, h) < e). (2)

This set is introduced as a standard coordinate neighborhood off, a definitionjustified by the two following lemmas. Note that U(f) is an open subset ofH1(S1, M) since the original Riemannian metric of M and the metric in-duced by an embedding M c R" (as used in Definition 8.11) are equivalentand V(f) _ {h e H1 (S1 , R"); max dRN(h(t),f(t)) < S} is open in H1 (S1 , R")by the proof of Lemma 8.12. `

8.14. Lemma: The following formula defines a 1-1 correspondence ri be-tween U(f) and 0(f):

h(t) = expf(,)(v(t))

Proof: We have Iv(t)I = dM(f(t), h(t)) by the radial isometry of the mapexp, and hence IIv1I. = d0,(f, h). Since e < e,, and assuming h e U(f), theinverse exp fc') is well defined at every point h(t) and hence the equation dis-played in the statement of the lemma is inverted by v(t) = expf«) (h(t)).Finally, the maps exp, ' (-) depend differentiably on p, whichimplies h is an H1-curve if and only if v is an Hl-vector field. Q.E.D. ;

8.15. Lemma: The 1-1 mapping , of Lemma 8.14, given by rl(h) = v, isa C" diffeormorphism between the open subset U(f) of the manifoldH1(S1 i M) and the open set 0(f) of the Hilbert space Hl (S1, TM.,).

8.16. Corollary: The mappings n : U(f) -+ 0(f) of Lemma 8.14 and 8.15define valid Cx' 3 coordinates on the manifold H1(S1, M). We refer to themas standard coordinates near f.

Remark: It is possible to define the differentiable structure of H1(S1, M)directly with the aid of Lemma 8.14 independently ofDefinition 8.11; in thiscase one has to show that the change-of-coordinates maps riZriT ' are ofclass C1-3.

Page 214: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 201)

The proof of Lemma 8.15 is based on Theorem 8.10 and Whitney's em-bedding theorem. We start with some considerations which we shall needagain.

8.17. Using Whitney's embedding theorem, we may take M as a C`-submanifold of some R". The Whitney sum TM O+ vM of the tangent bundleTM and the normal bundle vM is then the trivial bundle M x R". UsingM c R" we embed the trivial bundle M x R" in R` x R". Since thetangent bundle TM is a C"-'-subbundle of the trivial bundle we thus getTM as a Ck-'-submanifold of R'N x R". This embedding has the followingproperties. If we identify M with the zero section of TM, then this sub-manifold of TM is embedded in R" x {0}. Moreover the tangent space M,of M and p is embedded as a linear subspace of { p} x RI in such a way thatthe linear structure of M, is preserved. In view of Theorem 8.10 we havethen HI (S,, TM) as a C' 5-submanifold of Ht (S,, R" x R"), and for fixedf e H, (S,, M), HI (S1, TMf) is a linear subspace (and therefore as a C'submanifold) of H, (S , , R" x R") and consequently H, (S,, TMf) is a Ck"5-submanifold of Hl (SI, TM).

In the same way the Whitney sum TAf Q+ TM is C' '-submanifold ofRN x RN x RN so that the linear structure of the fibers is preserved. Conse-quently Hl (S, , TM ® TM) is a C'- 5-submanifold of HI (S, ,R' x R" x R")(Theorem 8.10) and Hl (S, , TMf) x HI (S, , TMf) is a linear subspace ofH, (SI, R" x R" x R") and a Ck-S-submanifold of H, (S,, TM E) TM).

Proof of Lemma 8.15: We assume 8.17. Consider the map rh : T,'tf --. Mdefined by 0(v) = exp,( (where p(c) is the base point of r). Then 0 E CA - 2and by Theorem 8.10 the induced map 0: H, (S,, TM) -+ HI (S,, M) be-longs to Ck_5 (not Ck-4 since TM is only Ck-'). But the restriction of 0to the Ck'5-submanifold HI(S1, TMf) (cf. 8.17) is the map 9/`' of ourLemma, proving that q-' e Ck-5

To show that we also have 77 e C'`-5, we argue as follows. On the opensubset U = {(p, q) E M x M; d (p, q) < e,} the map y: U -- TM given byy (p, q) = expo `(q) is well defined and Ck-2. By Theorem 8.10 y induces aCk-5 map y of an open subset of HI (SL, M) x H, (S,, M) into HI (S,, TM).The domain of y contains the Ck - 5 submanifold { f } x U(f) (cf. 8.13) andthe restriction of y to {f} x U(f) coincides with t) by the proof ofLemma 8.14. Q.E.D.

The next Lemma shows that the Hilbert manifold HI (S, , TM) may beidentified with the tangent manifold TH, (S, , M).14 Schwartz, Nonlinear

Page 215: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

'110 NONLINEAR FUNCTIONAL ANALYSIS

8.18. Lemma: Let 0: TM -+ M be the map given by 45(v) = exp (v), and

let 0 be the induced map (Theorem 8.10) of H, (S,, TM) - H, (S,, M).

Let 0 be the map defined by 0(v) = ds 0(sv) Is_ o, so that 0: H, (S,, TM)

- TH, (S,, M). Then 0 is a Ck ` 6 diffeomorphism of H, (S,, TM) onto

TH, (S,, M).Proof: From the proof of Lemma 8.15 we have that 0 e Ck- S so that

0 e C". Write p(v) for the base point of the vector v e TM, or for the base,point curve of the H,-vector field v E H, (S,, TM), as the case may be. Let

v e H, (S, , TM). The point 0(v) E H, (S,, M) has by Corollary 8.16 thecoordinate v in the standard coordinate neighborhood U(p(v)) near p(v).

s= o= (v)is that tangent vector of H, (S,, M) atp(v), which,Thus ds ; FP (sv)

in the coordinate system of TH, (S, , M) corresponding to U (p(v)), has thecoordinate (0, v). This shows that 0 is a 1-1 C1-6 mapping of H, (S,, TM)onto TH, (S,, M).

To prove that 0-' is also of class Ck-6 consider the coordinate systemof TH, (S,, M) corresponding to U(p(v)) and denote the C"`6-coordi-

nate map by ^, i.e. : C(v) --+ 0 (p(v)) x H, (S, , TM,(,,)). We shall find aC'--5 map a such that 0-' = a o rj, which proves 0-1 a Ck-6.

Now use 8.17. Let p = p(v,) and q = exp,(v,). By Lemma 8.1 thereexists e, > 0 such that if V1, v2 a M, and Iv21 < e, then P (VI, v2)= expq' (exp,(v, + v2)) defines a C1,-2 function whose value is a vector

tangent to M at q. Thus a (v,, v2) = s e (v,, sv2) I defines a Ck-3Ss0

map a: TM ® TM - TM. By Theorem 8.10 a induces a Ck-s mapH, (S,, TM (D TM) -, H, (S,, TM) and therefore by restriction a Ck- s

map (cf. 8.17)

v: H, (S, , TiVf f) x H, (S,, TMf) - H, (S,, TM).

For (v,, v2) e H, (S,, TM r) x H, (S,, TMf) and h = 0(v,) we have

nds

(eXp ` (exp.r (vl + sv2)))s=o

= (v1, v2)

(cf. 8.10 (ii), more explicitly we have R (s, v, , v2) = P (v,, sv2)and a (v,, v2)= (1, 0, 0). Apply 8.10 (ii) to R and obtain

a (v1, v2) (:) = dR(o,a,(t).V2(t)) (1, 0, 0)

LO)= ds (exp+,a) (eXpf(,) (v,(t) + sv2(:)))

Page 216: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 211

Thusexph ' (exp f (v, + sv2)) = s ' a (v, , v2) + o(s) E H, (Si , TMh).

Since rh (exph ' (exp f (v, + sv2))) = 0 (v1 + s v2) and since ?7-' =(by the proof of 8.15) it follows that u11(S1.TMI)

d0 (sa (L] , v2))

ds

d -= ds 0 (v, + sv2)

S=o ds S=Q

i.e. (a (VI , v2)) is that tangent vector of H, (S,, M) at the point h = 0(v,)whose coordinates in the coordinate neighborhood rl(v,) of TH, (S,, M) are(v, , v2). In other words i3 (0 (a (VI , v2))) = (v, , v2) or a o whichcompletes the proof. Q.E.D.

Our next aim is to prove the differentiability of the energy integral and tointroduce the intrinsic Riemannian metric for HI (SI , M), more precisely :

8.19. Theorem: The energy J is a C' S function on H, (SI , M).

8.20. Theorem: Suppose that we represent the tangent space to H, (S,, M)at f by H, (Si , TMf) (cf. Corollary 8.16) and take as scalar product in thetangent space the intrinsic scalar product for HI (S1, TMf) defined in 8.6 (3).Then this scalar product defines a Ck' 6 Riemannian metric for HI (S, , M).

Before we prove these two theorems, we make the following observations.

8.21. Let M be embedded in R" and TM in R" x R" as described in 8.17.The Ck_ I Riemannian metric g of M can be extended to a 6_2 Riemannianmetric of R" in such a way as to make M a totally geodesic submanifold(using the normal bundle of M in R" and partitions of unity). This impliesthat the covariant derivative along a curve f e H, (S, , M) is the same if f isconsidered as a curve in M or as a curve in R'r. We write the scalar productas g(p) (v, w) for (p, v) and (p, w) e R" x R" (this is of course bilinear in v, w).The following statements are very similar,to Lemma 6.9 and are provedin the same way.

(1) Let b(p) (v, w) be bilinear in x E R" and of class Ck in p e R". anddefine a Ck`2 function

b: H,(S,, R") x H,(S,, R") x H,(S,, R")-, Rby

b(f) (v, it-) = Lb(f1) (i'(t ), w(t )) dt .

14a Schwartz. Nonlinear

Page 217: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

212 NONLINEAR FUNCTIONAL ANALYSIS

(2) Let b ( ) ( , ) : RN - L2(RM) be a Ck map from R1 into the bilinearforms on RM. Then we define a Ck-2 map

b () (,) : H1(S1, RN) -' L2 (H1 (S1, RM))

by

b (f) (v, w) = f b (f) (t)) (v(t), +'(t)) dt.0

(3) Let c( , ) ( , ) : RN x RN --+ L2(RM) be a Ck map from R" x RN,which is linear in the second argument, into L2(RM). Then we define a Ck-1

mapH1(S1, RN) x H1(S,, RN) `' L2 (H1(S1, RM))

by

(f(t), h(t)) (v(t), l1w(t)) dt.c (f, h) (v, w) = fo c1

We show c (f, h) o L2 (H1(S1, RM)) to indicate the kind of changes whichhave to be made to adapt the proof of Lemma 6,9. We use Lemma 8.8 andSchwarz' inequality and we denite by e" the i-th unit vector in R so thath(t) = E eihi(t).

Ic (f(t), h(t)) (v(t), w(t))I S lIc (f(t), h(t))11 L2(RM) .1V(t)IRM ' I i'(06-

s E I h1(t)I Ilc (f(t), e1)IIL2(RM) max I v(t)I I +'(t)Iand therefore 9410.1]

2Ic (f h) (v, 01 S max (E Ilc (f(t), e,)11L

2(RM))1'2 IlhH0, (s,. Rx) '2 1IvIIR, (s,. it.)1x(0.1

' Ilwlls, (s,. RM)1

Proof of Theorem 8.19: With the notations of 8.21 2J(f) = f g (Al))0

x (f(t), f(t)) dt is the restriction of the Ck-4 function on H1(S1, RN)x H1(S1i RN) x H1(S1f RN) to the diagonal of the Ck - ° submanifold

H1(S1, M) x H1(S1, M) x H1(S1, M).Q.E.D.

Proof of 8.20: We identify TH1(S1, M) and H1(S1, TM) using Lemma8.18, i.e. we represent tangent vectors of H1(S1a M) by elements ofH1(S1, TM). For v e H1(S1, TM) we denote the covariant derivative along

Page 218: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 213

the base point curve p(v) by Dvldt (cf. 8.21). Then the scalar product ac-

cording to 8.6 (3) is

1 Dw dt,<v, w> = g (= p (v(0)) (v(t ), w(t)) + g (p (v(')))Dv

` dt ' dt }ot

\

where v and w have the same base point curve p(v). We have to show thatthis formula defines a Ck-6 section from M into the positive-definite, sym-metric, continuous bilinear forms on TH, (S,, M) = H, (S1, TM), cf. Lem-ma 8.18. Taking, for example, Cartesian coordinates for R" we may writethe covariant derivative along f as

DO dz` + Fjk (f(t)) v'(t) fk(t)) i = 1, ..., N and Edt dt ( J.k=1

where the rk are Ck-3 functions defined on R" (cf. 8.21). We writeI'(f(t)) [v(t), f(t)] for the vector {rk(f(t)) vJ(t) fk(t)} e R".

By 8.21(2) and (3) we may define a Ck-5 map

F: H, (SI , RN) x H,(S,, R") - L2(H,(SI, R"))by

F(f, h) (v, w) = {g (f(t)) (v(t), w(t)) + g (f(t)) (iv(t) + r (f(t)) wt), h(t)],0

w(t) + ru(t)) [w(t), h(t)])} dt.

The restriction of F first to the diagonal of H, (S,, R") x H, (S,, R"),which we identify with H, (S,, R"), and then to the submanifold H, (S1, M),is again Ck' 5. In other words, we have a Ck' 5 Riemannian metric on thetrivial vector bundle H, (S,, M) x H, (S1, R") and consequently also aCk'5 Riemannian metric on the Ck-5 subbundle H, (S,, TM), cf. 8.17. Welose on more order of differentiability since the identification of H, (S, , TM)with the tangent bundle TH, (S,, M) is only Ck-6 (cf. Lemma 8.18). ThisRiemannian metric induces the right topology by Theorem 8.9. Q.E.D.

We continue with some observations concerning a useful family of differ-entiable curves on H, (S1, M).

8.22. Notation: An element f e H,(S1, M) will be called a curve on M ora point of Hl (S,, M) depending on the situation. A curve x on H, (S,, M) willalways be a map x : [a, b] -, Ht (S, , M).

Page 219: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

214 NONLINEAR FUNCTIONAL ANALYSIS

&23. Definition : Let fo, f 1 e H1(S1 i M) be such that dd (Jo, fi) < E,,which implies that the shortest geodesics on M joining fo(t) and fl(t) areunique. Then put:

y (s, t) = expf0(t) (s . expf it)(t) (f1(t)))

From this function of two variables we obtain a curve y : s - y(s) E H1 (S1, M)

by writing y(s) (t) = y (s, t).

8.24. Lemma: The y-curves of Definition 8.23 are C'- 5 differentiable. The

tangent vectoray

(0) E H1(S1 i TMf) is the coordinate of f1 in the standardas

coordinate system near fo (cf. Lemma 8.14 and Corollary 8.16). We have

as(s, t) = dM(fo(t), fl(t)) (for all s, 0 < s:5 1); hence 11as

W= d.(.fo,f1)

Proof: In the coordinate system centered at fo the y-curves have the follow-ing representation.

v(s) (t) = s expj «)(f1(t)) where vs) E H1(S1, TMf0).

This implies that the y-curves are as often differentiable as the change of

coordinates map, i.e. are Ck - 5. Moreover, d v(s) is the coordinate of f1.ds s=0

(s, t) is the length of the tangent vector to the geodesic y (s, t),as

t = const., and therefore equals dx(fo(t), fl(t)).

8.25. Lemma: A Cl-curve x : [0, 1] - H1(Sl , M) considered as map [0, 1]x [0, 1] - M by putting x (s, t) = x(s) (t) is a homotopy between the end

points x(0), x(l) e H1 (Si, M). Moreover Cx(s, t) is a continuous vector

asfield on M along x. This implies that the deformation paths x (s, to), to = con-stant, are rectifiable curves on M and that their length depends continuouslyon t.

Proof: Since J is continuous and x [0, 1] is compact (in H1 (S3 , M)) wehave max J (x(s)) = A < oo. Therefore, by Lemma 8.3, the x(s) are an equi-

SE[0.1]continuous family of curves on M. Equicontinuity in one variable and con-tinuity in the other implies continuity in both variables. In this case x(s) (to)is also equicontinuous ins for to a [0, 1]. To see this let v(s) be the coordinateof x(s) in some standard coordinate system on H1(S1 i M) (cf. 8.13 (2)), I.e.

Page 220: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 215

for some f e H, (SI, M) and b > 0 let x(s) e U(f) and v(s) E H, (S,, TM-,)for Is - soI < S. By Lemma 8.8 we have

11v(s) - v(so)I17 = max Iv(s) (t) - v(so) (t)I2

2<v(s) - v(so), v(s) - v(so)>, ,

and therefore the continuity of v(s) implies the stated equicontinuity. Toprove the second part of the Lemma observe that the derivative of a C'-

curve in H, (S, , M) represents a tangent vector, so thatan- e H, (S, , TMK(S)) .

axas

Hence (s, t) is continuous in t for fixed s. Using coordinates we see asas

before that as (s, t) is equicontinuous ins for fixed to, to c- [0, 1 ]. This implies

continuity ofan

(s, t) in both variables.as

Q.E.D.

8.26. Lemma: Let s -+ x(s) be a C'-curve on H, (S, , M) and s --> w(s) bea C'-vector field along x(s). Define 0 (s, t) = x(s) (t) and v (s, t) = w(s) (t).Then

D a¢ D aat as D at

(almost everywhere)as ,

and

D D D Dv (s, t) = R

a-,

av (s, t) almost everywhere,at as - as at) (at as)

where R is the Ck' 3 curvature tensor of M andat

(resp. D f as) is the

covariant derivative along the curves 0 (so, t) (resp. 0 (s, to)) on M.

Proof: The formulae are well known for 0, V E C2 and extend by the usuallimit arguments to the above situation.

Q.E.D.

8.27. Definition: Letx : [0, 1] -+ H, (SI , M) be a C'-curve; then

ds = f'(I'dtff-.,\+(_0xdsD ax 1 /2

L(x)f 11o as J 0 as dsat at ' at as)})

Page 221: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

216 NONLINEAR FUNCTIONAL ANALYSIS

is the length of x and

E(x)I

2

f0 ax

as

2

ds is the energy of x .

8.28. Remark: L2(x) < 2E(x) may be proved along precisely the lines ofthe proof of Lemma 8.3.

8.29. Definition: In view of Lemma 8.25, define the Riemannian distancebetween by

inf L(x) if fo and f1 are homotopic as curves on Md K(O)=f0

(fo,f1) = K(1)=f,00 if fo, f, are not homotopic.

8.30. Theorem: I \/2J(fo) - 2J(fl)I < d (fo ,

Proof: Either d (fo, f1) = oo and nothing has to be proved or there is aC`-curve x joining fo and f1 . In this latter case we have

J (x(s)) 2 J o at(s, t), at (s, t)) dt

d 1 ` D ax ax

ds ` 2J-(m(s)) =,/ J

o (as dt ' at) dt,

Hence, using Lemma 8.26 to change the order of differentiation and notingthe -identity of Definition 8.27,

d 1 1

2J (x(s)) <_ds - -,l 2J (x(s)) (fo

ax

as

ax(s,

2 1/2 (fl

att) dt

\d 0

D On

at as

Therefore /2J (x(1)) - 2J (x(O)) < L(m). It follows by symmetry thatI \'2J(x(1)) - ,12J(x(0))I < L(x), and taking infima over x we obtain thetheorem.Q.E.D.

The following theorem is a generalization of Lemma 8.8.

Page 222: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 217

8.31. Theorem: d,2 (fo , f,) 2d 2 (fo ,fi)

Proof: We must show d.' (fo, f,) < LZ(x) for any C'-curve joining fo and

f , , cf. Definition 8.13 (1) and 8.29. Let tm c- [0, 1 ] be such that dc, (Jo , fi )

= dM(fo(tm),fi(tm)) Then by Lemma 8.25 x (s, is a Cl-curve on M and

we have

d.2 (fo,fi) = d (fo(tm), fi(tm))

($1 ax (s tm) dsllz

o as

< 1 maxax (s, t)Ids)2 < 2 ` ax ds)2

= 2L 2(X)

o t as o as

by Lemma 8.8.Q.E.D.

8.32. Theorem: H, (S,, M) is a complete metric space.

Proof: Let f f.} be a Cauchy sequence in H, (S, , M). By Theorem 8.31If,} is a d0-Cauchy sequence, i.e. f converges uniformly to a continuouscurve f on M. Since the coordinate neighborhoods on H, (S,, M) are definedusing the (cf. Definition 8.13 (2)) it follows that for large n the fand f all belong to some single coordinate neighborhood. The coordinates vof the curves f form a Cauchy sequence in the corresponding coordinateHilbert space; this Cauchy sequence converges to some limit v. The point ofH, (S1, M) with the coordinate v coincides as continuous curve on M with f;this proves that f -+ f e H, (S,, M) in the topology of H, (S,, M). Q.E.D.

8.33. Theorem: If we consider each point of M to define a closed, con-stant curve, then M is embedded isometrically as a totally geodesic closedsubmanifold M of H, (S1, M).

Proof: Every "constant" curve p: S, -+ p e M is determined by its imagepoint p on M. The set

M = J-'(0) = If If : S, -, M is a constant curve)

is a closed subset of H, (S,, M) since J(f) = 0 if and only if f is constantLet U(p) be the standard neighborhood (cf. Corollary 8.16) of the constantcurve p in H, (S,, M). Then f e U(p) n M if and only if the coordinateof f is a constant vector field along p. The constant vector fields form ann- (= dim M) dimensional (hence closed) subspace of the coordinate spaceH, (S1, TM;) and therefore 9 is a closed submanifold of H, (S1, M). Next

Page 223: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

218 NONLINEAR FUNCTIONAL ANALYSIS

we show that if two constant curves po, pl are joined in HI (S1, M) by aC1-curve x, then they can be joined by a curve x : 11lf such that L(x)< L(x). Indeed, by Lemma 8.25, the curve xT: I-+ M, defined by xe(s)

= x (s, r) is a C1-curve on M. Using any such curve, we may define acurve xT : I -> M on M by putting eT(s) (t) = x (s, T), 0 < t S 1. Wehave

1

oxT2 D 0 2 1/2

dsL(x ) _ f dtdsI=(s) (t)

)1T

0 as 0Jo \ as at as

= Jds

0

(s, z)as

= L,.4 (x,),

since oxT/as is independent oft andD axT = 0. Therefore the curve xT ondt as

M and the curve xT on M have the same length. Moreover

1 ., 1 1

inf LM(x,) = inJo ds I °x (s, z) S f dr dsas o .J o

f1ds(f'f._(s,T)21as

1

inf LM(x,) < f dsT Jo

ax

as= L(x).

By Lemma 8.25, LM(x,) depends continuously on z, hence there is a a* forwhich the above infunum is assumed. Putting X^' = xT. we have LM(xT.)= L(x) 5 L(x). From this it follows that the shortest geodesic joining poand p, on M, if considered as a curve on M joining p0 and P1, is also theshortest curve connecting p o and p 1 in H1(S1 i M), and is therefore a geodesicin H1(S1, M). Thus plainly dm (Po, P1) = d (P0, P1) so that ft s H1(S1 , M)is totally geodesic.Q.E.D.

We proceed to a more detailed discussion of the y-curves. These y-curvesare "short enough" to eliminate the need to refer explicitly to geodesics onH1(S1, M) in some situations with which we shall deal below.

Page 224: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 219

For use in the following proofs we recall some facts which have been

established above.

= d. (fo,f1) 2d (fo, fl) < v'2L(y) < 2 JE(y)00

(cf. Lemmas 8.24, 8.31, 8.29, and 8.28).

8.34. Lemma: Let IIRII be a bound for the norm of the curvature tensor

on M and let Ias8y

E

= dx . Then we havey

(1) max J2J(y(s)) (1 - IIRII dam)-1 (IJiif5 +

(3) 1 L(1')

(4)

as (0)

< 2 II R 11 dQ s max -*' 2J y(6))0<0<!

< 2 !IRII clr max \, 2J(y(s)):

2 IIRII do max J2J(y(s)).

Remarks; (1) gives a bound for the energy integral along a y-curve interms of the energy of the one end point, J(fo), and in terms of the coordinate

of the other end point, as (0).

(2) shows that a y-curve is parametrized almost proportionally to arclength.

Proof: (3) and (4) are immediate consequences of (2). To prove (2), wenote that, by 8.6 (3) and 8.8 (4)

d

ds

aydt

D L + D D oy D icy1 1

ay fo tas as ' as as at as ' at as)as (s) II

By integrating this equation and using a ay = 0, 8.26, I s (s, t) I - d,,s

Page 225: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

220 NO` I INEAR FUNCTIONAL ANALYSIS

and the Cauchy-Schwar7 inequality we obtain

<_ I ds,Jo

'a

oy (s)

as

\Ras

;II RI1 d f ds 2J (V(s))0

< II R I; d, (i ds v'2J (y(s)).

I

J'C

ay _y D ayat as at as

D ay

at asdt

2

This proves (2). From 8.30 we have

I\ 2J(,,((T)) - J2J(fo) $ as(s) I ds.

0

Therefore (2) gives

\12J (y(a)) - J2J(fo)I11

s(0) + 11RII d; m ax J2J(y(s)),

lia

which implies (1).Q.E.D.

8.35. Theorem: The first derivative of the energy integral at f e H, (S,, M)is the continuous linear map dJf : H, (S,, TM f) -+ R given by the formula

dJf(w) = I' C D w(t), aft dt, w e H, (S, , TM,).0 at at J

Moreover,

I f(w)I < 11w11.

Proof: Let f, e U(fo) and let y(s) be the y-curve joining fo and fl. Then

the coordinate off, is ay(0) (cf. 8.24) and we must prove that

as

.1(fi) - J(fo) - dJ f,as (0)) I = 0 f II as

(..(01)

Page 226: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEOD.ESICS ON COMPACT RIEMANNIAN MANIFOLDS 221

Now, using 8.26, we see from 8.8 (4), from1 1

J(fi) - J(fo) dsd

1ay

,ay

dtfo ds \2,10\at at} /

and from

f 0 as at(s, r),

at) dt

o

=

fo

da

fo dt

J\at as at

+ (D D ay ay 11

as as at atthat

1J(fl) - AM - f 1 (D ay (0), afo dto at as at )

1 dss

dor1 dt j(7 D ay D ay - R ey ayl ay

'ayf0 fo f0 as at a;) ( at' as as atI

< f ' ds {Ilas (s)0

2 ay

as

2

II R Il 2J(y(s)) } Cl2

as (0)

where C1 = C1(RII , J(fo) ,

as(0) ) by Lemma 8.34 (1) and 8.34 (2). There-

11

fore dJfo ((0)) = f 1 (0), afo dt, which proves our first con-as

)olclusion. The inequality IdJf(w)I S 2J(f) Ilwll now follows from theCauchy-Schwarz inequality.

Q.E.D.

Remark: If one carefully notes those parts of earlier results which havebeen used in the above proof, one finds that it not only establishes a formulafor the derivative assuming differentiability, but actually proves the differ-entiability of J.

8.36. Definition: The continuous linear functional dJf(w) on the Hilbertspace H1(S1, THf) can be represented as the earlier product of w and of avector which we call grad J(f). More specifically, we have (cf. 8.35)

dJ f(w) = <grad J(f), w> = f (D w(t ), ofat) dtI 0\ar ar/(grad J(f)(t), w(t)) + grad J(f) (t), D w(t)l dt.= f0

/jat at15 Schwartz, Nonlinear

Page 227: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

222 NONLINEAR FUNCTIONAL ANALYSIS

8.37. Remark: From 8.35 it follows that Jjgrad J(f)l S .J2J(f), and from

8.36 that fl grad J(f)112 = J2

D grad J(f),of

dt.0\ t at)

8.38. Theorem: The vector grad J(f) tangent to Hl (S1, H) at f, may beinterpreted as a vector field along the curve f on M, and is determined by theintegral equation given in Definition 8.36. If f is smooth enough so thataflat E Ht (S1, TM,r), then grad J(f) (t) is the unique periodic solution ofthe differential equation

2D22 grad 1(f)(t) - grad J(f)(t) = Dat .

Proof: Our first assertion is obvious. To prove the second, note that since

atand w(t) are continuous, we have (w(t), at) = 0. Hence integrating

o

the left side of the integral equation of 8.36 by parts gives (use 8.8 (4))

J o(w(t),

ataf(t)) dt = J1 J(grad J(f) (t), w(t))

I o

+ (. gradJ(f)(t), a w(t))j dt

This can hold for every w e Hl (Sl, TMf) only if D/at grad J(f) (t) is a con-tinuous vector field along f, in which case integration of the last term byparts gives the desired result.Q.E.D.

Recall that f is called a critical point of J if f1grad J(f')fI = 0.

8.39. Theorem: The critical points of J correspond precisely to the closedgeodesics on M (including the constant curves, cf. 8.33).

Proof: Since the constant curves are of minimal energy, i.e., of energy zero,there is nothing to be proved for these curves. Next, let f be a non-constant

closed geodesic and hence a C2-curve. Then al e Hl (S1, TMf) and byTheorem 8.35 we have at

dJAw) - 5o(T'

w(t), L) dt -J o(w(t), 0,

at at at

Page 228: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 223

a1 = 0 b continuity. There-since D of

= 0 for geodesics and (w(t),o

Y

of atfore by 8.36 11 grad J(f )II = 0 so that f is a critical point of J.

Next assume that f is any critical point of J, so that II grad J(f)11 = 0. We

shall prove D of = 0 by constructing a vector field z which is parallelat at

along f and which will turn out to be equal toof

. (In what follows D/atat

will denote the covariant derivative along f.) To construct the vector field z,

we first solve the equationat

y =

at

putting y(O) = 0. Then y is a H,-

vector field along f, continuous everywhere except possibly for the fact that

y(O) # y(1). Next solve the differential equation D z = 0 with the boundaryat

condition z(l) = +y(1). Again z is H, and continuous everywhere exceptpossibly for the fact that z(1) : z(0).

Note also that the solution of D x = z(t) with the end condition x(l)= y(l) is x(t) = tz(t). at

Put v(t) = x(t) - y(t). Then v(t) is a H,-vector field along f and v(O)= v(1) = 0. Hence v e H, (S, , TMf). Since II grad J(f) II = 0 it follows byTheorem 8.35 that

dJf(v)=0= ('ir of ) dt.J o \at ' atl

Moreover

o-f'(v(t), D z)dt=0.\ at

Forming the difference of the last two equations and noting that Dv/Ot= Dx/at - Dy/at = z(t) - of/at, we get

f(z(t) - of , z(t) _ of) dt = 0.at at

This implies that of/at = z(t) almost everywhere. Since of/at is the derivativeof an absolutely continuous function it follows that the indefinite integralsof of/at and of z(t) (in local coordinates) agree. But z(t) is continuous andtherefore everywhere equal to the derivative of f. Thus of/at is continuousexcept possibly for a jump at t = 0+, 1-. Arguing in a similar way however

Page 229: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

224 NONLINEAR FUNCTIONAL ANALYSIS

we can show that of/at is continuous except possibly for a jump at t = I-,I+. Thus of/at is continuous everywhere and parallel along f, so that f is

a closed geodesic.Q.E.D.

8.40. Remark: We restate the Palais-Smale condition for the energy J forthe convenience of the reader:

If f f.} is a sequence on H1(S1, M) such that J(,) < A and II grad J(fn)11converges to 0, then {f.) has a subsequence which converges to a critical

point.

8.41. Theorem: The Palais-Smale condition holds for J.

Proof: Since H1(S1, M) is complete (by 8.32) and since 11 grad J(-) 11 is

continuous it suffices to find a subsequence h of f,, which is a Cauchy se-quence. Since by Lemma 8.3 the {f,} are an equicontinuous family on a com-pact manifold we can use Arzela's theorem to find a subsequence of {f.}which converges uniformly; suppose, without loss of generality, that f f.)has this property. We then have d,, (f., fm) < s for n, m z N(e). We nowshow that { fn} is a Cauchy sequence. The proof will rest on Lemma 8.34and the following formula for y-curves (cf. 8.23, 8.8 (4) and 8.26)

aJ(y(s))I_ f

D y ay (s, t)dtas o at tas ' at /

z(s, t) dt ds

fu (aas

(o, t), ay (o, t)) dt + fo fo at as

+ f o f o(R at' dt ds, (5)

Let V. be the y-curve (cf. 8.23) such that y ..(O) = f fm. Then by8.29 and 8.34, we have

d (fA, fm) 5 L(yem)

6 211811 do (fi,fm)+ 2A +1 - IIRII dd (fn,fm)

aynm (0)

as}

We now suppress the indices n and m and write ay/as for ay,./as, d. for

(f,fm) and z for 21JR dx2

)2(1

- IIRIIdW

Page 230: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 225

Then the above formula yields

(6) d (f,,,fm) I (Y) (1 + e) jIas (0)II +

s 2A.

Since

dJ (y(s))

ds

we obtain from (5)

D ay (s, t)at as

and usingay

as

and 8.34 (1)

(7)

f1 1

J 0J0

E=zi ADat as'

at)dt < 11 grad J (Y(s))1I

2

dt ds 5 (1 grad J (Y(1))IIas (1)

+ 11 grad J (Y(0)) II 11as (0)11

'f'+ J 0 0 ((as , at as at

}

= d., 8.34 (2), s = 211 R II d.2

(1 - IIRII d.)

D(s, t)

at as

2

dt ds S Ilgrad J(y(1))II (

+ Eas

(0)) + 118rad J (Y(0)) II

+ (A +11

as

(0)112).

On the other hand by 8.27 we haveJ1 ff1

OJOD ay (s, t)at as

as(0)

dt ds,

as (0)

2

dt ds > 2E(y) - d.2

and, after multiplying 8.34 (4) by 2E(y) +

1

f1

foo

D 2(3, t) dtds>-

at as sy (0)

2

and using 8.34 (1)

as (0) as (0)

+ 2A + II ay(0)

ns

Page 231: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

226 NONLINEAR FUNCTIONAL ANALYSIS

Combining this inequality with (7) we obtain

2(0) + (e + 2e2) A(1 - 8 - 282) < d + e 2A (1 + E)

11

a1'

as11

+ ((1 + e) 11 grad J(y(l))11 + 11 grad J(y(0))I1)

as (0)+ e N/2A 11 grad J(y(l))ll

or, since e = O(de)

(8)2

as(0) O (dam) + (11 grad J (y(O))II= O (dam) +

+ pgradJ(y(l))II)

From (8) it follows first that (0) stays bounded as n, m -+ oo andas

then since da, -' 0 and 11 grad J (y,,,,,(0))11 -, 0, it also follows from (8) that

ay. (0) -. 0 as n, m -i oo. This and (6) completes the proof.asQ.E.D.

We may now draw various easy consequences of the Palais-Smale condi-tion (for more details see Chapter IV).

8.42. Lemma: If J has no critical points in J-1([a, b]) then there exist6 > O and e > O such that 11grad J II Z 8 on J-1 [(a - 6, b + 6]) .

Proof: Were this false, we could find a sequence {f.) such that lime [a, b] and lim Ilgrad 0. But then the Palais-Smale condition im-plies that J has a critical point in J-1([a, b]).

-Q.E.D.

8.43. Definition; As in Chapter IV, we define a vector field on H1(S1, M)

by v(f) = - grad J(f ). Integration of a0 = v(4) with the initial conditions

fi (0j) = f defines the "gradient deformation" 0 (s, f ).

8.44. Lemma: We have ` ( ,f)) = -l1gradJ(0)(s,f))112

Page 232: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 227

0V >Proof :

V( s,.f )) - <grad J(am), _ - II grad J(O) II 2

ds as

(cf. 4.71).Q.E.D.

Let the singular cycle (or curve) z be a representative of any nontrivialhomology class (or homotopy class) H of the pair (H1(S1, M), R) (cf. 8.33).The range (or carrier) of z is a compact subset of H1(SI , M) on which Jassumes a maximum. We make the following definition.

8.45. Definition:co = inf (max J(f)).

)z)EFI fErangez

cH is called the critical value of H.

8.46. Theorem: cg is a critical level of J.

Proof: We assume ca > 0 since 0 is a critical level of J. If cg is not a criticallevel, then for some 6 > 0 and E > 0 it follows by 8.42 that II grad J II k eon J-1([cg -- S, cg + 8]). By definition of cg we can find z E H such that

max J(f) 5 cg + 8. Now deform z using the gradient deformation forfEranjez 6205s526/62.Then 0(?,z}eHand

/

max J(f)5CH --8J E range m (2 46/0, z)

by 8.44, contradicting the definition of cm.Q.E.D.

The preceding theorem does not by itself imply the existence of a singlenontrivial closed geodesic since the possibility eg = 0 is not excluded andsince we do not know the existence of nontrivial homology or homotopyclasses of H1(S1, M), M). The remainder of our reasoning is aimed atovercoming this difficulty.

8.47. Theorem: There exists e > 0 such that Al (cf. 8.33) is a deformationretract of J-1([0, e]).

Remark: From 8.47 it follows cg > 0 for any nontrivial homology (orhomotopy) class H of (Hl (S1, M, M).

The proof of 8.47 will follow from Corollary 8.49.

Page 233: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

228 NONLINEAR FUNCTIONAL ANALYSIS

8.48. Theorem: There exists e > 0 such that the flow lines of the gradientdeformation (cf. 8.43) which start in J-1([0, e]) have uniformly boundedlength as s -i. oo, and such that each of these flow lines has a well definedlimit point in M as s -> oo.

8.49. Corollary: The uniform boundedness of the length of the flow lineswhich start in J-1([0, e]) implies that for these flow lines the end point in 14and the length depend continuously on the starting point. Consequentlywe can parametrize these flow lines proportional to are length and thus geta retraction of J-1([0, e]) to M, proving Theorem 8.47.

Proof of Theorem 8.48: We shall find e > 0 such that for f e J-1([0, e])we have

(1) IIgrad J(f)112 > I J(f)

This estimate implies 8.48 in the following manner. By 8.44 the flow linestarting at 0(0) satisfies

11grad J (0(x))112 do = J (0(0)) - J(0(s)) < e.J0

Therefore 11grad J(0(s))11 is not bounded away from 0 on a flow line. Bythe Palais-Smale condition, there exists a sequence converging to a crit-ical point. By (1) it follows that lim 0(s.) a M. Since J(0(s)) is monotone

R-00

decreasing this proves lim J(0(s)) = 0. Therefore, using 8.44 and (1), wes-ao

obtain the uniform bound for the length:

(2) L(0) = io 11 ' 11ds = J 11grad J(0(s))11 ds0

dJco ds

dsJ(0(0)) dJsJs

fo 11grad J(0(s))II fQ=J(O)) -1/J

= 2 J (0(0)) 5 J20 e

If a flow line had two different limit points in M, its length would have to beinfinite. Thus (2) proves

lim 0(s) a M,as desired. s~00

To complete the proof of 8.48, it only remains to prove (1). We do this byintroducing local coordinates on M. Let Ep, C, m1, m2 be as in Lemma 8.1

Page 234: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 229

and in addition let e, be so small that m2 - M1:5 m1/64. We define the

quantity e above as

2s(8C (n)3'Z)_2) (n = dim M).= mi(

By Corollary 8.4, every f e J-1([0, e]) satisfies L,,,(f) < 2s, and thereforeis completely contained in the domain of some geodesic parallel coordinatesystem. Since both sides of (1) are continuous it suffices to prove (1) forf e C2. Then by 8.38 grad J(f) is the periodic solution of

2 D(3)

d2y- y= d f.D

(where here and in the following proof we write y for grad J(f) and f forafat).

Moreover, by the proof of 8.38, Dy/dt is absolutely continuous. Nextdefine

(4)Dy

dt

We find using (3) that

(5) D z = dtD2y2 - D f. = and thereforeD2z z = f.

dt dt dty

dt2-

From 8.37 we have Ilyll 1 , f dt and hence, using (4), we get2 - f0 dt )

(6) I1y112 = f (z + f, f ) dt = 2.1(f) + f (z, f) dt.0 0

We shall prove (1) by estimating f (z, f) dt in (6) as follows. Let f'(t) and0

z'(t) be the coordinates of f(t) and z(t) in a parallel coordinate system whosedomain contains f (see choice of e, above). Then

(7) (z(t), f (t)) dt f 6,kz'(O) fk(t) dt0

f 1 6fk (Z1(t) - z'(0)) J k(t) dt0

fo

{(Z(0, f()) - 61kz'(tfk(t)} di

Page 235: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

230 NONLINEAR FUNCTIONAL ANALYSIS

Since f is a closed curve contained in a single coordinate system we have

(8)

1/2 1 1/2E Iz'(t)I If'(t)1 dt n (1'

0 1Izf12 dt)

(fo tY I.f`IZ dt0l.1 )

Before we estimate the next term observe that we have from (4) and (5) using8.37 again

(9) IIzI12

Al

= ,10{(zz)

+

Dz

(dt ,

Dz

dt )} dt

1

Jo -f dy -f)+(y,y)}dt=21(f)-I1y112.i

Lemma 8.2, 8.1 (1), the Cauchy-Schwarz inequality, and (9) give

(10) if {(z(t), f(t)) = 8,kz'(t) fk(t)} dtl

< f 16»i2-m'

Iz(t)I'I1(t)Idto m1

16 m2 -m1

11z11 ti (J < 16 m2- m1 2J(f)

m1 m1

To estimate the second term in (7) note thatdz

= y (cf. (5)) is in coordi-nates equivalent to

z(t) = y'(:) - r df(t)) z'(t) At),

so that integration and 8.1 (1) give

(11) Iz'(t) - z'(0)1

J 6tkz'(0) .f k(t) dt = 0.0

0frAl) dt + c t E Iz'(t)I If`(t)fI dt.

0 J.1

By the Cauchy-Schwarz inequality, 8.1 (1), and (9), we have

(12)

Since

< nIlzll ;5 2J(f).

mi m1

f'/;;j fo r latkllk(t)I dt < n (f(t), f(t)) dt

Page 236: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 231

anddT I fk(t) dtl

f1

Y_

8`k

ifo01.k J0

1 +1 1/2 1 1/2

< - ` (AT), Y(T)) dr (f (f(t), f(t)) dtm1 (J o o JJJ

we obtain from (11), and (12) using also IIYII2j(f) IIYII2 +J(f),

3)ifo ark(z`(t) - z(0))fk(t)dt

(1

5 1 (11yI12+JJ(f))+C(n

2J(f))3/2

m,

The estimates (8), (10) and (13) of the right side of (7) inserted in (6) give

(14) IIY112 J(f) (2 - 32n12 - m, _ 1 - 2C (n IIYII2

MI 2m, m, m,

so that I1yJ12 ? I J(f) by the choices made for s, and e.Q.E.D.

8.50. Theorem: On every compact Riemannian manifold M of class Ck,k 6, there is at least one nontrivial closed geodesic.

Proof: If M is not simply connected, then by Theorem 8.46 and 8.47 thereexists a closed geodesic in every nontrivial homotopy class of closed curves.If M is simply connected there is a first nonvanishing homotopy group XI(M),12: 2. We claim that r, _ 1(H, (S1, M), M) is nontrivial. This together withTheorem 8.46 and the Remark following Theorem 8.47 implies the presenttheorem.

We will prove our claim for the case M = S2, and indicate the modifica-tions needed to treat the general situation thereafter. Consider the spheres S2in R3 and a line 1 tangent at p to S2. Take the tangent plane to S2 at p androtate it around 1, through 180°, until it is again tangent. The intersectionsof the intermediate planes with S2 form a family of circles. Parametrize theintermediate planes with a parameter s running from 0 to 1 and call thecorresponding circles of intersection c(s). Then c(0) and c(1) are constantcurves (cf. 8.33). Parametrize each circle with a parameter t running from 0to 1, so that c(s) (0) = c(s) (1) = p (e.g. take t proportional to arc lengthon c(s)). Then c : [0, 1 ] -- H, (S1, M) is a curve in H, (S, , M) which re-

Page 237: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

232 NONLINEAP. FUNCTIONAL ANALYSIS

presents an element of n1 (H1 (S1 , M), M). We claim that this homotopy

element is nontrivial.To prove this, first consider the map e : I X I -+ S2 = M given by c (s, t)

C (s) (t). Note that c (s, 0) = c (s, 1) for each s and c (0, t) = p = e (1, t)

for all t. We shall make boundary identifications in I x I so that the squarebecomes a sphere and c induces a map c* : S2 - S2 = M such that c* ishomotopic to the identity and therefore homotopically nontrivial. This is

done as follows. For each s 0,1 identify the two points (s, 0) and (s, 1);for s = 0 identify all the points (0, t) to a single point; and for s = I identifyall the points (1, t) to a single point.

Assume now that c represents a trivial element of n1(H1(S1, M), M).Then there is a deformation 0, of c in H1(S1, M) which deforms c to acurve d : [0, 1] -- M and which leaves the end points of c fixed, i.e. 0,(0)= c(0) = p = c(l) = (,(1) for all r. We can assume that 0, is a differ-entiable deformation. Using Lemma 8.25, we can interpret 0, as a homotopyof a on M. Since 0,(s) (0) = 1,(s) (1) for each r and s (since we deal withspaces of closed curves), and also since 0,(0) (t) = p = 45,(1) (t) for all 1,I, gives a homotopy of c*. Now note that the curves O1(s) = d(s) are con-stant curves on M, i.e. 0,(s) (t) is independent of t, and note also that d(0)= d(l) = p . We have thus constructed a homotopy from the nontrivial mapc* : S2 -- S2 = M to a map d* : S2 -, S2 = M such that the image d*(S2)lies on the closed curve in S2 = M which is given by s -+ 01(s) (0). Clearlyd* is homotopically trivial, a contradiction.

In regard to the general case, we remark only that starting with a non-trivial element of the first nonvanishing homotopy group n,(M), i.e. with a(differentiable) map F: S' --.M, one may define an associated (l -1)-para-meter family of circles on S' such that the F-image of the family of circlesrepresents a differentiable element of n, -I (HI (SI , M), 9). If this element istrivial, one constructs, as above, a homotopy which deforms F to a map Gwhich can be considered as an element of n,_ 1(M) and is therefore trivial bythe choice of n,(M), a contradiction which proves our theorem in the generalcase. Q.E.D.

Comments on Further Developments of the Theory of ClosedGeodesics

When one tries to prove the existence of more than one nontrivial closedgeodesic, one runs into the disturbing fact that associated with each closedgeodesic one automatically has a one-parameter family of geodesics arising

Page 238: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 233

from the various different starting points for the parametrization of a closedcurve. (Note however that critical points of J are always parametrizedproportional to are length.) It is easy to prove that the action of 0(2) onH1(S1 i M), which for a e 0(2) is given by f(t) - f(t + a) (t and a of coursetaken mod 1) is continuous. By identifying orbits we then get a new space11(M) (in Klingenberg's notation) in which a single nontrivial closed geodesicis represented by exactly one point. (Multiple coverings of a geodesic arenot identified to a single point by this process.) The space II(M) is no longera manifold, but since the Riemann scalar product of H1(Sl, M) and theenergy function are compatible with the action of 0(2) (i.e., are equivariantunder the action of 0(2)) one can prove many statements concerning thespace 17(M) by "lifting" them to the manifold H1(S1, M). For example, thegradient deformation in H1(S1, M) induces an energy decreasing deforma-tion in 17(M), so that Theorem 8.47 also holds for 11(M). This is remarkablesince the result seems to be inaccessible via the classic techniques usingbroken geodesics. In a paper to appear in the journal Topology, W. Klingen-berg gives a complete description of the Z2-homology of both H1(S1, M)and I1(M) for M = S. His method of calculation also applies to the pro-jective spaces and the other symmetric manifolds of rank 1. Using this in-formation he obtains a number g(n) of "algebraically different" nontrivialclosed geodesics for the case of M = S"; specifically g(n) = 2n - s - 1where 0 S s = n - 2" < 2". "Algebraically different" means that the ener-gies of these g(n) geodesics are the critical values (8.45) corresponding tog(n) pairwise subordinated ' homology classes; here and below, we call ahomology class ,% subordinated to a homology class j9 if there exists a cohomo-logy class such that a can be written as a cap product a = r ft. Unfor-tunately the possibility cannot be excluded that all the geodesics obtainedare multiple coverings of a single geodesic.

The same result was proved using different methods by S. L. Alber. (Onperiodicity problems in the calculus of variations in the large, Amer. Math.Soc. Transl. (2) 14 (1960).) A. I. Fet proved that on every compact manifoldthere are at least 3 algebraically different nontrivial closed geodesics. (Onthe algebraic number of closed extremals on a manifold, Dokl. Akad. NaukSSSR (N.S.) 88 (1953), 619-621). Klingenberg obtains this result also.

No criterion is known which allows one to decide whether two algebraicallydifferent geodesics are also geometrically different (i.e. whether the under-lying simple covered closed geodesics are different). However, Lusternik andSchnirelmann (Sur les problemes de trois g6od6siques ferm6es sur les sur-faces de genre 0, C. R. Acad. Sci. Paris 189 (1929), 269-271) showed that on

Page 239: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

234 NONLINEAR FUNCTIONAL ANALYSIS

manifolds of the type of the 2-sphere there exist 3 closed geodesics withoutself intersections.

Fet proves the following result: If all closed geodesics on a compactmanifold are nondegenerate as critical points of J then there are atleast 2 prime geodesics (A periodic problem in the calculus of variations,Dokl. Akad. Nauk SSSR (N. S.) 160 (1965), 287-289). Alber and Klingenberg announced that under restrictions on the curvature of a manifoldM (j < min K/max K < 1) one can prove without difficulty that the geo-

M M

desics constructed from subordinated homology classes are geometricallydifferent. Fi aally, Klingenberg announced that certain special closed geo-desics constructed from subordinated homology classes turn out to besimple and without self intersection.

Page 240: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

Index

Absolutely continuous maps 165

Associated bilinear form 31

Attaching of handles 137

Banach space 10

Bilinear forms 121

Borsuk's theorem 78

Bott periodicity theorem 197

Bounded set 10of mappings 107

B-space 10

Bundles,analytic 113

direct sum of 113

homomorphism of 113

of class C' 113

smooth linear 112

sub bundles of 113

tangent 115

C1 mappings in R' 61

Calculus of variations in the large 162

Category theory 155

and homology 158principal theorem of 164

Cohomology ring of a group-like space189

Compact mappings 26Complete Riemannian manifold 126

Complex analytic mapping 30Contracting mapping principle 14

Convex set 9Coordinate of a vector 104Critical neck principle 139

Critical points, global study of 137Critical points of functions 132

Cup-length 189

of a space 161

Cup product 160

Curve on a manifold 102

Degree,and generalized Jordan's theorem for

Banach spaces 92multiplicative property of 74of a continuous mapping 70of finite dimensional perturbations of

the identity 84theory 55

Derivative, Gateaux 11

Diffeomorphic manifolds 101

Dimension of a compact metric space 156Domain invariance 77

Embedding of Riemannian manifolds 43Equicontinuous set of mappings 107

Exactness principle 149

Excision property of homology 149

F-differentiable function 11

Feebly continuous mapping 22Fixed point theorems 96Frechet differentiable function 11

Frechet space 9Freudenthal suspension relation 185

F-space 9

Gateaux derivative 11

Geodesics on a finite-dimensional mani-fold 172

Germ of smooth functions 102Gradient of a function 126

235

Page 241: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

236 INDEX

Hardies, attaching of 137.(ard implici functional theorems 33Hessian of a function 133Higher differentials 28Hilbert manifolds 169

Homology of n-sphere 149

Homology sequence 149Homotopi:ally equivalent spaces 148

Homotopy,of Lie groups 189

theory 181

Horizontal function 11

Hurewicz isomorphism theorem 188

Implicit function theorem 15

hard 33

soft 14

Jordan separation theorem 75

Kirszbraun lemma 19

Length of a curve on a Riemannian mani-fold 124

Locally compact mapping 26Locally convex space 9

Manifold,of curves 168Riemannian 121

smooth 100

tangent space to 103

Mapping horizontal at P 102Minty's theorem 22Monotone mapping 19

Morse index theorem 175

inequalities 148

lemma on critical points 136theory, applications of 181

Newton's method 33

Nash implicit functional theorem 33Non-critical neck principle 127

Non-degeneracy theorem 175Non-degenerate critical point 133

Palais-Smale condition 130, 171

Quadratic form 31

Regularly imbedded submanifold120

Relative cubical groups of a pair159

Riemannian manifold 121

geodesic manifold 175

Sard's lemma 55Section of a bundle 113Set,

of first category 155

of K-th category 155

Singular cubical chain group 158

N cube 158

Slightly continuous mapping 22Smooth linear bundles 112Smooth manifold 100Strictly monotone mapping 19

Strongly monotone mapping 18

Tangent bundle 115

Tangent space,to a manifold 103

to manifold of curves 168

vectors to a manifold 102Taylor's theorem for 9-spaces 28Topological linear space 9

...mar field on a manifold 116

Page 242: J.T.schwartz--Nonlinear Functional Analysis Notes on Mathematics and Its Applications

NONLINEAR FUNCTIONAL ANALYSISby J.T. Schwartz, Courant Institute of Mathematical Sciences, New YorkUniversity, USA

This book delves into the subject of nonlinear analysis within the context ofinfinite dimensional topological spaces and manifolds It aims to extendknown theorems of nonlinear analysis from the finite to the infinite dunen-sional case and to analyze difficulties, which arise in the infinite dimensionalcase. The authors address calculus on a basic level and work their way up toclosed geodesics on topological spheres. Mathematicians will find this a clearexplication of the theorems and applications in nonlinear functional analysis.

Related titles of interest from Gordon & Breach

SOME METHODS IN THE MATHEMATICAL ANALYSIS OFSYSTEMS AND THEIR CONTROLby J.L. LionsFINITE ELEMENT METHODSProceedings of the Symposium on Finite Element Methods, Hefei, China,(May 18-23, 1981)edited by He Guangqian and Y.K. CheungDIFFERENTIAL GEOMETRY AND TOPOLOGYby J.T. Schwartz

GORDON AND BREACH SCIENCE PUBLISHERS ISBN 0-677-01500-3NEW YORK LONDON PARIS MONTREUX TOKYO ISSN 0888-6113