Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
An Introduction to KAM Theory
C. Eugene Wayne
January 22, 2008
1 Introduction
Over the past thirty years, the Kolmogorov-Arnold-Moser (KAM) theory hasplayed an important role in increasing our understanding of the behavior ofnon-integrable Hamiltonian systems. I hope to illustrate in these lectures thatthe central ideas of the theory are, in fact, quite simple. With this in mind, Iwill concentrate on two examples and will forego generality for concreteness and(I hope) clarity. The results and methods which I will present are well-knownto experts in the field but I hope that by collecting and presenting them in assimple a context as possible I can make them somewhat more approachable tonewcomers than they are often considered to be.
The outline of the lectures is as follows. After a short historical introduction,I will explain in detail one of the simplest situations where the KAM techniquesare used â the case of diffeomorphisms of a circle. I will then go on to discuss thetheory in its original context, that of nearly-integrable Hamiltonian systems.
The problem which the KAM theory was developed to solve first arose incelestial mechanics. More than 300 years ago, Newton wrote down the dif-ferential equations satisfied by a system of massive bodies interacting throughgravitational forces. If there are only two bodies, these equations can be ex-plicitly solved and one finds that the bodies revolve on Keplerian ellipses abouttheir center of mass. If one considers a third body (the âthree-body-problemâ),no exact solution exists â even if, as in the solar system, two of the bodies aremuch lighter then the third. In this case, however, one observes that the mutualgravitational force between these two âplanetsâ is much weaker than that be-tween either planet and the sun. Under these circumstances one can try to solvethe problem perturbatively, first ignoring the interactions between the planets.This gives an integrable system, or one which can be solved explicitly, witheach planet revolving around the sun oblivious of the otherâs existence. Onecan then try to systematically include the interaction between the planets ina perturbative fashion. Physicists and astronomers used this method exten-sively throughout the nineteenth century, developing series expansions for thesolutions of these equations in the small parameter represented by the ratio of
1
the mass of the planet to the mass of the sun. However, the convergence ofthese series was never established â not even when the King of Sweden offereda very substantial prize to anyone who succeeded in doing so. The difficulty inestablishing the convergence of these series comes from the fact that the termsin the series have small denominators which we shall consider in some detaillater in these lectures. One can obtain some physical insight into the origin ofthese convergence problems in the following way. As one learns in an elemen-tary course in differential equations, a harmonic oscillator has a certain naturalfrequency at which it oscillates. If one subjects such an oscillator to an externalforce of the same frequency as the natural frequency of the oscillator, one hasresonance effects and the motion of the oscillator becomes unbounded. Indeed,if one has a typical nonlinear oscillator, then whenever the perturbing force hasa frequency that is a rational multiple of the natural frequency of the oscillator,one will have resonances, because the nonlinearity will generate oscillations ofall multiples of the basic driving frequency.
In a similar way, one planet exerts a periodic force on the motion of a second,and if the orbital periods of the two are commensurate, this can lead to resonanceand instability. Even if the two periods are not exactly commensurate, but onlyapproximately so the effects lead to convergence problems in the perturbationtheory.
It was not until 1954 that A. N. Kolmogorov [8] in an address to the ICMin Amsterdam suggested a way in which these problems could be overcome.His suggestions contained two ideas which are central to all applications of theKAM techniques. These two basic ideas are:
âą Linearize the problem about an approximate solution and solve the lin-earized problem â it is at this point that one must deal with the smalldenominators.
âą Inductively improve the approximate solution by using the solution of thelinearized problem as the basis of a Newtonâs method argument.
These ideas were then fleshed out, extended, and applied in numerous othercontexts by V. Arnold and J. Moser, ([1], [9]) over the next ten years or so,leading to what we now know as the KAM theory.
As I said above, we will consider the details of this procedure in two cases.The first, the problem of showing that diffeomorphisms of a circle are conjugateto rotations, was chosen for its simplicity â the main ideas are visible withfewer technical difficulties than appear in other applications. We will then lookat the KAM theory in its original setting of small perturbations of integrableHamiltonian systems. Iâll attempt to parallel the discussion of the case of circlediffeomorphisms as closely as possible in order to keep our focus on the mainideas of the theory and ignore as much as possible the additional technicalcomplications which arise in this context.
2
Acknowledgments: It is a pleasure to thank Percy Deift and Andrew Torokfor many helpful comments about these notes.
3
2 Circle Diffeomorphisms
Let us begin by discussing one of the simplest examples in which one encounterssmall denominators, and for which the KAM theory provides a solution. Itmay not be apparent for the moment what this problem has to do with theproblems of celestial mechanics discussed in the introduction, but almost all ofthe difficulties encountered in that problem also appear in this context but inways which are less obscured by technical difficulties â this is, if you like, ourwarm-up exercise.
We will consider orientation preserving diffeomorphisms of the circle, orequivalently, their lifts to the real line:
Ï : R1 â R1
Ï(x) = x + η(x) with η(x + 1) = η(x) and ηâČ(x) > â1 .
We wish to consider Ï as a dynamical system, and study the behavior of itsâorbitsâ â i.e. we want to understand the behavior of the sequences of pointsÏ(n)(x)(mod1)ân=0, where Ï(n) means the n-fold composition of Ï with itself.Typical questions of interest are whether or not these orbits are periodic, ordense in the circle.
The simplest such diffeomorphism is a rotation Rα(x) = x + α. Note thatwe understand âeverythingâ about its dynamics. For instance, if α is rational,all the orbits of Rα are periodic, and none are dense. However, we would liketo study more complicated dynamical systems than this. Thus we will supposethat
Ï(x) = x + α + η(x) , (1)
where as before, η(x + 1) = η(x) and ηâČ(x) > â1. As I said in the introduction,I will not attempt to consider the most general case, but rather will focus onsimplicity of exposition. Thus I will consider only analytic diffeomorphisms.Define the strips SÏ = z â C | |Imz| < Ï. Then I will assume that
η â BÏ =η | η(z) is analytic on SÏ,
η(x + 1) = η(x) and sup|Imz|<Ï
|η(z)| ⥠âηâÏ < â .
Note that one can assume that Ï < 1, without loss of generality.Our goal in this section will be to understand the dynamics of Ï(x) = x +
α + η(x) when η has small norm. One way to do this is to show that thedynamics of Ï are âlikeâ the dynamics of a system we understand â for instance,suppose that we could find a change of variables which transformed Ï into a purerotation. Then since we understand the dynamics of the rotation, we would alsounderstand those of Ï. If we express this change of variables as x = H(Ο), whereH(Ο +1) = 1+H(Ο) preserves the periodicity of Ï, then we want to find H suchthat
Hâ1 Ï H(Ο) = RÏ(Ο) ,
4
or equivalentlyÏ H(Ο) = H RÏ(Ο) . (2)
Such a change of variables is said to conjugate Ï to the rotation RÏ.
Remark 2.1 The relationship between this problem and the celestial mechanicsquestions discussed in the introduction now becomes more clear. In that casewe wanted to understand the extent to which the motion of the solar systemwhen we included the effects of the gravitational interaction between the variousplanets was âlikeâ that of the simple Kepler system.
In order to answer this question we need to introduce an important charac-teristic of circle diffeomorphisms, the rotation number
Definition 2.1 The rotation number of Ï is
Ï(Ï) = limnââ
Ï(n)(x)â x
n.
Remark 2.2 It is a standard result of dynamical systems theory that for anyhomeomorphism of the circle the limit on the right hand side of this equationexists and is independent of x. (See [6], p. 296.)
Remark 2.3 Note that from the definition of the rotation number, it followsimmediately that for any homeomorphism H, the map Ï = Hâ1 Ï H has thesame rotation number as Ï. (Since Ï(n) = Hâ1 Ï(n) H, and the initial andfinal factors of H and Hâ1 have no effect on the limit.)
As a final remark about the the rotation number we note that if Ï(x) = x +α+η(x), then an easy induction argument shows that Ï(Ï) = α+limnââ
1n
ânâ1j=0 η
Ï(j)(x). In particular, if α = Ï, we have limnââ1n
ânâ1j=0 η Ï(j)(x) = 0, so we
have proved:
Lemma 2.1 If Ï(x) = x + Ï + η(x) has rotation number Ï, then there existssome x0 such that η(x0) = 0.
We must next ask about the properties we wish the change of variablesH to have. If we only demand that H be a homeomorphism, then DenjoyâsTheorem ([6] p. 301) says that if the rotation number of Ï is irrational, wecan always find an H which conjugates Ï to a rotation. However, if we wantmore detailed information about the dynamics it makes sense to ask that Hhave additional smoothness. In fact, it is natural to ask that H be as smoothas the diffeomorphism itself â in this case, analytic. (There will, in general,be some loss of smoothness even in this case. We will find, for example, thatwhile there exists an analytic conjugacy function, H, its domain of analyticitywill be somewhat smaller than that of Ï.) Surprisingly, the techniques which
5
Denjoy used fail completely in this case, and the answer was not known untilthe late fifties when Arnold applied KAM techniques to answer the question inthe case when η is small. Even more surprisingly, in order to even state Arnoldâstheorem, we have to discuss a little number theory.
Any irrational number can be approximated arbitrarily well by rational num-bers, and in fact, Dirichletâs Theorem even gives us an estimate of how goodthis approximation is. More precisely, it says that given any irrational number Ï,there exist infinitely many pairs of integers (m,n) such that |Ïâ(m/n)| < 1/n2.On the other hand, most irrational numbers canât be approximated much betterthan this.
Definition 2.2 The real number Ï is of type (K, Îœ) if there exist positive num-bers K and Îœ such that |Ïâ (m/n)| > K|n|âÎœ , for all pairs of integers (m,n).
Proposition 2.1 For every Îœ > 2, almost every irrational number Ï is of type(K, Îœ) for some K > 0.
Proof: The proof is not difficult, but would take us a bit out of our way. Thedetails can be found in [3], page 116, for example. Note also, that we can assumewithout loss of generality that K †1, since if Ï is of type (K, Îœ) for some K > 1,it is also of type (1, Îœ).
Theorem 2.1 (Arnoldâs Theorem [1]) Suppose that Ï is of type (K, Îœ). Thereexists Δ(K, Îœ,Ï) > 0 such that if Ï(x) = x + Ï + η(x) has rotation number Ï,and âηâÏ < Δ(K, Îœ,Ï), then there exists an analytic and invertible change ofvariables H(x) which conjugates Ï to RÏ.
As mentioned above, Arnoldâs proof of this theorem used the KAM theory.The proof can be broken into two main parts â an analysis of a linearizedequation, and a Newtonâs method iteration step. These same two steps willreappear in the next section when we discuss nearly integrable Hamiltoniansystems, and they are characteristic of almost all applications of the KAMtheory.
Remark 2.4 It may seem that by assuming that the diffeomorphism is of theform Ï(x) = x+Ï+η(x), where Ï is the rotation number of Ï, we are consideringa less general situation than that described above in which we allowed Ï to havethe form x + α + η(x). As we shall see below, there is no real loss of generalityin this restriction.
Step 1: Analysis of the Linearized equation
Note that since âηâÏ is small, the diffeomorphism Ï is âcloseâ to the purerotation RÏ. Thus, we might hope that if a change of variables H which satisfies(2) exists is would be close to the identity i.e. H(x) = x + h(x), where h is
6
âsmallâ. If we make this assumption and substitute this form of H in (2), wefind that h should satisfy the equation
h(x + Ï)â h(x) = η(x + h(x)) (3)
If we now expand both sides of this equation, retaining only terms of first orderin the (presumably) small quantities h and η, we find:
h(x + Ï)â h(x) = η(x) (4)
Since all the functions in this equation are periodic, and the equation is linear inthe unknown function h, we can immediately write down a (formal) solution forthe coefficients in the Fourier series of h. If η(n) is the nth Fourier coefficientof η, then the nth Fourier coefficient of h is
h(n) =η(n)
e2ÏinÏ â 1, n )= 0 . (5)
In just a moment, we will address whether or not the function
h(x) =â
n %=0
h(n)e2Ïinx =â
n %=0
η(n)e2ÏinÏ â 1
e2Ïinx
makes any sense, however, we first note that even if (5) defines a well-behavedfunction, it will not solve (4) but rather:
h(x + Ï)â h(x) = η(x)ââ« 1
0η(x)dx = η(x)â η(0) . (6)
This is because the zeroth Fourier coefficient of h drops out of (4). The factthat h does not solve (4) will complicate the estimates below. The problemswith showing that (5) converges arise due to the presence of the factors ofe2ÏinÏ â 1 in the denominator of the summands, and these are the (in)famoussmall denominators which plagued celestial mechanics in the last centuryand which the KAM theory finally overcame. We first note that if Ï is rational,there is little hope that the sum defining h will converge since the denominatorsin this sum will vanish for the infinitely many n for which Ïn = m for somem â Z. Thus, we can only hope for success if Ï /â Q. If Ï is irrational, thedenominator will still be large whenever nÏ â m. However, by assuming that Ïis of type (K, Îœ), we have some control over how close to zero the denominatorcan become. In fact, the following lemma immediately allows us to estimateh(x).
Lemma 2.2 If Ï is of type (K, Îœ), then
|e2ÏinÏ â 1| = |e2Ïim(e2Ïi(Ïnâm) â 1)| â„ 4K|n|â(Îœâ1) if n )= 0 .
7
Proof: Since Ï is of type (K, Îœ), we know that |Ïnâm| â„ K|n|â(Îœâ1) and thelemma follows by writing |e2Ïi(Ïnâm) â 1| = 2| sin(Ï(Ïnâm))|, and then usingthe fact that | sin(Ïx)| â„ 2|x|, for |x| †1/2.
The other fact which we must use to estimate h is the fact that since η â BÏ,Cauchyâs theorem immediately gives an estimate on its Fourier coefficients ofthe form |η(n)| †âηâÏeâ2ÏÏ|n|. Combining this remark with Lemma 2.2, wesee that if |Imz| â€ Ï â ÎŽ, one has
|h(z)| = |â
n %=0
η(n)e2Ïinz
e2ÏinÏ â 1| â€
â
n %=0
|n|(Îœâ1)
4KâηâÏeâ2ÏÏ|n|e2Ï|n|(ÏâÎŽ)
†Î(Îœ)K(2ÏÎŽ)Îœ
âηâÏ ,
where Î(Îœ) =â«â0 xÎœâ1eâxdx, and we have assumed that 2Ï(K+1)ÎŽ †4ÏÎŽ < 1.
Thus we have proven,
Proposition 2.2 If Ï is of type (K, Îœ), and η â BÏ, then h(x), defined by (5)is an element of BÏâÎŽ for any ÎŽ > 0, and if 4ÏÎŽ < 1, we have the estimate:
âhâÏâÎŽ â€Î(Îœ)
K(2ÏÎŽ)ÎœâηâÏ .
Remark 2.5 Note that we do not get an estimate on h in BÏ â we lose someanalyticity. This is why we canât use an ordinary Implicit Function Theorem tosolve (2). Indeed, if we were to attempt to proceed with an ordinary iterativemethod to solve (2), we would find that we gradually lost all of the analyticityof our approximate solution. This is where the second âbig ideaâ of the KAMtheory enters the picture, namely:
Step2: Newtonâs Method in Banach Space
Recall that Newtonâs method says that if you want to find the roots ofsome nonlinear equation, you should take an approximate solution and then usea linear approximation to the function whose roots you seek to improve yourapproximation to the solution. You then use this improved approximation asyour new starting point and iterate this procedure. In the present circumstance,we regard Ï as an approximation to RÏ and then use the linear approximation,H(x) = x + h(x), to the conjugating function to improve that approximation.Recall that if h(x) had solved (2) exactly, then Hâ1 Ï H = RÏ. Our hope is
8
that if we use the H(x) that comes from solving (6), Hâ1 Ï H will be closerto RÏ than Ï was and then we can iterate this process.
The first thing we must do is check that H is invertible. Since H(z) = z +h(z), H will be invertible with analytic inverse on any domain on which âhâČâ < 1.Cauchyâs theorem and Proposition 2.2 imply that âhâČâÏâ2ÎŽ †2ÏÎ(Îœ)
K(2ÏÎŽ)Îœ+1 âηâÏ,so we conclude
Lemma 2.3 If 2ÏÎ(Îœ)âηâÏ < K(2ÏÎŽ)(Îœ+1) and 4ÏÎŽ < 1, then H(z) has ananalytic inverse on the image of SÏâ2ÎŽ.
Remark 2.6 Note that if we combine the inequalities in Lemma 2.3 and Propo-sition 2.2, we find âhâÏâÎŽ < ÎŽ. Thus, if z â SÏâ2ÎŽ, H(z) â SÏâÎŽ. Furthermore,H maps the real axis into itself, and the images of the lines Imz = ±(Ï â 2ÎŽ)lie outside the strip SÏâ3ÎŽ. With this information it is easy to show thatRange(H|SÏâ2ÎŽ) â SÏâ3ÎŽ, so that Hâ1(z) is defined for all z â SÏâ3ÎŽ.
In addition to knowing that the inverse exists, we need an estimate on itsproperties which the following proposition provides.
Proposition 2.3 If
2ÏÎ(Îœ)âηâÏ < K(2ÏÎŽ)(Îœ+1) and 4ÏÎŽ < 1 ,
then Hâ1(z) = z â h(z) + g(z), where
âgâÏâ4ÎŽ â€2ÏÎ(Îœ)2
K2(2ÏÎŽ)(2Îœ+1)âηâ2Ï .
Proof: If we define g(z) by g(z) = Hâ1(z)â z + h(z), then we see that
z = Hâ1 H(z) = z + h(z)â h(z + h(z)) + g(z + h(z)) .
Thus, g(z + h(z)) = h(z + h(z)) â h(z) =â« 10 hâČ(z + sh(z))h(z)ds. Setting
Ο = H(z), this becomes
g(Ο) =⫠1
0hâČ(Hâ1(Ο) + sh(Hâ1(Ο)))h(Hâ1(Ο))ds .
In a fashion similar to that in the remark just above, Range(H|SÏâ3ÎŽ) â SÏâ4ÎŽ,so if Ο â SÏâ4ÎŽ, Hâ1(Ο) â SÏâ3ÎŽ, and hence Hâ1(Ο) + sh(Hâ1(Ο)) â SÏâ2ÎŽ, soapplying the estimates on h and hâČ from above we obtain
âgâÏâ4ÎŽ †2Ï(Î(Îœ)
K
)2 âηâ2Ï(2ÏÎŽ)2Îœ+1
9
Let us now examine the transformed diffeomorphism Ï(x) = Hâ1 ÏH(x).Since h is only an approximate solution of (2), Ï will not be exactly a rotation,but since h did solve the linearized equation (6), we hope that Ï will differfrom a rotation only by terms that are of second order in the small quantitiesh and η. Using the form of Hâ1 given by Proposition 2.3, we find
Ï(x) = Hâ1 Ï H(x) = Hâ1(x + h(x) + Ï + η(x + h(x)))= x + h(x) + Ï + η(x + h(x))â h(x + Ï + h(x) + η(x + h(x))) +
+ g(x + h(x) + Ï + η(x + h(x)))= x + Ï + h(x)â h(x + Ï) + η(x)+ η(x + h(x))â η(x)+
+ h(x + Ï)â h(x + Ï + h(x) + η(x + h(x)))++ g(x + h(x) + Ï + η(x + h(x))) .
We first note that because h solves (6), the first expression in braces inthe second to last line of this sequence of equalities is equal to η0. The nextexpression in braces equals
â« 10 ηâČ(x+sh(x))h(x)ds. If 2ÏÎ(Îœ)âηâÏ < K(2ÏÎŽ)Îœ+1,
4ÏÎŽ < 1, and x â SÏâ4ÎŽ, we can bound the norm of this expression on BÏâ4ÎŽ by2ÏâηâÏ
Î(Îœ)K(2ÏÎŽ)Îœ+1 âηâÏ. Similarly, the quantity in braces in the next to last line
may be rewritten asâ« 10 hâČ(x + Ï + s(h(x) + η(x + h(x)))(h(x) + η(x + h(x))ds.
Once again, assuming that the conditions on âηâÏ and ÎŽ described above hold,and that x â SÏâ4ÎŽ, then we can bound the norm of this expression on BÏâ4ÎŽ
by2ÏÎ(Îœ)
K(2ÏÎŽ)Îœ+1
Î(Îœ)K(2ÏÎŽ)Îœ
âηâÏ + âηâÏ
âηâÏ <
4Ï(Î(Îœ))2
K2(2ÏÎŽ)2Îœ+1âηâ2Ï ,
where the last inequality used the fact that 2ÏÎŽK < 1. Finally, if |Imx| < Ïâ6ÎŽ,then |Im(x + h(x) + Ï + η(x + h(x)))| < Ï â 4ÎŽ, (since âηâÏ < ÎŽ), so that thelast term in this expression is bounded by Proposition 2.3.
Define η(x) by Ï(x) = x + Ï + η(x). By Remark 2.3, Ï has rotation numberÏ, so by Lemma 2.1 there exists x0 such that η(x0) = 0. Combining this remarkwith the expression for Ï just above, we find
η(0) = âη(x0 + h(x0))â η(x0)â h(x0 + Ï)â h(x0 + Ï + h(x0) + η(x0 + h(x0)))â g(x0 + h(x0) + Ï + η(x0 + h(x0))) .
In the previous paragraph we bounded the norm of each of the expressions onthe right hand side of this equality, so we conclude that
|η(0)| †2Ïâηâ2ÏÎ(Îœ)
K(2ÏÎŽ)Îœ+1+
4Ï(Î(Îœ))2
K2(2ÏÎŽ)2Îœ+1âηâ2Ï +
2ÏÎ(Îœ)2
K2(2ÏÎŽ)2Îœ+1âηâ2Ï .
Combining this estimate with the bounds above, we have proven,
10
Proposition 2.4 If 2ÏÎ(Îœ)âηâÏ < K(2ÏÎŽ)(Îœ+1), and 4ÏÎŽ < 1, then Ï(x) =Hâ1 Ï H(x) = x + Ï + η(x), where
âηâÏâ6ÎŽ â€16Ï(Î(Îœ))2
K2(2ÏÎŽ)2Îœ+1âηâ2Ï .
Remark 2.7 The important thing to note about the estimate of η is that inspite of the mess, it is second order in the small quantity âηâÏ as we had hoped.That is, there exists a constant C(K, ÎŽ, Îœ) such that âηâÏâ6ÎŽ †C(K, ÎŽ, Îœ)âηâ2Ï.This is what makes the Newtonâs method argument work.
The proof of Arnoldâs Theorem is completed by inductively repeating theabove procedure. The principal point which we must check is that we donâtlose all of our domain of analyticity as we go through the argument â notethat Ï is analytic on a narrower strip than was our original diffeomorphism Ï.The essential reason that there is a nonvanishing domain of analyticity at thecompletion of the argument is that the amount by which the analyticity stripshrinks at the nth step in the induction will be proportional to the amount bywhich our diffeomorphism differs from a rotation at the nth iterative step, andthanks to the extremely fast convergence of Newtonâs method, this is very small.
The Inductive Argument:
Let Ï0(x) ⥠Ï(x), be our original diffeomorphism, and set η0(x) = η(x).Define Ï1(x) = Hâ1
0 Ï0H0(x), and by induction, Ïn+1(x) = Hâ1n ÏnHn(x) =
x + Ï + ηn+1(x) where Hn(x) = x + hn(x), and hn solves
hn(x + Ï)â hn(x) = ηn(x)â η(0) .
Also define the sequence of inductive constants:
âą ÎŽn = Ï36(1+n2) , n â„ 0 (Note that this insures that 4ÏÎŽ0 < 1.)
âą Ï0 = Ï, and Ïn+1 = Ïn â 6ÎŽn, if n â„ 0.
⹠Δ0 = âηâÏ, and Δn = Δ(3/2)n
0 , if n â„ 0.
Note that Ïâ = limnââ Ïn > Ï/2. We now have:
Lemma 2.4 (Inductive Lemma) If
Δ0 <
(K
16ÏÎ(Îœ)(
Ï
36)(Μ+1)
)8
,
then Ïn+1(x) = x + Ï + ηn+1(x), with ηn+1 â BÏn+1 , and âηn+1âÏn+1 †Δn+1.
11
Furthermore, Hn(x) = x + hn(x) satisfies
âhnâÏnâÎŽn â€Î(Îœ)Δn
K(2ÏÎŽn)Îœ,
while Hâ1n (x) = xâ hn(x) + gn(x), where
âgnâÏâ4ÎŽn â€2ÏÎ(Îœ)2Δ2n
K2(2ÏÎŽn)2Îœ+1.
Proof: Note that Proposition 2.2 and Proposition 2.3 imply that the estimateson hn and gn hold for n = 0. The estimate on âη1âÏ1 follows by noting thatfrom Proposition 2.4,
âη1âÏ1 â€16Ï(Î(Îœ))2
K2(2ÏÎŽ)2Îœ+1âηâ2Ï
and the hypothesis on the inductive constants in the Inductive Lemma waschosen so that this last expression is less than Δ(3/2)
0 = Δ1. This completes thefirst induction step.
Now suppose that the induction holds for n = 0, 1, . . . , Nâ1, so that we knowthat âηNâÏN †ΔN . To prove that it holds for n = N , first choose hN to solvehN (x + Ï) â hN (x) = ηN (x) â ηN (0). By Proposition 2.2, and the inductiveestimate on ηN , we will have âhNâÏNâÎŽN †Î(Îœ)ΔN
K(2ÏÎŽN )Îœ , while Proposition 2.3
implies that Hâ1N (x) = x â hN (x) + gN (x), with âgNâÏNâ4ÎŽN †2ÏÎ(Îœ)2Δ2N
K2(2ÏÎŽN )2Îœ+1 .Finally, if we define ÏN+1 = Hâ1
N ÏN HN = x+ Ï + ηN+1, with ηN+1 definedin analogy with η in Proposition 2.4, then we see that
âηN+1âÏN+1 â€16ÏÎ(Îœ)2
K2(2ÏÎŽN )2Îœ+1Δ2N .
Once again, if use the hypotheses on the inductive constants we see that this ex-pression is bounded by Δ(3/2)
N = ΔN+1, which completes the proof of the lemma.
With the aid of the Inductive Lemma it is easy to complete the proof ofArnoldâs Theorem. Define
HN (x) = H0 H1 . . . HN (x)= x + hN (x) + hNâ1(x + hn(x) + hNâ2(x + hN (x) + hNâ1(x + hN (x)))
+ . . . + h0(x + h1(x + . . .) . . .)
12
By the Induction Lemma, HN is analytic on SÏNâ2ÎŽN and HN (z)âz is boundedby
âân=0
Î(Îœ)Δn
K(2ÏÎŽn)Îœ ⥠â. (This sum converges as a consequence of the hypotheseson the induction constants in the Induction Lemma.) Furthermore,
HN+1(z)âHN (z) = HN HN (z)âHN (z) =â« 1
0HâČ
N (z + thN (z))hN (z)ds ,
so âHN+1 âHNâÏN+1 †( 4âÎŽN
+ 1) Î(Îœ)ΔN
K(2ÏÎŽN )Îœ . Note that by the definition of theinductive constants, the right hand side of this inequality converges if summedover N . Thus, HN converges uniformly to some limit H on SÏâ , and H isanalytic. Furthermore, H(z) = z + h(z), where the estimates on HN (z) â z,plus Cauchyâs Theorem imply that if ÎŽâ = Ïâ/16, then âhâČâÏââÎŽâ †â/ÎŽâ < ÎŽâ,again using the definition of the inductive constants. By an argument similarto that following Lemma 2.3, we see that H is invertible on the image of SÏââÎŽâ
and that this image contains SÏââ2ÎŽâ .Finally, note that ÏHN (z) = HN ÏN (z) = HN (z+Ï+ηN (z)). As N ââ,
we see that
Ï H(z) = limNââ
HN ÏN (z) = limNââ
HN (z + Ï + ηN (z)) = H RÏ(z) ,
for all z â SÏââ2ÎŽâ . (The last equality used the inductive estimate on ηN .)Since H is invertible on this domain, this implies Hâ1 Ï H = RÏ, so H is thediffeomorphism whose existence was asserted in Arnoldâs Theorem.
Remark 2.8 Suppose that in Arnoldâs Theorem we were given a diffeomor-phism of the (apparently) more general form
Ï(x) = x + α + ”(x) .
but still with rotation number Ï of type (K, Îœ) (where α )= Ï.) We can rewriteÏ(x) = x+Ï+(αâÏ+”(x)) ⥠x+Ï+η(x). If âηâÏ = â(αâÏ)+”âÏ â€ Î”(K, Îœ,Ï),then Theorem 2.1 implies that Ï is analytically conjugate to RÏ
Remark 2.9 In examples it may be difficult to determine from inspection ofthe initial diffeomorphism what the rotation number is. In such cases there isoften a parameter in the problem which can be varied and which allows one toshow that the conjugacy in Arnoldâs Theorem exists at least for most parametervalues. For instance, the following result can be proven by easy modifications ofthe previous methods: (See, [1], page 271.)
Theorem 2.2 Consider the family of diffeomophisms:
Ïα,Δ(x) = x + α + Δη(x) , (7)
13
for α â [0, 1]. For every ÎŽ > 0, there exists Δ0 > 0 such that if |Δ| < Δ0, thereexists a set A(Δ) â [0, 1] such that for α â A(Δ), ÏΔ,α is analytically conjugateto a rotation of the circle, and |Lebesgue measure (A(Δ))â 1| < ÎŽ.
Remark 2.10 It is not necessary to work with analytic functions. For instance,Moser [10] showed that if the original diffeomorphism is Ck, and if the rotationnumber is of type (K, Îœ), then if k is sufficiently large (depending on Îœ), andthe diffeomorphism is a sufficiently small perturbation of a rotation, the diffeo-morphism is conjugate to a rotation, with a CkâČ conjugacy function, for some1 †kâČ < k. Note that this theorem is still âlocalâ in that it demands that thediffeomorphism which we start with be âcloseâ to a pure rotation. More recentwork of Hermann [7] and Yoccoz [12], has lead to a remarkably complete under-standing of the global picture of the dynamics of circle diffeomorphisms. Forinstance (see [12]), one can write the real numbers as a union of two disjointsets A and B, and prove that any analytic circle diffeomorphism, Ï, with rota-tion number Ï(Ï) â B is analytically conjugate to the rotation RÏ(Ï), while forany α â A, there exists an analytic circle diffeomorphism with rotation numberα which is not analytically conjugate to Rα.
14
3 Nearly Integrable Hamiltonian Systems
In this section we address the KAM theory in its original setting, namely nearlyintegrable Hamiltonian systems. Recall that a Hamiltonian system (in Euclideanspace) is a system of 2N differential equations whose form is given by
pj = ââH
âqj; j = 1, . . . , N ,
qj =âH
âpj; j = 1, . . . , N ,
for some function H(p,q). (Here p = (p1, . . . , pN ) and q = (q1, . . . , qN ).)In general these equations are just as hard to solve as any other system of 2N
coupled, nonlinear, ordinary differential equations, but in special circumstances(the integrable Hamiltonian systems) there exists a special set of variablesknown as the action-angle variables, (I,Ï), I â RN and Ï â TN , such thatin terms of these variables, H(I,Ï) = h(I). Since the Hamiltonian does notdepend on the angle variables Ï, the equations of motion are very simple â theybecome
Ij = â âH
âÏj= 0 ; j = 1, . . . , N ,
Ïj =âH
âIj⥠Ïj(I) ; j = 1, . . . , N .
We can solve these equations immediately, and we find that I(t) = I(0), andÏ(t) = Ï(I)t + Ï(0). Thus, for an integrable system with bounded trajectories,the action variables I are constants of the motion, while the angle variables Ïjust precess around an N -dimensional torus with angular frequencies Ï givenby the gradient of the Hamiltonian with respect to the actions. (In particular,if the components of Ï(I) are irrationally related to one another, Ï(t) is a quasi-periodic function.)
Remark 3.1 The three-body (or N -body) problem, in which we ignore the mu-tual interaction between the planets is an integrable system.
Now suppose that we start with an integrable Hamiltonian h(I) and make asmall perturbation which depends on both the action and the angle variables âas in the case of the solar system when we consider the gravitational interactionsbetween the planets. Then the Hamiltonian takes the form:
H(I,Ï) = h(I) + f(I,Ï) . (8)
As before we will assume that the Hamiltonian function is analytic in orderto avoid complications. More precisely, if we think of f(I,Ï) as a function
15
on RN Ă RN , which is periodic in Ï, then we assume that there exists someIâ â RN such that H can be extended to an analytic function on the setAÏ,Ï(Iâ) = (I,Ï) â CN ĂCN | |IâIâ| < Ï , |Im(Ïj)| < Ï , j = 1, . . . , N . (Iwill always use the .1 norm for N -vectors âi.e. |I| =
âNj=1 |Ij |.) We define the
norm of a function f , analytic on AÏ,Ï(Iâ) by âfâÏ,Ï = sup(I,Ï)âAÏ,Ï(Iâ) |f(I,Ï)|.(As in the previous section, one can assume without loss of generality thatÏ < 1.)
In addition, since we are interested in nearly integrable Hamiltonian systems,we will assume that f has small norm. (Just as we assumed that η was smallin the previous section.) Furthermore, we can assume that
â«TN f(I,Ï)dNÏ = 0,
since if this were nonzero it could be absorbed by redefining h(I).Question: Do the trajectories of this perturbed Hamiltonian system still lie oninvariant tori, at least for f sufficiently small?
To state the answer of this question more precisely, we need an analogue ofthe numbers of type (K, Μ) introduced in the previous section.
Definition 3.1 We say that a vector Ï â RN is of type (L, Îł) if
|ăÏ,nă| = |Nâ
j=1
Ïjnj | â„ L|n|âÎł , for all n â ZN\0 .
Remark 3.2 Note that if Ï is of type (K, Îœ), then the vector (Ï,â1) is of type(L, Îł) with K = L and Îł = Îœ â 1. Also, we again assume without loss ofgenerality that L †1.
Given this remark, and the fact that we know that the numbers of type(K, Μ) are a subset of the real line of full Lebesgue measure, the following result(whose proof we omit) is not surprising.
Proposition 3.1 If Îł > N , almost every Ï â RN is of type (L, Îł) for someL < 1.
We are now in a position to state the KAM theorem.
Theorem 3.1 (KAM) Suppose that Ï(Iâ) ⥠Ïâ is of type (L, Îł), and that thethe Hessian matrix â2h
âI2 is invertible at Iâ. (And hence on some neighborhood ofIâ.) Then there exists Δ0 > 0 such that if âfâÏ,Ï < Δ0, the Hamiltonian system(8) has a quasi-periodic solution with frequencies Ï(Iâ).
Remark 3.3 Although we have claimed in the theorem only that at least onequasi-periodic solution exists in the perturbed hamiltonian system, we will seein the course of the proof that the whole torus, I = Iâ, survives.
16
Remark 3.4 One might wonder why we study quasi-periodic orbits rather thanthe apparently simpler periodic orbits. If one considers values of the action vari-ables for which the frequencies Ïj(I) are all rationally related, then the integrablehamiltonian will have an invariant torus, filled with periodic orbits. However,under a typical perturbation, all but finitely many of these periodic orbits willdisappear. Hence, the quasi-periodic orbits are, in this sense, more stable thanthe periodic ones.
As we will see, the proof follows very closely the outline of the previoussection. In particular, we begin with:
Step 1: Analysis of the Linearized equation
The basic idea is to find new variables (I, Ï) such that in terms of these newvariables (8) will be integrable. However, not just any change of variables isallowed, because most changes of variables will not preserve the Hamiltonianform of the equations of motion. We will admit only those changes of variableswhich do preserve the Hamiltonian form of the equations. Such transformationsare known as canonical changes of variables. There is a large literature oncanonical transformations, (for a nice introduction see [2]), but pursuing it wouldtake us too far afield. In order to come to the point in as expeditious a fashionas possible, let us just note the following:
Proposition 3.2 Suppose that there exists a smooth function ÎŁ(I,Ï) such thatthe equations:
I =âÎŁâÏ
, Ï =âÎŁâI
,
can be inverted to find (I,Ï) = Ί(I, Ï). Then Ί is a canonical transformation,and ÎŁ is called its generating function.
Proof: See [2], section 48.
Remark 3.5 Note that ÎŁ(I,Ï) = ăI,Ïă is the generating function for the iden-tity transformation. (Here, ă·, ·ă is the inner product in RN .)
Remark 3.6 There are other ways of generating canonical transformations.In particular, the Lie transform method has proven to be very convenient forcomputational purposes [5]. However, the generating function method offers asimple and direct way to prove the KAM theorem and for that reason I havechosen it here.
We would like to find a canonical transformation (I,Ï) = Ί(I, Ï) such thatH(I, Ï) = H Ί(I, Ï) = h(I), or
H(âÎŁâÏ
(I,Ï),Ï) = h(I) . (9)
17
(This, by the way, is the Hamilton-Jacobi equation. In the last century, Jacobiproved the integrability of a number of physical systems by finding solutions ofthis equation.) In our example, (9) can be written as:
h((âÎŁâÏ
(I,Ï)) + f((âÎŁâÏ
(I,Ï),Ï) = h(I) . (10)
Since H is âcloseâ to an integrable Hamiltonian for f small, we can hope thatthe canonical transformation is âcloseâ to the identity transformation. Usingthe fact that we know the generating function for the identity transformation,we will look for canonical transformations whose generating functions are of theform ÎŁ(I,Ï) = ăI,Ïă+ S(I,Ï), where S is O(âfâÏ,Ï), the amount by which ourHamiltonian differs from an integrable one. If we substitute this form for ÎŁ into(10) and expand, retaining only terms that are formally of first order in thesmall quantities âfâÏ,Ï and âSâÏ,Ï, we obtain the linearized Hamilton-Jacobiequation:
ăÏ(I),âS
âÏ(I,Ï)ă+ f(I,Ï) = h(I)â h(I) . (11)
Once again, we now have a linear equation involving periodic functions, so if weexpand f(I,Ï) =
ânâZN f(I,n)ei2Ïăn,Ïă, we can solve (11) and we find
S(I,Ï) =i
2Ï
â
nâZN\0
f(I,n)ei2Ïăn,Ïă
ăÏ(I),nă(12)
Remark 3.7 Once again, as in (6), the function S defined (formally) by (12)does not satisfy (11), but rather
ăÏ(I),âS
âÏ(I,Ï)ă+ f(I,Ï) = 0 , (13)
and we will be forced to estimate the difference between these two equationsbelow.
Note that once again, we will face small denominators. Indeed, for a denseset of points I, the denominators in (12) will vanish for infinitely many choicesof n. This is the reason that many people (including Poincare) at the end ofthe last century believed that these series diverged. Nonetheless, the resultsof Kolmogorov, Arnold and Moser show that âmostâ (in the sense of Lebesguemeasure) points I give rise to a convergent series. Having S be defined only onthe complement of a dense set of points I would be a problem, since we would behard pressed to take the derivatives we need in order to compute the canonicaltransformation in Proposition 3.2. To proceed, we take advantage of the factthat because of the analyticity of f , the Fourier coefficients f(I,n) are decayingto zero exponentially fast as |n| becomes large. Thus, if we truncate the sumdefining S to consider only |n| < M , for some large M we will make only a
18
relatively small error in the solution of (11). On the other hand, since there arenow only finitely many terms in the sum defining S, we can find open sets ofaction-variables on which the generating function is defined. Before stating theprecise estimate on S, we introduce a few preliminaries.
First, define Ω ℠1, such that
max
(( sup|IâIâ|â€Ï
ââ2h
âI2â), ( sup
|IâIâ|â€Ïâ(â2h
âI2)â1â)
)< Ω .
(Here, â ·â is the norm of the matrix considered as an operator from CN â CN
with the .1 norm.) Analogously, define Ω such that sup|IâIâ|â€Ï â(â3hâI3 )â < Ω.
(In this case, â · â is the norm of (â3hâI3 ) considered as a bilinear operator from
CN ĂCN â CN .) Next note that if we define
S<(I,Ï) =i
2Ï
â
nâZN\0|n|â€M
f(I,n)ei2Ïăn,Ïă
ăÏ(I),nă,
S< will no longer be a solution of (13), but rather will solve
ăÏ(I),âS
âÏ(I,Ï)ă+ f<(I,Ï) = 0 ,
where f<(I,Ï) âĄâ
|n|â€M f(I,n)ei2Ïăn,Ïă. Note that we have already discardedall terms that were formally of more than first order in âfâÏ,Ï in order to derive(11). Thus, if in deriving this equation for S<, we change (11) only by amountsof this order, we wonât have qualitatively worsened our approximation. We willchoose M in order to insure that this is the case.
Proposition 3.3 Choose 0 < ÎŽ < Ï, and set M = | log(âfâÏ,Ï)|/(ÏÎŽ). IfÏ < L/(2ΩMÎł+1) and 4ÏÎŽ < 1, then S< is analytic on AÏâÎŽ,Ï(Iâ), and
âS<âÏâÎŽ,Ï â€(
8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N (2NÎł)âfâÏ,Ï
2ÏL
Proof: Recall that we chose our domain AÏ,Ï(Iâ) so that it was centered (inthe I variables) at a point with Ï(Iâ) = Ïâ. Now suppose that we choose|n| < M , and consider ăÏ(I),nă for some other point I in our domain. WritingI = Iâ+(Iâ Iâ), we see that |ăÏ(I),năâ ăÏ(Iâ),nă| †Ω|n|Ï. If we then use thefact that Ïâ is of type (L, Îł), we find that for |n| †M and all |Iâ Iâ| < Ï,
|ăÏ(I),nă| = |ăÏâ,nă+ (ăÏ(I),nă â ăÏâ,nă)| â„ L
|n|Îł â Ω|n|Ï â„ L
2|n|Îł ,
19
where the last inequality used the hypothesis on Ï and the fact that |n| †M .If we combine this observation with the fact that |f(I,n)| †âfâÏ,Ïeâ2ÏÏ|n|, byCauchyâs theorem, we find
âS<âÏâÎŽ,Ï â€â
|n|â€M
2|n|Îł
2ÏLâfâÏ,Ïe
â2ÏÎŽ|n|
†2âfâÏ,Ï
2ÏLNÎł(1 + 2
Mâ
m=0
mÎłeâ2ÏÎŽ|m|)N
â€(
8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎłâfâÏ,Ï
2ÏL.
In going from the first to second line of this inequality, we used the fact that
|n|Îłeâ2ÏÎŽ|n| †NÎł(maxj
|nj |)Îłeâ2ÏÎŽ|n| †NÎłNâ
j=1
max(1, |nj |)eâ2ÏÎŽ|nn| ,
so thatâ
|n|â€M |n|Îłeâ2ÏÎŽ|n| †NÎł(1 + 2âM
m=1 meâ2ÏÎŽm)N .
Now that we know that the generating function is well-defined, we can pro-ceed to check that the canonical transformation is defined and analytic, just aswe did in Proposition 2.3 in the previous section.
Proposition 3.4 If(
8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 16NÎł+1âfâÏ,Ï
2ÏÎŽÏL< 1 ,
Ï < L/(2ΩMÎł+1) and 4ÏÎŽ < 1, then the equations
I = I +âS<
âÏ, and Ï = Ï +
âS<
âI, (14)
define an analytic and invertible canonical transformation (I,Ï) = Ί(I, Ï) onthe set AÏâ3ÎŽ,Ï/4.
Proof: Just as in the proof of Lemma 2.3 we begin by using the analytic in-verse function theorem to check that (14) can be inverted. In both of theexpressions in this equation, the inverse function theorem can be applied pro-vided ââ2S<
âIâÏ âÏâ2ÎŽ,Ï/2 < 1. This in turn, follows immediately from the estimatein Proposition 3.3 and Cauchyâs Theorem.
20
The remainder of the proposition follows if we check that the transformationis onto the domain AÏâ3ÎŽ,Ï/4. (This is analogous to the proof of Proposition2.3.) Note that if (I,Ï) â AÏâ2ÎŽ,Ï/2,
ââS<
âÏâÏâ2ÎŽ,Ï/2 â€
(8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎł+1âfâÏ,Ï
2ÏÎŽL< Ï/8 ,
by the hypothesis of the Proposition, while
ââS<
âIâÏâÎŽ,Ï/4 â€
(8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 8NÎł+1âfâÏ,Ï
2ÏÏL< ÎŽ/2 ,
again by the hypotheses of the Proposition. Thus, |Iâ I| < Ï/8, while |Ïâ Ï| <ÎŽ/2. This implies that the canonical transformation maps the set AÏâ2ÎŽ,Ï/2 ontoAÏâ3ÎŽ,Ï/4, and hence that (I,Ï) = Ί(I, Ï) on this set.
Remark 3.8 For a vector valued function like âS<
âÏ on a domain AÏ,Ï,ââS<
âÏ âÏ,Ï âĄ supAÏ,Ï|âS<
âÏ |, where we recall that |âS<
âÏ | is the .1 norm of âS<
âÏ .This is the origin of the extra factor of N in these estimates.
Step 2: The Newton Step
Now, just as we did in the case of circle diffeomorphisms, where we trans-formed our original diffeomorphism with the approximate conjugacy functionobtained by solving the linearized conjugacy equation, we transform our originalHamiltonian with the approximate canonical transformation, whose generatingfunction is S<, and show that the difference between the transformed Hamil-tonian and an integrable Hamiltonian is of second order in the small quantityâfâÏ,Ï. As before, we will use this fact as the basis for a Newtonâs methodargument.
Proposition 3.5 Define H(I, Ï) = H Ί(I, Ï) ⥠h(I) + f(I, Ï). If(
8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 16NÎł+1âfâÏ,Ï
2ÏÎŽÏL< 1 ,
Ï < L/(2ΩMÎł+1) and 4ÏÎŽ < 1, then H is analytic on AÏâ3ÎŽ,Ï/4, and one hasthe estimates,
âhâ hâÏâ3ÎŽ,Ï/4 †(Ω + 2)
((8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎł+1âfâÏ,Ï
2ÏÎŽÏL
)2
,
21
and
âfâÏâ3ÎŽ,Ï/4 †2(Ω + 2)
((8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎł+1âfâÏ,Ï
2ÏÎŽÏL
)2
.
Remark 3.9 The important thing to note is that f , the amount by which ourtransformed Hamiltonian fails to be integrable is quadratic in the small quantityâfâ2Ï,Ï. Just as in Proposition 2.4 in the previous section, this will form thebasis of a Newtonâs method argument, which will allow us to prove the existenceof a quasi-periodic solution with frequencies Ïâ.
Proof: Using Taylorâs Theorem, we can rewrite
H(I, Ï) = H(I +âS<
âÏ,Ï(I, Ï))
= h(I +âS<
âÏ) + f(I +
âS<
âÏ,Ï(I, Ï))
= h(I) + ăÏ(I),âS<
âÏă+
â« 1
0
â« t
0ă(âÏ
âI(I + v
âS<
âÏ)âS<
âÏ),
âS<
âÏădvdt
+f(I,Ï) +â« 1
0ăâf
âI(I + t
âS<
âÏ,Ï),
âS<
âÏădt .
From the definition of S<, we know that ăÏ(I), âS<
âÏ ă + f(I,Ï) = fâ„(I,Ï) âĄâ
|n|â„M f(I,n)e2ÏiăÏ,nă. Thus, we can define
h(I) = h(I) + averageâ« 1
0
â« t
0ă(âÏ
âI(I + v
âS
âÏ)âS
âÏ),
âS
âÏădvdt
+averageâ« 1
0ăâf
âI(I + t
âS
âÏ,Ï),
âS
âÏădt+ averagef(I,Ï(I, Ï)) ,
where averageg(I, Ï) âĄâ«TN g(I, Ï)dÏ, and
f(I, Ï) = fâ„(I,Ï(I, Ï))
+â« 1
0
â« t
0ă(âÏ
âI(I + v
âS<
âÏ)âS<
âÏ),
âS<
âÏădvdt
+â« 1
0ăâf
âI(I + t
âS<
âÏ,Ï(I,Ï),
âS<
âÏădt
âaverageâ« 1
0
â« t
0ă(âÏ
âI(I + v
âS<
âÏ)âS<
âÏ),
âS<
âÏădvdt
âaverageâ« 1
0ăâf
âI(I + t
âS<
âÏ,Ï),
âS<
âÏ)ădt
â averagef(I,Ï(I, Ï)).
22
Remark 3.10 Subtracting the average of the three quantities in f insures thatwhen we expand f in a Fourier series, there will be no n = 0 coefficient â thiswas used in solving (11).
Both h and f are easy to estimate using the estimates of Proposition 3.3and Cauchyâs Theorem. For instance,
ââ« 1
0ăâf
âI(I + t
âS<
âÏ,Ï),
âS<
âÏădtâÏâ3ÎŽ,Ï/4
†2âfâÏ,Ï
Ï
(8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎł+2âfâÏ,Ï
2ÏÎŽL,
while,
ââ« 1
0
â« t
0ă(âÏ
âI(I + v
âS<
âÏ)âS<
âÏ),
âS<
âÏădvdtâÏâ3ÎŽ,Ï/4
†Ω
((8Î(Îł + 1)(2ÏÎŽ)Îł+1
)N 2NÎł+1âfâÏ,Ï
2ÏÎŽL
)2
.
Finally, we have the estimate
âfâ„âÏâ3ÎŽ,Ï/4 â€â
|n|â„M
âfâÏ,Ïeâ2ÏÎŽ|n| †âfâÏ,Ïe
âÏÎŽMâ
|n|â„M
eâÏÎŽ|n|
†(4ÏÎŽ
)Nâfâ2Ï,Ï ,
where the last of these inequalities came from using the definition of M inProposition 3.3.
If we combine these remarks, we immediately obtain the estimates stated inthe Proposition.
The Induction Argument:The induction follows closely the lines of the induction step in the case of the
circle diffeomorphisms. We have to keep track of two more inductive constantsâ Ïn to control the size of the domain of the action variables, and Mn to controlhow we cut off the sum defining S< at the nth stage of the iteration. Thus,we define our original Hamiltonian H(I,Ï) = H0(I,Ï) and set h(I) = h0(I) andf(I,Ï) = f0(I,Ï). Also define
âą ÎŽn = Ï36(1+n2) , n â„ 0.
23
âą Ï0 = Ï, and Ïn+1 = Ïn â 4ÎŽn, if n â„ 0.
âą Ï0 †Ï, and Ïn+1 = Ïn/8, with Ï0 chosen to satisfy the hypothesis of thefollowing Lemma.
⹠Δ0 = âfâÏ,Ï, and Δn = Δ(3/2)(n/Îł)
0 , if n â„ 0.
âą Mn = | log Δn|/(ÏÎŽn).
We set Hn+1 = Hn Ίn = hn+1 + fn+1, with fn+1(I, 0) = 0, where Ίn is thecanonical transformation whose generating function S<
n solves the equation
ăÏn(I),âS<
n
âÏ(I,Ï)ă+ f<
n (I,Ï) = 0 ,
with f<n (I,Ï) âĄ
â|n|â€Mn
fn(I,n)ei2Ïăn,Ïă, and Ïn(I) = âhnâI (I). At the nth
stage of the iteration we will work on the domain AÏn,Ïn(In) = (I,Ï) â CN ĂCN | |I â In| < Ïn , |Im(Ïj)| < Ïn , j = 1, . . . , N , where In is chosen sothat Ïn(In) = Ïâ, and we define Ωn = max(1, sup ââ2hn
âI2 â, â(â2hnâI2 )â1â), with
the supremum in these expressions running over all I with |Iâ In| < Ïn.We then have
Lemma 3.1 (KAM Induction Lemma) There exists a positive constant c1
such that if
Δ0 < 2âc1N(Îł+1) Ï8N(4Îł+1)Ï8
0L16
Î(Îł + 1)16NΩ8, and Ï0 <
2âc1L
ΩMγ+10
.
then
âą The generating function S<n satisfies
âS<n âÏnâÎŽn,Ïn â€
(8Î(Îł + 1)(2ÏÎŽn)Îł+1
)N 2NγΔn
2ÏL.
⹠Ίn is defined and analytic on AÏnâ3ÎŽn,Ïn/4(In) and maps this set intoAÏnâ2ÎŽn,Ïn/2(In).
âą âfn+1âÏn+1,Ïn+1 †Δn+1.
âą âhn+1 â hnâÏn+1,Ïn+1 †Δn+1.
âą |In+1 â In| < Ïn/8.
24
Before proving this lemma, we show how the KAM theorem follows from it.If the perturbation f in our Hamiltonian is sufficiently small, the hypotheses ofthe Induction Lemma will be satisfied, and roughly speaking, the idea is that asn ââ, Hn(I,Ï) â hâ(I), an integrable system, since fn â 0. Since all of theorbits of an integrable system are quasiperiodic, this would complete the proof.However, as n becomes larger and larger, the size of the domain in the actionvariables on which Hn is defined goes to zero. Thus, we must be a little carefulwith this limit.
Begin by defining Κn = Ί0 Ί1 . . .Ίn. By the induction lemma,Κn : AÏnâ3ÎŽn,Ïn/4(In) â AÏ0,Ï0(I0), and Hn = H0 Κnâ1. In particular,if (In(t),Ïn(t)) is a solution of Hamiltonâs equations with Hamiltonian Hn, thenΚnâ1(In(t),Ïn(t)) is a solution of Hamiltonâs equations with Hamiltonian H0.
Consider the equations of motion of Hn:
I = ââfn
âÏ, Ï = Ïn(I) +
âfn
âI.
Since ââfn
âI âÏn,Ïn/2 †2ΔnN/Ïn, and ââfn
âÏ âÏnâÎŽn,Ïn †ΔnN/ÎŽn, the trajectorywith initial conditions (In,Ï0) (for any Ï0 â TN ), will remain inAÏnâ3ÎŽn,Ïn/4(In)for all times |t| †Tn = 2n, by our hypothesis on Δ0, and the definition of theinduction constants. Furthermore, if (In(t),Ïn(t)) is the solution with theseinitial conditions, we have
max
(sup|t|â€Tn
|In(t)â In|, sup|t|â€Tn
|Ïn(t)â (Ïât + Ï0)|)†22n+2ΩΔnN/ÏnÎŽn .
Noting that the inductive estimates on In imply that there exists Iâ withlimnââ In = Iâ, we see that for t in any compact subset of the real line,(In(t),Ïn(t)) â (Iâ,Ïât + Ï0) (again using the definition of the inductive con-stants). Using the inductive bounds on the canonical transformation one canreadily establish that
âΚn(I,Ï)â (I,Ï)âÏn+1,Ïn+1 â€ââ
j=0
2N
(8Î(Îł + 1)(2ÏÎŽj)Îł+1
)N (8NγΔj
2ÏÎŽjÏjL
)⥠â ,
while
âΚn(I,Ï) â Κnâ1(I,Ï)âÏn+1,Ïn+1
= âΚnâ1 Ίn(I,Ï)âΚnâ1(I,Ï)âÏn+1,Ïn+1
†(2N +16âÏnÎŽn
)(
8Î(Îł + 1)(2ÏÎŽn)Îł+1
)N (8NγΔn
2ÏÎŽnÏnL
).
Using the definition of the inductive constants, we see that the sum over n of thislast expression converges and hence limnââ Κn(Iâ,Ïât + Ï0) = (Iâ(t),Ïâ(t))
25
exists and is a quasi-periodic function with frequency Ïâ. Similarly,limnââ |Κn(Iâ,Ïât + Ï0)âΚn(In(t),Ïn(t))| = 0, for t in any compact subsetof the real line.
Combining these two remarks, find thatlimnââ Κn(In(t),Ïn(t)) = (Iâ(t),Ïâ(t)), so (Iâ(t),Ïâ(t)) is a quasi-periodic so-lution of Hamiltonâs equations for the system with Hamiltonian H0 as claimed.
Remark 3.11 Note that this argument is independent of the point Ï0 that wetake on the original torus. Thus it shows that every trajectory on the unper-turbed torus is preserved.
Proof: (of Lemma 3.1.) Note that Propositions 3.3, 3.4, and 3.5, plus theassumption on the induction constants imply that we can start the induction,provided AÏ0â3ÎŽ0,Ï0/4(I0) â AÏ1,Ï1(I1). From the definitions of the domains andthe inductive constants, we see that this will follow provided |I0 â I1| < Ï0/8.To see that this is so we note that Ï0(I0) = Ï1(I1). Thus, Ï0(I0) â Ï0(I1) =â(h1âh0)
âI (I1). But, ââ(h1âh0)âI (I1)âÏ0â3ÎŽ0,Ï0/6 †12Δ1/Ï0, while
Ï0(I0)â Ï0(I1) =âÏ0
âI(I0)(I0 â I1)
+â« 1
0
â« t
0(â2Ï0
âI2(I0 + sI1)(I1 â I0))2dsdt .
Since â(
âÏ0âI
)â1 â †Ω and ââ2Ï0âI2 â †Ω, this implies that |I0â I1| < Ï0/8 by the
definition of the induction constants, provided ΩΩÏ0 < 1/2, which will followif the constant c1 in the Lemma is sufficiently large. This completes the firstinduction step.
Suppose that the induction argument holds for n = 0, 1, . . . ,K â 1. To proveit for n = K we first note if S<
K is defined by:
S<K(I,Ï) =
i
2Ï
â
nâZN\0|n|â€MK
fK(I,n)ei2Ïăn,Ïă
ăÏK(I),nă,
then by Proposition 3.3, we have
âS<KâÏK ,ÏKâÎŽK â€
(8Î(Îł + 1)(2ÏÎŽK)Îł+1
)N 2NγΔK
2ÏL.
Note that the hypothesis in Proposition 3.3 becomes ÏK < L/(2ΩKMÎł+1K ).
26
where,
ΩK = max(1, sup|IâIK |<ÏK
ââ2hK
âI2â, sup
|IâIK |<ÏK
â(â2hK
âI2)â1â)
†Ω max(1 +Kâ
j=1
64NΔj
Ï2j
, (1âKâ
j=1
64ΩNΔj
Ï2j
)â1) †2Ω ,
using the definition of the inductive constants. This observation, plus the hy-pothesis on Ï0 in the inductive lemma, guarantees that the hypothesis of Propo-sition 3.3 is satisfied. Thus, by Proposition 3.4, the canonical transformationΊK defined by
I = I +âS<
K
âÏ, and Ï = Ï +
âS<K
âI, (15)
is analytic and invertible on the set AÏKâ3ÎŽK ,ÏK/4(IK), and maps this set intoAÏK ,ÏK (IK).
If we then define fK+1 and hK+1, as we defined f and h in Proposition 3.5we see that
âfK+1âÏKâ3ÎŽK ,ÏK/4 †2(ΩK + 2)
((8Î(Îł + 1)(2ÏÎŽK)Îł+1
)N 2Nγ+1ΔK
2ÏÎŽKÏKL
)2
while
âhK â hK+1âÏKâ3ÎŽK ,ÏK/4 †(ΩK + 2)
((8Î(Îł + 1)(2ÏÎŽK)Îł+1
)N 2Nγ+1ΔK
2ÏÎŽKÏKL
)2
If we use the bound on Δ0, and the definitions of the inductive constants, wesee that the quantities on the right hand sides of both of these inequalities areless than ΔK+1. The proof of the inductive lemma will be completed if we canshow that AÏK+1,ÏK+1(IK+1) â AÏKâ3ÎŽK ,ÏK/4(IK). This follows in a fashionvery similar to the proof that AÏ1,Ï1(I1) â AÏ0â3ÎŽ0,Ï0/4(I0), which we demon-strated above, so we omit the details.
Remark 3.12 From the point of view of applications of this theory it is oftenconvenient to know not just what happens to a single trajectory, but rather thebehavior of whole sets of trajectories. Simple modifications of the precedingargument allow one to demonstrate the following variant of the KAM theorem.(See [4].) Consider the family of Hamiltonian systems
HΔ = h(I) + Δf(I,Ï) . (16)
27
Suppose that there exists a bounded set V â RN such that â2hâI2 (I) is invertible
for all I â V , and that for every Δ in some neighborhood of zero HΔ is analyticon a set of the form AÏ,Ï(V ) = (I,Ï) â CN ĂCN | |Iâ I| < Ï , for some I âV , and |Im(Ïj)| < Ï , j = 1, . . . , N .
Theorem 3.2 For every ÎŽ > 0, there exists Δ0 > 0 such that if |Δ| < Δ0, thereexists a set PΔ â V Ă TN , such that the Lebesgue measure of (V Ă TN )\PΔ isless than ÎŽ and for any point (I0,Ï0) â PΔ, the trajectory of (16) with initialconditions (I0,Ï0) is quasi-periodic.
Thus an informal way of stating the KAM theorem is to say thatâmostâ trajec-tories of a nearly integrable Hamiltonian systems remain quasi-periodic.
Remark 3.13 Just as in the case of Arnoldâs theorem about circle diffeomor-phisms, the KAM theorem also remains true when the Hamiltonian is onlyfinitely differentiable, rather than analytic. For a nice exposition of this the-ory, see [11].
References
[1] V. Arnold. Small denominators, 1: Mappings of the circumference ontoitself. AMS Translations, 46:213â288, 1965 (Russian original published in1961).
[2] V. Arnold. Mathematical Methods of Classical Mechanics. Springer-Verlag,New York, 1978.
[3] V. Arnold. Geometrical Methods in the Theory of Ordinary DifferentialEquations. Springer-Verlag, New York, 1982.
[4] G. Gallavotti. Perturbation theory for classical hamiltonian systems. InJ. Frohlich, editor, Scaling and Self-Similarity in Physics, pages 359â246.Birkhauser, Boston, 1983.
[5] W. Grobner. Die Lie-Reihen und ihre Anwendungen. Deutscher Verlag derWissenschaften, Berlin, 1960.
[6] J. Guckenheimer and P. Holmes. Nonlinear Oscillations, Dynamical Sys-tems, and Bifurcations of Vector-Fields. Springer-Verlag, New York, 1983.
[7] M. R. Herman. Sur la conjugaison differentiable des diffeomorphismes ducercle a des rotations. Publ. Math. I.H.E.S., 49:5â234, 1979.
[8] A. N. Kolmogorov. On conservation of conditionally periodic motions undersmall perturbations of the hamiltonian. Dokl. Akad. Nauk, SSSR, 98:527â530, 1954.
28
[9] J. Moser. On invariant curves of area-preserving mappings of an annulus.Nachr. Akad. Wiss., Gottingen, Math. Phys. Kl., pages 1â20, 1962.
[10] J. Moser. A rapidly convergent interation method, II. Ann. Scuola Norm.Sup. di Pisa, Ser. III, 20:499â535, 1966.
[11] J. Poschel. Integrability of hamiltonian systems on Cantor sets. Comm.Pure. and Appl. Math., 35:653â695, 1982.
[12] J.-C. Yoccoz. An introduction to small divisors problems. In From NumberTheory to Physics (Les Houches, 1989), chapter 14. Springer Verlag, Berlin,1992.
29