Advanced Canonical Methods Hamilton Jacobi Equation Action Angle

Advanced canonical methods: Hamilton-Jacobi equation, action-anglevariables, adiabatic invariants

Sergei Winitzki

July 23, 2006

Contents

1 Advanced canonical methods 11.1 Action evaluated on solutions. . . . . . . . . . . . . 11.2 Hamilton-Jacobi equation. . . . . . . . . . . . . . . 2

1.2.1 Separation of variables in general. . . . . . 31.2.2 Separation of variables in HJ equation. . . . 31.2.3 Examples. . . . . . . . . . . . . . . . . . . 4

1.3 Action-angle variables. . . . . . . . . . . . . . . . 51.3.1 Examples. . . . . . . . . . . . . . . . . . . 6

1.4 Adiabatic invariants. . . . . . . . . . . . . . . . . . 71.4.1 Change of adiabatic invariant. . . . . . . . . 7

This document is distributed under the terms of theGNU FreeDocumentation License. This license permits, among other things,unrestricted verbatim copying.

1 Advanced canonical methods

In this chapter we shall see that there exists a canonical transforma-tion (~q, ~p) → ( ~Q, ~P ) such that the new variables( ~Q, ~P ) are con-stant in time. This canonical transformation therefore yields a gen-eral solution of the equations of motion, expressing~q(t) and~p(t)through constants of integration. The Hamilton-Jacobi (HJ) equa-tion allows one to find such a canonical transformation. Often themethod of separation of variables can be used to find suitableso-lutions of the HJ equation. If a canonical pair of variables(q1, p1)is separable and if the values of these variables are bounded, onecan perform a canonical transformation to “action-angle” variables,(q1, p1) → (φ, J), such thatJ = const andφ = const. These vari-ables conveniently describe the motion in the(q1, p1) phase plane.The real usefulness of the action-angle variables is in the applica-tion to the case when the Hamiltonian is a slow-changing functionof time. In that case,J is an adiabatic invariant with respect to slowchanges in the Hamiltonian. The change inJ is exponentially small,even if the total change of the Hamiltonian is large (but spread overa long time).

1.1 Action evaluated on solutions

Let us consider a mechanical system whose trajectories are com-pletely known. We may consider a bunch of trajectories~q(t) thatstart at a fixed initial timet0 from a fixed initial point~q0 in all possi-ble directions (i.e. with all possible initial velocities). It is clear thatthe initial portions of these trajectories will cover the entire neigh-borhood of the initial point~q0, and that there will be a unique tra-jectory that reaches a nearby point~q at a later timet, for each~q, andfor sufficiently smallt − t0. We can compute the action along thetrajectories~q(t),

S(~q, t; ~q0, t0) =

∫ t

t0

L(~q, ~q, t)dt.

Note that we are using theactualtrajectories of the system, i.e. paths~q(t) that solve the equations of motion (EOM).1 We have just arguedthat (for small enought − t0) the functionS(~q, t; ~q0, t0) is well-defined for every~q in a neighborhood of~q0. We can compute thisfunctionS if we know all the trajectories of the system. Note thatat late timest, trajectories may turn around or cross each other, sothat there will not be a unique trajectory reaching~q at timet, and thefunctionS will be undefined.

You may be wondering why the functionS is interesting. Letus substitute~Q instead of~q0, set t0 to a fixed value, and treatS(~q,~q0 ≡ ~Q, t) as a generating function of a canonical transforma-tion, (~q, ~p) → ( ~Q, ~P ). This canonical transformation defines newcoordinates and a new Hamiltonian through the relations

~p =∂S(~q, ~q0, t)

∂~q, ~P = −∂S(~q, ~q0, t)

∂~q0, H ′ = H +

∂S

∂t. (1)

We shall now show by an explicit calculation thatH ′ = 0, whichis certainly a great simplification. This is why the functionS isinteresting.

To proceed, we need to compute∂S/∂~q, ∂S/∂~q0, and∂S/∂t.Note that the derivative∂S/∂~q measures the variationδS under achangeδ~q of the final point~q of the trajectory~q(t), while the ini-tial point ~q0 is held fixed. A change of the final point, of course,changes also the entire trajectory~q(t). Suppose the trajectory isthereby changed byδ~q(t). Then we apply the familiar derivation ofthe variationδS,

δS =

∫ t

t0

(

∂L

∂~qδ~q +

∂L

∂~qδ~q

)

dt

=∂L

∂~qδ~q

∣

∣

∣

∣

t

t0

+

∫ t

t0

(

∂L

∂~q− d

dt

∂L

∂~q

)

δ~q dt

= ~p(t)δ~q(t) − ~p(t0)δ~q(t0),

since the trajectory~q(t) is a solution of the EOM. Presently,δ~q(t0) = 0 since the initial point~q(t0) ≡ ~q0 is held fixed. It fol-lows that∂S/∂~q = ~p(t), which is the canonical momentum~p eval-uated on the correct trajectory~q(t) at the final timet. Similarly, wecould vary the initial point~q0 and obtain∂S/∂~q0 = −~p(t0). Finally,we need to compute∂S/∂t, which is the derivative with respect tothe change of thefinal time t while the initial and final coordinates,~q0 and~q, are held fixed; under these conditions, a change of the fi-nal time will of course change the entire trajectoryq(t) as well. Tocompute the quantity∂S/∂t, we note that thetotal derivative of theaction,dS/dt, is (by definition) equal to the Lagrangian:

d

dtS(~q(t), t; ~q0) = L(~q(t), ~q(t), t).

1One can prove (although the proof is not simple) that the action functional,S[q(t)], really has aminimum(not merely an extremum) at the solution~q(t) of theEOM, if the neighbor trajectories emanating from(~q0, t0) do not cross each other.In that case, the principle of “least action” holds literally andS(~q, t, ~q0, t0) is in factthe valueof the minimum of the action over all trajectories connecting (~q0, t0)and(~q, t).

1

http://www.gnu.org/copyleft/fdl.html

On the other hand,

dS(~q, t; ~q0)

dt=

∂S(~q, t; ~q0)

∂t+

∂S(~q, t; ~q0)

∂~q~q =

∂S

∂t+ ~p~q.

Therefore,∂S

∂t= L − ~p~q = −H,

whereH is the Hamiltonian,H = ~p~q − L.Now we can rewrite the canonical transformation (1) as

~p =∂S

∂~q= ~p(t), ~P = − ∂S

∂~q0= ~p(0), H ′ = H +

∂S

∂t= 0.

We can draw two conclusions about this canonical transformation:(i) the new coordinates and momenta,( ~Q, ~P ), are simply equal tothe initial conditions~q(t0) and~p(t0) at a fixed timet0, and (ii) thenew Hamiltonian is identically zero, indicating trivial EOM,

d ~Q

dt=

d~P

dt= 0.

So the new variables( ~Q, ~P ) are in a sense “constants of motion.”Writing the canonical transformation in the form~q = ~q( ~Q, ~P , t),~p = ~p( ~Q, ~P , t), we obtain a complete set of trajectories for the sys-tem, with arbitrary constants( ~Q, ~P ) representing initial conditionsat timet0. This is an explicit general solution of the EOM. Thus, theknowledge of a canonical transformation(~q, ~p) → ( ~Q, ~P ) such thatH ′ = 0 gives us a complete solution of the mechanical problem.

Note that a generating functionS(~q, ~Q, t) defines a valid canoni-cal transformation only if the following determinant is nonzero,

∣

∣

∣

∣

∣

∂2S(~q, ~Q, t)

∂qi∂Qj

∣

∣

∣

∣

∣

6= 0. (2)

One can show that this determinant is always nonzero for the actionfunctionS(~q, ~q0, t), where one sets~q0 ≡ ~Q, under the assumptionthat neighbor trajectories do not cross. Indeed, the formula

~p0 = −∂S(~q, ~q0, t)

∂~q0(3)

determines the required initial momentum~p0 for reaching the finalposition~q from the initial position~q0. If the neighbor trajectoriesdo not cross, there is only one such initial momentum for every finalposition~q, and, conversely, only one final position~q for each initialmomentum~p0. Therefore, the formula (3) can be viewed as a systemof equations for determining~q when~q0 and~p0 are known. This sys-tem is always solvable with respect to~q and the solution is unique,therefore the corresponding determinant (2) is always nonzero.

Thus, we find that the generating functionS yields a canonicaltransformation representing an explicit general solutionof the EOM.Of course, the way we found this canonical transformation was byusing the action functionS(~q, t; ~q0, t0), while one can only deter-mine this function if onealready knowsthe complete solution of theEOM. It would be useful if we could determine such a canonicaltransformationwithout knowing the solutions~q(t). One method isto use the Hamilton-Jacobi equation.

1.2 Hamilton-Jacobi equation

We have seen that the action functionS(~q, t; ~q0, t0) is a generatingfunction of a canonical transformation that transforms theHamilto-nian into a zero. The new canonical variables are the initialvalues(~q0, ~p0), which makes it very easy to find a complete solution of theEOM. It is true that we cannot findS(~q, t; ~q0, t0) without first having

a complete solution of the EOM. So at first the idea of finding thiscanonical transformation may seem hopeless.

However, now that we appreciate theidea that such a transfor-mation could exist, we realize that we do not necessarily need thecanonical transformation(~q, ~p) → (~q0, ~p0), where the new variablesare the initial conditions. In fact,anycanonical transformation,

~p =∂F

∂~q, ~P = −∂F

∂ ~Q, H ′ = H +

∂F

∂t,

such that the new HamiltonianH ′ = 0, would do just as well. Letus try to determine a generating function, sayF (~q, ~Q, t), for such acanonical transformation.

From the requirementH ′ = 0 we obtain the following conditionfor the generating functionF (~q, ~Q, t),

∂F (~q, ~Q, t)

∂t+ H(~q, ~p, t)

=∂F (~q, ~Q, t)

∂t+ H(~q,

∂F (~q, ~Q, t)

∂~q, t) = 0. (4)

This is a partial differential equation forF (~q, ~Q, t) called theHamilton-Jacobi equation. Additionally, the functionF shoulddefine a valid canonical transformation, i.e. we must require the non-degeneracy condition

∣

∣

∣

∣

∣

∂2F (~q, ~Q, t)

∂qi∂Qj

∣

∣

∣

∣

∣

6= 0. (5)

Note that the choice of the constants~Q entering the functionF (~q, ~Q, t) are largely arbitrary parameters and could be arbitrar-

ily redefined,~Q → ~Q, as long as the nondegeneracy condition (5)holds. For instance, we could always replaceQ1 → f(Q1), wheref is an arbitrary function such thatf ′ 6= 0, because this will notchange the condition (5). At the same time, anadditive constant(e.g.F = F (~q, ~Q, t) + Q0) is not desired: since∂F/∂Q0 = 1, thematrix row ∂2F/∂Q0∂qi is entirely equal to zero, so the determi-nant (5) would be equal to zero.

For example, consider the Hamiltonian of a harmonic oscillatorwith massm and frequencyω,

H(p, q) =p2

2m+

mω2q2

2.

Then the generating functionF depends onq, Q, t, and theHamilton-Jacobi equation is

∂F (q, Q, t)

∂t+

1

2m

(

∂F (q, Q, t)

∂q

)2

+mω2q2

2= 0.

The variableQ is not being differentiated; one says that it enters intothe equationas a parameter.

Let us remark that the HJ equation can be simplified forconserva-tive systems (∂H/∂t = 0). We can look for the generating functionF in the form

F = F (~q, ~Q) + f(t, ~Q),

which gives

−∂f(t, ~Q)

∂t= H(~q,

∂F (~q, ~Q)

∂~q).

SinceH does not depend ont while f does not depend on~q, a solu-tion is possible only if both−∂f/∂t andH are equal to a functiononly of the parameters~Q. (If you are unfamiliar with this logicalstep, see Sec.1.2.1.) Let this function beE( ~Q). Therefore, we find

f(t, ~Q) = −tE( ~Q),

2

while the rest of the HJ equation is

H(~q,∂F (~q, ~Q)

∂~q) = E( ~Q). (6)

Since the choice of the parameters~Q is arbitrary (subject only to thenondegeneracy condition), we may replace one of these parameters,sayQn, with E. In this way we obtain a simpler HJ equation.

1.2.1 Separation of variables in general

The method of separation of variables is usually applied to partialdifferential equations. You may skip this section if you understandthe idea of separating variables.

Let us start with a simple example. Suppose someone gives youthe following equation:

1 + ax3 = eby, (7)

and asks to determine the values ofa andb such that Eq. (7) holdsfor all values ofx andy. It is easy to see that the only possiblesolutiona = b = 0. Indeed, if Eq. (7) holds for allx andy, let us fixthe value ofy, e.g. sety = 3. Then we find1 + ax3 = e3b, whichshould hold for allx, say forx = 0 andx = 1. Then we get1 = e3b

and1 + a = e3b, so a = 0. But then Eq. (7) says that we have1 = eby for all y, say fory = 1. Then we have1 = eb, sob = 0.

Let us make this reasoning more general. Equation (7) is of theform f(x) = g(y), wheref andg are some functions. Let us fix thevalue ofy, sayy = y0, then we getf(x) = g(y0). The right-handside,g(y0), is a fixed number, so the only wayf(x) = g(y0) canhold for allx is if the left-hand side,f(x), is in fact independentofx. Similarly, g(y) should be independent ofy. In other words, theonly wayf(x) = g(y) can hold for allx, y is when bothf(x) andg(y) are equal to a constant,f(x) = g(y) = C. This is the mainlogical step in the method of separation of variables.

As another example, consider the equation

y (A + Bx) + (C + y) (1 + x) = 0. (8)

Our goal is to determine the values of the parametersA, B, C suchthat Eq. (8) holds for allx, y. Let us transform Eq. (8) to

A + Bx

1 + x= −C + y

y.

Since the left-hand side is a function only ofA, B, x while the right-hand side is a function ofC, y, we conclude that both sides must beequal to a constant, sayD:

A + Bx

1 + x= −C + y

y= D.

It follows thatA = −1, B = −1, C = 0, D = −1. One says thatthe variablesx andy can be separated in Eq. (8).

More generally, we may have an equation of the form

f(x1, x2, ..., xn, y1, y2, ..., yn) = g(y1, y2, ..., yn, z1, z2, ..., zn),

i.e. both sides share a set of variables(y1, ..., yn). Again, it is giventhat this equation holds for all values of the vectors~x, ~y, ~z. In thatcase, it is clear that both sides must be equal to a functionh(~y)depending only on the shared variables, so we may write

f(~x, ~y) = h(~y) = g(~y, ~z).

In this case, one says that the variables~x and~z can beseparatedfrom other variables. The functionh(~y) is so far unknown; but thecondition thatf(~x, ~y) is independent of~x and thatg(~y, ~z) is inde-pendent of~z is a very strong restriction which usually allows one toachieve significant progress in the calculations.

1.2.2 Separation of variables in HJ equation

The HJ equation (4) is an equation inpartial derivatives and appearsto be much more complicated than the set of EOM (~q = ∂H/∂~p,~p = −∂H/∂~q) which areordinarydifferential equations. However,we do not need a general solution of the HJ equation; we only needto find aparticular solution containing some constants~Q. In manycases, this task turns out to be easier than solving the EOM directly.In fact, in many cases the solution can be guessed, and this turnsout to be easier than to guess the solutions of the EOM directly.Since we are merely looking foronesuitable solution, we are freeto choose any ansatz and substitute it into the HJ equation. If thatansatz works, we will find a solution and be done; if that ansatz doesnot work, we can try another ansatz.

A common method to find suitable solutions of the HJ equation(with the constants) is to make a guess that the solution has the form

F (~q, ~Q, t) = S1(q1, Q1) + W (q2, q3, ..., qn, ~Q, t). (9)

This ansatz is calledseparating the variable q1. Substituting theansatz (9) into the HJ equation, we find

H(q1,∂S1

∂q1, q2,

∂W

∂q2, ..., qn,

∂W

∂qn, t) +

∂W

∂t= 0. (10)

Note that the entire dependence onq1 is contained inS1. Supposethat we can rewrite Eq. (10) in the form

f1(q1,∂S1

∂q1) = g(q2,

∂W

∂q2, ..., qn,

∂W

∂qn,∂W

∂t, t), (11)

i.e. suppose we can “solve forq1” in this sense. The left-hand sideof Eq. (11) is a function only ofq1 andQ1, while the right-hand sideis a function ofq2, ...,qn, t, and~Q. Therefore, both sides of Eq. (11)must be equal to some function of the parameterQ1:

f1(q1,∂S1

∂q1) = f(Q1) = g(q2,

∂W

∂q2, ..., qn,

∂W

∂qn,∂W

∂t, t).

We may redefine the constantQ1 as Q1 → f(Q1), so that bothsides of Eq. (11) are now equal toQ1. This is a convenient way tointroduce the necessary constantQ1 into the solution. Then we findtwo equations

f1(q1,∂S1

∂q1) = Q1, (12)

g(q2,∂W

∂q2, ..., qn,

∂W

∂qn,∂W

∂t, t) = Q1. (13)

The first of these equations is anordinarydifferential equation (sinceit contains only derivatives with respect toq1) and can be easily inte-grated to computeS1(q1, Q1). Thus we have separated the variableq1, introduced the necessary constantQ1, and obtained a simplifiedremaining part of the HJ equation that does not containq1.

Equation (12) is easy to solve if we invert the functionf1, so that

∂S1

∂q1= f1(q1, Q1),

S1(q1, Q1) =

∫ q1

f1(q1, Q1)dq1,

wheref1 is the function inverse tof1 with respect to its second argu-ment. Since∂F/∂q1 = ∂S1/∂q1, which follows from Eq. (9), thefunctionf1(q1, Q1) is actually equal to the canonical momentump1.Therefore, the trajectory of the system in the phase plane(q1, p1) isfully determined by the equationp1 = f1(q1, Q1), independentlyof all the other variablespj, qj , j 6= 1. It is in this sense that thevariableq1 is calledseparable from the other variables.

3

The remaining part (13) of the HJ equation can be treated simi-larly: we could try separating the variableq2 (which would in turnintroduce the constantQ2 into the solution). If the Hamiltonian issuch thatall the variables can be separated in the HJ equation, theresulting functionF has the form

F (~q, ~Q, t) = S1(q1, Q1)+S2(q2, Q2)+ ...+Sn(qn, Qn)−E( ~Q)t.

In this case the trajectories of the canonical pair(q1, p1) are inde-pendent of trajectories of every other canonical pair(qi, pi), and allthese trajectories can be found explicitly. Then the Hamiltonian iscalledcompletely integrable.

1.2.3 Examples

We shall now consider some examples where a mechanical systemis solved using the method of HJ equation. In these examples,weconsider relatively simple systems that can be solved usingordi-nary methods. The purpose of these examples is to illustratetheHamilton-Jacobi theory.2

Let us start with a system consisting of two noninteracting parti-cles, described by the Hamiltonian

H =p21

2+ V1(q1) +

p22

2+ V2(q2). (14)

It is perhaps obvious by looking at this Hamiltonian that thevari-ables(q1, p1) are separable from(q2, p2), since the two particles donot interact at all and their motions are independent. Let uscheckthat the HJ equation indeed allows one to separate these variables,in the sense defined in Sec.1.2.2above. Since the Hamiltonian (14)is time-independent, we write the ansatz as

F (q1, q2, t) = S1(q1) + S2(q2) − Et. (15)

(So far there is only one constant,E, so we are waiting to introduceanother one.) The simplified HJ equation (6) then becomes

1

2S′2

1 + V1(q1) +1

2S′2

2 + V2(q2) = E, (16)

where we simply writeS′

1 instead of∂S1/∂q1 sinceS1 depends onlyon q1, and similarly forS2. Now we notice that only the first twoterms in Eq. (16) depend onq1, so we rewrite that equation as

1

2S′2

1 (q1) + V1(q1) = E − 1

2S′2

2 (q1) − V2(q2),

which shows explicitly that the variableq1 separates. Both sides ofthe above equation are equal to a constant which we may callE1,

1

2S′2

1 (q1) + V1(q1) = E1,

E − 1

2S′2

2 (q1) − V2(q2) = E1.

It is easy to obtainS1,2(q) in the form of integrals,

S1(q1) =

∫ q1√

2E1 − 2V1(q1)dq1,

S2(q2) =

∫ q2√

2E − 2E1 − 2V2(q2)dq2.

2Of course, the real usefulness of the HJ formalism is in its power to solve prob-lems that could not otherwise be solved. In fact, large classes of completely integrablesystems were discovered using this formalism. But such complicated systems are be-yond the scope of the present text.

Let us denoteE − E1 ≡ E2 for convenience (these constants willappear in the solution more symmetrically). The solution ofthe HJequation is thus

F (q1, E1, q2, E2, t) =√

2

∫ q1√

E1 − V1(q1)dq1

+√

2

∫ q2√

E2 − V2(q2)dq2 − (E1 + E2) t.

(17)

This is the generating function of the canonical transformation(q1, p1, q2, p2) → (E1, D1, E2, D2), whereE1,2 are the new “co-ordinates” andD1,2 the new “momenta.” The new Hamiltonian

H ′(E1, D1, E2, D2) = H +∂F

∂t= 0,

therefore the new canonical variables are constant in time.We cannow obtain explicit formulas relating the old canonical variables(q1, p1, q2, p2) to the new ones. We write the definition of the canon-ical transformation generated byF ,

pj =∂F

∂qj, Dj = − ∂F

∂Ej, j = 1, 2.

Substituting Eq. (17) for S, we get

Dj = − ∂F

∂Ej= − 1√

2

∫ qj dqj√

Ej − Vj(qj)+ t,

pj =√

2√

E1 − Vj(qj).

These relations can be viewed as explicit formulas determiningqj(t)andpj(t) through the arbitrary constantsEj andDj . If we can eval-uate the integrals in closed form as certain functions, e.g.

1√2

∫ z

0

dz√

Ej − Vj(z)≡ hj(z), j = 1, 2

(the initial valuez = 0 was set arbitrarily), then we may write thesolution as

h1(q1(t)) = t − D1, h2(q2(t)) = t − D2,

and determineqj(t). Thus the equations of motion are completelyintegrated. The interpretation of the new canonical variables is thatEj is the energy of the particlej, while Dj is the moment of timewhereqj = 0.

As a second example, consider the following Hamiltonian (de-scribing a particle in a magnetic field),

H =p21

2+

(p2 + q1)2

2+ V (q1).

This time it may not be obvious whether(q1, p1) are separable from(q2, p2). Let us try to substitute the ansatz (15) into the HJ equation,

1

2S′2

1 (q1) +1

2(S′

2(q2) + q1)2

= E. (18)

This equation is not yet in an explicitly separable form becauseq1

andq2 are mixed together; for instance, we cannot integrateS1 since

S′

1(q1) = 2E − (S′

2(q2) + q1)2 (19)

depends on bothq1 andq2. We would like to gatherq1 on one sideof the equation andq2 on the other side. So we solve Eq. (18) for S′

2,

S′

2(q2) = −q1 +√

2E − S′21 (q1).

4

Now the HJ equation is reduced to an explicitly separable form: bothsides are equal to a constant that we may denoteP2,

S′

2(q2) = P2, (20)√

2E − S′21 (q1) − q1 = P2. (21)

The variables are separated, and it remains to integrate theequations.Equation (20) yields simply

S2(q2) = P2q2.

(Note that we do not need to add another constant toP2q2 whenwe integrate; as we showed in Sec.1.2, additive constants are notdesired in the solution of the HJ equation.) Finally, we can expressS′

1 from Eq. (21) or directly from Eq. (19) and integrate:

S1(q1) =

∫ q1√

2E − (P2 + q1)2dq1.

This integral iselementary (i.e. can be computed in terms of ele-mentary functions such asexp, sin, etc.) but we do not need itsexplicit form at this time.

We can now put together the solution of the HJ equation,

F (q1, q2, E, P2, t) =

∫ q1√

2E − (P2 + q1)2dq1 + P2q2 − Et.

This generating function yields the canonical transformation to newvariables(D, E, Q2, P2) defined by

D = −∂F

∂E= −

∫ q1 dq1√

2E − (P2 + q1)2

+ t,

Q2 = − ∂F

∂P2= −q2 +

∫ q1 dq1 (P2 + q1)√

2E − (P2 + q1)2.

These integrals are elementary, so we find

D = t − arcsinP2 + q1√

2E, Q2 = −q2 −

√

2E − (P2 + q1)2.

Finally, we invert these equations to obtain explicit solutions as func-tions of the four constants,

q1(t) = −P2 +√

2E sin (t − D) ,

q2(t) = −Q2 −√

2E cos (t − D) .

These examples show how the HJ method yields an explicit solu-tion of a Hamiltonian system. The HJ equation reveals the structureof the relationships between the canonical variables, which is not al-ways easy to see by looking directly at the Hamiltonian. Thisis whythe HJ method is one of the most powerful methods used to studyHamiltonian systems.

1.3 Action-angle variables

Suppose that we have a HamiltonianH(~q, ~p) for which at least onepair of canonical variables, say(q1, p1), is separable in the HJ equa-tion. Then, as we found in section1.2.2, the trajectory in the phaseplane(q1, p1) is independent of other canonical variables and is de-termined by an equation of the form

f1(q1,∂S1(q1, Q1)

∂q1) ≡ f1(q1, p1) = Q1. (22)

In other words, the trajectories in the phase plane are levelsurfacesof the functionf1(q1, p1). Moreover, let us now suppose that the

motion in the(q1, p1) plane isbounded (i.e. the values ofq1 andp1 during motion along a particular trajectory are always smallerthan some maximal valuesqmax andpmax). Then it is clear that atleast some of the trajectories areclosed curves. In particular, sup-posef1(q1, p1) has a local minimum or a local maximum at somepoint in the phase plane. Then trajectories around that point willlook like deformed circles or “ovals.” These trajectories correspondto periodic motion.3 Physically, such motion is interpreted as an os-cillation with a frequency that in general depends on the amplitudeof the oscillation.

It is convenient to describe these periodic trajectories interms oftwo canonical variables: one, denotedφ, that represents the “angle”along the circle, and another, denotedJ , that labels the different“ovals.” Let us remark that the “oval” is already labeled by the valueQ1 = f1(q1, p1), which is a constant of motion that varies onlyfrom one “oval” to another. However, one could redefinef1 andreplace the constantQ1 by any function ofQ1, while we would liketo have in some sense a “standard” choice of this variable, sothatwe could compare two different Hamiltonians in an unambiguousway. A convenient requirement that fixes the choice of canonicalvariables(φ, J) is that the “angle”φ should vary from0 to 2π as thesystem traverses any “oval” once.

Here is how one could motivate this requirement. Let us assumefor simplicity that the canonical variables(q1, p1) are the only vari-ables present in the system, and thatH(q1, p1) is the full Hamil-tonian of the system. Suppose(q1, p1) → (φ, Q1), H → H ′ is acanonical transformation such thatQ1 is a constant of motion. Itmeans that the Hamilton equations for the variables(φ, Q1) are

Q1 = −∂H ′(φ, Q1)

∂φ= 0, φ =

∂H ′(φ, Q1)

∂Q1.

It follows from the first equation thatH ′ is a function only ofQ1,and then it follows from the second equation thatφ is a function onlyof Q1, i.e. a constant. Therefore

φ(t) = t∂H ′

∂Q1≡ ω(Q1)t, ω(Q1) ≡

∂H ′(Q1)

Q1,

in other words,φ(t) grows linearly witht. Since the motion is alonga closed curve (“oval”), it is natural to interpretφ as the “angle”and ω as the “angular velocity” (in quotation marks, since noth-ing is really rotating and we are merely interpreting the trajectoryin the phase plane(q1, p1) in a geometrical way). A redefinitionQ1 → f(Q1) will change the value of this “angular velocity” asω → ωf ′(Q1). Thus we could setω to any function ofQ1 if wewish. Then it is natural to require thatω(Q1) be such that the an-gleφ always varies from0 to 2π as the system traverses any “oval,”for any fixedQ1. It is clear that this can be arranged by a suitableredefinitionQ1 → f(Q1).

It turns out that this requirement uniquely singles out a choice ofcanonical variables. The variables chosen in this way are commonlydenoted byJ andφ and called theaction-angle variables. We shallnow determine the required canonical transformation(q1, p1) →(φ, J), assuming that the functionf1(q1, p1) from Eq. (22) and thecorresponding solutionS1(q1, Q1) are known.

Let us suppose that the generating function for the canonicaltransformation isF (q1, J), whereJ is interpreted as the new mo-mentum andφ as the new coordinate. Then we have

φ(q1, J) =∂F (q1, J)

∂J, p1(q1, J) =

∂F (q1, J)

∂q1. (23)

3Of course, there may be other trajectories that are not periodic and/or notbounded. The study of the geometry of trajectories in phase space is a fascinatingarea of modern theoretical mechanics that serves as a basis of chaos theory. But thismaterial is beyond the scope of the present text.

5

We now use the following trick: We write the total change of thevariableφ as the system traverses one “oval” as an integral over theclosed curve, usingq1 as the integration variable,

∆φ =

∮

dφ =

∮

dφ

dq1dq1 =

∮

∂2F (q1, J)

∂q1∂Jdq1

=∂

∂J

∮

∂F (q1, J)

∂q1dq1 =

∂

∂J

∮

p1(q1, J)dq1.

We now require that∆φ = 2π. Consider the function

A(J) ≡∮

p1(q1, J)dq1.

Since∆φ = dA/dJ = 2π, we haveA(J) = 2πJ , or

J =1

2π

∮

p1(q1)dq1.

(Note thatJ has dimensions of action.) The above equation yieldsthe value ofJ = J(Q1) for the “oval” corresponding to the givenvalue ofQ1, determined in the phase plane by the algebraic equa-tion (22). We may invert that equation (at fixedQ1) to obtainp1 = f1(q1, Q1). ThenJ(Q1) is expressed as

J(Q1) =1

2π

∮

f1(q1, Q1)dq1, (24)

where the integral is performed over the “oval” given by the equationp1 = f1(q1, Q1) in the phase plane, at a fixed value ofQ1. (Notethat the integral is equal to12π of the area inside the “oval” in the(q, p) plane.) SinceJ is now known as a function ofQ1, we canalso expressQ1 as a function ofJ , i.e.Q1 = Q1(J).

Having defined the “action” variableJ as a function ofQ1 (andthus, through Eq. (22), as a function ofq1 andp1), it remains todefine the “angle”φ. We note that we already have the functionS1(q1, Q1), which satisfies

p1 = f1(q1, Q1) =∂S1(q1, Q1)

∂q1. (25)

However, we need the generating functionF (q1, J) of the canoni-cal transformation(q1, p1) → (φ, J). We shall now argue thatFequalsS1 after substitutingQ1 as a function ofJ , i.e. F (q1, J) =S1(q1, Q1(J)). To see this, let us compare Eqs. (23) and (25). Wefind that

∂S1(q1, Q1)

∂q1= p1(q1, Q1) ≡ p1(q1, J(Q1)) =

∂F (q1, J(Q1))

∂q1,

where the middle identity reflects the fact that the canonical mo-mentump1(q1) is a fixed function ofq1 for a givenQ1, and thusp1(q1, J) must be the same function ofq1 at the fixed value ofJ = J(Q1). We can now integrate the above equation and obtainF (q1, J(Q1)) = S1(q1, Q1).

Having obtainedF (q1, J), we define the variableφ through

φ =∂F (q1, J)

∂J=

∂S1(q1, Q1)

∂Q1

1

∂J/∂Q1

as a function ofq1 andQ1, and thus as a function ofq1 andp1. Thiscompletes an explicit construction of the action-angle variables forthe canonical pair(q1, p1).

The construction can be performed for every separable canoni-cal pair(qj , pj) with bounded motion. For a completely integrablesystem, the result is a canonical transformation ton “action” andn“angle” variables (J1, ...,Jn, φ1, ...,φn), with the new Hamiltoniandepending only onJ1, ...,Jn. The equations of motion then become

Jk = 0, φk =∂H( ~J)

∂Jk≡ ωk.

Note that the total motion of the system with several separable vari-ables may not be periodic, even though each pair executes indepen-dent periodic motion. The only case when the total motion is peri-odic is when all the frequenciesωk arecommensurate (i.e. everyratioωj/ωk is equal to a ratio of integers).

1.3.1 Examples

As a first example, let us consider a simple harmonic oscillator, withthe Hamiltonian

H =p2

2+ ω2 q2

2.

Since the Hamiltonian is a constant of motion,H(q(t), p(t)) = E =const, the trajectories in the phase plane are curves determined bythe equation

p2

a2+

q2

b2≡ p2

2E+

q2

2Eω−2= 1,

that is, ellipses centered at(0, 0) with semiaxesa =√

2E andb = ω−1

√2E. We see thatall the trajectories are simple closed

curves. Let us now carry out the construction of the action-anglevariables. The result will be a canonical transformation,(q, p) →(φ, J), where the new variables will be determined as explicit func-tions of the old ones.

The “simplified” HJ equation (6) for the functionS(q) is

1

2S′2(q) +

1

2ω2q2 = E.

This is already in the form (22) with Q1 ≡ E and f1(q, p) ≡H(q, p). The solutionS(q) is found as

S(q) =

∫ q√

2E − ω2q2dq.

(We shall not need to integrate this explicitly.) The next step is todetermine the functionJ(E),

J(E) =1

2π

∮

p(q, E)dq =1

2π

∮

√

2E − ω2q2dq.

Instead of computing this integral directly, let us use a geometricconsideration. SinceJ(E) is equal to 1

2π of the area of the ellipsef1(q, p) = E, whose area is equal toπab, we have

J(E) =1

2π(πab) =

1

2ππ

(√2E

) (

ω−1√

2E)

=E

ω.

Therefore,E(J) = ωJ , which also means that the Hamiltonian inthe new variables is simplyH(J) = ωJ . The next step is to de-fine the generating functionF (q, J) of the canonical transformation(q, p) → (φ, J),

F (q, J) = S(q, E(J)) = S(q, ωJ) =

∫ q√

2ωJ − ω2q2dq.

The variableφ is then found as

φ(q, J) =∂F (q, J)

∂J=

∫ q ωdq√

2ωJ − ω2q2= arcsin

q√

ω√2J

.

Thus the old variables and the new ones are related by

q =

√

2J

ωsinφ, p =

√

2ωJ − ω2q2 =√

2ωJ cosφ; (26)

J =H(q, p)

ω=

1

2ω

(

p2 + ω2q2)

, φ = arcsinq√

ω√2J

.

6

One can check that the Poisson bracket is correct,φ, J = 1. SinceH(φ, J) = H(J) = ωJ , the new equations of motion are

J = 0, φ = ω,

describing uniform motion around a circle of fixedJ .Let us now consider another, somewhat unusual Hamiltonian,

H = sinh

[

1

2p2 +

1

2ω2q2

]

.

The “simplified” Hamilton-Jacobi equation for the functionS(q) is

E = sinh

[

1

2S′2 +

1

2ω2q2

]

.

We can rewrite this as

arcsinhE =1

2S′2 +

1

2ω2q2.

It is now clear that the entire construction of the action-angle vari-ables for a harmonic oscillator can be repeated if we replaceE byarcsinhE. In particular, Eq. (26) still holds, so it is straightforwardto obtain the general solutionq(t), p(t). The Hamiltonian in the newvariables is

H(J) = sinh (ωJ) ,

and thus the equations of motion are

J = 0, φ =∂H(J)

∂J= ω cosh (ωJ) .

Since the value ofJ describes the amplitude of oscillations, we findthat the frequencyφ of oscillations depends on the amplitude.

1.4 Adiabatic invariants

We have seen how to describe periodic motion in terms of action-angle variables. The other significant use of these variables is insituations when parameters of the system change slowly withtime.In that case, the motion of the system is onlyapproximatelyperi-odic, since both the frequency and the amplitude of oscillations willslowly change with time. The action-angle variables are particularlysuitable for describing such systems with good precision.

Consider for simplicity a one-dimensional system with a Hamil-tonian of the formH0(q, p; λ), depending on a parameterλ (such asmass or frequency). Assume that the motion is periodic, so that thetrajectories in the(q, p) plane are closed curves and the action-anglevariables(φ, J) can be introduced. Now suppose thatλ is a slow-changing function of time. Quantitatively,λ(t) is aslow-changingfunction when its change is small over a characteristic time∆t oforder of one oscillation period,

|∆λ| ≈∣

∣

∣λ∆t

∣

∣

∣=

∣

∣

∣λ∣

∣

∣

2π

ω≪ λ. (27)

This is called theadiabaticity condition and the change ofλ iscalledadiabatic if the condition (27) holds.

The Hamiltonian is now explicitly time-dependent,H(q, p; t) =H0(q, p; λ(t)), and is not any more a conserved quantity (althoughthe change ofH is slow). Let us now perform the canonical trans-formation to the action-angle variables(φ, J) using the same gener-ating functionF (q, J) as in the case of constantλ. Of course, sinceλ is now a function of time, the generating functionF (q, J) willbe also a function of time through its dependence onλ. The newHamiltonian is

H ′(φ, J) = H0(J ; λ) +∂F

∂t= H0(J ; λ) +

∂F

∂λλ ≡ H0 + fλ,

where we introduced an auxiliary function

f(φ, J ; λ) ≡ ∂F (q, J ; λ)

∂λ

∣

∣

∣

∣

q→q(φ,J)

,

which must be expressed as a function ofφ andJ after evaluatingthe derivative∂F/∂λ. With the HamiltonianH ′, the equations ofmotion forφ, J are

J = −∂H ′

∂φ= −λ

∂f(φ, J)

∂φ,

φ =∂H ′

∂J=

∂H0(J ; λ)

∂J+ λ

∂f(φ, J)

∂J.

It is clear thatJ is not constant any more, although its time derivativeis small due to the smallness ofλ. Also, the frequencyφ is notconstant but is slow-changing due to the dependence ofH0 onλ.

Note that the equation forJ contains a small factorλ(t) multi-plied by a periodically oscillating term∂f/∂φ (recall thatφ is acircular variable with period2π). Qualitatively, one may expect thatthe oscillations ofλ∂f/∂φ during one period approximately cancel,sinceλ changes very little during one period. For this reason,J(t) isalmost constant even when traced over a long period of time, duringwhich λ(t) changes considerably. Quantities that remain approxi-mately constant under an adiabatic change of parameters arecalledadiabatic invariants.

One may ask why the Hamiltonian itself is not an adiabatic invari-ant; after all,H(t) also changes slowly with time. The answer is thatJ(t) changesmuchmore slowly thanH(t), although this is perhapsnot immediately obvious. To make this point more evident, inthenext section we shall estimate the change ofJ(t) for a harmonic os-cillator with an adiabatically time-dependent frequency.The role ofthe parameterλ(t) will be played by the frequencyω(t). A changeof the frequency∆ω over a timeT is adiabatic if∆ω/ (ωT ) ≪ ω,which will hold for largeT . So the frequency may change appre-ciably, say∆ω = 1000ω, provided that the change is spread overa very long timeT . The conclusion will be that the relative changeof J is exponentiallysmal,∆J/J ∼ exp (−ωT ), even if the totalchange in the frequency∆ω is not small. SinceJ = H/ω for theharmonic oscillator, it follows that bothω andH may change signif-icantly during the timeT , while the adiabatic invariantJ will remainessentially constant ifT is large.

1.4.1 Change of adiabatic invariant

Consider a harmonic oscillator with a time-dependent frequencyω(t) 6= 0,

H =1

2

(

p2 + ω2(t)q2)

.

The action-angle variables are introduced as before, according toEq. (26). The generating function of the canonical transformationis

F (q, J) =

∫ q√

2ωJ − ω2q2dq

and is now time-dependent through its dependence onω(t). Thenew Hamiltonian is

H ′ (φ, J, t) = ωJ + ω∂F (q, J)

∂ω

∣

∣

∣

∣

q=q(φ,J)

,

where we need to substituteq throughφ, J using Eq. (26) only afterevaluating the derivative∂F/∂ω. We first compute

∂F (q, J)

∂ω=

∫ q J − ωq2

√

2ωJ − ω2q2dq.

7

Then we substituteq =√

2J/ω sin φ, change variable asdq =√

2J/ω cosφdφ, and find

∂F (q, J)

∂ω

∣

∣

∣

∣

q=q(φ,J)

=J

2ωsin 2φ.

Therefore, the new Hamiltonian is

H ′(φ, J, t) = ω(t)J +ωJ

2ωsin 2φ.

The equations of motion in the action-angle variables are

J = −∂H ′

∂φ= − ω

ωJ cos 2φ,

φ =∂H ′

∂J= ω +

1

2

ω

ωsin 2φ.

Let us now analyze these equations. The adiabaticity condi-tion (27) with λ ≡ ω becomes|ω/ω| ≪ ω, therefore the equationfor φ can be approximately replaced byφ ≈ ω(t) and integrated as

φ(t) ≈∫ t

t0

ω(t)dt, (28)

wheret0 is an arbitrary reference point. We now turn to the equationfor J(t). Sinceφ grows monotonically with time, it is possible touseφ instead oft as the time variable (dφ = ωdt) and to considerJas a function ofφ. We can expressω(t) as a function ofφ as

ω(t) ≡ Ω(φ(t)),

whereΩ(x) is an auxiliary function. The equation forJ(φ) is

d

dφlnJ(φ) = −Ω′(φ)

Ω(φ)cos 2φ,

which can be immediately integrated, yielding

lnJ(φ)

J(φ0)= −

∫ φ

φ0

cos 2φΩ′

Ωdφ. (29)

At this point, we cannot simplify the integral in Eq. (29) any more.To proceed further, we consider the case when the frequencyω(t)smoothly changes from a constant valueω1 at t → −∞ to anotherconstant valueω2 at t → +∞, while the characteristic timescale ofchange isT . One such functionω(t) is

ω(t) =ω1 + ω2

2+

ω2 − ω1

2tanh

t

T. (30)

This functionω(t) will be adiabatic if|ω2 − ω1| ≪ ω21T . So we do

not need to assume thatω1 ≈ ω2; we might even haveω2 = 1000ω1

if T is sufficiently large. Note thatω(t) has the formf(t/T ), wheref(x) is a function that changes betweenω1 andω2 on scales∆x ∼1. When we chooseω(t) in this way, we can adjust the slowness ofchange ofω(t) by adjusting the value ofT while keeping the sameoverall shape of the functionω(t).

Suppose that the system starts att = −∞ with a valueJ = J1

of the adiabatic invariant. We can now use Eq. (29) to estimate thevalueJ2 at timet → +∞ when the frequency has changed fromω1

to ω2. We rewrite Eq. (29) as

lnJ2

J1= −Re

∫ +∞

−∞

e2iφ Ω′

Ωdφ,

so that we can use the methods of complex variable theory to evalu-ate the integral. The contour of integration can be closed inthe upperhalf-plane of complexφ. Then the integral will be equal to the sum

of residues of the functione2iφΩ′(φ)/Ω(φ) in the upper half-plane.Note that each root or poleφ∗ of Ω(φ) leads to a simple pole ofe2iφΩ′/Ω with residue of ordere2iφ∗ . A pole atφ∗ = φ1 + iφ2

(where we always haveφ2 > 0) will yield a factor e−2φ2 , thusonly the poles with the smallest value ofφ2 will contribute signifi-cantly. Suppose thatφ∗ is the pole ofΩ′/Ω with the smallest valueof φ2 ≡ Im φ∗, then we can estimate the integral as

lnJ2

J1∼ −Re

(

1

2πie−2φ2e2iφ1

)

≡ Ce−2φ2 ,

whereC is a constant typically of order 1 (in any case, not large).Finally, we need to estimate the value ofφ2 ≡ Im φ∗, whereφ∗

is the complex pole or root ofΩ(φ) that is closest to the real axis.SinceΩ(φ) is simplyω(t) expressed throughφ, while φ is definedby Eq. (28), the complex roots ofω(t) are also roots ofΩ(φ), whilethe complex poles ofω(t) are poles ofΩ(φ) at φ = ∞ and areirrelevant. Therefore,

φ∗ =

∫ z∗

t0

ω(z)dz,

wherez∗ is the complex root ofω(z) = 0 such that the imaginarypart ofφ∗ is the smallest among all such complex roots. Ifω(t) is ofthe formf(t/T ), then we have

φ∗ = T

∫ z′

∗

t0/T

f(z′)dz′, z′ ≡ z/T,

where nowz′∗

is the corresponding complex root off(z) = 0. Sincef(z) is a function that changes on scales∆z ∼ 1, we expect thatf(z) has complex roots of order 1 but not significantly smaller. (Ina moment, we shall compute the complex roots of the function inEq. (30) to see an explicit illustration of this statement.) Therefore,the change of the adiabatic invariant is estimated as

lnJ2

J1= Ce−2bT , b ≡ Im

∫ z′

∗

t0/T

f(z′)dz′.

This is the main result of this section. It shows that the change in theadiabatic invariant is exponentially small in the timescaleT .

Let us now find the specific values of the constantb for the exam-ple (30). The functionω(t) has complex roots at

t

T=

iπ

2ln

ω2

ω1+ iπn, n = 0,±1,±2, ...

If ω2 > ω1, the root closest to the real axis is

t∗T

=iπ

2ln

ω2

ω1.

Therefore,

b = Im∫ t∗

t0

ω(t)dt

T=

ω1 + ω2

2π ln

ω2

ω1,

lnJ2

J1= C exp

(

−πT (ω1 + ω2) lnω2

ω1

)

.

(The quantityln(J2/J1) can be estimated more precisely, but weonly performed an order-of-magnitude estimate.)

The final conclusion is that the change of the adiabatic invariantis of orderexp (−ωT ), whereω is a typical frequency andT is thetypical timescale of change of frequency.

8

Documents

Advanced Canonical Methods Hamilton Jacobi Equation Action Angle