Chapter 1 Linear diﬀerential equations

Chapter 1

Linear differential equations

We are going to study some interesting systems of nonlinear differential equa-tions. Before doing so, however, it is necessary to make a few remarks aboutlinear differential equations.

Any first order linear equation can be integrated directly. That is, thegeneral solution of the equation

y′ + py = 0

can be written explicitly as

y = Ce−R

p(x)dx

where C is an (arbitrary) constant. Here, p is a given function of x, y is thefunction of x to be found, and y′ means dy

dx .

We regard this as “the answer” even though the integral of the functionp may not be computable in terms of elementary functions, that is, func-tions which are combinations of familiar functions such as rational functions,trigonometric functions, and logarithms.

The next case is the second order equation

y′′ + py′ + qy = 0.

As in the first order case, a formula for y which involves integrals (of expres-sions containing p and q) is regarded as acceptable — but unfortunately sucha formula cannot be found, in general.

5

6 CHAPTER 1. LINEAR DIFFERENTIAL EQUATIONS

In the special case where p and q are constant functions, there is a wellknown method of finding the general solution (in terms of exponential func-tions). Textbooks on differential equations for beginners discuss various otherspecial cases. But, in general, there is no systematic method to “integrate”a second order equation. Evidently, the situation for higher order equationsis going to be even more complicated.

On the other hand, indirect methods are available. First, if y1 and y2 areany solutions, then A1y1 + A2y2 is also a solution, for any constants A1, A2.This holds because the equation is linear.

Next, the general theory of differential equations tells us, for any point x0

where p and q are sufficiently differentiable, that there exists a unique solutionin some open neighbourhood of x0 to the second order equation y′′+py′+qy =0 which satisfies the “initial condition”

y(x0) = a, y′(x0) = b.

Moreover, if p and q are sufficiently differentiable on some closed interval[x0 − c, x0 + d], then the solution is also defined on the same interval [x0 −c, x0 + d]. This holds for any choice of (a, b). It follows that the set of allsolutions defined on this interval is a two-dimensional vector space.

Therefore, to find the general solution, it suffices to find two linearlyindependent solutions. One can hope to find solutions with Taylor expansionsby substituting

y = a0 + a1(x − x0) + a2(x − x0)2 + · · ·

into y′′ + py′ + qy = 0 (although even if we find all the coefficients, we stillhave to show that the series converges). These observations are the startingpoint of the classical (19th century) theory of ordinary differential equations.In principle, it applies to nonlinear equations as well as linear equations, butthe theory has been developed much further in the linear case. For example,in the nonlinear case, the solution space may not be a vector space; it maynot be two-dimensional in any reasonable sense.

Alternatively, thanks to 20th century technology, using numerical meth-ods and a computer1 we can draw the graph of a solution y. By choosing vari-ous initial conditions, we can study various solutions (in the two-dimensional

13D-XplorMath: In footnotes like this we shall give hints on using the software 3D-

7

vector space of all solutions). This experimental approach is a very helpfulcompanion to the classical theory. It applies equally well to the nonlinearcase, which is a great advantage. However we only see part of the solutionon a computer screen, and it is difficult to tell whether numerical errors2 aresignificant. Both of these difficulties are greater in the nonlinear case.

All this information is certainly helpful, but unfortunately it does not helpto answer the simple question: “When is y′′ + py′ + qy = 0 integrable in thesame sense as y′+py = 0?”. The Galois theory of linear differential equationsaddresses this question, and it leads to deep and interesting mathematics,related to the algebraic properties of the differential equation. However, itturns out that this is not the only kind of integrability. Another direction,based on the idea of “conserved quantities”, lies perhaps even deeper, andhas even wider ramifications, not just in algebra but also in geometry.

In these lectures we are going to approach the problem of “recognizingintegrability” by combining theory and computer experiment.

Problem 1.1. Consider the solutions y of the following differential equationswith y(0) = 1, y′(0) = 0.

(a) In which cases can you find a formula for y?

(b) Using a computer, sketch3 the graph of the solution for −1 ≤ x ≤ 5.(Warning: such solutions are not guaranteed to exist.)

(1) y′′ = y

(2) y′′ = −y

(3) y′′ + xy = 0

(4) y′′ + (sin x)y = 0

(5) y′′ + xy′ + y = 0

XplorMath, which can be downloaded from http://3d-xplormath.org/. Instructions of theform AAA/BBB/CCC... refer to successive menus and choices, e.g. from menu AAA,select item BBB, then select item CCC etc.

23D-XplorMath: For example, the difference between the Euler method and the Runge-Kutta method is indicated in 3D-XplorMath:

33D-XplorMath: In ODE(1D)2nd order the graph of (y, y′) is drawn first; the graphsof (x, y) and (x, y′) can be obtained by choosing Show Projected Orbits from the Actionmenu. The graph of (y, y′) is a useful graphical visualization. For example, for y′′ = yand the real solution y = cosh x, the graph of (x, y) is a catenary, and the graph of (y, y′)is a hyperbola.

8 CHAPTER 1. LINEAR DIFFERENTIAL EQUATIONS

(6) x2y′′ + 2xy′ + y = 0

Project 1.2. Investigate the special case

y′′ = 1y (y′)2 − 1

xy′ + y3 − 1y

of the Painleve III equation

y′′ = 1y (y′)2 − 1

xy′ + 1x(αy2 + β) + γy3 + δ

y .

(There are six types of Painleve equation, and there is a substantial theorywhich explains their “integrability”. They may be regarded as nonlineargeneralizations of the linear equations which define “special functions” suchas Bessel functions.

Chapter 2

Systems of linear differentialequations

Consider the system

y′1 = ay1 + by2

y′2 = cy1 + dy2

of two linear first order differential equations for the (real) functions y1(x), y2(x),where a, b, c, d are (real) constants. In matrix form we can write this systemas

Y ′ = AY,

where

Y =

(y1

y2

), A =

(a bc d

).

The unique solution with y1(0) = α, y2(0) = β can be written down verytidily as

Y = exA

(αβ

).

A more computational (and perhaps more informative) method is to makea linear change of variables:

Z = PY

where P is an invertible 2 × 2 matrix. We obtain

Z ′ = PY ′ = PAY = PAP−1Z,

9

10CHAPTER 2. SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS

and we can choose the matrix P so that PAP−1 has the simplest possibleform.

For example, if A is symmetric (At = A), then, by linear algebra, we maychoose P such that

PAP−1 =

(λ1 00 λ2

).

The eigenvalues λ1,λ2 of A are real. The columns v1, v2 of the matrix P−1

are eigenvectors of A:

A

| |v1 v2

| |

= AP−1 = P−1

λ1 0

0 λ2

=

| |

λ1v1 λ2v2

| |

.

The differential equation becomes

z′1 = λ1z1

z′2 = λ2z2

and the solution of this, with z1(0) = γ, z′2(0) = δ, is(

z1

z2

)=

(eλ1x 00 eλ2x

)(γδ

).

The corresponding solution of Y ′ = AY is

Y = P−1Z =

| |v1 v2

| |

eλ1x 0

0 eλ2x

γ

δ

=

| |

eλ1xv1 eλ2xv2

| |

γ

δ

,

henceY = γeλ1xv1 + δeλ2xv2.

The directions of these eigenvectors are visible if one looks at the graphsof the curves (y(x), y′(x)) in (y1, y2)-space. For example, if λ1 > λ2 thenY ∼ δeλ2xv2 as x → ∞ (as the term γeλ1xv1 will be relatively insignificant).

In terms of α, β we have ( αβ ) = P−1 ( γδ ) hence

Y = P−1Z = P−1ex

“λ1

λ2

”

P

(αβ

)= e

xP−1“λ1

λ2

”P

(αβ

)= exA

(αβ

)

which recovers the version of the solution that we have already seen. Butthis version obscures the role of the eigenvectors v1, v2.

11

Example 2.1. A =( −2 2−2 3

)The eigenvalues are −1, 2 and corresponding eigen-

vectors are (any nonzero multiples of) ( 21 ) , ( 1

2 ) . We can take P−1 = ( 2 11 2 ) .

Example 2.2. The second order equation

y′′ + py′ + qy = 0

gives rise to a system of the above type, by the standard device of introducingy1 = y, y2 = y′, so that

(y1

y2

)′

=

(0 1−q −p

)(y1

y2

).

We are assuming here that p and q are constant. The eigenvalues are theroots of λ2 + pλ+ q = 0.

In particular, the famous equation

y′′ = −ky

(with k > 0) can be solved this way. The functions y = sin√

kx and y =cos

√kx are two linearly independent solutions. In both cases, however, the

graph of the curve (y(x), y′(x)) is a circle, and the eigenvectors cannot be“seen”. This is because the eigenvectors are not real.

Project 2.3. In general, the different possibilities are determined by the Jor-dan normal form of the matrix A = ( a b

c d ): for any A, we can find P with

PAP−1 =

(λ1 00 λ2

)or

(λ 10 λ

).

In the first case, since we are assuming that a, b, c, d are real, the eigenvaluesare both real or a complex conjugate pair. In the second case, the repeatedeigenvalue must be real.

Try to classify the different kinds of pictures1 by solving explicitly thesystems corresponding to the Jordan normal forms of A. It may be necessaryto consider various sub-cases.

13D-XplorMath: Use ODE(2D)1st order.

12CHAPTER 2. SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS

If the matrix A is not constant, we can still try to simplify the differentialequation by introducing Z = PY where P depends on x. We obtain

Z ′ = PY ′ + P ′Y = (PAP−1 + P ′P−1)Z,

but now we cannot use linear algebra so easily.

The graphs of the solution “curves” x (→ (y1(x), y2(x)) have an interestinggeometrical interpretation: they are the “integral curves of a vector field”.A vector field (in the plane) means an assignment of a vector to each point(of the plane), which is mathematically the same as a function F from theplane to the plane. The usual way to visualize a vector field is to draw a fewrepresentative vectors. An integral curve of such a vector field means a curve(y1, y2) with the property that, for each x, the velocity vector (y′

1(x), y′2(x))

at the point (y1(x), y2(x)) is equal to the value F (y1(x), y2(x)) of the vectorfield. In our situation the vector field is given by F (Y ) = AY .

Example 2.4 (show vector field).

If A is not constant, we have a “time dependent vector field”! It isconvenient in many applications to think of x as time, and in fact we shallrename x as t from Chapter 4.

Of course, everything that we have said in this chapter extends to thecase of n × n matrices.

Chapter 3

Example: the system Y ′′ = AY

As a generalization of the 2 × 2 first order system Y ′ = AY , let us considerthe 2 × 2 second order system

Y ′′ = AY.

This can be written as a 4×4 first order system for y1, y2, y′1, y

′2, so there is a

unique solution when the initial values of Y and Y ′ are specified. However,we choose to keep it as a 2× 2 second order system for y1, y2 in order to usephysical intuition (Newton’s equations, energy) in the next chapter.

The graphs of solutions Y (x) in the (y1, y2)-plane are slightly more inter-esting (than for the case Y ′ = AY ). Again, they can be classified accordingto the Jordan normal form of the matrix A.

Problem 3.1. In the case where both eigenvalues of A are real and negative,we have solutions for which both y1 and y2 remain bounded (confined to afinite area of the computer screen). Explain the shapes of these boundedregions, and how they are affected by the Jordan normal form of A.

We are going to study some n × n systems of the form Y ′′ = AY , andmodifications of the form Y ′′ = AY + nonlinear terms. In the linear case,there is little difference between the case n = 2 and general n, but in themodified case we shall see that interesting new features arise for “large n”.

13

Chapter 4

Conserved quantities for linearsystems

In this chapter we introduce an important idea. It is a mathematical gen-eralization of the concept of “conservation of energy” in physics. For thisreason we shall rename the independent variable t from now on, instead ofx, and think of it as “time”. We shall focus on second-order equations, asNewton’s equations (in particular the equations of lattice motion) are of thistype.

Let us begin with the simplest possible case, namely the scalar equation

y′′ = −ky

where k is a positive constant. This equation represents simple harmonicmotion in physics, and the concept of “conserved quantity” from physicsleads to another method of solving the equation. Namely, if we define

K =1

2y′2 (kinetic energy)

U =k

2y2 (potential energy)

then we see that the total energy (or Hamiltonian) H = K+U is “conserved”:

d

dtH = y′y′′ + kyy′ = y′(−ky) + kyy′ = 0.

15

16 CHAPTER 4. CONSERVED QUANTITIES FOR LINEAR SYSTEMS

That is, if y is a solution of the differential equation, then H is constant, i.e.12y

′2 + k2y

2 = C for some C.

This explains why the (y, y′)-graph is a bounded curve. More significantly,we can regard this property as a first order equation which can be solveddirectly by integration (by “quadrature”):

y′2 = 2C − ky2 =⇒ y′ = ±√

2C − ky2

=⇒ ±∫

dy√2C−ky2

=

∫dt

=⇒ ± 1√k

sin−1√

k2C y = t + D

=⇒ y = ±√

2Ck sin

√k(t + D).

Thus, this method does give the expected form y = E sin(√

kt + F ) of thegeneral solution.

Now let us try to apply the same idea to the 2 × 2 system Y ′′ = AY .

First we shall just try to guess a suitable “energy function”. Let

H =1

2y′

12 +

1

2y′

22 + U

for some (yet to be determined) function U(y1, y2). We may rewrite this as

H =1

2〈Y ′, Y ′〉 + U

where the inner product of two (column) vectors is defined by 〈X,Y 〉 = X tY .Then

d

dtH = 〈Y ′′, Y ′〉 +

∂U

∂y1y′

1 +∂U

∂y2y′

2 = 〈Y ′′, Y ′〉 + 〈∇U, Y ′〉

= 〈Y ′′ + ∇U, Y ′〉= 〈AY + ∇U, Y ′〉

If, for example, the function U satifies AY + ∇U = 0, namely

ay1 + by2 = −∂U

∂y1and cy1 + dy2 = −∂U

∂y2,

17

then the function H will be a conserved quantity. This condition is satisfiedif and only if

−U =1

2ay2

1 + by1y2 + f(y2)

−U = cy1y2 +1

2dy2

2 + g(y1),

in other words b = c, i.e. A is a symmetric matrix, and we must have:

U = −1

2(ay2

1 + 2by1y2 + dy22) + constant = −1

2〈Y, AY 〉 + constant.

In conclusion, we have discovered that

H =1

2〈Y ′, Y ′〉 − 1

2〈Y, AY 〉

is a conserved quantity of the system Y ′′ = AY , if the matrix A is symmetric.Having discovered the formula, it is easy to verify directly that H is constant:

H ′ = 〈Y ′′, Y ′〉 − 1

2〈Y ′, AY 〉 − 1

2〈Y, AY ′〉

= 〈AY, Y ′〉 − 1

2〈Y ′, AY 〉 − 1

2〈AY, Y ′〉 = 0

(we use A = At to obtain 〈Y,AY ′〉 = 〈AY, Y ′〉).Unfortunately one conserved quantity does not give enough information

to solve a 2 × 2 system.

As an alternative method, we can use linear algebra to reduce the equationto two scalar equations of the form y′′ = −ky. This will give us two conservedquantities. If we assume that A = At, there exists a matrix P such thatPAP−1 is diagonal, with eigenvalues λ1,λ2. When we make the change ofvariables Y = P−1Z, we obtain

H =1

2Y ′tY ′ − 1

2Y tAY

=1

2Z ′tZ ′ − 1

2ZtPAP−1Z

=1

2Z ′tZ ′ − 1

2Zt

(λ1 00 λ2

)Z.

18 CHAPTER 4. CONSERVED QUANTITIES FOR LINEAR SYSTEMS

Now we make a simple, but crucial, observation. Since(λ1 00 λ2

)=

(λ1 00 0

)+

(0 00 λ2

)

there is a very natural decomposition

H =1

2(z′1

2 − λ1z21)

︸︷︷︸H1

+1

2(z′2

2 − λ2z22)

︸︷︷︸H2

.

Both H1 and H2 must be conserved quantities as well, since we have decom-posed the problem Y ′′ = AY into two separate problems z′′i = λizi, i = 1, 2,and the total energy in each of these separate problems is conserved. It isobvious now that these two conserved quantities are sufficient to solve thesystem.

Problem 4.1. How many conserved quantities can the 2× 2 system Y ′′ = AYhave? If A is symmetric, we have already found two conserved quantities. Isit possible to find three, or more? And if A is not symmetric, what happens?(Hint: consider the Jordan normal form of A, and try to find conservedquantities for each situation. Is it possible to construct a conserved quantityfor a first order system?)

We can look for conserved quantities for any system of equations, evenwhen there is no obvious physical quantity that seems likely to be conserved,and even when the system itself has no obvious physical meaning. (Thisdirection of thinking does not worry a mathematician!) We can go on toask whether it is possible to have more than one conserved quantity, and, ifso, how many essentially different ones. Most importantly of all, we can askthese questions (and sometimes answer them) for nonlinear equations, notjust for linear equations.

Chapter 5

Nonlinear differential equations

A nonlinear scalar equation might not have any solution at all, e.g.

(y′)2 + y2 = −1.

A nonlinear scalar equation of the form y′ = f(y, t) has a local solution forany initial condition y(0) = c, but in general we cannot say much about thespace of all such solutions. However, there are many important nonlinearequations, and we shall study them by focusing on the idea of conservedquantities.

The pendulum1 equation

y′′ = −k sin y

(with k > 0) is a familiar example of a nonlinear equation from physics. Forsmall values of y, sin y is near to y, so the situation is similar to the case ofy′′ = −ky. For general y, the total energy H = 1

2y′2 − k cos y is a conserved

quantity, so we can reduce the equation to a first order equation and thenexpress the solution as an integral. The integral gives an elliptic function.Thus, by recognising a conserved quantity, we succeed in solving the abovedifferential equation.

Project 5.1. Using elliptic functions, solve y′′ = −k sin y, and compare thecomputer-generated solutions with the solutions of y′′ = −ky.

13D-XplorMath: ODE(1D)-2nd Order:Pendulum, ODE(2D)-1st Order:Pendulum

19

20 CHAPTER 5. NONLINEAR DIFFERENTIAL EQUATIONS

Let us now consider a system, of the form Y ′′ = F , i.e.

y′′1 = F1(y1, y2)

y′′2 = F2(y1, y2)

where F1, F2 are (not necessarily linear) functions. We shall assume thatthe functions F1, F2 are “sufficiently differentiable”. This means that weare not going to worry about differentiability; we will assume whatever isnecessary for our purposes (if necessary, that both functions are infinitelydifferentiable). Note that the functions depend directly on y1, y2 but thatthey do not depend directly on x. Finally we assume that the zero functionsy1(x) = 0, y2(x) = 0 are a solution of the system, i.e. that F (0, 0) = (0, 0).

In physical language, we are considering an autonomous system with anequilibrium point at the origin. The reason for the equilibrium point as-sumption is that we are usually going to consider systems which have (some)conserved quantities, and for which some solution curves stay in a boundedregion of the plane, in which case the most interesting behaviour is that inthe vicinity of an equilibrium point.

By the same calculation as in chapter 4, we see that

H =1

2y′

12 +

1

2y′

22 + U =

1

2〈Y ′, Y ′〉 + U

will be a conserved quantity if F + ∇U = 0, that is, if

F1(y1, y2) = −∂U

∂y1and F2(y1, y2) = −∂U

∂y2.

A necessary condition for the existence of such a function U is ∂F1∂y2

= ∂F2∂y1

.

Project 5.2. Check the last statement. Conversely, show that F1, F2 are de-fined on the entire (y1, y2)-plane R2 (or more generally, on a simply connectedregion containing the solutions in question), this condition is sufficient for theexistence of U . (Hint: this is related to the idea of a “fundamental theoremof calculus” for functions of several variables, and to the properties of theoperators div, grad, curl.) Can you generalise this to the n × n case?

Project 5.3. Let A be a matrix which has negative eigenvalues. Use a com-puter2 to investigate solutions of equations of the form Y ′′ = AY +N , where

N =

(n1

n2

),

23D-XplorMath: E.g. use ODE(2D)-2nd Order:User.

21

and n1, n2 are simple nonlinear functions such as y21, y1y2, y2

2 (or linear com-binations of these). Determine when there exists a conserved quantity ofthe form H = 1

2〈Y′, Y ′〉 + U , and consider the effect (if any) of this on the

solutions.

Unfortunately it is hopeless to expect conserved quantities in general.This is easy to believe, but not easy to prove. In fact, the existence ofsystems with no conserved quantities is related to the existence of “chaotic”systems, such as the famous Lorenz3 attractor.

Project 5.4. Find (from the literature) a definition of “chaotic”. Can a systemof the form

y′′1 = F1(y1, y2)

y′′2 = F2(y1, y2)

be chaotic?

We shall not discuss chaotic systems, except to say that they are the op-posite extreme from systems with many conserved quantities. It is importantto keep in mind that “chaotic” is not the same as “nonlinear” (the pendulumis an example of a nonlinear system which has “many conserved quantities”).

33D-XplorMath: See ODE(3D)-1st Order:Lorenz.

Chapter 6

Normal modes

From the graphs of solutions Y (t) of a second order nonlinear system

Y ′′ = AY + nonlinear terms

one can see (Project 5.3) how nonlinear terms disturb the simple geometryof the linear case.

To study this kind of nonlinear system, it is natural to begin with thecase where the nonlinear term is “small”. We expect the behaviour of thenonlinear system to be similar to that of the linear system Y ′′ = AY , if thesolution Y also remains small.

If A is diagonalizable, the n×n linear system Y ′′ = AY decomposes (aftera suitable change of variables) into n independent problems

z′′1 = λ1z1, . . . , z′′n = λnzn.

This decomposition is called the decomposition into normal modes. Thesystem is said to be in the i-th mode if zj ≡ 0 for all j /= i. A typicalsolution of the system can be regarded as being in a combination of thenormal modes. The conserved quantity Hi = 1

2(z′i2 − λiz2

i ) is called theenergy of the i-th normal mode.

This kind of decomposition will not be possible for a nonlinear systemsuch as Y ′′ = AY + nonlinear term. (Indeed, even for the linear systemY ′′ = AY , our decomposition was based on the assumption that A wassymmetric.)

23

24 CHAPTER 6. NORMAL MODES

For the rest of this chapter we assume that n = 2. As a first generalizationof the linear case (with symmetric A), let us consider the system Y ′′ = −∇Uwhere

−2U = ay21 + 2by1y2 + dy2

2 + ey31 + fy2

1y2 + gy1y22 + hy3

2.

If −2U = ay21 + 2by1y2 + dy2

2, i.e. a quadratic polynomial, we have the linearcase. Thus we have introduced “just a little” nonlinearity into U by allowingit to be a cubic polynomial.

The system is

y′′1 = − ∂U

∂y1= ay1 + by2 + 3

2ey21 + fy1y2 + 1

2gy22

y′′2 = − ∂U

∂y2= by1 + dy2 + 1

2fy21 + gy1y2 + 3

2hy22

which is of the form Y ′′ = AY + quadratic term.

A linear change of variable in this case cannot be expected to decomposeboth the linear term and the quadratic term of the system in an equitableway. An alternative idea is to ignore the analysis of the linear case, anddivide up the total energy

H = 12y

′12 + 1

2y′22 − 1

2

(ay2

1 + 2by1y2 + dy22 + ey3

1 + fy21y2 + gy1y

22 + hy3

2

)

in any “reasonably symmetrical” way. For example, let us define Htrial1 and

Htrial2 by

Htrial1 = 1

2y′12 − 1

2(ay21 + by1y2) − 1

2y1(ey21 + 1

2fy1y2 + 12gy2

2)

Htrial2 = 1

2y′22 − 1

2(by1y2 + dy22) − 1

2y2(12fy2

1 + 12gy1y2 + hy2

2)

If we are lucky, Htrial1 and Htrial

2 will be conserved quantities. If not, we canask how Htrial

1 and Htrial2 vary (for example whether they satisfy differential

equations which are simpler than the original system). A better procedure,perhaps, is to diagonalize the linear part first, then divide up the cubic part.

However, if we are dealing with a solution in which y1, y2 remain small,then the contributions of the nonlinear (cubic) terms of H1, H2 are very smallindeed. So the question of how to divide up the cubic terms of H might beirrelevant, and we might as well use the same H1, H2 as in the linear case.Let us introduce

H linear1 = 1

2(z′12 − λ1z

21)

H linear2 = 1

2(z′22 − λ2z

22).

25

and call these “the energies of the normal modes of the approximating linearsystem”. We shall abbreviate this to “the energies of the normal modes”,but it is important to keep in mind that H linear

i is defined using the nonlinearsystem.

It is difficult to guess what will happen, as various scenarios are plausible.The energies of the normal modes might remain approximately constant, orthey might vary in a simple and predictable way, or one of them mighteventually dominate the others, or they might vary “randomly”, etc. This isa situation where computer experiments are helpful, and it is precisely thiskind of experiment which was carried out by Fermi, Pasta, and Ulam.

As usual, there are various factors to consider when we perform computerexperiments. For example:

(i) Accidental mathematical (or physical) special features of the particularsystem.

(ii) The accuracy of the approximation of the nonlinear system by the linearsystem (and the accuracy of the approximation of the respective normalmodes).

(iii) Numerical error in the computer calculations.

(iv) The possibility of misinterpretation of the experimental results (e.g. byrunning the experiment for too short a time).

In other words, we have to be very careful.

Project 6.1. Take A to be a 2× 2 or 3× 3 symmetric matrix. Consider someexamples of systems of the form Y ′′ = AY + nonlinear term. Use a computerto study the energies H linear

i = 12(z

′i2 − λiz2

i ) of the normal modes? In eachcase, consider whether the solution Y remains small or not.

Chapter 7

The Fermi-Pasta-Ulamexperiment - 1

Physical intuition was the driving force in the experimental calculations ofFermi, Pasta and Ulam when they used one of the first electronic computers,in 1954-1955, to investigate the behaviour of the energies of the normal modesof a certain nonlinear system of differential equations.

Their system was a mathematical model for a “lattice model”, thatis, Newton’s equations of motion for a collection of particles connected bysprings. However, the forces in the springs were taken to be nonlinear (incontrast to a perfect spring, which would exert a linear force, by Hookes’Law). It was not possible to solve these equations explicitly. However, nu-merical simulations of the motion of the particles were carried out using thecomputer.

This was an early example of “experimental mathematics”: Fermi, Pastaand Ulam performed this experiment in order to verify their intuition (andcurrent theories) about what would happen. The forces in the springs weretaken as

linear + small nonlinear terms

i.e. approximately linear. The energies of the normal modes of the approx-imating linear system were computed (these were expected to yield moreinteresting information than the actual positions of the particles).

They expected that the effect of the nonlinearity would be “thermaliza-

27

28 CHAPTER 7. THE FERMI-PASTA-ULAM EXPERIMENT - 1

tion”: the energies of the normal modes should vary unpredictably at first,but eventually settle down1 — on average — to the same stable value. Thisis the kind of behaviour expected in thermodynamics, or more generally fornonlinear systems with many degrees of freedom.

But it did not happen!

The energies of the normal modes varied in a complicated way, but cer-tainly not randomly. The experimenters found this behaviour amazing, andcould not explain it. Over the next 20 years, a tremendous amount of math-ematics was discovered in attempts to understand the situation.

A historical account of this story, and how it led to such an unexpectedmathematical revolution, can be found in T. P. Weissert, The Genesis ofSimulation in Dynamics. Pursuing the Fermi-Pasta-Ulam Problem, Springer,1997.

In the next chapter we will introduce the Fermi-Pasta-Ulam system.

1roughly speaking!

Chapter 8

Lattice models

The motion of a collection of N particles, interconnected by “springs”, pro-vides an example of a system of o.d.e. which is interesting from both themathematical and the physical points of view. It is usually impossible to

Figure 8.1: A 2-dimensional lattice

solve such a complicated system explicitly. Therefore, we shall study thesystem indirectly, for example by looking for conserved quantities. We shallbe guided by computer experiments in which we solve examples of such sys-tems numerically.

29

30 CHAPTER 8. LATTICE MODELS

We shall consider a one-dimensional lattice, where N particles of unitmass lie on a straight line, and we shall denote their positions by

Y1(t), Y2(t), . . . , YN(t)

at time t. There are N − 1 springs, and we shall assume that the “restoring”

Figure 8.2: A 1-dimensional lattice

force Ti in the i-th spring (connecting Yi to Yi+1) depends only upon theextension of the spring from its equilibrium position. For example, if thespring obeys Hooke’s law, then Ti is directly proportional to the extension,i.e. a linear function of the extension. We shall mainly be interested in springswhere Ti is a nonlinear function of the extension.

The equilibrium positions of the particles (when the springs exert noforces) will be denoted by e1, . . . , eN (Fig. 8.3). The initial positions will be

Figure 8.3: Equilibrium positions

31

Figure 8.4: Initial positions

denoted by A1, . . . , AN (Fig. 8.4) and the initial velocities by v1, . . . , vN . (Forsimplicity, we shall assume that v1 = · · · = vN = 0 unless stated otherwise.)That is, we write Ai = Yi(0) and vi = Y ′

i (0).

Our notation for the forces is as in Fig. 8.5. That is, the restoring force Ti

Figure 8.5: Forces

in the i-th spring gives a force Ti in the positive direction on the i-th particle,and a force Ti in the negative direction on the (i+1)-th particle. (Of course,if the spring is compressed, then Ti will be negative.)

Let yi = Yi − ei be the displacement of Yi from its equilibrium positionei. Then the extension of the i-th spring at time t is yi+1(t) − yi(t). Fromthe mathematical point of view, it is convenient to work with the functions

32 CHAPTER 8. LATTICE MODELS

y1, . . . , yN (rather than Y1, . . . , YN), so we shall define ai = Ai − ei, i.e.yi(0) = ai.

Assumption. A function T is given such that Ti = T (yi+1 − yi), for i =1, . . . , N − 1.

When 1 < i < N , Newton’s equation for Yi is therefore

(Y ′′i =) y′′

i = T (yi+1 − yi) − T (yi − yi−1).

For the cases i = 1 and i = N , the equations are

y′′1 = T (y2 − y1)

y′′N = −T (yN − yN−1).

In the special case when Hooke’s law holds, so that T (y) = ky for somepositive constant k, Newton’s equation is

y′′i = kyi−1 − 2kyi + kyi+1

for 1 < i < N , and y′′1 = k(y2 − y1), y′′

N − k(yN − yN−1).

Before beginning the mathematical analysis and the computer experi-ments, let us make a few remarks about what might be expected to happen(and what Fermi, Pasta and Ulam expected to happen) in the situation where

T (y) = ky+ small nonlinear terms

and where the system is disturbed slightly from the equilibrium position.Intuition suggests that the particles will move in a complicated way, but notvery far from the equilibrium position.

In addition, physical intuition suggests that “thermalization” will occur:when N is large, the interactions between the particles will be so complicatedthat, eventually, the particles will all be jiggling slightly in a rather randomfashion. After a sufficiently long time, the motion of any “hyperactive”particle should be damped by its neighbours.

Of course, if T (y) = ky, there is no thermalization. In this case, we havea system of linear equations, and the coefficient matrix is symmetric. Hence,by diagonalizing this matrix, we see that the motion of the lattice will consistof N independent simple harmonic motions in disguise. The motion will be

33

exactly periodic. Thermalization is expected to occur only in the nonlinearcase.

Mathematically, thermalization should be measurable by looking at theenergies of the normal modes of the approximating linear system. In thelinear case, these energies remain constant — they are conserved quantitiesof the system. In the nonlinear case, energy will “leak” between the nor-mal modes. The total energy of all the normal modes will remain (almost)constant, as the total energy is a conserved quantity of the system, even inthe nonlinear case, and eventually this energy should be shared equitablybetween the normal modes.

This was the motivation for the experiment of Fermi, Pasta and Ulam(and in fact their goal was primarily to measure the rate of thermalization— they were confident that thermalization would occur!)

Chapter 9

Lattices with two degrees offreedom

Let us consider some examples where the lattice has exactly two movingparticles.

Two particles, free ends.

Figure 9.1: Two particles

In this case the equations are

y′′1 = T (y2 − y1)

y′′2 = −T (y2 − y1)

35

36 CHAPTER 9. LATTICES WITH TWO DEGREES OF FREEDOM

and we see immediately that these are equivalent to the simpler system

(y1 + y2)′′ = 0

(y1 − y2)′′ = 2T (y2 − y1).

Thus, we have reduced to the scalar equation y′′ = 2T (−y).

We shall be interested in the case T (y) = ky + nonlinear terms, so let usconsider the linear case T (y) = ky first (as usual k > 0). The system is

(y1

y2

)′′

= k

(−1 11 −1

)(y1

y2

)

and the eigenvalues of the matrix are 0,−2. Choosing corresponding eigen-vectors 1

2(1, 1)t, 12(1,−1)t, our usual diagonalization procedure gives

P−1 = 12

(1 11 −1

), P =

(1 11 −1

)

hence z1 = y1 + y2, z2 = y1 − y2. This is exactly the change of variable thatwe noticed earlier. The general solution is given by

y1 + y2 = At + B, y1 − y2 = C cos√

2kt + D sin√

2kt.

We can identify two special kinds of solution:

(1) y1 + y2 = At + B, y1 − y2 = 0

This is motion in the first normal mode (z2 = 0). The distance between theparticles remains constant (Y2 − Y1 = e2 − e1 + y2 − y1 = e2 − e1) and thewhole spring slides along the line, with constant velocity.

FIG. 2

(2) y1 − y2 = C cos√

2kt + D sin√

2kt, y1 + y2 = 0

This is motion in the second normal mode (z1 = 0). The centre of thespring remains fixed, and the particles move with equal and opposite simpleharmonic motion.

FIG.

37

From the formulae z1 = At+B, z2 = C cos√

2kt+D sin√

2kt, the energiesof the normal modes can be computed explicitly:

H1(t) = 12A

2

H2(t) = k(C2 + D2).

These are indeed independent of t — they are conserved quantities of thelinear system, as expected.

Now we turn to the nonlinear case. We shall make the same change ofvariable z1 = y1 + y2, z2 = y1 − y2 as in the linear case. Even though thediagonalization was carried out only for the linear part, we have seen thatthis is sensible as it simplifies the problem. We have y1 + y2 = At + B as inthe linear situation, and H linear

1 is again a conserved quantity. But y1 − y2 ismore complicated, and H linear

2 is not independent of t.

As a concrete example, consider T (y) = sin y. The scalar equation y′′ =2T (−y) is the pendulum equation of Chapter 5, and it can be solved in termsof elliptic functions. It follows that H linear

2 is a periodic function — it behaves“predictably”— and for small values of y1 − y2 it is approximately constant(because T (y) is approximately y).

Moreover, if (instead of H linear2 ) we consider the total energy of the pen-

dulum equation, we would obtain a conserved quantity! Alternatively, usingthe method of Chapter 5, we see that

H =1

2y′

12 +

1

2y′

22 + U

is a conserved quantity, where U = − cos(y2−y1)+C and C is any constant.

Let us examine H linear2 and its relation to H more closely. First, using

sin y = y − 13!y

3 + · · · , our nonlinear system can be written in the form

(y1

y2

)′′

=

(−1 11 −1

)(y1

y2

)+

(− 1

3!(y2 − y1)3 + · · ·13!(y2 − y1)3 + · · ·

)

so the approximating linear system is the one above with k = 1. The eigen-values of the matrix are 0,−2. Hence we have

H linear1 = 1

2(z′12), H linear

2 = 12(z

′22 + 2z2

2)


where z2 = y1 − y2 and y1, y2 satisfy the nonlinear system. Thus

H linear1 + H linear

2 = 12(z

′12 + z′2

2 + 2z22)

= 12(2y

′12 + 2y′

22 + 2(y1 − y2)

2)

Now, if we take the constant C to be 1, then the conserved quantity above is

H =1

2y′

12 +

1

2y′

22 + 1 − cos(y2 − y1).

For small values of y1 − y2, 1 − cos(y2 − y1) is close to 12(y1 − y2)2. Hence,

H linear1 + H linear

2 is close to being a conserved quantity (actually, close to 2H;we could avoid the factor of 2 by choosing an orthogonal matrix P , e.g.choosing z1 = 1√

2(y1 + y2), z2 = 1√

2(y1 − y2)).

Four particles, both ends fixed.

Figure 9.2: Four particles, both ends fixed

Consider a lattice with four particles, whose positions are given by Y0, Y1, Y2, Y3.Let us assume that the outer particles remain fixed, i.e. Y0(t) = e0, Y3(t) = e3

for all t. In this case the equations for the middle two particles are

y′′1 = T (y2 − y1) − T (y1 − y0)

y′′2 = T (y3 − y2) − T (y2 − y1)

with y0 = y3 = 0.

39

In the linear case (T (y) = ky) we obtain

y′′1 = −2ky1 + ky2

y′′2 = ky1 − 2ky2

i.e.

(y1 + y2)′′ = −k(y1 + y2)

(y1 − y2)′′ = −3k(y1 − y2).

The normal mode solutions are

(1) y1 + y2 = A cos√

kt + B sin√

kt, y1 − y2 = 0

FIG. 5

(2) y1 − y2 = C cos√

3kt + D sin√

3kt, y1 + y2 = 0.

FIG. 6

The general solution is a linear combination of the two; it can be describedas simple harmonic motion of frequency

√k with a simple harmonic forcing

term of frequency√

3k (or vice versa).

As a nonlinear example, let us use T (y) = sin y again. We obtain

y′′1 = sin(y2 − y1) − sin y1 = −2y1 + y2 + nonlinear terms

y′′2 = − sin(y2) − sin(y2 − y1) = y1 − 2y2 + nonlinear terms,

hence

(y1 + y2)′′ = − sin y1 − sin y2

(y1 − y2)′′ = − sin y1 + 2 sin(y2 − y1) + sin y2.

Here there are no obvious explicit solutions, and it is not clear whether theenergies of the normal nodes are conserved quantities. But there is a “totalenergy” conserved quantity, as the criterion

∂∂y2

(sin(y2 − y1) − sin y1) = ∂∂y1

(− sin(y2) − sin(y2 − y1))


of chapter 5 is satisfied. This gives

H =1

2y′

12 +

1

2y′

22 + cos(y2 − y1) + cos y1 + cos y2.

A comparison.

In both of the above examples, for any function T (assumed smooth onR2), we have a system of the form

y′′1 = F1(y1, y2)

y′′2 = F2(y1, y2)

and it is easy to check that ∂F1∂y2

= ∂F2∂y1

. Therefore we have a total energyfunction H, and this is a conserved quantity. In both examples, when T (y) =ky (the linear case), we have H = H1 + H2, and the energies H1, H2 of thenormal modes are also conserved quantities. But when T is nonlinear, there isan important difference. In the first example, H linear

1 is a conserved quantity,hence H −H linear

1 is also a conserved quantity (but H linear2 is not a conserved

quantity, in general). In the second example, neither H linear1 nor H linear

2 isa conserved quantity, in general; the existence of other conserved quantities(besides H) is not clear.

Chapter 10

The linear lattice

We shall refer to the case where T (y) = ky (and k > 0) as the “linear lattice”.For simplicity, let us take k = 1 in this chapter. We shall assume that allparticles have zero initial velocity.

Only the messiness of the explicit solutions distinguishes the linear lat-tice with N particles from the case N = 2 — the system behaves like Nsimple harmonic oscillators, which are uncoupled after a suitable choice ofcoordinates.

N + 2 particles, both ends fixed.

First, the matrix form Y ′′ = AY of the system is

y1

y2

y3

. . .yN

′′

=

−2 1 0 · · · 01 −2 1 · · · 00 1 −2 · · · 0· · · · · · · · · · · · · · ·0 0 0 · · · −2

y1

y2

y3

· · ·yN

.

The eigenvalues λl of A are given by

λl = −4 sin2 lp

2= 2 cos lp − 2, p = π/(N + 1), l = 1, . . . , N

41

42 CHAPTER 10. THE LINEAR LATTICE

and

Vl =√

2N+1

sin lpsin 2lp

...sin Nlp

is an eigenvector with eigenvalue λl. Thus

P−1 =√

2N+1

sin p sin 2p sin 3p · · · sin Npsin 2p sin 4p sin 6p · · · sin 2Npsin 3p sin 6p sin 9p · · · sin 3Np· · · · · · · · · · · · · · ·

sin Np sin 2Np sin 3Np · · · sin N2p

Note that P = P−1 = P t with this normalization of the eigenvectors.

Problem 10.1. Verify the above statements about P . (Later we shall give aneasy way to do this.)

Problem 10.2. (a) Using the criterion of Chapter 5, find the most generalfunction U = U(y1, . . . , yN) such that 1

2y′12 + · · · + 1

2y′N

2 + U is a conserved

quantity. (Answer: U = 12y

21 + 1

2

∑Ni=1(yi − yi−1)2 + 1

2y2N + C. (b) Verify that

H = 12〈Y

′, Y ′〉 − 12〈Y,AY 〉 is a conserved quantity. Show that this H is, in

fact, the sum of the energies of the normal modes.

Let us look at the solutions, explicitly. The change of variables Z = PYis

zi =√

2N+1

N∑

j=1

yj sin ijp, yi =√

2N+1

N∑

j=1

zj sin ijp.

We have

PAP−1 =

λ1

. . .λN

and the new system is Z ′′ = PAP−1Z. The solution of

z′′i = λizi, with zi(0) = Bi, z′i(0) = 0

is zi = Bi cos(t√−λi). Hence we obtain

Z =

z1

...zN

=

B1 cos(t

√−λ1)

...BN cos(t

√−λN)

=N∑

i=1

Bi cos(t√

−λi) Ei

43

where Ei = (0, . . . , 0, 1, 0, . . . , 0)t, the column vector with a 1 in the i-thposition and zeros elsewhere.

In terms of the original variable Y = P−1Z we obtain (using P−1Ei = Vi)

Y = P−1Z =N∑

i=1

Bi cos(t√−λi) P−1Ei =

N∑

i=1

ziVi.

(This result could have been obtained directly, by looking for a solution ofthe form Y =

∑Ni=1 uiVi, i.e. by substituting this into the equation Y ′′ = AY

and obtaining u′′i = λiui.)

Motion in the l-th normal mode is represented by the following solution:

zi =

Bl cos(t

√−λl) if i = l

0 if i /= l

i.e. Z = zlEl. In terms of the original variable Y we have Y = P−1Z = zlVl.Explicitly, this means

yi =√

2N+1

N∑

j=1

zj sin ijp

=√

2N+1 Bl cos(2t sin

lp

2) sin ilp

= γl sin(iαl) cos(tβl)

where

γl =√

2N+1 Bl, αl = lp, βl = 2 sin lp

2 .

Thus1, “motion in the l-th normal mode” means that

—every particle moves with simple harmonic motion of “frequency” βl and

—the “amplitude” for the motion of the i-th particle is γl sin(iαl).

This suggests another point of view. Let us regard the function

1, 2, . . . , N → R, i (→ γl sin(iαl)

1Conventionally, R sinSx or R cos Sx is said to have amplitude R, frequency S/2π, andperiod 2π/S.


as (part of) a “discrete sine curve”. It gives the initial positions of theparticles, and we can draw its “graph” (which is indeed part of the graph ofx (→ γl sin(xαl)). It is tempting to connect the N dots by straight lines (Fig.10.1).

Figure 10.1: Transverse representation of lattice (discrete wave)

This piecewise-linear wave is the situation at t = 0. As t varies, thewave moves, because of the factor cos(tβl). Thus, we have a different wayof visualizing2 the motion of the lattice of particles: instead of the longitu-dinal (horizontal) motion, we represent the positions of the particles in thetransverse (vertical) direction.

Problem 10.3. It is also tempting to go further, and regard the functiony(i, t) = yi(t) as a solution of the “wave equation” ∂2y/∂t2 = ∂2y/∂i2, where∂2y/∂i2 is interpreted as (yi+1 − yi) − (yi − yi−1). It is then natural to use“separation of variables” by trying a solution like yi = sinαi cos βt. Bysubstituting in the original lattice equation, show that β2 = 4 sin2 α

2 . Henceyi = sinαi cos(2t sin α

2 ) is a solution for any α. (Why should the valuesα = lπ

N+1 appear?)

23D-XplorMath: In ODE/Lattice Models both the longitudinal and the transversevisualizations can be selected from the Action/Set Lattice Parameters menu. The numberof particles and the step size for the solution of the differential equation can also beselected from Set Lattice Parameters. The black lines represent y1, . . . , yN and the bluelines represent y′

1, . . . , y′N .

45

N + 2 particles, periodic.

Instead of fixing the “end particles”, let us impose the condition

YN+1 = Y0 + C

for some constant C. Thus we have N + 1 “independent moving particles”,whose positions we can take as Y0, Y1, . . . , YN or Y1, Y2, . . . , YN+1. If C = 0, itis natural to interpret Yi as the angular position of the i-th particle, with theparticles arranged in a circular fashion (Fig. 10.2). In any case, the equations

Figure 10.2: Circular lattice

of motion become

y′′i = T (yi+1 − yi) − T (yi − yi−1) = yi+1 − 2yi + yi−1

where yi = yi+N+1 for all (positive or negative) integers i. That is, we regardthe index of yi as “i mod N + 1”. We shall see that this point of viewsimplifies the calculations (and as a bonus it will give an easier way to dealwith the fixed endpoint case as well).


Let us use y0, . . . , yN as the independent functions. Then the equationsof motion can be written in matrix form as

y0

y1

y2

...yN−2

yN−1

yN

′′

=

−2 1 0 0 0 11 −2 1 0 0 00 1 −2 0 0 0

. . .0 0 0 −2 1 00 0 0 1 −2 11 0 0 0 1 −2

y0

y1

y2

...yN−2

yN−1

yN

Adding these equations, we obtain (∑N

i=0 yi)′′ = 0, so∑N

i=0 yi = At + Bfor some constants A,B. If we take the usual initial condition y′

i = 0, thenA = 0, so

∑Ni=0 yi = B. (Sometimes it is natural to impose the condition∑N

i=0 yi = 0 from the beginning; this means that we ignore the simple motionof the “centre of mass” of the system.)

Observe that the coefficient matrix here is

A = −2I + Ω+ Ωt = −2I + Ω+ Ω−1

where

Ω =

0 1 0 0 0 00 0 1 0 0 00 0 0 0 0 0

. . .0 0 0 0 1 00 0 0 0 0 11 0 0 0 0 0

;

this represents the cyclic linear transformation which shifts the i-th basisvector to the i−1-th. Evidently A and Ω commute, so they are simultaneouslydiagonalizable.

Now, Ω is easy to understand. It satisfies ΩN+1 = I, but ΩN /= I,so its minimal polynomial and characteristic polynomial are both equal toλN+1 − 1. The eigenvalues of Ω are therefore the (N + 1)-th roots of unity.

47

Its eigenvectors are easy to construct, as we have

Ω

1αα2

...αN

=

αα2

α3

...1

= α

1αα2

...αN

for any (N + 1)-th root of unity α. Thus, we can take

P−1 =√

1N+1

1 1 1 11 ω ω2 ωN

1 ω2 ω4 ω2N

. . .

1 ωN ω2N ωN2

where ω = e2√−1π/(N+1). The normalization factor is chosen so that P t = P ,

P t = P−1.

Let us write the l-th column as

Wl =√

1N+1

(ωl)0

(ωl)1

...(ωl)N

.

We have ΩWl = ωlWl, so

AWl = (−2I + Ω+ Ω−1)Wl = (−2 + ωl + ω−l)Wl.

Hence the eigenvalue of A for the eigenvector Wl is

−2 + ωl + ω−l = −2 + 2 cos 2πlN+1 = −4 sin2 πl

N+1 .

This is remarkably similar to the situation for a lattice with fixed endpoints,and we shall give the precise relation shortly.

To conclude this section, we note that the normal mode solutions Y =∑Ni=0 ziWi are actually complex-valued solutions of Y ′′ = AY ; this is the

price we pay for introducing roots of unity. However, it is easy to obtain real


solutions: the equation is invariant3 under the symmetry Y (→ Y (becauseA is real), so the real solutions are obtained by taking real initial conditions.

Since A and all its eigenvalues are real, Wl is an eigenvector with thesame eigenvalue as Wl. Hence

12(Wl + Wl),

12i(Wl − Wl)

are real eigenvectors with the same eigenvalue −4 sin2 πlN+1 . It follows that

the real initial conditions are given by taking real linear combinations of thereal eigenvectors. Note that W0 = W0, and W 1

2 (N+1) = W 12 (N+1) if N is odd,

so in these cases Wl is already real. Note also that we have Wl = WN−l+1

Periodic v. fixed endpoints.

The equations of motion of the periodic linear lattice are invariant underthe symmetry yi (→ −y−i. In matrix terms, this means that

Y =

−1−1

. ..

−1

Y

is a solution, whenever Y is a solution.

Problem 10.4. Verify this. Can you find other symmetries of this type?

Let us investigate the smaller system which is obtained by imposing thissymmetry, i.e. by imposing the condition yi + y−i = 0 for all i.

First we consider the periodic case with N + 1 = 2M + 2, i.e. N odd.The symmetry condition can be written as yi + y2M+2−i = 0. It followsthat y0 = yM+1 = y2M+2 = 0, and the equations for yM+2, . . . , y2M+1 areequivalent to the equations for y1, . . . , yM . The latter equations are exactlythe equations for M + 2 particles, with the end particles fixed, i.e. with M“moving particles”. Thus, the case of fixed endpoints can be regarded as aspecial case of the periodic case.

Problem 10.5. Verify this.

3This means: if Y is a solution of Y ′′ = AY , then so is Y = Y .

49

For example, if N = 5 and M = 2, the system is

y′′0 = y1 − 2y0 + y5

y′′1 = y2 − 2y1 + y0

y′′2 = y3 − 2y2 + y1

y′′3 = y4 − 2y3 + y2

y′′4 = y5 − 2y4 + y3

y′′5 = y0 − 2y5 + y4

Imposing the conditions

y0 + y6 = 0, y1 + y5 = 0, y2 + y4 = 0, y3 + y3 = 0

we obtainy′′

1 = −2y1 + y2, y′′2 = y1 − 2y2.

Note that we cannot obtain the fixed endpoint lattice simply “by imposingthe conditions y0 = y6 = 0”.

Problem 10.6. Deduce the diagonalization in the case of fixed endpoints fromthe diagonalization of the periodic case.

This is illustrated in Fig. 10.3.

For the periodic case with N + 1 = 2M + 1, i.e. N even, we obtain asimilar result. The reduced system in this case is equivalent to a lattice withone end fixed and the other end “free”.


Figure 10.3: Periodic lattice with symmetry

APPENDIX: Fourier analysis on groups

Chapter 11

Nonlinear lattices

In this chapter we consider a lattice with fixed ends as in chapter 10, exceptthat we use a nonlinear force T . We assume that

T (y) = y + S(y)

where S is a smooth function such that S(0) = S ′(0) = 0. In particular,this means that the nonlinear term S(y) is small when y is small. Theseassumptions are natural for a physical system of particles

Newton’s equations, for the motion of the N particles, are:

y′′1 = T (y2 − y1) − T (y1)

y′′2 = T (y3 − y2) − T (y2 − y1)

. . .

y′′N−1 = T (yN − yN−1) − T (yN−1 − yN−2)

y′′N = T (−yN) − T (yN − yN−1)

From the above definition of T , these equations can be written in matrixform as follows:

y1

y2

y3

. . .yN

′′

=

−2 1 0 · · · 01 −2 1 · · · 00 1 −2 · · · 0· · · · · · · · · · · · · · ·0 0 0 · · · −2

y1

y2

y3

· · ·yN

+

S(y2 − y1) − S(y1)S(y3 − y2) − S(y2 − y1)S(y4 − y3) − S(y3 − y2)

. . .S(−yN) − S(yN − yN−1)

.

51

52 CHAPTER 11. NONLINEAR LATTICES

If we make the same change of variables Z = PY as in chapter 10, we obtaina system of the form

z1

z2

z3

. . .zN

′′

=

λ1 0 0 · · · 00 λ2 0 · · · 00 0 λ3 · · · 0· · · · · · · · · · · · · · ·0 0 0 · · · λN

z1

z2

z3

. . .zN

+

∗∗∗. . .∗

where each ∗ indicates a nonlinear function of z1, . . . , zN .

The energy of the i-th normal mode (of the corresponding linear system)is defined, as in chapter 6, to be

H lineari =

1

2(z′i

2 − λiz2i ).

We shall always consider examples where the differences yi−yi−1 remain smallthroughout the motion of the lattice, so the system of differential equationsfor the linear lattice is a good approximation to the above system of nonlineardifferential equations. Initially (for small values of t), the solution of thelinear lattice is a good approximation to the solution of the nonlinear lattice,and the same applies to the energies of the normal modes. However, forlarger values of t, the solutions and the energies of the normal modes can bevery different.

Despite the essential differences between the linear and nonlinear cases,both systems are “conservative” in the sense that there is a “potential energyfunction” U , and the total energy H = K+U function is a conserved quantity.The existence of U follows from the fact that

∂∂yj

(yi+1 − 2yi + yi−1 + S(yi+1 − yi) − S(yi − yi−1)) =

∂∂yi

(yj+1 − 2yj + yj−1 + S(yj+1 − yj) − S(yj − yj−1))

(here we use the criterion of chapter 5). There is an explicit formula for U inthis case. Let V be any anti-derivative of the function T , i.e. V ′ = T . (Thus,V is defined only up to the addition of an arbitrary constant.) Let

U(y1, . . . , yN) = V (y1) +N∑

i=2

V (yi − yi−1) + V (−yN).

53

It is easily verified that −∂U/∂yi = T (yi+1 − yi) − T (yi − yi−1), i.e. U is apotential function for the nonlinear lattice.

Thus, every nonlinear lattice has at least one conserved quantity, the“total energy” function H = K +U . In chapter 9 we saw that this conservedquantity strongly restricts the motion of a two particle lattice. But, fora many particle lattice, the restraining effect of the total energy functionis not very significant. In the search for other conserved quantities, thefunctions H linear

i are our first candidates, but in general they are not constant.Nevertheless, it is of interest to study these functions, in the hope of findingnew ideas.

Chapter 12

The Fermi-Pasta-Ulamexperiment - 2

A brief description of the Fermi-Pasta-Ulam experiment, and its results, werepublished in

E. Fermi, J. Pasta, and S. Ulam, Studies of nonlinear problems I, Los AlamosReport LA-1940 (1955)

The report was reprinted in

Collected Works of Enrico Fermi II, ed. E. Segre, University of Chicago Press,1965, pp. 978–988

and is available (at the time of writing) at

http://www.osti.gov/accomplishments/documents/fullText/ACC0041.pdf

Let us take a look at the experiment, following the authors’ description.

The experiment.

The motion of a system of (at most) 64 particles connected by springs,with the endpoints fixed, and with three different kinds of nonlinear springforces, was simulated by carrying out a computer experiment. This wasregarded as an approximation to the motion of a continuous string. Theadvantages of using this approximation, of course, are that the equationsof motion are a system of ordinary differential equations, and that such asystem can be solved numerically.

55


The first nonlinear force was taken to be T (y) = y + αy2 where α is aconstant. The second was T (y) = y + βy3, where β is a constant. The thirdwas a piecewise-linear approximation to T (y) = y + βy3.

For the first type of force, the system was written as

xi = (xi+1 + xi−1 − 2xi) + α[(xi+1 − xi)2 − (xi − xi−1)

2]

with i = 1, 2, . . . , 64. The authors do not say anything about the end parti-cles; we have to assume that they modified the equations for i = 1, 64 as wedid in Chapters 10, 11, in which case N = 64, or that they considered initialconditions for which x1 and x64 remained zero1 throughout the motion, inwhich case N = 62. Here, xi is “the displacement of the i-th point fromits original position”. Thus, yi = xi + C for some constant C, but this Cdoes not affect the above equation, so their equation for xi agrees with ourequation for yi (except for the treatment of the endpoints).

The size of α, β was taken to be “of the order of one tenth of the maximumdisplacement of the linear term”. In other words, if the linear system (givenby T (y) = y) satisfies |yi(t)| ≤ M , then α, β are taken to be approximately0.1M (perhaps 0.2M or 0.5M , but not as big as M or 5M). Obviously thisis a matter of “judgement”, since the actual solution is not known. Math-ematically speaking, there is no reason to restrict α at all, but in practicethe authors wanted to look at situations where yi(t) remains close to theequilibrium value yi = 0, and whether this happens can be observed duringthe experiment.

The value of M depends on the initial conditions — this is clear fromthe formula for the solution in Chapter 10. Roughly speaking, if the initialvalues satisfy |yi(0)| ≤ M0, and the initial velocities are all zero, then we canuse M = M0. The authors take the initial positions to be given by a “sinewave”, so M = 1, and then they take α = 0.25 in the first experiment whoseresults are reported in Figure 1 of their report. After performing severalsimulations, the authors could have decided to modify this restriction onα, β, but evidently they found it to be satisfactory.

The aim of the experiment was to “study the ergodic behaviour of thesystem”, i.e. the (expected!) phenomenon that nonlinear systems with many

1Warning: one cannot simply “impose the conditions x1(t) = x64(t) = 0 on the system”;see Chapter 10

57

degrees of freedom should look like random motion eventually. Specifically,they intended to study the rate at which such random behaviour emerges.Randomness has a precise meaning here (motivated by statistical mechanics):it means that the functions H linear

i are approximately constant (more precisely,that the time-averages of these functions are approximately equal to the sameconstant, over sufficiently long periods of time). This is called thermalization.The functions H linear

i are referred to as the Fourier modes of the system (seethe appendix to Chapter 10).

In order to be able to observe this clearly, the initial conditions werechosen so that the system started in a certain normal mode (or a combinationof a small number of normal modes), e.g. H linear

1 (0) /= 0 and H lineari (0) = 0

for i = 2, . . . , 64. Thus, it was expected that the dominant first mode wouldgradually become less important, and that (in the long run) all modes wouldbecome equally important.

One more decision is needed before starting the experiment: the timeperiod T (i.e. the interval 0 ≤ t ≤ T over which the differential equationis to be solved), and the step-size used in the numerical algorithm (i.e. thesubdivision of this time period into small intervals). To some extent thesecan be adjusted as part of the experiment, but it is important to be aware ofpotential problems. Even with a modern computer running at full speed forT = 20 days, there is always the possibility of a surprising result appearingon the 21st day. Moreover, we need T to be large in order for the time-average to have any significance. On the other hand, numerical errors willaccumulate as time and step-size increase. At this point, the authors allowedthemselves to be guided by the linear system, again. They decided to takeT to be several hundred “characteristic periods of the corresponding linearproblem”. The l-th normal mode of the linear system is given by Y (l) =cos t

√−λlVl; this is periodic motion with period 2π/

√−λl. Thus, T should

be several hundred times the (largest of the) numbers 2π/√−λl, l = 1, . . . , 64.

Then, the largest period was divided into (up to) 500 “time cycles”, for thenumerical calculation, and the step-size is taken to be one such cycle. Forexample, in the case of Figure 1 of the report, 30,000 cycles (time steps) wereused. In that case N = 32, and the largest period is 2π/ (2 sin(π/66)) ≈ 66,so 66 × 500 × 100 has the correct order of magnitude. We shall go into thisin more detail later.

Having decided all the ingredients, the experiment was carried out (no


doubt very many times, with many minor modifications), and the valuesof the functions H linear

i were recorded2 “every few hundred cycles”. Thegraphs of H linear

1 , . . . , H linear5 were plotted on the same diagram for various

experiments, and several of these are presented in the report (Figures 1-7).

This was the Fermi-Pasta-Ulam experiment.

The results.

“Let us say here that the results of our computations show features whichwere, from the beginning, surprising to us. Instead of a gradual, continuousflow of energy from the first mode to the higher modes, all of the problemsshow an entirely different behavior.”

Translated into everyday English, this means that the authors were amazedby the results. They expected the dominant initial mode to lose its domi-nance, gradually sharing its energy with the modes in an equitable way. Itdid not. In fact it behaved astonishingly — initially it shared quite a lot ofenergy with a few close neighbours, but after a while it gathered most of thisenergy back ! It was as though a broken dish repaired itself when the pieceswere tossed into the dustbin, or the steam from a boiling kettle hurried backinto the kettle when the heat was turned off. It went against all physicalintuition.

It is easy to reproduce this experiment using modern computer software.In 3D-XplorMath-J the default demonstration of the Lattice Models gallerydoes this for the case reported in Figure 1 of the Fermi-Pasta-Ulam report.Detailed instructions are given below in an appendix to this chapter; let ussummarize the main points.

In the usual3 notation let us choose N = 32 and T (y) = y + 0.25y2. Letus take the initial position to be given by the first normal mode, and initialvelocities zero. Let us take the step size to be 0.3.

2With a modern computer it would be easy to record the values after every single cycle,but such fine resolution of the resulting graph is unlikely to be helpful either on paper oron the computer screen.

3The report does not say exactly what Fermi-Pasta-Ulam did. The data given hereand in 3D-XplorMath-J represents what we believe they did. The fact that it appearsto reproduce Figure 1 supports our choices, but there are some small differences. Itwould be interesting to know whether these differences are of a mathematical nature (ourmisunderstanding of what they did) or whether they are merely superficial (numericalerror, inaccuracy in the drawing of Figure 1, etc.)

59

We observe immediately that the energy of the first normal mode startsto decrease, and the energies of the adjacent normal modes increase.

The total energy of the system (indicated by TNE in 3D-XplorMath-J)should remain constant, of course — it is affected only by numerical error,and based on the TNE display the numerical error appears to be very small.The sum of the energies of all the normal modes (indicated by TME in3D-XplorMath-J) remains approximately constant as well. Independent ofnumerical error, this differs from the total energy by an amount given bythe α

3 y3 term in V (y) = 12y

2 + α3 y3. However, since α

3 y3 remains small, thedifference from the total energy should also be small. As stated in the report,“it amounts to at most a few percent”.

Unexpected behaviour can be observed very quickly: instead of the firstmode gradually distributing its energy to all 32 modes, the first few modesappear to pass the energy around amongst themselves, never allowing theremaining modes to participate. The really surprising phenomenon is that,around t = 9, 000, all the energy appears to be back in the first normal mode.Then the whole process is repeated.

It should be emphasized that this “periodicity” cannot easily be appre-ciated by looking at the actual longitudinal motion of the lattice, or eventhe transverse representation. It is revealed by the energies of the normalmodes. The bar graphs representing these quantities in real time4 illustratethis reasonably well, but the most dramatic representation is that given bythe graphs of the energies of the first few normal modes over the entire period5

of the experiment. An illustration of this from 3D-XplorMath-J is shown inFig. 12.1. It resembles Figure 1 of the report quite closely.

Interpretation.

In the rest of this book we shall say more about the mathematics thathas been discovered since the 1950’s in attempts to explain this experiment.Let us just make a few general remarks here.

First of all, it cannot be stated with any certainly that the experimenthas been explained! The solutions have not been obtained theoretically (e.g.represented by explicit formulae), and it is quite likely that they never will

43D-XplorMath: The bar graphs of the energies of the normal modes are shown at thebottom of the screen.

53D-XplorMath: FPU Graph Display.


Figure 12.1: Graphs of the normal mode energies

be. This applies equally to the apparently periodic behaviour. So we haveto be content with a rather general kind of explanation. In other words,instead of “Why does the Fermi-Pasta-Ulam experiment turn out like this?”the question is “How is it possible that a nonlinear system can have (approx-imately) periodic behaviour, rather than the ergodic behaviour predicted byphysical intuition?”.

The basic underlying answer to this question is that not all nonlinearsystems behave in the same way. Some of them are actually “integrablesystems”, in the sense that they have many more conserved quantities thanare visible at first sight; they have “hidden” conserved quantities. There aretwo integrable systems which are “close” to the Fermi-Pasta-Ulam lattice,namely the KdV equation (a nonlinear wave equation, for which the Fermi-Pasta-Ulam lattice is a discrete approximation), and the Toda lattice (whichwe shall consider in detail later). The fact that systems which are “close tointegrable systems” can have similar properties to those integrable systems isknown as the KAM Theorem. All of these facts involve deep mathematics, asignificant amount of which was developed initially in response to the Fermi-

61

Pasta-Ulam experiment. Since then the subject has broadened considerablyand it has proved its worth by giving insight into many other nonlinearsystems; the theory of integrable systems has become a standard tool.

It is possible that there is a much simpler explanation of Fermi-Pasta-Ulam experiment, of course, and various ideas were floated in the early years.For example, might not the particular numbers 16, 32 and 64 used in the re-port be responsible for the unexpected behaviour? A system with manysymmetries would have its normal mode energies “confined” by these sym-metries, and may not thermalize. Might the system in the experiment havebeen “too close” to the approximating linear system? (Indeed, it has beendemonstrated later by other authors that thermalization can occur, if the ini-tial energy is sufficiently large.) Is the explanation simply that a nonlinearsystem close to a linear system will always inherit some properties of thatlinear system? None of these suggestions has provided a satisfactory answer,however, and the most likely scenario is still the one related to “integrability”.

Project 12.1. Try to reproduce Figures 2,4,5 of the FPU report.

Project 12.2. Try to reproduce Figures 3,6,7 of the FPU report.

Project 12.3. Try to reproduce Figure 8 of the FPU report.

Project 12.4. In all Figures of the FPU report, try to estimate the step sizeand number of cycles (or verify that they are consistent, where they aregiven).

Project 12.5. In all Figures of the FPU report, try to estimate the total energy(or verify that it is consistent with the energy marked on the graph).

Project 12.6. Investigate the dependence of the FPU results on the numberof particles. (For example, whether the “return time to the initial state” isproportional to N2.5, as predicted6 by Toda.)

Project 12.7. Investigate the dependence of the FPU results on the size ofα, β. (Investigate, for example, the “return time to the initial state”, andwhether the solutions yi(t) remain near to zero.)

Project 12.8. Investigate the dependence of the FPU results on the initialcondition. (Consider, for example, increasing the total energy, either byincreasing the initial amplitude and/or by allowing nonzero initial velocities.Alternatively, change the mode(s) of the initial position, e.g. by starting witha “higher” mode, or with a combination of modes.)

6REFERENCE


APPENDIX: Comments on the Fermi-Pasta-Ulam report.

Summary of notation.

Let us summarize the relevant formulae from Chapters 10, 11 before wecompare with those in the Fermi-Pasta-Ulam report, for the case T (y) =y + αy2.

First, the total energy of the Fermi-Pasta-Ulam system is

H =1

2

N∑

i=1

y′i2 + V (y1) +

N∑

i=2

V (yi − yi−1) + V (−yN)

where V (y) = 12y

2 + α3 y3.

Next, let us consider the corresponding linear system (i.e. where T (y) = y,V (y) = 1

2y2). For this system, the l-th normal mode solution is given by

zk =

Bl cos

(2t sin lπ

2N+2

)if k = l

0 if k /= l

with initial condition

zk(0) =

Bl if k = l

0 if k /= lz′k(0) = 0

(we shall choose the constant Bl later). In terms of the original variables,this solution is given by

yi =√

2N+1Bl cos

(2t sin lπ

2N+2

)sin ilπ

N+1

with initial condition

yi(0) =√

2N+1Bl sin

ilπN+1 , y′

i(0) = 0.

The total energy (of the linear system in its l-th normal mode) is simply thel-th normal mode energy, i.e.

12z

′l2 − 1

2λlz2l = 0 + 2 sin2 lπ

2N+2B2l = 2 sin2 lπ

2N+2B2l .

63

Finally, returning to the nonlinear system, we still make the change ofvariables

zk =√

2N+1

N∑

i=1

yi sinikπ

N+1

but now we have no formula for yi or zk and no concept of normal modesolutions. However, we can take the same initial condition as the l-th normalmode of the linear system, namely

zk(0) =

C if k = l

0 if k /= lz′

k(0) = 0

or, in terms of the original variables,

yi(0) =√

2N+1C sin ilπ

N+1 , y′i(0) = 0.

for some constant C. Then it is of interest to see how the solution deviatesfrom the l-th normal mode solution of the linear system. However, our maininterest will be the quantities H linear

k = 12(z

′k2 − λkz2

k), and how they deviatefrom the corresponding quantities for the linear system. For the above initialcondition, the initial values of these quantities are given by

H lineark (0) =

2C2 sin2 lπ

2N+2 if k = l

0 if k /= l.

Although H lineark (t) will vary in a complicated way, we expect

∑Nk=1 H linear

k (t)to remain close to the total energy of the system, if α is small.

Notation of Fermi-Pasta-Ulam.

The equations of Fermi-Pasta-Ulam (formula (1) of their report) are

xi = (xi+1 + xi−1 − 2xi) + α[(xi+1 − xi)2 − (xi − xi−1)

2]

with i = 1, 2, . . . , 64. If we assume that they modified the equations fori = 1, 64 as we did in Chapters 10, 11, then N = 64. However, they state informula (4) that the change of variables is

ak =∑

xi sinikπ64


which suggests that N = 63. (In our notation, xi = yi and ai = zi.) Presum-ably (as N is large) this conflict does not affect the numerical calculationsvery much, so let us continue to assume that N = 64.

We may interpret formula (5a)

12 x

2i +

(xi+1 − xi)2 + (xi − xi−1)2

2

as the “contribution of the i-th particle to the linear part of the total energy”.However, summing over i (and modifying the contributions of end particlesas above) gives

64∑

i=1

12 x

2i + 1

2x21 + (x2 − x1)

2 + · · · + (x64 − x63)2 + 1

2(−x64)2

which does not agree with our formula

64∑

i=1

12y

′i2 + 1

2y21 + 1

2(y2 − y1)2 + · · · + 1

2(y64 − y63)2 + 1

2(−y64)2.

On the other hand, what Fermi-Pasta-Ulam actually calculate (formula (5b)of the report) is

12a

′k2 + 2a2

k sin2 πk128

and this does agree with our H lineark (though with N = 63). They expected∑64

k=1 H lineark to differ from H by “at most a few percent”.

Figure 1 of Fermi-Pasta-Ulam.

In Figure 1 of the Fermi-Pasta-Ulam report, N = 32 (rather than 64),α = 0.25, and the step size is 1/

√8 = 0.354 (assuming that δt2 means the

square of the step size). The initial shape was taken to be a “sine wave”.Judging from Figure 1, which shows the first mode nonzero and the othermodes zero when t = 0, the initial condition was the same as that of the firstnormal mode solution of the linear system.

The largest period of the corresponding linear system is

2π

2 sin π66

= 66.025

65

If this is divided up into D steps, then the step size is 66.025/D = 0.354, soD = 186.5, which is consistent with the report’s “up to 500 steps”.

The graph shows “first return to the initial state” after about 30,000steps, i.e. when t = 30, 000 × 0.354 = 10, 620. This “recurrence time” wouldrepresent 30, 000/186.5 = 161 periods of the linear system. This is consistentwith the report’s “several hundred characteristic periods”.

The recurrence time t = 10, 620 is reasonably consistent with the 3D-XplorMath-J simulation.


APPENDIX: Instructions for the Lattice Models gallery in3D-XplorMath-J

WARNING: Many parts of the Lattice Models gallery are not working atthis time (November 2012). The “fixed boundary condition” part (as in theFermi-Pasta-Ulam simulation) should work. Detailed instructions are givenbelow.

1. Choose Fermi-Pasta-Ulam from the Lattice Models menu of theGallery menu.

The panel indicates various choices, and how you can change them. Theseare:

Initial position y(i) and initial velocity y′(i) of the i-th particle. (However,see 7. below.)

N = 32 means that there are 32 “moving” particles and 2 fixed endpoints.

The spring force function is T (y) = ay + by2 + cy3 + dy4 (the standard linearlattice would be given by b = c = d = 0).

Ignore “Range of displayed values”, “Remove Parameter”, “Add Param-eter”.

2. Click OK to see the simulation.

The graphs are those of the normal mode energies (only the first few areshown). The bars also show these normal mode energies. (For the standardlinear lattice the graphs will be horizontal lines and the bars will remainconstant.)

TNE denotes the total energy. It should remain constant (affected onlyby numerical error).

TME denotes the sum of all the normal mode energies. If the motion ofthe lattice remains near equilibrium, i.e. if y(i) and y′(i) remain small, thenTME should be close to TNE.

3. Click Longitudinal Display (on the right hand side control panel)to see the actual lattice at the top of the display.

The blue lines are “space-time graphs” of the particles. They can beremoved by choosing the last item in the Action menu.

67

The bars showing the normal mode energies remain visible. They (andthe entire control panel) can be removed by choosing the appropriate itemsin the Action menu (e.g. if you want to save the display as a file for printinglater).

4. Click Transverse Display to see a wave-like representation of thelattice.

The displacement of each particle from its equilibrium position is shownvertically (rather than horizontally, as in the Longitudinal Display).

The red arrows indicate velocities. After stopping the lattice motion (byclicking the lattice display, or by clicking Stop), you can change the positionsof the particles and their velocities individually by clicking-and-dragging withthe mouse. Observe that the normal mode bars will change instantaneously.Then click Continue. (If you click Restart, all positions and velocities will berestored to the original settings. If you want to restore the original settingswithout “starting” the motion - e.g. because you want to modify the positionsand velocities immediately - click vmm.latticemodel.command.Reset.)

5. Click FPU Graph Display to return to the graphs of the normalmode energies.

Click vmm.latticemodel.FermiPastaUlam.FPUAverageDisplay to see thetime-averages of those graphs. (Ignore Pendulum Display.)

6. Click Set Params to change the number of particles.

For safety, please change this in both places, i.e “N=” and “Number ofNodes”. You can also change a, b, c, d here (Spring Constant means a). Youcan change the step size for the numerical solution of the differential equation(vmm.latticemodel.stepSize). A smaller step size gives greater accuracy andslows down the simulation.

7. To return to the start-up panel, choose Change User Data inthe Lattice Models menu.

You can change the initial conditions here if you have an explicit formula.After entering the formula, go to the Action menu, then the Initial Shapesub-menu. Here you can choose several typical initial conditions. The lastitem vmm.latticemodel.UserInitialShape implements the formula in the start-up panel. (In fact, this has to be chosen here; it will not be implementedotherwise.)


You can also change the initial conditions by specifying a certain normalmode: go to the Action menu, then the Initial Mode sub-menu. Ignore theBoundary Condition sub-menu (it should always be kept atvmm.latticemodel.FixedBoundaryCondition).

Chapter 13

The Toda lattice - 1

The nonlinear lattice which has been most thoroughly studied is the Todalattice. It was introduced1 by M. Toda around 1967, using a force of theform

T (y) = α(eβy − 1).

From the point of view of physics this is an artificial example. Toda wasmotivated by purely mathematical reasons; he was searching for a nonlinearlattice which would have some interesting “explicit” solutions. Just as thelinear lattice admits trigonometric functions as solutions, Toda designed hislattice so that it would admit elliptic functions as solutions. Remarkably,however, the Toda lattice turned out to have applications in physics, as wellas deep mathematical properties.

Let us take the constants α, β with αβ = 1. Then the Taylor expansionof the exponential function gives

T (y) = α(βy + 12(βy)2 + 1

6(βy)3 + · · · ) ≈ y + 12βy2,

i.e. the Toda lattice is the same as the Fermi-Pasta-Ulam lattice, up to secondorder. It turns out — although this is definitely not obvious — that the Todalattice is an integrable system. For example, the periodic Toda lattice is

y′′i = αeβ(yi+1−yi) − αeβ(yi−yi−1),

1See M. Toda, Nonlinear Waves and Solitons, Mathematics and its Applications 5,Kluwer, 1989 and M. Toda, Theory of Nonlinear Lattices, Springer Series in Solid-StateSciences 20, Springer, 1989.

69

70 CHAPTER 13. THE TODA LATTICE - 1

where yi = yi+N+1 for all i. The property of being an integrable system isthat there exist N + 1 independent conserved quantities (with certain prop-erties, which we shall explain later). Now, we already know two conservedquantities in this case. First we have the obvious conserved quantity

∑Ni=0 y′

i

(since∑N

i=0 y′′i = 0). Then we have the total energy function. But there

are N − 1 more conserved quantities, and these do not have any immediatephysical meaning — they are mathematical conserved quantities and it israther miraculous that they were discovered at all.

To explain this, we begin with q simpler version of the Toda lattice, whichis a modification of the lattice with free ends.

N particles, free ends, with external forces.

Newton’s equations are

y′′1 = αeβ(y2−y1) − α

y′′i = αeβ(yi+1−yi) − αeβ(yi−yi−1), 2 ≤ i ≤ N − 1

y′′N = −αeβ(yN−yN−1) + α

If we write yi = λyi + iµ, we obtain the same system for yi

y′′1 = αeβ(y2−y1) − α

y′′i = αeβ(yi+1−yi) − αeβ(yi−yi−1), 2 ≤ i ≤ N − 1

y′′N = −αeβ(yN−yN−1) + α

but with α = eµα/λ and β = λβ. We can rescale α and β by arbitrarypositive numbers or by arbitrary negative numbers this way. For example, ifα, β > 0 then we can rescale so that α = β = 1.

Deleting the “awkward” constant terms −α,α in the equations for y1, yN

corresponds to adding a constant “outwards” force to the first and last parti-cles (see Fig. 13.1). This additional force causes the whole lattice to expand(if α > 0).

We shall make this modification, and (for later convenience) adjust thenotation slightly. By taking α = β = −2, deleting the constant terms, andrenaming y1, . . . , yN as q1, . . . , qn, we obtain what is usually called “the open

71

Figure 13.1: Lattice with external force applied at each end

Toda lattice” or “the open Toda molecule”:

q′′1 = −2e2(q1−q2)

q′′i = −2e2(qi−qi+1) + 2e2(qi−1−qi) i = 2, . . . , n − 1

q′′n = 2e2(qn−1−qn).

The case n = 2 is easy to solve “by hand”. We have (q1 + q2)′′ = 0 soq1+q2 = At+B. It is natural to choose A = B = 0; this corresponds to fixingthe centre of mass of the spring at the origin. So we can put q = q1 = −q2

and we have to solveq′′ = −2e4q.

Let us impose the initial conditions q(0) = a, q′(0) = 0. The quantity q′2+e4q

is conserved, so we can put q′2+e4q = C and integrate this first order equationto obtain (eventually)

q(t) = a − 1

2log cosh(2te2a).

Thus, the two conserved quantities lead to an explicit solution of the nonlin-ear differential equation.

The case n = 3, however, seems much more difficult. This case wasinvestigated numerically by J. Ford, D. Stoddard, and J. Turner, On the in-tegrability of the Toda lattice, Prog. Theoret. Phys. 50 (1973), 1547–1560.They were aware of the concept of integrability, and they aimed to deter-mine whether the Toda lattice is chaotic or integrable. Their experiments


suggested integrability. From the point of view of computer experiments, thiswas a disappointing result! A computer demonstration of chaotic behaviourwould have been definitive, but a “suggestion” of integrability was just that,a suggestion. Further techniques would be needed to reach a conclusion.

Shortly afterwards, M. Henon (Integrals of the Toda lattice, Phys. Rev.B 9 (1974), 1921– 1923) found the missing “third conserved quantity” whichthe Ford-Stoddard-Turner experiments had suggested. Then, H. Flaschka(The Toda lattice. I: Existence of integrals, Phys. Rev. B 9 (1974), 1924–1925) gave a remarkably simple explanation, which we shall reproduce in thenext chapter.

Chapter 14

The Toda lattice - 2

Concerning the integrability of the open Toda lattice (see the end of Chapter13), H. Flaschka made the following remarkable observation. Let us definen × n matrices L, M by

L =

p1 Q1 0 . . . 0 0Q1 p2 Q2 . . . 0 00 Q2 p3 . . . 0 0...

......

......

0 0 0 . . . pn−1 Qn−1

0 0 0 . . . Qn−1 pn

and

M =

0 Q1 0 . . . 0 0−Q1 0 Q2 . . . 0 0

0 −Q2 0 . . . 0 0...

......

......

0 0 0 . . . 0 Qn−1

0 0 0 . . . −Qn−1 0

where pi = q′i and Qi = eqi−qi+1 .

Proposition 14.1. The open Toda lattice is equivalent to the matrix equationL′ = [L, M ].

The proof is a straightforward calculation.

73


Equations of this type are called Lax equations. The Lax equation im-mediately gives n conserved quantities:

Corollary 14.2. Each eigenvalue of (the symmetric matrix) L is a conservedquantity for the open Toda lattice.

Proof. We have

(trace L)′ = trace(L′) = trace(LM − ML) = 0.

A similar argument gives (traceLk)′ = 0 for any k. Hence∑n

i=1 λki is constant

for any k, where λ1, . . . ,λn are the eigenvalues of L. Hence the symmetricfunctions of the eigenvalues, and the eigenvalues themselves, are also con-stant.

Note that, even though∑n

i=1 λki is a conserved quantity for any k, only

n independent conserved quantities arise this way.

Let us compute the conserved quantities for the case n = 3. It suffices tocompute the symmetric functions

λ1 + λ2 + λ3, λ1λ2 + λ2λ3 + λ3λ1, λ1λ2λ3

of the eigenvalues. Writing

L =

p1 Q1 0Q1 p2 Q2

0 Q2 p3

=

s1 t1 0t1 s2 t20 t2 s3

and computing the characteristic polynomial det(L − λI) =

−λ3 +(s1 + s2 + s3)λ2 +(−s1s2 − s2s3 − s3s1 + t21 + t22)λ+ s1s2s3 − s1t

22 − s3t

21

we obtain the following conserved quantities:

s1 + s2 + s3

s1s2 + s2s3 + s3s1 − t21 − t22s1s2s3 − s1t

22 − s3t

21

The first one is the obvious conserved quantity q′1 + q′2 + q′3. The second oneis essentially the same as the total energy. The third one is new.

75

Having these independent conserved quantities allows us to solve the sys-tem, just as we did in the case n = 2. The calculations are messy but theanswer is explicit. Let the initial conditions be written as

V = L(0) =

0 v1 0v1 0 v2

0 v2 0

.

This means that q′i(0) = 0 for all i, and, if we assume that q1 + q2 + q3 = 0,

q1(0) =2

3log v1 +

1

3log v2

q2(0) = −1

3log v1 +

1

3log v2

q3(0) = −1

3log v1 −

2

3log v2.

Then the solution is

q1(t) = q1(0) +1

2log

α(t)

β(t)

q2(t) = q2(0) +1

2log

β(t)

γ(t)

q3(t) = q3(0) +1

2log

γ(t)

α(t)

where

α(t) = v21 + v2

2

β(t) = v21 cosh 2t

√v2

1 + v22 + v2

2

γ(t) = v21 + v2

2 cosh 2t√

v21 + v2

2.

We have to confess that we did not derive these formulae directly from theconserved quantities. We obtained them by applying the following result.

Proposition 14.3. The solution of the open Toda lattice L′ = [L,M ] withL(0) = V is given by the explicit formula

L(t) = (exp tV )−11 V (exp tV )1.

In this formula, the notation X1 denotes the matrix which is obtained byorthogonalizing the columns of X, by the Gram-Schmidt procedure, startingfrom the last column.


This kind of result is well known in the Lie theory approach to integrablesystems. We refer to Chapter 5 of M. A. Guest, Harmonic Maps, LoopGroups, and Integrable Systems, LMS Student Texts 38, Cambridge Univ.Press, 1997 for the details, and references to the literature.

Incidentally, this formula gives another proof of the fact that the eigen-values of L are independent of t, because

det(L(t) − λI) = det((exp tV )−11 V (exp tV )1 − λI)

= det(exp tV )−11 (V − λI)(exp tV )1

= det(V − λI).

As an example, let us apply the formula to the simpler case n = 2, whosesolution we have given already in Chapter 13. Writing q1 = q, q2 = −q andp1 = p, p2 = −p, we have

L =

(p QQ −p

), M =

(0 Q

−Q 0

)

where Q = e2q. The initial condition is

L(0) =

(0 vv 0

),

which means q(0) = 12 log v and q′(0) = 0. We have

exp t

(0 vv 0

)=

(cosh tv sinh tvsinh tv cosh tv

)

and [exp t

(0 vv 0

)]

1

= 1√sinh2 tv+cosh2 tv

(cosh tv sinh tv− sinh tv cosh tv

).

We conclude that

L(t) = vsinh2 tv+cosh2 tv

(−2 sinh tv cosh tv 1

1 2 sinh tv cosh tv

).

This means that

p(t) = −v sinh 2tvcosh 2tv , Q(t) = v

cosh 2tv .

and henceq(t) = 1

2 log v − 12 log cosh 2vt,

which agrees with the formula in Chapter 13.

Documents

Chapter 1 Linear diﬀerential equations